问题需求:在text中统计pattern重复次数,需要注意的是TGGACTCTG和TGGACTCTGGACTCTG都符合pattern。
text:GGGTTATGGACTCTGGACTCTCAATGGACTCTGGACTCACTGGACTCTGGACTCGGCAATGGACTCATGGACTCTTGGACTCTTGGACTCTACTCGTGGACTCTGGACTCTAATGGACTCAATGGACTCTGGACTCATTTGGACTCGTTGGACTCCTGGACTCGTTGGACTCTGGACTCTTGGACTCCACTCTGGACTCACTGGACTCTGGACTCTTTGGACTCACCGCTGGACTCCTATGGACTCAGTGGACTCGAGTGGACTCCAGTAGTAGGATGTGGACTCTTGGACTCCGGGGACTTAAGCCTGGACTCGGTGGACTCACGATGGACTCTGGACTCTAACTGGACTCGGTGGACTCTGGACTCACCGACAGCGTCTGGACTCCACTGCAGAATGGACTCTGTGGACTCGAACTGGACTCGGCGGGAGATTTGGACTCATGGACTCTGGACTCACTGGACTCCATGGACTCAATTTTGGACTCTGGACTCTCGTGGACTCTGGACTCCCCTGGACTCCTATGGACTCCGTGGACTCTGGACTCTGGACTCTGGACTCTGGACTCGGACTGGACTCATGGACTCTGGACTCTGGACTCCGTTGGACTCTTGGACTCCTCTGGACTCTGTGGACTCAGTGGACTCTTGGACTCGTGGACTCATGGACTCGGTTTTAGTGGACTCGGTATGGACTCTCTGGACTCTGGACTCAGCATGGACTCATGGACTCCCGAGTCATGGACTCTGGACTCGGACGCTGGACTCTGGACTCTATGGACTCATTGGACTCTGGACTCTTTGGACTCAAATGGACTCACTGGACTCTGGACTCCATGGACTCGTGGACTCACTGGACTCGATCCGAATGGACTCTGGACTCACTCCTGGACTCCATGGACTCTTGGACTCGGGTGGACTCGTGGACTCCTCTGGACTCAGATGGACTCAGACCCTGGACTCGTTGGACTCTCTTTGGACTC
pattren:TGGACTCTG
代码实现:
python代码实现:
txt, i= {},1path ="dataset_2_6.txt"
f =open(path,encoding='utf-8')
for line in f:
txt[i] = line
i +=1
txt.update(txt)
f.close()
text = str(txt[1].strip())
pattern = str(txt[2].strip())
def PatternCount(text, pattern):
count=0
for i in range(len(text)-len(pattern)+1):
if text[i:i+len(pattern)] == pattern:
count=count+1
return count
print(PatternCount(text, pattern))
R代码实现:
library(stringr)
data = read.table('dataset_2_6.txt', stringsAsFactors = F)
str = str_split(data[1,1],'') %>%
as.data.frame()
ncount = 0
for (i in 1:(nrow(str) - nchar(data[2,1]) + 1)) {
j = i+nchar(data[2,1]) - 1
rep.1 = paste(str[i:j,1], sep = '')
rep.2 = ''
for (m in 1:length(rep.1)) {
rep.2 = paste(rep.2,rep.1[m], sep = '')
}
if (rep.2 == data[2,1]) {
ncount = ncount + 1
}
if (i == (nrow(str) - nchar(data[2,1]) + 1)) {
print(ncount)
}
}
shell代码实现:
text=$(sed -n '1p' dataset_2_6.txt)
pattern=$(sed -n '2p' dataset_2_6.txt)
len_text=echo ${#text}
len_pattern=echo ${#pattern}
len_range=$(($len_text-$len_pattern))
count=0
for((i=0;i<=$len_range;i++))
do
temp=$(echo ${text:$i:$len_pattern})
if [ $temp == $pattern ]; then
count=$(($count+1))
fi
done
echo $count