问题
匹配一段字符串中出现的Dog字符和Cat字符开头一些特征串(包含后面一个单词),以空格分割
import re
s = "Dog wins and Cat fails and ...."
pattern = re.compile("(Dog|Cat)\s[a-z]+\s")
result = pattern.findall(s)
此时发现result的结果是
['Dog', 'Cat']
我们想得到的是['Dog wins', 'Cat fails']
该如何修改正则呢?
- 解法1,将整个正则用括号括起来,如下
pattern = re.compile("((Dog|Cat)\s[a-z]+\s)")
得到的结果如下
[('Dog wins ', 'Dog'), ('Cat fails ', 'Cat')]
很明显,返回的结果有些冗余,并不需要第二项
- 解法2,利用(?:)表达式
pattern = re.compile("((?:Dog|Cat)\s[a-z]+\s)")
result = pattern.findall(s)
得到结果为:
['Dog wins ', 'Cat fails ']
正则表达式在匹配时,不捕获由(?:)标记的内容,摘自stackoverflow
?: is used when you want to group an expression, but you do not want to save it as a matched/captured portion of the string]