我正在尝试匹配包含字符串的文件中的行,例如 ACTGGGTAAACTA.如果我愿意
I am trying to match rows in a file containing a string say ACTGGGTAAACTA. If I do
grep "ACTGGGTAAACTA" file它给了我完全匹配的行.有没有办法允许一定数量的错配(替换、插入或删除)?例如,我正在寻找序列
It gives me rows which have exact matches. Is there a way to allow for certain number of mismatches (substitutions, insertions or deletions)? For example, I am looking for sequences
最多 3 个允许的替代词,例如AGTGGGTAACCAA"等.
Up to 3 allowed subtitutions like "AGTGGGTAACCAA" etc.
插入/删除(部分匹配,如ACTGGGAAAATAAACTA"或ACTAAAACTA")
Insertions/deletions (having a partial match like "ACTGGGAAAATAAACTA" or "ACTAAACTA")
推荐答案曾经有一个工具叫做 agrep 用于模糊正则表达式匹配,但它被放弃了.
There used to be a tool called agrep for fuzzy regex matching, but it got abandoned.
en.wikipedia/wiki/Agrep 有一些历史以及相关工具的链接.
en.wikipedia/wiki/Agrep has a bit of history and links to related tools.
github/Wikinaut/agrep 看起来像是一个复兴的开源版本,但我没有测试过.
github/Wikinaut/agrep looks like a revived open source release, but I have not tested it.
如果失败,请查看您是否可以为您的发行版找到 tre-agrep.
Failing that, see if you can find tre-agrep for your distro.
更多推荐
使用grep进行模糊字符串匹配
发布评论