在添加先前序列的长度之后计算序列的长度(calculate the length of a sequence after adding the length of previous sequences)
我想确定multifasta文件中各个序列的长度。 我从生物手册中得到了这个biopython代码:
from Bio import SeqIO import sys cmdargs = str(sys.argv) for seq_record in SeqIO.parse(str(sys.argv[1]), "fasta"): output_line = '%s\t%i' % \ (seq_record.id, len(seq_record)) print(output_line)我的输入文件如下:
>Protein1 MNT >Protein2 TSMN >Protein3 TTQRT代码产生:
Protein1 3 Protein2 4 Protein3 5但我想在添加先前序列的长度后计算序列的长度。 这将是:
Protein1 1-3 Protein2 4-7 Protein3 8-12我不知道代码中的上述哪一行需要更改以获得该输出。 我对这个问题有任何帮助表示感谢,谢谢!
I want to determine length of individual sequences in a multifasta file. I got this biopython code from the bio manual as:
from Bio import SeqIO import sys cmdargs = str(sys.argv) for seq_record in SeqIO.parse(str(sys.argv[1]), "fasta"): output_line = '%s\t%i' % \ (seq_record.id, len(seq_record)) print(output_line)My input file is like:
>Protein1 MNT >Protein2 TSMN >Protein3 TTQRTAnd the code yields:
Protein1 3 Protein2 4 Protein3 5But I want to calculate the length of a sequence after adding the length of previous sequences. It would be like:
Protein1 1-3 Protein2 4-7 Protein3 8-12I don't know in which of the above line in the code I need to change to get that output. I'd appreciate any help on this issue, thanks!!!!
最满意答案
获得总长度很容易:
from Bio import SeqIO import sys cmdargs = str(sys.argv) total_len = 0 for seq_record in SeqIO.parse(str(sys.argv[1]), "fasta"): total_len += len(seq_record) output_line = '%s\t%i' % (seq_record.id, total_len)) print(output_line)获得范围:
from Bio import SeqIO import sys cmdargs = str(sys.argv) total_len = 0 for seq_record in SeqIO.parse(str(sys.argv[1]), "fasta"): previous_total_len = total_len total_len += len(seq_record) output_line = '%s\t%i - %i' % (seq_record.id, previous_total_len + 1, total_len) print(output_line)It is easy just to get the total length:
from Bio import SeqIO import sys cmdargs = str(sys.argv) total_len = 0 for seq_record in SeqIO.parse(str(sys.argv[1]), "fasta"): total_len += len(seq_record) output_line = '%s\t%i' % (seq_record.id, total_len)) print(output_line)To get a range:
from Bio import SeqIO import sys cmdargs = str(sys.argv) total_len = 0 for seq_record in SeqIO.parse(str(sys.argv[1]), "fasta"): previous_total_len = total_len total_len += len(seq_record) output_line = '%s\t%i - %i' % (seq_record.id, previous_total_len + 1, total_len) print(output_line)更多推荐
发布评论