我试图编写一个shell脚本来搜索文件中的文本并将文本和相关信息打印到单独的文件中。
I'm trying to write a shell script that searches for text within a file and prints out the text and associated information to a separate file.
从这个包含基因ID列表的文件:
From this file containing list of gene IDs:
DDIT3 ENSG00000175197 DNMT1 ENSG00000129757 DYRK1B ENSG00000105204我想搜索这些基因ID( ENSG *),它们在一个gtf文件中的RPKM1和RPKM2值:
I want to search for these gene IDs (ENSG*), their RPKM1 and RPKM2 values in a gtf file:
chr16 gencodeV7 gene 88772891 88781784 0.126744 + . gene_id "ENSG00000174177.7"; transcript_ids "ENST00000453996.1,ENST00000312060.4,ENST00000378384.3,"; RPKM1 "1.40735"; RPKM2 "1.61345"; iIDR "0.003"; chr11 gencodeV7 gene 55850277 55851215 0.000000 + . gene_id "ENSG00000225538.1"; transcript_ids "ENST00000425977.1,"; RPKM1 "0"; RPKM2 "0"; iIDR "NA";并将其打印/写入单独的输出文件中
and print/ write it to a separate output file
Gene_ID RPKM1 RPKM2 ENSG00000108270 7.81399 8.149 ENSG00000101126 12.0082 8.55263我在命令行上为每个ID使用了它:
I've done it on the command line using for each ID using:
grep -w "ENSGno" rnaseq.gtf| awk '{print $10,$13,$14,$15,$16}' > output.file但是在编写shell脚本时,我尝试了各种各样的for ,同时,阅读,做和改变变量,但没有成功。任何想法都会很棒!
but when it comes to writing the shell script, I've tried various combinations of for, while, read, do and changing the variables but without success. Any ideas would be great!
推荐答案您可以这样做:
You can do something like:
while read line do var=$(echo $line | awk '{print $2}') grep -w "$var" rnaseq.gtf| awk '{print $10,$13,$14,$15,$16}' >> output.file done < geneIDs.file更多推荐
搜索文本
发布评论