希望有人可以教我如何完成这项任务.
Hoping somebody can teach me how to do this task.
我认为awk可能会做得很好,但是我确实是初学者.
I am thinking awk might be good to do this, but I am really beginner.
我有一个如下文件(制表符分隔,实际文件更大). 在这里,重要的列是第二和第九(文件第一行中的235和15).
I have a file like below (tab separated, actual file is much bigger). Here, important columns are second and ninth (235 and 15 in the first line of the file).
S 235 1365 * 0 * * * 15 1 c81 592 H 235 296 99.7 + 0 0 3I296M1066I 14 1 s15018 1 H 235 719 95.4 + 0 0 174D545M820I 15 1 c2664 10 H 235 764 99.1 + 0 0 55I764M546I 15 1 c6519 4 H 235 792 100 + 0 0 180I792M393I 14 1 c407 107 S 236 1365 * 0 * * * 15 1 c474 152 H 236 279 95 + 0 0 765I279M321I 10-1 1 s7689 1 H 236 301 99.7 - 0 0 908I301M156I 15 1 s8443 1 H 236 563 95.2 - 0 0 728I563M74I 17 1 c1725 12 H 236 97 97.9 - 0 0 732I97M536I 17 1 s11472 1我想通过指定第九列的值来提取行.此时,第二列将类似于枢轴列.我指的是透视列,如果第二列具有相同的值,则将其视为单个数据集.在这组行中,所有行都需要在第九列中具有特定的值.
I would like to extract lines by specifying the value of ninth columns. At this time, second columns will be like pivot column. What I mean pivot column is, consider as a single set of data if second column has same value. And within the set of lines, all lines need to have the specific values in the ninth column.
例如,如果我指定第九列"14"和"15".然后放出将.
So, for example, if I specify ninth column "14" and "15". Then out put will be.
S 235 1365 * 0 * * * 15 1 c81 592 H 235 296 99.7 + 0 0 3I296M1066I 14 1 s15018 1 H 235 719 95.4 + 0 0 174D545M820I 15 1 c2664 10 H 235 764 99.1 + 0 0 55I764M546I 15 1 c6519 4 H 235 792 100 + 0 0 180I792M393I 14 1 c407 107第6行和第8行的第九列具有"15",但是集合"(由第二列指定:236)中的其他行具有"14"或"15"以外的值,因此我不想提取线.
6th and 8th lines have "15" in their ninth column, but other lines in the "set" (specified by second column: 236) have values other than "14" or "15", so I do not want to extract the lines.
推荐答案$ cat tst.awk $2 != prevPivot { prtCurrSet() } $9 !~ /^1[45]$/ { isBadSet=1 } { currSet = currSet $0 ORS; prevPivot = $2 } END { prtCurrSet() } function prtCurrSet() { if ( !isBadSet ) { printf "%s", currSet } currSet = "" isBadSet = 0 } $ awk -f tst.awk file S 235 1365 * 0 * * * 15 1 c81 592 H 235 296 99.7 + 0 0 3I296M1066I 14 1 s15018 1 H 235 719 95.4 + 0 0 174D545M820I 15 1 c2664 10 H 235 764 99.1 + 0 0 55I764M546I 15 1 c6519 4 H 235 792 100 + 0 0 180I792M393I 14 1 c407 107
更多推荐
使用两个条件提取线
发布评论