我必须创建一个awk脚本来实现以下转换:
I have to create an awk script to achieve the following transformation:
FLAG1和FLAG2:
FLAG1 and FLAG2 are filled given the following conditions: if the VAL is ":" then NUM is null if the VAL is ":" and FLAG "c" then NUM is null and FLAG1 is "c" if the VAL is ":" and FLAG "u" then NUM is null and FLAG2 is "u" if the VAL is "14,385" and FLAG "d" then NUM is "14385" and FLAG(both) is null if the VAL is "14,385" and FLAG "du" then NUM is "14385" and FLAG2 is "u" if the VAL is ":" and FLAG "cd" then NUM is null and FLAG1 is "c" if the VAL is ":" and FLAG "bc" then NUM is null and FLAG1 is "c" and FLAG2 is "b" if the VAL is ":" and FLAG "z" then NUM is 0 and FLAG2 is "z"
csv输入文件是:
"PRIM", "TRD", "GTR", "VAL", "FLAG" "TPP", "T5-78", "HT", ":", c "TCP", "T5-78", "HT", "12,385", c "TZP", "T5-78", "HT", ":", z "TNP", "T5-78", "HT", ":", z "TNP", "T5-78", "HT", ":", cd "TNP", "T5-78", "HT", ":", du "TNP", "T5-78", "HT", "12,524,652", dfg输出的.dat文件应如下所示:
The output .dat file should look like this:
PRIM TRD GTR NUM FLAG1 FLAG2 TPP T5-78 HT null c null TCP T5-78 HT 12385 c null TZP T5-78 HT 0 null z TNP T5-78 HT 0 null z TNP T5-78 HT null c null TNP T5-78 HT null null u TNP T5-78 HT 12524652 null dfg我尝试的代码无法正常工作,因为仅满足前三个要求,而第四个则无法工作.
The code I have tried doesn't work properly as only the first 3 requirements are met while the 4th just doesn't work.
BEGIN { FS=","; OFS="\t"; a["PRIM"]=1;a["TRD"]=1;a["GTR"]=1;a["VAL"]=1;a["FLAG"]=1; } NR==1 { { $a["VAL"] = "NUMB" ; $a["FLAG"] = "FLAG1" ; $5 = "FLAG2" ; print ; next } $a["VAL"]=="12,385" && $a["FLAG"] == "d" { $a["VAL"] = "14385" ; $a["FLAG"] = $5 = "" } $a["VAL"]=="12,385" && $a["FLAG"] == "du" { $a["VAL"] = "14385" ; $a["FLAG"] = "" ; $9 = "u" } $a["VAL"] != ":" { print ; next } $a["FLAG"] == "z" { $a["VAL"] = "0" ; $a["FLAG"] = "" ; $5 = "z" } $a["FLAG"] != "z" { $a["VAL"] = "" } $NF=substr($NF,1,length($NF)-1); for(i=1;i<=NF;i++) if($i in a) a[$i]=i; } { print $a["PRIM"],$a["TRD"],$a["GTR"],NR==1?"NUM":$a["VAL"], NR==1?"FLAG1"OFS"FLAG2":($a["FLAG"]?""OFS$a["FLAG"]:$a["FLAG"]);
这是我认为可以使用的最新代码.现在我无法解决的问题是最后一个值(FLAG2)打印在第二行上.我试图放OFS,但不能解决问题.您能告诉我在这种情况下怎么了.
Here is the latest code which I think it will work. the problem now which I cannot solve is that the last value (FLAG2) is printed on the second line. I tried to put OFS but it doesn't fix the problem. Could you please tell me what is wrong in this case.
BEGIN { FS=","; OFS="\t"; a["PRIM"]=1; a["TRD"]=1; a["GTR"]=1; a["VAL"]=1; a["FLAG"]=1; a["FLAG1"]=1; a["FLAG2"]=1; } NR==1 { $NF=substr($NF,1,length($NF)-1); for(i=1;i<=NF;i++) #if($i in a) a[$i]=i; a["FLAG1"] = i; a["FLAG2"]=i; a["FLAG1"] = a["FLAG"]; # just for testing and it is ok a["FLAG2"] = a["FLAG"]; # just for testing and it is ok } { print $a["PRIM"],$a["TRD"],$a["GTR"],NR==1?"NUM":$a["VAL"], NR==1?"FLAG1":$a["FLAG1"],NR==1?"FLAG2":$a["FLAG2"];}
输出类似于
PRIM TRD GTR NUM FLAG1 FLAG2 TPP T5-78 HT null c null TCP T5-78 HT 12385 c null TZP T5-78 HT 0 null z
经过这么多建议,这是我的最后一个版本,但是仍然不成功...现在,当我添加if语句来满足上述要求时,什么也没有发生.我相信if语句要么不正确,要么放在正确的位置. 如果NR> 1是灾难性的,则打印这些值. 您能告诉我我的脚本出了什么问题吗?我不得不承认,我是3天前开始使用这个awk的,到目前为止很痛苦...问题是我应该从上周开始就完成此脚本
After so many advices this is my last version but it is still unsuccesful... Now when I'm adding if statements for fulfilling the requirements above, nothing happens. I believe the if statements either are not correct nor putted in the right place. printing the values if NR>1 is a catastropha.. could you pls tell me what is wrong with my script? I have to admit that I started with this awk 3 days ago and so far it is painful...the problem is that I have should have finished this script since last week
BEGIN { FS=","; OFS="\t"; a["PRIM"]=1; a["TRD"]=1; a["GTR"]=1; a["VAL"]=1; a["FLAG"]=1; a["FLAG1"]=1; a["FLAG2"]=1; } NR==1 { $NF=substr($NF,1,length($NF)-1); for(i=1;i<=NF;i++) #if($i in a) a[$i]=i; #a["FLAG1"] = a[i]; #a["FLAG2"]=a[i]; a["FLAG1"] = a["FLAG"]; a["FLAG2"] = a["FLAG"]; } { #initialisation of the new flags a["FLAG1"]==""; a["FLAG2"]==""; } #MY IF STATEMENTS GO HERE - TEST MODE a["FLAG"] == "cd" {a["FLAG1"]= "c"} a["FLAG"] == "du" {a["FLAG2"]= "u"} { #print header print $a["PRIM"],$a["TRD"],$a["GTR"],NR==1?"NUM":$a["VAL"], NR==1?"FLAG1":$a["FLAG1"],NR==1?"FLAG2":$a["FLAG2"]; } #print content NR>1 { for(j=1;j<=NF;j++) #if($i in a) a[$j]=j; #a["FLAG1"] = a[i]; #a["FLAG2"]=a[i]; a["FLAG1"] = a["FLAG"]; a["FLAG2"] = a["FLAG"]; } #MY IF STATEMENTS GO HERE - TEST MODE a["FLAG"] == "cd" {a["FLAG1"]= "c"} a["FLAG"] == "du" {a["FLAG2"]= "u"} { print $a["PRIM"],$a["TRD"],$a["GTR"],$a["VAL"], $a["FLAG1"], $a["FLAG2"] }推荐答案
这要求所有输入字段具有双引号.
$ echo '"PRIM", "TRD", "GTR", "VAL", "FLAG" "TPP", "T5-78", "HT", ":", "c" "TCP", "T5-78", "HT", "12,385", "c" "TZP", "T5-78", "HT", ":", "z" "TNP", "T5-78", "HT", ":", "z" "TNP", "T5-78", "HT", ":", "cd" "TNP", "T5-78", "HT", ":", "du" "TNP", "T5-78", "HT", "12,524,652", "dfg"' | awk -F '",[ \t]*"' ' { sub(/^"/, "", $1); sub(/"$/, "", $NF)} NR == 1 { for (i=1; i<=NF; i++) col[$i] = i print "PRIM TRD GTR NUM FLAG1 FLAG2" next } { f = $col["FLAG"] v = $col["VAL"]; gsub(/,/, "", v) num = "null"; flag1 = "null"; flag2 = "null" } v == ":" && f == "c" {flag1 = "c"} v == ":" && f == "u" {flag2 = "u"} v == "14385" && f == "d" {num = $4} v == "14385" && f == "du" {num = $4; flag2 = "u"} v == ":" && f == "cd" {flag1 = "c"} v == ":" && f == "bc" {flag1 = "c"; flag2 = "b"} v == ":" && f == "z" {num = 0; flag2 = "z"} {print $col["PRIM"],$col["TRD"],$col["GTR"],num,flag1,flag2} 'PRIM TRD GTR NUM FLAG1 FLAG2 TPP T5-78 HT null c null TCP T5-78 HT null null null TZP T5-78 HT null null z TNP T5-78 HT null null z TNP T5-78 HT null c null TNP T5-78 HT null null null TNP T5-78 HT null null null
我的输出与您的输出看起来不一样.查看您的规格,并确保样本输入足以涵盖这些规格.
My output does not look like yours. Review your specifications and make sure the sample input is sufficient to cover them.
更多推荐
如何通过多种条件将一列拆分为另一列?
发布评论