我有一个下面的文件,其中的行数为n,我希望将其总和(基于第3列)并相应地将行分配到3个不同的文件中(基于每个文件的总和)
I have a file as below with n number of rows, I want to total it's sum(based on 3rd column) and distribute rows accordingly in 3 different files(based on sum of each)
例如-如果我们将所有第3列的值相加,则总计为516,如果将其除以3,则为172.
For example- if we sum all the 3rd column values it's total is coming as 516 and if we divide it by 3 it is 172.
因此,我想向文件中添加一行,以使其不超过172个标记,与第二个文件相同,其余所有行应移至第三个文件.
So i want to add a rows to a file so it doesn't exceed 172 mark, same with the 2nd file and rest all rows should move to the third file.
输入文件
a aa 10 b ab 15 c ac 17 a dy 30 y ae 12 a dl 34 a fk 45 l ah 56 o aj 76 l ai 12 q al 09 d pl 34 e ik 30 f ll 10 g dl 15 h fr 17 i dd 23 j we 27 k rt 12 l yt 13 m tt 19预期产量
file1(total -163) a aa 10 b ab 15 c ac 17 a dy 30 y ae 12 a dl 34 a fk 45file2(共153个)
file2 (total-153)
l ah 56 o aj 76 l ai 12 q al 9file3(总计-200个)
file3 (total - 200)
d pl 34 e ik 30 f ll 10 g dl 15 h fr 17 i dd 23 j we 27 k rt 12 l yt 13 m tt 19推荐答案
能否请您按照GNU awk中显示的示例进行尝试,编写和测试.
Could you please try following, written and tested with shown samples in GNU awk.
awk ' FNR==NR{ sum+=$NF next } FNR==1{ count=sum/3 } { curr_sum+=$NF } (curr_sum>=count || FNR==1) && fileCnt<=2{ close(out_file) out_file="file" ++fileCnt curr_sum=$NF } { print > (out_file) }' Input_file Input_file说明: ,为此添加了详细说明.
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here. FNR==NR{ ##Checking condition FNR==NR which will be TRUE when first time Input_file is being read. sum+=$NF ##Taking sum of last field of all lines here and keep adding them to get cumulative sum of whole Input_file. next ##next will skip all further statements from here. } FNR==1{ ##Checking condition if its first line for 2nd time reading of Input_file. count=sum/3 ##Creating count with value of sum/3 here. } { curr_sum+=$NF ##Keep adding lst field sum in curr_sum here. } (curr_sum>=count || FNR==1) && fileCnt<=2{ ##Checking if current sum is <= count OR its first line(in 2nd time reading) AND output file count is <=2 here. close(out_file) ##Closing output file here, may NOT be needed here since we are having only 3 files here in output. out_file="file" ++fileCnt ##Creating output file name here. curr_sum=$NF ##Keep adding lst field sum in curr_sum here. } { print > (out_file) ##Printing current line into output file here. }' Input_file Input_file ##Mentioning Input_file names here.更多推荐
第三行的总和并相应地划分行
发布评论