哪个AWK程序可以执行此操作?

编程入门 行业动态 更新时间:2024-10-19 07:33:04
本文介绍了哪个AWK程序可以执行此操作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

给出一个包含如下结构的文件(其字段由SP或HT分隔)

Given a file containing a structure arranged like the following (with fields separated by SP or HT)

4 5 6 2 9 8 4 8 m d 6 7 9 5 4 g t 7 4 2 4 2 5 3 h 5 6 2 5 s 3 4 r 5 7 1 2 2 4 1 4 1 9 0 5 6 d f x c a 2 3 4 5 9 0 0 3 2 1 4 q w

我需要获得哪个AWK程序以下输出?

Which AWK program do I need to get the following output?

4 5 m d t 7 h 5 r 5 4 1 x c 0 0 6 2 6 7 4 2 6 2 7 1 9 0 a 2 3 2 9 8 9 5 4 2 5 s 2 2 5 6 3 4 1 4 4 8 4 g 5 3 3 4 4 1 d f 5 9 q w

在此先感谢您提供所有帮助.

Thanks in advance for any and all help.

后记

请记住,

  • 我的输入文件比这个问题描述的大得多.

  • My input file is much larger than the one depicted in this question.

    我的计算机科学技能受到严重限制.

    My computer science skills are seriously limited.

    这个任务已经强加给我了.

    This task has been imposed on me.

    推荐答案

    awk -v n=4 ' function join(start, end, result, i) { for (i=start; i<=end; i++) result = result $i (i==end ? ORS : FS) return result } { c=0 for (i=1; i<NF; i+=n) { c++ col[c] = col[c] join(i, i+n-1) } } END { for (i=1; i<=c; i++) printf "%s", col[i] # the value already ends with newline } ' file

    awk 信息页面在awk上有一个简短的入门,因此也请阅读.

    The awk info page has a short primer on awk, so read that too.

  • 创建一个具有1,000,000列和8行(由OP指定)的输入文件

  • create an input file with 1,000,000 columns and 8 rows (as specified by OP) #!perl my $cols = 2**20; # 1,048,576 my $rows = 8; my @alphabet=( 'a'..'z', 0..9 ); my $size = scalar @alphabet; for ($r=1; $r <= $rows; $r++) { for ($c = 1; $c <= $cols; $c++) { my $idx = int rand $size; printf "%s ", $alphabet[$idx]; } printf "\n"; }

    $ perl createfile.pl > input.file $ wc input.file 8 8388608 16777224 input.file

  • 为各种实现提供时间:我使用fish shell,因此计时输出与bash的计时输出不同

  • time various implementations: I use the fish shell, so the timing output is different from bash's

    • 我的awk

    • my awk

    $ time awk -f columnize.awk -v n=4 input.file > output.file ________________________________________________________ Executed in 3.62 secs fish external usr time 3.49 secs 0.24 millis 3.49 secs sys time 0.11 secs 1.96 millis 0.11 secs $ wc output.file 2097152 8388608 16777216 output.file

  • Timur的Perl:

  • Timur's perl:

    $ time perl -lan columnize.pl input.file > output.file ________________________________________________________ Executed in 3.25 secs fish external usr time 2.97 secs 0.16 millis 2.97 secs sys time 0.27 secs 2.87 millis 0.27 secs

  • 掠夺者的awk

  • Ravinder's awk

    $ time awk -f columnize.ravinder input.file > output.file ________________________________________________________ Executed in 4.01 secs fish external usr time 3.84 secs 0.18 millis 3.84 secs sys time 0.15 secs 3.75 millis 0.14 secs

  • kvantour的awk,第一个版本

  • kvantour's awk, first version

    $ time awk -f columnize.kvantour -v n=4 input.file > output.file ________________________________________________________ Executed in 3.84 secs fish external usr time 3.71 secs 166.00 micros 3.71 secs sys time 0.11 secs 1326.00 micros 0.11 secs

  • kvantour的第二个awk版本:Crtl-C在几分钟后被中断

  • kvantour's second awk version: Crtl-C interrupted after a few minutes

    $ time awk -f columnize.kvantour2 -v n=4 input.file > output.file ^C ________________________________________________________ Executed in 260.80 secs fish external usr time 257.39 secs 0.13 millis 257.39 secs sys time 1.68 secs 2.72 millis 1.67 secs $ wc output.file 9728 38912 77824 output.file

    $ 0 = a [j] 行非常昂贵,因为它每次必须将字符串解析为字段.

    The $0=a[j] line is pretty expensive, as it has to parse the string into fields each time.

    道格的蟒蛇

    $ timeout 60s fish -c 'time python3 columnize.py input.file 4 > output.file' [... 60 seconds later ...] $ wc output.file 2049 8196 16392 output.file

  • 另一个有趣的数据点:使用不同的awk实现.我在装有通过自制软件安装的GNU awk和mawk的Mac上

    another interesting data point: using different awk implementations. I'm on a Mac with GNU awk and mawk installed via homebrew

    • 多列少行

    • with many columns, few rows

    $ time gawk -f columnize.awk -v n=4 input.file > output.file ________________________________________________________ Executed in 3.78 secs fish external usr time 3.62 secs 174.00 micros 3.62 secs sys time 0.13 secs 1259.00 micros 0.13 secs

    $ time /usr/bin/awk -f columnize.awk -v n=4 input.file > output.file ________________________________________________________ Executed in 17.73 secs fish external usr time 14.95 secs 0.20 millis 14.95 secs sys time 2.72 secs 3.45 millis 2.71 secs

    $ time mawk -f columnize.awk -v n=4 input.file > output.file ________________________________________________________ Executed in 2.01 secs fish external usr time 1892.31 millis 0.11 millis 1892.21 millis sys time 95.14 millis 2.17 millis 92.97 millis

  • 多行,少列,该测试在MacBook Pro,6核Intel cpu,16GB内存

  • with many rows, few columns, this test took over half an hour on a MacBook Pro, 6 core Intel cpu, 16GB ram

    $ time mawk -f columnize.awk -v n=4 input.file > output.file ________________________________________________________ Executed in 32.30 mins fish external usr time 23.58 mins 0.15 millis 23.58 mins sys time 8.63 mins 2.52 millis 8.63 mins

  • 更多推荐

    哪个AWK程序可以执行此操作?

    本文发布于:2023-11-28 16:41:57,感谢您对本站的认可!
    本文链接:https://www.elefans.com/category/jswz/34/1643171.html
    版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
    本文标签:操作   程序   AWK

    发布评论

    评论列表 (有 0 条评论)
    草根站长

    >www.elefans.com

    编程频道|电子爱好者 - 技术资讯及电子产品介绍!