Excel和AWK不同意CSV总计

编程入门行业动态更新时间:2024-10-12 05:44:16

本文介绍了Excel和AWK不同意CSV总计的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

我有我总额高达两方面的CSV文件：一是使用Excel，另一个使用 AWK 。这里是我的第一个8列的总数在Excel中：

I have a CSV file that I'm totaling up two ways: one using Excel and the other using awk. Here are the totals of my first 8 columns in Excel:

1) 2640502474.00 2) 1272849386284.00 3) 36785.00 4) 5) 107.00 6) 239259.00 7) 0.00 8) 7418570893330.00

这是我的 AWK 输出：

$ cat /home/jason/import.csv | awk -F "\"*,\"*" '{s+=$1} END {printf("%01.2f\n", s)}' 2640502474.00 $ cat /home/jason/import.csv | awk -F "\"*,\"*" '{s+=$2} END {printf("%01.2f\n", s)}' 1272849386284.00 $ cat /home/jason/import.csv | awk -F "\"*,\"*" '{s+=$8} END {printf("%01.2f\n", s)}' 7411306364347.00

请注意如何1和2严丝合缝，但8是关闭的几百万。我假定Excel的总是正确的，那么，为什么 AWK 不同处理此文件？

推荐答案

您可能已经包含在报价逗号格式化数字。 Excel将妥善处理该数字作为一个单独的领域。您在AWK场分离的正则表达式不会 - 内部一个数字一个逗号是根据该正则表达式有效的分隔符。这是很难（而且大多徒劳的），尝试和处理可选嵌套逃逸喜欢什么是可能以CSV正则表达式。

You likely have a comma formatted number contained in quotes. Excel will properly handle that number as a single field. Your regex for field separation in awk won't - a comma internal to a number is a valid separator according to that regex. It is very hard (and mostly futile) to try and handle optional nested escaping like what is possible in csv with a regex.

比较下面，看看有什么是可能的事情：

Compare the following to see what is likely going on:

$ echo '"1","10","15","1,000","14"' | awk -F "\"*,\"*" '{print $4}' 1 $ echo '"1","10","15","1,000","14"' | awk -F "\",\"" '{print $4}' 1,000

请注意，上面仍然是第二正则表达式与在最后一个字段尾随问题，只有在所有工作，如果所有领域都始终引用 - 这是用于说明目的仅

Note that the second regex above still has a problem with a trailing " in the last field and only works at all if all field are consistently quoted - it is for illustration purposes only.

更多推荐

Excel和AWK不同意CSV总计

本文发布于:2023-11-27 11:28:36，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1637874.html