Java:检测csv或txt文件的定界符

编程入门 行业动态 更新时间:2024-10-10 06:21:29
本文介绍了Java:检测csv或txt文件的定界符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我看到这个问题已经被问过好几次了,但是他们使用的是其他语言,我无法理解答案.

I saw that this question was already asked several times but they're on other language and I can't get a grasp on the answers.

我正在通过套接字接收.csv或.txt文件.有什么方法可以检测CSV或TXT文件中一行的定界符或分隔符"?

I am receiving a .csv or .txt file through a socket. Is there any way I can detect the delimiter or "splitter" of a line in the CSV or TXT file?

这是处理文件写入的服务器代码,

This is the server code that handles the file writing,

try{ final ServerSocket server = new ServerSocket(8998); socket = server.accept(); File sdcard = Environment.getExternalStorageDirectory(); File myFile = new File(sdcard,"TestReceived"+curDate+".csv"); final BufferedReader br = new BufferedReader(new InputStreamReader(socket.getInputStream())); final PrintWriter pw = new PrintWriter(new FileWriter(myFile)); String line; String[] wordsarray; int bc = 0; int dc = 0; int pq = 0; int rq = 0; int id = 0; line = br.readLine(); wordsarray = line.split(","); for (int x = 0; x<wordsarray.length; x++){ switch(wordsarray[x]){ case "COLUMN NAME A": id = x; break; case "COLUMN NAME B": bc = x; break; case "COLUMN NAME C": dc = x; break; case "COLUMN NAME D": pq = x; break; case "COLUMN NAME E": rq = x; break; } } pw.println(wordsarray[dc]+"\t"+wordsarray[rq]+"\t"+wordsarray[pq]+"\t"+wordsarray[bc]+"\t"+wordsarray[id]); for (line = br.readLine(); line != null; line = br.readLine()) { wordsarray = line.split(","); pw.println(wordsarray[dc]+"\t"+wordsarray[rq]+"\t"+wordsarray[pq]+"\t"+wordsarray[bc]+"\t"+wordsarray[id]); } pw.flush(); pw.close(); br.close(); socket.close(); server.close(); } catch (Exception e){ e.printStackTrace(); }

如果我在 line.split(); 上加上逗号,并且文件具有不同的定界符,则会产生重复的行,我什至不知道为什么会发生这种情况

If I put a comma on line.split(); and the file has a different delimiter, it produces repeated lines and I don't even know why is that happening

COLUMN NAME A COLUMN NAME B COLUMN NAME C COLUMN NAME D COLUMN NAME E COLUMN NAME A COLUMN NAME B COLUMN NAME C COLUMN NAME D COLUMN NAME E COLUMN NAME A COLUMN NAME B COLUMN NAME C COLUMN NAME D COLUMN NAME E COLUMN NAME A COLUMN NAME B COLUMN NAME C COLUMN NAME D COLUMN NAME E COLUMN NAME A COLUMN NAME B COLUMN NAME C COLUMN NAME D COLUMN NAME E COLUMN NAME A COLUMN NAME B COLUMN NAME C COLUMN NAME D COLUMN NAME E

但是,如果文件具有匹配的逗号分隔符,则会产生正确的输出.

But If the file has a matching delimiter of comma it produces just the right output.

COLUMN NAME A COLUMN NAME B COLUMN NAME C COLUMN NAME D COLUMN NAME E

有什么方法可以自动检测文件的分隔符,这样我就不必担心文件使用的是哪个分隔符?还是有更好的解决方案?

Is there any way I can automatically detect the delimiter of a file so I won't have to worry which delimiter the file is using? Or is there a better solution for it?

推荐答案

使用 BufferedReader ,放置 mark(...) ,阅读第一个行.如果该行包含 \ t 制表符,则文件用制表符分隔,否则假定文件用逗号分隔.

Use a BufferedReader, place a mark(...), read the first line. If that line contains a \t tab character, then your file is tab-separated, otherwise assume that it is comma-separated.

然后使用CSV/TSV解析器解析文件,例如 Apache Commons CSV .

Then parse the file using a CSV/TSV parser, e.g. Apache Commons CSV.

try (BufferedReader in = Files.newBufferedReader​(Paths.get(filename))) { in.mark(1024); String line = in.readLine(); if (line == null) throw new IOException("File is empty: " + filename); CSVFormat fileFormat = (line.indexOf('\t') != -1 ? CSVFormat.TDF : CSVFormat.RFC4180) .withHeader(); in.reset(); for (CSVRecord record : fileFormat.parse(in)) { String lastName = record.get("Last Name"); String firstName = record.get("First Name"); ... } }

更多推荐

Java:检测csv或txt文件的定界符

本文发布于:2023-11-13 02:33:51,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1583181.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:文件   定界   Java   csv   txt

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!