Java XML解析和原始字节偏移

编程入门 行业动态 更新时间:2024-10-26 08:29:49
本文介绍了Java XML解析和原始字节偏移的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我想将一些结构良好的XML解析为DOM,但我想知道原始媒体中每个节点标记的偏移量。

I'd like to parse some well-formed XML into a DOM, but I'd like know the offset of each node's tag in the original media.

例如,如果我的XML文档的内容类似于:

For example, if I had an XML document with the content something like:

<html> <body> <div>text</div> </body> </html>

我想知道节点从原始媒体中的偏移量13开始,并且(更多重要的是text从偏移量18开始。

I'd like to know that the node starts at offset 13 in the original media, and (more importantly) that "text" starts at offset 18.

这是否可以使用标准的Java XML解析器? JAXB?如果没有容易获得的解决方案,那么在解析路径上需要进行哪些类型的更改才能实现这一点?

Is this possible with standard Java XML parsers? JAXB? If no solution is easily available, what type of changes are necessary along the parsing path to make this possible?

推荐答案

SAX API为此提供了一个相当模糊的机制 - org.xml.sax.Locator 界面。当您使用SAX API时,您继承 DefaultHandler 并将其传递给SAX解析方法,并且SAX解析器实现应该注入 Locator 通过 setDocumentLocator()进入 DefaultHandler 。随着解析的进行,调用 ContentHandler 上的各种回调方法(例如 startElement()),此时你可以参考定位器找出解析位置(通过 getColumnNumber()和 getLineNumber ())

The SAX API provides a rather obscure mechanism for this - the org.xml.sax.Locator interface. When you use the SAX API, you subclass DefaultHandler and pass that to the SAX parse methods, and the SAX parser implementation is supposed to inject a Locator into your DefaultHandler via setDocumentLocator(). As the parsing proceeds, the various callback methods on your ContentHandler are invoked (e.g. startElement()), at which point you can consult the Locator to find out the parsing position (via getColumnNumber() and getLineNumber())

从技术上讲,这是可选功能,但javadoc说强烈鼓励提供实现,所以你可以可能假设内置于JavaSE中的SAX解析器会这样做。

Technically, this is optional functionality, but the javadoc says that implementations are "strongly encouraged" to provide it, so you can likely assume the SAX parser built into JavaSE will do it.

当然,这确实意味着使用SAX API,这是没有趣味的想法,但我不能查看使用更高级API访问此信息的方法。

Of course, this does mean using the SAX API, which is noone's idea of fun, but I can't see a way of accessing this information using a higher-level API.

编辑:找到这个例子。

更多推荐

Java XML解析和原始字节偏移

本文发布于:2023-10-24 09:25:39,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1523574.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:字节   原始   Java   XML

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!