使用通过 apt 安装的 JAR 文件用于 Saxon-HE 和 tagsoup 解析 html 是一个-班轮为:
Using the JAR files installed through apt for Saxon-HE and tagsoup parsing html is a one-liner as:
thufir@dur:~/saxon$ thufir@dur:~/saxon$ java -cp /usr/share/java/Saxon-HE-9.8.0.14.jar:/usr/share/java/tagsoup-1.2.1.jar net.sf.saxon.Query -x:org.ccil.cowan.tagsoup.Parser -qs:doc\(\'books.toscrape/\'\) <?xml version="1.0" encoding="UTF-8"?><!--[if lt IE 7]> <html lang="en-us" class="no-js lt-ie9 lt-ie8 lt-ie7"> <![endif]--><!--[if IE 7]> <html lang="en-us" class="no-js lt-ie9 lt-ie8"> <![endif]--><!--[if IE 8]> <html lang="en-us" class="no-js lt-ie9"> <![endif]--><!--[if gt IE 8]><!--><html xmlns="www.w3/1999/xhtml" xmlns:html="www.w3/1999/xhtml" class="no-js" lang="en-us"><!--<![endif]--><head><title> All products | Books to Scrape - Sandbox .. <!-- Version: N/A --> thufir@dur:~/saxon$ thufir@dur:~/saxon$我将如何从 Java 执行此操作? 特别是,此执行需要从 Saxon 导入哪些内容?也许使用Saxon 和 JAXP 接口?
How would I do that from Java? In particular, what imports are required from Saxon for this execution? Perhaps using Saxon and the JAXP interface?
还有:
codingwithpassion.blogspot/2011/03/saxon-xslt-java-example.html
推荐答案您可以在 saxonica 和 sourceforge 网站上提供的 saxon-resources 下载中找到许多使用 Saxon 从 Java 调用转换的简单示例.
You will find many simple examples of invoking transformations using Saxon from Java in the saxon-resources download available on both the saxonica and sourceforge web sites.
在这里很难确切地知道您想要什么,因为除了调用 TagSoup 解析器和序列化结果之外,您的命令行示例没有使用 Saxon 做任何有用的事情.从 Java 执行此操作的最简单方法是使用 JAXP 身份转换,它与 JDK 中的内置 XSLT 转换器一起运行,就像在 Saxon 中一样:
It's difficult to know exactly what you want here, because your command line example isn't using Saxon to do anything useful other than invoking the TagSoup parser and serializing the result. The simplest way to do that from Java is with a JAXP identity transformation, which runs just as well with the built-in XSLT transformer in the JDK as with Saxon:
TransformerFactory factory = TransformerFactory.newInstance(); XMLReader xmlReader = XMLReaderFactory.createXMLReader("org.ccil.cowan.tagsoup.Parser"); Source input = new SAXSource(xmlReader, new InputSource("books.toscrape/")); Result output = new StreamResult(System.out); factory.newTransformer().transform(input, output);如果你想添加一些 XSLT 或 XQuery 处理,那当然是完全可能的(我总是使用 Saxon 的 s9api API,但你也可以使用 JAXP 或 XQJ),但细节取决于你想要什么做.
If you want to add some XSLT or XQuery processing then of course that's perfectly possible (I would always use the s9api API for Saxon, but you can also use JAXP or XQJ), but the details depend on exactly what you want to do.
更多推荐
使用 Java 的 Hello World Saxon
发布评论