我有一个非常具体的数据导入问题,而且我对XML数据集还很陌生,所以我的问题很可能是由于我缺乏理解.我想阅读Deutsche Bahn的德语轨道网络,该网络已在此处公开发布: http: //data.deutschebahn/dataset/data-streckennetz (不幸的是,链接在德国)
I have a very specific data import problem and I am fairly new to XML data sets, so my problems are probably due to my lack of understanding. I would like to read in the German track network from Deutsche Bahn, which is published publically here: data.deutschebahn/dataset/data-streckennetz (link is in Germany unfortunately)
这将是直接链接: download- data.deutschebahn/static/datasets/streckennetz/INSPIRE_0618.zip
还有一个指向INSPIRE数据集的200页文档的链接,但它并不能真正帮助我理解解析XML文档. inspire.ec.europa.eu/documents/Data_Specifications/INSPIRE_DataSpecification_TN_v3.0.pdf
There is also a link to a 200 page document about the INSPIRE data set, but it does not really help me understand parsing the XML document. inspire.ec.europa.eu/documents/Data_Specifications/INSPIRE_DataSpecification_TN_v3.0.pdf
在一个上一个问题中,我得到了一个如何在各层中阅读的答案使用软件包sf
In a previous question, I got an answer how to read in the layers using the package sf
sf::st_layers("./DB-Netz_INSPIRE_20171116.xml") nodes <- sf::st_read("./DB-Netz_INSPIRE_20171116.xml","RailwayNode")除了一个"RailwayLinkSequence",它适用于所有层
which worked for all layers but one "RailwayLinkSequence"
Link_Sequence<- sf::st_read("./DB-Netz_INSPIRE_20171116.xml","RailwayLinkSequence")返回
Reading layer `RailwayLinkSequence' from data source `J:\Auswertungen Daten\R Beispiele\GIS\10 Data\DB_Inspire_XML_2015\DB-Netz_INSPIRE_20171116.xml' using driver `GML' Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 1, 6, 5, 28, 99, 2, 19, 41, 11, 3, 65, 7, 4, 22, 20, 17, 38, 9, 15, 8, 13, 24, 49, 14, 42, 36, 51, 31, 12, 25, 60, 10, 18, 48, 104, 53, 23, 16, 26, 32, 119, 40, 47, 37, 21, 44, 39, 43, 52, 46, 27, 30, 63, 81, 54, 61, 59, 34, 35, 45, 56, 108, 64, 62, 68, 67, 57, 80, 55, 29, 123, 88, 85, 33, 50, 96, 66, 79, 115 In addition: Warning message: no simple feature geometries present: returning a data.frame or tbl_df有人暗示为什么st_read无法读取该层吗?
Does anybody have a hint why this layer cannot be read in with st_read?
推荐答案使用xml2包:
> library(xml2)读取文件:
> x= read_xml("./DB-Netz_INSPIRE_20171116.xml")在默认(d1)命名空间中找到所有RLS:
Find all the RLSs in the default (d1) namespace:
> f = xml_find_all(x, ".//d1:RailwayLinkSequence")看看其中之一:
> f[123] {xml_nodeset (1)} [1] <RailwayLinkSequence gml:id="LSeq-1829209">\n <gml:identifier codeSpace= ...有几个?
> length(f) [1] 7072第123个RLS的localId值是多少?
What's the localId value for the 123rd RLS?
> xml_find_all(f[123], ".//base:localId") {xml_nodeset (1)} [1] <base:localId>LSeq-1829209</base:localId>此RLS中包含哪些DirectedLink对象:
What are the DirectedLink objects contained in this RLS:
> xml_find_all(f[123], ".//net:DirectedLink") {xml_nodeset (36)} [1] <net:DirectedLink>\n <net:direction>+</net:direction>\n <net:link xlin ... [2] <net:DirectedLink>\n <net:direction>+</net:direction>\n <net:link xlin ... [3] <net:DirectedLink>\n <net:direction>+</net:direction>\n <net:link xlin ... [4] <net:DirectedLink>\n <net:direction>+</net:direction>\n <net:link xlin ... [5] <net:DirectedLink>\n <net:direction>+</net:direction>\n <net:link xlin ... [...]了解有关使用xml2进行XML解析的更多信息,您将必须弄清楚如何提取出于自身目的所需的部分.但我希望这会有所帮助.
Learn some more about XML parsing using xml2 and you'll have to work out how to extract the parts you need for your own purposes. But I hope this helps.
更多推荐
XML导入INSPIRE GIS数据层
发布评论