我正在尝试将XML解析为Python中的表式结构. 想象这样的XML:
I'm trying to parse XML to table-like structure in Python. Imagine XML like this:
<?xml version="1.0" encoding="UTF-8"?> <base> <element1>element 1</element1> <element2>element 2</element2> <element3> <subElement3>subElement 3</subElement3> </element3> </base>我想要这样的结果:
KEY | VALUE base.element1 | "element 1" base.element2 | "element 2" base.element3.subElement3 | "subElement 3"我尝试使用xml.etree.cElementTree,然后使用此处描述的功能如何在Python中将xml字符串转换为字典?
I've tried using xml.etree.cElementTree, then functions described here How to convert an xml string to a dictionary in Python?
是否有任何功能可以做到这一点?我发现的所有答案都是针对特定的XML方案编写的,因此需要针对每个新的XML方案进行编辑. 作为参考,在R中,使用XML和XML2包以及xmlToList函数很容易.
Is there any function that can do this? All answers I found are written for particular XML schemes and would need to be edited for each new XML scheme. For reference, in R it's easy with XML and XML2 packages and xmlToList function.
推荐答案使用以下脚本,我已经获得了所需的结果.
I've got the needed outcome using following script.
XML文件:
<?xml version="1.0" encoding="UTF-8"?> <base> <element1>element 1</element1> <element2>element 2</element2> <element3> <subElement3>subElement 3</subElement3> </element3> </base>Python代码:
import pandas as pd from lxml import etree data = "C:/Path/test.xml" tree = etree.parse(data) lstKey = [] lstValue = [] for p in tree.iter() : lstKey.append(tree.getpath(p).replace("/",".")[1:]) lstValue.append(p.text) df = pd.DataFrame({'key' : lstKey, 'value' : lstValue}) df.sort_values('key')结果:
更多推荐
使用Python将XML解析为表格
发布评论