XML文件解析Python

编程入门 行业动态 更新时间:2024-10-08 06:21:31
本文介绍了XML文件解析Python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我无法再收集2条数据以使用Python将数据从XML转换为

I am having trouble collecting 2 more pieces of data to convert from XML to CSV using Python

它们是description标签和generateOn标签。

They are the description tag and the generatedOn tag.

对于描述标签,我尝试了 item.find('description')。text ,但是它不起作用。

For the description tag I tried item.find('description').text but it did not work.

对于我希望generateOn标记将内部的项连接起来:

For the generatedOn tag I would like it concatenate the items inside like this:

请请参阅下面的示例XML:

Please see the sample XML below:

<?xml version="1.0" encoding="UTF-8"?> <omGroups xmlns="urn:nortel:namespaces:mcp:oms" xmlns:xsi="www.w3/2001/XMLSchema-instance" xsi:schemaLocation="urn:nortel:namespaces:mcp:oms OMSchema.xsd" > <group> <name>RecordingSystem</name> <row> <package>com.nortelnetworks.mcp.ne.base.recsystem.fw.system</package> <class>RecSysFileOMRow</class> <usage name="closedFileCount" hasThresholds="true"> <measures> closed file count </measures> <description> This register counts the number of closed files in the spool directory of a particular stream and a particular system. Files in the spool directory store the raw OAM records where they are sent to the Element Manager for formatting. </description> <notes> Minor and major alarms when the value of closedFileCount exceeds certain thresholds. Configure the threshold values for minor and major alarms for this OM through engineering parameters for minorBackLogCount and majorBackLogCount, respectively. These engineering parameters are grouped under the parameter group of Log, OM, and Accounting for the logs’ corresponding system. </notes> </usage> <usage name="processedFileCount" hasThresholds="true"> <measures> Processed file count </measures> <description> The register counts the number of processed files in the spool directory of a particular stream and a particular system. Files in the spool directory store the raw OAM records and then send the records to the Element Manager for formatting. </description> </usage> </row> <documentation> <description> Rows of this OM group provide a count of the number of files contained within the directory (which is the OM row key value). </description> <rowKey> The full name of the directory containing the files counted by this row. </rowKey> </documentation> <generatedOn> <all/> </generatedOn> </group> <group traffic="true"> <name>Ports</name> <row> <package>com.nortelnetworks.ims.cap.mediaportal.host</package> <class>PortsOMRow</class> <usage name="rtpMpPortUsage"> <measures> BCP port usage </measures> <description> Meter showing number of ports in use. </description> </usage> <lwGauge name="connMapEntriesLWM"> <measures> Lowest simultaneous port usage </measures> <description> Lowest number of simultaneous ports detected to be in use during the collection interval </description> </lwGauge> <hwGauge name="connMapEntriesHWM"> <measures> Highest simultaneous port usage </measures> <description> Highest number of simultaneous ports detected to be in use during the collection interval. </description> </hwGauge> <waterMark name="connMapEntries"> <measures> Connections map entries </measures> <description> Meter showing the number of connections in the host CPU connection map. </description> <bwg lwref="connMapEntriesLWM" hwref="connMapEntriesHWM"/> </waterMark> <counter name="portUsageSampleCnt"> <measures> Usage sample count </measures> <description> The number of 100-second samples taken during the collection interval contributing to the average report. </description> </counter> <counter name="sampledRtpMpPortUsage"> <measures> In-use ports usage </measures> <description> Provides the sum of the in-use ports every 100 seconds. </description> </counter> <precollector> <package>com.nortelnetworks.ims.cap.mediaportal.host</package> <class>PortsOMCenturyPrecollector</class> <collector>centurySecond</collector> </precollector> </row> <documentation> <description> </description> <rowKey> </rowKey> </documentation> <generatedOn> <list> <ne>sessmgr</ne> <ne>rtpportal</ne> </list> </generatedOn> </group> </omGroups>

代码

import csv from bs4 import BeautifulSoup soup = BeautifulSoup(xml_string, 'html.parser') with open('data.csv', 'w', newline='') as f_out: writer = csv.writer(f_out) writer.writerow(['General name:SpecificName', 'RegisterType', 'Measures']) for item in soup.select('row [name]'): writer.writerow([item.find_previous('name').text + ':' + item['name'], item.name, item.find('measures').get_text(strip=True)])

推荐答案

您可以尝试以下代码:

import csv import re from bs4 import BeautifulSoup soup = BeautifulSoup(xml_string, 'html.parser') with open('data.csv', 'w', newline='') as f_out: writer = csv.writer(f_out) writer.writerow(['General name:SpecificName', 'RegisterType', 'Measures', 'Description', 'generatedOn']) for item in soup.select('row [name]'): desc = item.find('description').get_text(strip=True) desc = re.sub(r'\s{2,}', ' ', desc) generatedOn = ','.join(ne.get_text(strip=True) for ne in item.find_parent('group').select('ne')) writer.writerow([item.find_previous('name').text + ':' + item['name'], item.name, item.find('measures').get_text(strip=True), desc, generatedOn])

生成 data.csv :

更多推荐

XML文件解析Python

本文发布于:2023-11-28 06:11:39,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1641244.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:文件   XML   Python

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!