来自xml的JSoup Strip html标记(JSoup Strip html markup from xml)

编程入门 行业动态 更新时间:2024-10-27 08:29:51
来自xml的JSoup Strip html标记(JSoup Strip html markup from xml)

我一直在寻找stackoverflow但无法让任何人遇到这种问题。

我想做这样的事情:

输入字符串:

<?xml version="1.0" encoding="UTF-8" ?> <List> <Object> <Section>Fruit</Section> <Category>Bananas</Category> <Brand>Chiquita</Brand> <Obs><p> Vende-se a pe&ccedil;as ou o conjunto.</p><br> </Obs> </Object> </List>

我想要的是剥离html标签,如<p>,<br>等。所以它结束如下:

<?xml version="1.0" encoding="UTF-8" ?> <List> <Object> <Section>Fruit</Section> <Category>Bananas</Category> <Brand>Chiquita</Brand> <Obs> Vende-se a pe&ccedil;as ou o conjunto. </Obs> </Object> </List>

我一直在玩JSoup,但我似乎无法让它正常工作。

这是我的代码:

Whitelist whitelist = Whitelist.none(); String xml = "<?xml version=\"1.0\" encoding=\"UTF-8\" ?><List><Object><Section>Fruit</Section><Category>Bananas</Category><Brand>Chiquita</Brand><Obs><p>Vende-se a pe&ccedil;as ou o conjunto.</p><br></Obs></Object></List>"; whitelist.addTags(new String[]{"?xml", "List", "Object", "Section", "Category", "Brand", "Obs"}); String safe = Jsoup.clean(xml, whitelist);

这是我获得的结果:

FruitBananasChiquitaVende-se a pe&ccedil;as ou o conjunto.

提前致谢

i've been looking stackoverflow but couldn't get anyone with this kind of problem.

I want to do something like this:

Input String:

<?xml version="1.0" encoding="UTF-8" ?> <List> <Object> <Section>Fruit</Section> <Category>Bananas</Category> <Brand>Chiquita</Brand> <Obs><p> Vende-se a pe&ccedil;as ou o conjunto.</p><br> </Obs> </Object> </List>

What i want is to strip html tags, like <p>,<br> etc. So it ends like this:

<?xml version="1.0" encoding="UTF-8" ?> <List> <Object> <Section>Fruit</Section> <Category>Bananas</Category> <Brand>Chiquita</Brand> <Obs> Vende-se a pe&ccedil;as ou o conjunto. </Obs> </Object> </List>

I have been playing around with JSoup, but i can't seem to make it work properly.

This is the code i have:

Whitelist whitelist = Whitelist.none(); String xml = "<?xml version=\"1.0\" encoding=\"UTF-8\" ?><List><Object><Section>Fruit</Section><Category>Bananas</Category><Brand>Chiquita</Brand><Obs><p>Vende-se a pe&ccedil;as ou o conjunto.</p><br></Obs></Object></List>"; whitelist.addTags(new String[]{"?xml", "List", "Object", "Section", "Category", "Brand", "Obs"}); String safe = Jsoup.clean(xml, whitelist);

This is the result i am obtaining:

FruitBananasChiquitaVende-se a pe&ccedil;as ou o conjunto.

Thanks in advance

最满意答案

标签是小写的,使用:

whitelist.addTags(new String[] { "?xml", "list", "object", "section", "category", "brand", "obs" });

输出:

<list> <object> <section> Fruit </section> <category> Bananas </category> <brand> Chiquita </brand> <obs> Vende-se a pe&ccedil;as ou o conjunto. </obs></object> </list>

tags are lowercased, use:

whitelist.addTags(new String[] { "?xml", "list", "object", "section", "category", "brand", "obs" });

output:

<list> <object> <section> Fruit </section> <category> Bananas </category> <brand> Chiquita </brand> <obs> Vende-se a pe&ccedil;as ou o conjunto. </obs></object> </list>

更多推荐

本文发布于:2023-07-21 23:17:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1214870.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:标记   JSoup   xml   Strip   markup

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!