使用Jsoup提取字符串

编程入门 行业动态 更新时间:2024-10-10 07:23:19
本文介绍了使用Jsoup提取字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我正在尝试使用<$ c在网站 html 页面中获取一些名称格式 class 属性$ c> Jsoup 库,问题是我使用 getElementsByClass(name)按类获取元素并将其存储到字符串变量和结果就像这样的mike andro rob banks maria gerardo louis .... etc。 但我想要的是将各个名称分开并将它们存储到数组中。 以下是代码片段:

public String processText(String htmlPage){ 文件html = Jsoup.parse(htmlPage); String names = html.body()。getElementsByClass(name)。text(); 返回姓名; }

更多信息:

源页面是 html 页面,我将完整的html代码保存在字符串中,然后处理字符串以仅提取元素在 class =name下

htmlPage 哪个我传递给 processText 方法类似于以下内容:

< div class =name> Rob Kardashian< / div> < / DIV> < / A> < / DIV> < div class =channelListEntry> < a href =/ zayn_malik> < div class =image> < img src =cdn.posh24/images/:profile/014cf47ca44daf8f44a3e0720929ee327\"alt =Zayn Malik/> < / DIV> < div class =info> < div class =status-container> < div class =position> 4< / div> < div class =img pos>< / div> < div class =value> + 12< / div> < / DIV> < div class =name> Zayn Malik< / div> < / DIV> < / A> < / DIV> < div class =channelListEntry> < a href =/ kanye_west> < div class =image> < img src =cdn.posh24/images/:profile/03f352f71ffab135cd81821eb190d4832\"alt =Kanye West/> < / DIV> < div class =info> < div class =status-container> < div class =position> 5< / div> < div class =img pos>< / div> < div class =value> + 16< / div> < / DIV> < div class =name> Kanye West< / div> < / DIV> < / A> < / DIV> < div class =channelListEntry> < a href =/ kendall_jenner> < div class =image> < img src =cdn.posh24/images/:profile/066d5c02547c4357f1bc5f633c68f4085\"alt =Kendall Jenner/> < / div>

解决方案

<你可以简单地使用 split 函数从字符串中获取数组

String arr [] = names.trim()。split(\\\\);

加上如果您在名称之间加上空格和制表符,则使用

String arr [] = names.split(\\\\ +);

更新:

ArrayList< String> name = new ArrayList< String>(); for(元素输出:html.body()。getElementsByClass(name)){ name.add(output.text()); }

将列表转换为数组的链接

I'm trying to get some name form class attribute within a website html page by using Jsoup Library, The problem is that i'm getting the elements by class using getElementsByClass("name") and store it into a string variable and the result coming like this "mike andro rob banks maria gerardo louis....etc". but what i want is to separate the individual names and store them into array. the following is the code snippet:

public String processText(String htmlPage) { Document html = Jsoup.parse(htmlPage); String names = html.body().getElementsByClass("name").text(); return names; }

More information:

The source page is an html page and i am saving the full html code in a string and then process the string to extract only the Elements under the class="name"

htmlPage which i am passing to processText method is similar to the following:

<div class="name"> Rob Kardashian </div> </div> </a> </div> <div class="channelListEntry"> <a href="/zayn_malik"> <div class="image"> <img src="cdn.posh24/images/:profile/014cf47ca44daf8f44a3e0720929ee327" alt="Zayn Malik"/> </div> <div class="info"> <div class="status-container"> <div class="position">4</div> <div class="img pos"></div> <div class="value">+12</div> </div> <div class="name"> Zayn Malik </div> </div> </a> </div> <div class="channelListEntry"> <a href="/kanye_west"> <div class="image"> <img src="cdn.posh24/images/:profile/03f352f71ffab135cd81821eb190d4832" alt="Kanye West"/> </div> <div class="info"> <div class="status-container"> <div class="position">5</div> <div class="img pos"></div> <div class="value">+16</div> </div> <div class="name"> Kanye West </div> </div> </a> </div> <div class="channelListEntry"> <a href="/kendall_jenner"> <div class="image"> <img src="cdn.posh24/images/:profile/066d5c02547c4357f1bc5f633c68f4085" alt="Kendall Jenner"/> </div>

解决方案

you can simply use split function to get an array from string

String arr[]=names.trim().split("\\s");

plus if you have spaces and tab combined between name then use

String arr[]=names.split("\\s+");

Update:

ArrayList<String> name=new ArrayList<String>(); for (Element output: html.body().getElementsByClass("name")) { name.add(output.text()); }

example link

Output :

link to convert list to array

更多推荐

使用Jsoup提取字符串

本文发布于:2023-11-30 12:32:11,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1649967.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:字符串   Jsoup

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!