Python正前pression为美味的汤

编程入门 行业动态 更新时间:2024-10-14 04:25:27
本文介绍了Python正前pression为美味的汤的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我是用美丽的汤拔出具体div标签,似乎我无法使用简单的字符串匹配。

I am using Beautiful Soup to pull out specific div tags, and it seems I can't use simple string matching.

该页面有一些标签的形式

The page has some tags in the form of

<div class="comment form new"...>

我想忽略,也有些标签的形式

which I want to ignore, and also some tags in the form of

<div class="comment comment-xxxx...">

其中x重新present任意长度的整数,椭圆重新presents用空格分隔的其他值(即我不关心)的任意数量。我想不通的正确的正则表达式前pression,尤其是因为我从来没有使用Python的重类。

where the x's represent an integer of arbitrary length, and the ellipses represents an arbitrary number of other values separated by white spaces (that I'm not concerned about). I can't figure out the correct regex expression, especially since I've never used python's re class.

使用

soup.find_all(class_="comment")

查找以单词注释的所有标签。我已经尝试使用

finds all tags starting with the word comment. I have tried using

soup.find_all(class_=repile(r'(comment)( )(comment)')) soup.find_all(class_=repile(r'comment comment.*'))

和许多其他的变化,但我想我缺少明显的东西在这里如何的正则表达式前pressions或匹配()的工作。谁能帮我?

and lots of other variations, but I think I'm missing something obvious here about how regex expressions or match() work. Can anyone help me out?

推荐答案

我想我知道了:

>>> [div['class'] for div in soup.find_all('div')] [['comment', 'form', 'new'], ['comment', 'comment-xxxx...']]

注意,不像BS3相当于它不是这样的:

Notice that, unlike the equivalent in BS3, it's not this:

['comment form new', 'comment comment-xxxx...']

这就是为什么你的正则表达式不匹配。

And that's why your regexps won't match.

但你可以匹配,例如,这样的:

But you can match, e.g., this:

>>> soup.find_all('div', class_=repile('comment-')) [<div class="comment comment-xxxx..."></div>]

需要注意的是BS确实 re.search 相当于,没有 re.match ,所以你不需要'评论 - *。。当然,如果你想匹配评论-12345而不是评论-OF-另一个实物你 ð希望,例如,'comment- \\ D + 。

Note that BS does the equivalent of re.search, not re.match, so you don't need 'comment-.*'. Of course if you want to match 'comment-12345' but not 'comment-of-another-kind you'd want, e.g., 'comment-\d+'.

更多推荐

Python正前pression为美味的汤

本文发布于:2023-06-05 23:06:08,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/531137.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:美味   Python   pression

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!