在字符串中查找HTML标记(Finding HTML tags in string)

编程入门 行业动态 更新时间:2024-10-07 20:33:27
字符串中查找HTML标记(Finding HTML tags in string)

我知道这个问题是关于SO的,但是我找不到合适的问题,而且我还在吸食Regex:/

我有一个string ,该字符串是有效的HTML。 现在我想找到具有特定name和attribute所有标签。

我试过这个正则表达式(即div类型): /(<div type="my_special_type" src="(.*?)<\/div>)/ 。

示例字符串:

<div>Do not match me</div> <div type="special_type" src="bla"> match me</div> <a>not me</a> <div src="blaw" type="special_type" > match me too</div>

如果我使用preg_match然后我只得到<div type="special_type" src="bla"> match me</div>什么是逻辑的,因为另一个具有不同顺序的属性。

在示例字符串上使用preg_match时,我需要什么正则表达式来获取以下array ?:

array(0 => '<div type="special_type" src="bla"> match me</div>', 1 => '<div src="blaw" type="special_type" > match me too</div>')

I know this question is around SO, but I can't find the right one and I still suck in Regex :/

I have an string and that string is valid HTML. Now I want to find all the tags with an certain name and attribute.

I tried this regex (i.e. div with type): /(<div type="my_special_type" src="(.*?)<\/div>)/.

Example string:

<div>Do not match me</div> <div type="special_type" src="bla"> match me</div> <a>not me</a> <div src="blaw" type="special_type" > match me too</div>

If I use preg_match then I only get <div type="special_type" src="bla"> match me</div> what is logical because the other one has the attributes in a different order.

What regex do I need to get the following array when using preg_match on the example string?:

array(0 => '<div type="special_type" src="bla"> match me</div>', 1 => '<div src="blaw" type="special_type" > match me too</div>')

最满意答案

一般建议: 不要使用正则表达式来解析HTML如果HTML发生变化会变得混乱。

改为使用DOMDocument :

$str = <<<EOF <div>Do not match me</div> <div type="special_type" src="bla"> match me</div> <a>not me</a> <div src="blaw" type="special_type" > match me too</div> EOF; $doc = new DOMDocument(); $doc->loadHTML($str); $selector = new DOMXPath($doc); $result = $selector->query('//div[@type="special_type"]'); // loop through all found items foreach($result as $node) { echo $node->getAttribute('src'); }

A general advice: Dont use regex to parse HTML It will get messy if the HTML changes..

Use DOMDocument instead:

$str = <<<EOF <div>Do not match me</div> <div type="special_type" src="bla"> match me</div> <a>not me</a> <div src="blaw" type="special_type" > match me too</div> EOF; $doc = new DOMDocument(); $doc->loadHTML($str); $selector = new DOMXPath($doc); $result = $selector->query('//div[@type="special_type"]'); // loop through all found items foreach($result as $node) { echo $node->getAttribute('src'); }

更多推荐

本文发布于:2023-07-09 10:06:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1085529.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:字符串   标记   HTML   string   Finding

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!