在字符串中查找HTML标记(Finding HTML tags in string)

编程入门行业动态更新时间:2024-10-07 20:33:27

我知道这个问题是关于SO的，但是我找不到合适的问题，而且我还在吸食Regex：/

我有一个string ，该字符串是有效的HTML。现在我想找到具有特定name和attribute所有标签。

我试过这个正则表达式（即div类型）： /(<div type="my_special_type" src="(.*?)<\/div>)/ 。

示例字符串：

<div>Do not match me</div> <div type="special_type" src="bla"> match me</div> <a>not me</a> <div src="blaw" type="special_type" > match me too</div>

如果我使用preg_match然后我只得到<div type="special_type" src="bla"> match me</div>什么是逻辑的，因为另一个具有不同顺序的属性。

在示例字符串上使用preg_match时，我需要什么正则表达式来获取以下array ？：

array(0 => '<div type="special_type" src="bla"> match me</div>', 1 => '<div src="blaw" type="special_type" > match me too</div>')

I know this question is around SO, but I can't find the right one and I still suck in Regex :/

I have an string and that string is valid HTML. Now I want to find all the tags with an certain name and attribute.

I tried this regex (i.e. div with type): /(<div type="my_special_type" src="(.*?)<\/div>)/.

Example string:

<div>Do not match me</div> <div type="special_type" src="bla"> match me</div> <a>not me</a> <div src="blaw" type="special_type" > match me too</div>

If I use preg_match then I only get <div type="special_type" src="bla"> match me</div> what is logical because the other one has the attributes in a different order.

What regex do I need to get the following array when using preg_match on the example string?:

array(0 => '<div type="special_type" src="bla"> match me</div>', 1 => '<div src="blaw" type="special_type" > match me too</div>')

最满意答案

一般建议： 不要使用正则表达式来解析HTML如果HTML发生变化会变得混乱。

改为使用DOMDocument ：

$str = <<<EOF <div>Do not match me</div> <div type="special_type" src="bla"> match me</div> <a>not me</a> <div src="blaw" type="special_type" > match me too</div> EOF; $doc = new DOMDocument(); $doc->loadHTML($str); $selector = new DOMXPath($doc); $result = $selector->query('//div[@type="special_type"]'); // loop through all found items foreach($result as $node) { echo $node->getAttribute('src'); }

A general advice: Dont use regex to parse HTML It will get messy if the HTML changes..

Use DOMDocument instead:

更多推荐

本文发布于:2023-07-09 10:06:00，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1085529.html