php dom文件loadHTML和getElementByTagName什么都不返回(php dom document loadHTML and getElementByTagName return

编程入门 行业动态 更新时间:2024-10-12 01:31:22
php dom文件loadHTML和getElementByTagName什么都不返回(php dom document loadHTML and getElementByTagName returns nothing) $urlToScrap = "https://play.google.com/store/apps/details?id=flipboard.app#?t=W251bGwsMSwxLDIxMiwiZmxpcGJvYXJkLmFwcCJd"; $pageContentData = file_get_contents($urlToScrap); $doc = new DOMDocument(); $doc->loadHTML($pageContentData); $listOfDivs = $doc->getElementsByTagName("div"); foreach ($listOfDivs as $div) { if($div->getAttribute("class") == "doc-banner-icon"){ $img = $div->getElementsByTagName("img"); var_dump($img->getAttribute("src")); } }

返回空。

我在dom中有以下元素:

<div class="doc-banner-icon"><img src="somesrc"></div>

我正在尝试获取img src,因为在页面中有很多图像,我想首先获取父div,然后在其中提取图像。

解决方案在这里:

$urlToScrap = "https://play.google.com/store/apps/details?id=flipboard.app#?t=W251bGwsMSwxLDIxMiwiZmxpcGJvYXJkLmFwcCJd"; $pageContentData = file_get_contents($urlToScrap); $doc = new DOMDocument(); $doc->loadHTML($pageContentData); $listOfDivs = $doc->getElementsByTagName("div"); foreach ($listOfDivs as $div) { if($div->getAttribute("class") == "doc-banner-icon"){ $listOfImages = $div->getElementsByTagName("img"); foreach($listOfImages as $img){ var_dump($img->getAttribute("src")); } } } $urlToScrap = "https://play.google.com/store/apps/details?id=flipboard.app#?t=W251bGwsMSwxLDIxMiwiZmxpcGJvYXJkLmFwcCJd"; $pageContentData = file_get_contents($urlToScrap); $doc = new DOMDocument(); $doc->loadHTML($pageContentData); $listOfDivs = $doc->getElementsByTagName("div"); foreach ($listOfDivs as $div) { if($div->getAttribute("class") == "doc-banner-icon"){ $img = $div->getElementsByTagName("img"); var_dump($img->getAttribute("src")); } }

returns empty.

I have the following elements in the dom:

<div class="doc-banner-icon"><img src="somesrc"></div>

I'm trying to get the img src and since in the page there are many images, I would like to first get the parent div and then extract the image inside it.

The solution is here:

$urlToScrap = "https://play.google.com/store/apps/details?id=flipboard.app#?t=W251bGwsMSwxLDIxMiwiZmxpcGJvYXJkLmFwcCJd"; $pageContentData = file_get_contents($urlToScrap); $doc = new DOMDocument(); $doc->loadHTML($pageContentData); $listOfDivs = $doc->getElementsByTagName("div"); foreach ($listOfDivs as $div) { if($div->getAttribute("class") == "doc-banner-icon"){ $listOfImages = $div->getElementsByTagName("img"); foreach($listOfImages as $img){ var_dump($img->getAttribute("src")); } } }

最满意答案

你没有遗漏任何东西, var_dump不能像你期望的那样在DOMNodeList 。 试试这个:

$listOfImages = $doc->getElementsByTagName("img"); foreach ($listOfImages as $img) { $imgClass = $img->getAttribute('class'); echo $imgClass; }

在您更新的问题中,只需更改:

$img->getAttribute("src")

至:

$img->item(0)->getAttribute("src")

鉴于您的选择标准相当复杂,您可以考虑使用XPath而不是手动导航:

$doc = new DOMDocument(); $doc->loadHTML($pageContentData); $xpath = new DOMXPath($doc); $img = $xpath->query("//div[@class = 'doc-banner-icon']/img"); var_dump($img->item(0)->getAttribute('src'));

You aren't missing anything, var_dump doesn't work as you expect on a DOMNodeList. Try this instead:

$listOfImages = $doc->getElementsByTagName("img"); foreach ($listOfImages as $img) { $imgClass = $img->getAttribute('class'); echo $imgClass; }

In your updated question, just change:

$img->getAttribute("src")

to:

$img->item(0)->getAttribute("src")

Given that your selection criteria is fairly complex, you might consider using XPath instead of navigating manually:

$doc = new DOMDocument(); $doc->loadHTML($pageContentData); $xpath = new DOMXPath($doc); $img = $xpath->query("//div[@class = 'doc-banner-icon']/img"); var_dump($img->item(0)->getAttribute('src'));

更多推荐

本文发布于:2023-08-04 04:53:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1410093.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:什么都不   文件   php   dom   loadHTML

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!