使用php提取html内容(Extract html content using php)
我有以下代码:
$html = file_get_contents("http://www.jabong.com/giordano-Dtlm60058-Black-Analog-Watch-267058.html"); $dom = new DOMDocument(); $xpath = new DOMXPath($dom); $nodes = $xpath->query('//*[@id="price_div"]/div[2]/span[2]'); //this catches all elements with var_dump($nodes);我想从页面中提取价格。 但是这个xpath没有给我结果。
I have the following code:
$html = file_get_contents("http://www.jabong.com/giordano-Dtlm60058-Black-Analog-Watch-267058.html"); $dom = new DOMDocument(); $xpath = new DOMXPath($dom); $nodes = $xpath->query('//*[@id="price_div"]/div[2]/span[2]'); //this catches all elements with var_dump($nodes);I want to extract the price from the page. But this xpath is not giving me the result.
最满意答案
你有没有解决过这个问题? 这是一些工作代码:
$html = file_get_contents("http://www.jabong.com/giordano-Dtlm60058-Black-Analog-Watch-267058.html"); //suppress errors (there is a lot on the page in question) libxml_use_internal_errors(true); //dont preserve whitespaces $page->preserveWhiteSpace = false; $dom = new DOMDocument(); //as @Larry.Z comments, you forgot to load the $html $dom->loadHTML($html); $xpath = new DOMXPath($dom); //assuming there can be more than one "price set" on each page $prices = array(); $price_divs = $xpath->query('//div[@id="price_div"]'); foreach ($price_divs as $price_div) { $price=array(); foreach ($price_div->childNodes as $price_item) { $content=trim($price_item->textContent); if ($content!='') $price[]=$content; } $prices[]=$price; } echo '<pre>'; print_r($prices); echo '</pre>';输出
Array ( [0] => Array ( [0] => Save 66% [1] => Rs. 5850 [2] => Rs. 1999 ) )您可以跳过$prices[]部分,如果每页的价格设置不会超过一个,则只能使用$price 。
Did you ever solve the problem? Here is some working code :
$html = file_get_contents("http://www.jabong.com/giordano-Dtlm60058-Black-Analog-Watch-267058.html"); //suppress errors (there is a lot on the page in question) libxml_use_internal_errors(true); //dont preserve whitespaces $page->preserveWhiteSpace = false; $dom = new DOMDocument(); //as @Larry.Z comments, you forgot to load the $html $dom->loadHTML($html); $xpath = new DOMXPath($dom); //assuming there can be more than one "price set" on each page $prices = array(); $price_divs = $xpath->query('//div[@id="price_div"]'); foreach ($price_divs as $price_div) { $price=array(); foreach ($price_div->childNodes as $price_item) { $content=trim($price_item->textContent); if ($content!='') $price[]=$content; } $prices[]=$price; } echo '<pre>'; print_r($prices); echo '</pre>';outputs
Array ( [0] => Array ( [0] => Save 66% [1] => Rs. 5850 [2] => Rs. 1999 ) )you can skip the $prices[] part and only use $price if there never will be more than one price set per page.
更多推荐
发布评论