PHP简单HTML DOM解析器>修改获取的链接

编程入门 行业动态 更新时间:2024-10-28 21:26:15
本文介绍了PHP简单HTML DOM解析器>修改获取的链接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 我有一个脚本将从网站获取内容,我想做的是修改所有的链接。假设:

$ html = str_get_html('< h2 class =r>< a class =lhref =www.example/2009/07/page.htmlonmousedown =return curwt(this,'www.example/2009/07/page。 html')> SEO结果Boost< b>< / b>< / a>< / h2>');

所以,是否可以以这种方式修改或重写?

< h2 class =r>< a class =lhref =www.site。 com?http://www.example/2009/07/page.html> SEO结果Boost< b> < / B>< / A>< / H2>

我已阅读它的手册,但无法理解如何计算( simplehtmldom.sourceforge/#fragment-12 )

是否可能,任何想法?

解决方案

假设相关问题的回复适用,

您应该可以使用以下使用简单HTML DOM

$ site =siteyourgettinglinksfrom; $ doc = str_get_html($ code); foreach($ doc-> find('a [href]')as $ a){ $ href = $ a-> href; if(/ * $ href以绝对URL路径开头* /){ $ a-> href ='www.site?'.$href; } else {/ * $ href以相对路径开头* / $ a-> href ='www.site?'.$site.$ HREF; } } $ code =(string)$ doc;

a href =php/book.dom =nofollow noreferrer> PHP的本机DOM库:

$ site =siteyourgettinglinksfrom; $ doc = new DOMDocument(); $ doc-> loadHTML($ code); $ xpath = new DOMXpath($ doc); foreach($ xpath-> query('// a [@href]')as $ a){ $ href = $ a-> getAttribute('href'); if(/ * $ href以绝对URL路径开头* /){ $ a-> setAttribute('href','www.site?'.$href ); } else {/ * $ href以相对路径* / $ a-> setAttribute('href','www.site?')开头。 。$ $网站HREF); } } $ code = $ doc-> saveHTML();

检查$ href:

您将检查一个相对链接,并添加您拉取内容的网站的地址,因为大多数网站使用相对链接。 (这是正常表达式匹配器将是您最好的朋友)

对于相对链接,您将前往从

'http:// www中获取链接的网站的absoute路径.site?$ site。$ href

绝对链接只是附加相对链接

'www.site?'.$href

示例链接

网站相对: /images/picture.jpg

相对文件: ../ images / picture.jpg

绝对: somesite/images/picture.jpg

(注意:这里需要做更多的工作,因为如果处理文档相对链接,那么你将不得不知道什么目录你目前在。网站的相关链接应该是好的,因为g,因为您有获取链接的网站的根文件夹)

i have a script which will fetch content from a website, what i wanna do is modify all that links. Suppose:

$html = str_get_html('<h2 class="r"><a class="l" href="www.example/2009/07/page.html" onmousedown="return curwt(this, 'www.example/2009/07/page.html')">SEO Result Boost <b> </b></a></h2>');

so, is it possible to modify or rewrite it in this way>

<h2 class="r"><a class="l" href="www.site?www.example/2009/07/page.html">SEO Result Boost <b> </b></a></h2>

I have read it's manual but can not understand how to figure it ( simplehtmldom.sourceforge/#fragment-12 )

Is It Possible, Any Idea?

解决方案

Assuming the answer to a related question works,

You should be able to use the following working with Simple HTML DOM

$site = "siteyourgettinglinksfrom"; $doc = str_get_html($code); foreach ($doc->find('a[href]') as $a) { $href = $a->href; if (/* $href begins with a absolute URL path */) { $a->href = 'www.site?'.$href; } else{ /* $href begins with a relative path */ $a->href = 'www.site?'.$site.$href; } } $code = (string) $doc;

or

Using PHP’s native DOM library:

$site = "siteyourgettinglinksfrom"; $doc = new DOMDocument(); $doc->loadHTML($code); $xpath = new DOMXpath($doc); foreach ($xpath->query('//a[@href]') as $a) { $href = $a->getAttribute('href'); if (/* $href begins with a absolute URL path */) { $a->setAttribute('href', 'www.site?'.$href); } else{ /* $href begins with a relative path */ $a->setAttribute('href', 'www.site?'.$site.$href); } } $code = $doc->saveHTML();

Checking the $href:

you would be checking for a relative link and prepend the address of the site your pulling the content from, since most sites use relative links. (this is where a regular expression matcher would be your best friend)

for relative links you prepend the absoute path to the site which you are getting links from

'www.site?'.$site.$href

for absolute links you just append the relative link

'www.site?'.$href

Example links:

site relative: /images/picture.jpg

document relative: ../images/picture.jpg

absolute: somesite/images/picture.jpg

(Note: there is a little more work that needs done here, because if your handling "document relative" links, then you will have to know what directory you're currently in. Site relative links should be good to go, as long as you have the root folder of the site you're getting links from)

更多推荐

PHP简单HTML DOM解析器&gt;修改获取的链接

本文发布于:2023-05-29 16:43:40,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/348715.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:简单   链接   HTML   PHP   gt

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!