所以,是否可以以这种方式修改或重写?
< h2 class =r>< a class =lhref =www.site。 com?http://www.example/2009/07/page.html> SEO结果Boost< b> < / B>< / A>< / H2>我已阅读它的手册,但无法理解如何计算( simplehtmldom.sourceforge/#fragment-12 )
是否可能,任何想法? 解决方案假设相关问题的回复适用,
您应该可以使用以下使用简单HTML DOM
$ site =siteyourgettinglinksfrom; $ doc = str_get_html($ code); foreach($ doc-> find('a [href]')as $ a){ $ href = $ a-> href; if(/ * $ href以绝对URL路径开头* /){ $ a-> href ='www.site?'.$href; } else {/ * $ href以相对路径开头* / $ a-> href ='www.site?'.$site.$ HREF; } } $ code =(string)$ doc;或
a href =php/book.dom =nofollow noreferrer> PHP的本机DOM库:
$ site =siteyourgettinglinksfrom; $ doc = new DOMDocument(); $ doc-> loadHTML($ code); $ xpath = new DOMXpath($ doc); foreach($ xpath-> query('// a [@href]')as $ a){ $ href = $ a-> getAttribute('href'); if(/ * $ href以绝对URL路径开头* /){ $ a-> setAttribute('href','www.site?'.$href ); } else {/ * $ href以相对路径* / $ a-> setAttribute('href','www.site?')开头。 。$ $网站HREF); } } $ code = $ doc-> saveHTML();检查$ href:
您将检查一个相对链接,并添加您拉取内容的网站的地址,因为大多数网站使用相对链接。 (这是正常表达式匹配器将是您最好的朋友)
对于相对链接,您将前往从
'http:// www中获取链接的网站的absoute路径.site?$ site。$ href绝对链接只是附加相对链接
'www.site?'.$href示例链接
网站相对: /images/picture.jpg
相对文件: ../ images / picture.jpg
绝对: somesite/images/picture.jpg
(注意:这里需要做更多的工作,因为如果处理文档相对链接,那么你将不得不知道什么目录你目前在。网站的相关链接应该是好的,因为g,因为您有获取链接的网站的根文件夹)
i have a script which will fetch content from a website, what i wanna do is modify all that links. Suppose:
$html = str_get_html('<h2 class="r"><a class="l" href="www.example/2009/07/page.html" onmousedown="return curwt(this, 'www.example/2009/07/page.html')">SEO Result Boost <b> </b></a></h2>');so, is it possible to modify or rewrite it in this way>
<h2 class="r"><a class="l" href="www.site?www.example/2009/07/page.html">SEO Result Boost <b> </b></a></h2>I have read it's manual but can not understand how to figure it ( simplehtmldom.sourceforge/#fragment-12 )
Is It Possible, Any Idea? 解决方案Assuming the answer to a related question works,
You should be able to use the following working with Simple HTML DOM
$site = "siteyourgettinglinksfrom"; $doc = str_get_html($code); foreach ($doc->find('a[href]') as $a) { $href = $a->href; if (/* $href begins with a absolute URL path */) { $a->href = 'www.site?'.$href; } else{ /* $href begins with a relative path */ $a->href = 'www.site?'.$site.$href; } } $code = (string) $doc;or
Using PHP’s native DOM library:
$site = "siteyourgettinglinksfrom"; $doc = new DOMDocument(); $doc->loadHTML($code); $xpath = new DOMXpath($doc); foreach ($xpath->query('//a[@href]') as $a) { $href = $a->getAttribute('href'); if (/* $href begins with a absolute URL path */) { $a->setAttribute('href', 'www.site?'.$href); } else{ /* $href begins with a relative path */ $a->setAttribute('href', 'www.site?'.$site.$href); } } $code = $doc->saveHTML();Checking the $href:
you would be checking for a relative link and prepend the address of the site your pulling the content from, since most sites use relative links. (this is where a regular expression matcher would be your best friend)
for relative links you prepend the absoute path to the site which you are getting links from
'www.site?'.$site.$hreffor absolute links you just append the relative link
'www.site?'.$hrefExample links:
site relative: /images/picture.jpg
document relative: ../images/picture.jpg
absolute: somesite/images/picture.jpg
(Note: there is a little more work that needs done here, because if your handling "document relative" links, then you will have to know what directory you're currently in. Site relative links should be good to go, as long as you have the root folder of the site you're getting links from)
更多推荐
PHP简单HTML DOM解析器>修改获取的链接
发布评论