用Nokogiri替换(Replacing with Nokogiri)
我正在使用Nokogiri来扫描文档并删除以附件形式存储的特定文件。 不过,我想要注意的是,该值已在线删除。
例如。
<a href="...">File Download</a>转换成:
File Removed这是我试过的:
@doc = Nokogiri::HTML(html).to_html @doc.search('a').each do |attachment| attachment.remove attachment.content = "REMOVED" # ALSO TRIED: attachment.content = "REMOVED" end第二个替换锚文本,但保持href和用户仍然可以下载该值。
我如何替换锚值并将其更改为带有新字符串的<p>?
I am using Nokogiri to scan a document and remove specific files that are stored as attachments. I want to note however that the value was removed in-line.
Eg.
<a href="...">File Download</a>Converted to:
File RemovedHere is what I tried:
@doc = Nokogiri::HTML(html).to_html @doc.search('a').each do |attachment| attachment.remove attachment.content = "REMOVED" # ALSO TRIED: attachment.content = "REMOVED" endThe second one does replace the anchor text but keeps the href and the user can still download the value.
How can I replace the anchor value and change it to a < p> with a new string?
最满意答案
使用create_element和replace组合来实现这一点。 在下面找到内置评论。
html = '<a href="...">File Download</a>' dom = Nokogiri::HTML(html) # parse with nokogiri dom.to_s # original content #=> "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\" \"http://www.w3.org/TR/REC-html40/loose.dtd\">\n<html><body><a href=\"...\">File Download</a></body></html>\n" # scan the dom for hyperlinks dom.css('a').each do |a| node = dom.create_element 'p' # create paragraph element node.inner_html = "REMOVED" # add content you want a.replace node # replace found link with paragraph end dom.to_s # modified html #=> "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\" \"http://www.w3.org/TR/REC-html40/loose.dtd\">\n<html><body><p>REMOVED</p></body></html>\n"希望这可以帮助
Use combination of create_element and replace to achieve that. Find the comments inline below.
html = '<a href="...">File Download</a>' dom = Nokogiri::HTML(html) # parse with nokogiri dom.to_s # original content #=> "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\" \"http://www.w3.org/TR/REC-html40/loose.dtd\">\n<html><body><a href=\"...\">File Download</a></body></html>\n" # scan the dom for hyperlinks dom.css('a').each do |a| node = dom.create_element 'p' # create paragraph element node.inner_html = "REMOVED" # add content you want a.replace node # replace found link with paragraph end dom.to_s # modified html #=> "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\" \"http://www.w3.org/TR/REC-html40/loose.dtd\">\n<html><body><p>REMOVED</p></body></html>\n"Hope this helps
更多推荐
发布评论