HtmlAgilityPack和HtmlDecode

编程入门 行业动态 更新时间:2024-10-28 13:27:16
本文介绍了HtmlAgilityPack和HtmlDecode的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我目前正在将HtmlAgilityPack与控制台应用程序配合使用来抓取网站.由于html是经过编码的(返回的是'之类的编码字符),因此必须先进行解码,然后再将内容保存到数据库中.

I am currently using HtmlAgilityPack with a console application to scrape a website. Since the html is encoded (it returns encoded characters like ') I have to decode before I save the content to my database.

有没有一种方法可以使用HtmlAgilityPack解码返回的html,而不必使用HttpUtility.HtmlDecode?如果可能的话,我想避免将System.Web添加到我的控制台应用程序中.

Is there a way to decode the returned html using HtmlAgilityPack without having to use HttpUtility.HtmlDecode? I want to avoid adding System.Web to my console application if possible.

推荐答案

HTML Agility Pack配备了一个名为HtmlEntity的实用程序类.它具有带有以下签名的静态方法:

The Html Agility Pack is equiped with a utility class called HtmlEntity. It has a static method with the following signature:

/// <summary> /// Replace known entities by characters. /// </summary> /// <param name="text">The source text.</param> /// <returns>The result text.</returns> public static string DeEntitize(string text)

它支持众所周知的实体(例如&nbsp;)和编码字符(例如&#039;).

It supports well-known entities (like &nbsp;) and encoded characters such as &#039; as well.

更多推荐

HtmlAgilityPack和HtmlDecode

本文发布于:2023-11-04 12:39:59,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1557966.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:HtmlAgilityPack   HtmlDecode

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!