用C#从html标签中提取文本(Extract texts from a html tag with C#)
我在变量中有下面的标签。 我需要使用C#将不同的变量类型和id值提取出来。 最好的方法是什么?
<a href="gana:$type=FlexiPage;id=c828c4ea-075d-4dde-84f0-1876f8b71fa8;title=Workflow%20flexi$">workflow link</a>I have the below tag in a variable. I need to the extract the values of type and id to different variables using C#. What would be the best approach?
<a href="gana:$type=FlexiPage;id=c828c4ea-075d-4dde-84f0-1876f8b71fa8;title=Workflow%20flexi$">workflow link</a>最满意答案
如果我不得不解析HTML,我也会使用HtmlAgilityPack 。 您可以使用SelectSingleNode , GetAttributeValue和字符串方法来创建密钥和值对的字典:
var doc = new HtmlAgilityPack.HtmlDocument(); doc.LoadHtml(html)); var anchor = doc.DocumentNode.SelectSingleNode("a"); string href = anchor.GetAttributeValue("href", ""); // take the text between both $ int startIndex = href.IndexOf('$') + 1; href = href.Substring(startIndex, href.Length - startIndex); Dictionary<string, string> pageInfos = href.Split(';') .Select(token => token.Split('=')) .ToDictionary(kv => kv[0].Trim(), kv => kv[1].Trim(), StringComparer.InvariantCultureIgnoreCase); string id = pageInfos["id"]; // c828c4ea-075d-4dde-84f0-1876f8b71fa8 string type = pageInfos["type"]; // FlexiPageI would also use HtmlAgilityPack if i had to parse HTML. You can use SelectSingleNode, GetAttributeValue and string methods to create a dictionary of key- and value pairs:
var doc = new HtmlAgilityPack.HtmlDocument(); doc.LoadHtml(html)); var anchor = doc.DocumentNode.SelectSingleNode("a"); string href = anchor.GetAttributeValue("href", ""); // take the text between both $ int startIndex = href.IndexOf('$') + 1; href = href.Substring(startIndex, href.Length - startIndex); Dictionary<string, string> pageInfos = href.Split(';') .Select(token => token.Split('=')) .ToDictionary(kv => kv[0].Trim(), kv => kv[1].Trim(), StringComparer.InvariantCultureIgnoreCase); string id = pageInfos["id"]; // c828c4ea-075d-4dde-84f0-1876f8b71fa8 string type = pageInfos["type"]; // FlexiPage更多推荐
发布评论