Skip to content

Instantly share code, notes, and snippets.

@pedrolamas
Created April 14, 2012 17:12
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save pedrolamas/2386001 to your computer and use it in GitHub Desktop.
Save pedrolamas/2386001 to your computer and use it in GitHub Desktop.
Gettings all "a.href" in HTML string
using System.Linq;
using HtmlAgilityPack;
namespace PhoneApp1
{
public class HtmlAgilityPackSample
{
public static string[] GetLinks(string html)
{
// The next line allows for form overlaps (forms inside forms)
HtmlNode.ElementsFlags["form"] = HtmlElementFlag.CanOverlap;
var htmlDocument = new HtmlDocument()
{
OptionFixNestedTags = true // fixes tags that are not properly closed
};
htmlDocument.LoadHtml(html); // load the HTML string
return htmlDocument.DocumentNode.Descendants("a") // get all "a" elements in HTML
.Cast<HtmlNode>()
.Select(x => x.Attributes["href"].Value) // get the "href" attribute from each element
.ToArray();
}
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment