This GitHub gist repository contains C# code examples used in the Aspose.HTML for .NET documentation, specifically within the Data Extraction section. These gists showcase various approaches and techniques for effectively parsing, navigating, and extracting information from HTML documents using the Aspose.HTML for .NET library.
- DOM Traversal. Programmatically navigate and manipulate the DOM tree using W3C-compliant traversal interfaces to inspect and retrieve content from HTML documents.
- Filter and process nodes. Select and manipulate specific parts of the HTML.
- Apply XPath queries and CSS selectors. Navigate the HTML document structure to retrieve specific nodes.
- Download images and SVGs from website. Programmatically extract various types of images from a website using C#.
- Download website. Programmatically extract and save websites, and customize the saving process to suit your needs – use MaxHandlingDepth, PageUrlRestriction, etc.
- Extract data from tables. Retrieve structured information from HTML tables.
Each example is self-contained and demonstrates a particular aspect of extracting structured or media content from HTML sources.
- Ensure your .NET project references the Aspose.HTML for .NET library. You can get it via NuGet.
- Select the example you're interested in and copy its content.
- Paste the code into your project and execute it to see how data extraction works.
You can download a free trial of Aspose.HTML for .NET and use a temporary license for unrestricted access.
These samples support the tutorials in the Data Extraction chapter of the official documentation.
- Documentation – Aspose.HTML for .NET
- Product page – Aspose.HTML for .NET
- Free Support Forum – Aspose.HTML
- Blog – Aspose.HTML Product Family
- API Reference – Aspose.HTML for .NET
- NuGet Package – Aspose.HTML for .NET
- .NET 6.0+, .NET Core, or .NET Framework
- Aspose.HTML for .NET library