.NET API for working with real-world HTML
Class library to create, edit, extract data & convert HTML pages to PDF, XPS, Images and other formats.
Download Free TrialAspose.HTML for .NET is an advanced HTML processing API to perform a wide range of management and manipulation tasks within cross-platform applications. API supports to generate, modify, extract data, convert and render HTML documents without any external software. Also, it supports popular file formats such as EPUB, MHTML, SVG, and Markdown and rendering to PDF, XPS and Image file formats.
Moreover, the HTML Document Object Model is integrated with embedded formats and specifications such as CSS, HTML Canvas, SVG, XPath and JavaScript out-of-the-box that extend the manipulation functional and rendering quality.
At a
Glance
Supported File
Formats
Platform
Independence
Advanced .NET HTML Manipulation API Features
Create HTML pages from Scratch
Load existing HTML from file, stream or URL
Implement W3C specifications
Implement templates using template merger
Fill the template with various data sources
Render HTML Canvas 2D to PDF
Add, replace or remove nodes
Extract data from HTML documents
Load EPUB and MHTML file formats
Render HTML to raster image formats
Render multiple documents at once
Implement Markdown to HTML converter
Apply header and footer during HTML to PDF
Convert HTML to PDF, Image and Other Formats
API allows with just a few lines of code implement HTML to PDF, HTML to Image or any other conversion for your .NET applications.
Convert HTML to PDF and PNG - C#
// Load the HTML file to be converted
using (var document = new Aspose.Html.HTMLDocument("document.html"))
{
// Convert HTML to PDF
Aspose.Html.Converters.Converter.ConvertHTML(document, new PdfSaveOptions(), "output.pdf");
// Convert HTML to Image
Aspose.Html.Converters.Converter.ConvertHTML(document, new ImageSaveOptions(ImageFormat.Png), "output.png");
}
Markdown Support
Markdown is a markup language with a plain-text-formatting syntax. Markdown is often used as a format for documentation and readme files since it allows writing in an easy-to-read and easy-to-write style. Aspose.HTML provides a powerful and flexible Markdown Converter that can convert in both directions from Markdown to HTML and from HTML to Markdown. Moreover, the converter API has a set of predefined rules, so you can convert HTML to Markdown using the authentic Markdown syntax, GitLab Flavored Markdown modification or even configure the rules for your needs.
Convert HTML to Markdown - C#
// Load HTML file
using (var document = new Aspose.Html.HTMLDocument("document.html"))
{
// Convert HTML to Markdown using a set of features supported by GitLab Flavored Markdown
document.Save("output.md", Aspose.Html.Saving.MarkdownSaveOptions.Git);
}
Convert Markdown to HTML - C#
// Convert Markdown to HTML
Aspose.Html.Converters.Converter.ConvertMarkdown("document.md", "output.html");
Electronic Books and Web Archives
The Electronic Books (EPUB) formats and Web Archive (MHTML) formats supported out-of-the-box. API offers high fidelity rendering EPUB and MHTML files to the supported output formats such as PDF, XPS and Image file formats.
Convert EPUB to PDF - C#
// Convert EPUB to PDF.
Aspose.Html.Converters.Converter.ConvertEPUB("document.epub", new Aspose.Html.Saving.PdfSaveOptions(), "output.pdf");
Convert MHTML to PDF - C#
// Convert MHTML to PDF.
Aspose.Html.Converters.Converter.ConvertMHTML("document.mht", new Aspose.Html.Saving.PdfSaveOptions(), "output.pdf");
Web Scraping
Web scraping, also well known as web harvesting, web data extraction or web crawling, is a technique to extract data from a website. Aspose.HTML doesn't support a Web Scraping module out-of-the-box. However, using Aspose.HTML API that is entirely based on W3C specification and supports XPath and CSS Selector queries you can easily inspect the content of any HTML document and create your own Web Scraping solution.
Simple Web Data Extraction - C#
// Create an instance of the HTML document with a website as a parameter.
using (var document = new Aspose.Html.HTMLDocument("https://en.wikipedia.org/wiki/Aspose_API"))
{
// Get all anchor-elements
var elements = document.QuerySelectorAll("a");
// Dump the anchor-element data to the console.
elements.Cast<HTMLAnchorElement>().ToList().ForEach(x =>
{
System.Console.WriteLine("[Href]: " + x.Href);
System.Console.WriteLine("[Content]: " + x.TextContent);
});
}
Tags: Aspose, Web Application Framework Library, .NET, Aspose.HTML for .NET