PDF to HTML Converter
The Documentize PDF to HTML Converter for .NET is a dynamic tool that simplifies the conversion of PDF documents into HTML format. This plugin is designed not just for simple file format changes but also for enhancing accessibility, making documents more user-friendly and adaptable to web environments.
How to Convert PDF to HTML
To convert a PDF document into HTML, follow these steps:
- Create an instance of the
PdfHtml
class. - Create an instance of the
PdfToHtmlOptions
class to configure the conversion options. - Add the input PDF file using the
AddInput
method. - Add the output HTML file path using the
AddOutput
method. - Call the
Process
method to convert the PDF to HTML.
1var pdfHtml = new PdfHtml();
2var options = new PdfToHtmlOptions(PdfToHtmlOptions.SaveDataType.FileWithEmbeddedResources);
3
4// Set input and output file paths
5options.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
6options.AddOutput(new FileDataSource(@"C:\Samples\output.html"));
7
8// Process the PDF to HTML conversion
9pdfHtml.Process(options);
How to Convert HTML to PDF
The PDF to HTML Converter also supports converting HTML files back into PDF format, allowing for full bidirectional conversion.
1var pdfHtml = new PdfHtml();
2var options = new HtmlToPdfOptions();
3
4// Set input and output file paths
5options.AddInput(new FileDataSource(@"C:\Samples\input.html"));
6options.AddOutput(new FileDataSource(@"C:\Samples\output.pdf"));
7
8// Process the HTML to PDF conversion
9pdfHtml.Process(options);
Customizing PDF to HTML Conversion
You can customize the conversion process by specifying encoding, fonts, or other settings. Here’s an example of setting UTF-8 encoding and the Arial font for the conversion:
1var pdfHtml = new PdfHtml();
2var options = new PdfToHtmlOptions(PdfToHtmlOptions.SaveDataType.FileWithEmbeddedResources);
3
4// Set encoding and font
5options.Encoding = Encoding.UTF8;
6options.Font = "Arial";
7
8// Add input and output files
9options.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
10options.AddOutput(new FileDataSource(@"C:\Samples\output.html"));
11
12// Process the conversion
13pdfHtml.Process(options);
Batch Conversion from PDF to HTML
This plugin also supports batch processing, enabling you to convert multiple PDFs into HTML files in one go.
1var pdfHtml = new PdfHtml();
2var options = new PdfToHtmlOptions(PdfToHtmlOptions.SaveDataType.FileWithEmbeddedResources);
3
4// Add multiple input PDF files
5options.AddInput(new FileDataSource(@"C:\Samples\file1.pdf"));
6options.AddInput(new FileDataSource(@"C:\Samples\file2.pdf"));
7
8// Set output file paths for each conversion
9options.AddOutput(new FileDataSource(@"C:\Samples\output_file1.html"));
10options.AddOutput(new FileDataSource(@"C:\Samples\output_file2.html"));
11
12// Process the batch conversion
13pdfHtml.Process(options);
Key Features:
- Convert PDF to HTML: Seamlessly convert PDF documents into fully functional HTML files.
- Embedded Resources: Choose whether to embed resources (such as images and fonts) directly into the HTML or link them externally.
- Bidirectional Conversion: Convert PDFs to HTML and vice versa with full support for both directions.
- Maintain Layout: Ensure that the original layout and formatting are preserved during conversion.
- Custom Encoding: Specify the encoding format such as UTF-8 for precise text rendering in the converted HTML.