Developer's Guide

PDF Security

Encrypt and decrypt PDF documents using C# .NET

PDF Optimizer

Reduce file sizes, rotate pages, crop content, and resize documents

PDF Table Generator

Easily generate structured tables in PDF documents, ideal for organizing data, creating interactive forms, and enhancing content readability.

PDF Merger

Merge multiple PDF documents into a single file using C# .NET

PDF Signature

.NET plugin offers a streamlined process for adding signatures, ensuring authenticity, and securing PDF content

PDF Splitter

.NET tool that simplifies the process of splitting large PDF documents into smaller, more manageable files

PDF ChatGPT

Integrate ChatGPT API with .NET PDF applications

PDF Text Extractor

.NET plugin allows you to extract text efficiently while preserving formatting or omitting it based on your needs

PDF Timestamp Adder

Add secure timestamps to your PDF documents using C# .NET

PDF/A Converter

.NET plugin converts PDF documents into the PDF/A format, ensuring that your content remains compliant with long-term archiving standards

PDF to XLS Converter

Effortlessly convert PDF documents into Excel spreadsheets (XLS/XLSX) with Documentize's robust .NET plugin.

PDF to DOC Converter

.NET tool allows converting PDF documents into DOC or DOCX formats

PDF to JPEG Converter

.NET plugin simplifies the conversion of PDF documents into high-quality JPEG images

PDF to PNG Converter

.NET plugin allows you to convert PDF documents into high-quality PNG images

PDF to TIFF Converter

.NET plugin simplifies the conversion of PDF documents into high-quality TIFF images

HTML Converter

Comprehensive guide to Documentize HTML Converter's PDF to HTML and HTML to PDF features.

Sep 12, 2024

Subsections of Developer's Guide

PDF Security

The Documentize PDF Security for .NET is a powerful tool designed to enhance the security of your PDF documents by providing encryption and decryption capabilities. It ensures that your sensitive information remains confidential and protected from unauthorized access.

Key Features:

  • Encrypt PDF Documents: Secure your PDF files by adding user and owner passwords.
  • Decrypt PDF Documents: Remove encryption from PDFs when needed.
  • Set Permissions: Control permissions such as printing, copying, and modifying content.
  • Automation: Integrate encryption and decryption into your .NET applications for automated workflows.
  • Compliance: Ensure your documents meet industry standards for document security.

How to Encrypt a PDF Document

To encrypt a PDF document, follow these steps:

  1. Create an instance of the Security class.
  2. Create an instance of EncryptionOptions with the desired user and owner passwords.
  3. Add the input PDF file using the AddInput method.
  4. Set the output file path using AddOutput.
  5. Execute the encryption using the Process method.
 1// Instantiate the Security plugin
 2var plugin = new Security();
 3
 4// Configure the encryption options
 5var opt = new EncryptionOptions("user_password", "owner_password");
 6
 7// Add input PDF file
 8opt.AddInput(new FileDataSource("path_to_pdf"));
 9
10// Specify the output encrypted PDF file
11opt.AddOutput(new FileDataSource("path_to_encrypted_pdf"));
12
13// Perform the encryption process
14plugin.Process(opt);

How to Decrypt a PDF Document

To decrypt a PDF document, follow these steps:

  1. Create an instance of the Security class.
  2. Create an instance of DecryptionOptions with the necessary password.
  3. Add the encrypted PDF file using the AddInput method.
  4. Set the output file path using AddOutput.
  5. Execute the decryption using the Process method.
 1// Instantiate the Security plugin
 2var plugin = new Security();
 3
 4// Configure the decryption options
 5var opt = new DecryptionOptions("user_password");
 6
 7// Add input encrypted PDF file
 8opt.AddInput(new FileDataSource("path_to_encrypted_pdf"));
 9
10// Specify the output decrypted PDF file
11opt.AddOutput(new FileDataSource("path_to_decrypted_pdf"));
12
13// Perform the decryption process
14plugin.Process(opt);

Setting Permissions on PDF Documents

When encrypting a PDF, you can set various permissions to control how the document can be used.

  • Printing: Allow or disallow printing of the document.
  • Copying: Allow or disallow copying of content.
  • Modifying: Allow or disallow modifications to the document.

To set permissions, you can configure the EncryptionOptions accordingly.

PDF Optimizer

The Documentize PDF Optimizer is a comprehensive plugin that enhances PDF documents through advanced optimization techniques. It is designed to help reduce file sizes, rotate pages, crop content, and resize documents. These operations improve the quality and manageability of PDF files, making them easier to store, share, and view.

Key Features:

  • Optimization: Reduce PDF file size without losing quality.
  • Rotation: Adjust the orientation of PDF pages.
  • Cropping: Remove unnecessary margins or content from the document.
  • Resizing: Resize pages to specific dimensions (e.g., A4, Letter).

Optimize PDF Document

The following steps demonstrate how to optimize a PDF document by reducing its file size while maintaining quality.

  1. Create an instance of the Optimizer class.
  2. Create an OptimizeOptions object to configure optimization settings.
  3. Add input PDF file(s) and set an output location for the optimized file.
  4. Run the Process method to execute the optimization.
1var optimizer = new Optimizer();
2var optimizeOptions = new OptimizeOptions();
3optimizeOptions.AddInput(new FileDataSource("input.pdf"));
4optimizeOptions.AddOutput(new FileDataSource("output.pdf"));
5optimizer.Process(optimizeOptions);

Resize PDF Document

To resize a PDF document, the ResizeOptions class is used to specify the new page size for the document.

  1. Instantiate the Optimizer class.
  2. Create a ResizeOptions object to define the page size.
  3. Add the input file and set the desired output location.
  4. Use the SetPageSize method to specify the new size (e.g., A4).
  5. Call the Process method to apply the changes.
1var optimizer = new Optimizer();
2var resizeOptions = new ResizeOptions();
3resizeOptions.AddInput(new FileDataSource("input.pdf"));
4resizeOptions.SetPageSize(PageSize.A4);
5resizeOptions.AddOutput(new FileDataSource("output.pdf"));
6optimizer.Process(resizeOptions);

Rotate PDF Pages

Use the RotateOptions class to adjust the orientation of pages in a PDF file.

  1. Instantiate the Optimizer class.
  2. Create a RotateOptions object and configure the rotation angle.
  3. Add the input PDF file and specify the output file location.
  4. Set the rotation angle (e.g., 90 degrees) using the SetRotation method.
  5. Execute the rotation with the Process method.
1var optimizer = new Optimizer();
2var rotateOptions = new RotateOptions();
3rotateOptions.AddInput(new FileDataSource("input.pdf"));
4rotateOptions.SetRotation(90);
5rotateOptions.AddOutput(new FileDataSource("output.pdf"));
6optimizer.Process(rotateOptions);

Crop PDF Document

Cropping removes unwanted content or margins from a PDF document. The CropOptions class can be used to define the crop area.

  1. Create an instance of the Optimizer class.
  2. Define the crop area with the CropOptions object.
  3. Add the input file and specify the output file location.
  4. Use the SetCropBox method to define the crop area.
  5. Execute the cropping with the Process method.
1var optimizer = new Optimizer();
2var cropOptions = new CropOptions();
3cropOptions.AddInput(new FileDataSource("input.pdf"));
4cropOptions.SetCropBox(new Rectangle(50, 50, 500, 700)); // Defines the crop area
5cropOptions.AddOutput(new FileDataSource("output.pdf"));
6optimizer.Process(cropOptions);

PDF Table Generator

The Documentize Table Generator for .NET is a versatile plugin designed to streamline the integration of tables into PDF documents. Whether you’re organizing data, designing forms, or improving document readability, this plugin simplifies the process while maintaining precision and efficiency. Its intuitive API supports both single document and batch processing workflows, making it an essential tool for developers working with structured data.

Key Features:

  • Dynamic Table Creation: Effortlessly generate structured tables in PDF documents.
  • Rich Content Support: Populate tables with text, HTML, images, and LaTeX content.
  • Page Placement: Insert tables at specific locations within a PDF with precision.
  • Customizable Layout: Adjust table structure, cell alignment, and styling.
  • Batch Processing: Process multiple documents simultaneously for maximum efficiency.

Creating a PDF with Tables

Follow these steps to create structured tables in a PDF using the TableGenerator class:

  1. Instantiate the TableGenerator class.
  2. Configure the TableOptions object to define table structure, content, and input/output files.
  3. Add tables, rows, and cells to your PDF.
  4. Finalize the table generation process using the Process method.

Here’s an example:

 1var generator = new TableGenerator();
 2var options = new TableOptions();
 3
 4// Specify input and output PDF files
 5options.AddInput(new FileDataSource("input.pdf"));
 6options.AddOutput(new FileDataSource("output.pdf"));
 7
 8// Define a table with rows and cells
 9options
10    .InsertPageAfter(1) // Add table after the first page
11    .AddTable()
12        .AddRow()
13            .AddCell().AddParagraph(new TextFragment("Cell 1"))
14            .AddCell().AddParagraph(new TextFragment("Cell 2"))
15            .AddCell().AddParagraph(new TextFragment("Cell 3"));
16
17// Generate the table in the document
18generator.Process(options);

Adding Rich Content to Tables

Tables in PDF documents can include a variety of content types to enhance their functionality and appearance. Below is an example of adding HTML content to table cells:

1options
2    .AddTable()
3        .AddRow()
4            .AddCell().AddParagraph(new HtmlFragment("<h1>Header 1</h1>"))
5            .AddCell().AddParagraph(new HtmlFragment("<h2>Header 2</h2>"))
6            .AddCell().AddParagraph(new HtmlFragment("<h3>Header 3</h3>"));

Supported Content Types in Tables

The PDF Table Generator supports various content types, enabling developers to customize tables for a wide range of use cases:

  • HtmlFragment: Add HTML-based content, such as headers, lists, and formatted text.
  • TeXFragment: Include LaTeX-based content for mathematical equations and scientific notation.
  • TextFragment: Insert plain or formatted text.
  • Image: Embed images directly into table cells.

Customizing Table Layout and Structure

The plugin provides flexibility for adjusting table structure, including row height, column width, and cell alignment. These customization options allow you to design tables that match your document’s layout and styling needs.

Processing the Table Generation

After adding all content and customizing the table structure, finalize the process by calling the Process method. This method generates the tables and updates the PDF document. Here’s how to handle the results:

1var resultContainer = generator.Process(options);
2
3// Output the number of generated results
4Console.WriteLine("Number of results: " + resultContainer.ResultCollection.Count);

Use Cases for the PDF Table Generator

  1. Data Reporting: Present analytics, financial reports, or survey results in a clear and organized format.
  2. Form Design: Create interactive forms with structured table layouts.
  3. Document Enhancement: Improve the readability and usability of user manuals, guides, or instructional materials.
  4. Batch Processing: Automate table generation for multiple PDF documents.

PDF Merger

The Documentize PDF Merger for .NET is a versatile tool designed to merge multiple PDF documents into a single file. It simplifies the consolidation of PDF files, ensuring your documents are merged efficiently and maintaining consistency across content. The plugin handles internal resources such as fonts and images to optimize the merged document.

Key Features:

  • Merge Multiple PDFs: Easily combine multiple PDF files into one.
  • Resource Optimization: Removes duplicate fonts and images during merging.
  • Batch Processing: Merge large batches of PDF documents in one go.
  • Secure Merging: Ensure document integrity without data loss or content corruption.

How to Merge PDF Documents

To merge multiple PDF documents into a single file, follow these steps:

  1. Create an instance of the Merger class.
  2. Create an instance of MergeOptions to configure the merging process.
  3. Add input PDF files using the AddInput method.
  4. Set the output file path using AddOutput.
  5. Execute the merge using the Process method.
 1var merger = new Merger();
 2var mergeOptions = new MergeOptions();
 3
 4// Add input PDF files to merge
 5mergeOptions.AddInput(new FileDataSource(@"C:\Samples\file1.pdf"));
 6mergeOptions.AddInput(new FileDataSource(@"C:\Samples\file2.pdf"));
 7mergeOptions.AddInput(new FileDataSource(@"C:\Samples\file3.pdf"));
 8
 9// Specify the output file path
10mergeOptions.AddOutput(new FileDataSource(@"C:\Samples\mergedOutput.pdf"));
11
12// Merge the PDFs
13merger.Process(mergeOptions);

How to Merge PDFs with Page Range

You can also merge specific page ranges from input PDF files using the MergeOptions class. This allows you to combine selected pages into the final output document.

  1. Create an instance of the Merger class.
  2. Configure page ranges using MergeOptions.
  3. Add the input files with specified page ranges.
  4. Set the output path.
  5. Call the Process method.
 1var merger = new Merger();
 2var mergeOptions = new MergeOptions();
 3
 4// Merge specific pages from input PDFs
 5mergeOptions.AddInput(new FileDataSource(@"C:\Samples\file1.pdf"), new PageRange(1, 3));
 6mergeOptions.AddInput(new FileDataSource(@"C:\Samples\file2.pdf"), new PageRange(2, 5));
 7
 8// Specify the output file path
 9mergeOptions.AddOutput(new FileDataSource(@"C:\Samples\outputWithSpecificPages.pdf"));
10
11// Merge the PDFs
12merger.Process(mergeOptions);

How to Handle Batch Merging

The PDF Merger plugin is optimized for handling large batches of PDF documents. By leveraging the batch processing feature, you can merge hundreds of PDFs in a single operation, ensuring efficient and fast document management.

  1. Instantiate the Merger class.
  2. Add all input PDF files to the MergeOptions class.
  3. Specify the output path.
  4. Call the Process method to merge all files in the batch.
 1var merger = new Merger();
 2var mergeOptions = new MergeOptions();
 3
 4// Add a large batch of PDFs for merging
 5for (int i = 1; i <= 100; i++)
 6{
 7    mergeOptions.AddInput(new FileDataSource($@"C:\Samples\file{i}.pdf"));
 8}
 9
10// Specify the output file path
11mergeOptions.AddOutput(new FileDataSource(@"C:\Samples\batchMergedOutput.pdf"));
12
13// Process the batch merging
14merger.Process(mergeOptions);

PDF Signature

The Documentize PDF Signature for .NET plugin allows users to digitally sign PDF documents. It offers a streamlined process for adding signatures, ensuring authenticity, and securing PDF content. The plugin supports both visible and invisible signatures and provides options to customize the signature’s position, reason, contact information, and more.

Key Features:

  • Digitally Sign PDF Documents: Secure your documents with visible or invisible digital signatures.
  • PFX Support: Sign PDF files using a PFX certificate.
  • Customizable Options: Configure signature settings like reason, location, and contact details.
  • Visible and Invisible Signatures: Choose whether the signature is visible on the document.

How to Sign PDF Documents

To sign a PDF document using a PFX file, follow these steps:

  1. Create an instance of the Signature class.
  2. Instantiate the SignOptions class with the PFX file path and password.
  3. Add the input PDF and the output file to the options.
  4. Run the Process method to apply the signature.
 1var signature = new Signature();
 2var signOptions = new SignOptions(@"C:\certificates\myCertificate.pfx", "pfxPassword");
 3
 4// Add the input PDF and specify the output file
 5signOptions.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
 6signOptions.AddOutput(new FileDataSource(@"C:\Samples\signedOutput.pdf"));
 7
 8// Configure signature options
 9signOptions.Reason = "Contract Agreement";
10signOptions.Contact = "johndoe@example.com";
11signOptions.Location = "New York";
12signOptions.PageNumber = 1;
13signOptions.Visible = true;
14signOptions.Rectangle = new Rectangle(100, 100, 200, 150);
15
16// Apply the signature to the document
17signature.Process(signOptions);

How to Use Stream for PFX File

You can also sign a PDF using a PFX certificate provided as a stream instead of a file path. This allows more flexible handling of certificate storage.

  1. Create an instance of the Signature class.
  2. Instantiate SignOptions with a stream containing the PFX and the password.
  3. Add the input and output files.
  4. Run the Process method to apply the signature.
 1using var pfxStream = File.OpenRead(@"C:\certificates\myCertificate.pfx");
 2var signature = new Signature();
 3var signOptions = new SignOptions(pfxStream, "pfxPassword");
 4
 5// Add input and output files
 6signOptions.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
 7signOptions.AddOutput(new FileDataSource(@"C:\Samples\signedOutput.pdf"));
 8
 9// Apply signature
10signature.Process(signOptions);

How to Apply Invisible Signatures

To add an invisible signature (one that secures the document without displaying the signature on the document), simply set the Visible property to false.

  1. Create an instance of SignOptions.
  2. Set Visible to false.
  3. Add input and output files.
  4. Call Process to apply the invisible signature.
 1var signature = new Signature();
 2var signOptions = new SignOptions(@"C:\certificates\myCertificate.pfx", "pfxPassword");
 3
 4// Configure invisible signature
 5signOptions.Visible = false;
 6
 7// Add input and output files
 8signOptions.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
 9signOptions.AddOutput(new FileDataSource(@"C:\Samples\invisiblySigned.pdf"));
10
11// Process signature
12signature.Process(signOptions);

PDF Splitter

The Documentize PDF Splitter for .NET is a powerful tool that simplifies the process of splitting large PDF documents into smaller, more manageable files. Whether you need to extract individual pages or divide a document into specific sections, this plugin allows you to achieve it efficiently and with minimal effort.

Key Features:

  • Split PDF by Page: Break down a PDF document into individual pages.
  • Batch Processing: Split large batches of PDFs in one go.
  • Custom Split Options: Configure the splitting process based on your requirements.
  • Organized Output: Easily manage the output files for each split page or section.

How to Split PDF Documents

To split a PDF document into individual pages, follow these steps:

  1. Create an instance of the Splitter class.
  2. Create an instance of SplitOptions to configure the splitting options.
  3. Add the input PDF file using the AddInput method.
  4. Add output files for each split page using the AddOutput method.
  5. Run the Process method to split the document.
 1var splitter = new Splitter();
 2var splitOptions = new SplitOptions();
 3
 4// Add the input PDF file
 5splitOptions.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
 6
 7// Specify output files for each page
 8splitOptions.AddOutput(new FileDataSource(@"C:\Samples\output_page_1.pdf"));
 9splitOptions.AddOutput(new FileDataSource(@"C:\Samples\output_page_2.pdf"));
10splitOptions.AddOutput(new FileDataSource(@"C:\Samples\output_page_3.pdf"));
11
12// Process the split operation
13splitter.Process(splitOptions);

Splitting PDF by Page Ranges

You can also split a PDF by specifying page ranges. This allows you to extract specific sections or multiple pages from a PDF into separate documents.

 1var splitter = new Splitter();
 2var splitOptions = new SplitOptions();
 3
 4// Add the input PDF
 5splitOptions.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
 6
 7// Define output for page ranges (e.g., pages 1-3)
 8splitOptions.AddOutput(new FileDataSource(@"C:\Samples\output_pages_1_to_3.pdf"));
 9
10// Process the split
11splitter.Process(splitOptions);

How to Handle Batch Splitting

The PDF Splitter plugin is optimized to handle large batches of PDF documents. You can split hundreds of PDFs into individual pages or sections by leveraging batch processing.

 1var splitter = new Splitter();
 2var splitOptions = new SplitOptions();
 3
 4// Add input PDF files in a batch
 5splitOptions.AddInput(new FileDataSource(@"C:\Samples\file1.pdf"));
 6splitOptions.AddInput(new FileDataSource(@"C:\Samples\file2.pdf"));
 7
 8// Define the output for each file
 9splitOptions.AddOutput(new FileDataSource(@"C:\Samples\output_file1_page1.pdf"));
10splitOptions.AddOutput(new FileDataSource(@"C:\Samples\output_file2_page1.pdf"));
11
12// Process the batch split
13splitter.Process(splitOptions);

PDF ChatGPT

The Documentize ChatGPT for .NET plugin is a powerful tool designed to integrate the ChatGPT API with PDF applications. This plugin allows developers to generate chat responses based on input messages and save the output in PDF format, making it suitable for creating conversational interfaces or analysis reports directly within PDF documents.

Key Features:

  • Chat Completions: Generate responses using the ChatGPT API based on custom input.
  • System & User Messages: Provide both system context and user input to create dynamic conversations.
  • PDF Output: Save generated chat completions in a structured PDF file for further use.
  • Asynchronous Processing: Ensure responsive applications by processing chat completions asynchronously.

Generate Chat Responses

To generate chat responses and save them to a PDF file using the ChatGPT plugin, follow these steps:

  1. Create an instance of the PdfChatGptRequestOptions class to configure the request options.
  2. Add input and output PDF files.
  3. Set the API key and specify parameters such as maximum token count and the query for the ChatGPT model.
  4. Run the ProcessAsync method to generate the chat completion.
 1var options = new PdfChatGptRequestOptions();
 2options.ApiKey = "sk-******";  // Set your API key
 3options.MaxTokens = 1000;  // Set the maximum number of tokens
 4options.Query = "Analyze this text for key themes.";
 5
 6// Add the input PDF file
 7options.AddInput(new FileDataSource("input.pdf"));
 8
 9// Specify where to save the output PDF with chat responses
10options.AddOutput(new FileDataSource("output.pdf"));
11
12// Create an instance of the PdfChatGpt plugin
13var plugin = new PdfChatGpt();
14
15// Run the process asynchronously
16var result = await plugin.ProcessAsync(options);

Adding System and User Messages

To create a more interactive conversation, you can add both system and user messages. These messages help shape the conversation context.

  1. Add a system message that sets the context for ChatGPT.
  2. Add a user message that represents the user’s input for the conversation.
 1var options = new PdfChatGptRequestOptions();
 2options.ApiKey = "sk-******";  // Set your API key
 3
 4// Add system message for context
 5options.AddSystemMessage("You are an AI trained to summarize text.");
 6
 7// Add user message to query the ChatGPT model
 8options.AddUserMessage("Please summarize the attached document.");
 9
10// Add input and output PDFs
11options.AddInput(new FileDataSource("input.pdf"));
12options.AddOutput(new FileDataSource("output.pdf"));
13
14// Process the request asynchronously
15var plugin = new PdfChatGpt();
16var result = await plugin.ProcessAsync(options);

PDF Text Extractor

The Documentize PDF Text Extractor for .NET simplifies extracting text from PDF documents. Whether you need pure, raw, or plain text, this plugin allows you to extract text efficiently while preserving formatting or omitting it based on your needs.

Key Features:

  • Pure Mode: Extract text while preserving its original formatting.
  • Raw Mode: Extract text without any formatting.
  • Plain Mode: Extract text without special characters or formatting.
  • Batch Processing: Extract text from multiple PDFs at once.

How to Extract Text from PDF Documents

To extract text from a PDF document, follow these steps:

  1. Create an instance of the TextExtractor class.
  2. Create an instance of TextExtractorOptions to configure the extraction options.
  3. Add the input PDF file using the AddInput method.
  4. Run the Process method to extract the text.
  5. Access the extracted text using the ResultContainer.ResultCollection.
 1using var extractor = new TextExtractor();
 2var textExtractorOptions = new TextExtractorOptions();
 3
 4// Add the input PDF
 5textExtractorOptions.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
 6
 7// Process the text extraction
 8var resultContainer = extractor.Process(textExtractorOptions);
 9
10// Print the extracted text
11var extractedText = resultContainer.ResultCollection[0];
12Console.WriteLine(extractedText);

Extracting Text from Multiple PDFs

The plugin allows you to extract text from multiple PDFs simultaneously, ensuring quick and efficient processing.

 1using var extractor = new TextExtractor();
 2var textExtractorOptions = new TextExtractorOptions();
 3
 4// Add multiple input PDFs
 5textExtractorOptions.AddInput(new FileDataSource(@"C:\Samples\input1.pdf"));
 6textExtractorOptions.AddInput(new FileDataSource(@"C:\Samples\input2.pdf"));
 7
 8// Process the extraction
 9var resultContainer = extractor.Process(textExtractorOptions);
10
11// Output the extracted text
12foreach (var result in resultContainer.ResultCollection)
13{
14    Console.WriteLine(result);
15}

Text Extraction Modes

The TextExtractor plugin offers three extraction modes, providing flexibility based on your needs.

  1. Pure Mode: Preserves the original formatting, including spaces and alignment.
  2. Raw Mode: Extracts the text without formatting, useful for raw data processing.
  3. Plain Mode: Extracts text without special characters or additional formatting.
1var textExtractorOptions = new TextExtractorOptions();
2
3// Set to Pure mode
4textExtractorOptions.Mode = ExtractionMode.Pure;
5textExtractorOptions.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
6
7// Process and output
8var resultContainer = extractor.Process(textExtractorOptions);
9Console.WriteLine(resultContainer.ResultCollection[0]);

How to Handle Batch Processing

For large document sets, you can leverage batch processing, enabling you to extract text from multiple PDFs at once.

 1using var extractor = new TextExtractor();
 2var textExtractorOptions = new TextExtractorOptions();
 3
 4// Add multiple input PDFs
 5textExtractorOptions.AddInput(new FileDataSource(@"C:\Samples\batch1.pdf"));
 6textExtractorOptions.AddInput(new FileDataSource(@"C:\Samples\batch2.pdf"));
 7
 8// Define output for each file
 9var resultContainer = extractor.Process(textExtractorOptions);
10
11// Handle extracted text
12foreach (var result in resultContainer.ResultCollection)
13{
14    Console.WriteLine(result);
15}

PDF Timestamp Adder

The Documentize PDF Timestamp Adder for .NET is a powerful tool designed to add secure timestamps to your PDF documents. It enhances the integrity and authenticity of your documents by providing a trusted time reference, ensuring compliance with digital signature standards.

Key Features:

  • Add Secure Timestamps: Effortlessly add secure timestamps to your PDF documents.
  • Customizable Timestamp Servers: Use custom timestamp server URLs and authentication credentials.
  • Automation: Integrate timestamping into your .NET applications for automated workflows.
  • Compliance: Ensure your documents meet industry standards for digital signatures and timestamps.

How to Add a Timestamp to PDF Documents

To add a secure timestamp to a PDF document, follow these steps:

  1. Create an instance of the Timestamp class.
  2. Create an instance of AddTimestampOptions to configure the timestamping process.
  3. Add the input PDF file using the AddInput method.
  4. Set the output file path using AddOutput.
  5. Execute the timestamping using the Process method.
 1// Instantiate the Timestamp plugin
 2var plugin = new Timestamp();
 3
 4// Configure the timestamping options
 5var opt = new AddTimestampOptions("path_to_pfx", "password_for_pfx", "timestamp_server_url");
 6
 7// Add input PDF file
 8opt.AddInput(new FileDataSource("path_to_pdf"));
 9
10// Specify the output PDF file
11opt.AddOutput(new FileDataSource("path_to_result_pdf"));
12
13// Perform the timestamping process
14plugin.Process(opt);

How to Use Custom Authentication with Timestamp Server

You can provide basic authentication credentials when connecting to the timestamp server. This allows you to authenticate with servers that require a username and password.

  1. Create an instance of the Timestamp class.
  2. Create an instance of AddTimestampOptions, including the serverBasicAuthCredentials.
  3. Add the input file and output file paths.
  4. Call the Process method.
 1// Instantiate the Timestamp plugin
 2var plugin = new Timestamp();
 3
 4// Configure the timestamping options with authentication
 5var opt = new AddTimestampOptions("path_to_pfx", "password_for_pfx", "timestamp_server_url", "username:password");
 6
 7// Add input PDF file
 8opt.AddInput(new FileDataSource("path_to_pdf"));
 9
10// Specify the output PDF file
11opt.AddOutput(new FileDataSource("path_to_result_pdf"));
12
13// Perform the timestamping process
14plugin.Process(opt);

Handling PFX Files and Passwords

The AddTimestampOptions class allows you to use a PFX file for digital signing along with the password.

  • PFX Stream or File Path: You can provide a stream or file path to the PFX file.
  • Password Protection: Ensure you securely manage the password for the PFX file.

PDF/A Converter

The Documentize PDF/A Converter for .NET is a powerful tool designed to convert PDF documents into the PDF/A format, ensuring that your content remains compliant with long-term archiving standards. This plugin also supports validating existing PDF documents for PDF/A compliance, offering both conversion and validation features in a single solution.

Key Features:

  • Convert to PDF/A: Seamlessly transform PDF files into the PDF/A format (such as PDF/A-1a, PDF/A-2b, PDF/A-3b) to ensure compliance with archiving standards.
  • Validate PDF/A Compliance: Check existing PDF documents for conformance with PDF/A standards and identify issues if they do not comply.
  • Batch Processing: Process multiple files at once for conversion or validation.
  • Efficient Workflow: Minimize time and effort with fast and reliable conversion processes.

How to Convert PDF to PDF/A

To convert a PDF document into PDF/A format, follow these steps:

  1. Create an instance of the PdfAConverter class.
  2. Create an instance of PdfAConvertOptions to configure the conversion.
  3. Specify the desired PDF/A version (e.g., PDF/A-3B).
  4. Add the input PDF file using the AddInput method.
  5. Add the output file for the resulting PDF/A using the AddOutput method.
  6. Call the Process method to execute the conversion.
 1var pdfAConverter = new PdfAConverter();
 2var pdfAOptions = new PdfAConvertOptions
 3{
 4    PdfAVersion = PdfAStandardVersion.PDF_A_3B
 5};
 6
 7// Add the input PDF file
 8pdfAOptions.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
 9
10// Specify the output PDF/A file
11pdfAOptions.AddOutput(new FileDataSource(@"C:\Samples\output_pdfa.pdf"));
12
13// Process the conversion
14pdfAConverter.Process(pdfAOptions);

Validating PDF/A Compliance

You can validate existing PDF files for PDF/A compliance using the PdfAValidateOptions class.

 1var pdfAConverter = new PdfAConverter();
 2var validationOptions = new PdfAValidateOptions
 3{
 4    PdfAVersion = PdfAStandardVersion.PDF_A_1A
 5};
 6
 7// Add the PDF file to be validated
 8validationOptions.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
 9
10// Run the validation process
11var resultContainer = pdfAConverter.Process(validationOptions);
12
13// Check the validation result
14var validationResult = (PdfAValidationResult)resultContainer.ResultCollection[0].Data;
15Console.WriteLine("PDF/A Validation Passed: " + validationResult.IsValid);

Batch Processing for PDF/A Conversion

This plugin supports batch processing, allowing you to convert or validate multiple PDF files for PDF/A compliance at once.

 1var pdfAConverter = new PdfAConverter();
 2var pdfAOptions = new PdfAConvertOptions
 3{
 4    PdfAVersion = PdfAStandardVersion.PDF_A_3B
 5};
 6
 7// Add multiple input PDFs
 8pdfAOptions.AddInput(new FileDataSource(@"C:\Samples\file1.pdf"));
 9pdfAOptions.AddInput(new FileDataSource(@"C:\Samples\file2.pdf"));
10
11// Specify output files for the converted PDF/As
12pdfAOptions.AddOutput(new FileDataSource(@"C:\Samples\file1_pdfa.pdf"));
13pdfAOptions.AddOutput(new FileDataSource(@"C:\Samples\file2_pdfa.pdf"));
14
15// Process the batch conversion
16pdfAConverter.Process(pdfAOptions);

PDF to XLS Converter

The Documentize PDF to XLS Converter for .NET is a versatile and powerful tool for converting PDF documents into Excel spreadsheets (XLS/XLSX). By leveraging this plugin, developers can seamlessly transform static PDF data into dynamic and editable spreadsheets, simplifying data manipulation, analysis, and sharing.

Key Features:

  • Flexible Conversion Options: Convert PDF files into XLSX, XLS, CSV, or other formats.
  • Content Preservation: Maintain the original structure, layout, and formatting.
  • Customizable Output: Configure page ranges, worksheet names, and output formats.
  • Batch Processing: Handle multiple PDF files simultaneously for high efficiency.
  • Advanced Formatting: Insert blank columns or minimize the number of worksheets.

How to Convert PDF to Excel

To convert a PDF document into an Excel file (XLS/XLSX), follow these steps:

  1. Create an instance of the XlsConverter class.
  2. Configure the conversion settings using the PdfToXlsOptions class.
  3. Add input PDF files using the AddInput method.
  4. Specify the output file path using the AddOutput method.
  5. Execute the Process method to initiate the conversion.
1var converter = new XlsConverter();
2var options = new PdfToXlsOptions();
3
4// Add input and output file paths
5options.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
6options.AddOutput(new FileDataSource(@"C:\Samples\output.xlsx"));
7
8// Perform the conversion
9converter.Process(options);

Customizing the PDF to Excel Conversion

The PdfToXlsOptions class allows you to customize the conversion process. For example, to convert the PDF to an XLSX file, set a worksheet name, and enable advanced formatting options:

 1var options = new PdfToXlsOptions
 2{
 3    Format = PdfToXlsOptions.ExcelFormat.XLSX,    // Specify XLSX format
 4    WorksheetName = "MySheet",                    // Name the worksheet
 5    InsertBlankColumnAtFirst = true               // Insert a blank column at the start
 6};
 7
 8// Add input and output files
 9options.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
10options.AddOutput(new FileDataSource(@"C:\Samples\output.xlsx"));
11
12// Process the conversion
13converter.Process(options);

Batch Processing PDF to XLS Conversion

With batch processing, you can convert multiple PDF files into Excel spreadsheets in one go. Here’s an example:

 1var converter = new XlsConverter();
 2var options = new PdfToXlsOptions();
 3
 4// Add multiple input files
 5options.AddInput(new FileDataSource(@"C:\Samples\file1.pdf"));
 6options.AddInput(new FileDataSource(@"C:\Samples\file2.pdf"));
 7
 8// Specify output file paths
 9options.AddOutput(new FileDataSource(@"C:\Samples\output1.xlsx"));
10options.AddOutput(new FileDataSource(@"C:\Samples\output2.xlsx"));
11
12// Perform the batch conversion
13converter.Process(options);

Handling Conversion Results

After the conversion process, the Process method returns a ResultContainer object that contains the details of the operation. Here’s how to retrieve the converted file path:

1var resultContainer = converter.Process(options);
2
3// Access the output file path
4var result = resultContainer.ResultCollection[0];
5Console.WriteLine("Converted file path: " + result.Data.ToString());

Supported Output Formats

The PdfToXlsOptions.ExcelFormat enum provides a range of output formats:

  • XLSX: Office Open XML (.xlsx) File Format (default).
  • XLSM: Macro-enabled Excel format.
  • CSV: Comma-separated values.
  • ODS: Open Document Spreadsheet.
  • XMLSpreadSheet2003: Excel 2003 XML format.

PDF to DOC Converter

The Documentize PDF to DOC Converter for .NET is a powerful tool designed to convert PDF documents into DOC or DOCX formats. This plugin seamlessly transforms PDF pages into editable Microsoft Word documents, making it easy to reuse, edit, and share content across multiple platforms.

Key Features:

  • DOC/DOCX Conversion: Convert PDF documents to editable Microsoft Word formats (DOC or DOCX).
  • Maintain Formatting: Retain the original layout, text, and formatting during the conversion process.
  • Batch Processing: Convert multiple PDF files at once.
  • Custom Conversion Options: Fine-tune the conversion process with different modes, like Enhanced Flow, for better layout.

How to Convert PDF to DOC/DOCX

To convert a PDF document to DOC/DOCX format, follow these steps:

  1. Create an instance of the DocConverter class.
  2. Create an instance of DocConversionOptions to configure the conversion process.
  3. Add the input PDF file using the AddInput method.
  4. Add the output file path for the resulting DOC/DOCX file using the AddOutput method.
  5. Run the Process method to execute the conversion.
 1var docConverter = new DocConverter();
 2var options = new DocConversionOptions()
 3{
 4    SaveFormat = SaveFormat.DocX,       // Output format as DOCX
 5    ConversionMode = ConversionMode.EnhancedFlow // Optimize layout and formatting
 6};
 7
 8// Add the input PDF file
 9options.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
10
11// Add the output Word document path
12options.AddOutput(new FileDataSource(@"C:\Samples\output.docx"));
13
14// Process the conversion
15docConverter.Process(options);

Converting PDF to DOC with Custom Options

The PDF to DOC Converter plugin provides several options to customize your conversion process. You can choose between different modes to control how the layout and structure of the PDF are handled during conversion.

 1var docConverter = new DocConverter();
 2var options = new DocConversionOptions()
 3{
 4    SaveFormat = SaveFormat.Doc,        // Output format as DOC
 5    ConversionMode = ConversionMode.Precise // Maintain original PDF layout as closely as possible
 6};
 7
 8// Add the input PDF file
 9options.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
10
11// Add the output Word document path
12options.AddOutput(new FileDataSource(@"C:\Samples\output.doc"));
13
14// Process the conversion
15docConverter.Process(options);

Batch Processing PDF to DOC/DOCX Conversion

The PDF to DOC Converter supports batch processing, allowing you to convert multiple PDF files at once. Here’s an example of batch conversion:

 1var docConverter = new DocConverter();
 2var options = new DocConversionOptions()
 3{
 4    SaveFormat = SaveFormat.DocX
 5};
 6
 7// Add multiple input PDF files
 8options.AddInput(new FileDataSource(@"C:\Samples\file1.pdf"));
 9options.AddInput(new FileDataSource(@"C:\Samples\file2.pdf"));
10
11// Add output file paths for the resulting DOCX files
12options.AddOutput(new FileDataSource(@"C:\Samples\output_file1.docx"));
13options.AddOutput(new FileDataSource(@"C:\Samples\output_file2.docx"));
14
15// Process the batch conversion
16docConverter.Process(options);

PDF to JPEG Converter

The Documentize PDF to JPEG Converter for .NET is a powerful tool that simplifies the conversion of PDF documents into high-quality JPEG images. This plugin is designed to make your content more accessible across platforms by transforming PDF pages into widely-used image formats.

Key Features:

  • Convert PDF to JPEG: Effortlessly convert entire PDF documents or specific pages into JPEG images.
  • Custom Resolution: Adjust the resolution (e.g., 300 dpi) for high-quality outputs.
  • Page Range: Select specific pages or ranges for conversion.
  • Batch Processing: Convert multiple PDF pages or entire documents at once.
  • Quick Conversion: Fast and efficient process with minimal effort.

How to Convert PDF Pages to JPEG

To convert a PDF document into JPEG images, follow these steps:

  1. Create an instance of the Jpeg class.
  2. Create an instance of JpegOptions to configure the conversion process.
  3. Add the input PDF file using the AddInput method.
  4. Specify the output file path for the JPEG images using the AddOutput method.
  5. Run the Process method to convert the PDF pages into JPEG images.
 1var converter = new Jpeg();
 2var options = new JpegOptions();
 3
 4// Add the input PDF file
 5options.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
 6
 7// Specify output directory for JPEG images
 8options.AddOutput(new FileDataSource(@"C:\Samples\images"));
 9
10// Process the PDF to JPEG conversion
11converter.Process(options);

Customizing PDF to JPEG Conversion

You can customize the conversion process by adjusting resolution, selecting page ranges, or setting image quality. Here’s how to convert the first page of a PDF at 300 dpi:

 1var converter = new Jpeg();
 2var options = new JpegOptions();
 3
 4// Set output resolution to 300 dpi and convert only the first page
 5options.OutputResolution = 300;
 6options.PageRange = new PageRange(1);
 7
 8// Add input and output paths
 9options.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
10options.AddOutput(new FileDataSource(@"C:\Samples\output_page_1.jpg"));
11
12// Process the conversion
13converter.Process(options);

Batch Processing for PDF to JPEG Conversion

The PDF to JPEG Converter plugin supports batch processing, allowing you to convert multiple pages from a PDF into individual JPEG files.

 1var converter = new Jpeg();
 2var options = new JpegOptions();
 3
 4// Add input PDF file
 5options.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
 6
 7// Set output paths for each page
 8options.AddOutput(new FileDataSource(@"C:\Samples\output_page_1.jpg"));
 9options.AddOutput(new FileDataSource(@"C:\Samples\output_page_2.jpg"));
10
11// Process the batch conversion
12converter.Process(options);

How to Handle Conversion Results

The Process method returns a ResultContainer object that holds information about the conversion results. You can print the paths of the converted JPEG files as shown below:

1ResultContainer resultContainer = converter.Process(options);
2
3// Print the output paths of the JPEG images
4foreach (FileResult result in resultContainer.ResultCollection)
5{
6    Console.WriteLine(result.Data.ToString());
7}

PDF to PNG Converter

The Documentize PDF to PNG Converter for .NET is an advanced tool that allows you to convert PDF documents into high-quality PNG images. This plugin is designed to make your content more versatile, accessible, and easier to share by transforming PDF pages into widely supported image formats.

Key Features:

  • Convert PDF to PNG: Quickly and efficiently convert entire PDF documents or specific pages into PNG images.
  • Customizable Resolution: Set the desired DPI (e.g., 300 DPI) for high-quality image output.
  • Batch Processing: Convert multiple PDF pages or entire documents in one go.
  • Easy Output Management: Specify output directories for each converted PNG file.
  • Quick Conversion: Fast, efficient, and requires minimal effort to configure.

How to Convert PDF to PNG

To convert a PDF document into PNG images, follow these steps:

  1. Create an instance of the Png class.
  2. Create an instance of PngOptions to configure the conversion process.
  3. Add the input PDF file using the AddInput method.
  4. Specify the output directory for the PNG images using the AddOutput method.
  5. Run the Process method to convert the PDF pages into PNG images.
 1var converter = new Png();
 2var options = new PngOptions();
 3
 4// Add the input PDF file
 5options.AddInput(new FileDataSource(@"C:\Samples\sample.pdf"));
 6
 7// Specify output directory for PNG images
 8options.AddOutput(new FileDataSource(@"C:\Samples\images"));
 9
10// Process the PDF to PNG conversion
11converter.Process(options);

Customizing PDF to PNG Conversion

You can customize the conversion by adjusting the resolution and selecting specific pages. For example, to convert only the first page of a PDF at 300 DPI:

 1var converter = new Png();
 2var options = new PngOptions();
 3
 4// Set output resolution to 300 DPI
 5options.OutputResolution = 300;
 6
 7// Convert only the first page
 8options.PageRange = new PageRange(1);
 9
10// Add input and output paths
11options.AddInput(new FileDataSource(@"C:\Samples\sample.pdf"));
12options.AddOutput(new FileDataSource(@"C:\Samples\output_page_1.png"));
13
14// Process the conversion
15converter.Process(options);

Batch Processing for PDF to PNG Conversion

The PDF to PNG Converter plugin also supports batch processing, allowing you to convert multiple pages or even entire PDF documents into individual PNG files.

 1var converter = new Png();
 2var options = new PngOptions();
 3
 4// Add input PDF file
 5options.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
 6
 7// Set output paths for each page
 8options.AddOutput(new FileDataSource(@"C:\Samples\output_page_1.png"));
 9options.AddOutput(new FileDataSource(@"C:\Samples\output_page_2.png"));
10
11// Process the batch conversion
12converter.Process(options);

Handling Conversion Results

After processing the conversion, the Process method returns a ResultContainer object containing the conversion results. You can print the output paths of the PNG images as follows:

1ResultContainer resultContainer = converter.Process(options);
2
3// Print the output paths of the PNG images
4foreach (FileResult result in resultContainer.ResultCollection)
5{
6    Console.WriteLine(result.Data.ToString());
7}

PDF to TIFF Converter

The Documentize PDF to TIFF Converter for .NET is a powerful tool designed to convert PDF documents into high-quality TIFF images. This plugin ensures that your content is accessible across various platforms while maintaining excellent fidelity and versatility.

Key Features:

  • Convert PDF to TIFF: Effortlessly convert entire PDF documents or specific pages into TIFF images.
  • Custom Resolution: Adjust the resolution (e.g., 300 dpi) for superior quality outputs.
  • Multi-Page TIFF: Combine multiple PDF pages into a single multi-page TIFF file.
  • Page Range: Convert specific pages or ranges for precise results.
  • Batch Processing: Convert multiple PDF documents or pages at once.
  • Quick Conversion: Fast and efficient process with minimal effort.

How to Convert PDF Pages to TIFF

To convert a PDF document into TIFF images, follow these steps:

  1. Create an instance of the TiffConverter class.
  2. Create an instance of PdfToTiffOptions to configure the conversion process.
  3. Add the input PDF file using the AddInput method.
  4. Specify the output file path for the TIFF images using the AddOutput method.
  5. Run the Process method to convert the PDF pages into TIFF images.
 1var converter = new TiffConverter();
 2var options = new PdfToTiffOptions();
 3
 4// Add the input PDF file
 5options.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
 6
 7// Specify output file for TIFF images
 8options.AddOutput(new FileDataSource(@"C:\Samples\output.tiff"));
 9
10// Process the PDF to TIFF conversion
11converter.Process(options);

Customizing PDF to TIFF Conversion

You can customize the conversion process by adjusting resolution, enabling multi-page output, or selecting page ranges. Here’s how to convert the first page of a PDF at 300 dpi into a TIFF file:

 1var converter = new TiffConverter();
 2var options = new PdfToTiffOptions();
 3
 4// Set output resolution to 300 dpi and convert only the first page
 5options.OutputResolution = 300;
 6options.PageList = new List<int> { 1 };
 7
 8// Add input and output paths
 9options.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
10options.AddOutput(new FileDataSource(@"C:\Samples\output_page_1.tiff"));
11
12// Process the conversion
13converter.Process(options);

Multi-Page TIFF Creation

The PDF to TIFF Converter plugin supports multi-page TIFF generation, allowing you to combine multiple PDF pages into a single TIFF file for efficient archiving or printing.

 1var converter = new TiffConverter();
 2var options = new PdfToTiffOptions
 3{
 4    MultiPage = true // Enable multi-page TIFF output
 5};
 6
 7// Add input PDF file
 8options.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
 9
10// Specify output file for multi-page TIFF
11options.AddOutput(new FileDataSource(@"C:\Samples\output.tiff"));
12
13// Process the conversion
14converter.Process(options);

Batch Processing for PDF to TIFF Conversion

The PDF to TIFF Converter plugin also supports batch processing, allowing you to convert multiple PDF pages or entire documents simultaneously into separate TIFF files.

 1var converter = new TiffConverter();
 2var options = new PdfToTiffOptions();
 3
 4// Add input PDF file
 5options.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
 6
 7// Set output paths for individual pages
 8options.AddOutput(new FileDataSource(@"C:\Samples\output_page_1.tiff"));
 9options.AddOutput(new FileDataSource(@"C:\Samples\output_page_2.tiff"));
10
11// Process the batch conversion
12converter.Process(options);

How to Handle Conversion Results

The Process method returns a ResultContainer object that provides details about the conversion results. You can print the paths of the converted TIFF files as shown below:

1ResultContainer resultContainer = converter.Process(options);
2
3// Print the output paths of the TIFF images
4foreach (FileResult result in resultContainer.ResultCollection)
5{
6    Console.WriteLine(result.Data.ToString());
7}

HTML Converter

The Documentize HTML Converter for .NET provides robust capabilities for converting documents between PDF and HTML formats, ideal for web applications, archiving, and report generation. With multiple options for handling resources and layouts, the converter adapts to various project requirements.

Key Features

PDF to HTML Conversion

Convert PDF files to HTML to make documents accessible for web-based viewing or integration into applications where HTML format is preferred.

HTML to PDF Conversion

Transform HTML content into high-quality PDFs, perfect for generating printable reports, archiving web content, or creating shareable document formats.


Detailed Guide

Converting PDF to HTML

To convert a PDF to HTML:

  1. Initialize the Converter: Create an instance of HtmlConverter.
  2. Set Conversion Options: Use PdfToHtmlOptions to customize output, choosing either embedded or external resources.
  3. Define Input and Output Paths: Set the paths for your input PDF and output HTML.
  4. Execute the Conversion: Call the Process method to convert the file.

Example: Convert PDF to HTML with Embedded Resources

// Step 1: Initialize the HTML Converter
var converter = new HtmlConverter();

// Step 2: Configure options for PDF to HTML conversion
var options = new PdfToHtmlOptions(PdfToHtmlOptions.SaveDataType.FileWithEmbeddedResources);

// Step 3: Set file paths
options.AddInput(new FileDataSource("input.pdf"));
options.AddOutput(new FileDataSource("output.html"));

// Step 4: Run the conversion
converter.Process(options);

Available Options for PDF to HTML Conversion

  • SaveDataType:

    • FileWithEmbeddedResources: Generates a single HTML file with all resources embedded.
    • FileWithExternalResources: Saves resources separately, ideal for large HTML files.
  • Output Customization:

    • BasePath: Set the base path for resources in the HTML document.
    • IsRenderToSinglePage: Optionally render all PDF content on a single HTML page.

Converting HTML to PDF

To convert an HTML document to a PDF, follow these steps:

  1. Initialize the Converter: Create an instance of the HtmlConverter.
  2. Configure PDF Options: Use HtmlToPdfOptions to define layout and media settings.
  3. Specify Paths: Set input HTML and output PDF file paths.
  4. Execute the Conversion: Run the Process method to complete the conversion.

Example: Convert HTML to PDF

// Step 1: Initialize the HTML Converter
var converter = new HtmlConverter();

// Step 2: Configure options for HTML to PDF conversion
var options = new HtmlToPdfOptions();

// Step 3: Set file paths
options.AddInput(new FileDataSource("input.html"));
options.AddOutput(new FileDataSource("output.pdf"));

// Step 4: Execute the conversion
converter.Process(options);

Additional Options for HTML to PDF Conversion

  • Media Type:

    • HtmlMediaType.Print: Ideal for generating PDFs suited for printing.
    • HtmlMediaType.Screen: Use when converting content designed for digital viewing.
  • Layout Adjustments:

    • PageLayoutOption: Adjusts how HTML content fits the PDF layout, like ScaleToPageWidth to ensure the content scales to the PDF width.
    • IsRenderToSinglePage: Enables rendering the entire HTML content on a single PDF page if needed for concise presentations.

This converter is versatile for a variety of applications, from generating PDF reports based on web content to converting archives of PDF documents for web-based accessibility. For more advanced configurations, refer to the full Documentize documentation.

 English