Subsections of Developer's Guide
PDF ChatGPT
The Documentize ChatGPT for .NET plugin is a powerful tool designed to integrate the ChatGPT API with PDF applications. This plugin allows developers to generate chat responses based on input messages and save the output in PDF format, making it suitable for creating conversational interfaces or analysis reports directly within PDF documents.
Key Features:
- Chat Completions: Generate responses using the ChatGPT API based on custom input.
- System & User Messages: Provide both system context and user input to create dynamic conversations.
- PDF Output: Save generated chat completions in a structured PDF file for further use.
- Asynchronous Processing: Ensure responsive applications by processing chat completions asynchronously.
Generate Chat Responses
To generate chat responses and save them to a PDF file using the ChatGPT plugin, follow these steps:
- Create an instance of the
PdfChatGptRequestOptions
class to configure the request options. - Add input and output PDF files.
- Set the API key and specify parameters such as maximum token count and the query for the ChatGPT model.
- Run the
ProcessAsync
method to generate the chat completion.
1var options = new PdfChatGptRequestOptions();
2options.ApiKey = "sk-******"; // Set your API key
3options.MaxTokens = 1000; // Set the maximum number of tokens
4options.Query = "Analyze this text for key themes.";
5
6// Add the input PDF file
7options.AddInput(new FileDataSource("input.pdf"));
8
9// Specify where to save the output PDF with chat responses
10options.AddOutput(new FileDataSource("output.pdf"));
11
12// Create an instance of the PdfChatGpt plugin
13var plugin = new PdfChatGpt();
14
15// Run the process asynchronously
16var result = await plugin.ProcessAsync(options);
Adding System and User Messages
To create a more interactive conversation, you can add both system and user messages. These messages help shape the conversation context.
- Add a system message that sets the context for ChatGPT.
- Add a user message that represents the user’s input for the conversation.
1var options = new PdfChatGptRequestOptions();
2options.ApiKey = "sk-******"; // Set your API key
3
4// Add system message for context
5options.AddSystemMessage("You are an AI trained to summarize text.");
6
7// Add user message to query the ChatGPT model
8options.AddUserMessage("Please summarize the attached document.");
9
10// Add input and output PDFs
11options.AddInput(new FileDataSource("input.pdf"));
12options.AddOutput(new FileDataSource("output.pdf"));
13
14// Process the request asynchronously
15var plugin = new PdfChatGpt();
16var result = await plugin.ProcessAsync(options);
PDF Merger
The Documentize PDF Merger for .NET is a versatile tool designed to merge multiple PDF documents into a single file. It simplifies the consolidation of PDF files, ensuring your documents are merged efficiently and maintaining consistency across content. The plugin handles internal resources such as fonts and images to optimize the merged document.
Key Features:
- Merge Multiple PDFs: Easily combine multiple PDF files into one.
- Resource Optimization: Removes duplicate fonts and images during merging.
- Batch Processing: Merge large batches of PDF documents in one go.
- Secure Merging: Ensure document integrity without data loss or content corruption.
How to Merge PDF Documents
To merge multiple PDF documents into a single file, follow these steps:
- Create an instance of the
Merger
class. - Create an instance of
MergeOptions
to configure the merging process. - Add input PDF files using the
AddInput
method. - Set the output file path using
AddOutput
. - Execute the merge using the
Process
method.
1var merger = new Merger();
2var mergeOptions = new MergeOptions();
3
4// Add input PDF files to merge
5mergeOptions.AddInput(new FileDataSource(@"C:\Samples\file1.pdf"));
6mergeOptions.AddInput(new FileDataSource(@"C:\Samples\file2.pdf"));
7mergeOptions.AddInput(new FileDataSource(@"C:\Samples\file3.pdf"));
8
9// Specify the output file path
10mergeOptions.AddOutput(new FileDataSource(@"C:\Samples\mergedOutput.pdf"));
11
12// Merge the PDFs
13merger.Process(mergeOptions);
How to Merge PDFs with Page Range
You can also merge specific page ranges from input PDF files using the MergeOptions
class. This allows you to combine selected pages into the final output document.
- Create an instance of the
Merger
class. - Configure page ranges using
MergeOptions
. - Add the input files with specified page ranges.
- Set the output path.
- Call the
Process
method.
1var merger = new Merger();
2var mergeOptions = new MergeOptions();
3
4// Merge specific pages from input PDFs
5mergeOptions.AddInput(new FileDataSource(@"C:\Samples\file1.pdf"), new PageRange(1, 3));
6mergeOptions.AddInput(new FileDataSource(@"C:\Samples\file2.pdf"), new PageRange(2, 5));
7
8// Specify the output file path
9mergeOptions.AddOutput(new FileDataSource(@"C:\Samples\outputWithSpecificPages.pdf"));
10
11// Merge the PDFs
12merger.Process(mergeOptions);
How to Handle Batch Merging
The PDF Merger plugin is optimized for handling large batches of PDF documents. By leveraging the batch processing feature, you can merge hundreds of PDFs in a single operation, ensuring efficient and fast document management.
- Instantiate the
Merger
class. - Add all input PDF files to the
MergeOptions
class. - Specify the output path.
- Call the
Process
method to merge all files in the batch.
1var merger = new Merger();
2var mergeOptions = new MergeOptions();
3
4// Add a large batch of PDFs for merging
5for (int i = 1; i <= 100; i++)
6{
7 mergeOptions.AddInput(new FileDataSource($@"C:\Samples\file{i}.pdf"));
8}
9
10// Specify the output file path
11mergeOptions.AddOutput(new FileDataSource(@"C:\Samples\batchMergedOutput.pdf"));
12
13// Process the batch merging
14merger.Process(mergeOptions);
PDF Optimizer
The Documentize PDF Optimizer is a comprehensive plugin that enhances PDF documents through advanced optimization techniques. It is designed to help reduce file sizes, rotate pages, crop content, and resize documents. These operations improve the quality and manageability of PDF files, making them easier to store, share, and view.
Key Features:
- Optimization: Reduce PDF file size without losing quality.
- Rotation: Adjust the orientation of PDF pages.
- Cropping: Remove unnecessary margins or content from the document.
- Resizing: Resize pages to specific dimensions (e.g., A4, Letter).
Optimize PDF Document
The following steps demonstrate how to optimize a PDF document by reducing its file size while maintaining quality.
- Create an instance of the Optimizer class.
- Create an OptimizeOptions object to configure optimization settings.
- Add input PDF file(s) and set an output location for the optimized file.
- Run the Process method to execute the optimization.
1var optimizer = new Optimizer();
2var optimizeOptions = new OptimizeOptions();
3optimizeOptions.AddInput(new FileDataSource("input.pdf"));
4optimizeOptions.AddOutput(new FileDataSource("output.pdf"));
5optimizer.Process(optimizeOptions);
Resize PDF Document
To resize a PDF document, the ResizeOptions class is used to specify the new page size for the document.
- Instantiate the Optimizer class.
- Create a ResizeOptions object to define the page size.
- Add the input file and set the desired output location.
- Use the SetPageSize method to specify the new size (e.g., A4).
- Call the Process method to apply the changes.
1var optimizer = new Optimizer();
2var resizeOptions = new ResizeOptions();
3resizeOptions.AddInput(new FileDataSource("input.pdf"));
4resizeOptions.SetPageSize(PageSize.A4);
5resizeOptions.AddOutput(new FileDataSource("output.pdf"));
6optimizer.Process(resizeOptions);
Rotate PDF Pages
Use the RotateOptions class to adjust the orientation of pages in a PDF file.
- Instantiate the Optimizer class.
- Create a RotateOptions object and configure the rotation angle.
- Add the input PDF file and specify the output file location.
- Set the rotation angle (e.g., 90 degrees) using the SetRotation method.
- Execute the rotation with the Process method.
1var optimizer = new Optimizer();
2var rotateOptions = new RotateOptions();
3rotateOptions.AddInput(new FileDataSource("input.pdf"));
4rotateOptions.SetRotation(90);
5rotateOptions.AddOutput(new FileDataSource("output.pdf"));
6optimizer.Process(rotateOptions);
Crop PDF Document
Cropping removes unwanted content or margins from a PDF document. The CropOptions class can be used to define the crop area.
- Create an instance of the Optimizer class.
- Define the crop area with the CropOptions object.
- Add the input file and specify the output file location.
- Use the SetCropBox method to define the crop area.
- Execute the cropping with the Process method.
1var optimizer = new Optimizer();
2var cropOptions = new CropOptions();
3cropOptions.AddInput(new FileDataSource("input.pdf"));
4cropOptions.SetCropBox(new Rectangle(50, 50, 500, 700)); // Defines the crop area
5cropOptions.AddOutput(new FileDataSource("output.pdf"));
6optimizer.Process(cropOptions);
PDF Security
The Documentize PDF Security for .NET is a powerful tool designed to enhance the security of your PDF documents by providing encryption and decryption capabilities. It ensures that your sensitive information remains confidential and protected from unauthorized access.
Key Features:
- Encrypt PDF Documents: Secure your PDF files by adding user and owner passwords.
- Decrypt PDF Documents: Remove encryption from PDFs when needed.
- Set Permissions: Control permissions such as printing, copying, and modifying content.
- Automation: Integrate encryption and decryption into your .NET applications for automated workflows.
- Compliance: Ensure your documents meet industry standards for document security.
How to Encrypt a PDF Document
To encrypt a PDF document, follow these steps:
- Create an instance of the
Security
class. - Create an instance of
EncryptionOptions
with the desired user and owner passwords. - Add the input PDF file using the
AddInput
method. - Set the output file path using
AddOutput
. - Execute the encryption using the
Process
method.
1// Instantiate the Security plugin
2var plugin = new Security();
3
4// Configure the encryption options
5var opt = new EncryptionOptions("user_password", "owner_password");
6
7// Add input PDF file
8opt.AddInput(new FileDataSource("path_to_pdf"));
9
10// Specify the output encrypted PDF file
11opt.AddOutput(new FileDataSource("path_to_encrypted_pdf"));
12
13// Perform the encryption process
14plugin.Process(opt);
How to Decrypt a PDF Document
To decrypt a PDF document, follow these steps:
- Create an instance of the
Security
class. - Create an instance of
DecryptionOptions
with the necessary password. - Add the encrypted PDF file using the
AddInput
method. - Set the output file path using
AddOutput
. - Execute the decryption using the
Process
method.
1// Instantiate the Security plugin
2var plugin = new Security();
3
4// Configure the decryption options
5var opt = new DecryptionOptions("user_password");
6
7// Add input encrypted PDF file
8opt.AddInput(new FileDataSource("path_to_encrypted_pdf"));
9
10// Specify the output decrypted PDF file
11opt.AddOutput(new FileDataSource("path_to_decrypted_pdf"));
12
13// Perform the decryption process
14plugin.Process(opt);
Setting Permissions on PDF Documents
When encrypting a PDF, you can set various permissions to control how the document can be used.
- Printing: Allow or disallow printing of the document.
- Copying: Allow or disallow copying of content.
- Modifying: Allow or disallow modifications to the document.
To set permissions, you can configure the EncryptionOptions
accordingly.
PDF Signature
The Documentize PDF Signature for .NET plugin allows users to digitally sign PDF documents. It offers a streamlined process for adding signatures, ensuring authenticity, and securing PDF content. The plugin supports both visible and invisible signatures and provides options to customize the signature’s position, reason, contact information, and more.
Key Features:
- Digitally Sign PDF Documents: Secure your documents with visible or invisible digital signatures.
- PFX Support: Sign PDF files using a PFX certificate.
- Customizable Options: Configure signature settings like reason, location, and contact details.
- Visible and Invisible Signatures: Choose whether the signature is visible on the document.
How to Sign PDF Documents
To sign a PDF document using a PFX file, follow these steps:
- Create an instance of the
Signature
class. - Instantiate the
SignOptions
class with the PFX file path and password. - Add the input PDF and the output file to the options.
- Run the
Process
method to apply the signature.
1var signature = new Signature();
2var signOptions = new SignOptions(@"C:\certificates\myCertificate.pfx", "pfxPassword");
3
4// Add the input PDF and specify the output file
5signOptions.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
6signOptions.AddOutput(new FileDataSource(@"C:\Samples\signedOutput.pdf"));
7
8// Configure signature options
9signOptions.Reason = "Contract Agreement";
10signOptions.Contact = "johndoe@example.com";
11signOptions.Location = "New York";
12signOptions.PageNumber = 1;
13signOptions.Visible = true;
14signOptions.Rectangle = new Rectangle(100, 100, 200, 150);
15
16// Apply the signature to the document
17signature.Process(signOptions);
How to Use Stream for PFX File
You can also sign a PDF using a PFX certificate provided as a stream instead of a file path. This allows more flexible handling of certificate storage.
- Create an instance of the
Signature
class. - Instantiate
SignOptions
with a stream containing the PFX and the password. - Add the input and output files.
- Run the
Process
method to apply the signature.
1using var pfxStream = File.OpenRead(@"C:\certificates\myCertificate.pfx");
2var signature = new Signature();
3var signOptions = new SignOptions(pfxStream, "pfxPassword");
4
5// Add input and output files
6signOptions.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
7signOptions.AddOutput(new FileDataSource(@"C:\Samples\signedOutput.pdf"));
8
9// Apply signature
10signature.Process(signOptions);
How to Apply Invisible Signatures
To add an invisible signature (one that secures the document without displaying the signature on the document), simply set the Visible
property to false
.
- Create an instance of
SignOptions
. - Set
Visible
to false
. - Add input and output files.
- Call
Process
to apply the invisible signature.
1var signature = new Signature();
2var signOptions = new SignOptions(@"C:\certificates\myCertificate.pfx", "pfxPassword");
3
4// Configure invisible signature
5signOptions.Visible = false;
6
7// Add input and output files
8signOptions.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
9signOptions.AddOutput(new FileDataSource(@"C:\Samples\invisiblySigned.pdf"));
10
11// Process signature
12signature.Process(signOptions);
PDF Splitter
The Documentize PDF Splitter for .NET is a powerful tool that simplifies the process of splitting large PDF documents into smaller, more manageable files. Whether you need to extract individual pages or divide a document into specific sections, this plugin allows you to achieve it efficiently and with minimal effort.
Key Features:
- Split PDF by Page: Break down a PDF document into individual pages.
- Batch Processing: Split large batches of PDFs in one go.
- Custom Split Options: Configure the splitting process based on your requirements.
- Organized Output: Easily manage the output files for each split page or section.
How to Split PDF Documents
To split a PDF document into individual pages, follow these steps:
- Create an instance of the
Splitter
class. - Create an instance of
SplitOptions
to configure the splitting options. - Add the input PDF file using the
AddInput
method. - Add output files for each split page using the
AddOutput
method. - Run the
Process
method to split the document.
1var splitter = new Splitter();
2var splitOptions = new SplitOptions();
3
4// Add the input PDF file
5splitOptions.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
6
7// Specify output files for each page
8splitOptions.AddOutput(new FileDataSource(@"C:\Samples\output_page_1.pdf"));
9splitOptions.AddOutput(new FileDataSource(@"C:\Samples\output_page_2.pdf"));
10splitOptions.AddOutput(new FileDataSource(@"C:\Samples\output_page_3.pdf"));
11
12// Process the split operation
13splitter.Process(splitOptions);
Splitting PDF by Page Ranges
You can also split a PDF by specifying page ranges. This allows you to extract specific sections or multiple pages from a PDF into separate documents.
1var splitter = new Splitter();
2var splitOptions = new SplitOptions();
3
4// Add the input PDF
5splitOptions.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
6
7// Define output for page ranges (e.g., pages 1-3)
8splitOptions.AddOutput(new FileDataSource(@"C:\Samples\output_pages_1_to_3.pdf"));
9
10// Process the split
11splitter.Process(splitOptions);
How to Handle Batch Splitting
The PDF Splitter plugin is optimized to handle large batches of PDF documents. You can split hundreds of PDFs into individual pages or sections by leveraging batch processing.
1var splitter = new Splitter();
2var splitOptions = new SplitOptions();
3
4// Add input PDF files in a batch
5splitOptions.AddInput(new FileDataSource(@"C:\Samples\file1.pdf"));
6splitOptions.AddInput(new FileDataSource(@"C:\Samples\file2.pdf"));
7
8// Define the output for each file
9splitOptions.AddOutput(new FileDataSource(@"C:\Samples\output_file1_page1.pdf"));
10splitOptions.AddOutput(new FileDataSource(@"C:\Samples\output_file2_page1.pdf"));
11
12// Process the batch split
13splitter.Process(splitOptions);
The Documentize PDF Text Extractor for .NET simplifies extracting text from PDF documents. Whether you need pure, raw, or plain text, this plugin allows you to extract text efficiently while preserving formatting or omitting it based on your needs.
Key Features:
- Pure Mode: Extract text while preserving its original formatting.
- Raw Mode: Extract text without any formatting.
- Plain Mode: Extract text without special characters or formatting.
- Batch Processing: Extract text from multiple PDFs at once.
To extract text from a PDF document, follow these steps:
- Create an instance of the
TextExtractor
class. - Create an instance of
TextExtractorOptions
to configure the extraction options. - Add the input PDF file using the
AddInput
method. - Run the
Process
method to extract the text. - Access the extracted text using the
ResultContainer.ResultCollection
.
1using var extractor = new TextExtractor();
2var textExtractorOptions = new TextExtractorOptions();
3
4// Add the input PDF
5textExtractorOptions.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
6
7// Process the text extraction
8var resultContainer = extractor.Process(textExtractorOptions);
9
10// Print the extracted text
11var extractedText = resultContainer.ResultCollection[0];
12Console.WriteLine(extractedText);
The plugin allows you to extract text from multiple PDFs simultaneously, ensuring quick and efficient processing.
1using var extractor = new TextExtractor();
2var textExtractorOptions = new TextExtractorOptions();
3
4// Add multiple input PDFs
5textExtractorOptions.AddInput(new FileDataSource(@"C:\Samples\input1.pdf"));
6textExtractorOptions.AddInput(new FileDataSource(@"C:\Samples\input2.pdf"));
7
8// Process the extraction
9var resultContainer = extractor.Process(textExtractorOptions);
10
11// Output the extracted text
12foreach (var result in resultContainer.ResultCollection)
13{
14 Console.WriteLine(result);
15}
The TextExtractor plugin offers three extraction modes, providing flexibility based on your needs.
- Pure Mode: Preserves the original formatting, including spaces and alignment.
- Raw Mode: Extracts the text without formatting, useful for raw data processing.
- Plain Mode: Extracts text without special characters or additional formatting.
1var textExtractorOptions = new TextExtractorOptions();
2
3// Set to Pure mode
4textExtractorOptions.Mode = ExtractionMode.Pure;
5textExtractorOptions.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
6
7// Process and output
8var resultContainer = extractor.Process(textExtractorOptions);
9Console.WriteLine(resultContainer.ResultCollection[0]);
How to Handle Batch Processing
For large document sets, you can leverage batch processing, enabling you to extract text from multiple PDFs at once.
1using var extractor = new TextExtractor();
2var textExtractorOptions = new TextExtractorOptions();
3
4// Add multiple input PDFs
5textExtractorOptions.AddInput(new FileDataSource(@"C:\Samples\batch1.pdf"));
6textExtractorOptions.AddInput(new FileDataSource(@"C:\Samples\batch2.pdf"));
7
8// Define output for each file
9var resultContainer = extractor.Process(textExtractorOptions);
10
11// Handle extracted text
12foreach (var result in resultContainer.ResultCollection)
13{
14 Console.WriteLine(result);
15}
PDF Timestamp Adder
The Documentize PDF Timestamp Adder for .NET is a powerful tool designed to add secure timestamps to your PDF documents. It enhances the integrity and authenticity of your documents by providing a trusted time reference, ensuring compliance with digital signature standards.
Key Features:
- Add Secure Timestamps: Effortlessly add secure timestamps to your PDF documents.
- Customizable Timestamp Servers: Use custom timestamp server URLs and authentication credentials.
- Automation: Integrate timestamping into your .NET applications for automated workflows.
- Compliance: Ensure your documents meet industry standards for digital signatures and timestamps.
How to Add a Timestamp to PDF Documents
To add a secure timestamp to a PDF document, follow these steps:
- Create an instance of the
Timestamp
class. - Create an instance of
AddTimestampOptions
to configure the timestamping process. - Add the input PDF file using the
AddInput
method. - Set the output file path using
AddOutput
. - Execute the timestamping using the
Process
method.
1// Instantiate the Timestamp plugin
2var plugin = new Timestamp();
3
4// Configure the timestamping options
5var opt = new AddTimestampOptions("path_to_pfx", "password_for_pfx", "timestamp_server_url");
6
7// Add input PDF file
8opt.AddInput(new FileDataSource("path_to_pdf"));
9
10// Specify the output PDF file
11opt.AddOutput(new FileDataSource("path_to_result_pdf"));
12
13// Perform the timestamping process
14plugin.Process(opt);
How to Use Custom Authentication with Timestamp Server
You can provide basic authentication credentials when connecting to the timestamp server. This allows you to authenticate with servers that require a username and password.
- Create an instance of the
Timestamp
class. - Create an instance of
AddTimestampOptions
, including the serverBasicAuthCredentials
. - Add the input file and output file paths.
- Call the
Process
method.
1// Instantiate the Timestamp plugin
2var plugin = new Timestamp();
3
4// Configure the timestamping options with authentication
5var opt = new AddTimestampOptions("path_to_pfx", "password_for_pfx", "timestamp_server_url", "username:password");
6
7// Add input PDF file
8opt.AddInput(new FileDataSource("path_to_pdf"));
9
10// Specify the output PDF file
11opt.AddOutput(new FileDataSource("path_to_result_pdf"));
12
13// Perform the timestamping process
14plugin.Process(opt);
Handling PFX Files and Passwords
The AddTimestampOptions
class allows you to use a PFX file for digital signing along with the password.
- PFX Stream or File Path: You can provide a stream or file path to the PFX file.
- Password Protection: Ensure you securely manage the password for the PFX file.
PDF to DOC Converter
The Documentize PDF to DOC Converter for .NET is a powerful tool designed to convert PDF documents into DOC or DOCX formats. This plugin seamlessly transforms PDF pages into editable Microsoft Word documents, making it easy to reuse, edit, and share content across multiple platforms.
Key Features:
- DOC/DOCX Conversion: Convert PDF documents to editable Microsoft Word formats (DOC or DOCX).
- Maintain Formatting: Retain the original layout, text, and formatting during the conversion process.
- Batch Processing: Convert multiple PDF files at once.
- Custom Conversion Options: Fine-tune the conversion process with different modes, like Enhanced Flow, for better layout.
How to Convert PDF to DOC/DOCX
To convert a PDF document to DOC/DOCX format, follow these steps:
- Create an instance of the
PdfDoc
class. - Create an instance of
PdfToDocOptions
to configure the conversion process. - Add the input PDF file using the
AddInput
method. - Add the output file path for the resulting DOC/DOCX file using the
AddOutput
method. - Run the
Process
method to execute the conversion.
1var pdfToWord = new PdfDoc();
2var options = new PdfToDocOptions()
3{
4 SaveFormat = SaveFormat.DocX, // Output format as DOCX
5 ConversionMode = ConversionMode.EnhancedFlow // Optimize layout and formatting
6};
7
8// Add the input PDF file
9options.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
10
11// Add the output Word document path
12options.AddOutput(new FileDataSource(@"C:\Samples\output.docx"));
13
14// Process the conversion
15pdfToWord.Process(options);
Converting PDF to DOC with Custom Options
The PDF to DOC Converter plugin provides several options to customize your conversion process. You can choose between different modes to control how the layout and structure of the PDF are handled during conversion.
1var pdfToWord = new PdfDoc();
2var options = new PdfToDocOptions()
3{
4 SaveFormat = SaveFormat.Doc, // Output format as DOC
5 ConversionMode = ConversionMode.Precise // Maintain original PDF layout as closely as possible
6};
7
8// Add the input PDF file
9options.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
10
11// Add the output Word document path
12options.AddOutput(new FileDataSource(@"C:\Samples\output.doc"));
13
14// Process the conversion
15pdfToWord.Process(options);
Batch Processing PDF to DOC/DOCX Conversion
The PDF to DOC Converter supports batch processing, allowing you to convert multiple PDF files at once. Here’s an example of batch conversion:
1var pdfToWord = new PdfDoc();
2var options = new PdfToDocOptions()
3{
4 SaveFormat = SaveFormat.DocX
5};
6
7// Add multiple input PDF files
8options.AddInput(new FileDataSource(@"C:\Samples\file1.pdf"));
9options.AddInput(new FileDataSource(@"C:\Samples\file2.pdf"));
10
11// Add output file paths for the resulting DOCX files
12options.AddOutput(new FileDataSource(@"C:\Samples\output_file1.docx"));
13options.AddOutput(new FileDataSource(@"C:\Samples\output_file2.docx"));
14
15// Process the batch conversion
16pdfToWord.Process(options);
PDF to XLS Converter
The Documentize PDF to XLS Converter for .NET is a powerful tool that allows seamless conversion of PDF documents into Excel spreadsheets (XLS/XLSX). This plugin enhances the accessibility and usability of your PDF content, making it easy to manipulate and analyze data in spreadsheet format.
Key Features:
- Convert PDF to Excel: Transform PDF files into XLS/XLSX spreadsheets for easy data management.
- Custom Output Options: Configure the output format, page range, worksheet name, and more.
- High-Fidelity Conversion: Retain layout, formatting, and content accuracy during conversion.
- Batch Processing: Convert multiple PDF files in one go for large-scale operations.
How to Convert PDF to XLS
To convert a PDF document into an Excel file (XLS/XLSX), follow these steps:
- Create an instance of the
PdfXls
class. - Create an instance of
PdfToXlsOptions
to configure the conversion settings. - Add the input PDF file using the
AddInput
method. - Specify the output Excel file using the
AddOutput
method. - Run the
Process
method to initiate the conversion.
1var pdfXlsConverter = new PdfXls();
2var options = new PdfToXlsOptions();
3
4// Add input and output file paths
5options.AddInput(new FileDataSource(@"C:\Samples\sample.pdf"));
6options.AddOutput(new FileDataSource(@"C:\Samples\output.xlsx"));
7
8// Run the conversion process
9pdfXlsConverter.Process(options);
Customizing the PDF to Excel Conversion
You can customize the conversion settings by modifying the PdfToXlsOptions
class. For instance, to convert the PDF to an XLSX format, insert a blank column, and name the worksheet, you can use the following code:
1var options = new PdfToXlsOptions();
2
3// Set the output format to XLSX
4options.Format = PdfToXlsOptions.ExcelFormat.XLSX;
5
6// Insert a blank column at the first position
7options.InsertBlankColumnAtFirst = true;
8
9// Set the worksheet name
10options.WorksheetName = "MySheet";
11
12// Add input and output files
13options.AddInput(new FileDataSource(@"C:\Samples\sample.pdf"));
14options.AddOutput(new FileDataSource(@"C:\Samples\output.xlsx"));
15
16// Process the conversion
17pdfXlsConverter.Process(options);
Handling Conversion Results
After processing, the Process method returns a ResultContainer
object that holds the result of the conversion. You can retrieve the converted file path or other output details:
1var resultContainer = pdfXlsConverter.Process(options);
2
3// Access and print the result file path
4var result = resultContainer.ResultCollection[0];
5Console.WriteLine(result);
Batch Processing for PDF to XLS Conversion
The PDF to XLS Converter plugin also supports batch processing, enabling the conversion of multiple PDF files at once.
1var pdfXlsConverter = new PdfXls();
2var options = new PdfToXlsOptions();
3
4// Add multiple input PDFs
5options.AddInput(new FileDataSource(@"C:\Samples\file1.pdf"));
6options.AddInput(new FileDataSource(@"C:\Samples\file2.pdf"));
7
8// Add the output Excel files
9options.AddOutput(new FileDataSource(@"C:\Samples\output1.xlsx"));
10options.AddOutput(new FileDataSource(@"C:\Samples\output2.xlsx"));
11
12// Process the batch conversion
13pdfXlsConverter.Process(options);
PDF/A Converter
The Documentize PDF/A Converter for .NET is a powerful tool designed to convert PDF documents into the PDF/A format, ensuring that your content remains compliant with long-term archiving standards. This plugin also supports validating existing PDF documents for PDF/A compliance, offering both conversion and validation features in a single solution.
Key Features:
- Convert to PDF/A: Seamlessly transform PDF files into the PDF/A format (such as PDF/A-1a, PDF/A-2b, PDF/A-3b) to ensure compliance with archiving standards.
- Validate PDF/A Compliance: Check existing PDF documents for conformance with PDF/A standards and identify issues if they do not comply.
- Batch Processing: Process multiple files at once for conversion or validation.
- Efficient Workflow: Minimize time and effort with fast and reliable conversion processes.
How to Convert PDF to PDF/A
To convert a PDF document into PDF/A format, follow these steps:
- Create an instance of the
PdfAConverter
class. - Create an instance of
PdfAConvertOptions
to configure the conversion. - Specify the desired PDF/A version (e.g., PDF/A-3B).
- Add the input PDF file using the
AddInput
method. - Add the output file for the resulting PDF/A using the
AddOutput
method. - Call the
Process
method to execute the conversion.
1var pdfAConverter = new PdfAConverter();
2var pdfAOptions = new PdfAConvertOptions
3{
4 PdfAVersion = PdfAStandardVersion.PDF_A_3B
5};
6
7// Add the input PDF file
8pdfAOptions.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
9
10// Specify the output PDF/A file
11pdfAOptions.AddOutput(new FileDataSource(@"C:\Samples\output_pdfa.pdf"));
12
13// Process the conversion
14pdfAConverter.Process(pdfAOptions);
Validating PDF/A Compliance
You can validate existing PDF files for PDF/A compliance using the PdfAValidateOptions
class.
1var pdfAConverter = new PdfAConverter();
2var validationOptions = new PdfAValidateOptions
3{
4 PdfAVersion = PdfAStandardVersion.PDF_A_1A
5};
6
7// Add the PDF file to be validated
8validationOptions.AddInput(new FileDataSource(@"C:\Samples\input.pdf"));
9
10// Run the validation process
11var resultContainer = pdfAConverter.Process(validationOptions);
12
13// Check the validation result
14var validationResult = (PdfAValidationResult)resultContainer.ResultCollection[0].Data;
15Console.WriteLine("PDF/A Validation Passed: " + validationResult.IsValid);
Batch Processing for PDF/A Conversion
This plugin supports batch processing, allowing you to convert or validate multiple PDF files for PDF/A compliance at once.
1var pdfAConverter = new PdfAConverter();
2var pdfAOptions = new PdfAConvertOptions
3{
4 PdfAVersion = PdfAStandardVersion.PDF_A_3B
5};
6
7// Add multiple input PDFs
8pdfAOptions.AddInput(new FileDataSource(@"C:\Samples\file1.pdf"));
9pdfAOptions.AddInput(new FileDataSource(@"C:\Samples\file2.pdf"));
10
11// Specify output files for the converted PDF/As
12pdfAOptions.AddOutput(new FileDataSource(@"C:\Samples\file1_pdfa.pdf"));
13pdfAOptions.AddOutput(new FileDataSource(@"C:\Samples\file2_pdfa.pdf"));
14
15// Process the batch conversion
16pdfAConverter.Process(pdfAOptions);
HTML Converter
The Documentize HTML Converter for .NET provides robust capabilities for converting documents between PDF and HTML formats, ideal for web applications, archiving, and report generation. With multiple options for handling resources and layouts, the converter adapts to various project requirements.
Key Features
PDF to HTML Conversion
Convert PDF files to HTML to make documents accessible for web-based viewing or integration into applications where HTML format is preferred.
HTML to PDF Conversion
Transform HTML content into high-quality PDFs, perfect for generating printable reports, archiving web content, or creating shareable document formats.
Detailed Guide
Converting PDF to HTML
To convert a PDF to HTML:
- Initialize the Converter: Create an instance of
HtmlConverter
. - Set Conversion Options: Use
PdfToHtmlOptions
to customize output, choosing either embedded or external resources. - Define Input and Output Paths: Set the paths for your input PDF and output HTML.
- Execute the Conversion: Call the
Process
method to convert the file.
Example: Convert PDF to HTML with Embedded Resources
// Step 1: Initialize the HTML Converter
var converter = new HtmlConverter();
// Step 2: Configure options for PDF to HTML conversion
var options = new PdfToHtmlOptions(PdfToHtmlOptions.SaveDataType.FileWithEmbeddedResources);
// Step 3: Set file paths
options.AddInput(new FileDataSource("input.pdf"));
options.AddOutput(new FileDataSource("output.html"));
// Step 4: Run the conversion
converter.Process(options);
Available Options for PDF to HTML Conversion
SaveDataType:
FileWithEmbeddedResources
: Generates a single HTML file with all resources embedded.FileWithExternalResources
: Saves resources separately, ideal for large HTML files.
Output Customization:
BasePath
: Set the base path for resources in the HTML document.IsRenderToSinglePage
: Optionally render all PDF content on a single HTML page.
Converting HTML to PDF
To convert an HTML document to a PDF, follow these steps:
- Initialize the Converter: Create an instance of the
HtmlConverter
. - Configure PDF Options: Use
HtmlToPdfOptions
to define layout and media settings. - Specify Paths: Set input HTML and output PDF file paths.
- Execute the Conversion: Run the
Process
method to complete the conversion.
Example: Convert HTML to PDF
// Step 1: Initialize the HTML Converter
var converter = new HtmlConverter();
// Step 2: Configure options for HTML to PDF conversion
var options = new HtmlToPdfOptions();
// Step 3: Set file paths
options.AddInput(new FileDataSource("input.html"));
options.AddOutput(new FileDataSource("output.pdf"));
// Step 4: Execute the conversion
converter.Process(options);
Additional Options for HTML to PDF Conversion
Media Type:
HtmlMediaType.Print
: Ideal for generating PDFs suited for printing.HtmlMediaType.Screen
: Use when converting content designed for digital viewing.
Layout Adjustments:
PageLayoutOption
: Adjusts how HTML content fits the PDF layout, like ScaleToPageWidth
to ensure the content scales to the PDF width.IsRenderToSinglePage
: Enables rendering the entire HTML content on a single PDF page if needed for concise presentations.
This converter is versatile for a variety of applications, from generating PDF reports based on web content to converting archives of PDF documents for web-based accessibility. For more advanced configurations, refer to the full Documentize documentation.