PDF Image Extractor
The Documentize PDF Image Extractor for .NET plugin enables you to effortlessly extract images from PDF documents. It scans your PDF files, identifies embedded images, and extracts them while maintaining their original quality and format. This tool enhances the accessibility of visual content and streamlines the process of retrieving images from PDFs.
How to Extract Images from a PDF
To extract images from a PDF file, follow these steps:
- Create an instance of the
ImageExtractor
class. - Create an instance of the
ImageExtractorOptions
class. - Add the input file path to the options.
- Process the image extraction using the plugin.
- Retrieve the extracted images from the result container.
1using var plugin = new ImageExtractor();
2
3// Create an instance of the ImageExtractorOptions class
4var imageExtractorOptions = new ImageExtractorOptions();
5
6// Add the input file path
7imageExtractorOptions.AddInput(new FileDataSource(Path.Combine(@"C:\Samples\", "sample.pdf")));
8
9// Process the image extraction
10var resultContainer = plugin.Process(imageExtractorOptions);
11
12// Get the extracted image and save it to a file
13var extractedImage = resultContainer.ResultCollection[0].ToStream();
14var outputStream = File.OpenWrite(@"C:\Samples\tmp.jpg");
15extractedImage.CopyTo(outputStream);
Extracting Images from Multiple PDF Files
The ImageExtractor plugin supports batch processing, allowing you to extract images from multiple PDFs simultaneously. This feature is especially useful when you have a collection of PDF files and need to retrieve all the images in one go.
1using var plugin = new ImageExtractor();
2var options = new ImageExtractorOptions();
3
4// Add multiple input PDF files
5options.AddInput(new FileDataSource(@"C:\Samples\file1.pdf"));
6options.AddInput(new FileDataSource(@"C:\Samples\file2.pdf"));
7options.AddInput(new FileDataSource(@"C:\Samples\file3.pdf"));
8
9// Process the image extraction
10var resultContainer = plugin.Process(options);
11
12// Save the extracted images from all files
13for (int i = 0; i < resultContainer.ResultCollection.Count; i++)
14{
15 var extractedImage = resultContainer.ResultCollection[i].ToStream();
16 using var outputStream = File.OpenWrite($@"C:\Samples\image_{i + 1}.jpg");
17 extractedImage.CopyTo(outputStream);
18}
Key Features:
- Extract Embedded Images: Identify and extract images from PDF documents.
- Preserve Image Quality: Ensures extracted images retain their original quality.
- Batch Processing: Extract images from multiple PDF documents in a single operation.
- Flexible Output: Save extracted images in your preferred format or location.