提取图像
The Documentize PDF Extractor for .NET plugin enables you to effortlessly extract images from PDF documents. It scans your PDF files, identifies embedded images, and extracts them while maintaining their original quality and format. This tool enhances the accessibility of visual content and streamlines the process of retrieving images from PDFs.
如何从 PDF 中提取图像
要从 PDF 文件中提取图像,请按照以下步骤操作:
- 创建
ExtractImagesOptions类的实例。 - 使用
AddInput方法将输入文件路径添加到选项中。 - 使用
AddOutput方法设置图像的输出目录路径。 - 使用插件执行图像提取过程。
- 从结果容器中获取提取的图像。
1// Create ExtractImagesOptions to set instructions
2var options = new ExtractImagesOptions();
3// Add input file path
4options.AddInput(new FileDataSource("path_to_your_pdf_file.pdf"));
5// Set output Directory path
6options.AddOutput(new DirectoryDataSource("path_to_results_directory"));
7// Perform the process
8var results = PdfExtractor.Extract(options);
9// Get path to image result
10var imageExtracted = results.ResultCollection[0].ToFile();将 PDF 文件中的图像提取到流而不使用文件夹
The PdfExtractor plugin supports saving to streams, which allows you to extract images from PDF files into streams without using temporary folders.
1// Create ExtractImagesOptions to set instructions
2var options = new ExtractImagesOptions();
3// Add input file path
4options.AddInput(new FileDataSource("path_to_your_pdf_file.pdf"));
5// Not set output - it will write results to streams
6// Perform the process
7var results = PdfExtractor.Extract(options);
8// Get Stream
9var ms = results.ResultCollection[0].ToStream();
10// Copy data to file for demo
11ms.Seek(0, SeekOrigin.Begin);
12using (var fs = File.Create("test_file.png"))
13{
14 ms.CopyTo(fs);
15}关键特性:
- 提取嵌入的图像:识别并提取 PDF 文档中的图像。
- 保留图像质量:确保提取的图像保持原始质量。
- 灵活的输出:将提取的图像保存为您偏好的格式或位置。