Extract Properties / Metadata
The Documentize PDF Extractor for .NET simplifies extracting Metadata from PDF documents. Available properties that may interest you: FileName, Title, Author, Subject, Keywords, Created, Modified, Application, PDF Producer, Number of Pages.
如何从 PDF 文件中提取元数据
The example demonstrates how to Extract Properties (Title, Author, Subject, Keywords, Number of Pages) from PDF file. To extract metadata from a PDF document, follow these steps:
- Create an instance of
ExtractPropertiesOptionsto configure the extraction options and input PDF file. - Run the
Extractmethod ofPdfExtractorto extract the metadata. - Access the extracted properties using the
PdfProperties.
1// Create ExtractPropertiesOptions object to set input file
2var options = new ExtractPropertiesOptions("path_to_your_pdf_file.pdf");
3// Perform the process and get Properties
4var pdfProperties = PdfExtractor.Extract(options);
5var filename = pdfProperties.FileName;
6var title = pdfProperties.Title;
7var author = pdfProperties.Author;
8var subject = pdfProperties.Subject;
9var keywords = pdfProperties.Keywords;
10var created = pdfProperties.Created;
11var modified = pdfProperties.Modified;
12var application = pdfProperties.Application;
13var pdfProducer = pdfProperties.PdfProducer;
14var numberOfPages = pdfProperties.NumberOfPages;如何从 PDF 流中提取元数据
You can open the stream at your own discretion.
1// Create ExtractPropertiesOptions object to set input stream
2var stream = File.OpenRead("path_to_your_pdf_file.pdf");
3var options = new ExtractPropertiesOptions(stream);
4// Perform the process and get Properties
5var pdfProperties = PdfExtractor.Extract(options);
6var title = pdfProperties.Title;
7var author = pdfProperties.Author;
8var subject = pdfProperties.Subject;
9var keywords = pdfProperties.Keywords;
10var created = pdfProperties.Created;
11var modified = pdfProperties.Modified;
12var application = pdfProperties.Application;
13var pdfProducer = pdfProperties.PdfProducer;
14var numberOfPages = pdfProperties.NumberOfPages;以最简洁方式从 PDF 文件中提取元数据
1// Perform the process and get Properties
2var pdfProperties = PdfExtractor.Extract(new ExtractPropertiesOptions("path_to_your_pdf_file.pdf"));关键特性:
- 可用的元数据:FileName、Title、Author、Subject、Keywords、Created、Modified、Application、PDF Producer、Number of Pages.