Extract Properties / Metadata

The Documentize PDF Extractor for .NET simplifies extracting Metadata from PDF documents. Available properties that may interest you: FileName, Title, Author, Subject, Keywords, Created, Modified, Application, PDF Producer, Number of Pages.

如何从 PDF 文件中提取元数据

The example demonstrates how to Extract Properties (Title, Author, Subject, Keywords, Number of Pages) from PDF file. To extract metadata from a PDF document, follow these steps:

  1. Create an instance of ExtractPropertiesOptions to configure the extraction options and input PDF file.
  2. Run the Extract method of PdfExtractor to extract the metadata.
  3. Access the extracted properties using the PdfProperties.
 1// Create ExtractPropertiesOptions object to set input file
 2var options = new ExtractPropertiesOptions("path_to_your_pdf_file.pdf");
 3// Perform the process and get Properties
 4var pdfProperties = PdfExtractor.Extract(options);
 5var filename = pdfProperties.FileName;
 6var title = pdfProperties.Title;
 7var author = pdfProperties.Author;
 8var subject = pdfProperties.Subject;
 9var keywords = pdfProperties.Keywords;
10var created = pdfProperties.Created;
11var modified = pdfProperties.Modified;
12var application = pdfProperties.Application;
13var pdfProducer = pdfProperties.PdfProducer;
14var numberOfPages = pdfProperties.NumberOfPages;

如何从 PDF 流中提取元数据

You can open the stream at your own discretion.

 1// Create ExtractPropertiesOptions object to set input stream
 2var stream = File.OpenRead("path_to_your_pdf_file.pdf");
 3var options = new ExtractPropertiesOptions(stream);
 4// Perform the process and get Properties
 5var pdfProperties = PdfExtractor.Extract(options);
 6var title = pdfProperties.Title;
 7var author = pdfProperties.Author;
 8var subject = pdfProperties.Subject;
 9var keywords = pdfProperties.Keywords;
10var created = pdfProperties.Created;
11var modified = pdfProperties.Modified;
12var application = pdfProperties.Application;
13var pdfProducer = pdfProperties.PdfProducer;
14var numberOfPages = pdfProperties.NumberOfPages;

以最简洁方式从 PDF 文件中提取元数据

1// Perform the process and get Properties
2var pdfProperties = PdfExtractor.Extract(new ExtractPropertiesOptions("path_to_your_pdf_file.pdf"));

关键特性:

  • 可用的元数据:FileName、Title、Author、Subject、Keywords、Created、Modified、Application、PDF Producer、Number of Pages.
 中文