Contains information about a page of a PDF document.
[SerializableAttribute()]
public class PDFDocumentPage
The PDFDocumentPage class is used as the type of the PDFDocument.Pages collection.
The PDFDocument.Pages collection is automatically created when a new PDFDocument object is created with the PDFDocument(string fileName) or PDFDocument(string fileName, string password) constructors. This collection is read-only and cannot be modified since the PDFDocument object is a read-only view of a PDF file. Each item in the collection corresponding to a page in the PDF document.
Each item in the Pages collection corresponds to a page in the PDF document. So item at index 0 is the properties of page 1, item at index 1 is the properties of page 2 and so on. Even though the PDFDocumentPage structure contains the PageNumber property that specifies the number of the page, this information is for convenience only, the PDFDocument constructors will always populate the collection in the correct order from first to last page.
Each PDF document page contain two rectangular areas, the media box and the crop box. The PDFDocumentPage will load and store these values in the MediaBox and CropBox properties. The various width and height values described below are of the crop box. For more information, refer to PDF Coordinate System.
The PDFDocumentPage class contains the width and height of the page in PDF units which is read from the PDF file directly (the PDF crop box). PDF units are in 1/72 of an inch, so a page size of 612 by 792 corresponds to 8.5 by 11 inches (612/72 by 792/72). The size of each page in PDF units (1/72 of an inch) will be automatically set in the Width and Height properties. The size of each page in inches is also fixed and will be set in the WidthInches and HeightInches properties. The size of the page in pixels depends on the owner document Resolution. This value can be changed by the user at any time so the pixel size of the page changes accordingly. To get the size of the page in pixel using the current resolution, use the WidthPixels and HeightPixels properties.
The total number of pages in the document is Pages.Count.
Each PDFDocumentPage object can also be populated with the various PDF native objects located in the corresponding page in the original PDF document. When you first create a PDFDocument object from a PDF file, all the collections described below will have a value of null. You can populate the collections with the objects by using the PDFDocument.ParsePages method. Each page parsed will have the collection described below populated with the objects found in the file depending on the value of PDFParsePagesOptions passed as the options parameter.
After the ParsePages method returns, the following properties will be initialized as follows:
If PDFParsePagesOptions.Objects is specified, then the PDFDocumentPage.Objects collection will be populated with a PDFObject object for each object item found in the page. These items can be text (characters), images or rectangles. If there are no object items found in the page, then the PDFDocumentPage.Objects will be initialized with an empty collection (PDFDocumentPage.Objects.Count will be 0).
If PDFParsePagesOptions.Hyperlinks is specified, then the PDFDocumentPage.Hyperlinks collection will be populated with a PDFHyperlink object for each hyperlink item found in the page. If no hyperlinks are found in the page, PDFDocumentPage.Hyperlinks will be initialized with an empty collection (PDFDocumentPage.Hyperlinks.Count will be 0).
If PDFParsePagesOptions.Annotations is specified, then the PDFDocumentPage.Annotations collection will be populated with a PDFAnnotation object for each annotation item found in the page. If no annotations are found in the page, PDFDocumentPage.Annotations will be initialized with an empty collection (PDFDocumentPage.Annotations.Count will be 0).
If PDFParsePagesOptions.FormFields is specified, then the PDFDocumentPage.FormFields collection will be populated with a PDFFormField object for each form field item found in the page. If no form fields are found in the page, PDFDocumentPage.FormFields will be initialized with an empty collection (PDFDocumentPage.FormFields.Count will be 0).
If PDFParsePagesOptions.Signatures is specified, then the PDFDocumentPage.Signatures collection will be populated with a PDFSignature object for each digital signature item found in the page. If no signatures are found in the page, PDFDocumentPage.Signatures will be initialized with an empty collection (PDFDocumentPage.Signatures.Count will be 0).
The PDFDocumentPage object also contains the ConvertPoint and ConvertRect helper methods that can be used to convert a point or a rect from and to page/object to pixel/inch coordinates.
This example will load a PDF document and parse all its objects. For an example on how to draw these objects on the surface on an image, refer to PDFDocumentPage.
using Leadtools;
using Leadtools.Codecs;
using Leadtools.Pdf;
using Leadtools.WinForms;
public void PDFDocumentPageExample()
{
string pdfFileName = Path.Combine(LEAD_VARS.ImagesDir, @"Leadtools.pdf");
string txtFileName = Path.Combine(LEAD_VARS.ImagesDir, @"LEAD_pdf.txt");
// Open the document
using (PDFDocument document = new PDFDocument(pdfFileName))
{
// Parse everything and for all pages
PDFParsePagesOptions options = PDFParsePagesOptions.All;
document.ParsePages(options, 1, -1);
// Save the results to the text file for examining
using (StreamWriter writer = File.CreateText(txtFileName))
{
foreach (PDFDocumentPage page in document.Pages)
{
writer.WriteLine("Page {0}", page.PageNumber);
IList<PDFObject> objects = page.Objects;
writer.WriteLine("Objects: {0}", objects.Count);
foreach (PDFObject obj in objects)
{
writer.WriteLine(" ObjectType: {0}", obj.ObjectType.ToString());
writer.WriteLine(" Bounds: {0}, {1}, {2}, {3}", obj.Bounds.Left, obj.Bounds.Top, obj.Bounds.Right, obj.Bounds.Bottom);
WriteTextProperties(writer, obj.TextProperties);
writer.WriteLine(" Code: {0}", obj.Code);
writer.WriteLine("------");
}
writer.WriteLine("---------------------");
IList<PDFHyperlink> hyperlinks = page.Hyperlinks;
writer.WriteLine("Hyperlinks: {0}", hyperlinks.Count);
foreach (PDFHyperlink hyperlink in hyperlinks)
{
writer.WriteLine(" Hyperlink: {0}", hyperlink.Hyperlink);
writer.WriteLine(" Bounds: {0}, {1}, {2}, {3}", hyperlink.Bounds.Left, hyperlink.Bounds.Top, hyperlink.Bounds.Right, hyperlink.Bounds.Bottom);
WriteTextProperties(writer, hyperlink.TextProperties);
}
writer.WriteLine("---------------------");
}
}
}
}
private static void WriteTextProperties(StreamWriter writer, PDFTextProperties textProperties)
{
writer.WriteLine(" TextProperties.FontHeight: {0}", textProperties.FontHeight.ToString());
writer.WriteLine(" TextProperties.FontWidth: {0}", textProperties.FontWidth.ToString());
writer.WriteLine(" TextProperties.FontIndex: {0}", textProperties.FontIndex.ToString());
writer.WriteLine(" TextProperties.IsEndOfWord: {0}", textProperties.IsEndOfWord.ToString());
writer.WriteLine(" TextProperties.IsEndOfLine: {0}", textProperties.IsEndOfLine.ToString());
writer.WriteLine(" TextProperties.Color: {0}", textProperties.Color.ToString());
}
static class LEAD_VARS
{
public const string ImagesDir = @"C:\LEADTOOLS22\Resources\Images";
}