Leadtools.Pdf Namespace : PDFDocumentPage Class |
[SerializableAttribute()] public class PDFDocumentPage
'Declaration <SerializableAttribute()> Public Class PDFDocumentPage
'Usage Dim instance As PDFDocumentPage
public sealed class PDFDocumentPage
function Leadtools.Pdf.PDFDocumentPage()
[SerializableAttribute()] public ref class PDFDocumentPage
The PDFDocumentPage class is used as the type of the PDFDocument.Pages collection.
The PDFDocument.Pages collection is automatically created when a new PDFDocument object is created with the PDFDocument(string fileName) or PDFDocument(string fileName, string password) constructurs. This collection is read-only and cannot be modified since the PDFDocument object is a read-only view of a PDF file. Each item in the collection corresponding to a page in the PDF document.
Each item in the Pages collection corresponds to a page in the PDF document. So item at index 0 is the properties of page 1, item at index 1 is the properties of page 2 and so on. Even though the PDFDocumentPage structure contains the PageNumber property that specifies the number of the page, this information is for convenience only, the PDFDocument constructors will always populate the collection in the correct order from first to last page.
Each PDF document page contain two rectangular areas, the media box and the crop box. The PDFDocumentPage will load and store these values in the MediaBox and CropBox properties. The various width and height values described below are of the crop box. For more information, refer to PDF Coordinate System.
The PDFDocumentPage class contains the width and height of the page in PDF units which is read from the PDF file directly (the PDF crop box). PDF units are in 1/72 of an inch, so a page size of 612 by 792 corresponds to 8.5 by 11 inches (612/72 by 792/72). The size of each page in PDF units (1/72 of an inch) will be automatically set in the Width and Height properties. The size of each page in inches is also fixed and will be set in the WidthInches and HeightInches properties. The size of the page in pixels depends on the owner document Resolution. This value can be changed by the user at any time so the pixel size of the page changes accordingly. To get the size of the page in pixel using the current resolution, use the WidthPixels and HeightPixels properties.
The total number of pages in the document is Pages.Count.
Each PDFDocumentPage object can also be populated with the various PDF native objects located in the corresponding page in the original PDF document. When you first create a PDFDocument object from a PDF file, all the collections described below will have a value of null (Nothing in Visual Basic). You can populate the collections with the objects by using the PDFDocument.ParsePages method. Each page parsed will have the collection described below populated with the objects found in the file depending on the value of PDFParsePagesOptions passed as the options parameter.
After the ParsePages method returns, the following properties will be initialized as follows:
Fonts: (If PDFParsePagesOptions.Fonts is used) will contain a list of PDFFont objects for each font found in the page. If no font are found in the page, the property will be initialized with an empty list
Objects: (If PDFParsePagesOptions.Objects is used) will contain a list of PDFObject objects for each text item (character), image or rectangle found in the page. If no objects are found in the page, the property will be initialized with an empty list
Hyperlinks: (If PDFParsePagesOptions.Hyperlinks is used) will contain a list of PDFHyperlink objects for each hyperlink found in the page. If no hyperlinks are found in the page, the property will be initialized with an empty list
The PDFDocumentPage object also contains the ConvertPoint and ConvertRect helper methods that can be used to convert a point or a rect from and to page/object to pixel/inch coordinates.
Public Sub PDFDocumentPageExample() Dim pdfFileName As String = Path.Combine(LEAD_VARS.ImagesDir, "LEAD.pdf") Dim txtFileName As String = Path.Combine(LEAD_VARS.ImagesDir, "LEAD_pdf.txt") ' Open the document Using document As New PDFDocument(pdfFileName) ' Parse everything and for all pages Dim options As PDFParsePagesOptions = PDFParsePagesOptions.All document.ParsePages(options, 1, -1) ' Save the results to the text file for examining Using writer As StreamWriter = File.CreateText(txtFileName) For Each page As PDFDocumentPage In document.Pages writer.WriteLine("Page {0}", page.PageNumber) Dim fonts As IList(Of PDFFont) = page.Fonts ' Note, no need to check if fonts is Nothing since we passed .All ' This will either get the fonts or an empty list. Same for all ' the other objects writer.WriteLine("Fonts: {0}", fonts.Count) For Each font As PDFFont In fonts writer.WriteLine(" FaceName: {0}", font.FaceName) writer.WriteLine(" FontStyle: {0}", font.FontStyle.ToString()) writer.WriteLine("------") Next writer.WriteLine("---------------------") Dim objects As IList(Of PDFObject) = page.Objects writer.WriteLine("Objects: {0}", objects.Count) For Each obj As PDFObject In objects writer.WriteLine(" ObjectType: {0}", obj.ObjectType.ToString()) writer.WriteLine(" Bounds: {0}, {1}, {2}, {3}", obj.Bounds.Left, obj.Bounds.Top, obj.Bounds.Right, obj.Bounds.Bottom) WriteTextProperties(writer, obj.TextProperties) writer.WriteLine(" Code: {0}", obj.Code) writer.WriteLine("------") Next writer.WriteLine("---------------------") Dim hyperlinks As IList(Of PDFHyperlink) = page.Hyperlinks writer.WriteLine("Hyperlinks: {0}", hyperlinks.Count) For Each hyperlink As PDFHyperlink In hyperlinks writer.WriteLine(" Hyperlink: {0}", hyperlink.Hyperlink) writer.WriteLine(" Bounds: {0}, {1}, {2}, {3}", hyperlink.Bounds.Left, hyperlink.Bounds.Top, hyperlink.Bounds.Right, hyperlink.Bounds.Bottom) WriteTextProperties(writer, hyperlink.TextProperties) Next writer.WriteLine("---------------------") Next End Using End Using End Sub Private Shared Sub WriteTextProperties(ByVal writer As StreamWriter, ByVal textProperties As PDFTextProperties) writer.WriteLine(" TextProperties.FontHeight: {0}", textProperties.FontHeight.ToString()) writer.WriteLine(" TextProperties.FontWidth: {0}", textProperties.FontWidth.ToString()) writer.WriteLine(" TextProperties.FontIndex: {0}", textProperties.FontIndex.ToString()) writer.WriteLine(" TextProperties.IsEndOfWord: {0}", textProperties.IsEndOfWord.ToString()) writer.WriteLine(" TextProperties.IsEndOfLine: {0}", textProperties.IsEndOfLine.ToString()) writer.WriteLine(" TextProperties.Color: {0}", textProperties.Color.ToString()) End Sub Public NotInheritable Class LEAD_VARS Public Const ImagesDir As String = "C:\Users\Public\Documents\LEADTOOLS Images" End Class
public void PDFDocumentPageExample() { string pdfFileName = Path.Combine(LEAD_VARS.ImagesDir, @"LEAD.pdf"); string txtFileName = Path.Combine(LEAD_VARS.ImagesDir, @"LEAD_pdf.txt"); // Open the document using(PDFDocument document = new PDFDocument(pdfFileName)) { // Parse everything and for all pages PDFParsePagesOptions options = PDFParsePagesOptions.All; document.ParsePages(options, 1, -1); // Save the results to the text file for examining using(StreamWriter writer = File.CreateText(txtFileName)) { foreach(PDFDocumentPage page in document.Pages) { writer.WriteLine("Page {0}", page.PageNumber); IList<PDFFont> fonts = page.Fonts; // Note, no need to check if fonts is null since we passed .All // This will either get the fonts or an empty list. Same for all // the other objects writer.WriteLine("Fonts: {0}", fonts.Count); foreach(PDFFont font in fonts) { writer.WriteLine(" FaceName: {0}", font.FaceName); writer.WriteLine(" FontStyle: {0}", font.FontStyle.ToString()); writer.WriteLine("------"); } writer.WriteLine("---------------------"); IList<PDFObject> objects = page.Objects; writer.WriteLine("Objects: {0}", objects.Count); foreach(PDFObject obj in objects) { writer.WriteLine(" ObjectType: {0}", obj.ObjectType.ToString()); writer.WriteLine(" Bounds: {0}, {1}, {2}, {3}", obj.Bounds.Left, obj.Bounds.Top, obj.Bounds.Right, obj.Bounds.Bottom); WriteTextProperties(writer, obj.TextProperties); writer.WriteLine(" Code: {0}", obj.Code); writer.WriteLine("------"); } writer.WriteLine("---------------------"); IList<PDFHyperlink> hyperlinks = page.Hyperlinks; writer.WriteLine("Hyperlinks: {0}", hyperlinks.Count); foreach(PDFHyperlink hyperlink in hyperlinks) { writer.WriteLine(" Hyperlink: {0}", hyperlink.Hyperlink); writer.WriteLine(" Bounds: {0}, {1}, {2}, {3}", hyperlink.Bounds.Left, hyperlink.Bounds.Top, hyperlink.Bounds.Right, hyperlink.Bounds.Bottom); WriteTextProperties(writer, hyperlink.TextProperties); } writer.WriteLine("---------------------"); } } } } private static void WriteTextProperties(StreamWriter writer, PDFTextProperties textProperties) { writer.WriteLine(" TextProperties.FontHeight: {0}", textProperties.FontHeight.ToString()); writer.WriteLine(" TextProperties.FontWidth: {0}", textProperties.FontWidth.ToString()); writer.WriteLine(" TextProperties.FontIndex: {0}", textProperties.FontIndex.ToString()); writer.WriteLine(" TextProperties.IsEndOfWord: {0}", textProperties.IsEndOfWord.ToString()); writer.WriteLine(" TextProperties.IsEndOfLine: {0}", textProperties.IsEndOfLine.ToString()); writer.WriteLine(" TextProperties.Color: {0}", textProperties.Color.ToString()); } static class LEAD_VARS { public const string ImagesDir = @"C:\Users\Public\Documents\LEADTOOLS Images"; }
Target Platforms: Windows 7, Windows Vista SP1 or later, Windows XP SP3, Windows Server 2008 (Server Core not supported), Windows Server 2008 R2 (Server Core supported with SP1 or later), Windows Server 2003 SP2