Error processing SSI file
LEADTOOLS OCR (Leadtools.Forms.Ocr assembly)

Show in webframe

IOcrPage Interface








Members 
Defines an image page in an OCR document.
Object Model
Syntax
public interface IOcrPage : System.IDisposable  
'Declaration
 
Public Interface IOcrPage 
   Inherits System.IDisposable 
'Usage
 
Dim instance As IOcrPage
@interface LTOcrPage : NSObject
public class OcrPage
function Leadtools.Forms.Ocr.IOcrPage() System.IDisposable 
public interface class IOcrPage : public System.IDisposable  
Remarks

IOcrPage defines a page currently added in the OCR engine. Each page contains the raster image used to create it (the image used when the page is loaded or added) and a group of OCR zones for the page either added manually or through auto-zoning.

Pages can be stand-alone or part of an IOcrDocument. To create a stand-alone page, use IOcrEngine.CreatePage. To create pages as part of IOcrDocument, use the IOcrDocument.Pages collection.

For information on how to create memory-based or file-based documents or how to load file-based documents from disk refer to IOcrDocumentManager.CreateDocument and Programming with the LEADTOOLS .NET OCR.

Memory-Based Documents

You can access the pages inside a memory-based OCR document (IOcrDocument) through the IOcrDocument.Pages property. The value of this property is a IOcrPageCollection interface. This interface implements standard .NET ICollection`1, IList`1, and IEnumerable`1 interfaces and hence, you can use the member of these interfaces to add, remove, get, set and iterate through the different pages of the document.

In memory-based documents, you cannot create IOcrPage objects directly. Instead, add pages to the engine through the various AddPage, AddPages, InsertPage and InsertPages methods of the IOcrPageCollection interface. Once a page is added, access it by index to get the IOcrPage object associated with it.

Pages obtained this way do not need to be disposed. The owner IOcrDocument will automatically destroy the pages when it is disposed.

File-Based Documents

Usually, you create a page directly using IOcrEngine.CreatePage. You can use all the IOcrPage methods to zone and recognize the document as listed below as usual. And if saving the page to a final output format is required, then you can add this page to a file-based IOcrDocument using the IOcrPageCollection.Add member of Pages.

Pages obtained through CreatePage must be destroyed by the user using the Dispose method.

Each page contains a collection of OCR zones. This collection can be accessed with the Zones member. This member implements the IOcrZoneCollection interface which also implements the same standard .NET collections interfaces as IOcrPageCollection. Hence you can use Zones to add, remove, get, set and iterate through the various zones in the page.

After optionally manipulating the zones inside a page, call Recognize to collect the recognition data of the page. This data is stored internally in the page and can later be saved to one of the many document file formats supported by the engine such as PDF or Microsoft Word.

After a page is recognized, examine and modify the recognition data (characters and words) through the GetRecognizedCharacters and SetRecognizedCharacters methods. The GetText method can be used to obtain the recognition data as simple string object.

Once an IOcrPage object is obtained, you can do the following:

Example

This example creates an OCR document and adds a page to it, displays various information about the page and then saves it as PDF file.

Copy Code  
Imports Leadtools
Imports Leadtools.Codecs
Imports Leadtools.Forms.Ocr
Imports Leadtools.Forms
Imports Leadtools.Forms.DocumentWriters
Imports Leadtools.WinForms
Imports Leadtools.Drawing
Imports Leadtools.ImageProcessing
Imports Leadtools.ImageProcessing.Color

<TestMethod>
Public Sub OcrPageExample()
   Dim tifFileName As String = Path.Combine(LEAD_VARS.ImagesDir, "Ocr1.tif")
   Dim pdfFileName As String = Path.Combine(LEAD_VARS.ImagesDir, "Ocr1.pdf")
   ' Create an instance of the engine
   Using ocrEngine As IOcrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage, False)
      ' Start the engine using default parameters
      ocrEngine.Startup(Nothing, Nothing, Nothing, LEAD_VARS.OcrAdvantageRuntimeDir)

      ' Create an OCR document
      Using ocrDocument As IOcrDocument = ocrEngine.DocumentManager.CreateDocument()
         ' Add this image to the document
         Dim ocrPage As IOcrPage = ocrDocument.Pages.AddPage(tifFileName, Nothing)

         ' Auto-recognize the zones in the page
         ocrPage.AutoZone(Nothing)

         ' Show its information
         Console.WriteLine("Size: {0} by {1} pixels", ocrPage.Width, ocrPage.Height)
         Console.WriteLine("Resolution: {0} by {1} dots/inch", ocrPage.DpiX, ocrPage.DpiX)
         Console.WriteLine("Bits/Pixel: {0}, Bytes/Line: {1}", ocrPage.BitsPerPixel, ocrPage.BytesPerLine)

         Dim palette As Byte() = ocrPage.GetPalette()
         Dim paletteEntries As Integer
         If palette IsNot Nothing Then
            paletteEntries = palette.Length \ 3
         Else
            paletteEntries = 0
         End If

         Console.WriteLine("Number of entries in the palette: {0}", paletteEntries)
         Console.WriteLine("Original format of this page: {0}", ocrPage.OriginalFormat)
         Console.WriteLine("Has this page been recognized? : {0}", ocrPage.IsRecognized)
         ShowZonesInfo(ocrPage)

         ' Recognize it and save it as PDF
         ocrPage.Recognize(Nothing)
         ocrDocument.Save(pdfFileName, DocumentFormat.Pdf, Nothing)
      End Using

      ' Shutdown the engine
      ' Note: calling Dispose will also automatically shutdown the engine if it has been started
      ocrEngine.Shutdown()
   End Using
End Sub

Private Sub ShowZonesInfo(ocrPage As IOcrPage)
   Console.WriteLine("Zones:")
   For Each ocrZone As OcrZone In ocrPage.Zones
      Dim index As Integer = ocrPage.Zones.IndexOf(ocrZone)
      Console.WriteLine("Zone index: {0}", index)
      Console.WriteLine("  Id                  {0}", ocrZone.Id)
      Console.WriteLine("  Bounds              {0}", ocrZone.Bounds)
      Console.WriteLine("  ZoneType            {0}", ocrZone.ZoneType)
      Console.WriteLine("  CharacterFilters:   {0}", ocrZone.CharacterFilters)
      Console.WriteLine("----------------------------------")
   Next
End Sub

Public NotInheritable Class LEAD_VARS
Public Const ImagesDir As String = "C:\Users\Public\Documents\LEADTOOLS Images"
Public Const OcrAdvantageRuntimeDir As String = "C:\LEADTOOLS 19\Bin\Common\OcrAdvantageRuntime"
End Class
using Leadtools;
using Leadtools.Codecs;
using Leadtools.Forms.Ocr;
using Leadtools.Forms;
using Leadtools.Forms.DocumentWriters;
using Leadtools.WinForms;
using Leadtools.Drawing;
using Leadtools.ImageProcessing;
using Leadtools.ImageProcessing.Color;

public void OcrPageExample()
{
   string tifFileName = Path.Combine(LEAD_VARS.ImagesDir, "Ocr1.tif");
   string pdfFileName = Path.Combine(LEAD_VARS.ImagesDir, "Ocr1.pdf");
   // Create an instance of the engine
   using (IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage, false))
   {
      // Start the engine using default parameters
      ocrEngine.Startup(null, null, null, LEAD_VARS.OcrAdvantageRuntimeDir);

      // Create an OCR document
      using (IOcrDocument ocrDocument = ocrEngine.DocumentManager.CreateDocument())
      {
         // Add this image to the document
         IOcrPage ocrPage = ocrDocument.Pages.AddPage(tifFileName, null);

         // Auto-recognize the zones in the page
         ocrPage.AutoZone(null);

         // Show its information
         Console.WriteLine("Size: {0} by {1} pixels", ocrPage.Width, ocrPage.Height);
         Console.WriteLine("Resolution: {0} by {1} dots/inch", ocrPage.DpiX, ocrPage.DpiX);
         Console.WriteLine("Bits/Pixel: {0}, Bytes/Line: {1}", ocrPage.BitsPerPixel, ocrPage.BytesPerLine);

         byte[] palette = ocrPage.GetPalette();
         int paletteEntries;
         if (palette != null)
            paletteEntries = palette.Length / 3;
         else
            paletteEntries = 0;

         Console.WriteLine("Number of entries in the palette: {0}", paletteEntries);
         Console.WriteLine("Original format of this page: {0}", ocrPage.OriginalFormat);
         Console.WriteLine("Has this page been recognized? : {0}", ocrPage.IsRecognized);
         ShowZonesInfo(ocrPage);

         // Recognize it and save it as PDF
         ocrPage.Recognize(null);
         ocrDocument.Save(pdfFileName, DocumentFormat.Pdf, null);
      }

      // Shutdown the engine
      // Note: calling Dispose will also automatically shutdown the engine if it has been started
      ocrEngine.Shutdown();
   }
}

private void ShowZonesInfo(IOcrPage ocrPage)
{
   Console.WriteLine("Zones:");
   foreach (OcrZone ocrZone in ocrPage.Zones)
   {
      int index = ocrPage.Zones.IndexOf(ocrZone);
      Console.WriteLine("Zone index: {0}", index);
      Console.WriteLine("  Id                  {0}", ocrZone.Id);
      Console.WriteLine("  Bounds              {0}", ocrZone.Bounds);
      Console.WriteLine("  ZoneType            {0}", ocrZone.ZoneType);
      Console.WriteLine("  CharacterFilters:   {0}", ocrZone.CharacterFilters);
      Console.WriteLine("----------------------------------");
   }
}

static class LEAD_VARS
{
public const string ImagesDir = @"C:\Users\Public\Documents\LEADTOOLS Images";
public const string OcrAdvantageRuntimeDir = @"C:\LEADTOOLS 19\Bin\Common\OcrAdvantageRuntime";
}
using Leadtools;
using Leadtools.Codecs;
using Leadtools.Controls;
using Leadtools.Forms.Ocr;
using Leadtools.Forms;
using Leadtools.Forms.DocumentWriters;
using Leadtools.ImageProcessing;

      
public async Task OcrPageExample()
{
   string tifFileName = @"Assets\Ocr1.tif";
   string pdfFileName = "Ocr1.pdf";
   // Create an instance of the engine
   IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage, false);

   // Start the engine using default parameters
   ocrEngine.Startup(null, null, String.Empty, Tools.OcrEnginePath);

   // Create an OCR document
   IOcrDocument ocrDocument = ocrEngine.DocumentManager.CreateDocument();

   // Add this image to the document
   IOcrPage ocrPage = null;
   using (RasterCodecs codecs = new RasterCodecs())
   {
      StorageFile loadFile = await Tools.AppInstallFolder.GetFileAsync(tifFileName);
      using (RasterImage image = await codecs.LoadAsync(LeadStreamFactory.Create(loadFile)))
         ocrPage = ocrDocument.Pages.AddPage(image, null);
   }

   // Auto-recognize the zones in the page
   ocrPage.AutoZone(null);

   // Show its information
   Debug.WriteLine("Size: {0} by {1} pixels", ocrPage.Width, ocrPage.Height);
   Debug.WriteLine("Resolution: {0} by {1} dots/inch", ocrPage.DpiX, ocrPage.DpiX);
   Debug.WriteLine("Bits/Pixel: {0}, Bytes/Line: {1}", ocrPage.BitsPerPixel, ocrPage.BytesPerLine);

   byte[] palette = ocrPage.GetPalette();
   int paletteEntries;
   if(palette != null)
      paletteEntries = palette.Length / 3;
   else
      paletteEntries = 0;

   Debug.WriteLine("Number of entries in the palette: {0}", paletteEntries);
   Debug.WriteLine("Original format of this page: {0}", ocrPage.OriginalFormat);
   Debug.WriteLine("Has this page been recognized? : {0}", ocrPage.IsRecognized);
   ShowZonesInfo(ocrPage);

   // Recognize it and save it as PDF
   ocrPage.Recognize(null);
   StorageFile saveFile = await Tools.AppLocalFolder.CreateFileAsync(pdfFileName, CreationCollisionOption.ReplaceExisting);
   await ocrDocument.SaveAsync(LeadStreamFactory.Create(saveFile), DocumentFormat.Pdf, null);

   // Shutdown the engine
   ocrEngine.Shutdown();
}

private void ShowZonesInfo(IOcrPage ocrPage)
{
   Debug.WriteLine("Zones:");
   foreach(OcrZone ocrZone in ocrPage.Zones)
   {
      int index = ocrPage.Zones.IndexOf(ocrZone);
      Debug.WriteLine("Zone index: {0}", index);
      Debug.WriteLine("  Id                  {0}", ocrZone.Id);
      Debug.WriteLine("  Bounds              {0}", ocrZone.Bounds);
      Debug.WriteLine("  ZoneType            {0}", ocrZone.ZoneType);
      Debug.WriteLine("  FillMethod:         {0}", ocrZone.FillMethod);
      Debug.WriteLine("  RecognitionModule:  {0}", ocrZone.RecognitionModule);
      Debug.WriteLine("  CharacterFilters:   {0}", ocrZone.CharacterFilters);
      Debug.WriteLine("----------------------------------");
   }
}
Requirements

Target Platforms

See Also

Reference

IOcrPage Members
Leadtools.Forms.Ocr Namespace
OcrEngineManager Class
OcrEngineType Enumeration
IOcrPageCollection Interface
IOcrZoneCollection Interface
OcrZone Structure
Programming with the LEADTOOLS .NET OCR
Working with OCR Pages

Error processing SSI file
Leadtools.Forms.Ocr requires a Recognition or Document Imaging Suite license and unlock key. For more information, refer to: LEADTOOLS Toolkit Features