LEADTOOLS Support
Document
Document SDK Examples
How to remove Graphic zone types from a PDF while using LEADTOOLS OCR
#1
Posted
:
Wednesday, June 7, 2017 3:48:03 PM(UTC)
Groups: Registered
Posts: 119
Was thanked: 4 time(s) in 4 post(s)
There are multiple
Ocr Zone Types that can be found when the LEADTOOLS OCR engine calls
AutoZone to detect all the zones on a document. Sometimes the zoning may detect more zones than you would like to recognize.
Below you will find code that will demonstrate how to remove a graphic zone from the collection of zone types that have been found as well as Save the zones found, print out the bounds of the zones (x,y), recognize the document and save the text output as a text file.
Code:
// Create an instance of the engine
using (IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage, false))
{
// Start the engine using default parameters
ocrEngine.Startup(null, null, null, @"C:\LEADTOOLS 19\Bin\Common\OcrAdvantageRuntime");
// Create an OCR document
using (IOcrDocument ocrDocument = ocrEngine.DocumentManager.CreateDocument())
{
RasterImage image = ocrEngine.RasterCodecsInstance.Load(input, 0, CodecsLoadByteOrder.Rgb, 1, -1);
// Add this image to the document
IOcrPage ocrPage = ocrDocument.Pages.AddPage(image, null);
// Perform default AutoZoning on the page
ocrPage.AutoZone(null);
foreach (OcrZone zone in ocrPage.Zones)
{
if (zone.ZoneType == OcrZoneType.Graphic)
{
ocrPage.Zones.Remove(zone);
break;
}
}
// Save the zones
ocrPage.SaveZones(@"SAVE LOCATION.ozf");
foreach (OcrZone ocrZone in ocrPage.Zones)
{
int index = ocrPage.Zones.IndexOf(ocrZone);
Console.WriteLine("Zone index: {0}", index);
Console.WriteLine(" Id {0}", ocrZone.Id);
Console.WriteLine(" Bounds {0}", ocrZone.Bounds);
Console.WriteLine(" ZoneType {0}", ocrZone.ZoneType);
Console.WriteLine("----------------------------------");
}
//Recognize Page
//ocrPage.Recognize(null);
ocrDocument.Pages.Recognize(null);
ocrDocument.Save(output, DocumentFormat.Text, null);
}
// Shutdown the engine
// Note: calling Dispose will also automatically shutdown the engine if it has been started
ocrEngine.Shutdown();
}
Nick Villalobos
Developer Support Engineer
LEAD Technologies, Inc.
LEADTOOLS Support
Document
Document SDK Examples
How to remove Graphic zone types from a PDF while using LEADTOOLS OCR
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.