#1
Posted
:
Friday, May 26, 2017 11:08:29 AM(UTC)
Groups: Registered, Tech Support, Administrators
Posts: 39
Thanks: 2 times
Was thanked: 3 time(s) in 3 post(s)
LEADTOOLS OCR allows you to extract text from image in any of the 40+ languages that we support. The supported engines are able to intelligentlly detect the language on the image and return the results as a text string. Earlier this week I worked with a customer who was curious if our OCR engine could detect multiple languages on the same page and the answer is yes.
While you can use the
DetectLanguage method of the
IOcrLanguageManager this detects one language for the entire page from a list of languages you provide. Instead to detect multiple languages on the same page you can leverage the
Language property of the OcrZone structure. Setting this property to null or an empty string will trigger the language detection. See the following chart for additional usage information:
I put together a quick test to showcase the language detection performance our OCR engine offers. Please see below the code and attached test file.
Code:string inputFile = @"PATH TO IMAGE FILE";
using (RasterCodecs codecs = new RasterCodecs())
{
codecs.Options.RasterizeDocument.Load.XResolution = 300;
codecs.Options.RasterizeDocument.Load.YResolution = 300;
RasterImage image = codecs.Load(inputFile);
using (IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage, false))
{
ocrEngine.Startup(null, null, null, @"C:\LEADTOOLS 19\Bin\Common\OcrAdvantageRuntime");
//enable as many languages as you want/need/think your application will encounter
ocrEngine.LanguageManager.EnableLanguages(new string[] { "en", "de", "es", "pt", "uk", "el", "cs" });
using (IOcrDocument document = ocrEngine.DocumentManager.CreateDocument())
{
document.Pages.AddPage(image, null);
document.Pages[0].Recognize(null);
for (int i = 0; i < document.Pages[0].Zones.Count; i++)
{
OcrZone currentZone = document.Pages[0].Zones[i];
CultureInfo info = new CultureInfo(currentZone.Language.ToString());
Console.WriteLine($"Zone#: {i} contains {info.EnglishName} text");
}
}
}
}
Roberto Rodriguez
Developer Support Engineer
LEAD Technologies, Inc.
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.