OCR Languages and Spell-Checking

The LEADTOOLS .NET OCR Toolkit supports languages and spell-checking through the following, separate parts:

The Language Environment

The language environment defines the character set(s) recognized by the OCR engine. For example, if you enable the English and German languages, the German characters (ä, Ä, é, ö, Ö, ü, Ü, ß) will be combined with the English characters to define the set recognized by the engine.

To set the character sets to use in the engine, use the IOcrLanguageManager.EnableLanguages method. To get the character sets supported by the engine, use the IOcrLanguageManager.GetSupportedLanguages and IOcrLanguageManager.IsLanguageSupported methods. You can enable as many character sets as required.

The language environment does not automatically perform spell-checking. To enable it you need to use the spell-checking sub-system.

Spell-Checking Sub System

The functionality of the checking subsystem consists of three separate parts:

LEADTOOLS OCR supports spell-checking and correction through the use of external dictionaries. The value of IOcrSpellCheckManager.SpellCheckEngine acts as a global switch to use a particular spell checker or turn spell checking off.

When you set the IOcrSpellCheckManager.SpellCheckEngine property to a value other than None, the OCR engine will automatically try to load the spell checker requested and query the language dictionaries found on your machine. You can change SpellCheckEngine at any time during the life of the IOcrEngine depending on your application's needs. For example, you can disable spell-checking while performing recognition on certain types of documents and then re-enable it for other types.

To determine which languages support a dictionary in an engine, use IOcrSpellCheckManager.GetSupportedSpellLanguages. You can use one language dictionary at a time inside the engine.

Language Character Sets Supported by Engine

For more information, refer to IOcrLanguageManager.

LEADTOOLS OCR Module - LEAD Engine

OmniPage Engine

And the following Asian character sets (available with the Asian OCR Module):

Chinese Simplified (zh-Hans) Chinese Traditional (zh-Hant) Japanese (ja) Korean (ko)

Arabic OCR Engine

Arabic (ar)

Language Dictionary Support, by Engine

For more information, refer to IOcrSpellCheckManager.

LEADTOOLS OCR Module - LEAD Engine

For more information, refer to OcrSpellCheckEngine.

OmniPage Engine

Arabic OCR Engine

Currently, the Arabic OCR engine does not support language dictionaries.

See Also

Help Version 20.0.2020.4.3
Products | Support | Contact Us | Intellectual Property Notices
© 1991-2020 LEAD Technologies, Inc. All Rights Reserved.

LEADTOOLS Imaging, Medical, and Document