The OcrCharacter Structure is available as an add-on to the LEADTOOLS Document and Medical Imaging toolkits.
Recognized character data.Visual Basic (Declaration) | |
---|---|
<SerializableAttribute()> Public Structure OcrCharacter Inherits System.ValueType |
Visual Basic (Usage) | Copy Code |
---|---|
Dim instance As OcrCharacter |
C# | |
---|---|
[SerializableAttribute()] public struct OcrCharacter : System.ValueType |
C++/CLI | |
---|---|
[SerializableAttribute()] public value class OcrCharacter : public System.ValueType |
To get the recognized characters of a page, call IOcrPage.GetRecognizedCharacters after IOcrPage.Recognize or IOcrPage.RecognizeText.
To update the recognized characters of a page, call IOcrPage.GetRecognizedCharacters before calling IOcrDocument.Save or IOcrDocument.SaveXml.
IOcrPageCharacters implements the standard IList, ICollection and IEnumerable interfaces with items of type IOcrZoneCharacters. Each item in the IOcrPageCharacters contains a collection of the character collections of the zones.
The IOcrZoneCharacters interface contains a collection of the characters for a particular zone.
IOcrZoneCharacters also implements IList, ICollection and IEnumerable interfaces but with items of type OcrCharacter. Each item in the IOcrZoneCharacters contains a collection of the characters of the zone.
For example, if you are interesed in iterating through the characters of the 2nd zone in the page, you can do the following:
// Get the page characters IOcrPageCharacters pageCharacters = ocrPage.GetRecognizedCharacters(); // Get the 2nd zone characters. Note, index is zero-based so 2nd zone is index 1 // You can also iterate through the pageCharacters collection and fine the IOcrZoneCharacters item with ZoneIndex = 1 IOcrZoneCharacters zoneCharacters = pageCharacters.FindZoneCharacters(1); // Loop through the characters foreach(OcrCharacter ocrCharacter in zoneCharacters) { // Do something with ocrCharacter }
OcrCharacter is the most detailed information available about the recognized characters.
Touching characters, those whose shapes are physically joined in the page passed to the OCR engine, will result in a separate OcrCharacter structure for each recognized character within the block. However, the coordinate property of these characters (Bounds) will have identical coordinates defining a rectangle boundary for the character block. The order of the OcrCharacter structures representing a character block gives the order of the touching characters on the original document. This means the coordinates do not give information on the order of characters inside a boundary block.
System.Object
System.ValueType
Leadtools.Forms.Ocr.OcrCharacter
Target Platforms: Microsoft .NET Framework 2.0, Windows 2000, Windows XP, Windows Server 2003 family, Windows Server 2008 family, Windows Vista, Windows 7
Reference
OcrCharacter MembersLeadtools.Forms.Ocr Namespace
IOcrPage.SetRecognizedCharacters
IOcrPage.GetRecognizedCharacters
IOcrPage.Recognize
IOcrPage.IsRecognized
OcrCharacter Structure
IOcrPageCharacters Interface
IOcrZoneCharacters Interface
IOcrPageCollection Interface
IOcrZoneCollection Interface
OcrZone Structure
Programming with Leadtools .NET OCR
OCR Confidence Reporting