LEADTOOLS Support
Document
Document SDK Questions
OCR word low confidence but all characters max high confidence?
This topic and its replies were posted before the current version of LEADTOOLS was released and may no longer be applicable.
#1
Posted
:
Thursday, December 21, 2006 4:39:51 AM(UTC)
Groups: Registered
Posts: 15
I am noticing that sometimes the OCR engine will return a 1 for the high bit of a character's confidence property, which the documentation states as being "uncertain" yet each character in the word are all 0 for the remaining bits, which the documentation states as being the highest of confidence. If each component character in the word are full confidence what is causing the word confidence to be low?
The documentation states that the converse can happen: "In some cases a word may have some or all characters that are individually suspicious but the characters are not be marked suspicious in the word bit." This makes sense but there is no mention of what would cause the behavior I've described above.
#2
Posted
:
Tuesday, December 26, 2006 4:16:00 AM(UTC)
Groups: Guests
Posts: 3,022
Was thanked: 2 time(s) in 2 post(s)
This behavior is possibly caused when the OCR
engine recognizes a word that is not in the user dictionary but it recognizes
the individual characters of this word. For example, if you got the word
"Guten"; the characters are not being marked suspicious in the word
bit, but the word itself is uncertain. This means that the word was validated
by the checking subsystem.
LEADTOOLS Support
Document
Document SDK Questions
OCR word low confidence but all characters max high confidence?
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.