LEADTOOLS Support
Document
Document SDK Questions
No recognized text available, either because the zone is empty or the required recognition module ha
This topic and its replies were posted before the current version of LEADTOOLS was released and may no longer be applicable.
#1
Posted
:
Wednesday, August 8, 2007 6:53:34 AM(UTC)
Groups: Registered
Posts: 32
Hi,
We are using Leadtools 15 with C#.
Even when there is content on the page, we are getting the following error when trying to OCR some images.
-------------
No recognized text available, either because the zone is empty or the required recognition module has not been initialized properly.
-------------
What could be wrong? Please help...
Suresh
#2
Posted
:
Friday, August 10, 2007 6:57:34 AM(UTC)
Groups: Registered, Tech Support, Administrators
Posts: 764
This is usually a problem with the image not having text, the text cannot be found because the background color is too similar, the image is of bad quality, or a combination.
Since it works for some images, I doubt that the latter part of the error "...not been initialized properly" is the cause.
Please send an image or two that reproduces this problem. Make sure that you do not press the preview button before posting or the attachment will get dropped. If you do not want to post the file publically, or the file is too large (>5MB) then send an email to
support@leadtools.com and attach the file or ask for FTP instructions. Be sure to include a link to this forum post.
#3
Posted
:
Monday, August 13, 2007 10:01:24 AM(UTC)
Groups: Registered
Posts: 32
Please find one of the images.
#4
Posted
:
Monday, August 13, 2007 10:02:33 AM(UTC)
Groups: Registered
Posts: 32
Please check the attachment in zipped format (It contains a tif)
#5
Posted
:
Tuesday, August 14, 2007 4:16:05 AM(UTC)
Groups: Registered, Tech Support, Administrators
Posts: 764
The problem with your file is the background. It appears that this was once a color image that was converted to a black and white image with some dithering. The dithering used small dots to give the appearance of grayscale, but the background interefered with the OCR results. I did a DotRemove on your image and was able to OCR the text (dot removed image is attached).
#6
Posted
:
Wednesday, August 29, 2007 6:27:59 AM(UTC)
Groups: Registered
Posts: 32
Thanks Greg. Please check attached pic. It has text that is not recognized. Only top text (headers) is recognized. How can we extract the text in between.
Thanks,
Suresh
#7
Posted
:
Wednesday, August 29, 2007 6:29:12 AM(UTC)
Groups: Registered
Posts: 32
Please check the attachment here.
#8
Posted
:
Thursday, August 30, 2007 6:23:51 AM(UTC)
Groups: Registered, Tech Support, Administrators
Posts: 764
This image's text (i'm assuming the problem is the light text in the middle) is too light. As you can see it is heavily half-toned so the text is not "together" but very spacy. I used the L_MinFilterBitmap function (MinimumCommand in .NET) to dilate the pixels to fill in the gaps resulting from the half-toning and was able to recognize the text. The accuracy was not as good, but you should be able to tweak the settings of the function and combine it with some others like SmoothCommand and get better results.
LEADTOOLS Support
Document
Document SDK Questions
No recognized text available, either because the zone is empty or the required recognition module ha
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.