This topic and its replies were posted before the current version of LEADTOOLS was released and may no longer be applicable.
#1
Posted
:
Thursday, June 8, 2006 3:47:17 AM(UTC)
Groups: Registered
Posts: 10
I am using DotNet class library in OCR of LeadTools Document Imaging 14.5.
i want to know some query which are listed below:
1. I want to store the text in the same format as Per in the image .But I am not able to gety format of text. I want to know the font size,format etc of Recognised word by ocr, so that i will be able to extract data from image on the basis of these attributes. I am attaching a image which has yellow boxes, from which i want to extract information. These information is not fixed on same co-ordinates in an image, it varies image by image. If u have any alternate solution please tell me.
2. I want OCR Editor, i could not find in LeadTools Document Imaging 14.5 toolkit.
Hope for positive response.
Thanks,
Neeraj Kaushik
#2
Posted
:
Monday, June 12, 2006 12:15:52 AM(UTC)
Groups: Guests
Posts: 3,022
Was thanked: 2 time(s) in 2 post(s)
To get the font name and size of a recognized character use the the
RecognizedCharater.Font and RecognizedCharater.FontSize properties.
If you want to extract specific information from an image, then this
information has to be in a fixed position in all images so that you can
create a zone on that position and recognize it. However, if this
information is the same string in all images, then you can just
recognize the whole image and look for this string in the result.
Not sure what you mean by "OCR Editor". Can you please explaint further?
#3
Posted
:
Monday, June 12, 2006 10:06:19 PM(UTC)
Groups: Registered
Posts: 10
But my problem is that i want to recognize word's font and size as on this basis I can extract information. There is no fixed position and string in ocr text which I want to extract. Suppose there is a date which I want to extract from image and this date can be in header, body or elsewhere, so that I want to make a system which help me to identify the position of date format. For this i want to make a training program which will extract all the information from image and will store in database and these information will be refer in live project. So that I am confused that which information will be suitable to store so that it will help system in live scenario.
Is there any method from which we can classify images without doing ocr? It will also help me, but I haven't got it from SDK.
OCR editor means the editor which shows extracted data while doing ocr and we can correct this data in editor.
I hope you will give me positive response.
#4
Posted
:
Tuesday, June 13, 2006 12:20:38 AM(UTC)
Groups: Guests
Posts: 3,022
Was thanked: 2 time(s) in 2 post(s)
LEADTOOLS does not provide any functionality to train the OCR engine to
classify images. If you want to extract specific information that
may not be the same or not in the same position in your documents, then
you will have to recognize the whole document and use third party
functions/algorithms to extract the desired data from the recognized
text.
LEADTOOLS does not provide a build-in editor, but provides you with
means to create one. The Verification event is fired for each
recognized word with a suggested word for the recognized word.
You can also call GetRecognizedWords to get the recognized words at any
point.
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.