Welcome Guest! To enable all features, please Login or Register.

Notification

Icon
Error

Options
View
Last Go to last post Unread Go to first unread post
#1 Posted : Tuesday, April 12, 2011 6:23:30 AM(UTC)
it-dimension

Groups: Registered
Posts: 2


Hello!

I can not
understand how to use the SDK
. Need to recognize PDF and get the entire text, as well as its position and size (width and height, or top-left and bottom-right coordinates).

Now I've settled on this:

            IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Professional, false);
            ocrEngine.Startup(null, null, null, null);

            ocrEngine.AutoRecognizeManager.EnableTrace = false;
            ocrEngine.AutoRecognizeManager.MaximumThreadsPerJob = 0;
            ocrEngine.AutoRecognizeManager.MaximumPagesBeforeLtd = 0;
            ocrEngine.AutoRecognizeManager.JobErrorMode = OcrAutoRecognizeManagerJobErrorMode.Abort;
            ocrEngine.AutoRecognizeManager.PreprocessPageCommands.Clear();

            OcrAutoRecognizeJobData ocrJobData = new OcrAutoRecognizeJobData(path,  DocumentFormat.Html, "output.html");

            IOcrAutoRecognizeJob ocrJob = ocrEngine.AutoRecognizeManager.CreateJob(ocrJobData);
            ocrEngine.AutoRecognizeManager.RunJob(ocrJob);

As a result, getting an exception: PDF codec is needed to use this feature.

We plan to use v17 SDK, in .net 4 project.

Could you please explain to me what I'm doing wrong. Could you please explain to me what I'm doing wrong.

Thanks in advance
.
 

Try the latest version of LEADTOOLS for free for 60 days by downloading the evaluation: https://www.leadtools.com/downloads

Wanna join the discussion? Login to your LEADTOOLS Support accountor Register a new forum account.

#2 Posted : Wednesday, April 13, 2011 1:25:16 AM(UTC)
Maen Hasan

Groups: Registered, Tech Support
Posts: 1,326

Was thanked: 1 time(s) in 1 post(s)

To resolve the problem, you might need to add the Leadtools.Codecs.Pdf.dll file as reference in your project.

However, to get the recognized words locations, you might need to use the low-level OCR methods, such as ocrDocument.Pages.AddPage() and ocrPage.Recognize(). And then you can get the recognized character data of the OcrPage by using the IOcrPage.GetRecognizedCharacters Method.
For more information and sample code, please see the help topic "GetRecognizedCharacters Method" in the LEADTOOLS .Net documentation.

Thanks,
Maen Badwan
LEADTOOLS Technical Support
 
#3 Posted : Wednesday, April 13, 2011 2:49:26 AM(UTC)
it-dimension

Groups: Registered
Posts: 2


Thanks!
 
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.

Powered by YAF.NET | YAF.NET © 2003-2024, Yet Another Forum.NET
This page was generated in 0.083 seconds.