This topic and its replies were posted before the current version of LEADTOOLS was released and may no longer be applicable.
#1
Posted
:
Wednesday, March 11, 2009 6:21:39 AM(UTC)
Groups: Registered
Posts: 2
My app is crashing with a NOT ENOUGH MEMORY error. I have 142 page PDF that I need to OCR.
Perhaps I need to free up resources as I go ?
Dim ocrEngine As IOcrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Plus, False)
Dim pdfFileName As String = "C:\Documents and Settings\jus\Desktop\New WORK\TEST.pdf"
RasterCodecs.Startup()
Dim codecs As RasterCodecs = New RasterCodecs()
Try
ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Plus, False)
' unlock features
RasterSupport.Unlock(RasterSupportType.OcrPlus, "")
RasterSupport.Unlock(RasterSupportType.OcrPlusPdfOutput, "")
RasterSupport.Unlock(RasterSupportType.PdfSave, "")
' Start the engine using default parameters
ocrEngine.Startup(Nothing, Nothing, Nothing)
' Update PDF load resolutions
codecs.Options.Pdf.Load.XResolution = 150
codecs.Options.Pdf.Load.YResolution = 150
' Create an OCR document
Using ocrDocument As IOcrDocument = ocrEngine.DocumentManager.CreateDocument()
'add a page to the document
Dim ocrPage As IOcrPage = Nothing
If Not ocrEngine Is Nothing Then
' load in the PDF as a raster image
Dim _image As RasterImage
' DEBUG add get page count here
codecs.Options.Pdf.Load.XResolution = 150
codecs.Options.Pdf.Load.YResolution = 150
_image = codecs.Load(pdfFileName, 0, CodecsLoadByteOrder.BgrOrGray, 1, -1)
codecs.Dispose()
Dim i As Integer = 1
Do While i < _image.PageCount
ocrPage = ocrDocument.Pages.AddPage(_image, Nothing)
' Recognize the page
' Note, Recognize can be called without calling AutoZone or manually adding zones. The engine will
' check and automatically auto-zones the page
ocrPage.AutoZone(Nothing)
ocrPage.Recognize(Nothing)
_image.Page = i + 1
i += 1
Loop
_image.Dispose()
' Save the document we have as PDF
ocrDocument.Save(pdfFileName + "-2.pdf", OcrDocumentFormat.Pdf, Nothing)
End If
End Using
'Shutdown the engine
' Note: calling Dispose will also automatically shutdown the engine if it has been started
ocrEngine.Dispose()
#2
Posted
:
Thursday, March 12, 2009 1:06:25 AM(UTC)
Groups: Registered, Tech Support
Posts: 1,326
Was thanked: 1 time(s) in 1 post(s)
Try to load and OCR the PDF image in two steps:
- Load the first 71 pages and try to OCR them and then free the image.
- Load the second 71 pages and try to OCR them and then free the image.
Thanks,
Maen Badwan
LEADTOOLS Technical Support
#3
Posted
:
Friday, June 12, 2009 9:14:32 AM(UTC)
Groups: Registered
Posts: 3
In LEADTOOLS 16.5, you can use the new DocumentFormat.Ltd to add pages to an existing temporary file on disk before converting the temp file to PDF. This technique is suited precisely for OCRin large amount of pages.
Please refer to the DocumentFormat enumeration documentation in LEADTOOLS 16.5 help.
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.