Welcome Guest! To enable all features, please Login or Register.

Notification

Icon
Error

Options
View
Last Go to last post Unread Go to first unread post
#1 Posted : Thursday, July 25, 2019 3:54:02 PM(UTC)
Chekhovych

Groups: Registered
Posts: 11


Could you please provide any information how to optimize memory usage during OCR process for large non-searchable PDF documents on JAVA.
Thanks.

Code:
OcrEngine ocrEngine = OcrEngineManager.createEngine(OcrEngineType.LEAD);
        ocrEngine.startup(null, null, null, ocrRuntimePath);
        setInitialOcrConfiguration(processingDocument, ocrEngine);

        RasterCodecs rasterCodecs = ocrEngine.getRasterCodecsInstance();
        OcrDocument ocrDocument = ocrEngine.getDocumentManager()
                .createDocument(null, OcrCreateDocumentOptions.AUTO_DELETE_FILE.getValue());
        OcrProgressCallback ocrProgressCallback = ....;

        Path inputPath = ....;
        Path outputPath = generateOutputDocumentPath(ocredDocumentDirectory, processingDocument.getName());

        try {
            int pageCount = rasterCodecs.getTotalPages(inputPath.toString());
            IntStream.rangeClosed(1, pageCount)
                    .forEach(pageNumber -> {
                        // monitorOcrProgress(processingDocument, pageCount, pageNumber);

                        RasterImage rasterImage = rasterCodecs.load(inputPath.toString(), pageNumber);
                        OcrPage ocrPage = ocrEngine.createPage(rasterImage, OcrImageSharingMode.AUTO_DISPOSE);

                        ocrPage.autoZone(null);
                        ocrPage.recognize(ocrProgressCallback);
                        ocrDocument.getPages().add(ocrPage);

                        ocrPage.dispose();
                    });

            ocrDocument.save(outputPath, DocumentFormat.PDF, null);
        } finally {
            if (Objects.nonNull(rasterCodecs)) {
                rasterCodecs.dispose();
            }

            if (Objects.nonNull(ocrDocument)) {
                ocrDocument.dispose();
            }

            ocrEngine.dispose();

            if (Objects.nonNull(outputPath)) {
                FileUtils.deleteQuietly(outputPath.getParent().toFile());
            }
        }
    }
 

Try the latest version of LEADTOOLS for free for 60 days by downloading the evaluation: https://www.leadtools.com/downloads

Wanna join the discussion? Login to your LEADTOOLS Support accountor Register a new forum account.

#2 Posted : Thursday, August 1, 2019 3:19:16 PM(UTC)
Josh Clark

Groups: Registered, Tech Support, Administrators
Posts: 54

Thanks: 2 times
Was thanked: 10 time(s) in 10 post(s)

Apologies for the delayed response. I would have to see the rest of your project to understand the issue fully. Can you email a sample project to our support@leadtools.com email? I can take a look at it and see if there are any optimizations that we could do to it to increase performance.
Josh Clark
Developer Support Engineer
LEAD Technologies, Inc.

LEAD Logo
 
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.

Powered by YAF.NET | YAF.NET © 2003-2024, Yet Another Forum.NET
This page was generated in 0.068 seconds.