This tutorial shows how to create a Java application that uses the LEADTOOLS SDK to preprocess images for OCR Recognition.
Overview | |
---|---|
Summary | This tutorial covers how to use LEADTOOLS Image Processing SDK technology in a Java application |
Completion Time | 30 minutes |
Project | Download tutorial project (2 KB) |
Platform | Java Application |
IDE | Eclipse |
Runtime License | Download LEADTOOLS |
Try it in another language |
|
Get familiar with the basic steps of creating a project by reviewing the Add References and Set a License tutorial, before working on the Preprocess Image for OCR - Java tutorial.
In Eclipse, create a new Java project, and add the necessary LEADTOOLS references.
The references needed depend upon the purpose of the project. The following JAR files are needed for this tutorial:
The JAR files are located at <INSTALL_DIR>\LEADTOOLS21\Bin\Java
leadtools.jar
leadtools.codecs.jar
leadtools.document.writer.jar
leadtools.ocr.jar
The License unlocks the features needed for the project. It must be set before any toolkit function is called. For details, including tutorials for different platforms, refer to Setting a Runtime License.
There are two types of runtime licenses:
Note
Adding LEADTOOLS references and setting a license are covered in more detail in the Add References and Set a License tutorial.
With the project created, the references added, and the license set, coding can begin.
In the Package Explorer, open the _Main.java
class. Add the following import
statements to the import block at the top.
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import leadtools.*;
import leadtools.codecs.*;
import leadtools.document.writer.*;
import leadtools.ocr.*;
Add a new method called OCRPreprocess()
to the _Main
class. Call it inside the main
method, after the SetLicense() call.
public static void main(String[] args) throws IOException
{
Platform.setLibPath("C:\\LEADTOOLS21\\Bin\\CDLL\\x64");
Platform.loadLibrary(LTLibrary.LEADTOOLS);
Platform.loadLibrary(LTLibrary.CODECS);
Platform.loadLibrary(LTLibrary.DOCUMENT_WRITER);
Platform.loadLibrary(LTLibrary.OCR);
SetLicense();
OCRPreprocess();
}
static void OCRPreprocess()
{
String tifFileName = "C:\\LEADTOOLS21\\Resources\\Images\\ocr1.tif";
String pdfFileName = "C:\\LEADTOOLS21\\Resources\\Images\\cleanupTIF.pdf";
RasterCodecs codecs = new RasterCodecs();
RasterImage image = codecs.load(tifFileName);
OcrEngine ocrEngine = OcrEngineManager.createEngine(OcrEngineType.LEAD);
ocrEngine.startup(new RasterCodecs(), new DocumentWriter(), null, null);
OcrDocument ocrDocument = ocrEngine.getDocumentManager().createDocument();
OcrPage ocrPage = ocrDocument.getPages().addPage(image, null);
// Auto-preprocess it
ocrPage.autoPreprocess(OcrAutoPreprocessPageCommand.DESKEW, null);
ocrPage.autoPreprocess(OcrAutoPreprocessPageCommand.INVERT, null);
ocrPage.autoPreprocess(OcrAutoPreprocessPageCommand.ROTATE, null);
// Recognize it and save it as PDF
ocrPage.recognize(null);
ocrDocument.save(pdfFileName, DocumentFormat.PDF, null);
System.out.println("File saved successfully.");
}
Run the project by selecting Run -> Run.
If the steps were followed correctly, the application should OCR the TIFF and provide a cleaned up searchable PDF document.
This tutorial showed how to initialize the LEAD OCR Engine, process the specified input file, preprocess it, and output the recognition results to the specified output file in the specified format.