This tutorial shows how to set up OCR processing to convert any supported file to a searchable PDF using the LEADTOOLS SDK in a Windows C DLL application.
Overview | |
---|---|
Summary | This tutorial shows how to load, recognize, and save OCR data to a PDF in a Windows C DLL Application. |
Completion Time | 20 minutes |
Visual Studio Project | Download tutorial project (19 KB) |
Platform | Windows C DLL Application |
IDE | Visual Studio 2019, 2022 |
Development License | Download LEADTOOLS |
Try it in another language |
|
Get familiar with the basic steps of creating a project and loading images by reviewing the Add References and Set a License and Load, Display and Save Images tutorials, before working on this tutorial.
Start with a copy of the 64-bit Windows API project created in the Load, Display and Save Images tutorial. If the project is not available, create it by following the steps in that tutorial.
To utilize LEADTOOLS OCR functionality, additional header and DLL files are required. Open the pre-compiled headers file and add the following lines(either pch.h
or stdafx.h
, depending on the version of Visual Studio used):
#include "C:\LEADTOOLS23\Include\ltocr.h"
#pragma comment (lib, "C:\\LEADTOOLS23\\Lib\\CDLL\\x64\\Ltocr_x.lib") // OCR support
The License unlocks the features needed for the project. It must be set before any toolkit function is called. For details including tutorials for different platforms, refer to Setting a Runtime License.
There are two types of runtime licenses:
With the project created, the references added, the license set, and the load file code added, coding can begin.
Add a new menu with a new item in it:
ID_OCR_RECOGNIZEANDEXPORTRESULTS
.Note
For details on how to add menu items in Visual Studio 2022, see the Load, Display and Save Images tutorial.
For a full list of which DLLs are required for specific toolkit features or file formats, refer to Files to be Included With Your Application.
For details on the LEADTOOLS OCR Module files, see LEAD Engine Runtime Redistributables.
Open the project's CPP file and add the following declarations to the Global Variables section at the top.
// OCR Global Variables:
L_OcrEngine ocrEngine = NULL;
L_OcrDocument ocrDocument = NULL;
L_OcrPage ocrPage = NULL;
L_OcrDocumentManager ocrDocumentManager = NULL;
Go to the InitInstance
function and add the following initialization code below the LEADTOOLS set license code:
if ((SUCCESS != L_OcrEngineManager_CreateEngine(L_OcrEngineType_LEAD, &ocrEngine))
|| (SUCCESS != L_OcrEngine_Startup(ocrEngine, NULL, TEXT("C:\\LEADTOOLS23\\Bin\\Common\\OcrLEADRuntime")))
|| (SUCCESS != L_OcrEngine_GetDocumentManager(ocrEngine, &ocrDocumentManager)))
{
MessageBox(NULL, TEXT("Error initializing OCR..\nAborting"), TEXT("LEADTOOLS Demo"), MB_ICONERROR);
return FALSE;
}
Navigate to the WndProc
function and modify the code under the ID_FILE_OPEN
case. Add the following code immediately below the call to the L_LoadBitmap()
function:
if(LEADBmp.Flags.Allocated)
{
if(ocrDocument)
L_OcrDocument_Destroy(ocrDocument);
ocrDocument = NULL;
ocrPage = NULL;
L_OcrDocumentManager_CreateDocument(ocrDocumentManager, &ocrDocument, L_OcrCreateDocumentOptions_AutoDeleteFile, NULL);
L_OcrPage_FromBitmap(ocrEngine, &ocrPage, &LEADBmp, L_OcrBitmapSharingMode_None, NULL, NULL);
}
Next, modify the WM_DESTROY
case code and add the following lines before the call to PostQuitMessage()
.
if (ocrDocument)
L_OcrDocument_Destroy(ocrDocument);
if (ocrEngine)
L_OcrEngine_Destroy(ocrEngine);
In the WndProc
function under the switch (wmId)
statement that is below the WM_COMMAND
case, add a new case:
switch (wmId)
{
case ID_OCR_RECOGNIZEANDEXPORTRESULTS:
{
if (!LEADBmp.Flags.Allocated)
{
MessageBox(hWnd, TEXT("Cannot perform OCR. No image loaded"), TEXT("LEADTOOLS Demo"), MB_ICONERROR);
break;
}
OcrAndSaveResult(hWnd);
break;
}
// Keep rest of the code as is
Create a new function named OcrAndSaveResult
and place it above the WndProc
function. Add the code below to the new function.
void OcrAndSaveResult(HWND hwnd)
{
if (!ocrEngine || !ocrDocument || !ocrPage)
{
MessageBox(hwnd, TEXT("OCR Engine not properly initialized"), TEXT("LEADTOOLS OCR Demo"), MB_OK);
return;
}
L_OcrLanguageManager languageManager = NULL;
L_OcrLanguage langs[] = { L_OcrLanguage_EN };
const TCHAR* outputFile = TEXT("C:\\Temp\\output.pdf");
L_OcrEngine_GetLanguageManager(ocrEngine, &languageManager);
L_OcrLanguageManager_EnableLanguages(languageManager, langs, _countof(langs));
// Try to recognize the text in the document
L_OcrPage_Recognize(ocrPage, NULL, NULL);
// Add the created OCR page into the file-based OCR document
L_OcrDocument_AddPage(ocrDocument, ocrPage);
TCHAR message[1024];
if (L_OcrDocument_Save(ocrDocument, outputFile, DOCUMENTFORMAT_PDF, NULL, NULL) == SUCCESS)
wsprintf(message, TEXT("OCR succeeded and result saved to %s"), outputFile);
else
wsprintf(message, TEXT("OCR failed"));
MessageBox(hwnd, message, TEXT("LEADTOOLS OCR Demo"), MB_OK);
}
Run the project by pressing F5 or by selecting Debug -> Start Debugging.
If the steps were followed correctly, the application should run and enable the user to select File -> Open to load the file on which OCR recognition will be preformed.
For this example, this scanned document will be used. Select Ocr -> Recognize and Export Results to have the application run OCR on the
input file and output to a searchable PDF file here: C:\Temp\output.pdf
Here is the expected output from the original scanned document: Output PDF
This tutorial covered how to create a C++ Windows API OCR application that takes an input file and exports to a searchable PDF.