LEADTOOLS Support
Document
Document SDK Examples
HOW TO: Extract All Text from a File Using the RecognitionEngine
#1
Posted
:
Monday, December 4, 2017 10:40:25 AM(UTC)
Groups: Registered, Tech Support, Administrators
Posts: 70
Was thanked: 4 time(s) in 4 post(s)
The LEADTOOLS CloudServices RecognitionEngine interface provides a high-level method to extract all text from a file. The ExtractText method will pull the text from each page in the range specified in the LoadDocumentOptions. This method will return an enumerable list of DocumentPageText classes.
Code:
var inputStream = File.Open(@"leadtools.pdf", FileMode.Open);
LoadDocumentOptions loadOptions = new LoadDocumentOptions()
{
FirstPageNumber = 1,
LastPageNumber = 1
};
RecognitionEngine recognitionEngine = new RecognitionEngine();
IOcrEngine engine = OcrEngineManager.CreateEngine(OcrEngineType.LEAD, false);
engine.Startup(null, null, null, @"path/to/OCR Bin Directory");
var textList = recognitionEngine.ExtractText(inputStream, loadOptions, engine, null);
foreach(var pageText in textList)
{
//Process text
}
DocumentPageText Class:
https://www.leadtools.co...ox/documentpagetext.htmlEdited by user Friday, December 29, 2017 4:24:31 PM(UTC)
| Reason: Not specified
Duncan Quirk
Developer Support Engineer
LEAD Technologies, Inc.
LEADTOOLS Support
Document
Document SDK Examples
HOW TO: Extract All Text from a File Using the RecognitionEngine
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.