Extracts text from a file and can be called with a POST Request to the following URL:
[POST] https://azure.leadtools.com/api/Recognition/ExtractText
The following parameters are required unless indicated otherwise, and are used by all Conversion and Recognition API calls:
Parameter | Description | Accepted Values |
---|---|---|
fileUrl (Optional) |
The URL to the file to be processed. For more information, refer to the Cloud Services Overview section. | A string or URI containing a valid URL to the file to be uploaded. |
firstPage |
The first page in the file to process. | An integer value between 1 and the total number of pages in the file. |
lastPage |
The last page in the file to process. | Passing a value of -1 or 0 will indicate to the service that all pages between the First Page parameter, and the last page in the file will be processed. Otherwise, an integer value between 1 and the total number of pages in the file must be passed, and the value must be greater than or equal to the value specified in the FirstPage parameter. |
guid (Optional) |
Unique identifier corresponding to an uploaded file. This value will be returned when a file is uploaded using the UploadFile service call. | A valid GUID |
filePassword (Optional) |
The password to unlock a password protected file. | A string containing the password for a secure PDF. |
callbackUrl (Optional) |
Passing a callbackURL to the service will allow us to notify you when your file has finished processing. If the callbackUrl is invalid or malicious, it will be ignored. The LEADTOOLS Cloud Services will send the request’s ID in the body of the message sent to the callbackUrl. | A string or URI containing a valid URL to message. |
ocrLanguage (Optional) |
The OCR Language to use when OCRing a Raster file. Defaults to en (English) if no languages are specified. | 0 - en 1 - bg 2 - hr 3 - cs 4 - da 5 - nl 6 - fr 7 - de 8 - el 9 - hu 10 - it 11 - pl 12 - pt 13 - sr 14 - es 15 - sv 16 - tr 17 - uk |
Additional parameters available are listed below.
Parameter | Description | Accepted Values |
---|---|---|
characterinfo (Optional) |
Value indicating whether you want to receive additional data regarding the Characters found in each page and their locations. | A Boolean |
The following status codes will be returned when the method is called:
Status | Description |
---|---|
200 | The request has been successfully received. |
400 | The request was not valid for one of the following reasons: Required request parameters were not included. GUID value was not provided. File information provided was malformed. Attempting to queue a request on a file that has not yet been verified. |
401 | The AppID/Password combination is not valid or does not correspond with the GUID provided. |
402 | There are not enough pages left in the Application to process the request. |
500 | There was an internal error processing your request. |
If performing a single-service call, a unique-identifier will be returned that can be used to query the progress of the extraction.
This method is available for free in our live Online Demo. You do not need an account and you can test out your own files to see the results.
//Simple script to make and process the results of an ExtractText request to the LEADTOOLS CloudServices.
const request = require('request');
var servicesUrl = "https://azure.leadtools.com/api/";
//The first page in the file to mark for processing
var firstPage = 1;
//Sending a value of -1 will indicate to the services that the rest of the pages in the file should be processed.
var lastPage = -1;
//We will be uploading the file via a URL. Files can also be passed by adding a PostFile to the request. Only 1 file will be accepted per request.
//The services will use the following priority when determining what a request is trying to do GUID > URL > Request Body Content
var fileURL = 'https://demo.leadtools.com/images/pdf/leadtools.pdf';
var recognitionUrl = servicesUrl + 'Recognition/ExtractText?firstPage=' + firstPage + '&lastPage=' + lastPage + '&fileurl=' + fileURL;
request.post(getRequestOptions(recognitionUrl), recognitionCallback);
function recognitionCallback(error, response, body){
if(!error && response.statusCode == 200){
var guid = body;
console.log("Unique ID returned by the Services: " + guid);
}
}
function getRequestOptions(url){
//Function to generate and return HTTP request options.
var requestOptions ={
url: url,
headers: {
'Content-Length' : 0
},
auth: {
user:"Enter Application ID",
password:"Enter Application Password"
}
};
return requestOptions;
}