This tutorial shows how to extract text data from an image using the LEADTOOLS Cloud Services in a Java application.
Overview | |
---|---|
Summary | This tutorial covers how to use LEADTOOLS Cloud Services to make ExtractText request in a Java application. |
Completion Time | 30 minutes |
Project | Download tutorial project (7 KB) |
Platform | LEADTOOLS Cloud Services API |
IDE | IntelliJ |
Language | Java |
Development License | Download LEADTOOLS |
Be sure to review the following sites for information about LEADTOOLS Cloud Services API.
Create an Account with LEADTOOLS Hosted Cloud Services to obtain both Application ID and Password strings.
LEADTOOLS Service Plan offerings:
Service Plan | Description |
---|---|
Free Trial | Free Evaluation |
Page Packages | Prepaid Page Packs |
Subscriptions | Prepaid Monthly Processed Pages |
To further explore the offerings, refer to the LEADTOOLS Hosted Cloud Services page.
To obtain the necessary Application ID and Application Password, refer to Create an Account and Application with the LEADTOOLS Hosted Cloud Services.
In the IDE, create a new Java project with Maven, and add the following required Maven dependency to the pom.xml
file:
<dependency>
<groupId>org.json</groupId>
<artifactId>json</artifactId>
<version>20210307</version>
</dependency>
With the project created and the package added, coding can begin.
In Project Files, open App.java
. Add the following import
statements at the top.
import org.json.JSONArray;
import org.json.JSONObject;
import java.io.File;
import java.io.FileNotFoundException;
import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpRequest.BodyPublisher;
import java.net.http.HttpResponse;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;
Add a method named extractText()
. The extractText()
method sends a ExtractText
request to the LEADTOOLS Cloud Services API, if successful a unique identifier (GUID) will be returned and then a query using this GUID will be made.
Add the code below to the extractText()
method.
private static void extractText() {
HttpClient client = HttpClient.newHttpClient();
//The first page in the file to mark for processing
int firstPage = 1;
//Sending a value of -1 will indicate to the service that all pages in the file should be processed.
int lastPage = -1;
// If using URL to the file
String fileURL = "http://demo.leadtools.com/images/cloud_samples/ocr1-4.tif";
String conversionUrl = String.format("Recognition/ExtractText?firstPage=%s&lastPage=%s&fileurl=%s", firstPage, lastPage, fileURL);
Results results = postAsync(conversionUrl, null, client);
//If uploading a file as multi-part content:
// File uploadFile = new File("path/to/file");
// String conversionUrl = String.format("Recognition/ExtractText?firstPage=%s&lastPage=%s", firstPage, lastPage);
// Results results = postAsync(conversionUrl, uploadFile, client);
if (results.getStatusCode() == 200) {
//Unique ID returned by the services
System.out.println("Unique ID returned by the services: " + results.getData());
query(results.getData(), client);
} else {
System.out.println("Request failed with the following response: " + results.getStatusCode());
}
}
Next, create a new method called query(String id, HttpClient client)
that utilizes the GUID provided by the extractBusinessCard()
method. If successful the response body will contain all the request data in JSON format. Be sure to call the query()
method inside the extractBusinessCard()
method, as shown above. Add the code below to the query()
method.
private static void query(String id, HttpClient client) {
String queryUrl = String.format("Query?id=%s", id);
Results results;
JSONObject returnedData = new JSONObject();
int fileStatus = 0;
do {
try{
results = postAsync(queryUrl, null, client);
returnedData = new JSONObject(results.getData());
fileStatus = returnedData.getInt("FileStatus");
if (fileStatus != 200) Thread.sleep(5000);
} catch (InterruptedException e) {
//e.printStackTrace();
}
} while (fileStatus == 0 || fileStatus == 100 || fileStatus == 123);
System.out.println("File has finished processing with return code: " + fileStatus);
if (fileStatus != 200) {
return;
}
parseJson(returnedData.get("RequestData").toString());
}
Then, create two new methods named parseJson(String json)
and postAsync(String path, File file, HttpClient client)
, which will both be called inside the query()
method, as shown above.
Add the code below to the parseJson()
method to process the returned JSON data.
private static void parseJson(String json) {
JSONArray requestArray = new JSONArray(json);
for (Object requestObject : requestArray) {
if (requestObject instanceof JSONObject) {
JSONObject requestReturn = (JSONObject) requestObject;
System.out.println("Service Type: " + requestReturn.getString("ServiceType"));
System.out.println("Returned Data:" + requestReturn.get("data").toString());
}
}
}
Add the code below to the postAsync()
method to create a client connection to request the GUID and JSON data through. Where it states Replace with Application ID
and Replace with Application Password
be sure to place your Application ID and Password accordingly.
private static Results postAsync(String path, Object body, HttpClient client) {
String AppId = "Replace with Application ID";
String Password = "Replace with Application Password";
String authHeader = "Basic " + new String(Base64.getEncoder().encode((AppId + ":" + Password).getBytes(StandardCharsets.UTF_8)));
String hostedServicesUrl = "https://azure.leadtools.com/api/";
BodyPublisher thisBody = HttpRequest.BodyPublishers.ofString("null");
if (body instanceof BodyPublisher) {
thisBody = (BodyPublisher) body;
} else if (body instanceof File) {
try {
thisBody = HttpRequest.BodyPublishers.ofFile(((File) body).toPath());
} catch (FileNotFoundException e) {
//e.printStackTrace();
}
}
HttpRequest request = HttpRequest.newBuilder()
.POST(thisBody)
.uri(URI.create(hostedServicesUrl + path))
.header("Accept", "application/json")
.header("Authorization", authHeader)
.header("Content-Type", "text/plain")
.build();
CompletableFuture<HttpResponse<String>> result = client.sendAsync(request, HttpResponse.BodyHandlers.ofString());
Results results = new Results();
results.setData("");
results.setStatusCode(0);
try{
results.setData(result.thenApply(HttpResponse::body).get(5, TimeUnit.SECONDS));
results.setStatusCode(result.thenApply(HttpResponse::statusCode).get(5, TimeUnit.SECONDS));
} catch (InterruptedException | ExecutionException | TimeoutException e) {
e.printStackTrace();
}
return results;
}
The application will require a helper class to store the results of the POST
requests. Create a new class called Results
and add the code below to it.
private static class Results {
private String data;
private int statusCode;
public String getData() {
return data;
}
public void setData(String data) {
this.data = data;
}
public int getStatusCode() {
return statusCode;
}
public void setStatusCode(int statusCode) {
this.statusCode = statusCode;
}
}
In order to test run this code be sure to add extractText();
to the static void main
section.
Run the project by pressing Alt F5, or by selecting Run -> Debug App.
If the steps were followed correctly, the console appears and the application displays the parsed text information from the returned JSON data.
This tutorial showed how to extract text information from an image via the LEADTOOLS Cloud Services API in a Java application.