LBitmap::AutoZone

Summary

Automatically detects different zone types (Text, Graphics and Tables) in an image. Classifying by types, improves the recognition results from OCR pre-processing. This function is useful for any application that needs to automatically separate images, tables and text within mixed raster content (MRC) images.

Syntax

#include "ltwrappr.h"

L_INT LBitmap::AutoZone(phZones, puCount, uFlags = 0)

Parameters

HGLOBAL * phZones

Pointer to a handle to be updated with the detected zones. To retrieve the data of the detected zones cast the phZones to pLEADZONE and call the GlobalLock function.

L_UINT32 * puCount

Pointer to a variable to be updated with the number of detected zones.

L_UINT32 uFlags

Flags indicating how the function should behave. You can combine values when appropriate by using a bitwise OR ( | ). Possible values are:

Flags indicating which types of zones to detect: (At least one is required)

Value Meaning
AUTOZONE_DETECT_TEXT [0x0001] Detect text zones.
AUTOZONE_DETECT_GRAPHIC [0x0002] Detect graphic zones.
AUTOZONE_DETECT_TABLE [0x0004] Detect table zones.
AUTOZONE_DETECT_ALL [0x00010007] Detect all zone types (Text, Graphics and tables).

Flags indicating whether to allow overlapping among zones: (Optional)

Value Meaning
AUTOZONE_DONT_ALLOW_OVERLAP [0x0000] Do not allow zones to overlap.
AUTOZONE_ALLOW_OVERLAP [0x0010] Allow zones to overlap.

Flags indicating how to merge text zones: (Optional)

Value Meaning
AUTOZONE_ACCURATE_ZONES [0x0000] Do not merge text zones. Keep them separated (as paragraphs).
AUTOZONE_GENERAL_ZONES [0x0100] Merge text zones as much as possible.

Flags indicating options for table zone detection: (Optional)

Value Meaning
AUTOZONE_DONT_RECOGNIZE_ONE_CELL_TABLE [0x0000] Ignore one-cell tables (borders). Detect what is inside.
AUTOZONE_RECOGNIZE_ONE_CELL_TABLE [0x1000] Consider borders to be one-cell tables.
AUTOZONE_NORMAL_TABLE (version 17 or after) [0x0000] Use normal table detection.
AUTOZONE_ADVANCED_TABLE (version 17 or after) [0x2000] Use advanced table detection which is more accurate and which can detect complex tables.
AUTOZONE_LINES_RECONSTRUCTION (version 17 or after) [0x4000] Use line reconstruction to connect broken lines and for patterned tables.

Flags indicating whether to use multi-threading: (Optional)

Value Meaning
AUTOZONE_USE_MULTITHREADING [0x00000000] Use multi-threading (faster for multi-core CPUs).
AUTOZONE_DONTUSE_MULTITHREADING [0x80000000] Do not use multi-threading (Use this with single-core CPUs).

Flags indicating the bitmap should be changed so all it contains is text: (Optional)

Value Meaning
AUTOZONE_TEXT_DETECTION [0x8000] If set, the function does not return text zones in phZones. Instead, it modifies pBitmap to have text areas only: graphics and tables areas are deleted from the input image.

NOTE: Make a copy of the original image if you want to keep it. When this function is set to AUTOZONE_TEXT_DETECTION, the original image is modified.

Flag indicating the type of the document needed to be autozoned: (Optional)

Value Meaning
AUTOZONE_TEXTBOOK [0x40000] Use improved zoning on a book's pages.
AUTOZONE_FAVOR_GRAPHICS [0x400000] Use improved detection of figure zones in documents.

Flag indicating the type of document needed to be auto-zoned: (Optional)

Value Meaning
AUTOZONE_DETECT_VERTICAL_TEXT [0x100000] Detect vertical text lines.

Flag indicating whether to detect checkbox zones and the sensitivity of the detection: (Optional)

Value Meaning
AUTOZONE_DETECT_CHECKBOX [0x10000] Set this flag to detect checkbox zones in the document.
AUTOZONE_CHECKBOX_SENSITIVITY_HIGH [0x00000000] The sensitivity of matching the checkboxes shape is high, and hence, the false negative detections is low.
AUTOZONE_CHECKBOX_SENSITIVITY_LOW [0x200000] The sensitivity of matching the checkboxes shape is low, and hence, the false negative detections is high.

Flag indicating that the document contains Asian zones:

Value Meaning
AUTOZONE_ASIAN_ZONING [0x200] Detect Asian text (Japanese, Chinese, Korean, etc.)

Flag indicating that the document contains Handwritten zones:

Value Meaning
AUTOZONE_ICR_ZONING [0x00000080] Detect Handwritten text.

Returns

Value Meaning
SUCCESS The function was successful.
< 1 An error occurred. Refer to Return Codes.

Comments

This function is useful for any application that needs to automatically separate images, tables and text within mixed raster content (MRC) images.

After using this function, free the phZones using the LBitmap::FreeZoneData function.

This function does not support 12 and 16-bit grayscale and 48 and 64-bit color images. If the image is 12 and 16-bit grayscale and 48 and 64-bit color, the function will not return an error.

This function does not support signed data images. It returns the ERROR_SIGNED_DATA_NOT_SUPPORTED error code if a signed data image is passed to this function.

This function does not support 32-bit grayscale images. It returns the ERROR_GRAY32_UNSUPPORTED error code if a 32-bit grayscale image is passed to this function.

The behavior of this function can be modified by overriding LBitmap::AutoZoneCallback.

Required DLLs and Libraries

Platforms

Win32, x64.

See Also

Functions

Topics

Example

Detects the image zones and return the results.

L_INT LBitmap__AutoZoneBitmapExample(LBitmap & LeadBitmap) 
{ 
   L_INT nRet; 
   HGLOBAL pointer = NULL; 
   L_UINT32 count; 
   L_INT TextZonesCount = 0; 
   L_INT GraphicZonesCount = 0; 
   L_INT TableZonesCount = 0; 
   L_INT CellsCount = 0; 
   L_UINT i; 
 
   nRet = LeadBitmap.Load(MAKE_IMAGE_PATH(TEXT("Clean.tif"))); 
   if(nRet !=SUCCESS) 
      return nRet; 
 
   nRet = LeadBitmap.AutoZone(&pointer, &count, 0); 
   if(nRet !=SUCCESS) 
      return nRet; 
 
   LEADZONE * pZone = (LEADZONE *)GlobalLock(pointer); 
 
   for(i=0; i<count; i++) 
   { 
      if (pZone[i].uZoneType == LEAD_ZONE_TYPE_TEXT) 
      { 
         TextZonesCount++; 
      } 
 
      if (pZone[i].uZoneType == LEAD_ZONE_TYPE_GRAPHIC) 
      { 
         GraphicZonesCount++; 
      } 
 
      if (pZone[i].uZoneType == LEAD_ZONE_TYPE_TABLE) 
      { 
         TableZonesCount++; 
         TABLEZONE * pTable = (TABLEZONE *)GlobalLock(pZone[i].pZoneData); 
         CellsCount = pTable->Rows * pTable->Columns; 
 
         GlobalUnlock(pTable); 
      } 
   }        
 
   nRet = LeadBitmap.FreeZoneData(pointer, count); 
   if(nRet !=SUCCESS) 
      return nRet; 
 
   GlobalUnlock(pointer); 
 
   return SUCCESS; 
} 

Help Version 22.0.2023.2.2
Products | Support | Contact Us | Intellectual Property Notices
© 1991-2023 LEAD Technologies, Inc. All Rights Reserved.

LEADTOOLS Raster Imaging C++ Class Library Help
Products | Support | Contact Us | Intellectual Property Notices
© 1991-2023 LEAD Technologies, Inc. All Rights Reserved.