Using Filter Data to Speed up Loading Large Files

Some file formats become quite slow to load or convert when they have a lot of pages. That is because to get to page N you sometimes have to go through pages 1, 2, ... N-1 before you get to page N. For such formats, very often loading page N becomes slower and slower as N increases.

LEADTOOLS provides a mechanism for speeding up the loading or conversion of such files. File formats that can be handled this way include the following formats:

To use this mechanism, perform something like the following steps:

  1. Get the file's filter data by calling L_FileInfo with the FILEINFO_GETUSERDATA flag.
  2. Pass the filter data to the load or convert function using the LOADFILEOPTION structure for each page.
  3. Free the filter data using L_FreeFilterData.

Do not use the filter data in more than one thread at a time, for the following reasons:

Example:

/* This helper function saves one file in a certain file folder */ 
static L_INT SavePage(pBITMAPHANDLE pBitmap, L_INT nPageNumber, L_TCHAR *pszDir) 
{ 
   L_TCHAR szPath[MAX_PATH]; 
   _stprintf_s(szPath, _countof(szPath), _T("%s%d.jpg"), pszDir, nPageNumber); 
   return L_SaveBitmap(szPath, pBitmap, FILE_JPEG_411, 0, 20, NULL); 
} 
/* This function shows how to use filter data to speed up the save of a file with many pages */ 
L_INT ConvertFileWithManyPages() 
{ 
#define SRC_PDF_FILE _T("c:\\temp\\FileWithManyPages.pdf") 
   BITMAPHANDLE Bitmap = {0}; 
   FILEINFO fileInfo = {0}; 
   LOADFILEOPTION LoadFileOption; 
   L_GetDefaultLoadFileOption(&LoadFileOption, sizeof(LoadFileOption)); 
   LoadFileOption.PageNumber = 1; 
   /* calculate the total number of pages and get the filter data */ 
   L_INT nRet = L_FileInfo(SRC_PDF_FILE, &fileInfo, sizeof(FILEINFO), FILEINFO_TOTALPAGES | FILEINFO_USEFILTERDATA, &LoadFileOption); 
   if(nRet == SUCCESS) 
   { 
      /* Copy the filter data into LOADFILEOPTION structure to speed up the load. 
      Also, the filter data will be preserved in LOADFILEOPTION since the FILEINFO structure will be overwritten. 
      */ 
      if(fileInfo.pFilterData) 
      { 
         LoadFileOption.pFilterData = fileInfo.pFilterData; 
         LoadFileOption.uFilterDataSize = fileInfo.uFilterDataSize; 
         LoadFileOption.nFilter = fileInfo.nFilter; 
         LoadFileOption.Flags2 |= ELO2_USEFILTERDATA; 
      } 
      for(L_INT i = 1; i <= fileInfo.TotalPages; i++) 
      { 
         /* The fileInfo structure is complete only for the first page. For the others, it only indicates the format */ 
         if(i > 1) 
            fileInfo.Flags = FILEINFO_FORMATVALID; 
         LoadFileOption.PageNumber = i; 
         nRet = L_LoadBitmap(SRC_PDF_FILE, &Bitmap, sizeof(BITMAPHANDLE), 0, ORDER_BGRORGRAY, &LoadFileOption, &fileInfo); 
         if(nRet == SUCCESS) 
         { 
            nRet = SavePage(&Bitmap, LoadFileOption.PageNumber, _T("c:\\temp\\out\\")); 
            L_FreeBitmap(&Bitmap); 
         } 
         if(nRet != SUCCESS) 
            break; 
      } 
      /* Free the filter data */ 
      L_FreeFilterData(LoadFileOption.nFilter, LoadFileOption.pFilterData, LoadFileOption.uFilterDataSize, L_TRUE); 
   } 
   return nRet; 
} 

The above example uses filter data to convert all of the pages of a PDF file. For a PDF file, the speed improvement will be noticeable when large files (1000 pages or more) are converted. For other file formats, the speed improvement can be significant even with fewer pages (as few as 10 pages).

A file's filter data depends on the contents of the source file, so use the filter data only with the file for which it was created. Also, do not use the filter data after the file contents change. Whenever file contents change, free the filter data and retrieve another copy by calling L_FileInfo again.

You can use filter data to speed up any function that takes a LOADFILEOPTION structure: L_FileInfo, L_FileConvert, L_LoadXXX (all the load functions, etc). For a list of functions that utilize the LOADFILEOPTION structure, refer to the Usage section of its documentation.

TIFF/BigTIFF files use a simpler mechanism (the IFD, or file offset of each page). For more information about using the IFD, refer to Loading and Saving Large TIFF/BigTIFF Files.

Help Version 22.0.2023.7.11
Products | Support | Contact Us | Intellectual Property Notices
© 1991-2023 LEAD Technologies, Inc. All Rights Reserved.

LEADTOOLS Raster Imaging C API Help
Products | Support | Contact Us | Intellectual Property Notices
© 1991-2023 LEAD Technologies, Inc. All Rights Reserved.