Efficiently Convert a Document to an Image

Posted on 2019-12-19 Nick Villalobos

One of the new features of the latest LEADTOOLS V19 update is a re-factored load algorithm, which has resulted in greatly reduced load times of documents formats such as PDF, MS-Office formats 97-2013 (Word, Excel, and PowerPoint), and TXT. The increase in speed is directly related to the number of pages in the document; the more pages, the greater the increase of speed.

Below is a C# code snippet showing how to use the new feature with the new bits marked.


private static void ConvertDocumentToImage(
    string inputFile,
    string outputFile,
    RasterImageFormat outputFormat,
    int bitsPerPixel)
{
    if (!File.Exists(inputFile))
        throw new ArgumentException($"{inputFile} not found.", nameof(inputFile));

    if (bitsPerPixel != 0 && bitsPerPixel != 1 && bitsPerPixel != 2 && bitsPerPixel != 4 &&
        bitsPerPixel != 8 && bitsPerPixel != 16 && bitsPerPixel != 24 && bitsPerPixel != 32)
        throw new ArgumentOutOfRangeException(nameof(bitsPerPixel), bitsPerPixel, 
            $"Invalid {nameof(bitsPerPixel)} value");

    using (var codecs = new RasterCodecs())
    {
        codecs.Options.RasterizeDocument.Load.XResolution = 300;
        codecs.Options.RasterizeDocument.Load.YResolution = 300;

        // indicates the start of a loop from the same source file
        codecs.StartOptimizedLoad();

        var totalPages = codecs.GetTotalPages(inputFile);
        if (totalPages > 1 && !RasterCodecs.FormatSupportsMultipageSave(outputFormat))
            throw new NotSupportedException(
                $"The {outputFormat} format does not support multiple pages.");

        for (var pageNumber = 1; pageNumber <= totalPages; pageNumber++)
        {
            Console.WriteLine($"Loading and saving page {pageNumber}");
            using (var rasterImage = codecs.Load(inputFile, bitsPerPixel, CodecsLoadByteOrder.Bgr, pageNumber, pageNumber))
                codecs.Save(rasterImage, outputFile, outputFormat, bitsPerPixel, 1, -1, 1, CodecsSavePageMode.Append);
        }

        // indicates the end of the load for the source file
        codecs.StopOptimizedLoad();
    }
}

 

Get the Code

Get a Visual Studio 2017 Windows Console project that includes the sample code from above to convert documents to TIFF. Download the project!

 

 

 

LEADTOOLS Blog

LEADTOOLS Powered by Apryse,the Market Leading PDF SDK,All Rights Reserved