Digital images are everywhere you look. There’s no escaping them. They can be found in just about every email, they’re all over social media, and they can be embedded throughout PDFs. Some may embed images into PDFs to make the document look better or to provide visuals. Others may do this to show images for legal reasons, such as insurance adjusters.
Let’s keep the focus on embedded images in PDFs, and how you can use the LEADTOOLS PDF SDK to extract them. Inside PDFs are different objects that can be found, Text
, Rectangle
, and Image
. In order to extract these images from the PDFs, LEADTOOLS has a method that is found in the PDFDcoument Class
, the DecodeImage Method
. This method does exactly what you would think. It decodes the specified PDF image object embedded in this PDF document.
The following code is the core code for extracting all image objects from a PDF.
using (PDFDocument document = new PDFDocument(sourceFileNamePath))
{
document.Resolution = 200;
// Parse the objects in all pages
document.ParsePages(PDFParsePagesOptions.Objects, 1, -1);
using(RasterCodecs codecs = new RasterCodecs())
{
// Look through each page in the document
foreach (PDFDocumentPage page in document.Pages)
// Check the page for PDFObjects
if (page.Objects != null && page.Objects.Count > 0)
// If the object type is an image, save it
foreach (PDFObject obj in page.Objects)
if (obj.ObjectType == PDFObjectType.Image)
using (RasterImage image = document.DecodeImage(obj.ImageObjectNumber))
codecs.Save(image, destinationFileNamePath, RasterImageFormat.Png,
image.BitsPerPixel, 1, 1, -1, CodecsSavePageMode.Append);
}
}
I also have a full project that will scan all PDFs from a given directory and extract all image objects. The application will then save each image to disk in its own folder based on the initial file name. Using my example from earlier, insurance adjusters who create PDFs with embedded images could use this to extract images of accidents, damage to property, etc.
Support
Need help getting this sample up and going? Contact our support team for free technical support! For pricing or licensing questions, you can contact our sales team (sales@leadtools.com) or call us at 704-332-5532.
Hi Team,
we have requirement compare two pdf files by pixel to pixel so that in the pdf file if images will be there those images should be compared. Is this type of comparision is possible with lead tools comparision in .net core.
Regards,
Sirisha
Hello Sirisha,
The LEADTOOLS SDK can accomplish this type of comparison. We have an online demo that showcases how we achieve this functionality. The UI is in HTML but the actual comparisons are done on the server in .NET. You can find the demo here:
https://demo.leadtools.com/JavaScript/DocumentComparison/index.html
There is a tutorial walkthrough for setting up and using this demo here:
https://www.leadtools.com/help/sdk/v22/tutorials/html5-get-started-with-the-document-compare-demo.html
The documentation for the Leadtools.Document.Compare can be found here:
https://www.leadtools.com/help/sdk/v22/dh/dc/namespace.html
Feel free to download our free 60-day evaluation from this link:
https://www.leadtools.com/downloads
And also feel free to reach out to our free technical support via email or chat:
support@leadtools.com
https://www.leadtools.com/support/chat
Thanks,
Zac