Handling images in PDF documents with C# and .NET

SautinSoft.Pdf supports exporting images from PDF files in JPEG, BMP, PNG, and TIFF image formats. Extracting images from PDF documents can be a crucial task for various applications, such as data analysis, digital archiving, and content repurposing. Using C# and .NET, you can efficiently extract images from PDFs with the help of the Sautinsoft.PDF library. This article will guide you through the process of extracting images from PDFs using this powerful library.

Extracting images from PDFs can be useful for:

  • Re using images in other documents or presentations.
  • Analyzing visual data.
  • Archiving images separately for better organization.
  • Enhancing content management systems.

The following example shows how to export a single image from a PDF file:

  1. Add SautinSoft.PDF from NuGet.
  2. Load a PDF document.
  3. Iterate through PDF pages.
  4. Get all image content elements on the page.
  5. Export the first image element to an image file.
  6. Save the image.

Input file:

Output result:

Полный код

using System;
using System.IO;
using System.Linq;
using SautinSoft;
using SautinSoft.Pdf;
using SautinSoft.Pdf.Content;

namespace Sample
{
    class Sample
    {
        /// <summary>
        /// Export and import images to PDF file.
        /// </summary>
        /// <remarks>
        /// Details: https://sautinsoft.com/products/pdf/help/net/developer-guide/extract-images-from-pdf.php
        /// </remarks>
        static void Main(string[] args)
        {
            // Before starting this example, please get a free 100-day trial key:
            // https://sautinsoft.com/start-for-free/

            // Apply the key here:
            // PdfDocument.SetLicense("...");

            string pdfFile = Path.GetFullPath(@"..\..\..\simple text.pdf");

            using (var document = PdfDocument.Load(pdfFile))
            {
                // Iterate through PDF pages.
                foreach (var page in document.Pages)
                {
                    // Get all image content elements on the page.
                    var imageElements = page.Content.Elements.All().OfType<PdfImageContent>().ToList();

                    // Export the first image element to an image file.
                    if (imageElements.Count > 0)
                    {
                        imageElements[0].Save("Export Images.jpeg");
                        System.Diagnostics.Process.Start(new System.Diagnostics.ProcessStartInfo("Export Images.jpeg") { UseShellExecute = true });
                        break;
                    }
                }
            }
        }
    }
}

Download

Option Infer On

Imports System
Imports System.IO
Imports System.Linq
Imports SautinSoft
Imports SautinSoft.Pdf
Imports SautinSoft.Pdf.Content

Namespace Sample
	Friend Class Sample
		''' <summary>
		''' Export and import images to PDF file.
		''' </summary>
		''' <remarks>
		''' Details: https://sautinsoft.com/products/pdf/help/net/developer-guide/extract-images-from-pdf.php
		''' </remarks>
		Shared Sub Main(ByVal args() As String)
			' Before starting this example, please get a free license:
			' https://sautinsoft.com/start-for-free/

			' Apply the key here:
			' PdfDocument.SetLicense("...");

			Dim pdfFile As String = Path.GetFullPath("..\..\..\simple text.pdf")

			Using document = PdfDocument.Load(pdfFile)
				' Iterate through PDF pages.
				For Each page In document.Pages
					' Get all image content elements on the page.
					Dim imageElements = page.Content.Elements.All().OfType(Of PdfImageContent)().ToList()

					' Export the first image element to an image file.
					If imageElements.Count > 0 Then
						imageElements(0).Save("Export Images.jpeg")
						System.Diagnostics.Process.Start(New System.Diagnostics.ProcessStartInfo("Export Images.jpeg") With {.UseShellExecute = True})
						Exit For
					End If
				Next page
			End Using
		End Sub
	End Class
End Namespace

Download


Если вам нужен пример кода или у вас есть вопрос: напишите нам по адресу support@sautinsoft.ru или спросите в онлайн-чате (правый нижний угол этой страницы) или используйте форму ниже:



Вопросы и предложения всегда приветствуются!

Мы разрабатываем компоненты .Net с 2002 года. Мы знаем форматы PDF, DOCX, RTF, HTML, XLSX и Images. Если вам нужна помощь в создании, изменении или преобразовании документов в различных форматах, мы можем вам помочь. Мы напишем для вас любой пример кода абсолютно бесплатно.