Search notes:
iText
Download .NET prerequisites
set-strictMode -version 3
add-Type -assembly System.IO.Compression.FileSystem
function download-NuGet-package {
param (
[string ] $name,
[string ] $version,
[string[]] $dllPaths
)
write-host "Downloading $name - $version"
$downloadURL = "https://www.nuget.org/api/v2/package/$name/$version"
invoke-webRequest $downloadURL -outFile $env:temp/nugetPkg.zip
$zip = [IO.Compression.ZipFile]::OpenRead("$env:temp/nugetPkg.zip")
foreach ($dllPath in $dllPaths) {
$dllName = $dllPath -replace '.*[/\\](.*)', '$1'
[IO.Compression.ZipFileExtensions]::ExtractToFile($zip.GetEntry($dllPath), "$psScriptRoot/$dllname")
}
$zip.Dispose()
}
download-NuGet-package BouncyCastle 1.8.9 lib/BouncyCastle.Crypto.dll
download-NuGet-package itext7.commons 7.2.0 lib/net461/itext.commons.dll
download-NuGet-package itext7 7.2.0 lib/net461/itext.kernel.dll, lib/net461/itext.io.dll, lib/net461/itext.layout.dll
download-NuGet-package Microsoft.Bcl.AsyncInterfaces 5.0.0 lib/net461/Microsoft.Bcl.AsyncInterfaces.dll
download-NuGet-package Microsoft.Extensions.Logging 5.0.0 lib/net461/Microsoft.Extensions.Logging.dll
download-NuGet-package Microsoft.Extensions.DependencyInjection 5.0.2 lib/net461/Microsoft.Extensions.DependencyInjection.dll
download-NuGet-package Microsoft.Extensions.DependencyInjection.Abstractions 5.0.0 lib/net461/Microsoft.Extensions.DependencyInjection.Abstractions.dll
download-NuGet-package Microsoft.Extensions.Logging.Abstractions 5.0.0 lib/net461/Microsoft.Extensions.Logging.Abstractions.dll
download-NuGet-package Microsoft.Extensions.Options 5.0.0 lib/net461/Microsoft.Extensions.Options.dll
download-NuGet-package Microsoft.Extensions.Primitives 5.0.0 lib/net461/Microsoft.Extensions.Primitives.dll
download-NuGet-package System.Threading.Tasks.Extensions 4.5.4 lib/net461/System.Threading.Tasks.Extensions.dll
download-NuGet-package System.Memory 4.5.4 lib/net461/System.Memory.dll
download-NuGet-package System.Runtime.CompilerServices.Unsafe 5.0.0 lib/net45/System.Runtime.CompilerServices.Unsafe.dll
download-NuGet-package System.Diagnostics.DiagnosticSource 5.0.1 lib/net46/System.Diagnostics.DiagnosticSource.dll
download-NuGet-package System.ValueTuple 4.5.0 lib/net461/System.ValueTuple.dll
add-types.ps1
add-types.ps1
adds the necessary types to a
PowerShell session so that iText can be used from PowerShell.
$libDir = "$psScriptRoot/lib"
add-type -path $libDir/BouncyCastle.Crypto.dll
add-type -path $libDir/itext.commons.dll
add-type -path $libDir/itext.io.dll
add-type -path $libDir/itext.kernel.dll
add-type -path $libDir/itext.layout.dll
add-type -path $libDir/System.Threading.Tasks.Extensions.dll
add-type -path $libDir/Microsoft.Bcl.AsyncInterfaces.dll
add-type -path $libDir/Microsoft.Extensions.DependencyInjection.Abstractions.dll
add-type -path $libDir/Microsoft.Extensions.DependencyInjection.dll
add-type -path $libDir/Microsoft.Extensions.Logging.Abstractions.dll
add-type -path $libDir/Microsoft.Extensions.Logging.dll
add-type -path $libDir/Microsoft.Extensions.Options.dll
add-type -path $libDir/Microsoft.Extensions.Primitives.dll
add-type -path $libDir/System.Diagnostics.DiagnosticSource.dll
add-type -path $libDir/System.Memory.dll
add-type -path $libDir/System.Runtime.CompilerServices.Unsafe.dll
add-type -path $libDir/System.ValueTuple.dll
Hello World
A simple
Hello World that creates a
PDF document:
. ./add-types.ps1
$pdfWriter = [iText.Kernel.Pdf.PdfWriter ]::new("$psScriptRoot/Hello-World.pdf")
$pdfDocument = [iText.Kernel.Pdf.PdfDocument ]::new($pdfWriter )
$document = [itext.Layout.Document ]::new($pdfDocument )
$paragraph = [iText.Layout.Element.Paragraph]::new('Hello World!' )
$null = $document.Add($paragraph)
$pdfDocument.Close()
using namespace iText.Kernel.Pdf
using namespace iText.Layout
using namespace iText.Layout.Element
. ./add-types.ps1
$pdfWriter = [PdfWriter ]::new("$psScriptRoot/Hello-World-using-namepsace.pdf")
$pdfDocument = [PdfDocument]::new($pdfWriter )
$document = [Document ]::new($pdfDocument )
$paragraph = [Paragraph ]::new('Hello World!' )
$null = $document.Add($paragraph)
$pdfDocument.Close()
Extract text from a PDF document
This script extracts text from from a PDF document:
using namespace iText.Layout
using namespace iText.Layout.Element
using namespace iText.Kernel.Pdf
using namespace iText.Kernel.Pdf.Canvas.Parser
using namespace iText.Kernel.Pdf.Canvas.Parser.Listener
param (
[string] $pdfName
)
if (-not (test-path $pdfName)) {
write-host "$pdfName does not exist"
return
}
. ./add-types.ps1
#
# A PdfReader reads and parses a PDF document.
#
$pdfReader = [PdfReader]::new($pdfName);
$pdfDocument = [iText.Kernel.Pdf.PdfDocument]::new($pdfReader)
$totalPages = $pdfDocument.GetNumberOfPages()
write-host "Number of pages: $totalPages"
[ITextExtractionStrategy] $strategy = [SimpleTextExtractionStrategy]::new()
for ($p = 1; $p -le $totalPages; $p++) {
write-host " page: $p"
$page = $pdfDocument.GetPage($p)
$text = [PdfTextExtractor]::GetTextFromPage($page, $strategy);
write-host $text
}
$pdfReader.Close();