PDFInfo Explained: Retrieve Author, Dates, and Page Count Fast

PDFInfo: Quick Guide to Extracting Metadata from PDFs

PDF metadata—title, author, creation date, page count, and more—helps you organize, audit, and automate workflows that involve PDF files. pdfinfo, a lightweight command-line tool from the Poppler (or Xpdf) suite, quickly exposes this metadata so you can inspect PDFs without opening them in a GUI. This guide covers installation, common commands, useful flags, output parsing, and automation tips.

What pdfinfo shows

pdfinfo reports common metadata and file details such as:

Title, Author, Subject, Keywords
Creator (software that generated the PDF)
Producer (PDF library that produced the file)
CreationDate / ModDate
Tagged (whether the PDF includes accessibility tagging)
Encrypted (encryption status)
Page count, Page size, and PDF version
File size and linearization / fast web view indicators

Install pdfinfo

macOS: brew install poppler
Debian/Ubuntu: sudo apt-get install poppler-utils
Fedora: sudo dnf install poppler-utils
Windows: install Poppler binaries (add to PATH) or use WSL and follow Linux steps.

Basic usage

Run pdfinfo against a PDF file:

Code
pdfinfo file.pdf

Typical output is a line-by-line list of metadata fields and values.

Useful flags

-meta
Prints XML metadata block (XMP) if present: pdfinfo -meta file.pdf
-box
Shows page box sizes (MediaBox, CropBox, BleedBox, TrimBox, ArtBox): pdfinfo -box file.pdf
-f -l
Limit analysis to pages n–m (useful for very large files): pdfinfo -f 1 -l 5 file.pdf
-rawdates
Show raw date strings from the PDF (no post-processing): pdfinfo -rawdates file.pdf
-enc
Include encryption details (if any).

Check pdfinfo -help for the full list on your system.

Parsing pdfinfo output in scripts

pdfinfo output is plain text; use standard CLI tools to extract fields.

Extract page count (bash):

Code
pages=\((pdfinfo file.pdf | awk '/^Pages:/ {print \)2}‘)

Get title or fallback to filename:

Code
title=\((pdfinfo file.pdf | sed -n 's/^Title:[]*//p') </span>[ -z "\)title” ] && title=”\((basename file.pdf)" </code></div></div></pre> <ul> <li>Extract creation date and convert to ISO (example using GNU date):</li> </ul> <pre><div class="XG2rBS5V967VhGTCEN1k"><div class="nHykNMmtaaTJMjgzStID"><div class="HsT0RHFbNELC00WicOi8"><i><svg width="16" height="16" fill="none" xmlns="http://www.w3.org/2000/svg"><path fill="currentColor" fill-rule="evenodd" clip-rule="evenodd" d="M15.434 7.51c.137.137.212.311.212.49a.694.694 0 0 1-.212.5l-3.54 3.5a.893.893 0 0 1-.277.18 1.024 1.024 0 0 1-.684.038.945.945 0 0 1-.302-.148.787.787 0 0 1-.213-.234.652.652 0 0 1-.045-.58.74.74 0 0 1 .175-.256l3.045-3-3.045-3a.69.69 0 0 1-.22-.55.723.723 0 0 1 .303-.52 1 1 0 0 1 .648-.186.962.962 0 0 1 .614.256l3.541 3.51Zm-12.281 0A.695.695 0 0 0 2.94 8a.694.694 0 0 0 .213.5l3.54 3.5a.893.893 0 0 0 .277.18 1.024 1.024 0 0 0 .684.038.945.945 0 0 0 .302-.148.788.788 0 0 0 .213-.234.651.651 0 0 0 .045-.58.74.74 0 0 0-.175-.256L4.994 8l3.045-3a.69.69 0 0 0 .22-.55.723.723 0 0 0-.303-.52 1 1 0 0 0-.648-.186.962.962 0 0 0-.615.256l-3.54 3.51Z"></path></svg></i><p class="li3asHIMe05JPmtJCytG wZ4JdaHxSAhGy1HoNVja cPy9QU4brI7VQXFNPEvF">Code</p></div><div class="CF2lgtGWtYUYmTULoX44"><button type="button" class="st68fcLUUT0dNcuLLB2_ ffON2NH02oMAcqyoh2UU MQCbz04ET5EljRmK3YpQ CPXAhl7VTkj2dHDyAYAf" data-copycode="true" role="button" aria-label="Copy Code"><svg viewBox="0 0 16 16" fill="none" xmlns="http://www.w3.org/2000/svg"><path fill="currentColor" fill-rule="evenodd" clip-rule="evenodd" d="M9.975 1h.09a3.2 3.2 0 0 1 3.202 3.201v1.924a.754.754 0 0 1-.017.16l1.23 1.353A2 2 0 0 1 15 8.983V14a2 2 0 0 1-2 2H8a2 2 0 0 1-1.733-1H4.183a3.201 3.201 0 0 1-3.2-3.201V4.201a3.2 3.2 0 0 1 3.04-3.197A1.25 1.25 0 0 1 5.25 0h3.5c.604 0 1.109.43 1.225 1ZM4.249 2.5h-.066a1.7 1.7 0 0 0-1.7 1.701v7.598c0 .94.761 1.701 1.7 1.701H6V7a2 2 0 0 1 2-2h3.197c.195 0 .387.028.57.083v-.882A1.7 1.7 0 0 0 10.066 2.5H9.75c-.228.304-.591.5-1 .5h-3.5c-.41 0-.772-.196-1-.5ZM5 1.75v-.5A.25.25 0 0 1 5.25 1h3.5a.25.25 0 0 1 .25.25v.5a.25.25 0 0 1-.25.25h-3.5A.25.25 0 0 1 5 1.75ZM7.5 7a.5.5 0 0 1 .5-.5h3V9a1 1 0 0 0 1 1h1.5v4a.5.5 0 0 1-.5.5H8a.5.5 0 0 1-.5-.5V7Zm6 2v-.017a.5.5 0 0 0-.13-.336L12 7.14V9h1.5Z"></path></svg>Copy Code</button><button type="button" class="st68fcLUUT0dNcuLLB2_ WtfzoAXPoZC2mMqcexgL ffON2NH02oMAcqyoh2UU MQCbz04ET5EljRmK3YpQ GnLX_jUB3Jn3idluie7R"><svg fill="none" viewBox="0 0 24 24" xmlns="http://www.w3.org/2000/svg"><path fill="currentColor" fill-rule="evenodd" d="M20.618 4.214a1 1 0 0 1 .168 1.404l-11 14a1 1 0 0 1-1.554.022l-5-6a1 1 0 0 1 1.536-1.28l4.21 5.05L19.213 4.382a1 1 0 0 1 1.404-.168Z" clip-rule="evenodd"></path></svg>Copied</button></div></div><div class="mtDfw7oSa1WexjXyzs9y" style="color: var(--sds-color-text-01); font-family: var(--sds-font-family-monospace); direction: ltr; text-align: left; white-space: pre; word-spacing: normal; word-break: normal; font-size: var(--sds-font-size-label); line-height: 1.2em; tab-size: 4; hyphens: none; padding: var(--sds-space-x02, 8px) var(--sds-space-x04, 16px) var(--sds-space-x04, 16px); margin: 0px; overflow: auto; border: none; background: transparent;"><code class="language-text" style="color: rgb(57, 58, 52); font-family: Consolas, "Bitstream Vera Sans Mono", "Courier New", Courier, monospace; direction: ltr; text-align: left; white-space: pre; word-spacing: normal; word-break: normal; font-size: 0.9em; line-height: 1.2em; tab-size: 4; hyphens: none;"><span>raw=\)(pdfinfo -rawdates file.pdf | sed -n ’s/^CreationDate:[ 	]*//p’) # raw might look like D:20220303120000-05’00’
 convert with custom parsing or use a library in higher-level languages

For robust parsing, prefer using a scripting language (Python, Node.js) and a PDF library that reads XMP or Info dictionaries directly.

Examples in Python

Using PyPDF2 to read basic metadata:

python
from PyPDF2 import PdfReader reader = PdfReader(“file.pdf”)
info = reader.metadata print(info.title, info.author, info.get(”/CreationDate”))

Note: PyPDF2 reads the document info dictionary; XMP metadata may require a different parser (e.g., pypdfium2 or direct XML parsing).

Automation tips

Batch-check PDFs for missing metadata:
- Loop through files, call pdfinfo, and log missing Title/Author fields.
Integrate into CI: fail builds if PDFs lack required metadata or are encrypted.
Combine with exiftool or custom scripts to update metadata (some tools allow editing; pdfinfo is read-only).
Normalize dates and author names using a mapping file in scripts.

Troubleshooting

No metadata shown: PDF may lack an Info dictionary or XMP block; consider extracting XMP via pdfinfo -meta or using a PDF library.
Dates look odd: PDF dates use the “D:YYYYMMDDHHmmSSOHH’mm’” format; use parsing utilities or libraries to normalize.
Encrypted PDFs: pdfinfo will flag encryption; you may need to decrypt (if permitted) before extracting metadata.

Security and permissions

pdfinfo reads files locally—ensure you have permission to access the files.
Do not run pdfinfo on untrusted PDFs in an environment where opening the file (or parsing) could execute unvetted code; run in a sandbox if content is suspicious.

Quick checklist

Install poppler/poppler-utils.
Run pdfinfo file.pdf for a quick view.
Use -meta for XMP, -box for page boxes, -rawdates for raw timestamps.
Script parsing with awk/sed or use PyPDF2 for programmatic access.
Automate checks and integrate into CI for consistency.

Use pdfinfo whenever you need a fast, scriptable way to inspect PDF metadata without opening a viewer.

PDFInfo Explained: Retrieve Author, Dates, and Page Count Fast

PDFInfo: Quick Guide to Extracting Metadata from PDFs

What pdfinfo shows

Install pdfinfo

Basic usage

Useful flags

Parsing pdfinfo output in scripts

convert with custom parsing or use a library in higher-level languages

Examples in Python

Automation tips

Troubleshooting

Security and permissions

Quick checklist

Comments

Leave a Reply Cancel reply

More posts

UplBatteryExtender Setup Guide: Boost Battery Performance in Minutes

Speed Up Repairs: Advanced CLScan Tips and Best Practices

How to Set Up Desktop Media Uploader in 5 Minutes

Top Tips and Tricks for Power Users of XOWA