Part Six — Workflow and Publishing Chapter Twenty-four

Toolchain Options

Paged.js in the browser is the right tool for design work and iterative layout. For production, automation, and scale, the question of which tool to use becomes consequential.

Paged.js: browser polyfill and CLI
WeasyPrint: Python-native PDF generation
Prince: the professional-grade alternative
Choosing between them: a decision framework

Throughout this book, the tool has been Paged.js running in a browser — open the HTML file in Chrome, inspect the paginated preview, adjust the CSS, print to PDF. For the work of designing a document, learning the layout system, and building the project document, this workflow is exactly right: immediate visual feedback, live CSS editing, nothing between you and the result. For other kinds of work — generating hundreds of PDFs from a database, integrating document production into a web application, producing output that must be identical on every machine regardless of who runs it — the browser workflow has limitations. This chapter examines the full landscape of HTML-to-PDF tools available to the designer-developer, with an honest assessment of what each one does well, where each one falls short, and how to choose between them for a specific situation.

The tools fall into three categories. Browser-based tools (Paged.js in Chrome, Puppeteer) generate PDFs through Chromium's rendering engine and are the closest to what you see during design. Command-line tools (Paged.js CLI, WeasyPrint, Prince) run independently of a browser, accept HTML and CSS as input, and produce PDF as output — appropriate for automation. Commercial professional tools (Prince, Antenna House) implement the CSS Paged Media specification most completely and produce the highest-quality output, at significant cost. Understanding which category your use case falls into narrows the choice considerably before the detailed comparison begins.

· · ·

The landscape Four tools, four different trade-offs

Paged.js (browser) Browser polyfill · pagedjs.org

Free Open source Browser Used in this book

A JavaScript polyfill that implements the CSS Paged Media specification in any modern browser. Include the script in your HTML, open in Chrome or Chromium, and the browser renders a paginated preview. Print to PDF using the browser's built-in print function. The design and debugging workflow described throughout this book is built on this tool.

Strengths

Zero installation — just include a script tag
Live CSS editing with immediate visual feedback
All browser DevTools available for debugging
Handles web fonts, SVG, and modern CSS
Large community, active development

Limitations

PDF quality depends on the browser's print engine
Cannot be automated without headless browser
Some CSS Paged Media features not implemented
Output may vary slightly between browser versions

Paged.js CLI Command-line tool · npm package

Free Open source CLI

The Paged.js polyfill wrapped in a Puppeteer (headless Chromium) shell. Accepts an HTML file path or URL, renders it using the same Paged.js engine as the browser preview, and saves the PDF output. Since it uses the same rendering engine, output is consistent with the browser preview. Suitable for automated pipelines where Chromium can be installed.

Strengths

Consistent with browser preview — no surprises
Scriptable in Node.js or shell scripts
Handles the same CSS as the browser version
Free and open source

Limitations

Requires Node.js and Chromium installation
Chromium is a large dependency (~300MB)
Not suitable for serverless or minimal environments
Inherits all browser PDF engine limitations

WeasyPrint Python library · weasyprint.org

Free Open source CLI + Python API

A pure-Python HTML-to-PDF converter that renders CSS directly to PDF without a browser intermediary. No Chromium dependency. Well-suited to Python web applications and lightweight server environments. Implements most CSS Paged Media features natively, with its own layout engine independent of any browser.

Strengths

No browser dependency — just Python
Excellent for Flask/Django integration
Deterministic output across platforms
Good CSS Paged Media support
Active maintenance and documentation

Limitations

CSS support lags behind browsers
Some modern CSS (Grid, custom properties) limited
Output may differ from browser preview
Complex layouts can require CSS adjustments

Prince Commercial tool · princexml.com

Paid licence CLI + API

A commercial HTML-to-PDF converter with the most complete CSS Paged Media implementation of any tool. Used by publishers, academic presses, and enterprise document systems. Produces print-ready PDF/X output with full OpenType support, correct footnote handling, advanced hyphenation and justification. The industry standard for high-volume professional document production.

Strengths

Most complete CSS Paged Media support
Professional-grade typography (H&J, hyphenation)
PDF/X output for print
No browser dependency, fast execution
Consistent across platforms

Limitations

Expensive — server licence from ~$3,800
Some modern CSS features missing
Closed source — no community contributions
Output may differ from browser preview

Figure 24.1 — The four main HTML-to-PDF tools compared. Paged.js in the browser is the correct tool for design and layout work. Paged.js CLI extends this to automated pipelines at no cost, using Chromium as the rendering engine. WeasyPrint replaces Chromium with a Python rendering engine, eliminating a large dependency at the cost of some CSS coverage. Prince provides the most typographically sophisticated output at significant expense, appropriate for high-volume professional production.

· · ·

Paged.js CLI Automating the browser workflow

The Paged.js command-line interface wraps the browser polyfill in Puppeteer, a Node.js library that controls a headless Chromium instance. This means the rendering is identical to what you see in the browser preview — the same CSS is interpreted, the same fonts are loaded, the same Paged.js JavaScript runs. The only difference is that the result is written directly to a PDF file rather than displayed in a browser window.

Installation requires Node.js. Puppeteer downloads a compatible version of Chromium automatically during installation, so no separate browser installation is needed. On most systems, the full installation takes under five minutes:

# Install the Paged.js CLI globally
npm install -g pagedjs-cli

# Generate a PDF from an HTML file
pagedjs-cli document.html -o document.pdf

# With a specific page size (override @page CSS)
pagedjs-cli document.html \
  --width 5.5in \
  --height 8.5in \
  -o document.pdf

# With additional CSS for print-specific overrides
pagedjs-cli document.html \
  --style print-overrides.css \
  -o document.pdf

# Batch: generate PDFs for all HTML files in a folder
for f in *.html; do
  pagedjs-cli "$f" -o "${f%.html}.pdf"
done

The --style flag accepts an additional CSS file that is injected alongside the document's own stylesheet. This is useful for print-specific overrides — removing debug styles, adjusting colors for print rather than screen, enabling bleeds and crop marks — without modifying the document's source HTML or its primary stylesheet. Keeping print-specific overrides in a separate file maintains the separation between design CSS and production CSS.

One practical limitation of the Paged.js CLI is its handling of network resources. By default, Puppeteer does not wait for asynchronously loaded resources to complete before capturing the PDF. If your document loads fonts from Google Fonts or other CDNs, you need to ensure the fonts are loaded before the PDF is captured. The CLI provides a --timeout flag that specifies how long to wait for the page to be "ready" (in milliseconds), which in practice means waiting for the document.fonts.ready promise to resolve. Set this to a generous value — 10,000 to 30,000 milliseconds — for documents with network-loaded fonts.¹

# Allow 15 seconds for fonts and resources to load
pagedjs-cli document.html \
  --timeout 15000 \
  -o document.pdf

# For offline use with self-hosted fonts:
# no timeout needed — local fonts load instantly
pagedjs-cli document-offline.html -o document.pdf

· · ·

WeasyPrint Python-native generation without a browser

WeasyPrint takes a fundamentally different approach from Paged.js. Where Paged.js is a JavaScript polyfill that runs inside a browser and uses the browser's CSS engine for everything except the paged media layer, WeasyPrint is a standalone Python library with its own HTML parser, its own CSS engine, and its own PDF generator. No browser is involved at any stage. The advantage is that WeasyPrint can run in any Python environment — including minimal server environments where installing Chromium is impractical or prohibited.

WeasyPrint is particularly well suited to Python web applications. A Flask or Django view can generate a PDF on the fly from an HTML template, serve it directly to the user, or save it to storage — without spawning a browser process or installing any non-Python dependencies beyond WeasyPrint itself and its system libraries (Cairo and Pango for rendering).

# Install WeasyPrint
pip install weasyprint

# Generate from command line
weasyprint document.html document.pdf

# Python API — generate and serve from Flask

# Flask view — PDF on demand
from flask import render_template, make_response
from weasyprint import HTML, CSS

@app.route('/report/<int:id>/pdf')
def report_pdf(id):
    report = Report.query.get_or_404(id)
    html = render_template('report.html', report=report)

    pdf = HTML(string=html).write_pdf(
        stylesheets=[CSS(filename='static/report.css')]
    )

    response = make_response(pdf)
    response.headers['Content-Type'] = 'application/pdf'
    response.headers['Content-Disposition'] = \
        f'attachment; filename=report-{id}.pdf'
    return response

The most important CSS limitation to know before committing to WeasyPrint for a document designed with this book's CSS system: WeasyPrint's support for CSS custom properties (variables) has improved significantly in recent versions but is not always complete, and its CSS Grid implementation covers common cases but may require adjustments for complex grid layouts. The safest approach is to test your specific document in WeasyPrint early in development, rather than designing against the browser and discovering incompatibilities at the end.

· · ·

Prince When quality is the constraint

Prince (princexml.com) is the tool that professional book publishers, academic presses, and enterprise document systems use when quality is the primary constraint and cost is secondary. It implements the CSS Paged Media specification more completely than any other tool, produces PDF/X output with correct color management, handles OpenType features with typographic precision, and applies sophisticated hyphenation and justification algorithms that produce noticeably better line-breaking than any browser-based tool.

The practical scenarios where Prince justifies its cost are specific: high-volume production of documents where typographic quality directly affects perception (financial reports, legal documents, published books), automated pipelines where the same document is generated thousands of times and any rendering inconsistency is unacceptable, and projects where PDF/X output for offset printing is a hard requirement that Chromium's PDF engine cannot reliably meet.

For individual designers, academic researchers, and small organisations producing documents for internal or limited-distribution use, the cost is difficult to justify. Paged.js produces visually excellent results for all the document types in this book; the difference between Paged.js and Prince output, while real, is visible mainly at the level of fine typographic detail — slightly better H&J in Prince, more reliable footnote placement, more complete CSS Paged Media coverage — rather than at the level of overall quality and readability.

Prince can be evaluated free of charge for non-commercial use; its free output includes a small watermark on the first page. If evaluating it for a specific production use case, running the same document through both Paged.js CLI and Prince and comparing the output directly is the most informative test.

Choosing your toolchain — a decision tree

Figure 24.2 — Toolchain decision framework. The primary fork: designing or generating? For design and layout iteration, Paged.js in the browser is always the right tool. For production PDF generation, the choice between WeasyPrint and Paged.js CLI depends on the application stack — Python applications integrate naturally with WeasyPrint; Node.js and shell environments use the Paged.js CLI for guaranteed consistency with the browser preview. Prince sits as a premium option across all production paths when typographic quality is the primary constraint.

· · ·

Setting up a production pipeline A complete automated workflow

A production PDF pipeline has three stages: source preparation, rendering, and output management. Source preparation ensures the HTML is fully assembled with all assets resolved and all fonts available before the renderer is invoked. Rendering converts HTML to PDF using whichever tool is appropriate for the context. Output management handles the resulting PDF — naming it, moving it, verifying it, and delivering it to wherever it needs to go.

The following shows a complete shell script pipeline for batch generation of reports from HTML templates, using the Paged.js CLI. It handles font loading with a generous timeout, names output files consistently, and performs a basic file-size sanity check on each generated PDF:

#!/usr/bin/env bash
# generate-reports.sh
# Batch PDF generation using Paged.js CLI
# Usage: ./generate-reports.sh reports/

set -euo pipefail

REPORTS_DIR="${1:-.}"
OUTPUT_DIR="${REPORTS_DIR}/pdf"
TIMEOUT=20000      # 20s for font loading
MIN_SIZE=50000     # 50KB minimum — catch empty PDFs

mkdir -p "$OUTPUT_DIR"

for html_file in "$REPORTS_DIR"/*.html; do
  base=$(basename "$html_file" .html)
  out="$OUTPUT_DIR/${base}.pdf"

  echo "Generating: $base.pdf"

  pagedjs-cli "$html_file" \
    --timeout "$TIMEOUT" \
    --style print-production.css \
    -o "$out"

    # Sanity check: verify file is not empty
  size=$(wc -c < "$out")
  if [ "$size" -lt "$MIN_SIZE" ]; then
    echo "WARNING: $out may be empty ($size bytes)"
  else
    echo "  ✓ ${size} bytes"
  fi
done

echo "Done. PDFs in $OUTPUT_DIR"

The print-production.css file injected via --style contains the production-specific overrides — removing debug visualisations, enabling crop marks, setting print-specific background color rendering — kept separate from the design CSS so the same HTML file is usable for both browser preview and production generation without modification.

Choosing for the project document

For The Compositor's Garden and the other document projects in Part Five, the recommended workflow is: design and iterate using Paged.js in the browser, then generate the final PDF using either the browser's print function (for one-off production) or Paged.js CLI (for repeatability and scripting). If you need to embed PDF generation in a Python web application, switch to WeasyPrint and test the document's CSS against WeasyPrint's coverage early. If you are producing high-volume, professionally printed output and the budget exists, evaluate Prince against your specific document and printing requirements.

In all cases, the CSS developed throughout this book — the custom properties, the grid, the page geometry, the furniture — is compatible with all four tools at the level of fundamental layout. The differences between tools manifest in the finer typographic details: footnote placement, complex break scenarios, and edge cases in the CSS Paged Media specification. For the documents in this book, Paged.js in the browser produces output that is excellent by any professional standard.

· · ·

The toolchain is the last technical question in the book. Chapter 25 addresses something different: not how to use the tools, but how to structure the project so that the tools remain easy to use as the document grows, the team changes, or the requirements shift. File organisation, naming conventions, and the practical patterns that make HTML-based document projects maintainable are the subject of the next chapter.

Notes

The Paged.js CLI's --timeout flag controls how long Puppeteer waits after the page's load event before capturing the PDF. It does not directly wait for document.fonts.ready — it simply waits a fixed number of milliseconds. For a more precise approach, you can modify the CLI behaviour by using Puppeteer directly in a Node.js script, calling page.waitForFunction(() => document.fonts.ready) before the PDF capture. The Paged.js CLI documentation covers this pattern for custom Puppeteer scripts.
Antenna House Formatter is a fourth commercial option, comparable to Prince in capability and price, with particularly strong support for complex scripts (Arabic, Hebrew, CJK) and some CSS features not covered by Prince. It is the preferred tool for publishers producing multilingual documents. For the Latin-script documents in this book, Prince and Antenna House are functionally equivalent for most practical purposes; the choice between them for a production system should be made by testing both against the specific document.