Structuring Your Project
A document project that starts in one file and grows without structure becomes unworkable before it is finished. The decisions made at the start — about folders, about CSS layers, about what goes where — determine whether the project is maintainable or merely functional.
- Folder structure and file naming conventions
- Layering CSS: base, theme, layout, print
- Separating content from presentation
- Git for document version control
Every document project starts small and legible. One HTML file, one CSS file, a handful of images, maybe a fonts folder. Everything is visible, everything is findable, everything is a single step from anything else. Then the project grows. The HTML file is split into multiple chapters. The CSS accumulates layers of rules, overrides, and print-specific declarations. The images folder fills with figures in various states of revision. The font situation becomes complicated when a second typeface is introduced. A year later, or six months, or even six weeks, the person opening the project for the first time — which may be you — cannot easily see what connects to what, which CSS file to edit for which kind of change, or which version of the document is the final one. This chapter is about making the decisions at the start that prevent that experience.
Good project structure is not a bureaucratic imposition. It is a form of respect for future work — your own or someone else's — on the same files. It makes changes faster to make, errors easier to spot, and collaboration possible without the overhead of explaining the project's hidden logic to every new person who touches it. The patterns described here are not the only correct ones, but they are the result of thinking through the specific requirements of HTML-based document projects and arriving at something coherent.
· · ·Folder structure A place for everything before anything needs a place
The guiding principle for folder structure in a document project is: the type of file determines its location, not its function. A CSS file that controls the print layout goes in the css/ folder alongside all other CSS files. A font file goes in fonts/ regardless of which page it appears on. This type-based organisation is more maintainable than function-based organisation (where, for instance, all assets for Chapter 3 live in a chapter-3/ folder) because it keeps the same kind of file in the same place, so you always know where to look for a given file type without knowing the project structure in advance.
The following is the reference folder structure for a single-document project — a book, report, or paper — produced with Paged.js. It accommodates everything needed for the document types covered in this book, scales to multi-chapter documents, and separates production concerns (the dist/ folder) from source concerns cleanly:
A few decisions in this structure deserve explanation. The content/ folder holds HTML fragments — not complete HTML documents, but the body content of each chapter or section, without the <html>, <head>, or <body> tags. These fragments are included into the main index.html using either server-side includes, a build step, or — for simpler projects — the HTML <template> element loaded with a small JavaScript utility. This separation keeps each chapter's content authoring context clean while allowing the single index.html to assemble everything into a coherent paginated document.
The numbered CSS files — 01-tokens.css through 07-print.css — are loaded in order by index.html. The numbering ensures the loading order is always clear without depending on the order of <link> tags in the HTML. Each file's scope is strictly defined: 01-tokens.css contains nothing but custom property definitions; it never declares any selectors beyond :root and *. 06-page.css contains nothing but @page rules and the CSS that assigns elements to named pages. This strict file-by-file scope means you always know which file to open for a given kind of change, and changes in one layer do not accidentally affect another.
The dist/ folder contains generated output only — files that are produced by the build process, not files that are edited by hand. Including it in version control is a deliberate choice: it allows collaborators to access the latest built PDF without needing to run the build themselves, and it preserves a record of exactly what was distributed at each version. The trade-off is increased repository size; for documents with large embedded images, the alternative is to exclude dist/ from version control and distribute PDFs separately.
Layering CSS Seven files, seven concerns, zero confusion
The CSS layer structure above reflects a specific mental model: each layer has one concern, and each concern has one layer. This is a stronger separation than is common in web development, where CSS files are often organised by component or page rather than by concern. For a document project, concern-based organisation is more appropriate because the concerns — tokens, base styles, typography, layout, components, page media, production — are genuinely orthogonal. A change to the type scale should not require touching the layout file. A change to the @page rule should not require touching the typography file. Keeping these concerns separate in different files makes both the intention and the effect of each change easier to understand.
:root. Never references other layers. Everything else references this.html and body declarations, font-size, background, color. Establishes the document ground. Minimal — if a rule here is not needed by every element, it belongs elsewhere.@page rules, named pages, string-set declarations, margin box content, break properties, widows and orphans. The Paged.js layer. Nothing here affects screen rendering.--style. Enables bleeds and crop marks, removes debug classes, sets print-color-adjust: exact for background colors. Never loaded during design or preview.The strict separation at layer 6 — everything Paged.js-specific goes in 06-page.css, nothing else goes there — makes it straightforward to test the document in a non-Paged.js context. If you need to render the HTML as a screen document (for a web preview, for example), you simply omit 06-page.css and 07-print.css from the <link> tags. Layers 1 through 5 remain fully functional without any paged-media-specific CSS. This separation also makes it easy to port the document's visual design to a different rendering tool — WeasyPrint, Prince — by swapping the @page rules in 06-page.css for that tool's specific syntax, while leaving all the typography and layout CSS untouched.
Separating content from presentation The HTML that will outlast the CSS
The principle of separating content from presentation — keeping what the document says in the HTML and how it looks in the CSS — is one of the oldest and most consistently violated principles in web development. In document design, the stakes of violating it are higher than in web development, because documents are often long-lived. A report produced in 2024 may need to be restyled for a new brand identity in 2027. A book that was typeset for one trim size may need to be reformatted for a different one. If the visual design is tangled into the HTML — through inline styles, through class names that describe appearance rather than content type, through structural choices made for visual rather than semantic reasons — these changes require editing the content files, which is slow, error-prone, and risks introducing errors into the text.
In practice, this means class names in the HTML should describe what an element is, not how it looks. A callout box should be class="callout", not class="tinted-box". A pull quote should be class="pull-quote", not class="large-italic-centered". The visual treatment of a callout or pull quote can change entirely — background color, border, size, position — without any change to the HTML, because the class name refers to the element's semantic function rather than its current visual state.
This discipline also applies to the HTML structure itself. A document element that uses a <div> with a class when a semantic HTML element exists and applies should use the semantic element: <figure> and <figcaption> for figures and their captions, <blockquote> for extended quotations, <aside> for sidebars and marginal material, <table> (with appropriate <thead>, <tbody>, <th>) for tabular data. Semantic HTML does not merely satisfy a theoretical purity; it makes the document more legible to other tools — screen readers, search indexers, other CSS stylesheets — and makes the CSS selectors more meaningful.
Version control with Git Tracking what changed, when, and why
A document produced with HTML and CSS is software, and like any software project it benefits enormously from version control. Git — the distributed version control system used by virtually every software project of any scale — is fully appropriate for document projects, and the skills needed to use it for documents are a small subset of what developers use for code. You need: git init to start, git add and git commit to record changes, git log to see history, and git diff to see what changed between versions.
The document-specific Git workflow differs from a software workflow in a few ways worth knowing. Commit messages for documents benefit from a different vocabulary — "Revise Chapter 3 opening paragraph," "Adjust outer margin to 0.875in," "Fix widow in §4.2" — rather than the feature-branch vocabulary of software development. Branches are useful for major structural alternatives (a landscape version of a portrait document, for instance) but less useful for incremental content editing, which can proceed directly on the main branch without ceremony. And the .gitignore file should be set up carefully to avoid committing files that are regenerated automatically — Paged.js cache files, operating system metadata files, and typically the dist/ folder unless you have specifically decided to track generated PDFs.
# Recommended .gitignore for a Paged.js document project
# OS metadata
.DS_Store
Thumbs.db
# Node / npm
node_modules/
npm-debug.log
# Editor temp files
*.swp
*.swo
.vscode/
.idea/
# Generated output (optional — remove these lines
# if you want to track PDFs in the repository)
dist/*.pdf
For documents where content is edited collaboratively — multiple authors contributing to different chapters, a designer and an editor working in parallel — Git branches work well for separating authoring and design work. The author works on a content branch, the designer on a design branch, and the two branches are merged periodically. Merge conflicts in HTML and CSS are less common and less catastrophic than merge conflicts in compiled code, and Git's three-way merge handles most document editing scenarios correctly.
Tagging releases — git tag v1.0, git tag v1.1-review-draft — creates permanent anchors in the history corresponding to specific versions of the document. This is invaluable for documents that go through review cycles, where the specific version sent to a reviewer must be recoverable months later when their comments arrive. The dist/ folder PDF name should echo the tag: compositor-garden-v1.0.pdf, compositor-garden-v1.1-review.pdf.
# Tag a release and generate the corresponding PDF
git add -A
git commit -m "Finalise layout for v1.0 submission"
git tag v1.0
# Generate the PDF at this tagged version
pagedjs-cli index.html \
--style css/07-print.css \
--timeout 20000 \
-o dist/compositor-garden-v1.0.pdf
# Push tags to remote repository
git push origin main --tags
Figure 25.1 — Complete project workflow. Source files (HTML, CSS, fonts, images) in a structured project folder. Design iteration happens in the browser via Paged.js live preview — edit CSS, observe result, repeat. Production PDF generation uses Paged.js CLI with the print production CSS injected at build time. Generated PDFs go to dist/, named with version numbers. Git tracks the full history of source changes and tags specific versions. The browser never changes; the content and CSS are the only moving parts.
The README is not optional
Every document project should have a README.md at the root that answers three questions for someone encountering the project for the first time: how do I view the document in preview mode, how do I generate the production PDF, and what goes where. This does not need to be long — four or five paragraphs is sufficient. But its absence means that anyone returning to the project after a gap — including you, in six months — will spend time reconstructing knowledge that should have been written down when it was fresh.
A minimal README for the project document:
Prerequisites: Node.js ≥ 18. Run npm install once to install Paged.js CLI. Preview: Open index.html in Chrome. Fonts load from Google Fonts; ensure internet access. Build: Run npm run build (defined in package.json) to generate dist/compositor-garden.pdf. Structure: Content in content/, one HTML fragment per chapter. CSS in css/, layered by concern. Fonts in fonts/ (local copies; the Google Fonts URL in index.html is for preview only).
The complete project is available alongside this book — source files, CSS layers, and build configuration included.
Good project structure is invisible when it is working — you simply open the right file, make the change, see the result, and commit. It becomes visible when it is absent: the moment you cannot remember which CSS file controls the heading spacing, or which version of the document was sent to the printer, or why the chapter opener looks different in the PDF than it did in the browser. The patterns in this chapter do not prevent that moment from ever occurring. They make it occur far less often, and make it much faster to resolve when it does.
Chapter 26 closes the book with the long-term picture: how HTML-based documents are maintained and updated over time, how to handle content changes without disturbing the layout, and how to think about the life of a document project that will outlast the circumstances of its first production.