The Document is the Computer

Front matter

Preface

Against the app

There is a question nobody asks about software, because the answer has always seemed too obvious to bother with. The question is: who is the application for?

The assumed answer is: for you, the user. The application exists to serve your needs. The calendar is for tracking your time. The email client is for managing your correspondence. The note-taking app is for capturing your thoughts. The applications are tools, and you are the person who wields them.

I argue that the obvious answer is wrong. Or rather — that it was once approximately right, and has become, over forty years of accumulated design decisions, profoundly and systematically wrong. The application is no longer primarily for you. It is for the company that built it. Your needs are accommodated insofar as accommodation keeps you inside the application. Your data is stored insofar as storage creates dependency. Your experience is designed insofar as design increases retention. You are not the customer of the application. In the oldest and most precise sense of the word, you are its captive.

This is not a conspiracy. Nobody sat in a room and decided to trap users. It is the emergent consequence of a specific technical architecture — the application silo — combined with a specific economic model — the software subscription — repeated across every category of personal software until the pattern became invisible through ubiquity. We do not notice the cage because we have never seen the outside of it.

· · ·

In the beginning — and here "the beginning" means roughly 1969, in the computing laboratories of Bell Labs and MIT — there were no applications. There was text. Programs read text and wrote text. The output of one program could be the input to another, because they all spoke the same language. A user who wanted to count the words in a document, sort them alphabetically, and find the ones that appeared more than ten times could do so by connecting three small programs with two vertical bars: wc | sort | grep. No application needed. No vendor required. The data was text. The operations were text transformations. The user owned both.

This architecture had a name — the Unix philosophy — and it had a beauty that its practitioners recognised immediately. The beauty was structural. Any piece of data could flow into any operation. The system was composable: you could combine its parts freely, because the parts all spoke the same language.

Then came the graphical interface, and everything changed — for better and for worse simultaneously. The graphical interface was genuinely liberating. It made computers accessible to people who had no desire to learn a command language. But it carried a hidden cost, paid slowly over decades, that has now come due. The hidden cost was composability. Graphical applications do not speak a common language. An email client and a calendar application can share data only if someone has written a specific integration between them — a bridge that had to be designed, built, maintained, and constantly repaired. The bridge is not free. It is not guaranteed. And there are never enough of them.

The result is the world we inhabit: a landscape of applications that are individually capable and collectively stupid. None of them is intelligent about the relationships between things — the email that references the meeting that produced the note that generated the task — because intelligence about relationships requires a shared substrate, and the application architecture has no shared substrate. The shared substrate is you. You are the integration layer.

· · ·

My argument is that there is a better substrate, but the scope of the claim must be stated with precision. The substrate is text. Not text in the diminished sense — not the pale imitation of a word processor, not the blinking cursor in a search box. Text in the full sense: structured, composable, human-readable, machine-processable, version-controllable, greppable, archivable text.

What I propose is an architecture in which text is the ground truth and applications are renderers. Your data lives in a document. The document is a text file. In the parts of the system that can truly belong to the file — tasks, computation, tables, notes, transclusions, local file references — the document is not merely where the data is displayed. It is where the data is. The text is the record, the mechanism, and the meaning at once.

For remote systems, the claim changes. Email still lives on the mail server. Calendar events still live in the calendar backend. Chat messages still live on the chat service. But the document becomes the durable layer above them: the place where those systems are brought into context, where the user's actions are taken in full view of the surrounding prose, and where a permanent textual record of those actions remains. The document is not the database of everything. It is the place where your part of everything is finally kept in one readable form.

This is a proposal written first for document-oriented power users: people who already think in files, text, search, links, diffs, and revision history; people for whom the idea of a computing environment made of readable artifacts is not alien but overdue. The "first" matters, and is worth stating plainly. The architecture's payoff is for everyone — but adoption is staged, and the v1 audience is the audience equipped to adopt now. The early adopter writes their daily document in a text editor, runs the renderer locally, and is comfortable with git, the command line, and the failure modes of a sandboxed Python runtime when something goes wrong. The general knowledge worker — the person who has never opened a terminal — benefits later: when renderers are mature, when AI assistants compose blocks and resolve sync conflicts on their behalf, and when the trust model for capability grants is robust enough to expose to a non-technical user without surprise.

The order matters. A system designed first for the general user and second for the power user ends up optimised for onboarding, weak on power, and captive to the vendor that built the onboarding. A system designed first for the power user has a chance of becoming the substrate the general user adopts second, on the same architecture, with no compromise on what was built first. Where this book uses phrases like "the widest audience" — and it does, in places — the phrase should be read with this staircase in mind: widest among the cohort plausibly equipped to adopt v1, not widest in absolute terms. What I hope to show is that the power-user habits this book describes point not to a niche subculture but to the floor on which a better general architecture can be built.

· · ·

"The future is already here — it's just not evenly distributed."
— William Gibson, The Economist, 2003

The ideas in this book are not new. Ted Nelson described transclusion in 1963. Doug Engelbart demonstrated collaborative hypertext in 1968. The Unix philosophy was articulated in 1978. Emacs org-mode proved in 2003 that a single text file could serve as calendar, task manager, spreadsheet, and programming environment simultaneously. Literate programming — the practice of writing code and prose together — was proposed by Donald Knuth in 1984. The future that these thinkers described has been here for decades — and Raskin built a version of it in commercial hardware in 1987, calling it the Canon Cat, with the specific goal of ending stand-alone applications. This is an attempt to make it available to everyone — not by requiring them to learn Emacs, but by designing an architecture that delivers the benefits of text-native computing through an interface suited to the world we actually inhabit.

A note on what this book is not. It is not a critique of the people who built the applications I am arguing against. The engineers and designers who created Gmail and Notion and Apple Calendar are not villains. They were working within constraints that made the application silo the natural unit of software. I am not indicting them. I am indicting the constraints. It is not a proposal for destroying existing software. It is not nostalgic. And it is not a technical manual, though there is technical content — enough that an implementation team could begin from it.

The title of this book is The Document is the Computer. It says that the primary unit of personal computing is not the application, not the service, not the platform, but the document — the file, the text, the thing you author and own and can hold in your hand as a string of bytes on your own storage. The document is where you live, and you should not have to leave home every time you need to compute.

That is the argument. The rest of the book makes it in detail.

Front matter

Introduction

The file that runs

Imagine opening your computer in the morning to a single document. Not an inbox, not a dashboard, not a home screen scattered with application icons. A document — a long, scrollable page that is yours, that you have been writing into and that has been written into on your behalf, that contains everything relevant to your day in the order you have chosen to arrange it.

Near the top, perhaps, a few sentences you wrote last night about what you intended to accomplish. Below that, without any border or transition, a live view of your calendar: today's meetings rendered inline, inside the prose, as naturally as a table appears in a newspaper article. Below that, the three emails that arrived overnight and that your document has decided are worth surfacing here. Below those emails, a block of Python you wrote last week that recalculates your monthly budget from first principles every time the document opens, and whose output — a small table, a single summary line — sits quietly beneath the code, updated, current, requiring nothing from you.

You add a sentence. You check a task. You accept a meeting invitation by clicking a button that sits inside a paragraph. The acceptance is recorded — in the calendar backend, yes, but also in the document itself, as a line of text appended to the prose: accepted team sync, 14:00, 24 March. The document knows what you did. It will know forever, because it is a text file, and text files do not forget.

This is not a fantasy. Every component of this description exists, in some form, right now. What does not yet exist is an architecture that separates the cases in which the document can truly be authoritative from the cases in which it must instead stand above an external system and record the user's relationship to it. That distinction is the difference between a manifesto and a design. This book argues for the design.

· · ·

The central concept is simple enough to state, but it has two scopes. For owned state — tasks, computation, tables, notes, transclusions, and local files — the document is the operating environment and the system of record at once. For mirrored state — email, calendar, chat, and other remote services — the document is the primary user surface and the durable journal of intent, action, and consequence. In the first case, the document is the thing itself. In the second, it is the user's authoritative account of their interaction with things that live elsewhere.

So: you do not open a spreadsheet. You write a ::py block containing the computation you need, and the result appears directly below the code. You do not create a task in a hidden database. You write a ::task directive, and the text itself constitutes the task. You do not copy a paragraph from one note into another. You transclude it, and both places continue to read from the same source. These are not applications rendered inline. They are document-native forms of state.

When the document summons email or calendar, the claim is different. You write ::email[inbox]{filter=unread} or ::cal[today]{view=agenda}, and the renderers bring those remote systems into the context where your thinking already lives. The backend remains where it is. What becomes local, durable, and yours is the surrounding prose, the decisions, the replies, the acceptance, the annotation, the trail of what you saw and what you did. The document becomes not the database of the world, but the place where your part of the world is finally kept in one readable form.

Had a call with Sara about the Q3 numbers. She flagged the hardware spend.

::py[q3-analysis]{run=auto}

actuals = [4200, 1850, 6700, 3100, 950]

budget = [3500, 2000, 5000, 3000, 1200]

delta = sum(a-b for a,b in zip(actuals, budget))

print(f"Net overrun: {delta:+,}")

::end

::table[q3-breakdown]{source=q3-analysis sortable=true}

↳ renders as a live, sortable table — text on disk, interactive on screen

Need to flag the hardware line to finance before end of week.

::task[flag-hardware-spend]{due=2026-03-27 priority=high}

↳ renders as a checkbox — completion writes [x] back to text

Had a call with Sara about the Q3 numbers. She flagged the hardware spend.

→Net overrun: +2,100

category	actual	budget	Δ
software	4,200	3,500	+700
contractors	1,850	2,000	−150
hardware	6,700	5,000	+1,700
travel	3,100	3,000	+100
training	950	1,200	−250

Need to flag the hardware line to finance before end of week.

flag hardware spend with finance due 27 Mar · high

This fragment — prose, computation, table, task — is a single document. It is also a single text file. The reasoning and the computation and the action are all in the same place, in the order they happened, in the words that make them meaningful. No application switch. No copy and paste. No context lost in transit.

· · ·

The architecture has four primitives. Everything in this book is built from them.

Primitive I

The embed directive — ::app[id]{params} — is the universal syntax for summoning any app renderer into the document. Email, calendar, contacts, chat, browser, files, terminal: all invoked by the same grammar, all rendering inline, all writing state back to the same text layer.

Primitive II

The computation block — ::py — evaluates Python in a sandboxed interpreter, captures output, and renders it inline. Blocks share a document-scoped namespace. The document is the notebook. There is no separate kernel, no separate file, no separate application.

Primitive III

The table directive — ::table — is a text-serialisable structured data block with a live renderer. It can be authored by hand, generated by a ::py block, or edited directly in the rendered view — with the text always updated to match. The spreadsheet is not replaced. It is dissolved.

Primitive IV

The capability registry — invisible to the user, foundational to the system — maps directive types to renderer implementations, enforces sandboxing, and handles the bidirectional sync protocol that keeps rendered state and text state in agreement.

· · ·

A word about Python. I chose it not because it is the best language but because it is the most widely encountered: most readers have seen it in school, in scripting tutorials, or in a data-science course, and its standard library covers the routine needs of personal data analysis without external dependencies. The case is made in full in §6.2, and the isolation model — what "cannot read disk" actually means in practice — in §6.3. The choice of language is not essential to the architecture. Lua, Scheme, or Smalltalk would each serve equally well, and each has a credible case: Lua for its compactness and embeddability, Scheme for its semantic cleanliness and the elegance of its macro system, Smalltalk for its live-object model which is in some ways more natural than Python's batch evaluation. What matters is that the language be sandboxable under a host-enforced isolation boundary, accessible to non-specialists, and capable of sharing a document-scoped namespace across blocks. Python satisfies all three for the audience the Preface describes — the early adopters who will encounter ::py first. The architectural promise is bounded: a ::py block cannot read disk, make network requests, or import arbitrary native libraries — not because the Python interpreter forbids it, but because the host-enforced isolation boundary withholds the syscalls that would let it try. The mechanism is described in §6.3.

· · ·

The book is in four parts.

Part I

The broken promise of the GUIThree chapters diagnosing the WIMP paradigm, the silo problem, and what the Unix tradition preserved that the graphical interface lost. Read this if you want the full argument for why the current architecture is broken.

Part II

The four primitivesSeven chapters building the architecture from the ground up: the document, the embed directive, the computation layer, the table layer, the capability registry, the sync protocol, and the document lifecycle.

Part III

The eleven apps, redesignedTen chapters — one per application, plus a dedicated chapter on worksheets as the primary demonstration of ::py and ::table composing in practice. Each chapter contains worked examples and interactive figures.

Part IV

The properties of text-native computingFive chapters on what the architecture gives you for free: diffability, grepability, auditability, AI-nativity, and graceful degradation.

Each chapter in Parts II and III contains at least one interactive figure — a live prototype of the concept being described, embedded in the text in exactly the way the architecture proposes all embeds should work. The book is, in this sense, a demonstration of itself.

I hope that someone reads it and starts building. I hope that someone is you.

Part I — The broken promise of the GUI · Chapter 1

Windows all the way down

How the WIMP paradigm became a cage

On the morning of 9 December 1968, Douglas Engelbart walked onto a stage in San Francisco and changed the way the world understood what a computer could be. The audience of roughly a thousand computer professionals watched as he demonstrated, for the first time in public, a system that supported real-time collaborative editing, hypertext linking, video conferencing, and a small handheld device he called a mouse. He called the session "A Research Center for Augmenting Human Intellect." History would call it the Mother of All Demos.

What is remarkable about the demo, watching it today, is not how futuristic it looks. It is how familiar, and how strange simultaneously. Familiar because nearly everything Engelbart showed has been incorporated into the computers we use every day. Strange because the spirit of what he showed — the vision of the computer as a tool for augmenting human thought, for connecting ideas to ideas, for making reasoning visible and shareable — was never fully incorporated. We took the mouse. We left behind the philosophy.

Engelbart's system, NLS, treated the document as the primary unit of computing. Everything lived in documents. Documents contained other documents by reference. Computation was not separate from writing — it was woven into it. We did not build that system. We built something different: a world of windows.

· · ·

1.1 The desk that doesn't exist

The graphical interface as most people know it was born at Xerox PARC in the early 1970s, and popularised by Apple with the Macintosh in 1984. Its designers borrowed deliberately from the visual language of offices: a desktop, folders, files, a trash can. The metaphor was chosen for accessibility. A person who had never touched a computer could look at the screen and find something recognisable.

The metaphor worked. Personal computing went from a hobbyist pursuit to a mass-market phenomenon in the span of a decade. By the early 1990s, the visual vocabulary of WIMP — Windows, Icons, Menus, Pointer — was universal. By the 2000s, it was invisible: not a metaphor users were consciously applying, but the unchallenged reality of what a computer looked like and how it behaved.

Invisible metaphors are the most substantial kind. When you can see a metaphor, you can interrogate it. When the metaphor becomes the world — when nobody alive remembers computing before windows and icons — you lose the ability to see it clearly enough to question it.

The desktop metaphor rests on a physical object — a desk — that the vast majority of computer users no longer have in any meaningful sense. The desk of 1984 was a real working surface covered in real papers, real folders, a real telephone. The computer sat on one corner of it. The desktop metaphor made the computer's screen look like the rest of the desk, so that the transition between physical and digital would be cognitively smooth. For many users, the desk is now the computer itself. There are no physical papers. The metaphorical desktop on the screen refers to physical objects that have not existed in most offices for twenty years. It is a metaphor for a world that has vanished, running as the primary interface of the world that replaced it.

The word "desktop" appears in the name of a feature introduced by Microsoft in Windows 95 called "Active Desktop," which allowed the desktop background to be a live HTML page. This was, in retrospect, a brief and failed attempt to make the desktop metaphor computational. It was discontinued in Windows XP. The impulse — to make the desktop surface dynamic and data-bearing — was correct. The implementation was not.

· · ·

1.2 The container hierarchy

The deeper problem with WIMP is not the desktop metaphor itself but the architectural principle it encodes: that information lives inside containers, which live inside other containers, in a strict hierarchy. This is so natural-feeling, after forty years of conditioning, that it is hard to see it as a choice. But it is a choice, and it has costs.

The costs can be made visible with a simple experiment. Choose any task you performed on a computer today. Now count the number of container boundaries you crossed to complete it. Suppose the task was reading an email that contained a link to a shared document, and leaving a comment on that document. Starting from a locked screen: you unlocked the device (boundary one: operating system), opened the email client (boundary two: application), found the email (boundary three: inbox folder), clicked the link (boundary four: browser opened), navigated to the document (boundary five: web application), found the relevant section (boundary six: document structure), and left a comment (boundary seven: comment thread). Seven container crossings for a task that should feel like three: read, navigate, respond.

Each container crossing requires a context switch. The email client does not know about the document. The document does not know about the email. The browser does not remember why it opened. The containers are individually functional and collectively amnesiac.

Figure 1.1 — The memory asymmetry

depth into task

Held in your mind

Held by the computer

Drag the slider through the seven container boundaries of a routine task — read an email referencing a doc, leave a comment. Items pile up in your head; the computer's memory replaces, never accumulates. By step 7, you are holding seven pieces of context the computer never knew it needed.

The right column is the architectural punch. The computer's memory holds exactly one item at every step, and that one item is replaced at the next step rather than added to. The left column accumulates: by the seventh container boundary, you are carrying seven pieces of context the computer never knew it needed. The entire burden of remembering why you opened the browser, what was in the email, which section of the document you were looking for, is carried by you. This is not a flaw in any individual application. It is a structural property of the container hierarchy. Containers do not communicate context upward or downward. They simply contain.

· · ·

1.3 What the window manager knows

It is worth being precise about what the operating system's window manager actually knows, because the gap between what it knows and what you need it to know is the measure of the problem.

The window manager knows the position and size of each application window on screen. It knows which window is in focus. It knows how to draw window borders, title bars, and scroll bars. This is, essentially, the complete list. It does not know what application is in each window, beyond a name and an icon. It does not know what content the application is displaying. It does not know the relationship between the content in one window and the content in another. It does not know anything that would allow it to help you — to surface connections, to suggest relevant information, to remember context across application boundaries.

The window manager is a sophisticated picture frame. It knows how to display rectangles. It does not know what is in them.

· · ·

1.4 The icon as a false promise

An icon is a picture that represents a program. The picture is meant to communicate what the program does. The icon is a promise: this picture stands for this capability, and the capability lives here, inside this container, and nowhere else.

The promise is false. Or rather — it was true in 1984, when a computer had perhaps a dozen applications and each application did one clearly bounded thing. Consider what the envelope icon on your dock actually promises today. It promises email. But email is not a bounded capability. Email is communication, which involves people (contacts), time (calendar), tasks (todo items), documents (files), and memory (search and history). An email about a project involves all of these simultaneously. The envelope icon promises access to a container. What the user needs is access to a relationship. The icon cannot represent a relationship. It can only represent a container. And so the user must manually reconstruct the relationship — four icons, four containers, one relationship held together by memory and effort rather than by the system.

The iOS home screen of 2007 was perhaps the purest expression of icon-as-promise thinking. A grid of identical rounded squares, each standing for a container, the user's entire computing life organised as a set of destinations to be navigated to and departed from. Sixteen years later, the home screen of iOS 17 is a grid of identical rounded squares. The paradigm has not moved.

· · ·

1.5 Engelbart's question, unanswered

Return, for a moment, to that stage in San Francisco in 1968. Engelbart's question — the question that animated the entire demo — was not "how do we make computers easier to use?" It was a harder and more important question: how do we use computers to augment human intellect?

Augmentation is a specific claim. It means that the human-computer system, taken together, should be capable of things that neither the human nor the computer could accomplish alone. The WIMP paradigm, despite its many genuine achievements, has not produced augmentation in this sense. It has produced acceleration — computers that let us do more of what we were already doing, faster. We write more emails, not better ones. We attend more meetings, not more productive ones. The speed is real. The augmentation is largely absent.

Augmentation requires that the computer hold context. It requires exactly what the container hierarchy prevents: a shared substrate in which all of this information lives together, accessible to every operation, composable with everything.

"The digital computer is... a means for the extension of the human intellect, as the printed book was, or the pen, or language itself."
— J.C.R. Licklider, Man-Computer Symbiosis, 1960

· · ·

1.6 The window is not the answer

There is a response to everything in this chapter that is worth addressing directly: but we have multi-window setups for exactly this reason. If the problem is that email and calendar and notes are in separate containers, arrange them side by side. Problem solved.

It is not solved. It is deferred. Arranging windows side by side addresses the visual problem of proximity, not the architectural problem of isolation. The email window placed next to the calendar window does not share data with the calendar window. Moving information between them still requires manual intervention — copy, paste, type, click. The windows are next to each other. They do not know about each other. The cognitive burden of maintaining the connection between them has not been reduced.

The window is not the answer because the window is the problem. The window is the container. Adding more containers, or arranging them differently, does not dissolve the container hierarchy. It tiles it. The answer is not a better arrangement of windows. It is an architecture in which the container boundary does not exist — in which the email and the calendar event and the note and the task are all aspects of the same underlying thing, all readable and writable through the same interface, all composable with each other because they live in the same substrate. That substrate is the document.

Figure 1.2 — What each app knows and cannot know

The information asymmetry at the heart of WIMP. Each application knows its slice deeply; none knows the whole. The user holds all cross-app context in memory.

A note on what this chapter has not argued: it has not argued that the GUI will or should disappear. The visual interface is not the problem. The container hierarchy is the problem. A text-native document architecture can be visually rich — richer, in some respects, than the current paradigm. The visual surface of computing is not what we are replacing. We are replacing what lies beneath it.

Chapter 1 — Windows all the way down
Chapter 2: The silo problem →

Part I — The broken promise of the GUI · Chapter 2

The silo problem

Why your calendar doesn't know about your email

There is a specific kind of frustration that every person who uses a computer regularly has felt but rarely named. You are in a meeting. Someone references a document. You open your email to find the link. You find the email, click the link, the document opens in a browser. The meeting is now about something in that document. You want to make a note. You open your notes application. The note has no connection to the email, to the meeting, to the document, to the calendar event that convened you all. It is a note that exists in isolation, in a silo, knowing nothing about the context that produced it.

This is the silo problem. It is not a bug. It is not an oversight. It is the direct and inevitable consequence of the application architecture described in Chapter 1, combined with a set of economic and technical incentives that have made the problem progressively worse over the decades in which it should, by rights, have been getting better.

· · ·

2.1 The format is the fence

Every application stores its data in a format. In practice, the format is a fence. It determines who can read the data, who can write it, and who must ask permission to do either. Some formats are open. HTML is a format — anyone can write it, open it in any browser, read it in any text editor. Most application formats are not open in this sense. A .pages file requires Pages. A .sketch design file is opaque to every application except the one that created it. The format encodes a power relationship: the application that writes the file is the canonical reader of the file. All other applications are guests, and the terms of their access can be revoked at any time.

In 2019, Google announced that it would stop supporting third-party applications in Gmail through the IMAP protocol's XOAUTH2 extension for "less secure apps." The effect was that many email clients — applications that users had been using for years to access their own email — simply stopped working. The users' email had not changed. A policy decision by the company that ran the server revoked the access. The data was in a silo. The silo's owner changed the lock.

· · ·

2.2 The API as a controlled aperture

The modern response to the format problem is the API. In practice, the API is a more sophisticated version of the same problem. The API is controlled by the application that exposes it. The application decides what can be requested, at what rate, by whom, and under what terms. The API can be restricted, rate-limited, priced, versioned, and shut down. It is not a window between silos. It is a controlled aperture — a hole in the wall whose size and position are determined entirely by the party with the most to lose from genuine openness.

The history of consumer software is littered with APIs that were opened generously during a growth phase and closed abruptly when the platform reached dominance. Twitter opened its API in 2006 and built an ecosystem of third-party applications on top of it. In 2023, it priced that API at $42,000 per month, destroying the ecosystem overnight. Facebook did it in 2015. Netflix did it in 2014. The pattern is consistent: open during growth, close at dominance. The API is not a bridge between silos. It is a drawbridge, and the people inside the castle control when it is raised.

· · ·

2.3 The subscription model's silent bargain

The shift from one-time software purchases to subscription pricing created a new and more durable form of the silo. Under the purchase model, the user bought a copy of an application and received a file. The file was theirs. Under the subscription model, the data lives on the vendor's servers. The user does not receive a file. They receive access to a view of their data, for as long as they continue paying. If they stop paying, access is suspended. If the vendor shuts down, access ends permanently. The subscription model transfers custody of the user's data from the user to the vendor.

Evernote, for many years one of the most popular personal note-taking applications, changed its free tier limits multiple times between 2016 and 2023 — restricting the number of devices, reducing offline access, and limiting the size of uploads. Users who had stored years of notes in the application faced a choice: pay an increasing subscription fee or accept degraded access to their own notes. The notes had not changed. The bargain had.

· · ·

2.4 The meeting invite as case study

Consider what should be the simplest possible inter-application transaction: receiving a meeting invitation by email and accepting it so that it appears on your calendar. The iCalendar standard is twenty-six years old. Calendar invites from Outlook do not reliably render in Gmail. Google Calendar events do not always round-trip correctly through Apple Calendar. Timezone handling is a persistent source of errors across all combinations. Reply handling — the mechanism by which the organiser learns that you have accepted — breaks silently in a meaningful fraction of cases.

Why? Because the standard defines a format, but not a rendering model, not a conflict resolution protocol, not a timezone normalisation procedure, not a reply handling mechanism. Each application implements the gaps differently. The gaps are where the silos live. And this is the easy case — two applications with a documented standard, a clear user intent, and decades of implementation experience. Now consider the cases with no standard: the email about a project that should update a task list, the chat message that should create a calendar event. For these, there is no standard. There is only the user, copy-pasting between silos.

Figure 2.1 — The meeting invite failure chain

Step through a routine RSVP from Outlook to Gmail. Each red dot is not a bug in any individual application — each is doing what it was designed to do. The failures are at the silo boundaries.

· · ·

2.5 Why integration fails

Given the silo problem, a natural market response would be integration — products that connect the silos. This market exists. Zapier, Make, and dozens of similar services have built businesses on the premise that people need their applications to talk to each other. These services are valuable. They are also a damning indictment of the underlying architecture. A world in which a thriving industry exists solely to pass data between applications that should already share a substrate is a world that has systematically failed to solve a foundational problem.

Integration services fail in predictable ways. They fail when APIs change — and APIs always change. They fail when vendors implement the same standard differently. They fail silently, leaving users with the impression that data has been transferred when it has not. And they introduce a new party — the integration service itself — into the custody chain of the user's data, adding a third silo to the problem of two. More fundamentally, integration services treat the symptom rather than the disease.

· · ·

2.6 The concrete cost

Here are five tasks that a person performing knowledge work routinely needs to perform, and a count of the silo boundaries each one crosses in a typical modern setup. Finding the context for a meeting you are about to join — four applications, four silos, zero of them shared; three to five minutes per meeting, performed manually, every time. Creating a task from an email — two silos, one manual transfer, the task now having no persistent connection to the email that generated it. Writing a weekly status update — five silos synthesised into a sixth that immediately becomes a frozen snapshot, stale by Monday. Onboarding a new colleague — six silos, one hour, always incomplete. Finding something you know you have — four separate searches across four separate indexes, may still fail.

Figure 2.2 — The hidden cost of silos: a weekly audit

Click each task to expand its crossing chain. The final row — persistent connections made — is zero in every case. Time is spent, context is assembled, and then it dissolves.

· · ·

2.7 The structural solution

The silo problem has resisted solution for forty years not because nobody has noticed it, and not because the technical problems of integration are genuinely unsolvable. It has resisted solution because the solutions attempted have all worked within the existing architecture rather than replacing it. Better APIs do not dissolve silos — they create better-maintained walls with better-maintained drawbridges. Integration services do not dissolve silos. All-in-one workspaces do not dissolve silos — they create larger silos with more internal connections. The silo is not a local problem that can be fixed locally. It is a systemic consequence of the architectural decision to make the application the primary unit of data ownership.

The structural solution requires a different primary unit. Not the application, but the substrate — a shared layer that all applications can read from and write to, that no single vendor owns, that persists independently of any particular product's continued existence, and that the user controls absolutely. That substrate is text. Specifically: a plain text document in which data from any application can live as a structured directive, rendered by a registered handler, and written back as text when something changes.

Chapter 2 — The silo problem
Chapter 3: What text got right →

Part I — The broken promise of the GUI · Chapter 3

What text got right

Portability, durability, composability, and the Unix legacy

In 1971, Ken Thompson sat down and wrote a text editor. The editor was called ed. It accepted commands as text, produced output as text, stored its files as text. It had no graphical interface, no mouse support, no menus, no icons. It was, by any contemporary standard of interface design, a tool of extraordinary austerity.

The files that ed created are still readable today. Not readable in the sense that they can be opened with compatibility software, or converted through an intermediate format. Readable in the literal sense: you can open them in any text editor on any operating system on any device manufactured in the past forty years, and the text will be there, in the same order, in the same encoding, as legible as the day it was written. Try this with a .wps file from Microsoft Works 3.0. Try it with an .lwp file from Lotus Word Pro. These formats existed. They held real documents. They are now effectively inaccessible — not because the information was lost, but because the formats that encoded it became orphans when the applications that wrote them were discontinued. The containers decayed. The text inside them was never the problem.

· · ·

3.1 Durability: the file that does not rot

Plain text is the single format that consistently survives digital preservation challenges. The reason is structural: plain text has no rendering layer. A .docx file contains not just the text of a document but instructions for how to render it — font specifications, layout rules, embedded objects, tracked changes, macro definitions. The rendering instructions require a renderer. When the renderer is discontinued, the instructions become uninterpretable, and the document, though technically present, is effectively lost. A plain text file contains nothing but characters and the order in which they appear. There are no rendering instructions. There is nothing to interpret beyond the characters themselves. The file is complete. It needs nothing.

The Voyager spacecraft, launched in 1977, carries a golden record — a physical disc containing sounds and images from Earth, encoded in a format chosen specifically for durability: analogue grooves that any sufficiently advanced civilisation could reconstruct from the physics of the medium alone, with no knowledge of any particular encoding standard. The designers of the Voyager record understood something that software vendors prefer not to discuss: the most durable format is the one that requires the fewest assumptions about what the reader already knows.

· · ·

3.2 Composability: the pipe that connects everything

Of text's properties, composability is the one that the graphical interface most completely abandoned, and the one whose loss has been most costly. Composability, in the Unix sense, means that any program can read from any source and write to any destination, because all programs speak the same language: a stream of text. This is enforced by the architecture. A Unix program that reads from standard input receives text. A program that writes to standard output produces text. The pipe operator — the vertical bar — connects the output of one program to the input of another. The programs do not need to know about each other. They need only to speak text. A program written in 1975 can pipe its output to a program written in 2024. Composition is free, because the common language was chosen once and never changed. The graphical interface broke this. Graphical applications write to windows. Windows are not pipeable.

Figure 3.1 — The pipe versus the integration

task: find the three contacts who emailed you most this month

grep -h "^From:" ~/mail/inbox/*

|

sort

|

uniq -c

|

sort -rn

|

head -3

→ three lines, sender address, count. Done. Programs never knew about each other.

task: extract all action items from this week's notes and append to tasks file

grep "ACTION:" ~/notes/week-12.txt

|

sed 's/ACTION: //'

>>

~/tasks.txt

→ tasks appended. Notes and task list are now connected, permanently.

0

integrations required

∞

composable combinations

free

cost of each new composition

forever

durability of connections made

same task: find the three contacts who emailed you most this month

Gmail

Has sender data, but no built-in frequency analysis UI

Requires Google Takeout export, or a third-party add-on, or manual counting

Zapier

Can trigger on new emails, log sender to a spreadsheet

Requires a Zap, a Google Sheet, and a formula. Breaks when Gmail API changes.

Contacts

Has contact records, but no email frequency data

Separate silo. No connection to email frequency without another integration.

3+

integrations per query

brittle

durability of connections

hours

cost of each new composition

breaks

when any API changes

The Unix pipeline is not faster because it is cleverer. It is faster because the programs share a substrate. Composability is free when the common language was chosen once.

Toggle between the two paradigms. The pipeline is the same task — the difference is entirely architectural.

· · ·

3.3 The Unix philosophy, stated plainly

In 1978, Doug McIlroy, the inventor of the Unix pipe, summarised the Unix philosophy in three rules: write programs that do one thing and do it well; write programs that work together; write programs that handle text streams, because that is a universal interface. The third rule is the foundational one. Text streams are a universal interface because every program, regardless of its purpose, domain, or implementation language, can produce and consume them. The universality is not a feature of any particular program. It is a property of the interface itself — and a property that holds precisely because the interface is not owned by any particular vendor.

"Write programs that do one thing and do it well. Write programs that work together. Write programs that handle text streams, because that is a universal interface."
— Doug McIlroy, A Quarter Century of Unix, 1994

· · ·

3.4 Literate programming: when Knuth went all the way

In 1984, Donald Knuth published a paper introducing a practice he called literate programming. The idea was simple and radical: that a program should be written primarily for a human reader, with the machine as a secondary audience. The programmer would write a document — prose and code interleaved — in which the explanation of the algorithm and the implementation of the algorithm were the same artifact.

Knuth was describing, in 1984, something very close to the ::py block in the architecture of this book. The ::py block is a literate programming construct: it is code embedded in prose, where the prose explains the intent and the code implements it, and both are part of the same document. Literate programming never achieved mainstream adoption. The tooling was complex, the discipline required was high, and the payoff was diffuse. What changed in the years since is the existence of Jupyter notebooks — which made the insight accessible to data scientists. The architecture of this book takes the Jupyter notebook's core insight and removes the application boundary entirely.

· · ·

3.5 Org-mode: the proof of concept that already exists

If you want to see the architecture of this book already built, partially, by one person, in a tool that has been in continuous use for more than twenty years, open Emacs and type M-x org-mode. A .org file can be a task manager, a calendar, a note-taking system, a spreadsheet, a programming notebook, and a publishing system — all of this from a text file, all composable with every Unix tool, all version-controllable with git. Org-mode is proof — running, deployed, actively used for more than twenty years — that the model works. One text file. Many rendering modes. Everything composable. No silos.

Figure 3.2 — The org-mode lineage: what it does and how this architecture extends it

Each capability in the left column is deployed, maintained, and depended upon by real users today.

· · ·

3.6 The five properties, stated precisely

Text possesses five properties that proprietary application formats do not. Every design decision in the chapters that follow should be evaluated against this checklist. If a proposed feature requires storing state in a database rather than in the text file, it compromises durability. If it requires a specific application to read the file's contents, it compromises portability. If it makes the file's text opaque to general tools, it compromises composability. If it stores state that cannot be diffed, it compromises diffability. If it makes the raw file illegible to a careful human reader, it compromises legibility.

Durability

A plain text file is readable by any tool that can decode characters, with no dependency on any application, operating system, or vendor. A text file written today will be readable in fifty years by tools that do not yet exist.

Portability

A plain text file can be copied, moved, transmitted, and stored without loss or degradation. No license key, no account, no application is required to possess a text file.

Composability

Any tool that speaks text can operate on a text file. The output of one text-processing tool can be the input to another with no negotiation, no integration work, and no knowledge of each other.

Diffability

The difference between two versions of a text file is itself expressible as text, in a standardised format (diff) that any version control system can produce and interpret. The history of a text file is a text file.

Legibility

A text file can be read by a human without any special tool. The file is self-describing to any literate reader — which is not true of any binary format, nor of any proprietary application data.

· · ·

3.7 The AI moment

There is a fifth reason, not available to the architects of Unix or the designers of org-mode, why text is the right substrate for personal computing now: language models. A language model is a system that reads text and produces text. It is, in this sense, a Unix program — one that participates naturally in text-based workflows, that can read any text file without special integration and produce output that any text-processing tool can consume.

This has a direct implication for personal computing. A computing environment built on text is natively legible to language models. An AI assistant in a text-native document can read the entire document — the prose, the directives, the Python blocks, the table data, the completed tasks — without any special interface, without any API integration, without any export step. It reads the same file you read. It knows what you know, in the same format you know it. Contrast this with the current state of AI assistants embedded in applications: the AI sees what the integration layer permits, in the format the API returns. The AI is inside the silo, looking out through the same controlled aperture as everything else. A text-native document has no controlled aperture. The AI reads the file. The file is the context. The context is complete.

· · ·

Part I has made its argument. The WIMP paradigm created a cage. The silo problem is the cage at scale, reinforced by economic and technical forces that have made it self-perpetuating. Text, by contrast, has properties that make it the right substrate for a different architecture, and a tradition of practice going back fifty years that has proven the model works. Part II builds the architecture.

Chapter 3 — What text got right
Part II: The four primitives → Chapter 4: The document as OS

Part II — The four primitives · Chapter 4

The document as OS

One scrollable file as the primary interface

An operating system, in the formal sense, is software that manages hardware resources and provides services to application programs. I use the phrase "the document as OS" in a more functional sense: the document as the layer that the user actually lives in. The layer that organises access to capabilities — email, calendar, computation, files — without requiring the user to navigate to a separate application for each one. The layer that holds context across capabilities, so that the email and the calendar event and the task and the note are all visible in the same place, at the same time, in the order that makes sense to the person looking at them. The document OS is not invisible. It is the primary thing the user sees. It is the page they open in the morning. It is where their thinking lives.

· · ·

4.1 The daily document

The primary interface of the text-native computing environment is a scrollable document — not a fixed layout, not a grid of panels, not a dashboard, but a flowing page of text that grows as the day progresses and that can be reorganised at any time by the user without breaking anything. The daily document is not a place to write things down about what is happening elsewhere. It is the place where things happen. The email is not summarised in the daily document. It is there, rendered inline, reply-able in context. The calendar is not referenced in the daily document. It is there, showing the agenda, accepting invites in place.

Monday, 23 March 2026. Picked up where Friday left off — the Q3 analysis is nearly done, just needs the hardware line clarified with finance before I can send it to Sara.

::cal[today]{view=agenda} ↳ today's meetings, inline

::email[inbox]{filter=unread priority=high} ↳ three unread, high priority

The hardware overrun from last week — I ran the numbers again this morning:

::py[q3-check]{run=auto}

actuals = [4200, 1850, 6700, 3100, 950]

budget = [3500, 2000, 5000, 3000, 1200]

print(f"Overrun: {sum(a-b for a,b in zip(actuals,budget)):+,}")

::end

↳ output: Overrun: +2,300

::task[call-finance]{due=today priority=high}

::task[send-q3-sara]{due=2026-03-24 blocked-by=call-finance}

This fragment — prose, calendar, email, computation, tasks — is a single text file. The user did not switch applications to compose it. The Python ran because the document opened. The tasks are connected to each other through the blocked-by parameter: send-q3-sara cannot be completed until call-finance is checked. This dependency is in the text. It is not in a project management database.

The daily document is not permanent. It has a lifecycle — it is created, it grows, it becomes yesterday's document, and eventually it is archived. How documents transition from one day to the next, how they carry forward unfinished work, and how the collection scales over months and years is the subject of Chapter 10.

· · ·

4.2 Structure without rigidity

The daily document has structure, but the structure is not imposed. There are no mandatory sections, no required fields, no template that must be filled in. The structure emerges from the user's practice and is recorded in the text itself. A user who finds it useful to begin every document with a calendar view and a task list will write those directives at the top of each day's document. A user who prefers to start with prose reflection and pull in email only when needed will write it that way. The document accommodates both, because the directives are text and text can go anywhere in a document. This is structurally different from most productivity applications, which impose a data model on the user. The document has no predefined categories. A thing is whatever the prose around it says it is.

· · ·

4.3 The renderer is not the document

A critical design principle of the document OS: the renderer is replaceable. The document — the text file — is permanent. The renderer — the application that parses the directives and displays the widgets — is a software component that can be swapped, upgraded, or replaced without affecting the document's content. This is not how applications work today. In most applications, the renderer and the data model are inseparable. In the document OS, the renderer is explicitly downstream of the document. The document is the source of truth. Any renderer that understands the directive grammar can render any document. Your data stays in your text file. The renderer is just the lens through which you see it.

The closest existing analogy is CSS applied to HTML. An HTML file is a document with semantic structure; a CSS file determines how that structure is displayed. Changing the CSS does not change the HTML. The document OS applies the same principle to personal computing: the text file is the HTML, the directive grammar is the semantic layer, and the renderer is the CSS — separable from the content, swappable without data loss.

· · ·

4.4 Ownership, finally

The document OS has a fundamental consequence that is worth stating plainly. When your data lives in a text file on your disk, it is yours — not in the contractual sense of "you retain ownership of your content" as defined in section 14(b) of a terms of service agreement. Yours in the physical sense: you have the bytes. You can read them. You can move them. You can delete them. Nobody can revoke your access, change the pricing model, discontinue the product, or sell your data to an advertiser, because nobody else has the data. The renderer may be a service. The calendar backend may be hosted. But the document itself — the text file that holds your thinking, your directives, your computation, your tasks — is yours unconditionally. If every service you use shuts down tomorrow, your document remains. It is readable in any text editor. It is yours.

· · ·

4.5 No hidden state

Niklaus Wirth, designing the Oberon system in 1988, articulated a principle that this architecture inherits but has not yet stated with sufficient precision: the state of the system should be practically determined by what is visible to the user. Wirth observed that hidden state forces users to maintain a mental model of something the computer is concealing. Every hidden field, every in-memory application state, every formula invisible behind a cell value, every mode that changes what a button does — each is a tax on human working memory, paid continuously, invisibly, in full.

The Four Laws of this architecture are consequences of this deeper principle. Text is always ground truth because hidden databases are hidden state. All mutations are text mutations because silent in-memory changes are hidden state. Embeds are sandboxed and declared because undeclared capabilities are hidden state. Graceful degradation is mandatory because renderer-only content is hidden state — content that exists but cannot be read.

The practical test is Wirth's: can the user determine the complete state of the system by looking at what is visible? In the daily document, the answer must be yes. The computation is visible as code. Its output is visible inline. The tasks are visible as text. The accepted calendar invite is visible as a log line. The diverged table cell is visible as an amber-highlighted field. Nothing is concealed. Nothing requires the user to remember a prior action, maintain an awareness of a mode, or infer a state from an effect. The document is a complete and readable record of everything that has happened in it.

Wirth's specific formulation: "The absence of hidden states. The state of the system is practically determined by what is visible to the user. This makes it unnecessary to remember a long history of previously activated commands, started programs, entered modes, etc. Modes are in our view the hallmark of user-unfriendly systems." — Wirth & Gutknecht, Project Oberon, 1992. Oberon achieved this for a 1988 single-user workstation. The text-native document achieves it for personal computing at large.

Chapter 4 — The document as OS
Chapter 5: The embed directive →

Part II — The four primitives · Chapter 5

The embed directive

A grammar for summoning live apps inside prose

Every architecture needs a syntax. The text-native document architecture's syntax is the embed directive: a compact, human-readable expression that tells the document renderer to summon a specific application capability at a specific point in the text. The directive is text. It looks like text. A person reading the raw document without a renderer can parse it mentally in seconds. A renderer that encounters it knows exactly what to do. The directive is the single point of contact between the document and the application ecosystem — the seam where prose becomes computation, where text becomes a calendar view, where a line of markup becomes an inbox. Getting the grammar right is the most important design decision in the entire architecture, because every other decision builds on top of it.

· · ·

5.1 The grammar

The embed directive follows a single grammar, with no exceptions:

formal grammar

::type [ identifier ] { key=value key=value }

:: directive sigil — two colons, never ambiguous at the start of a line in prose

type registered renderer type: cal, email, task, py, table, note, contact, file, chat, web, sh

[id] optional identifier: date, query, filename, block name

{k=v} optional parameters: key=value pairs, space-separated

The sigil :: was chosen because it almost never appears at the start of a line in natural prose. A sentence will never begin with two colons in English or in any major natural language. The parser can therefore identify a directive by a single rule: any line beginning with :: is a directive; all other lines are prose. The rule has one carve-out — technical writing about CSS pseudo-elements (::before), C++ scope resolution, IPv6 addresses, and similar — and the carve-out is handled by an explicit escape mechanism described in §5.2. The grammar's normal case requires no escaping; the technical-writing case requires one extra character.

The identifier in square brackets is optional but strongly encouraged. It serves two purposes: it gives the directive a name that the document context can reference (a ::py block with id=q3-analysis makes its variables available to subsequent blocks), and it makes the raw text more legible to a human reader. ::cal[today] is self-describing even without documentation. The parameters in curly braces are key-value pairs that configure the renderer — key=value, space-separated, double-quoted for values containing spaces.

· · ·

5.2 Escaping the sigil

The sigil rule fails on a small but real set of cases: technical writing about CSS pseudo-elements (::before, ::after), C++ scope resolution (std::vector at the start of a quoted line), IPv6 addresses (::1, 2001:db8::), and any quoted excerpt of source code that happens to begin with two colons. In these cases the writer needs to put :: at the start of a line in prose and have it render as ::, not as a directive. The architecture provides two escape mechanisms, and between them they cover every realistic case.

Backslash at line start. A line whose first three characters are \:: is parsed as prose, with the leading backslash stripped at render time. So \::before in the raw text renders as the literal string ::before in the document. To write a literal \:: at line start — that is, to display the backslash itself — the writer types \\::, the same convention Markdown already uses for escaping any character that would otherwise be interpreted as syntax. The parser rule is two characters of lookahead: at line start, peek for \::; if present, drop the backslash and emit the rest as prose. No global escape mode, no per-line state, no character that means different things in different contexts.

Code blocks. Anything inside a fenced code block (delimited by triple backticks) or an indented code block (four-space leading indent) is never parsed for directives. CSS examples, C++ snippets, and quoted shell sessions belong in code blocks for legibility reasons anyway, and once they are in code blocks the escape problem does not arise. This rule is inherited from CommonMark and composes cleanly with the existing CommonMark surface the architecture already adopts.

The combination is sufficient. The backslash handles inline cases where prose genuinely needs to begin with ::; the code block handles every example, every quoted excerpt, every block of source. Two rules considered and rejected: a "smart" parser that whitelists known directive types would couple the grammar to the capability registry and cause typos in real directives to fail silently as prose, which is a worse failure mode than visible mis-render. An indentation-based exemption would break inside blockquotes and lists where prose is legitimately indented for layout reasons. Backslash at line start, code blocks for examples: the cheapest pair of rules that covers the full surface without introducing hidden state.

A reader may ask whether the backslash itself ever causes confusion. In practice it does not: a line beginning with \ followed by anything other than :: is parsed normally — the backslash is preserved and rendered as written. The escape rule is local to the line-start \:: sequence. The full grammar update appears in Appendix A.

· · ·

5.3 All eleven directives

::cal[today]{view=agenda}

Calendar — renders events for the given date or range. view= selects agenda, week, month, or day. Inline RSVP writes back to the calendar backend and appends a line to the document.

::email[inbox]{filter=unread from=@team limit=5}

Email — renders a filtered view of an inbox, a thread, or a search result. filter= accepts any query the mail backend supports. Reply and compose write back as text appends.

::task[call-finance]{due=2026-03-23 priority=high}

Task — renders a single task as a checkbox, or a query as a filtered task list. Checking off writes [x] to the document text. blocked-by= creates a dependency expressed purely in text.

::note[q3-meeting-notes]{section=decisions}

Note / transclusion — renders a section of another document inline. Edits in the rendered view propagate to the source document. This is transclusion: no copy, only reference.

::py[q3-analysis]{run=auto}

Python block — evaluates in the document's sandboxed Python context. Output renders inline below the code. Toggle to source with a click. Full treatment in Chapter 6.

::table[q3-breakdown]{source=q3-analysis sortable=true editable=true}

Table — renders structured data from inline text or from a ::py block's output. Edits write back to the text. Divergence from source is flagged. Full treatment in Chapter 7.

::contact[sara.chen@example.com]

Contact — renders a contact card inline. Email, call, and message actions write back to the document as log entries. @mentions anywhere in the document auto-link to contact records.

::file[~/docs/q3-report.pdf]{preview=true}

File — renders a preview of the referenced file inline. Supports PDF, image, audio, and directory listing. The file path in the directive is the canonical reference; the preview is derived.

::chat[#design-team]{limit=10 since=yesterday}

Chat — renders a slice of a channel or thread inline. Sending from the embed appends to the channel and logs the action to the document. Supports Slack, Teams, and Matrix-compatible backends.

::web[https://example.com/report]{view=reader}

Web — renders a URL in reader mode (text-stripped) or as a live frame. Saving a clip writes the URL, title, and summary to the document as text with the embed above it.

::sh[git-status]{cmd="git status" cwd=~/projects/myapp}

Shell — runs a command in the privileged execution context (distinct from ::py's sandbox) and captures stdout as document text. This is the one directive that can have external side effects.

· · ·

5.4 Graceful degradation is mandatory

Every directive must be meaningful as raw text. This is a hard requirement, for two reasons. First, durability: if the document is to outlast the renderer — and it must, because that is what text-nativity means — then the directives must be interpretable without the renderer. ::cal[2026-03-23]{view=agenda} tells a human reader: this is a calendar embed, for 23 March 2026, in agenda view. The reader knows what was meant, even if they cannot see the rendered calendar. Second, interoperability: not every environment that reads a document will be a full renderer. Partial renderers should be able to operate on documents containing directives they do not handle, rendering what they can and displaying the others as text. The test for any directive syntax is: can a careful human reader, who has never seen this system before, understand what it means from the raw text alone?

· · ·

5.5 The directive is not HTML

A natural question arises: why not use HTML? The answer is legibility. HTML is not legible to a human reader without a renderer. A person who encounters <div class="cal-embed" data-view="agenda" data-date="2026-03-23"></div> in a text file needs to know HTML, they need to know the application's class naming conventions, and they need to know the data attribute schema to interpret it. This fails the graceful degradation test. ::cal[2026-03-23]{view=agenda} passes. Any literate person can read it. The directive syntax is also intentionally minimal — one construct, with two optional extensions. The parser is trivial to implement. The writer does not need to remember a markup language. The syntax is a notation, not a language.

The closest prior art for this syntax is the CommonMark Generic Directives Proposal, a draft specification by John MacFarlane and others that would add a ::directive[argument]{key=value} syntax to the CommonMark Markdown standard. The proposal has been discussed since 2017 and has not yet been formally adopted, primarily because the use cases it addresses — embedding application capabilities in prose — were not yet considered mainstream. The text-native document architecture makes those use cases central.

Chapter 5 — The embed directive
Chapter 6: The computation layer →

Part II — The four primitives · Chapter 6

The computation layer

Python as the document's native language; shared context; sandboxing

The ::py block is the most important addition to the document architecture — more significant, even, than the embed directive — because it changes what a document fundamentally is. An embed directive makes a document a surface on which applications render. A Python block makes the document itself computational. The document does not just display results from elsewhere. It produces results, from code that lives inside it, in context, alongside the prose that motivated the computation. This is the literate programming insight, delivered at last to a mainstream audience. The code is in the document, the prose is in the document, the output is in the document. The reasoning and the result are inseparable, because they are literally in the same text file.

· · ·

6.1 The shared document context

All ::py blocks in a document share a single Python namespace — the document context. A variable defined in the first block is available in every subsequent block. The document is, in this sense, a single Python program whose source code is interspersed with prose.

# Block 1 — defines the data
::py[q3-data]{run=auto}
categories = ["Travel", "Software", "Hardware", "Marketing"]
actuals    = [4200, 1850, 6700, 3100]
budget     = [3500, 2000, 5000, 3000]
::end
# Block 2 — uses data from Block 1 without importing it
::py[q3-summary]{run=auto}
total_over = sum(a-b for a,b in zip(actuals,budget) if a>b)
print(f"Total overrun: {total_over:+,}")
::end

The context is rebuilt by evaluating blocks in document order, top to bottom, every time the document opens. This means the document's computation is always reproducible: given the same document text, the same context is always produced. There is no hidden kernel state, no out-of-order evaluation, no "restart and run all" button needed because the kernel fell out of sync. The document is the program. The program runs from the top. Every time.

· · ·

6.2 Why Python — and why the choice is not essential

I chose Python as the computation language for three reasons, which I stated briefly in the Introduction and defend here in full. First, ubiquity: more people have encountered Python — in school, in data-science courses, in scripting tutorials — than have encountered any other language with equivalent capability. The computation layer should meet them where they already are.

Second, accessibility for non-specialists. Python sets a low floor: assignment, indexing, and function calls are enough surface to express simple computation, and a person who has never written a program can be taught enough to be useful in a ::py block in an afternoon. The claim some advocates make — that Python "reads like English" — is overstated; once a comprehension, a generator expression, or a lambda appears, that pretence dissolves. The more defensible claim is the weaker one: Python does not require understanding of type systems, memory management, or compilation phases to write useful blocks, and that is what the audience this book addresses needs.

Third, ecosystem: Python's standard library — even the subset exposed by the default sandbox policy in Appendix B — is more capable than any spreadsheet formula language. The full statistics module is available. datetime handles timezone-aware dates correctly. collections.Counter does frequency analysis in one line. These are not exotic capabilities; they are the routine needs of anyone analysing personal data.

That said, the choice of Python is not essential to the architecture. The ::py directive name is arbitrary — it could equally be ::lua, ::scm, or ::st. Lua would serve well for its compactness and the elegance of its embedding model; Scheme for the semantic cleanliness of its evaluation rules and the power of its macro system; Smalltalk for its live-object model, which treats the document context as a naturally persistent image rather than a batch-evaluated namespace. What the language must provide is: a host-enforced isolation boundary (§6.3), accessibility to non-specialists, and the ability to share state across blocks via a document-scoped namespace. Python satisfies all three for the audience this book is written first for — read "widest" here in the staircase sense the Preface establishes, not in absolute terms. A future implementation targeting a different audience — one more comfortable with Lisp syntax, or running in an environment where Lua is already embedded — should feel no obligation to use Python. The architecture is language-agnostic. The directive is the contract. The language is an implementation detail.

· · ·

6.3 The sandbox: what "cannot read disk" actually means

Every previous claim about what a ::py block cannot do — read the filesystem, open a socket, import an arbitrary native library — is a claim about an isolation mechanism. The mechanism matters, and the architecture has to commit to one, because the obvious approach has been tried for decades and has not worked.

The obvious approach is to restrict CPython at the language level: replace __builtins__, intercept import, audit attribute access, ban dangerous functions. Tools in this lineage include RestrictedPython, rexec (deprecated by the Python core team in 2003 precisely because it could not be made safe), and the pysandbox project (deprecated by its own author in 2013 with the message that "it is not possible to sandbox untrusted code with CPython"). The pattern of failure is consistent: Python's introspection is so deep that any sufficiently determined block can reach the C-level escape hatches, and any whitelist comprehensive enough to prevent this is so restrictive that ordinary code stops working. CPython is not a sandbox, and the practitioners who have tried hardest to make it one have all reached the same conclusion.

The architecture therefore does not sandbox CPython. It runs ::py blocks under a Python interpreter that the host has confined to a syscall-level isolation envelope: an execution context that cannot reach memory outside an allocated heap, cannot issue syscalls the host has not explicitly imported into it, and cannot escape through any in-language mechanism, because the in-language mechanisms reach only the in-language environment. "Cannot read disk" is enforceable not because the Python interpreter forbids it, but because the host has not given the interpreter a syscall that would let it try.

This is the inversion that matters. In the language-sandbox approach, the language tries to deny capabilities it possesses by default. In the host-imported approach, the host grants only the capabilities it chooses, and the language possesses nothing else. The whitelist in Appendix B is therefore not a list of things forbidden from inside the interpreter; it is a list of imports the host introduces into the interpreter. A future renderer that wishes to allow a ::py block to read a single named file does not need to modify Python — it adds a host-side capability that, when granted, exposes a function the block can call.

The cost is real and worth naming. The discipline imposes four constraints the rest of the architecture must live with, in proportions that depend on which environment the implementation targets:

Constraint 1 — cold start

Bringing up an isolated Python runtime has a non-zero startup cost — typically a fraction of a second to a few seconds on first use per session, depending on the host's interpreter choice and on how much of the standard library it preloads. Subsequent ::py blocks within the same document context are evaluated against the warm runtime.

Constraint 2 — library availability

Only the libraries the host has admitted into its runtime are usable. A given implementation defines which Python distribution it ships with, which packages it makes available by default, and which the user can extend at install time. Native extensions are reachable only if the host has produced a build compatible with its isolation envelope.

Constraint 3 — concurrency

asyncio is supported. Operating-system threads and multiprocessing are not assumed by the architecture — the document execution model does not require them, and the isolation envelope cannot be relied upon to expose them. Long computations are bounded by the per-block timeout (3 seconds) rather than by parallelism.

Constraint 4 — memory

Each document context allocates its own bounded heap (default 64MB). The bound is not a soft hint enforced by the renderer above the interpreter; it is enforced by the isolation envelope at instantiation. Out-of-memory is a hard runtime error, surfaced inline in the document like any other exception.

None of these constraints is a flaw of the architecture. They are the price of having an isolation boundary that actually holds. The architecture's job is to state the price plainly, so that an implementer building a renderer knows what they are signing up for and a user reading a ::py block knows what its guarantees are. The promise is bounded and explicit: a ::py block runs inside a host-isolated Python runtime; the only capabilities available to it are the ones the host has explicitly imported, listed in the document's ::meta block (§8.5) and enforceable by inspection of the renderer's capability registry (Chapter 8).

The host's choice of mechanism is properly a local decision. Different implementations satisfy the same architectural requirement by different means: a browser-hosted renderer can compile Python to WebAssembly and run it inside the browser's syscall-restricted envelope; a desktop renderer can use a process-level sandbox (gVisor, Firecracker, nsjail) or a containerised interpreter; a CLI implementation can use OS-level capability restrictions (seccomp-bpf, App Sandbox, Job Objects). What the architecture requires is not a particular mechanism but a particular shape: the interpreter must execute inside an envelope it cannot syntactically escape, and capabilities reach the interpreter only by host-side import. Language-level sandboxes — RestrictedPython and its lineage — are explicitly rejected for the reasons above; among host-level sandboxes the architecture is indifferent.

Chapter 6 — The computation layer
Chapter 7: The table layer →

Part II — The four primitives · Chapter 7

The table layer

Structured data as text; the spreadsheet dissolved; divergence protocol

Figures 6.1 & 7.1 — ::py drives ::table: a genuinely integrated worksheet

March budget review. Software came in under — we cancelled two subscriptions. Hardware over again. Running the numbers:

::py[budget-march]{run=auto}

ctx ▾

stdout

The table below is generated from the Python block above — not a separate widget. Edit any cell to override; the block tracks divergence and offers reconciliation.

::table[budget-breakdown]{source=budget-march sortable=true editable=true}

edited — diverged in sync

Table has been manually edited and no longer matches the ::py source.

Category	Budget	Actual	Delta	% used

document mutation log

run=auto — ::py[budget-march] evaluated on document open

::table[budget-breakdown] generated from budget-march context

::task[review-transport]{due=2026-04-01} review transport spend before April budget

Toggle the ::py block to code view, change a number in the actual list, hit run ▶ — the table regenerates from the new values. Click ctx ▾ to inspect the live Python namespace. Then edit a table cell directly — the divergence badge appears and the reconcile bar offers to re-run or detach. Toggle to source view to see the exact text written to disk at every step.

· · ·

7.2 The divergence protocol

When a ::table block is generated by a ::py block, a connection exists between the two. The table's source= parameter records which block generated it. This connection creates three possible states:

Synchronised

The table matches what the ::py block would produce if run now. The renderer shows no indicator. The user sees a clean table.

Diverged

The user has edited cells directly, so the table no longer matches the code's output. The renderer shows a small "edited" badge. The connection is not broken; it is flagged.

Detached

The user has explicitly chosen to break the connection. The source= parameter is removed from the text. The table is now a standalone data structure, manually maintained.

· · ·

7.3 Why this replaces the spreadsheet

The spreadsheet application has three jobs: storing data, computing over it, and displaying it. The text-native document splits these three jobs along cleaner lines. Data is stored as ::table text — human-readable, diffable, greppable, writable by any program. Computation is expressed as ::py blocks — readable as prose, editable in place, auditable by anyone reading the document. Display is handled by the renderer — sortable, editable, optionally chartable.

The deeper cost of the spreadsheet is that its data is trapped. A .xlsx file requires Excel or a compatible application to read. A ::table block in a document is text. grep, awk, python, git diff — every tool you already have can read it, transform it, version it, and pipe it somewhere else. The spreadsheet is a silo. The ::table directive is a first-class citizen of the text layer.

Chapter 7 — The table layer
Chapter 8: The capability registry →

Part II — The four primitives · Chapter 8

The capability registry

How renderers register, resolve, and stay sandboxed

The embed directive grammar tells the document what to render. The capability registry tells the document how. The registry is the mapping between directive types and renderer implementations — the table that says "when you encounter ::cal, use the CalendarRenderer; when you encounter ::email, use the MailRenderer." It is invisible to the user and foundational to the system. Without it, the directive is syntax. With it, the directive is a live capability.

· · ·

8.1 Registration

A renderer registers itself with the capability registry by declaring its manifest — a small text document that states what directive types it handles, what parameters it accepts, what capabilities it requires, and what it promises not to do. The manifest is verifiable: the renderer runtime enforces the capability declarations at execution time.

# Example renderer manifest (TOML)

[renderer]

name = "google-calendar"

handles = ["cal"]

version = "2.1.0"

[capabilities]

network = true # needs calendar API

storage = false # no local file access

secrets = ["GCAL_OAUTH_TOKEN"]

[sandbox]

max_render_ms = 2000

max_memory_kb = 8192

allowed_origins = ["calendar.googleapis.com"]

The registry resolves directive types to renderers in order of specificity: a renderer registered for cal:google takes precedence over one registered for cal. If no renderer is registered for a directive type, the directive is rendered as its raw text — the graceful degradation fallback.

· · ·

8.2 Sandboxing

Each renderer runs in an isolated execution context. It receives only the identifier and parameters declared in the directive, plus the capabilities explicitly granted to its type and to the document that invoked it. It cannot read the document text wholesale, cannot inspect other renderers' output, cannot read the user's filesystem except where a file capability has been granted, and cannot make network requests to origins not declared in its manifest and approved by the user. There are no ambient capabilities in the system. A renderer gets only what it declares and what the user permits. This is the architectural guarantee that makes third-party renderers tolerable rather than terrifying.

· · ·

8.3 Bidirectional sync as a registry service

The registry also manages the bidirectional sync protocol. When a user clicks "Accept" on a calendar invite rendered by the CalendarRenderer, the renderer does not directly edit the document. It emits a sync event — a structured record of what changed — to the registry, which applies the change to the document text and notifies any other renderer whose displayed content depends on the changed data. This indirection is important. It ensures that the document text is always the authoritative source of state, that changes are atomic, and that the document's version history accurately reflects every user action, whether that action was a keystroke in prose or a click in a rendered widget.

· · ·

8.4 The Python context as a universal query layer

There is a consequence of the capability registry design that deserves its own treatment, because it is one of the architecture's great powers and one of its great dangers. When a ::py block evaluates, it does not receive ambient access to the user's world. By default it receives only the document's owned state — local tasks, local tables, local metadata, transclusions already resolved in the document, and the whitelisted standard library. This default matters. A block that can read private mail is not harmless merely because it cannot send mail.

Only a trusted document, with explicitly granted capabilities, may expose mirrored app namespaces to Python. When such a grant exists, the registry injects read-only interfaces into the Python context as first-class objects: cal.events(...), tasks.query(...), email.query(...), files.list(...). The block still cannot mutate those systems. Mutations continue to flow through the sync protocol described in Chapter 9. But the distinction between local compute and capability-granted query must remain visible, because read access is itself sensitive power.

This yields a model that is both useful and defensible. A document you wrote yourself may be granted the right to query your calendar and mail because it is already part of your working environment. A document you received from someone else opens with no such rights. Its ::py blocks render as text, or execute only against document-local state, until you decide otherwise. Trust is not inferred from syntax. It is granted explicitly, capability by capability, document by document.

The practical consequence, once trust has been established, is still the same one that matters: Python becomes a unified query language over the things you have chosen to expose to the document. But it becomes so by permission, not by assumption. The architecture is stronger for saying this plainly.

Figure 8.1 — Python as the universal query layer: one block reads from all eleven apps

Each scenario shows a ::py block querying app namespaces injected by the capability registry. Toggle to code view to read the query, then hit run ▶.

doc context: cal tasks email contacts files chat web

::py[weekly-briefing]{run=auto}

stdout

::py[meeting-prep]{run=auto}

::py[follow-up-pipeline]{run=auto}

top contacts widget

::py[custom]{run=on-demand}

document query log

context initialised — cal, tasks, email, contacts, files namespaces ready

Each tab is a different cross-app query. Toggle to code view to read the Python, then hit run ▶. The custom query tab is a live REPL — edit the code and run your own queries against the simulated app namespaces. The key insight: Python doesn't import from apps. The capability registry injects the namespaces before the block runs.

· · ·

8.5 Lazy renderer loading and the capability manifest

Oberon's module system had an elegant property that the capability registry should inherit: a module is only loaded when it is actually needed. In Oberon, the system does not load a compiler when it boots. It loads the compiler the first time a command invokes it, and thereafter keeps it resident. This is not a performance optimisation — or not only that. It is a security and transparency principle: a module that is not loaded cannot have side effects, cannot consume resources, and cannot be exploited.

The capability registry should enforce the same rule. A renderer for ::cal is loaded only when a document contains at least one ::cal directive. A renderer for ::sh — the privileged shell executor — is loaded only when a document explicitly contains a ::sh block. A document with no shell directives has no shell renderer loaded, which means it has no shell access, which is a security property expressible without any additional mechanism: the absence of the directive is the absence of the capability.

This has a useful consequence for document metadata. Every document that declares its directives in a preamble — a short block at the top listing which renderer types it uses — becomes self-describing in terms of capabilities. A reader encountering the document for the first time can see, before rendering anything, that this document requires a calendar renderer and a Python sandbox but no shell access and no email renderer. The capability manifest is the document's permission declaration, readable by a human without any tooling.

# Example document preamble (proposed ::meta directive)
::meta[capabilities]
renderers = [cal, email, task, py, table]
py-sandbox = strict
sh-access  = false
::end
# Registry behaviour: load only cal, email, task, py, table renderers.
# sh renderer is never loaded — capability is structurally absent.

The preamble is optional — a document without one is rendered with whatever renderers the registry has available for the directive types it encounters. But a document with one is auditable, portable, and safe to open in restricted environments. It is, in miniature, the same principle as a Unix process's file descriptor table: a complete declaration of what the process can touch, visible before the process runs.

· · ·

8.6 Viewport-driven rendering

The lazy loading principle of §8.5 determines which renderer modules are loaded. This section addresses a finer-grained question: when does a loaded renderer actually render a specific directive instance? The answer, for any document longer than a screenful, should be: when the directive is near the viewport.

Opening a document triggers two distinct operations that are easy to conflate but must be kept separate. The first is parsing: a single linear scan of the text that identifies every directive, its type, its identifier, and its parameters. Parsing is cheap — it is pattern matching on lines that begin with :: — and it runs to completion on open, regardless of document length. The result is the directive index: a lightweight table of every directive in the document, its line number, and its type. The directive index is what powers search, the capability manifest, and the dependency graph. It is built from the text and kept in memory.

The second operation is rendering: the expensive work of querying backends, evaluating code, constructing widget DOM, and painting pixels. Rendering is where the cost lives. A ::cal directive that queries Google Calendar takes network round-trips. A ::email directive that renders a thread constructs a complex widget tree. A ::py block evaluates code in a sandbox. None of this work should happen for a directive that is three thousand lines below the cursor and will not be seen until the user scrolls there — if they ever do.

The renderer maintains a proximity threshold — a configurable window above and below the current viewport, measured in lines (default: 500). Directives within this window are rendered eagerly. Directives outside it are displayed as their raw directive text — the same graceful degradation fallback used when no renderer is registered, except that here the renderer is available and will activate the moment the directive enters the threshold. The transition is seamless: the text line is replaced by the rendered widget as the user scrolls near it. The reverse also applies — directives that leave the threshold by a wider margin (default: 1500 lines) are de-rendered back to text, freeing memory and dropping backend connections.

This is consistent with the no-hidden-state principle of §4.5. The full document text is loaded, parsed, and searchable at all times. Grep finds every directive. The AI reader (Chapter 24) sees the complete text. Only the rendered widgets — the visual, interactive layer that is explicitly downstream of the document — are deferred. The document is complete. The renderer is selective. This is exactly the separation that §4.3 insists on: the renderer is not the document.

Every code editor of consequence has solved this problem. VS Code does not syntax-highlight a million-line file all at once; it highlights the viewport and a buffer around it. The document renderer applies the same principle to a richer domain — not just syntax colouring, but widget construction, backend queries, and code evaluation — with the same user-facing guarantee: scrolling is never blocked by work happening off-screen.

Figure 8.2 — Viewport-driven rendering: drag the viewport to see directives activate and deactivate

rendered (in threshold) text only (outside threshold) proximity threshold

Drag the viewport indicator on the minimap. Directives inside the proximity threshold render as live widgets (green left border). Directives outside it display as raw directive text — readable, searchable, but not rendered. The stats panel shows how many directives are active at any viewport position. In a real document, the deferred directives consume zero backend connections and zero sandbox memory.

· · ·

8.7 Progressive evaluation and `run=visible`

Chapter 6 established that ::py blocks with run=auto execute on document open, in document order, sharing a single namespace. This is the correct default for short documents where every computation contributes to the visible state. But it becomes a liability in a long document with dozens of ::py blocks, only a handful of which are on screen. Evaluating thirty Python blocks to display three is wasteful — and worse, it delays the moment the user sees anything useful.

The solution is a third evaluation mode that sits between the existing two:

::py[analysis]{run=auto}      # evaluate on document open (current default)
::py[deep-dive]{run=visible}   # evaluate when scrolled into the viewport threshold
::py[custom]{run=on-demand}    # evaluate only when the user clicks run ▶

run=visible defers evaluation until the block enters the proximity threshold defined in §8.6. When it does, the block evaluates in document order relative to other blocks that are also being evaluated — the ordering guarantee from Chapter 6 is preserved within any evaluation batch. Until evaluation occurs, the renderer displays the raw code text, which is always meaningful: a reader can see what the block will compute before it computes it.

The complication is dependencies. If block B at line 800 references a variable defined in block A at line 50, and the user opens the document scrolled to line 800, block A must evaluate first — even though it is outside the viewport. The renderer resolves this by analysing the dependency graph derived from the directive index: when a run=visible block enters the threshold, all blocks it depends on (transitively) are evaluated first, in document order, regardless of their viewport position. The cost is paid once; subsequent scrolls to those blocks find them already evaluated.

For blocks with no downstream dependents and no run=auto flag, the optimisation is significant. A document with twenty ::py blocks, of which five are run=auto (they define shared data), ten are run=visible (they compute section-specific analyses), and five are run=on-demand (they are exploratory), evaluates only the five run=auto blocks on open. The remaining fifteen wait. The document opens fast, the viewport is live immediately, and the rest of the computation happens incrementally as the user reads.

· · ·

8.8 Embed folding and the `fold` parameter

A chat stream with fifty messages, an email thread with twelve replies, a task list with forty items — these are common embeds that can dominate a document's visual space without contributing proportionally to the user's immediate attention. The renderer should fold them: show a summary with the most recent or most relevant items visible, and collapse the rest behind a disclosure toggle.

Folding can be expressed as a directive parameter:

::chat[#design-team]{limit=50 fold=5}     # show 5 most recent, fold 45
::email[thread-q3]{fold=3}                 # show 3 most recent replies, fold the rest
::task[project-tasks]{fold=overdue}        # show overdue tasks, fold completed and future

The fold parameter is in the text. A plain text reader sees it and understands the intent: this embed has more content than is shown by default. The renderer honours it by displaying the specified number of items (or the specified filter) and rendering a summary line for the folded portion — "45 earlier messages" or "28 completed tasks" — with a toggle to expand. Expanding does not mutate the document. It is a renderer view state, the same category of transient state as scroll position or cursor location: meaningful to the current session, not persisted to the text.

There is a deliberate asymmetry here. The fold parameter — the default fold state — is in the text, because it represents the author's intent about how the document should be read. The current fold state — whether the reader has expanded or collapsed a section during this session — is renderer state, because it represents the reader's transient preference. This distinction matters: the author's intent is versioned, diffable, and shared; the reader's preference is local and ephemeral. The fold parameter is not a UI preference. It is a document-level declaration that this embed has a natural summary and a natural detail view, and that the summary is the right default.

This is the same principle behind the HTML <details> element, which has a default open/closed state declared in the markup and a current state managed by the browser. The key difference is that the fold parameter is semantic — fold=overdue means "show what needs attention" — rather than purely structural. The renderer interprets the fold value in the context of the directive type: fold=5 on a ::chat means the five most recent messages; fold=5 on a ::task list means the five highest-priority incomplete tasks. The parameter is the same. The semantics are directive-specific.

Chapter 8 — The capability registry
Chapter 9: Bidirectional sync →

Part II — The four primitives · Chapter 9

Bidirectional sync

All mutations flow through text — never through a database

Every system that claims to keep two representations of the same information in sync — a rendered widget and a text file, a local document and a remote backend — faces the synchronisation problem: what happens when both representations change at the same time? Which one wins? How do conflicts get resolved? How does the user know what happened? The text-native document architecture resolves the synchronisation problem by making it asymmetric. The text file is always the source of truth. The rendered widgets are always derived from the text. Changes flow in one direction for the authoritative state — from text to widget — and in one direction for user actions — from widget to text.

· · ·

9.1 The mutation model

Every user action in a rendered widget produces exactly one of three mutation types: Append — new text is added to the document; sending a reply appends a block, accepting an invite appends a log line, completing a task appends a timestamp. Replace — existing text is modified in place; editing a table cell replaces the cell value, changing a task's due date replaces the parameter value, toggling a checkbox replaces [ ] with [x]. Delete — existing text is removed; dismissing a directive removes it, detaching a table from its source removes the parameter. Every mutation is expressed as a text edit. Every text edit is recorded in the document's version history. The document's history is a complete audit trail of every action the user took.

· · ·

9.1.1 Two kinds of state: owned and mirrored

The phrase "source of truth" has to be handled carefully here, because the architecture uses it in two related but distinct senses. If this distinction is blurred, the system begins to sound cleaner than it is, and therefore less trustworthy than it should be.

Owned state is state for which the document text is the sole canonical source. A ::task directive is owned state. A ::py block is owned state. A ::table whose data lives inline in the document is owned state. A transcluded note, though it may be viewed from multiple places, is still owned by text in a source document. For owned state, the claim is absolute: the document is not a representation of the system. It is the system. To change the state is to change the text.

Mirrored state is different. An ::email embed projects mail that remains canonical on a mail server. A ::cal embed projects events whose authoritative record remains in a calendar backend. A ::chat embed projects messages that remain canonical on the chat service. In these cases the document is not the remote database and should never pretend to be. What it is, instead, is the durable user-owned orchestration layer above those systems: the place where the relevant slice of remote state is brought into context, where actions are taken in that context, and where those actions are recorded in a permanent textual history.

So the architecture makes two promises, not one. For owned state, the document is the source of truth in the strongest possible sense. For mirrored state, the document is the authoritative personal record of what the user saw, decided, and did in relation to external systems. This is a narrower promise than "the document is the database of everything." It is also the promise that can actually be kept.

Figure 9.1 — Bidirectional sync in action: accept a meeting, watch the document update

Had a call with Sara this morning about the product roadmap. Need to schedule a follow-up with the full team next week.

::cal[2026-03-23]{view=agenda}

Today — Mon 23 Mar

09:00 – 09:45

Product roadmap call

Sara, Rohan · Meet

Accepted

14:00 – 14:30

1:1 with manager

Recurring · Zoom

Accepted

Tue 24 Mar

11:00 – 12:00

Team roadmap review

All team · Conference room A

Document sync log — mutations written back to text on every action

view="agenda" — embed param in document

Switch between views — the view parameter updates in the sync log. Hit "Accept?" on the team review — watch two mutations fire: the RSVP and an auto-generated task. Both are text writes to the document.

· · ·

9.2 Remote backends

When a directive interacts with a remote backend — sending an email, accepting a calendar invite, posting a chat message — the mutation model extends to cover the remote state. The renderer sends the action to the backend and waits for confirmation. On success, it emits a sync event that writes the mutation to the document text. On failure, it emits an error that renders inline in the document. The key property is that the document text reflects reality. If an email was sent, the document contains a record of it. The document is not an interface to the backends. It is a journal of what has happened, with the backends as the execution layer.

· · ·

9.2.1 The document as event log

The model described above — renderers emit actions, actions are confirmed by backends, confirmations are written to the document as text — has a precise analogy in software architecture: event sourcing. In an event-sourced system, the authoritative state is not a database snapshot. It is an ordered sequence of events. The current state is derived by replaying the events from the beginning. The events are immutable. The derived state is disposable — it can always be recomputed.

The text-native document is an event-sourced personal information system. The document text is the event log. The rendered widgets are derived views. The "current state" of any directive — whether a task is complete, whether an email was replied to, what time a meeting is scheduled for — is computable by reading the document from top to bottom. This is not a metaphor. It is the operational model, and it has consequences that the architecture depends on.

Every mutation written to the document — whether by a user action, a renderer sync, or an inbound backend update — is an event record. An event record has four fields:

event record schema

timestamp ISO 8601, always UTC — when the event was confirmed

source one of: user, renderer, sync — who originated the mutation

type one of: append, replace, delete — the mutation class from §9.1

payload the text that was written, replaced, or removed

In the raw document, an event record is a line of text. The user's acceptance of a meeting is: accepted team roadmap review, 14:00, 24 March. The source is user; the type is append; the timestamp is implicit in the document's version history; the payload is the line itself. A sync annotation from an inbound backend change is: [sync 2026-03-24T08:12] team roadmap review rescheduled by organiser: 14:00 → 15:00. The source is sync; the timestamp is explicit in the bracket; the type is append; the payload is the annotation text. The schema is the same in both cases. The document accumulates these records in chronological order. It never edits a past record — it only appends new ones.

This gives the architecture three properties that are difficult to achieve any other way:

Replay. The current state of any directive can be reconstructed by reading the document's event records in order. A task that was created, then had its due date changed, then was completed, has three event records. Reading them in sequence produces the current state: completed, with the revised due date. This means there is no hidden state — no database row that the user cannot see. Everything is in the text, in order, readable.

Time travel. Because event records are immutable and chronologically ordered, the document at any past point in time is recoverable by replaying events up to that point. Combined with git version history — which the architecture assumes for all documents — this means the user can answer questions like "what did my Tuesday look like before the meeting was rescheduled?" by checking out the document as of Monday night. The event log and the version history are complementary: the version history shows the file at each commit; the event log shows the reasoning within each version.

Auditability. Every action has a record. Every record has a source. A user, an AI assistant, a backend sync, or a renderer action — each is attributed. When something unexpected appears in the document, the event log answers "who did this and when" without requiring any external system. This matters especially for AI-nativity (Chapter 24): when a language model writes to the document — suggesting a task, drafting a reply, computing a summary — its actions are event records with source=ai, distinguishable from user actions, auditable in the same text.

Event sourcing in traditional software systems typically requires a purpose-built event store — a database optimised for append-only writes and sequential reads. The text-native document needs no such infrastructure. The event store is the text file. Appending an event is appending a line. Sequential reads are reading the file from top to bottom. The format is the infrastructure. This is, perhaps, the most direct expression of why text is the right substrate: it is already an event log. It has been one since the first person wrote a diary.

· · ·

9.2.2 Inbound sync: when the backend changes without you

The mutation model described so far covers only outbound sync — the user acts in the document, and the action propagates to the backend. But backends change on their own. A meeting organiser reschedules from 14:00 to 15:00. A colleague deletes a shared document. A chat message is edited after it was rendered in the document. An email is moved to spam by a server-side filter. In each case, the backend's state has diverged from the projection the document last rendered. The document must handle this, or it will become a record of a world that no longer exists.

The architecture addresses inbound sync through a refresh cycle. When a mirrored-state directive is rendered, the renderer queries the backend for the current state of the resource it projects. If the backend state matches the document's last-known state, rendering proceeds normally. If the backend state has changed, the renderer must do two things: update the rendered view to reflect the new state, and write a sync annotation to the document text recording what changed and when.

The sync annotation is the critical piece. Consider the rescheduled meeting. The document contains a log line from yesterday: accepted team sync, 14:00, 24 March. Today, when the ::cal renderer refreshes, it discovers that the event is now at 15:00. The renderer updates the calendar widget to show 15:00. And it appends to the document: [sync 2026-03-24T08:12] team sync rescheduled by organiser: 14:00 → 15:00. The original acceptance record is not modified — it is historical fact. The sync annotation records the new fact. The document now contains both, in chronological order, and a reader — human or machine — can reconstruct the full history.

For deletions, the pattern is the same. If a backend resource no longer exists — an email was permanently deleted server-side, a shared document was removed — the renderer annotates the directive: [sync 2026-03-24T09:00] resource not found — email deleted server-side. The directive remains in the document as a record of what was once there. The renderer displays it in a muted or struck-through style, but the text is preserved. The document does not silently lose references to things that have vanished.

The refresh cycle is not continuous. Renderers poll or subscribe to backend changes at a cadence appropriate to their type: email and chat renderers may poll every few minutes or hold a push connection; calendar renderers may refresh on document open and at a longer interval thereafter; file renderers may check on render only. The cadence is a parameter of the renderer registration, not of the architecture itself. What the architecture requires is the contract: when a mirrored-state renderer detects a backend change, it must write a sync annotation to the document text. Silent updates — where the rendered widget changes but the text does not — are prohibited. If the user can see it, the document must say it.

There is a subtlety here that is worth stating. The refresh cycle means the document can be temporarily out of date — the backend changed at 08:12 but the renderer does not poll until 08:17. During those five minutes, the document's rendered view is stale. This is acceptable, and in fact unavoidable in any distributed system. The document does not claim to be a real-time mirror of every backend. It claims to be an eventually consistent projection that records every change it discovers. Staleness is bounded by the refresh interval. Inconsistency — a widget that shows one thing while the text says another — is never acceptable.

Figure 9.2 — Inbound sync: the organiser reschedules, the document catches up

Accepted the team roadmap review yesterday. Preparing agenda notes this morning.

::cal[2026-03-24]{view=agenda}

Team roadmap review

All team · Conference room A

14:00 – 15:00 Accepted

[sync 2026-03-24T08:12] team roadmap review rescheduled by organiser: 14:00–15:00 → 15:00–16:00

[sync 2026-03-24T08:12] ::task[prepare-agenda] due updated: 14:00 → 15:00

document mutation log

::cal[2026-03-24] rendered — 1 event fetched from backend

accepted team roadmap review, 14:00, 24 March — user action yesterday

raw document text

accepted team roadmap review, 14:00, 24 March ::cal[2026-03-24]{view=agenda} ::task[prepare-agenda]{due=2026-03-24T14:00 priority=high}

Step through the scenario. In step 1, the document reflects yesterday's acceptance at the original time. In step 2, the organiser reschedules on the backend — the document is temporarily stale. In step 3, the refresh cycle fires: the widget updates, a sync annotation is written to the text, and the linked task's due time is adjusted. The original acceptance record is preserved. Watch the raw text panel to see exactly what changes.

· · ·

9.3 Conflict resolution

Conflicts arise when the same document is edited simultaneously from two locations — two devices, two users, two renderers. The architecture's conflict resolution must handle these transparently, without hiding the disagreement and without asking the user to do work that the system can do for them.

9.3.1 The common case: concurrent non-overlapping edits

The most frequent conflict in practice is not two humans editing the same sentence. It is a user editing one part of the document — adding a paragraph, checking off a task — while a renderer writes a sync annotation to a different part — an inbound calendar update, a new chat message appended to an embed. These edits touch different regions of the text. They do not semantically conflict. They are concurrent appends or replacements to non-overlapping ranges.

For this common case, the architecture adopts a CRDT-style merge: non-overlapping edits from different sources are merged automatically, without user intervention, by applying each edit to its target range in the document. The merge is deterministic — given the same set of edits, every device produces the same result — and it preserves every edit in full. No data is lost, no user action is required, and the document's version history records both edits with their respective sources and timestamps. This is the case that covers the vast majority of real-world concurrent editing, and it should be invisible to the user. The document simply contains both changes, as if one had happened slightly before the other.

9.3.2 The hard case: overlapping edits

True conflicts — two edits to the same range of text — are rarer but must be handled. A user on their phone changes a task's due date to Friday while their desktop renderer, processing an inbound sync, updates the same task's due date to Thursday because a linked calendar event moved. The same text range has been written by two sources with two different values.

For this case, the architecture follows the git model: the conflict is recorded in the document text using a standard marker format.

<<<<<<< local
::task[ship-report]{due=2026-03-28 priority=high}
=======
::task[ship-report]{due=2026-03-27 priority=high}
>>>>>>> sync:cal-backend 2026-03-24T08:12

The conflict markers are text. The document is still a valid text file. A user reading the raw file can see exactly what happened: two sources disagreed about a value, and neither has been chosen yet.

But raw conflict markers in a daily planner are a poor experience — and the architecture must not pretend otherwise. The renderer is required to present conflicts as a chooser widget: a compact inline element that shows both versions side by side with buttons to accept one, the other, or a manual edit. The chooser is the rendered view of the conflict markers, in the same way that a checkbox is the rendered view of [ ]. Choosing a resolution replaces the entire conflict block — markers and all — with the selected version. The resolution is a text edit, recorded in version history, attributed to the user who made the choice.

This is the same relationship between text and rendering that holds everywhere in the architecture: the conflict markers are the source of truth; the chooser widget is the renderer's presentation of them. A partial renderer that does not support the chooser displays the raw markers. A full renderer displays the widget. The document is correct in both cases.

9.3.3 Persistence and recovery

An unresolved conflict persists until the user resolves it. The document remains fully functional around the conflict — other directives render normally, other edits proceed, the daily document is usable. The conflict does not block rendering; it blocks only the specific directive it affects, which renders in its conflicted state (the chooser widget) until resolved.

If a user ignores a conflict — closes the laptop, opens the document on their phone — the conflict markers are present in the text and the phone's renderer displays the same chooser. Conflicts travel with the document because they are text. They cannot be lost, forgotten by a renderer, or silently resolved by a device the user is not looking at. The user resolves the conflict once, on whichever device they choose, and the resolution propagates like any other text edit.

The architecture does not auto-resolve conflicts after a timeout. Choosing for the user is worse than waiting for the user to choose, because a wrong automatic resolution that goes unnoticed is silent data loss — the one outcome the architecture is designed to prevent. The cost of an unresolved conflict is a visible widget asking a question. The cost of a wrong resolution is invisible corruption. The architecture prefers the former unconditionally.

· · ·

9.4 When sync fails

The preceding sections describe how the system behaves when things work. This section describes what happens when they do not. Sync failures are not edge cases in a system that interacts with remote backends over unreliable networks. They are routine, and the architecture must handle them without losing data or lying to the user.

9.4.1 Network failure during an outbound mutation

The user clicks "send" on an email reply. The renderer attempts to deliver the message to the mail backend. The network is down. What happens?

The draft directive remains in the document. The renderer had already written ::email[draft]{to=sara@example.com} to the text when the compose area opened (as described in Chapter 11). Sending would have replaced the draft directive with a sent log entry. Because the send failed, the replacement does not occur. The draft is still there — in the text, on disk, surviving any application crash. The renderer displays the draft with a failure annotation: send failed — network unavailable — retry? The annotation is rendered, not written to the document text, because it describes a transient condition rather than a historical fact.

When the network recovers, the user can retry from the same draft. If the renderer supports automatic retry, it may attempt redelivery on a schedule — but it must not remove the draft directive from the document text until the backend confirms delivery. The invariant is: if the document says the message was sent (a sent log entry exists), then it was sent. If the document still contains the draft, then it was not. The document does not lie about the state of the world.

9.4.2 Partial sync: backend succeeded, text write failed

A harder failure: the RSVP was sent to the calendar backend successfully, but the application crashed before the acceptance could be written to the document text. The backend now says "accepted." The document does not reflect this. The two have diverged.

This is resolved on the next refresh cycle. When the ::cal renderer next queries the backend, it discovers that the event's status is "accepted" — a change from the document's last-known state. The inbound sync mechanism (Section 9.2.2) treats this like any other backend-initiated change: it writes a sync annotation to the document recording the acceptance. The annotation's source attribution — [sync] rather than a direct user action — signals that the document is catching up to reality rather than recording a new user decision. The end state is correct: the document reflects the acceptance. The path was indirect, but the contract held: the document eventually tells the truth.

This recovery depends on the refresh cycle described in Section 9.2.2. The gap between the crash and the next refresh is a window during which the document is wrong — it does not yet record the acceptance. This is the same eventual-consistency property discussed in the inbound sync section: the document does not claim to be instantaneously correct, only eventually correct, and the staleness window is bounded by the refresh interval. For most directive types, this interval is measured in minutes. For a daily document, a few minutes of inconsistency after a crash is acceptable. What would not be acceptable is permanent inconsistency — the acceptance never being recorded — and the refresh cycle prevents that.

9.4.3 Mutation ordering

When a document opens, multiple directives may evaluate or refresh concurrently: ::py blocks execute in document order, ::cal and ::email renderers query their backends, ::task directives render from the text. Do mutations from these concurrent operations have ordering guarantees?

Yes, within a constraint. Owned-state mutations are ordered by document position. ::py blocks execute top to bottom, sequentially, as specified in Chapter 6. A task that depends on a ::py block's output will see the correct value because the block above it has already run. This is deterministic and reproducible.

Mirrored-state mutations are ordered by wall-clock receipt. When an inbound sync annotation arrives from a backend, it is appended to the document at the position of its directive, timestamped with the moment the renderer received the confirmation. If two sync annotations arrive for different directives at nearly the same time, they are written in receipt order. This order is not semantically meaningful — the annotations are independent facts about independent backends — but it is deterministic for a given execution, and the timestamps make the actual chronology recoverable.

Cross-type dependencies are the user's responsibility. If a ::py block reads from a ::table that a user is simultaneously editing in the rendered view, the block sees the text as it was at evaluation time — before or after the edit, depending on timing. The architecture does not attempt to impose transactional semantics across owned and mirrored state. The document is a text file, not a database. Edits are sequential text operations. The version history records the order in which they occurred. For the vast majority of personal-computing use cases, this is sufficient. For cases where it is not — where a computation must see a guaranteed-consistent snapshot of multiple directives — the user can trigger a manual re-evaluation of the ::py block after the dependent edits are complete, or use run=manual instead of run=auto to control evaluation timing explicitly.

Chapter 9 — Bidirectional sync
Chapter 10: The document in time →

Part II — The four primitives · Chapter 10

The document in time

Lifecycle, archiving, and scale

The preceding six chapters build the architecture as if each document is a static artifact — a text file that exists, is opened, is rendered, and is used. But documents are not static. They are born, they grow, they become irrelevant, and eventually they are so old that their primary value is archival. The daily document described in Chapter 4 makes this especially vivid: a new one is created every morning, grows throughout the day, and by next week is history. A year of daily documents is 365 files. Five years is 1,825. The architecture must account for this temporal dimension — and, when one person opens the same collection on more than one device, for the spatial one too — not as an afterthought bolted onto a static model, but as a first-class concern expressed in the same text-native primitives that govern everything else. The document exists in time, and across the devices a user moves between. This chapter describes both.

· · ·

10.1 The spawn directive

The transition from one daily document to the next is, in most productivity systems, invisible. You open today's page; yesterday's page is somewhere in the archive. There is no explicit connection between the two. Unfinished tasks from yesterday must be manually recreated or, in more sophisticated systems, automatically copied — creating the divergence problem that Chapter 14 (transclusion) exists to solve.

The text-native document makes the transition explicit with a ::spawn directive:

::spawn[2026-03-25]{template=daily carry-forward=open-tasks,pinned}

The directive declares that a new document should be created for the given date, using a named template, and carrying forward specific categories of content. The spawn event is not silent. It is logged in both documents as a text mutation:

# In 2026-03-24.txt (the source document):
[spawn 2026-03-24T23:59] created 2026-03-25.txt, carried forward: 2 open tasks, 1 pinned embed
# In 2026-03-25.txt (the new document):
[spawned-from 2026-03-24.txt at 2026-03-24T23:59]
::note[2026-03-24.txt]{section=task:send-q3-sara}  # transcluded, not copied
::note[2026-03-24.txt]{section=pinned:design-palette}

The carried-forward items are transclusions — ::note references back to the source document — not copies. Completing the task in either document completes it in both, because both are viewing the same text. This is the same bidirectional transclusion mechanism described in Chapter 14, applied to the specific problem of day-to-day continuity. The spawn directive does not invent new machinery. It composes existing primitives — transclusion, templates, logging — into a lifecycle event.

The spawn can be triggered automatically (by the renderer, on first open after midnight) or manually (by the user writing the directive). The trigger mechanism is a renderer behaviour; the directive in the text is the record of what happened. A user who opens their document collection in a plain text editor sees the spawn log and understands the lineage: this document was created from that one, at that time, carrying these items.

· · ·

10.2 Document states

A document's lifecycle position should be declared in the document itself — not in filesystem metadata, not in a database, not in an application's internal state. The ::meta directive introduced in §8.5 for capability declarations extends naturally to lifecycle state:

::meta[status]
state       = active
created     = 2026-03-24
modified    = 2026-03-24T17:45
::end

The architecture recognises three states:

Active — the document is in current use. Its renderers query backends, its ::py blocks evaluate, its sync protocol runs. This is the default for any document without a ::meta[status] block, which means that every document written before this convention existed is active by default. No migration required.

Dormant — the document is not in current use but is not yet archived. Its renderers do not query backends on open; they display the last-known state from the text. Its ::py blocks with run=auto still evaluate (the code is local; there is no cost), but ::cal and ::email embeds show their text annotations without refreshing. Dormancy is a power-saving mode: the document is fully functional if re-activated, but it does not consume network resources or sync bandwidth while idle.

Archived — the document is historical. No renderers activate. No sync runs. No Python evaluates. The document opens as styled text — directives are displayed with their parameters and any inline annotations, but nothing executes. This is the cheapest possible open: the renderer parses the directives for structural display (indentation, colour-coding, fold markers) but makes no backend calls and starts no sandboxes.

# State transition is a text edit:
::meta[status]
state       = archived
created     = 2026-03-24
modified    = 2026-03-24T17:45
archived-on = 2026-04-23
reason      = age > 30d, all tasks resolved
::end

The state transition is a text edit. It appears in the version history. It is reversible: changing archived back to active is a one-word edit that restores full functionality on the next open. The reason field is optional, human-readable, and useful for understanding why a document was archived when reviewing the collection months later. The state is not enforced by the filesystem or by permissions — it is a declaration that the renderer respects.

· · ·

10.3 Compaction

An archived document from three months ago still contains ::cal[2026-03-24]{view=agenda} — a directive that, if rendered, would query a calendar backend for a date long past. The directive is not wrong; it is historical. But it carries a cost: if the document is ever re-activated or if a tool traverses the directive index, the directive implies a backend dependency that no longer serves a purpose. Compaction resolves this by replacing resolved directives with their final text output, preserving the original directive in a comment:

# Before compaction:
::cal[2026-03-24]{view=agenda}
[sync 2026-03-24T08:00] agenda loaded: 3 events
  09:30 – 10:00  1:1 with Rohan
  14:00 – 15:00  Q3 planning session
  16:00 – 16:30  Design review with Sara
::end
# After compaction:
<!-- compacted from ::cal[2026-03-24]{view=agenda} on 2026-06-24 -->
09:30 – 10:00  1:1 with Rohan
14:00 – 15:00  Q3 planning session
16:00 – 16:30  Design review with Sara

The compacted form is smaller (no directive syntax, no sync annotations), faster to parse (no directive to index), and fully self-contained (no backend dependency). The comment preserves the provenance: a reader or tool can see exactly what directive produced this text and when compaction occurred. The operation is visible in git diff — every compaction is a commit that shows precisely what changed.

Compaction is selective. Not every directive in an archived document should be compacted. A ::task that was completed is a good candidate — its final state is a checked box with a completion timestamp, and that information is fully captured in text. A ::note transclusion reference should not be compacted if other documents still transclude through it, because compaction would break the transclusion chain. The compaction tool must respect the dependency graph: a directive is safe to compact only if no live document references it.

The compaction trigger can be manual (the user runs a compact command), automatic (the renderer offers compaction when archiving), or scheduled (a background process compacts documents older than a threshold). In every case, the operation is a text mutation that appears in the version history. It is never silent and never irreversible — git revert restores the original directives.

· · ·

10.4 Roll-up documents

As daily documents accumulate, the user needs a way to see the week, the month, the quarter — without opening thirty individual files. The ::rollup directive creates a summary document by transcluding key sections from a range of daily documents:

# File: 2026-w12-rollup.txt
::meta[rollup]
range   = 2026-03-17..2026-03-21
include = tasks, decisions, prose-tagged:important
::end
# Auto-generated transclusions:
## Monday, 17 March
::note[2026-03-17.txt]{section=tasks}
::note[2026-03-17.txt]{section=decisions}
## Tuesday, 18 March
::note[2026-03-18.txt]{section=tasks}
::note[2026-03-18.txt]{section=decisions}
... (continues for each day in range)

The roll-up is itself a text file. Its content is transclusions — not copies. If a task completed on Tuesday is later re-opened on Thursday (it happens), the roll-up reflects the change because it is reading the source. Monthly roll-ups can transclude from weekly roll-ups; quarterly from monthly. The hierarchy composes without any special mechanism — it is transclusion all the way up.

The roll-up document also serves as the archive index. Rather than a database of archived files or a filesystem convention that tools must understand, the archive is a text document with references to every document in the collection, organised by time. Searching the archive index gives summaries; following a transclusion link gives the full document. The index is greppable, diffable, and readable in any text editor — because it is a text file like every other document in the architecture.

The composition property — roll-ups of roll-ups, transcluding transcluded content — is not a feature that was designed. It is a consequence of the transclusion primitive being general. A ::note directive does not know or care whether its source is a daily document, a roll-up, or another roll-up. It resolves the reference and renders the content. This is the architectural payoff of building on a small number of composable primitives rather than a large number of special-purpose features.

· · ·

10.5 Tiered search

The Leap search described in Chapter 22 searches across all documents. But as the collection grows, searching everything with equal priority becomes noise. A search for "Sara" should surface today's mention of Sara before a mention from six months ago. The architecture achieves this by deriving search tiers from the ::meta[status] annotations already present in every document.

Tier 1: Active documents. Searched first, results displayed instantly. For most users, this is today's document and a handful of project documents — a small set that can be indexed in memory.

Tier 2: Dormant documents. Searched on demand — when the user requests "search all" or when Tier 1 returns no results. Results are displayed with a visual distinction (dimmer, labelled with the document date) so the user knows they are looking at older content.

Tier 3: Archived documents. Searched only when explicitly requested. The search UI shows a count — "47 matches in archived documents" — and the user expands to see them. Archived results are displayed with their compaction status: a result from a compacted directive shows the frozen text; a result from an uncompacted archived document shows the original directive.

This is not hidden state. The search interface displays the tier boundaries and the counts at every level. The user always knows how much of the collection has been searched and can expand to the full set with one action. The tier boundaries are not configured by the user — they are derived from the documents' own state declarations. A document that transitions from active to dormant automatically moves from Tier 1 to Tier 2 search. No configuration, no settings, no sync. The document declares its state; the search respects it.

· · ·

10.6 The soft budget

The Appendix B specification states a 64MB memory limit per document context and a 10-second execution time limit per document open. These are hard limits — the sandbox enforces them. But they are implementation constraints, not user-facing guidance. The user needs a softer signal: your document is getting large, and here is what that means.

The renderer should maintain a line-count budget — a configurable threshold (default: 5,000 lines) above which the renderer displays a non-intrusive advisory:

The advisory is a renderer behaviour — it does not mutate the document. It is not an error. It is not a warning that blocks functionality. It is information: your document is in the range where viewport-driven rendering (§8.6) is doing meaningful work to keep the experience responsive, and the lifecycle tools described in this chapter — spawn, compact, archive — can reduce the load if you choose.

The budget is deliberately soft. A user who writes a 10,000-line document because that is how their thinking works should not be punished by the tool. The document will be slower to open, the directive index larger, the proximity threshold doing more work — but it will function. The advisory makes the cost visible. Making the cost visible is what "no hidden state" demands: if the system is working harder, the user should know. What the system must never do is silently degrade — dropping directives, refusing to render, truncating content. The document is the user's. The budget is advice, not a cage.

There is a useful analogy to compiler warnings. A compiler warning does not prevent compilation. It says: this code will work, but there is a pattern here that suggests a problem. The document budget advisory says the same thing: this document will render, but there is a size pattern that suggests you may want to use the lifecycle tools. The user who dismisses the advisory is making an informed choice. The user who never sees it — because the renderer silently degrades performance — is not.

Figure 10.1 — Document lifecycle: spawn, carry-forward, compaction — toggle between live view and compacted text

Click a day to inspect its source text. For archived documents (Mon, Tue), toggle between source text and compacted to see directives replaced with frozen summaries. Note the ::spawn chain linking each day to the next, and the ::note transclusions carrying forward open tasks. The lifecycle log at the bottom shows the state transitions and compaction events — every one is a diffable text mutation.

· · ·

10.7 The document across devices

The preceding six sections treat each document as if it lives on a single disk. In practice, a person's daily document is opened on a laptop in the morning, checked on a phone in the afternoon, and resumed on a tablet at night. Three devices, one document — or, more accurately, three views of one document, none of which is privileged. The architecture must say what the document is in this case, and where the truth lives when more than one device can write to it.

The owned/mirrored distinction (§9.1.1) extends naturally to this question, and answers most of it. Mirrored state is not a problem. A ::cal embed reads the calendar backend on each device; the backend is the source of truth; whichever device renders the directive sees the same events. Multi-device for mirrored embeds is not a feature the document architecture must provide — it is a property the document inherits from the backends it projects. Email read on the phone is the same email read on the laptop, because the email was never in the document in the first place. It was always in the mail server.

Owned state is the harder case. A ::task directive, a ::py block, a ::table with inline data — these live in the text file. If two devices each have a copy of the file and each writes to it, the file diverges. The architecture takes a position on this divergence rather than hiding from it.

10.7.1 The git baseline

For the single-user, multi-device case — one person, one collection of documents, several devices — the baseline answer is git. The document directory is a git repository. Each device clones it. Each device commits its changes locally. Sync is git push followed by git pull, run automatically by the renderer on document close and document open, or manually by the user. Conflicting edits to the same line produce a merge conflict, presented as a text diff and resolved like any other text conflict.

This is the same git that Chapter 21 already requires for version history. No new infrastructure. The remote can be GitHub, a self-hosted Forgejo, a local network share, or even a cloud drive treating the .git folder as opaque storage. The architecture does not specify the remote. It specifies the format: the document is a directory of text files, and the directory is a git repository. Anything that can host such a directory is sufficient.

The tradeoff is explicit. Phones and tablets are not git-native, and a user who completes the same task on two devices within the same minute will see a conflict marker. The architecture does not pretend this is invisible. What it offers in exchange is that the conflict is visible, textual, and resolvable — a diff between two versions of one line, presented in the same UI that backs the diffability chapter. There is no hidden state. There is no silent loss. There is a conflict, and there is a way to read it.

10.7.2 Single-writer discipline

A practical reduction makes the conflict case rare. Most users, most of the time, write into their daily document from one device at a time. The document moves with attention — laptop in the morning, phone at lunch, laptop in the afternoon — but rarely does the user actively edit on two devices in the same minute. A renderer that auto-syncs on document close and document open will, for the overwhelming majority of sessions, see clean fast-forwards rather than divergent histories. Single-writer-at-a-time is not enforced by the architecture; it is the natural usage pattern, and the architecture is designed to behave correctly when it holds and to surface the conflict clearly when it does not.

10.7.3 What the architecture does not yet solve

Multi-user real-time collaboration — two people typing into the same daily document at the same time, with sub-second propagation — is out of scope for v1. The reason is structural. Real-time collaboration over text is a solved problem, but the solutions that work (Operational Transform, or CRDTs such as Yjs, Automerge, and Loro) require a runtime layer that maintains state independently of the bytes on disk. Two files that look identical can have different CRDT histories that resolve differently on the next merge. This is hidden state by definition, and the architecture's most central commitment — that the file is the truth — is incompatible with it without qualification.

This does not mean real-time collaboration is impossible in this architecture. It means it is the v2 question that the v1 specification is not yet ready to answer. A future revision may introduce a CRDT layer as an explicit, opt-in capability — declared in the document's ::meta block, with the CRDT state stored in a sidecar of known format and the on-disk text remaining a faithful serialisation of the converged state. That is a real design problem, not a small one. The accurate position today is to describe its shape, not to pretend it is already solved.

There is precedent for the staircase. Git itself was built for asynchronous, offline-first collaboration — and it is the most successful collaborative editing system ever deployed. Real-time editing is a younger and narrower problem. The architecture's bet is that a personal computing environment is much closer to the asynchronous case (a person reconciling their own work across devices and time) than to the synchronous case (multiple editors typing into one cell). For the cases where synchronous editing is genuinely needed — a meeting agenda being co-written, a design document being marked up live — the existing answer is to embed those moments via ::web or a chat-oriented renderer that owns its own real-time layer, and let the document record the outcome. The document does not have to be the editor for every collaborative case. It has to be the place where the result is kept.

The collection scales across devices the same way it scales across time. A document is born on one device, edited on others, and its history is the merge of every device's commits. The architecture neither hides the seam nor lets it tear silently. The seam is visible in git log. The user can read what each device contributed and when. That is the same property the rest of this chapter has argued for: the file is the record, the record is text, and the text is yours — on any device that can read it.

Chapter 10 — The document in time
Part III: The eleven apps, redesigned → Chapter 11: Email

Part III — The eleven apps, redesigned · Chapters 11–20

The eleven apps, redesigned

From email to terminal — every app as a renderer

Part III redesigns each of the eleven applications in the ecosystem by applying the four primitives developed in Part II. Each application is approached the same way: what does this app fundamentally do, what data does it own, how does that data become a directive, and what does bidirectional sync mean for this particular kind of interaction? The chapters below cover each app in turn, with a design principle and worked directive examples for each.

Before the chapters begin, a clear framing is owed. The list of eleven — email, calendar, tasks, notes, worksheets, files, contacts, chat, browser, terminal, and the worksheet as a composition of two existing primitives — is not a first-principles taxonomy of personal computing. It is the set of categories that contemporary readers already know how to want, inherited from the application-silo era this book diagnoses. To take that list as the architectural frame would be to let the cage define the shape of the door. So before the chapters, a step back: the eleven directives reduce to five shapes, two of which Part II already named as primitives, and three of which the chapters below introduce.

· · ·

The five shapes behind the eleven directives

A shape is a semantic category — a description of how a directive's data behaves, what operations are meaningful on it, and what its bidirectional-sync contract looks like. Shape is not the same as type: ::email and ::chat are different directive types, but they share a shape, and they share most of the renderer machinery and all of the architectural commitments that follow from that shape. Naming shapes makes it possible to ask a question the architecture could not previously answer precisely: does a proposed directive introduce a new shape, or is it a specialisation of an existing one? If it is a specialisation, it may still deserve its own directive, but it does not extend the architecture's core taxonomy.

Shape 1 — Stream

An ordered, append-mostly, attributed sequence: messages in time order, each with a sender, a timestamp, and a body. Mutations are predominantly appends; edits and deletions exist but are exceptional. Bidirectional sync is poll- or push-driven from a remote backend. Directives: ::email (Ch. 11), ::chat (Ch. 18).

Shape 2 — Scheduled

A thing with a when, optionally a who, optionally a dependency on other scheduled things. Time is the primary organising axis; the directive carries time parameters that drive both rendering and downstream queries. Directives: ::cal (Ch. 12), ::task (Ch. 13).

Shape 3 — Reference

A named, addressable thing whose canonical content lives somewhere else. The directive carries the address; the renderer fetches a derived view. The renderer never holds the truth — the source does. Directives: ::note (Ch. 14, transclusion of another document section), ::file (Ch. 16, a path on disk), ::contact (Ch. 17, a record in the address book), ::web (Ch. 19, a URL).

Shape 4 — Compute

Code that runs and produces output. The output renders inline; the source renders on toggle. Reproducibility is a defining property: same code plus same inputs equals same output, every open. Directives: ::py (Ch. 6, sandboxed in a host-isolated runtime), ::sh (Ch. 20, privileged execution context with full host capabilities). The two compute directives share the shape but not the trust boundary — and that distinction is part of the shape's contract.

Shape 5 — Structured

Rows and columns with named types, edited in a tabular view, serialised as text. Mutations are local cell edits. Sources may be inline data or the output of a compute block; the divergence protocol (§7.2) handles the case where both want to be authoritative. Directives: ::table (Ch. 7).

Shape 4 (compute) and Shape 5 (structured) are exactly the Part II primitives II and III. The architecture did not suddenly grow a new layer to accommodate the apps. It had two of the shapes already, and the chapters that follow introduce the three remaining ones — stream, scheduled, and reference — by working through the directive types that specialise each.

The architectural payoff is small but real. A future implementer who wants to add ::sms is not extending the architecture; they are adding a third stream specialisation, and they inherit the stream contract for free — ordering, attribution, append-mostly mutations, sync cadence as a renderer parameter. A future implementer who wants to add ::poll for a survey-response directive is, by contrast, introducing a sixth shape, and the architecture's job is to ask whether the new shape is genuinely orthogonal to the existing five before granting it directive status. The eleven of this book are the surface; the five shapes are the floor; the four Part II primitives are the foundation. Each layer says less and constrains more than the one above it.

The chapter notes below tag each app with its shape and its siblings, so a reader who wants to skim Part III by shape rather than by app — reading the two stream chapters back to back, or the four reference chapters as a single arc — can do so. The eleven chapters that follow are not eleven independent designs. They are five designs and a composition.

Chapter 11 — Email: threads as paragraphs

Shape: stream (ordered, append-mostly, attributed). Sibling: ::chat (Ch. 18). The two share the stream contract; the renderers differ chiefly in cadence (email polls; chat pushes) and in the conventions of attribution and threading.

Email is the oldest and most durable of personal computing's communication tools. It has survived the rise and fall of instant messaging, social networking, enterprise chat, and every other platform that promised to replace it. It survives because it is federated, asynchronous, and text. These are precisely the properties that make it natural in the text-native document. But describing the rendering experience — email appearing inline between paragraphs — is the easy part. The harder question is the mechanics: how do you reply to a thread, compose a new message, or file and delete mail without ever switching to a mail client?

· · ·

11.1 Thread rendering as prose

A ::email[thread-id] directive renders a mail thread as a sequence of message blocks, each with a sender header and body. Visually it looks like a conversation — the same reading experience as any email client. The difference is context: the thread appears between the paragraphs that motivated you to read it. The agenda you were writing for tomorrow's meeting has the relevant thread embedded directly above the action items. When you finish reading, you are still in the document. You never left.

An ::email[inbox]{filter=unread priority=high} directive renders a filtered inbox slice — not the entire inbox, but exactly the messages relevant to the current moment, selected by the same query language the mail backend supports. The inbox is not a destination you navigate to. It is a view you summon, in context, and dismiss when you are done.

· · ·

11.2 Replying without leaving the document

At the bottom of every rendered thread is a compose bar. Typing a reply and pressing send does two things simultaneously. The message is sent via the registered mail renderer's network capability — delivered exactly as if sent from any mail client. And a log entry is appended to the document text: a single line recording the reply, the recipient, the timestamp, and the subject. The draft, while being composed, is stored as a ::email[draft] directive in the document. Sending it replaces the draft directive with the log entry. Discarding it removes the directive. The document text reflects the true state at every step.

The draft-as-text model has an important consequence: drafts survive application restarts, device switches, and renderer upgrades. A draft email stored as a ::email[draft]{to=sara@example.com subject="Q3 update"} block in a document is readable without any mail client. Its content is in the text. When a renderer is available, it renders as a compose form. When one is not, it is still a line of text recording your intention.

· · ·

11.3 Composing new messages

Composing a new message anywhere in the document uses the same directive:

::email[draft]{to=sara@example.com subject="Q3 hardware spend" priority=high}
Sara — following up on the hardware variance we discussed.
The overrun is £2,300 across Travel and Hardware lines.
Can we sync before Friday?
::end

The body is plain text between the directive tags. The renderer displays it as a compose form with pre-filled fields. Sending appends: sent to sara@example.com — "Q3 hardware spend" — 14:32 and removes the draft block. The document records that the email was sent; the mail backend delivers it.

· · ·

11.4 Filing and deletion

Filing a message — moving it to a folder or label in the mail backend — is triggered from the rendered embed. The action is sent to the backend and a log entry is appended to the document: filed: "Budget variance" → Finance/2026-Q1. Deleting a message follows the same pattern. The document does not store the email. It stores the record of what you did with it. The email lives in the mail backend; the relationship between you and that email lives in the document.

Figure 11.1 — Email inline: reply in context, every action writes to text

Preparing for the finance sync tomorrow. Sara flagged the hardware spend — pulling the thread in here to review before I reply.

::email[thread-q3-hardware]{limit=3}

Sara ChenMon 09:14Q3 hardware spend — variance

Hi — just flagged by finance that hardware came in £1,700 over. Can you confirm the figure and whether it needs escalating before the board pack?

Rohan MehtaMon 10:02

Confirmed — £1,700 on the hardware line, primarily the server upgrade in March. I can pull the breakdown from the budget doc if useful.

After confirming with Rohan, I'll add the escalation note to the board pack section below.

document mutation log

::email[thread-q3-hardware] rendered — 2 messages fetched

Hit reply on any message — a compose area opens and a ::email[draft] directive is written to the document immediately. Type a reply and send — three mutations fire: the message is delivered to the mail backend, the draft directive is removed, and a sent log entry is appended to the document. Hit file → to file the thread — a filing record is appended to the document text. Toggle source to watch the raw text change at each step.

Chapter 12 — Calendar: time as a view, not a destination

Shape: scheduled (a thing with a when, optionally with a who). Sibling: ::task (Ch. 13). Both project time-anchored entities with optional dependencies; the calendar renders events with a duration, the task renderer renders deadlines and blockers — same shape, different specialisation.

The calendar application commits the original sin of the WIMP paradigm more completely than any other: it is a destination. You go to the calendar to see your time. You leave the calendar to use your time. The calendar knows nothing about why you accepted the meeting, what you were working on before it, or what you need to do after it. In the text-native document, time is always in context. But again the reading experience is the easy part. The harder mechanics: how do you create an event, edit its recurrence, propose a new time, or decline from inside the document?

· · ·

12.1 Events as text blocks

A calendar event in the text-native document is a ::cal block. Events you create are stored as directive blocks with named fields; events fetched from a backend are rendered from the backend's data but with their state reflected in the document. Creating an event from the document is as simple as writing the directive:

::cal[finance-sync]{create=true}
title     = Finance sync — Q3 hardware variance
date      = 2026-03-25
time      = 10:00
duration  = 30min
attendees = sara@example.com, finance@example.com
notes     = Review hardware overrun before board pack
::end

Saving the block sends the event to the calendar backend and writes a confirmation log entry. The block transitions from a create=true draft to a confirmed event with a backend ID. From that point, edits to the block's fields sync to the backend.

· · ·

12.2 RSVP and the mutation chain

Accepting, declining, or proposing a new time from a rendered calendar embed fires a chain of mutations. Accepting writes status=accepted to the directive, sends the RSVP to the organiser's mail server, and appends a log line. Declining follows the same pattern with an optional message. Proposing a new time creates a ::email[draft] block with the proposed time pre-filled, leaving the user to add context before sending.

· · ·

12.3 Editing recurrence and details

Editing an event's fields from the rendered card writes the change to the directive and queues a backend sync. Recurrence is expressed in the directive parameters — recur=weekly day=Mon — and rendered as a recurrence badge on the card. Changing the recurrence from the embed replaces the recur= value in the text and sends the update to the backend. The text is always the source of truth; the backend is always downstream.

Figure 12.1 — Calendar embed: RSVP, create event, and reschedule — all as text mutations

Checking tomorrow's schedule before writing the finance prep notes.

::cal[2026-03-24]{view=agenda}

Tue 24 Mar

09:00 – 09:45

Product roadmap call

Sara, Rohan · accepted

10:00 – 10:30

Finance sync

Sara, Finance · awaiting RSVP

New event — will create ::cal block in document

title

time

duration

attendees

document mutation log

::cal[2026-03-24] rendered — 2 events, 1 pending RSVP

Hit Accept or Decline on the Finance sync — watch the status write to the directive and the log record the RSVP. Hit Propose new time — a draft email directive is created. Click + new event, fill the form, hit create event — the event appears in the agenda and a block is written to the document. Toggle source to see the raw text state at every step.

Chapter 13 — Tasks: every checkbox is a line of text

Shape: scheduled (a thing with a when, optionally with a who). Sibling: ::cal (Ch. 12). The shared trait is that the directive carries time and dependency parameters that drive both rendering and downstream queries; blocked-by on a task is the same shape of edge as a calendar event's relationship to its prerequisites.

A task, reduced to its essential nature, is a line of text with a state. It is either done or not done. Every other attribute — due date, priority, project, assignee, recurrence — is metadata attached to that line. The ::task directive makes this explicit. But the mechanics deserve elaboration: how do you create a task from prose, how do blocking chains actually work in the text, and how does querying tasks across documents function in practice?

· · ·

13.1 The task as a directive

Every task in the text-native document is a directive line with an identifier, a completion state, and optional metadata parameters. The minimal task is one line:

::task[call-finance]{due=2026-03-23 priority=high}

Checking the rendered checkbox replaces the directive text: done=true completed=2026-03-23T14:32 is appended to the parameters. The checkbox state, the completion timestamp, and the original due date are all visible in the raw text. A diff of the document at that moment shows exactly one character change plus the timestamp append — the most precise possible record of when the decision was made.

· · ·

13.2 Creating tasks from prose

Any line of prose can become a task by selecting it and pressing a keyboard shortcut, or by typing ::task[id] immediately after it. The selected text becomes the task title, appended as a title= parameter or as the block body. The renderer writes the directive to the document at the cursor position. No application switch required. The task is created in context, next to the prose that generated it.

Tasks created from within email replies, calendar accepts, or chat promotes use the same mechanism: the action handler writes a ::task directive to the document alongside the action log entry. The task and the event that produced it are adjacent in the text — their relationship expressed by proximity and by the linked-to= parameter if the connection should be explicit.

· · ·

13.3 Blocking chains in text

The blocked-by= parameter creates a dependency expressed entirely in text. A task with blocked-by=call-finance references the identifier of another task. The renderer marks the blocked task as inactive and shows the blocker's status. When the blocker is completed, the blocked task becomes active automatically — the renderer re-evaluates the directive parameters on every render. The dependency graph is in the text, readable as text, greppable as text. grep -r "blocked-by=" ~/documents/ returns every dependency across every project.

· · ·

13.4 The task list as a query

The power of the directive model becomes clearest when you query across documents. ::task[overdue]{filter="done=false due<today"} renders all overdue incomplete tasks from the entire document collection as a live interactive list. Completing a task from this view writes the completion back to the source document — wherever that task lives — not to the query view. The query view is derived; the source document is authoritative.

Figure 13.1 — Tasks: create, complete, block, and query — all as text mutations

add a task from prose

today's tasks — with blocking chain

Call finance re hardware variance

overdue · today

Send Q3 brief to Sara

blocked by: call-finance

Review transport spend

due 1 Apr

source view — document text ▾

document mutation log

::task directives loaded — 3 tasks, 1 blocking chain

Type a task in the input and hit + add task — the directive is written to the document at the cursor position. Click the checkbox on Call finance — it completes, the log records the timestamp, and the blocked Send Q3 brief task becomes active immediately. Toggle source view to see the exact directive text for each task, including blocked-by= and completion timestamps.

Chapter 14 — Notes and docs: transclusion, not copying

Shape: reference (a named addressable thing whose canonical content lives elsewhere). Siblings: ::file (Ch. 16), ::contact (Ch. 17), ::web (Ch. 19). All four point at addressable content rather than carrying it; transclusion is the reference shape applied to a section of another document, and the carry-forward and roll-up mechanics in Chapter 10 are uses of the same primitive.

The note-taking application has a fundamental design flaw that its users have learned to work around: it creates copies. When you want to include content from one note in another, you copy it. The copy immediately begins to diverge from the original. The text-native document resolves this with transclusion. But the mechanics deserve precision: what happens when you edit at the source versus at the reference? What does breaking a transclusion link look like in text? And how does the document graph stay coherent as it grows? (Transclusion is also the mechanism behind the carry-forward and roll-up operations described in Chapter 10 — the lifecycle chapter used transclusion before this chapter had a chance to define it formally.)

· · ·

14.1 How transclusion works in text

A ::note[document-id]{section=heading} directive renders the named section of another document inline at the point of the directive. The source document is not modified. The rendering is a live view — not a snapshot. Changes at the source propagate immediately to every document that transclude it. The mechanism is simple: on every render, the renderer resolves the directive against the current state of the source document. There is no sync protocol, no replication, no eventual consistency. The source is read at render time. If it has changed, the rendered view shows the change.

· · ·

14.2 Editing at the source vs at the embed

Edits made through a transcluded embed are propagated to the source document. The renderer knows the source location for every transcluded block and writes changes back there. This is bidirectional transclusion: the source and the embed are the same content, readable from either location, editable from either location, always in agreement.

There is one asymmetry: structural changes — reordering sections, renaming headings, deleting paragraphs — can only be made at the source. The embed renders a section; you can edit the content of that section, but you cannot rearrange its relationship to other sections from the embed. This constraint is intentional. Structural changes to a source document that is transcluded in many places would otherwise propagate unexpected restructuring throughout the document graph.

· · ·

14.3 Breaking a transclusion link

Breaking a transclusion — converting a live embed to a frozen copy — is a one-action operation from the embed. The action replaces the ::note[id]{section=heading} directive with the literal content of the section, as plain text, at the point of the directive. The source document is unchanged. The embed becomes independent text, no longer connected to the source. This operation is recorded in the document's version history: the diff shows exactly which content was frozen and from which source.

Deletion of the source document marks all directives that reference it with an unresolved indicator — the same mechanism as a deleted contact. The directive text remains; the renderer cannot resolve it. The document is not broken. The content of the directive is still a legible text line: ::note[q3-meeting-notes]{section=decisions} tells a reader exactly what was intended, even without a renderer.

Figure 14.1 — Transclusion: edit at source, watch embed update; break link, embed freezes

source: q3-meeting-notes.txt

q3-meeting-notes.txt

## Decisions

Tab bar navigation confirmed for mobile. Max 5 tabs. Priya to share designs by EOD.

this document (embed)

::note[q3-meeting-notes]{section=decisions} live

Tab bar navigation confirmed for mobile. Max 5 tabs. Priya to share designs by EOD.

document mutation log

::note[q3-meeting-notes]{section=decisions} — live transclusion active

Edit the source document on the left — watch the embed on the right update instantly. There is no sync delay; the source is read at render time. Hit break link → freeze copy — the embed turns amber and becomes independent text. Now edit the source again — the embed no longer updates. Hit restore live link to reconnect.

Chapter 15 — Worksheets: the spreadsheet dissolved

Shape: composition, not a fifth shape. The worksheet is what happens when compute (::py) and structured (::table) are placed adjacent in a document. There is no ::worksheet directive — a worksheet is a pattern, not a primitive. This chapter exists in Part III not because the architecture grew a new shape, but because the composition of two existing shapes is significant enough to deserve its own treatment.

Chapter 15 is where the architecture earns its most ambitious claim: that the spreadsheet, as an application category, can be dissolved into the document. Not replaced by a better spreadsheet. Dissolved — because the functions a spreadsheet serves are better served, separately, by the text layer, the Python block, and the table renderer respectively. The interactive figures in Chapter 6 and 7 demonstrate the mechanics in full. This chapter explains what that dissolution means conceptually and why it is an improvement rather than a regression.

· · ·

15.1 What a worksheet actually is

A worksheet in the text-native document is any document section that combines a ::py block defining computation, a ::table block displaying results, and the prose that explains what the numbers mean and why they were computed. These three elements are technically separable — each is a distinct directive — but they are intentionally kept together in the document, adjacent in the scroll, so that the reasoning and the result are inseparable to any reader.

This is the key structural difference from a spreadsheet. A spreadsheet's reasoning is hidden inside cells. A worksheet's reasoning is visible in the Python block above the table, in the prose paragraph that motivated the computation, and in the task directive below that records the follow-up action. A colleague who receives the document can read the worksheet without any spreadsheet application and understand not just what the numbers are but where they came from and what was decided because of them.

· · ·

15.2 The formula dissolved

The spreadsheet formula is a domain-specific language embedded invisibly inside a cell. =SUMIF(B2:B10,">0",C2:C10) means something specific to someone who knows Excel's formula language, but it is opaque to anyone who doesn't — which is most people outside finance and analytics. The Python equivalent — total = sum(v for v in values if v > 0) — is a sentence. It uses the same words a programmer and a non-programmer would both reach for. The computation is legible. The reasoning is visible. The spreadsheet formula is a dialect; the Python block is plain language.

· · ·

15.3 Sharing a worksheet

Sharing a spreadsheet requires the recipient to have a compatible application, to understand the formula language, and to trust that the cells shown are all the cells that matter. Sharing a text-native worksheet requires none of this. The document is a text file. Any text editor opens it. Any reader can see the Python source, the table data, and the prose explanation simultaneously. The worksheet is self-documenting in a way that a spreadsheet cannot be: the documentation is literally in the same file as the computation.

· · ·

15.4 The worksheet as audit trail

Because every change to a worksheet — editing a Python value, updating a table cell, writing a prose note — is a text mutation, every change is captured in the document's version history. The diff between two versions of a worksheet shows exactly which number changed, at what time, and what prose context accompanied the change. This is the audit trail that financial and analytical workflows require and that spreadsheets provide poorly: a spreadsheet's version history, if it exists at all, shows which cells changed but not the human reasoning that motivated the change. The worksheet embeds the reasoning in the document alongside the data, and the version history captures both together.

The interactive figure demonstrating worksheet mechanics appears in Figures 6.1 & 7.1 — the integrated ::py→::table prototype in Chapter 6. That figure shows the computation layer driving the table, the divergence protocol in action, and the source text updating in real time. It is the worked example this chapter describes.

Chapter 16 — Files: the document as index

Shape: reference. Siblings: ::note (Ch. 14), ::contact (Ch. 17), ::web (Ch. 19). The file path is the canonical reference; the rendered preview is a derived view. The file's bytes never live in the document — they live on disk, and the directive is the address.

The ::file directive does not replace the file system. It makes it navigable from the document — which is where the context for finding a file usually lives. But how are file references created? What happens when a file moves? How does the document handle broken references, and what does a directory view look like?

· · ·

16.1 Creating file references

File references are created in three ways. Dragging a file into the document at any cursor position writes a ::file[path]{preview=true} directive at that location — the file is not copied into the document, only referenced. Typing the directive manually works equally well. And any action in another embed that produces a file — a downloaded email attachment, a chart exported from a ::py block — offers a one-click option to write the reference to the document.

::file[~/documents/q3-report.pdf]{preview=true}
::file[~/projects/budget-model.xlsx]{preview=first-sheet}
::file[~/downloads/roadmap-v4.pptx]{preview=thumbnail}

· · ·

16.2 Moving and renaming files

When a referenced file is moved or renamed, the ::file directive path becomes stale. The renderer detects this on the next render and marks the directive as broken — displaying the path with a visual indicator and an "update path" action. Clicking updates the directive's path parameter to the new location. The document records the path change as a text mutation: the old path is replaced with the new one, and the change is captured in the version history. The file is wherever it is; the directive records where the document last knew it to be.

· · ·

16.3 Directory views

A ::file directive pointing to a directory rather than a file renders as a directory listing — a filterable, sortable view of the directory's contents. Files in the listing can be clicked to open a preview inline or to write a new ::file directive for the selected file at the cursor position. The directory listing is not a file manager. It is a view into the file system, rendered where context demands it, dismissible when the context has passed.

Figure 16.1 — File references: create, break, update path — all as text mutations

Working on the Q3 board pack. Pulling the key files in here for context before writing.

drag a file here — or click to simulate dropping one

::file[~/documents/q3-report.pdf]{preview=true}

resolved

Q3 Financial Report · 24 pages · PDF · Modified 22 Mar 2026
Contains: Executive summary, P&L by department, hardware variance analysis

::file[~/projects/budget-model.xlsx]{preview=first-sheet}

resolved

budget-model.xlsx · Sheet 1: Q3 Actuals · 7 rows, 5 columns
Categories: Rent, Insurance, Loan, Food, Transport, Software, Training

document mutation log

::file[~/documents/q3-report.pdf] resolved — preview available

::file[~/projects/budget-model.xlsx] resolved — preview available

Click drop zone to simulate dragging a file in — a ::file directive is written to the document and the file is immediately available to preview. Click preview on any reference to see it inline. Click simulate move to move the file — the directive becomes broken and the renderer flags it. Click update path to fix the reference — the path parameter is updated in the document text.

Chapter 17 — Contacts: people as named entities

Shape: reference. Siblings: ::file (Ch. 16), ::note (Ch. 14), ::web (Ch. 19). A contact card is a view of a record whose canonical home is the address-book backend; the directive references it. @-mentions in prose are the reference shape's most natural surface — a name in text resolving to a record kept elsewhere.

In the text-native document, a person is a named entity in the text. Typing @sara.chen anywhere in the document creates a live reference to Sara's contact record. The ::contact directive renders her card inline when needed. This much is the easy part to describe. The harder question — the one this chapter actually needs to answer — is: how do you create a contact record, edit one, and delete one, without ever leaving the document? And what does the text layer look like for each of those operations?

· · ·

17.1 Contact records as text blocks

A contact record in the text-native document is a ::contact block with named fields in its body. The block lives in the document — typically in a dedicated contacts document that other documents reference — and is the canonical source of truth for that person:

::contact[sara.chen]
name  = Sara Chen
email = sara@example.com
role  = Head of Product
org   = Example Ltd
phone = +44 7700 900123
notes = Met at ProductCon 2025. Prefers async communication.
::end

This is a text file. Editing Sara's phone number means editing the phone = line. Adding a field means adding a line. Deleting the contact means deleting the block. No database, no API call, no sync conflict — the text is the record.

The @sara.chen mention anywhere in any document is a reference to this block's identifier. The renderer resolves it at render time, displaying Sara's name as a styled link that opens her card inline. If the block is deleted, the mention renders as an unresolved reference — visually distinct, greppable, fixable. Broken references do not crash the document. They appear as text.

· · ·

17.2 Creating a contact — two paths

From an @mention: Type @someone-new anywhere in the document. If the identifier does not resolve to an existing block, the renderer offers a one-click action: "Create contact for @someone-new." Clicking it appends a new ::contact block stub to the contacts document and opens it for inline editing — name, email, role. The stub is written to the text immediately; the inline editing fills in the fields. When you tab away, the contact is created. The mutation log records: ::contact[someone-new] created.

From a received email or calendar invite: When a ::email or ::cal embed shows a sender or attendee whose email address does not match any existing contact block, the renderer flags them with an "Add to contacts" action. Clicking it pre-fills the block with the name and email from the message header. You add any additional fields inline and save. The contact block is appended to the contacts document and the @mention in any email thread or calendar event resolves immediately.

· · ·

17.3 Editing and deleting — all in text

Editing a contact is editing text. The ::contact block renders as a card in read mode and as an editable form in edit mode — toggled by clicking an edit button or by navigating to the contacts document directly and editing the raw text. Both paths produce the same result: a text mutation on the block's field lines. The renderer writes the change back and the card updates everywhere it is rendered, because all renders read from the same source block.

Deleting a contact means deleting the block from the contacts document. All @mentions of the deleted identifier immediately render as unresolved references. This is intentional: it makes the deletion visible and the impact auditable. A ::py block can find all broken mentions: broken = [l for l in doc.lines() if re.search(r'@[\w.]+', l) and not contacts.get(l)]. The document tells you what needs cleaning up. It does not clean it up silently.

· · ·

17.4 The relationship log

Every action taken from a rendered contact card writes a timestamped log entry to the document: emails sent, meetings accepted, tasks created. Over time, the document containing the ::contact embed becomes a relationship log — a running account of every interaction with this person, in chronological order, in the context that made each interaction meaningful. This is what a CRM does. The difference is that the text-native version is owned by you, lives in your document, and requires no subscription.

Figure 17.1 — Contact CRUD: every create, edit, and delete is a text mutation

Had a catch-up with @sara.chen and @rohan.mehta about the roadmap. I should loop in @james.park — he's new and not in contacts yet.

Create new contact — block will be written to contacts.txt

id

name

email

role

notes

contacts.txt — raw document text

document mutation log

::contact[sara.chen] loaded from contacts.txt

::contact[rohan.mehta] loaded from contacts.txt

@james.park — unresolved reference, no matching block

Click @sara.chen or @rohan.mehta to open their contact card inline. Hit edit — change a field, save — watch the mutation log record the exact field change written to contacts.txt. Hit delete — the @mention in the prose turns into a struck-through unresolved reference. Click @james.park (unresolved, amber) to create him — the block is appended, the mention resolves. Toggle source view to see the raw text that represents all of this.

Chapter 18 — Chat: channels as embedded streams

Shape: stream. Sibling: ::email (Ch. 11). The two share the stream contract — ordered, append-mostly, attributed — and most of the rendering machinery. The renderers differ chiefly in cadence (chat is push-oriented and expects sub-second rendering of new messages; email is poll-oriented) and in the conventions of attribution and threading.

Chat is the most context-destroying of all communication formats. A Slack message is a fragment — a sentence or two, deprived of the thread that preceded it, disconnected from the document it was discussing, unlinked from the task it created or the decision it recorded. The ::chat directive renders a slice of a channel or thread in the document, in context. But again, the easy part is describing the reading experience. The harder question is the writing one: how do you post a message without leaving the document? And what does the mutation look like when you do?

· · ·

18.1 The chat embed renders where it is needed

The ::chat directive renders a configurable slice of a channel inline:

::chat[#design-team]{limit=8 since=yesterday}

The rendered embed shows the last eight messages in #design-team from the past day. The channel is not opened in a separate application. It appears between the paragraphs of the document that motivated you to check it. If the paragraph above is about a design decision, the relevant Slack thread is right there below it.

· · ·

18.2 Posting without leaving the document

At the bottom of every rendered ::chat embed is a compose bar — a single text input with a send button. Typing a message and pressing send does three things simultaneously, all as text mutations:

First, the message is posted to the backend channel via the registered chat renderer's network capability. The channel backend receives it exactly as if you had sent it from Slack or Teams directly.

Second, a log entry is appended to the document text: a single line recording what was sent, to which channel, at what time. This is the mutation that makes the action auditable. A reader of the document later can see that you posted to #design-team at 14:47, without needing to open Slack.

Third, if the message is a reply to a specific thread visible in the embed, the thread context — the message being replied to — is also recorded in the log entry, so the document contains enough context to understand the exchange without opening the channel.

· · ·

18.3 Promoting a decision to the document

The most important interaction in the chat embed is not posting — it is promoting. When a thread reaches a decision, any message in the rendered embed can be promoted to the document with one click. Promoting writes the message text as a prose line below the embed, prefixed with the channel name, the author, and the timestamp. The decision is now in the document. It is no longer buried in a chat archive that requires navigating to Slack, finding the channel, scrolling to the right day, and hoping the thread is still intact. It is text, in the document, permanent, searchable, greppable.

· · ·

18.4 The full mutation model

The three mutation types from Chapter 9 apply cleanly to chat. Posting a message is an append — a log line is added to the document. Editing a posted message appends a correction entry (the edit is also sent to the backend). Deleting a message appends a deletion record. Promoting a decision is an append of the promoted text. Nothing is a hidden operation. Everything that happened in the chat embed is recoverable from the document's mutation history, because every action wrote a line of text.

Figure 18.1 — Chat embed: post a message, promote a decision, watch every action write to text

Picking up the thread from yesterday's design review. Need to check what the team decided on the navigation pattern before writing the spec.

::chat[#design-team]{limit=8 since=yesterday}

Now I can write the spec with the nav decision locked in.

document mutation log

::chat[#design-team] rendered — 6 messages fetched from Slack API

Type a message in the compose bar and hit send ↵ — watch three mutations fire: the message goes to the Slack API, a log entry is appended to the document, and the embed updates. Hover over any message and hit ↑ promote to doc — the message appears as prose below the embed, permanently written to the document. Toggle source view to see the raw text representation of the embed and all sent messages.

Chapter 19 — The browser: the web, read-mode first

Shape: reference. Siblings: ::file (Ch. 16), ::note (Ch. 14), ::contact (Ch. 17). The URL is the canonical reference; the reader-mode rendering is a derived view. The web is the largest body of referenceable content the architecture can address, but the architectural treatment is the same as for any other reference: the directive carries the address, the renderer fetches what it can present.

The ::web directive defaults to reader mode — the cleaned, text-extracted rendering of a web page that most users have never discovered. But the mechanics of how web content flows into the document deserve full treatment: how does clipping work, what does an annotation look like in text, and what happens when reader mode fails?

· · ·

19.1 Reader mode mechanics

When the ::web renderer fetches a URL, it requests the page and applies a readability extraction algorithm — the same technology behind Firefox Reader View and Safari Reader. The algorithm strips navigation, advertisements, sidebars, and tracking pixels, preserving the article prose, headings, images that are part of the content, and the canonical URL. The result is rendered in the document's typography: the same font, the same line height, the same reading environment as everything else on the page. The web article becomes a paragraph in your document, not a window into a foreign interface.

When reader mode fails — when the page is a single-page application with no extractable text, or a login wall, or a PDF — the directive renders with a fallback: the page title, the URL, and an "open externally" link. The fallback is text. The document is not broken.

· · ·

19.2 Clipping to the document

Clipping a web page writes three things to the document as a structured block: the URL as a ::web directive, the page title as a heading, and a brief summary drawn from the first paragraph of the extracted content. The full extracted text is available by expanding the embed but is not written to the document by default — the document holds the reference, and the web holds the content. This is the appropriate division: the document owns your annotation, your context, and the pointer; the web owns the original.

The clipped block can be annotated: any text you type below the ::web directive and before the next block is treated as your annotation for that clip. The annotation is plain prose. It can be queried by a ::py block, searched by grep, and versioned by git. It is yours unconditionally.

· · ·

19.3 The web as a source for ::py

A clipped article's extracted text is available in the document context as a string. A ::py block following the clip can reference it: article = web.clips[-1].text. This means a web clip can feed a computation: counting words, extracting dates, summarising key figures. The ::py sandbox does not make network requests — it reads from the document context. The ::web renderer fetches the content; the Python block processes it. The data flow is text, always text.

Figure 19.1 — Web clip: toggle reader mode, clip article, annotate inline

Researching the baseline for navigation pattern decisions. Found a relevant article on mobile tab bar UX patterns.

::web[https://example.com/mobile-nav-patterns]{view=raw}

Fetch the URL to load the page content, then toggle to reader mode to extract the article.

Mobile Navigation Patterns: Tab Bar vs Hamburger Menu

uxdesign.cc · 8 min read · March 2026

Research across 2,400 users shows tab bar navigation outperforms hamburger menus on task completion by 23% on average. The primary factor is discoverability: tab bar items are always visible, reducing the cognitive overhead of remembering what options are available. For applications with four to five primary destinations, the tab bar is the clear choice. Beyond five destinations, the trade-off shifts toward drawer-based navigation...

article extracted · clip to document?

Mobile Navigation Patterns: Tab Bar vs Hamburger Menu

https://example.com/mobile-nav-patterns

Tab bar navigation outperforms hamburger menus on task completion by 23%. Tab bar items are always visible, reducing cognitive overhead...

document mutation log

::web[...]{view=raw} — directive in document, page not yet fetched

Hit fetch ↵ to load the page — the raw view shows what a browser sees (847KB, ads, scripts). Switch to reader mode — the article is extracted, stripped of noise, rendered in document typography. Hit clip to document — the title, URL, and summary are written as a structured block. Type in the annotation field — each keystroke appends to the document text below the clip directive.

Chapter 20 — The terminal: the document as notebook

Shape: compute. Sibling: ::py (Ch. 6). The two compute directives differ in privilege, not in shape: ::py runs in a host-isolated runtime and cannot reach the host's filesystem or network without an explicit capability grant; ::sh runs in the host's privileged execution context with full access to the user's environment. They share the architectural shape — code in, output captured to text — but the trust boundaries are deliberately different, and that distinction is part of the compute shape's contract.

The ::sh directive is the terminal's representative in the document — the one directive that can have external side effects, the one that runs in the user's actual shell rather than a sandbox. Its power requires careful design: how is output captured, how are errors rendered, and what does the runbook pattern look like in practice?

· · ·

20.1 ::sh versus ::py — the key distinction

The difference between ::sh and ::py is not just about what they can do. It is about the security model. A ::py block is sandboxed: it cannot touch the file system, make network requests, or import arbitrary libraries. You can share a document containing ::py blocks with a colleague and they can open it without risk. A ::sh block is not sandboxed: it runs in your shell, with your permissions, against your file system and network. A document received from a stranger containing ::sh blocks should never be auto-executed. The renderer enforces this: ::sh blocks in documents opened from untrusted sources are displayed as text until the user explicitly clicks "run this block."

· · ·

20.2 Output capture and error rendering

When a ::sh block is executed, stdout and stderr are captured separately. Stdout renders as a code block below the directive, with a timestamp and exit code. Stderr, if non-empty, renders with a distinct visual style — a danger-coloured border — but does not prevent the output from being shown. Non-zero exit codes are flagged. The output is written to the document text as an appended block:

::sh[build-report]{cmd="make build" cwd=~/projects/q3-report}
↓ output appended to document on run:
::sh-output[build-report]{exit=0 duration=4.2s ts=2026-03-24T14:32}
Build successful. 3 files generated.
::end

The output block is text. It can be diffed, grepped, and versioned. If the command is re-run, the output block is replaced — not appended — so the document shows the most recent result while the version history preserves all prior outputs.

· · ·

20.3 The runbook pattern

The runbook pattern is the most powerful use of ::sh blocks: a document that is simultaneously a prose explanation, an executable procedure, and a run log. The document explains what each command does and why; the ::sh blocks contain the commands; the output blocks contain what happened when they were run. A colleague reading the runbook after the fact can follow every step, see every output, and understand every decision — without needing to reconstruct the sequence from a shell history.

· · ·

20.4 Trust and execution context

The ::sh directive has a trust= parameter that the document author can set to document their intent: trust=author means "I wrote this and vouch for it"; trust=review-before-running means "this should be read before executing." The renderer uses these hints but does not rely on them — all ::sh blocks in documents from external sources require explicit user confirmation regardless of their trust parameter. The parameter is documentation, not a bypass.

Figure 20.1 — Shell notebook: run commands, capture output inline, see error rendering

Deploying the Q3 report to staging. Running build checks first, then deploying, then verifying the output.

::sh[check-deps]{cmd="npm run check" cwd=~/projects/q3-report trust=author}

npm run check

If checks pass, run the build:

::sh[build-report]{cmd="make build" cwd=~/projects/q3-report trust=author}

make build

Intentional error example — wrong directory:

::sh[deploy-fail]{cmd="make deploy" cwd=~/wrong-dir trust=author}

make deploy

document mutation log

::sh blocks loaded — 3 commands, trust=author, awaiting execution

Click run ▶ on each block in sequence. The first two succeed — output appears inline with a green exit badge and timestamp, and the output block is written to the document text. The third command intentionally fails — the error is rendered with a red border and exit code, but the document is not broken. The error is text, in the document, permanent. Hit re-run ▶ on any block — the output block is replaced with the new result.

Part III — The eleven apps, redesigned
Part IV: The properties of text-native computing → Chapter 21

Part IV — The properties of text-native computing · Chapters 21–25

The properties of text-native computing

Five things you get for free when your data is text

Part IV does not describe features. It describes consequences. The five properties in these chapters are not things you have to implement, configure, or pay for. They are things that fall out of the architecture automatically, by virtue of the data being text. They are the second-order benefits of the design — the reasons why, even if the text-native document did not have a single interactive feature, it would still be a better place to keep your information than any application silo.

Chapter 21 — Diffability: your life under version control

When your computing life is text, you get version control for free. Not as a feature you must configure, but as a consequence of the format. Any directory of text files can be a git repository. Any change to any file is tracked. Any previous state is recoverable. The entire history of your documents — every task completed, every note written, every budget revised, every email drafted — is accessible with git log.

The most underused capability of version control in a personal context is the diff — the comparison between two versions of a file that shows exactly what changed and when. In a text-native document, every decision is visible in the diff. Accepting a meeting invite is a line added. Completing a task is a character changed from [ ] to [x]. Changing a budget number is an integer replaced.

# git diff HEAD~1 HEAD -- ~/documents/2026-03-23.txt
-::task[call-finance]{due=today priority=high}
+::task[call-finance]{due=today priority=high done=true completed=2026-03-23T14:32}
 
+accepted: Team sync, 14:00, 24 March — written from ::cal embed

This diff tells you that on 23 March 2026, at some point during the day, you completed the task to call finance, and you accepted the team sync meeting. The diff is the record. The record is text. The text is yours.

Chapter 22 — Grepability: one search to rule them all

When all your information is in text files, you can search all of it at once, with a single tool, in a fraction of a second. grep -r "Sara" ~/documents/ returns every reference to Sara across every document you have ever written — meeting notes, tasks, email drafts, budget comments, project documents, daily notes. The result is a list of lines, each preceded by its file path and line number. The search crossed no silo boundaries, because there are no silo boundaries. Sara is a name in text files. The text files are a directory. The directory is searchable.

Because the document uses a consistent directive grammar, search can be made structurally precise. Finding all incomplete high-priority tasks: grep -r "::task\[" ~/documents/ | grep "priority=high" | grep -v "done=true". Finding every calendar event accepted in the past month: grep -r "^accepted:" ~/documents/2026-02* ~/documents/2026-03*. These are not queries to a database. They are text searches. The tools that run them are fifty years old, universally available, and require no installation, no account, and no subscription.

· · ·

22.5 Search as navigation — the Leap principle

Jeff Raskin's Canon Cat, released in 1987, took the logic of grepability to its furthest conclusion: there was no spatial navigation at all. No mouse, no cursor keys, no scroll bars. To move the cursor, you held a key called Leap and typed what you were looking for. The cursor jumped instantly to the next occurrence of that text in the document. Leap forward. Leap back. Incremental, immediate, modeless. The entire spatial navigation paradigm — the scroll bar as a metaphor for position, the cursor key as a metaphor for walking — replaced by a single search primitive.

Raskin's observation was that spatial navigation is a concession to data that is not text. When data has no textual identity — when a file is a binary blob, when a pixel is just a coordinate — you need spatial metaphors to find it. But when all data is text, every piece of data has a textual identity, and that identity is sufficient for navigation. You do not need to remember where the document is; you remember what it says. You do not need to navigate to the task; you Leap to its name.

The text-native document enables this but does not yet design it. A document where everything is text — prose, directives, computation, tasks, calendar events — is a document where everything is Leapable. The following query, in any implementation, should be a single keystroke interaction: jump to the next occurrence of a contact's name, find the next overdue task, return to the last Python block. These are searches, not navigations. The scroll bar is a compatibility shim for people who have not yet discovered that their data is text. The Leap key is what replaces it.

Raskin was explicit about the principle: "The user should not have to navigate. The user should be able to say what they want and arrive there." This is grep as UX. It is what every modern browser's Cmd-F implements for a single page, and what no personal computing environment has yet implemented as the primary navigation model for a person's entire document collection. The text-native document makes it technically possible. The design work remains.

As the document collection grows over months and years, Leap must respect the document lifecycle. Chapter 10 describes tiered search: active documents are searched first, dormant documents on demand, archived documents only when explicitly requested. The tier boundaries are derived from the ::meta[status] annotations in the documents themselves — no configuration, no settings, just the documents declaring their own relevance.

Figure 22.1 — Leap across a week: type to navigate, no scrolling, no clicking

Leap

Type a name, a directive, or any fragment — results appear instantly across a week of daily documents

↑ ↓ to move between matches · Enter to jump · no scrolling, no clicking, no app switching

Type any fragment into the Leap bar — a name (Sara), a directive (::task), a number (2,300), a keyword (hardware). Results appear instantly across five days of daily documents, with file path, line number, match highlight, and directive type. Arrow keys move between matches. No folder was opened. No application was launched. The documents are text; the search is text; the navigation is text.

Chapter 23 — Auditability: when computation lives in prose

The spreadsheet's formula is a secret. It lives in the cell, invisible in the default view, accessible only by clicking into the cell and reading a syntax that takes years to read fluently. A spreadsheet can contain years of accumulated business logic, and none of it is visible in the output. The numbers appear. Their provenance does not. The ::py block is the opposite of this. The computation is in the document. The code is in the text. Any reader of the document can see exactly what was computed, in language designed to be read, in the same scroll as the result.

Consider a budget overrun report shared with a finance team. In a spreadsheet, the finance team receives a grid of numbers. If they want to understand how any number was derived, they must click into cells, trace formula chains, and hope the formula references are labelled. In practice, they rarely do this. In a text-native document, the finance team receives the prose explanation, the Python block, and the table. The computation is right there. Anyone with basic Python literacy can verify the derivation in sixty seconds. Auditability becomes the natural consequence of keeping computation in prose.

The academic concept of "reproducible research" — the practice of publishing not just results but the code and data that produced them — is exactly the text-native document ideal applied to science. The Jupyter notebook made reproducible research practical for data scientists. The text-native document makes it practical for personal computing at every scale.

Chapter 24 — AI-nativity: the model reads what you read

The language model is the first fundamentally new computing tool since the spreadsheet. It reads text and produces text — which means it is, structurally, a Unix program: it participates naturally in text-based workflows without any special integration. A language model can read your text-native document because your document is text. This is, when you think about it, remarkable: the most powerful AI tool ever built is natively compatible with the simplest possible data format.

In the current application paradigm, integrating a language model with personal data requires significant work: connect the model to Gmail via OAuth, connect it to Calendar via a separate OAuth, connect it to Notion via yet another API, manage the permissions, manage the rate limits, manage the data residency questions. Each connection is a negotiation. In the text-native document, there are no apertures to negotiate. The language model reads the document. The document contains everything relevant. The model knows what the user knows, in the same format the user uses.

The language model context window — the maximum amount of text a model can reason about at once — is the binding constraint on AI assistance for personal computing. A well-maintained daily document for one week is approximately 5,000–10,000 words. A model that reads the entire week's document has complete context for any question about that week. The text-native document is already shaped for this — a curated selection of relevant information assembled in the order that makes sense to the person writing it. The document is already the right size and shape for AI assistance, because it was designed to be the right size and shape for human attention. These turn out to be the same thing.

Figure 24.1 — Same question, two architectures

Query to the assistant

"What did Sara and I decide about the Q3 hardware overrun, and what's still open?"

Through application APIs

click "ask" to trace each path

Reading one text file

click "ask" to trace each path

Same question, two architectures. The left column shows what an AI assistant must do to gather context through application APIs — five services, four OAuth scopes, partial returns, hard failures. The right column shows the same assistant reading a single weekly roll-up document. The architectural payoff of text-native is most visible here: the AI sees what the user sees, in the same format, with no apertures to negotiate.

Chapter 25 — Graceful degradation: without the renderer

Every design system should be tested at its weakest point. For the text-native document architecture, the weakest point is the failure mode: what happens when the renderer is unavailable, when a directive type has no registered handler, when the user opens the document in a plain text editor because nothing else is installed? The answer must be: the document remains useful. Not as useful — the calendar does not render, the Python does not evaluate, the tasks do not have checkboxes. But the information is all there, in text, legible to any reader.

Full renderer

All directives render as interactive widgets. Python evaluates. Tables are sortable and editable. Calendar shows live events. The full experience.

Partial renderer

Some directive types render; others display as formatted text. A renderer that handles ::task and ::py but not ::cal renders tasks as checkboxes and code as highlighted text, while displaying ::cal directives as readable monospace text. Still substantially useful.

Markdown renderer

A Markdown renderer shows table data as a formatted table, shows code blocks as syntax-highlighted text, and shows directives as monospaced text. The document is readable and structured, if not interactive.

Plain text editor

::cal[2026-03-23]{view=agenda} is legible as "a calendar view for today." The table data is readable as a text table. The Python is readable as code. The document conveys its full information content to a careful reader.

No software at all

The file is bytes on a disk. The bytes encode characters. The characters are prose and directives. A future reader, or a determined archivist, can read the file and reconstruct its intent. The document is its own documentation.

Part IV — The properties of text-native computing
Epilogue: The document we are writing →

Back matter

Epilogue

The document we are writing

This book was written in a text file. It contains directives that I could not render while writing it, because the renderer does not yet fully exist. The interactive figures in the chapters were built as prototypes — widgets demonstrating how the architecture would work — rather than as genuine ::py blocks evaluated in a live document context. I have written a book about a system that describes itself, using a partial implementation of the system it describes. This is not irony. It is the current state of an architecture that is ahead of its implementation.

The gap between the architecture and the implementation is real. It is not mysterious, but neither is it small. The core path — tasks, computation, tables, transclusion, local files, rendering, and diffable text mutation — is an engineering project. The mirrored-app path — email, calendar, chat, capability grants, trust boundaries, conflict recovery, backend drift, multi-device reconciliation — is a platform project. The architecture is buildable. It is not trivial.

What matters is that these two paths need not be built at once. The right way to begin is with the part of the system for which the document can genuinely be the source of truth: the daily document, the ::task directive, the ::py block, the ::table directive, the ::note transclusion, and local file references. A renderer that does these well would already constitute a serious computing environment for the audience this book addresses. It would validate the central claim where the central claim is strongest.

Only after that core exists should the architecture extend outward into mirrored applications. Email and calendar are not the foundation. They are the outer ring: valuable, powerful, and harder to make explicit. They should be added only when the trust model, capability registry, and sync protocol are strong enough to say, without hand-waving, what the document owns and what it merely reflects.

· · ·

I want to be clear, then, about what I am asking for. Not a single company to build a proprietary empire around this idea. Not a format owned by a vendor. Not another subscription cage with better rhetoric. I am asking for an open specification and a reference implementation whose first allegiance is to readable text, explicit capabilities, and user ownership. The path to that future is not to boil the ocean. It is to build the owned-state core so well that the rest of the architecture has earned the right to exist.

I hope this book is the beginning of a community. I hope someone reads it and starts building. I hope someone else reads it and starts writing their daily document in a text editor and discovers that even without a renderer, the format is useful — that naming things with the directive syntax helps, that keeping everything in one file helps, that writing prose around your tasks and computations helps. The document we are writing is not finished. It is the document you continue when you put this one down.

· · ·

"The best way to predict the future is to invent it."
— Alan Kay

— end —

Back matter

Appendices

Appendix A — Directive grammar reference

Formal ABNF grammar

document = *( code-block / line )
line = directive-line / escaped-line / prose-line
directive-line = "::" type [ "[" identifier "]" ] [ "{" params "}" ] *CHAR CRLF
escaped-line = "\" "::" *CHAR CRLF ; rendered as "::" *CHAR (leading "\" stripped)
prose-line = *CHAR CRLF ; any line not matching the two rules above
code-block = fenced-block / indented-block
fenced-block = "```" *CHAR CRLF *( CHAR / CRLF ) "```" CRLF
indented-block = 1*( 4SP *CHAR CRLF )
directive = "::" type [ "[" identifier "]" ] [ "{" params "}" ]
block = directive CRLF body CRLF "::" "end"
type = 1*( ALPHA / DIGIT / "-" )
identifier = 1*( ALPHA / DIGIT / "-" / "." / "/" / "~" / "@" )
params = param *( SP param )
param = key "=" value
key = 1*( ALPHA / DIGIT / "-" )
value = token / quoted-string
token = 1*( ALPHA / DIGIT / "-" / "." / "/" / ":" / "+" )
quoted-str = DQUOTE *( any char except DQUOTE ) DQUOTE

Escape and exemption rules (§5.2)

Backslash escape. A line beginning \:: is prose, not a directive. The leading backslash is stripped at render time; the remainder of the line renders as written. To display a literal \:: at line start, write \\::.

Code-block exemption. Lines inside a fenced code block (triple-backtick delimiters) or an indented code block (four leading spaces) are never parsed for directives, regardless of their content. CSS, C++, IPv6, and source-code excerpts belong in code blocks; the exemption is consistent with CommonMark.

Parser implementation. At line start, peek two characters. If ::, attempt to parse a directive. If \: followed by :, drop the backslash and emit the rest as a prose line. Otherwise, emit a prose line. Code-block detection is a separate phase that runs first and marks contiguous spans as exempt before line-level parsing begins.

Rejected alternatives. A whitelist of known directive types was rejected because typos in real directives would fail silently as prose. An indentation-based exemption was rejected because prose is legitimately indented inside blockquotes and lists. The two rules above are sufficient and minimal.

Reserved type names

cal · email · task · note · py · table · contact · file · chat · web · sh · end (block closer, not a directive type)

Universal parameters (all directive types)

id= — unique identifier within document
render=false — suppress rendering, show as text only
comment= — human-readable annotation, ignored by renderer
version= — schema version, for forward compatibility

Appendix B — Python sandbox specification

Isolation mechanism

Each document context runs inside a Python interpreter that the host has confined to a syscall-level isolation envelope. Isolation is provided by the envelope, not by language-level restriction of CPython. The host imports a fixed set of capabilities into the runtime; the runtime cannot reach any capability the host has not imported. The mechanism by which the host enforces the envelope is a property of the implementation — a browser-hosted renderer may compile Python to WebAssembly; a desktop or CLI renderer may use process-level or OS-level sandboxing — and the architecture is indifferent among host-level mechanisms that satisfy the shape. See §6.3 for the full argument.

Execution model

Blocks execute in document order, sharing a single namespace. Execution is triggered by document open (for run=auto blocks) or by user action. Total execution time limit: 10 seconds per document open. Individual block time limit: 3 seconds. Memory limit: 64MB per document context, enforced by the isolation envelope at instantiation; out-of-memory is a hard runtime error surfaced inline.

Constraints inherited from the isolation envelope

Cold start — bringing up the runtime imposes a non-zero startup cost on first use per session. The exact figure depends on the host's interpreter choice and on how much of the standard library it preloads; cached after first load.

Library availability — only the libraries the host has admitted into its runtime are reachable. The implementation defines which Python distribution it ships with and which packages are available by default; native extensions are reachable only if the host has produced a build compatible with its envelope.

Filesystem — an in-memory virtual FS rooted at /tmp. Not the user's actual disk. Persists for the lifetime of the document context only.

Networking — blocked at the runtime level. No socket, no urllib, no outbound traffic of any kind unless the host has explicitly imported a capability that exposes it.

Concurrency — asyncio is supported. threading and multiprocessing are not.

Subprocess — subprocess, os.system, and os.exec* are unavailable. The ::sh directive (Chapter 5) is the only path to a real subprocess and runs in the privileged execution context, not the sandbox.

Standard-library modules available in the default policy

The default host policy imports the following stdlib modules into the runtime, all of which are pure-Python or available in any conformant interpreter the architecture admits: math · cmath · decimal · fractions · statistics · random · datetime · calendar · collections · itertools · functools · operator · re · string · textwrap · json · csv · enum · dataclasses · typing · abc · copy · pprint. Modules with side-effecting capabilities (os, sys, socket, subprocess, importlib, ctypes) are not imported by the default policy and are not reachable from inside the isolation envelope in any case. A renderer may extend the policy at install time; users can inspect the active policy via the document's ::meta block.

Document context API

doc.tasks(filter=None) — list of task dicts from all blocks in document
doc.tables(id=None) — list of table dicts, optionally filtered by id
doc.blocks(type=None) — list of all directive blocks in document
doc.metadata — dict of document-level metadata (date, title, tags)
table(data) — helper: outputs a list-of-dicts as a ::table block
chart(type, data) — helper: outputs a chart specification

Appendix C — Prior art and influences

Direct predecessors

NLS / oN-Line System (Engelbart, 1968) — hypertext, collaborative editing, the mouse, the document as computing environment.

Xanadu (Nelson, 1960–) — transclusion, permanent links, the idea that documents should reference not copy.

Emacs org-mode (Dominik, 2003) — one text file as calendar, task manager, spreadsheet, and programming notebook. The most direct proof of concept.

Jupyter Notebook (Pérez & Granger, 2011) — literate programming for data scientists; code and prose interleaved; output captured inline.

Literate Programming (Knuth, 1984) — code and documentation as the same artifact; the machine as secondary audience.

Formative ideas

Unix philosophy (McIlroy, 1978) — small programs, text streams, free composition.

Man-Computer Symbiosis (Licklider, 1960) — the computer as an extension of human intellect.

As We May Think (Bush, 1945) — the memex, associative indexing, the idea that information should be navigable by human patterns of thought.

Canon Cat / The Humane Interface (Raskin, 1987 / 2000) — the most direct commercial predecessor. The Cat had no file system, no stand-alone applications, and no spatial navigation: all content lived in a single scrollable document stream, tools were applied to document content rather than opened as applications, inline Forth code could be executed in place with output written back to the document, and the Leap key replaced all cursor navigation with incremental search. Raskin's 2000 book The Humane Interface restated the central principle: "An end to stand-alone applications — every software package should be structured as a set of tools available to users on any document." This is the architecture of this book stated in one sentence, published twenty-five years before this book.

Project Oberon (Wirth & Gutknecht, 1992) — the most important independent derivation of the document-as-OS idea. Oberon implements commands as executable text (any occurrence of Module.Proc in any document can be clicked to invoke it), modules as lazy-loaded renderers, a shared text substrate as the universal data layer, and no-hidden-state as an explicit design principle. Wirth reached the same architecture from a different direction — systems programming rather than personal productivity — and in doing so validated it. The text-native document architecture and Oberon are the same answer to the same question, separated by thirty-five years and a change in the question's scale.

CommonMark Generic Directives (MacFarlane et al., 2017–) — the proposed ::directive[id]{params} syntax that this architecture adopts and extends.

The Document is the Computer — complete open manuscript · living draft
About a text-native architecture · written ahead of its renderer · free to read · free to fork

1.1 The desk that doesn't exist

1.2 The container hierarchy

1.3 What the window manager knows

1.4 The icon as a false promise

1.5 Engelbart's question, unanswered

1.6 The window is not the answer

2.1 The format is the fence

2.2 The API as a controlled aperture

2.3 The subscription model's silent bargain

2.4 The meeting invite as case study

2.5 Why integration fails

2.6 The concrete cost

2.7 The structural solution

3.1 Durability: the file that does not rot

3.2 Composability: the pipe that connects everything

3.3 The Unix philosophy, stated plainly

3.4 Literate programming: when Knuth went all the way

3.5 Org-mode: the proof of concept that already exists

3.6 The five properties, stated precisely

3.7 The AI moment

4.1 The daily document

4.2 Structure without rigidity

4.3 The renderer is not the document

4.4 Ownership, finally

4.5 No hidden state

5.1 The grammar

5.2 Escaping the sigil

5.3 All eleven directives

5.4 Graceful degradation is mandatory

5.5 The directive is not HTML

6.1 The shared document context

6.2 Why Python — and why the choice is not essential

6.3 The sandbox: what "cannot read disk" actually means

7.2 The divergence protocol

7.3 Why this replaces the spreadsheet

8.1 Registration

8.2 Sandboxing

8.3 Bidirectional sync as a registry service

8.4 The Python context as a universal query layer

8.5 Lazy renderer loading and the capability manifest

8.6 Viewport-driven rendering

8.7 Progressive evaluation and run=visible

8.8 Embed folding and the fold parameter

9.1 The mutation model

9.1.1 Two kinds of state: owned and mirrored

9.2 Remote backends

9.2.1 The document as event log

9.2.2 Inbound sync: when the backend changes without you

9.3 Conflict resolution

9.3.1 The common case: concurrent non-overlapping edits

9.3.2 The hard case: overlapping edits

9.3.3 Persistence and recovery

9.4 When sync fails

9.4.1 Network failure during an outbound mutation

9.4.2 Partial sync: backend succeeded, text write failed

9.4.3 Mutation ordering

10.1 The spawn directive

10.2 Document states

10.3 Compaction

10.4 Roll-up documents

10.5 Tiered search

10.6 The soft budget

10.7 The document across devices

10.7.1 The git baseline

10.7.2 Single-writer discipline

10.7.3 What the architecture does not yet solve

The five shapes behind the eleven directives

Chapter 11 — Email: threads as paragraphs

11.1 Thread rendering as prose

11.2 Replying without leaving the document

11.3 Composing new messages

11.4 Filing and deletion

Chapter 12 — Calendar: time as a view, not a destination

12.1 Events as text blocks

12.2 RSVP and the mutation chain

12.3 Editing recurrence and details

Chapter 13 — Tasks: every checkbox is a line of text

13.1 The task as a directive

13.2 Creating tasks from prose

13.3 Blocking chains in text

8.7 Progressive evaluation and `run=visible`

8.8 Embed folding and the `fold` parameter