Cluedoc: document your codebase as interlinked visual papers

Coding agents write software faster than any human can read it. Cluedoc has your agent document as it builds, one human-readable paper per feature, so the system stays understandable to the people who own it. Fittingly, this page is one such paper.

flowchart TD ROOT["◆ Cluedoc document a codebase as papers"] ROOT --> CT["Capability tree one paper per feature"] ROOT --> PS["Progressive sync driven by code changes"] ROOT --> CG["Citation graph papers cite papers"] ROOT --> RG["Reading guide points you at the right papers"] PS -. maintains .-> CT CG -. links .-> CT RG -. walks .-> CG

Figure 1 · Cluedoc's own capability tree. Solid edges nest features; dashed edges are citations.

Abstract

Cluedoc is an Agent Skill that keeps a codebase understandable to humans while coding agents rapidly change it. It treats a software system as a group of features and writes one document per feature, in the style of an academic paper, a format long established for explaining complex ideas, which is what a system's features usually are. The papers are visual, organized as a capability tree, and cross-referenced like citations; Cluedoc writes and maintains them as the code changes, so people can keep track of a system that now grows faster than they could read it line by line.

Introduction

Coding agents have changed who writes software. Increasingly the human does not type the code; they direct agents that do, and codebases grow far faster than before. The scarce resource shifts from writing to understanding: when you did not write a line of it, and it changed again this morning, how do you know what the system does, or whether the next change is safe?

Hand-written documentation cannot keep that pace, and a README rots the moment the agent moves on. Cluedoc closes the loop by putting the docs in the same hands as the code: the agent writes and maintains human-readable documentation as it builds. Cluedoc defines a software system as a group of features and gives each feature one document. It writes those documents as academic papers (abstract, introduction, figures, related work) because that is a well-established way to explain a complex idea, and a nontrivial feature usually is one. The reader needs a single mental model to begin: a system is a hierarchy of features, each explained by one visual paper that links to the ones around it.

Related Work

Cluedoc's closest relative is the LLM-wiki pattern: an LLM incrementally builds and maintains a persistent, interlinked wiki instead of re-reading raw sources at query time. Cluedoc shares that bet (documentation as a living artifact an agent keeps current) and is best read as that pattern specialized for source code. Where LLM-wiki is domain-agnostic, leaving structure, format, and workflow to you, Cluedoc fixes them for a codebase: it is driven by code changes rather than a curated corpus, following callers and callees to revise the right papers; it prescribes one shape: a capability tree, one paper per feature, the six-section academic form; and it stays in the loop, the same agent updating docs as it edits code.

Nearer in tooling is DeepWiki-Open, an open-source AI wiki generator that clones a repo, indexes it into a vector database, and serves a browsable wiki (with diagrams and a RAG-backed "Ask" chat) from its own web app. It shares Cluedoc's aim but sits on the other side of the tooling boundary: its wiki lives in an external app you host and regenerate on demand, drifting in between, whereas Cluedoc's papers are plain Markdown under .cluedoc/, versioned with the code and kept current in the loop: no server, no vector store.

It also contrasts with classic generators (Doxygen, Sphinx, JSDoc) that extract a reference from source and comments. Those describe symbols, one entry per function; Cluedoc explains features, one paper per capability, in language a designer would recognize. The skill itself is defined in SKILL.md, built on the open Agent Skills specification.

Description

One paper per feature, organized as a hierarchy

The unit of documentation is one feature. Features form a hierarchy (large features contain smaller sub-features), and Cluedoc mirrors that hierarchy in the folder structure: higher-level features live in higher folders, finer detail lives deeper. Everything sits in a .cluedoc/ folder at the repository root, where every feature is a folder and its paper is the README.md inside it.

.cluedoc/
├── README.md                  ← root paper: the whole system
├── authentication/
│   ├── README.md              ← the "Authentication" feature
│   ├── login/
│   │   └── README.md          ← sub-feature: "Login"
│   └── session-management/
│       ├── README.md          ← sub-feature: "Session Management"
│       └── token-refresh/
│           └── README.md      ← deeper still: "Token Refresh"
└── billing/
    └── README.md

Splitting stops at the one-paper rule: divide a feature only when it has distinct sub-capabilities that each deserve their own hero visual; stop when a single focused paper says it all. It is a judgment about capability, never a mirror of the code's directory layout, so the "monorepo vs. single-package" question never comes up. One package or twenty, the docs are the same kind of capability tree.

It builds progressively, driven by your code, in both directions

Cluedoc does not document the whole repository in one pass. It works progressively: when code changes, a single change can ripple up and down the feature hierarchy, so it updates parent and child papers alike. Upward, it scans where the changed code is used (its callers) to find the larger feature it belongs to; downward, it scans what the code uses (its callees) to find the collaborators worth documenting.

Over many changes the tree fills in and sharpens from every direction. Early on it may be shallow; it deepens as features reveal both the structures that contain them and the collaborators they rely on. As features appear, move, and disappear, Cluedoc creates, splits, renames, and deletes folders to match, and re-checks that every cross-link still resolves.

Abstract prose, anchored to code

Papers are about the code but never contain it. The prose stays abstract and human, with no snippets, no symbols, no file paths. Concepts are named the way a user or designer would (the option's value source, the parse loop), not the way the compiler would. The link to the implementation lives in structured metadata instead: a sources list in the frontmatter, kept at the granularity of files so it survives ordinary refactors.

title: Option Parsing
sources:
  - lib/option.js
  - lib/command.js   # the parse loop

That is the only place raw paths appear, which is exactly why the prose rarely changes when code moves.

Every paper has the same shape

Each paper is exactly what you are reading: YAML frontmatter followed by six sections, always in order, opening with a hero visual and leaning on diagrams throughout: Mermaid for structure and flow, terminal graphics for trees and layouts. Never a wall of prose where a picture is clearer.

┌─ a Cluedoc paper ──────────────────────────┐
│  frontmatter    title · sources             │
│  1  Hero visual  a diagram, before any prose │
│  2  Abstract     what it is, and why         │
│  3  Introduction the problem it solves       │
│  4  Related Work links to other papers       │
│  5  Description  how it works · visual-heavy  │
│  6  Conclusion   the takeaway; where to go    │
└────────────────────────────────────────────┘

"Related Work" is the connective tissue. Every cross-paper link lives there (the parent above, the children below, and any non-adjacent paper that adds context), turning the docs into a citation graph you can traverse. One rule holds it together: link only papers that already exist, never a dead link to a planned one.

It also guides your reading

The papers are a map for reading, not only writing. Ask how something works (a feature, a flow, "where does X happen") and Cluedoc answers, then appends a short Reading Guide: the two-to-five papers most worth reading, in a suggested order, each with a clause on why it matters to your question. (It kicks in only once a .cluedoc/ folder exists, and never invents a paper that isn't there.)

Getting started

One command. The skills CLI auto-detects your agent and installs Cluedoc for it:

$ npx skills add KeunwooPark/cluedoc

Prefer to do it by hand? See the manual install.

Cluedoc is an Agent Skill, so it runs through your coding agent rather than as a background daemon; there is no file watcher or git hook. Run it explicitly the first time, to bootstrap the tree:

$ /cluedoc

After that, the agent will often update the affected papers on its own as it edits code in a session; that is what the skill asks it to do. But that proactive pass is best-effort, not a guarantee: changes you make outside the agent, or turns where it simply doesn't reach for the skill, won't be picked up. When you want a sure sync (before a commit, or a full pass over the repository), call /cluedoc by name. For hands-off updates, wire it into a commit hook or CI step that invokes the skill against your changes.

Conclusion

Cluedoc turns a codebase into a navigable body of knowledge: one visual paper per capability, organized as a tree, cross-linked like a bibliography, and kept honest by the code it tracks. Read it the way you would a research corpus: start at the root, follow the citations, stop when you understand. The natural next stop is a real one: browse the example docs → and see the format on projects you already know.