Giving a coding agent readable context
The point: an agent re-reads your documents dozens of times per task — addresses, grep-ability and honest gaps pay compound interest.
Chat use reads a document once. An agent — a coding assistant grinding through a task with tools — comes back to the same material again and again: greps it, quotes it, re-opens it after compaction, cites it in a commit message. That access pattern changes what "good input" means, in three specific ways.
1 · Addresses beat descriptions
An agent that can say "per analysis.md, Cell [12]"
doesn't need to re-establish context each time — it navigates. This
is why converted notebooks keep stable cell addresses and why the
RAG preset emits <a id="slug"></a> anchors:
they turn prose into a coordinate system. Instructions to the agent
get shorter and more reliable too ("update the constants documented
in #configuration" is unambiguous; "the settings part
of the doc" is a dice roll).
2 · Grep-ability is a real requirement
Agents search files with literal tools before reading them. A
.docx is a ZIP — ungreppable; a PDF greps as glyph
soup; a notebook greps through JSON escapes. The same content as
Markdown is one grep -n away, and heading lines make
the search results self-locating. If your repo carries reference
documents agents must consult (specs, vendor docs, analysis
notebooks), a docs-md/ mirror of converted copies is
cheap infrastructure: one batch drop here produces it.
3 · Token discipline compounds
A human pays a document's token cost once; an agent pays it on every re-read across a long session, competing with its own working memory. The conversion properties that were nice-to-have in chat — base64 stripped, outputs truncated with a note, boilerplate gone — become budget policy. Our measured notebook case (3.5 MB → ~64K tokens) is the difference between "the agent can consult the analysis" and "the analysis evicts the agent's plan".
4 · The honest-gaps rule matters most for agents
A human notices when quoted text looks thin. An agent takes its
input as ground truth and acts — writes code against it,
asserts facts from it. That's why explicit markers
(Showing the first 50 of 4,812 rows, scanned-page
warnings, [Figure: …] placeholders) aren't cosmetic
here: they're the difference between an agent that says "the data
continues beyond my excerpt, I'll compute over the real file" and
one that hard-codes a wrong total. Silent loss turns into wrong
commits with surprising efficiency.
5 · A working recipe
- Batch-convert reference documents with the RAG preset (anchors + chunk comments double as agent navigation).
- Commit them to the repo the agent works in, filenames stable,
one
docs-md/directory. - In the agent's instructions, point to sections by anchor, and state the convention once: "documents in docs-md/ carry explicit truncation notes; trust them."
- Re-convert when sources change — conversion is deterministic, so diffs stay reviewable.
Build your docs-md/ mirror in one batch drop — local, instant, loss-accounted.