Advanced · 02

Giving a coding agent readable context

The point: an agent re-reads your documents dozens of times per task — addresses, grep-ability and honest gaps pay compound interest.

Chat use reads a document once. An agent — a coding assistant grinding through a task with tools — comes back to the same material again and again: greps it, quotes it, re-opens it after compaction, cites it in a commit message. That access pattern changes what "good input" means, in three specific ways.

1 · Addresses beat descriptions

An agent that can say "per analysis.md, Cell [12]" doesn't need to re-establish context each time — it navigates. This is why converted notebooks keep stable cell addresses and why the RAG preset emits <a id="slug"></a> anchors: they turn prose into a coordinate system. Instructions to the agent get shorter and more reliable too ("update the constants documented in #configuration" is unambiguous; "the settings part of the doc" is a dice roll).

"the latency part…?" ? #latency · Cell[12]
A description sends the agent re-reading the whole file; an address lands on the line.

2 · Grep-ability is a real requirement

Agents search files with literal tools before reading them. A .docx is a ZIP — ungreppable; a PDF greps as glyph soup; a notebook greps through JSON escapes. The same content as Markdown is one grep -n away, and heading lines make the search results self-locating. If your repo carries reference documents agents must consult (specs, vendor docs, analysis notebooks), a docs-md/ mirror of converted copies is cheap infrastructure: one batch drop here produces it.

3 · Token discipline compounds

A human pays a document's token cost once; an agent pays it on every re-read across a long session, competing with its own working memory. The conversion properties that were nice-to-have in chat — base64 stripped, outputs truncated with a note, boilerplate gone — become budget policy. Our measured notebook case (3.5 MB → ~64K tokens) is the difference between "the agent can consult the analysis" and "the analysis evicts the agent's plan".

4 · The honest-gaps rule matters most for agents

A human notices when quoted text looks thin. An agent takes its input as ground truth and acts — writes code against it, asserts facts from it. That's why explicit markers (Showing the first 50 of 4,812 rows, scanned-page warnings, [Figure: …] placeholders) aren't cosmetic here: they're the difference between an agent that says "the data continues beyond my excerpt, I'll compute over the real file" and one that hard-codes a wrong total. Silent loss turns into wrong commits with surprising efficiency.

5 · A working recipe

  1. Batch-convert reference documents with the RAG preset (anchors + chunk comments double as agent navigation).
  2. Commit them to the repo the agent works in, filenames stable, one docs-md/ directory.
  3. In the agent's instructions, point to sections by anchor, and state the convention once: "documents in docs-md/ carry explicit truncation notes; trust them."
  4. Re-convert when sources change — conversion is deterministic, so diffs stay reviewable.

Build your docs-md/ mirror in one batch drop — local, instant, loss-accounted.