Under the hood · 01

What a converter built for LLMs does differently

Search "convert docx to markdown" and you'll find dozens of tools. Most share an unstated goal: make the output look like the original, for a human eyeball. That goal is why their output quietly fails when the reader is a language model. A model doesn't see how the page looks. It sees a token stream — and it needs different things from that stream than your eyes need from a page.

MakeItMarkdown is built around that difference. Here's what changes, element by element.

1 · Headings: semantics from styles, not from font size

In a .docx, "Heading 2" is a style object, and we map style names directly to ## levels. Converters that render the page visually and then transcribe it tend to emit big bold paragraphs — which a model reads as ordinary text. The document's outline, the single most useful retrieval structure it has, evaporates. Style-aware mapping keeps it: your section tree arrives as a real #/##/### hierarchy.

What Word stores

"Results"
  style: Heading 2
  (bold · 14 pt · spacing before)

What the model needs

## Results

2 · Tables: kept rectangular, typed, and honest about truncation

Tables are where converted documents lie most. Merged cells, header rows demoted to data, thousands of rows silently clipped. Our table handling does three unusual things:

3 · Figures: placeholders, never base64 walls

Embedded images are the silent token bomb. One notebook we tested carried a 515 KB base64 screenshot inside a markdown cell — pasted into a chat window, that's roughly 130,000 tokens of pure noise, a third of many context windows, spent on one image the model can't even decode from text. We extract every embedded image to a real file and leave an explicit, addressable marker in the text:

[Figure: cell_12_figure_1.png]

The model sees that a figure exists, where it sits in the document, and can refer to it by name. Your token budget goes to words.

4 · Notebooks: cells you can point at

A .ipynb is JSON wrapping code, outputs, images and metadata. Our notebook parser gives every cell a stable address (Cell [7]), flags execution-order anomalies — the classic "ran cell 40 before cell 12" state that makes results unreproducible — and adds approximate dependency hints between cells, so a model can trace where a variable came from without you pasting the whole notebook twice. The full build log, with real numbers from a 100-cell notebook, is in Making Jupyter notebooks LLM-addressable.

5 · The fidelity report: "detected", never "preserved"

This is the part we consider non-negotiable, and the part most converters simply don't have. Every conversion here returns three panels: the original, the Markdown, and a fidelity report — counts of tables, figures, equations and code cells the parser detected, every warning it accumulated, and a weighted structural quality score.

The fidelity report with the QC breakdown expanded — detected counts below.

The wording is deliberate. "Preserved" is a promise nobody converting real-world files can keep; "detected" is a measurement. If your document contains four tables and the report says two, you've learned something vital before the model hallucinated around the missing half. Silent loss becomes visible loss. That's the whole trust model of the product — and it's why the report sits on equal footing with the output itself.

🎬 Media slot — save as /assets/media/blog/not-a-converter/fidelity-zoom.mp4 · 10–15 s screen recording: drop the sample notebook, then zoom (cursor-follow) onto the fidelity panel while the counters animate up and a warning appears. Muted, looping. This box is replaced by the clip once the file lands.

6 · What we refuse to do

7 · One structure, many targets

Because every parser emits the same internal structure, the output presets are cheap and consistent: token-lean Chat paste, RAG with chunk boundaries and stable anchors, Obsidian with callouts and wikilinks, and a faithful Archive with full frontmatter. Same detected structure, four disciplines of output — pick per destination, not per file.

Drop a file and read its fidelity report. If the report surprises you, that's the point.