Choosing tools · 04

MakeItMarkdown vs cloud parsing APIs

A different species of alternative: hosted document-intelligence APIs — the parsing services attached to RAG frameworks and cloud providers. You POST a document, they return structured Markdown/JSON, you pay per page. These are serious tools solving a harder problem than ours, and the honest comparison is about which problem you actually have.

1 · What the APIs genuinely add

2 · What you trade for it

3 · The decision, plainly

Your situationRight tool
Scanned/photographed documents at volumeCloud parsing API (that's the OCR you're paying for)
Automated ingestion service, thousands of filesCloud parsing API (or pandoc in a worker for clean formats)
Born-digital files, interactive use, confidential materialMakeItMarkdown — free, local, instant, loss-accounted
Notebooks specificallyMakeItMarkdown (cell addresses and dependency hints don't exist elsewhere)
Feeding a personal/team workspace (Projects, Spaces)MakeItMarkdown — the measured wins

A pattern we like: use the local converter as the default path for everything born-digital (which is most of a typical corpus), and reserve the paid API for the scanned residue it's actually built for. The fidelity report tells you which pile each file belongs to — scans get flagged at drop time.

your machine their servers file.pdf upload · $ per page their OCR this tab · $0
The APIs' OCR is real — the trade is that your file crosses the line, and the meter runs per page.

Sort your corpus in one batch drop: clean files convert now, scans get flagged for the OCR pile.