MakeItMarkdown vs cloud parsing APIs
A different species of alternative: hosted document-intelligence APIs — the parsing services attached to RAG frameworks and cloud providers. You POST a document, they return structured Markdown/JSON, you pay per page. These are serious tools solving a harder problem than ours, and the honest comparison is about which problem you actually have.
1 · What the APIs genuinely add
- OCR and layout models. Scanned pages, photographed documents, complex multi-column layouts, visual table reconstruction — ML-powered extraction that no rule-based local parser matches on hostile input.
- Scale. Ten thousand documents, queued, retried, webhooked.
- Pipeline integration. An HTTP call fits an ingestion service; a browser tab doesn't.
2 · What you trade for it
- Custody. Every document transits and is processed on the provider's infrastructure — fine for public docs, a real question for client or unpublished material.
- Cost with volume. Per-page pricing is cheap until the corpus is large or reprocessed often.
- Opacity. Model-based extraction can hallucinate structure — a reconstructed table can contain cells the page never had, and you'll rarely get a per-file account of confidence or loss.
- A dependency. Keys, quotas, deprecations, latency.
3 · The decision, plainly
| Your situation | Right tool |
|---|---|
| Scanned/photographed documents at volume | Cloud parsing API (that's the OCR you're paying for) |
| Automated ingestion service, thousands of files | Cloud parsing API (or pandoc in a worker for clean formats) |
| Born-digital files, interactive use, confidential material | MakeItMarkdown — free, local, instant, loss-accounted |
| Notebooks specifically | MakeItMarkdown (cell addresses and dependency hints don't exist elsewhere) |
| Feeding a personal/team workspace (Projects, Spaces) | MakeItMarkdown — the measured wins |
A pattern we like: use the local converter as the default path for everything born-digital (which is most of a typical corpus), and reserve the paid API for the scanned residue it's actually built for. The fidelity report tells you which pile each file belongs to — scans get flagged at drop time.
Sort your corpus in one batch drop: clean files convert now, scans get flagged for the OCR pile.