Context Lab
Start here · Why Markdown wins
What actually happens when a model reads your document — and why the input format decides the answer quality.
Format guides
The reference shelf — what's really inside each file type and the exact element mapping applied. All thirteen guides →
.ipynb .docx .pptx .xlsx .csv .pdf .html .json .srt .tex .eml .mbox .md
Under the hood
How the converter earns trust: structure detection element by element, and a fidelity report that never says "preserved".
-
01
What a converter built for LLMs does differently
Headings from styles, typed tables, figure placeholders — and a report of what was detected, not promised.
-
02
Making Jupyter notebooks LLM-addressable
Cell addresses, execution-order confessions, dependency hints — a build log with real numbers.
-
03
Tables LLMs can actually read
Column types, explicit truncation, serial dates, cached formulas — spreadsheets without invented data.
-
04
Which output preset should I use?
Pick by destination, not by file — the 30-second decision table with live sample links.
Real-world workflows
Measured, story-driven guides: what changes in Claude Projects, Perplexity Spaces and other AI workspaces when the uploads are Markdown.
-
01
Case study: lecture notes that made a smaller model smarter
Same workspace, same questions — the only change was PDF → Markdown. A first-person report with numbers.
-
02
Markdown is the best upload format for AI workspaces
Storage caps, token budgets, retrieval accuracy — three bottlenecks, one format change, measured.
-
03
Meeting recordings → minutes your model can quote
The caption track is the transcript — minutes with [mm:ss] receipts and a searchable archive.
Choosing tools
Honest comparisons — including the cases where the other tool wins.
-
01
vs pandoc
When you want the universal CLI, when you want the fidelity report.
-
02
vs online converter sites
"Upload" means your file lands on their servers — and the difference is checkable here.
-
03
vs just attaching the file
Platforms convert attachments with a hidden extractor. Converting first inverts the control.
-
04
vs cloud parsing APIs
They bring OCR at per-page prices; the local default handles the born-digital majority.
Advanced
For people building pipelines: retrieval systems, agent context, automation.
-
01
RAG-ready Markdown: chunks, anchors, honest citations
Most RAG quality problems are chunking problems. Fix retrieval at the source, before embedding.
-
02
Giving a coding agent readable context
Agents re-read documents dozens of times — addresses, grep-ability and honest gaps compound.
-
03
Obsidian as an LLM knowledge base
Callouts, wikilinked figures and front matter — documents becoming vault citizens.
-
04
Token budgeting 101
The estimate lies by ~20% on technical text — how to count, and where document tokens hide.
-
05
Build an LLM-readable wiki
Archive-preset Markdown in git: humans browse it, models retrieve it, diffs audit it.
-
06
Roadmap: a CLI, agent skills, stronger local OCR
Same parsers, new surfaces — terminal, agents, and model-based OCR on your CPU/GPU. Plans, not promises.
FAQ · Fixes for common errors
Hit an error? Find your symptom below, understand the mechanism, apply the two-minute fix — the converter's own error messages link straight here.
-
01
Claude Projects can't read your PDF?
Refusals, misquotes, wrong-file hunts — why PDF retrieval fails and the two-minute fix.
-
02
"I can't access this file"
The four real causes behind every upload error, and how to tell them apart.
-
03
Your notebook is too big to paste
3.5 MB of .ipynb is mostly base64 — beating the token limit with cell addresses intact.
-
04
Pasted a spreadsheet, got garbage
Whitespace collapse, serial dates, merged cells — and the table format that survives.
-
05
The model invents numbers from your data
Silent truncation and headerless values cause most of it — both fixable before the prompt.
-
06
Scanned PDFs — and the OCR fallback
Pictures of text contain no text — the diagnosis, your options, and the opt-in local OCR.
-
07
Copying from a webpage floods the chat with junk
Menus, banners and teasers ride along — article extraction keeps only the article.
-
08
Your workspace keeps missing half a document
Three retrieval failure modes — bad chunks, duplicate copies, ambiguous filenames.