Obsidian as an LLM knowledge base
An Obsidian vault has quietly become the best personal infrastructure for LLM work: a folder of plain Markdown, linked into a graph, fully local. Every AI integration that reads vaults — and every copy-paste into a chat — inherits whatever quality your notes have. Which makes the conversion question concrete: when a PDF, notebook or report enters your vault, what should it become?
1 · Why the vault shape fits LLMs
- It's already Markdown — the model's native register, no extraction step between your notes and any assistant.
- Links are structure.
[[wikilinks]]encode the relations retrieval wishes it could infer; an assistant following your links reads your knowledge graph, not just your files. - Local-first matches confidential work — the vault never left your disk; neither does conversion here.
2 · What the Obsidian preset emits
| Element | Output |
|---|---|
| Document metadata | YAML front matter (title, source file, date) — Dataview-queryable |
| Overview / summary sections | > [!abstract] callouts |
| Method-like sections | > [!info] / > [!warning] callouts by role (limitations get warnings) |
| Figures | ![[figures/chart_1.png]] embeds — drop the .zip's figures folder into the vault and they render |
| Notebook cells | Kept as plain sections (cell headers stay addressable, not callout-wrapped) |
The callout mapping is heuristic and says so — it styles the note for scanning; the underlying text is untouched.
3 · Three vault workflows this unlocks
- The reading inbox. Saved articles (.html) convert to notes with the chrome already stripped — your highlights-and-links pass starts from clean text.
- The analysis archive. Notebooks convert with
cell addresses, so a note can cite
Cell [12]and a future chat about that note can too. - The AI-readable reference shelf. Specs, reports and transcripts (.srt meeting captions included) become vault citizens that any assistant-with-vault -access can retrieve properly — the same chunking logic as hosted workspaces, but on your disk.
4 · Vault hygiene that pays off with models
- Descriptive filenames (they become link text and retrieval titles).
- One document, one note — split monsters by section if a note exceeds a few thousand tokens (budgeting guide).
- Keep the front matter — future-you will Dataview it, and
assistants read
source:to cite properly.
Convert something with the Obsidian preset and drop it straight into your vault — front matter, callouts, figure embeds.