Feeding LaTeX sources (.tex) to an LLM
LaTeX sources are the native format of papers, theses and course
notes — and the best possible input when you have it, because the
structure a PDF destroys is still explicit in the .tex.
Converting the source beats OCR-ing its output every time.
The element mapping
| In the .tex | In the Markdown |
|---|---|
\section / \subsection | ## / ### outline |
\title, \author, abstract | Document title, authors, an Abstract section |
equation / align environments | $$ … $$ fences, math body verbatim |
tabular | GFM pipe tables |
\includegraphics + \caption | [Figure: name.png] placeholder with caption |
\cite / \ref | [key] / (ref: label) — readable citation anchors |
thebibliography | A References section, keyed |
| Preamble (packages, macros) | Dropped, with a macro-count warning |
Before → after
In the file
\section{Method}\label{sec:method}
Throughput $T$ follows
\begin{equation}
T = \frac{U}{\Delta t} \cdot \eta
\end{equation}
as shown in~\cite{lee2024}.In the Markdown
## Method
Throughput $T$ follows
$$
T = \frac{U}{\Delta t} \cdot \eta
$$
as shown in [lee2024].Honest limits
- Not a TeX engine. Custom macros
(
\newcommand) are not expanded — text using them converts literally, and the fidelity report counts the definitions it saw. - Math is preserved, not rendered — bodies stay
verbatim inside
$$fences, which is exactly what LLMs read best anyway. - Figure images are referenced by path, not extracted — the .tex doesn't contain them.
- Multi-file projects (
\input/\include) convert one file at a time — batch-drop the parts.
FAQ
Why not just paste the PDF? Because the PDF is the painted output; the .tex is the structure itself.
BibTeX .bib files? Not yet —
thebibliography environments convert;
request .bib with a sample.
Drop a .tex and check the outline, fenced math and keyed references.