Your notebook is too big to paste: beating the token limit

MakeItMarkdown · July 2026 · 5 min read

The point: your notebook isn't too big — its embedded output images are; the text a model can actually use fits comfortably.

"Message exceeds the maximum length." "File too large." Or the upload works and the model clearly saw only the first third. Jupyter notebooks hit input limits far before their content justifies it — and the reason is what a .ipynb actually stores.

1 · Where the megabytes live

A notebook file is JSON wrapping four things: your code, your prose, every output the cells ever produced, and metadata. The killers are outputs: each rendered chart is a base64-encoded PNG embedded as text — hundreds of KB each — and dataframe previews, progress bars and tracebacks pile up behind them. Metadata adds widget state and execution bookkeeping the model has no use for. In the notebook we use as a stress test, the meaningful text is about 7% of the bytes:

The bar is the file; the sliver is what a model can read. Conversion keeps the sliver and the addresses.

	Raw .ipynb	Converted Markdown
Size	3,465 KB	261 KB
Fits a chat input?	No	Yes (~64K tokens)
Cell addresses	—	kept (`Cell [7]`)
Figures	base64 walls	`[Figure: cell_5_output_1.png]` placeholders

One markdown cell in that notebook contained a single pasted screenshot worth ~130,000 tokens of base64 — a third of a large context window, spent on characters no model can even decode back into an image.

2 · What conversion keeps (this is the part that matters)

Shrinking is easy — jupyter nbconvert --to script strips outputs too, and loses everything else you care about. The point is what survives:

Cell addresses — every cell keeps a stable label (Cell [12]), so you and the model can point at code precisely;
Outputs, truncated honestly — the first lines of each output stay, with an explicit truncation note (a model that sees the shape of a result reasons better than one that sees nothing);
Execution-order warnings — if you ran cell 40 before cell 12, the conversion says so, which is often the very bug you're pasting the notebook to ask about;
Dependency hints — "depends on df (defined in Cell [2])", so questions about one cell don't require the model to re-derive the whole notebook. How that analysis works: Making Jupyter notebooks LLM-addressable.

3 · The fix

Drop the .ipynb here — conversion is local, your unpublished analysis stays on your machine.
Still tight on budget? Switch the preset to Chat — outputs truncate harder and a token estimate appears at the top. The Markdown pane's count exactly button gives you a real o200k token count before you paste.
Paste, and ask your question with cell addresses ("why does Cell [12] change the result of Cell [40]?").

Try it on the sample notebook — or the giant one that keeps failing.