Fixes · 03

Your notebook is too big to paste: beating the token limit

The point: your notebook isn't too big — its embedded output images are; the text a model can actually use fits comfortably.

"Message exceeds the maximum length." "File too large." Or the upload works and the model clearly saw only the first third. Jupyter notebooks hit input limits far before their content justifies it — and the reason is what a .ipynb actually stores.

1 · Where the megabytes live

A notebook file is JSON wrapping four things: your code, your prose, every output the cells ever produced, and metadata. The killers are outputs: each rendered chart is a base64-encoded PNG embedded as text — hundreds of KB each — and dataframe previews, progress bars and tracebacks pile up behind them. Metadata adds widget state and execution bookkeeping the model has no use for. In the notebook we use as a stress test, the meaningful text is about 7% of the bytes:

3.5 MB .ipynb base64 output images · ~93% what the model can use 261 KB as Markdown — code + text, addresses intact
The bar is the file; the sliver is what a model can read. Conversion keeps the sliver and the addresses.
Raw .ipynbConverted Markdown
Size3,465 KB261 KB
Fits a chat input?NoYes (~64K tokens)
Cell addresseskept (Cell [7])
Figuresbase64 walls[Figure: cell_5_output_1.png] placeholders

One markdown cell in that notebook contained a single pasted screenshot worth ~130,000 tokens of base64 — a third of a large context window, spent on characters no model can even decode back into an image.

2 · What conversion keeps (this is the part that matters)

Shrinking is easy — jupyter nbconvert --to script strips outputs too, and loses everything else you care about. The point is what survives:

3 · The fix

  1. Drop the .ipynb here — conversion is local, your unpublished analysis stays on your machine.
  2. Still tight on budget? Switch the preset to Chat — outputs truncate harder and a token estimate appears at the top. The Markdown pane's count exactly button gives you a real o200k token count before you paste.
  3. Paste, and ask your question with cell addresses ("why does Cell [12] change the result of Cell [40]?").

Try it on the sample notebook — or the giant one that keeps failing.