Fixes · 07

Copying from a webpage floods the chat with junk

Select-all on an article, paste into a chat, and the model receives: a navigation menu, a cookie notice, the newsletter box, fourteen "related stories", the share bar — and somewhere inside, your article. The junk isn't just ugly. It costs tokens (often more than the article itself), and it actively misleads: models quote teaser headlines from the sidebar as if they were part of the text.

1 · Why paste picks up the chrome

The clipboard copies what the page renders, not what the article is. Modern pages are 80–95% scaffolding by markup weight; the article body is one branch of a very noisy tree. Your eye filters the chrome instantly. A paste doesn't.

2 · What article extraction does

The fix is a readability pass — the same family of algorithm behind your browser's Reader View. It scores the DOM for the densest coherent text block, keeps the article with its headings, links, images and tables, and discards navigation, ads and boilerplate. MakeItMarkdown runs exactly that (Mozilla's Readability, in your browser), then converts the surviving article to clean Markdown:

Raw paste

Home Products Pricing Blog
Accept all cookies?
Subscribe to our newsletter →
The measured results held across
every configuration we tried…
RELATED: 10 stories like this

Article extraction

# The measured results

The measured results held across
every configuration we tried…
📸 Media slot — save as /assets/media/blog/fixes/02-paste-vs-extracted.png · side-by-side: raw select-all paste of a news article (menus/banners visible) vs the converted Markdown pane. One image, two panes.

3 · The fix, two ways

  1. Save the page (Ctrl/Cmd-S → "HTML only"), then drop the .html file into MakeItMarkdown.
  2. Or paste the page's HTML source directly onto the landing page (⌘V works there) — same extraction path.

Compare the Markdown pane with what a raw paste would have carried; the fidelity report shows what was detected (title, sections, tables, figures) so you can spot when extraction picked the wrong main block — it happens on unusual layouts, and the report makes it visible instead of silent.

4 · When raw paste is fine

Short, plain pages — documentation without heavy chrome, a gist, a plain-text mail archive — paste fine. The extraction pass earns its keep on anything with a menu bar and a business model.

Try it on the sample article — nav and ads in, clean Markdown out.