What Is a Context Window? Why Your AI Coding Tool Forgets
Your AI coding tool isn't broken — it's hit its context window. Here's what that means in Claude Code, Cursor, and Windsurf, and how to fix mid-project drift.
Sam Okafor is a fictional AI persona, not a real person. This article was written by AI and reviewed by a human editor before publishing. How we work →

Why Your AI Just "Forgot" What You Said
You've been building something for 45 minutes. The AI was right there with you — it knew your component structure, your naming conventions, the bug you fixed an hour ago. Then, without warning, it starts suggesting code that contradicts what it already wrote. It ignores a constraint you mentioned three messages back. It regenerates something you already told it to remove.
Nothing broke. The AI isn't confused. It literally cannot remember what you said — because it ran out of room.
The symptom: mid-project drift
This specific experience has a name: mid-project drift. The AI's responses start to feel disconnected from the project's earlier context. It stops referencing prior decisions. Answers get more generic. You find yourself repeating things.
It feels like the AI is getting lazy, but the real cause is mechanical. The AI only has access to what's currently loaded in its context window. Once that window fills up, older content gets pushed out — and the AI has no access to it at all.
It's not a bug — it's a hard limit
Every AI model has a fixed amount of working memory. That memory is called the context window. It holds everything: your messages, the AI's responses, any code it reads, any files it scans. When the total amount of content exceeds the window's size, something has to give.
Different tools handle this differently — and most don't warn you in plain English when it's about to happen.
What a Context Window Actually Is
Think of the context window as a whiteboard the AI can see. It can only work with what's currently written on that whiteboard. When the whiteboard fills up, the AI starts erasing older content from the top to make room for new content at the bottom. What gets erased is gone — the AI can't see it anymore.
Tokens, not words
The whiteboard isn't measured in words — it's measured in tokens. Tokens are chunks of text that roughly correspond to syllables or short word fragments. As a rough rule:
- ~750 words ≈ 1,000 tokens
- A typical full-length article like this one ≈ 1,500–2,000 tokens
- A file with 500 lines of TypeScript ≈ 3,000–5,000 tokens
When an AI tool reads your codebase files or pastes in a long error log, those tokens add up fast.
The window fills from both ends
Here's the part most explainers skip: the context window fills with both your input and the AI's output. Every response the AI writes counts against the limit, not just the messages you send. Long explanations, large code blocks, step-by-step plans — those all eat into the shared budget.
In a long coding session, a back-and-forth conversation can hit tens of thousands of tokens just from accumulated replies.
What happens when it overflows
When the window is full, the AI silently drops the oldest content. Most tools don't show a "context almost full" banner — the conversation just keeps going, and the AI answers with less information than it had before. ( is an exception: it now displays a visual indicator that turns yellow near capacity.)
This is the root cause of mid-project drift. The AI isn't degrading. It's answering with a smaller slice of your conversation than when you started.
Context Limits in the Tools You're Using
Context window sizes vary by model and tool. Here's where the three most popular AI coding tools currently stand.
Claude Code (Sonnet 4.6 / Opus 4.7)
runs on Anthropic's models, which currently support a 1,000,000-token context window. That's one of the largest in the consumer AI space — roughly equivalent to 750,000 words, or a substantial portion of a large codebase.
In practice this means a single Claude Code session can hold an enormous amount before drift kicks in. But large agentic tasks that repeatedly scan files, run commands, and generate long outputs can still fill it over time. The model itself doesn't give you a real-time token counter in the terminal.
For pricing and plan details, see Anthropic's pricing page.
Cursor
's context window depends on which underlying model you've selected. As of Cursor 3 (April 2026), the old Composer pane has been replaced by a full Agents Window designed for running and managing AI agents across multiple files. Inline edit mode (Cmd+K / Ctrl+K) still operates with a narrower local context focused on the surrounding code selection.
Cursor 3.3 added a context usage breakdown in the agent interface, so you can see how context is distributed across rules, skills, MCPs, and subagents. Prior to this, the context indicator was inconsistently present — if you're on an older version and responses start feeling less accurate deep in an agent session, that's usually the first signal.
For a direct comparison of how Cursor and Claude Code handle larger projects, see the Cursor vs. Claude Code guide for beginners.
Windsurf (Cascade)
Windsurf's Cascade mode is designed to handle multi-step agentic tasks across a codebase. Codeium (Windsurf's developer) has described Cascade as using a "flows" model that manages context across a session rather than a simple rolling window — but the underlying limits still apply based on the model being used.
Unlike Claude Code, Windsurf does show a visual context window indicator. As of early 2026, Windsurf added a real-time indicator in the Cascade interface that shows how full your context window is — it turns yellow when you're approaching capacity, signaling that it's time to start a new conversation. A prompt cache timer was also integrated into the indicator to help track caching status.
For a deeper look at how Windsurf handles longer sessions, see the Windsurf review.
3 Workarounds That Actually Help
You can't increase the window — but you can work within it. These three approaches have the lowest friction for beginners.
Start a fresh chat and paste a summary
When a session starts drifting, don't keep pushing the same thread. Open a new chat and begin with a short summary of the project state: what you're building, what's already done, the key decisions made so far, and what you need next.
This effectively resets the whiteboard with only the most important content, leaving maximum space for the actual work ahead. Three to five sentences is usually enough.
Use a CLAUDE.md (or equivalent rules file) to anchor key decisions
The most durable fix is to write down your project's core decisions somewhere the AI will always read it — before the conversation starts.
In Claude Code, this is the CLAUDE.md file. It sits in your project root and gets loaded into every session automatically. Architecture choices, naming conventions, constraints, and preferences you write there don't consume chat context — they're loaded fresh each time.
Cursor and Windsurf have equivalent "rules" or "instructions" files that work the same way. Use them. A few bullet points in a rules file can save you from re-explaining the same thing in every session.
Keep sessions scoped: one feature, one chat
Long, sprawling sessions are the fastest way to overflow a context window. A session that starts with "let's build the login page" and drifts into backend API design, database schema changes, and a completely different component will burn through context fast.
Instead: one feature per chat. When you've finished the login page, commit the work, summarize what was done, and open a fresh session for the next piece. This discipline keeps each context window focused on exactly what's relevant.
For more techniques on keeping AI sessions productive, see how to write better prompts for AI coding tools. And if you're already dealing with broken or confused output from a long session, how to fix AI-generated code covers recovery steps.
When Context Size Actually Matters for Choosing a Tool
Does a bigger window mean a better tool?
A larger context window gives you more headroom before drift kicks in, but it's not the whole picture. A tool that uses its context intelligently — by prioritizing the most relevant file content and summarizing what it doesn't need in full — will outperform a tool with a bigger window that loads everything indiscriminately.
For most beginners, context window size is rarely the deciding factor for choosing a tool. Workflow, price, and editor integration matter more day to day. Where context size becomes relevant is when you're working on a large, complex project and need the AI to hold a lot of state across a long session — that's where Claude Code's 1M-token window gives it a practical edge over tools that hit limits sooner.
The honest answer: learn to work within the window before you optimize for window size. The three habits above — fresh chats, rules files, and scoped sessions — will serve you in any tool you use.
Some links may be affiliate links. We may earn a commission at no extra cost to you.
Ready to go deeper? Start with what a CLAUDE.md file is and how to write one — it's the highest-leverage fix for context drift and takes about ten minutes to set up.
The StackBrief weekly
New reviews and the AI-coding-tool news worth knowing — with our take. One email a week, unsubscribe anytime.
Keep reading

What Is Context Rot? (And How to Fix It Fast)
Context rot is why your AI coding tool degrades mid-session — not because the window is full, but because it's polluted. What causes it and how to fix it.
May 10, 2026
What Is a System Prompt? A Beginner's Plain-English Guide
What is a system prompt and why does it control how Cursor, Cline, and Claude Code behave? Plain-English explainer for beginners who keep seeing the term.
May 12, 2026
Prompt Chaining Explained for Vibe Coders
Prompt chaining explained: why one giant prompt collapses, and how breaking AI coding tasks into small verified steps gets you working code every time.
May 10, 2026