“OpenAI tried MCP for Codex. It didn't work. Here's what they built instead.”
The codex-1 model is just one component. The real engineering went into the agent loop, prompt management, and a custom protocol that MCP couldn't handle. OpenAI rejected MCP because it couldn't support streaming progress, mid-task approvals, or code diffs — so they built their own JSON-RPC protocol. The prompt grows quadratically with conversation length, but prompt caching keeps computation linear. 66k GitHub stars in under a year.
You know that feeling when you ask an AI to fix a bug and it gives you code that doesn't work because it can't run your tests? Before Codex: you'd paste code back and forth, manually run tests, and iterate. Now: Codex runs in an isolated sandbox with your full repo, executes tests, reads linter output, and keeps iterating until tests pass — all while you work on something else.
Think of Codex as a tireless junior developer in a sandbox. You give it a task ('fix the auth bug'), and it enters an agent loop: read files, form a plan, run commands, see what happens, adjust, repeat. Each turn can involve dozens of tool calls (shell commands, file edits, test runs) before it responds. The prompt stacks context like layers: system rules, your AGENTS.md instructions, sandbox permissions, tool definitions, and conversation history. When the context window fills up, Codex 'compacts' the conversation — replacing full history with an encrypted summary that preserves the model's understanding.
If you're a developer who spends time on repetitive, well-scoped tasks like refactoring, writing tests, fixing bugs, or triaging on-call issues — this is for you. OpenAI's own engineers use it to offload work that breaks focus. Not useful yet if you need image inputs for frontend work or want to course-correct mid-task. Best for teams with good test coverage and clear coding conventions.
Yes — if you have a ChatGPT Pro, Plus, or Enterprise subscription, you already have access. The CLI is open source (66k stars) and free to try. The engineering blog posts reveal production-grade patterns for building agents: prompt layering, context management, protocol design. Even if you don't use Codex, the architecture is worth studying. The main gotcha: usage limits on Plus plans have been fluctuating, and cloud tasks take longer than interactive editing.
View original sourceThis page gives you the hook. The full Snaplyze digest goes deeper so you can move from curiosity to decision with less noise.
Open the full digest to read the deeper breakdown, compare viewpoints, and get the practical next-step playbooks.
Read the full digest for deep-dive insight, Easy Mode, Pro Mode, and practical playbooks you can actually use.
Install Snaplyze