“When the Codex team needed their agent to work inside VS Code, they first tried the obvious approach and exposed it through MCP. It didn't work. — ByteByteGo”
You know that feeling when you ask an AI to fix a bug and it gives you code that doesn't work because it can't run your tests? Before Codex: you'd paste code back and forth, manually run tests, and iterate. Now: Codex runs in an isolated sandbox with your full repo, executes tests, reads linter output, and keeps iterating until tests pass — all while you work on something else.
Think of Codex as a tireless junior developer in a sandbox. You give it a task ('fix the auth bug'), and it enters an agent loop: read files, form a plan, run commands, see what happens, adjust, repeat. Each turn can involve dozens of tool calls (shell commands, file edits, test runs) before it responds. The prompt stacks context like layers: system rules, your AGENTS.md instructions, sandbox permissions, tool definitions, and conversation history. When the context window fills up, Codex 'compacts' the conversation — replacing full history with an encrypted summary that preserves the model's understanding.
If you're a developer who spends time on repetitive, well-scoped tasks like refactoring, writing tests, fixing bugs, or triaging on-call issues — this is for you. OpenAI's own engineers use it to offload work that breaks focus. Not useful yet if you need image inputs for frontend work or want to course-correct mid-task. Best for teams with good test coverage and clear coding conventions.
Yes — if you have a ChatGPT Pro, Plus, or Enterprise subscription, you already have access. The CLI is open source (66k stars) and free to try. The engineering blog posts reveal production-grade patterns for building agents: prompt layering, context management, protocol design. Even if you don't use Codex, the architecture is worth studying. The main gotcha: usage limits on Plus plans have been fluctuating, and cloud tasks take longer than interactive editing.
Deep-dive insight, Easy and Pro modes, plus action playbooks — the full breakdown is one tap away.