Using GPT-5 Codex in VS Code to Build Agentic Workflows

Foreword

On September 15, 2025, OpenAI released GPT-5 Codex, a specialized version of the GPT-5 architecture optimized for software development. This update is considered a major upgrade to the Codex model, extending its core capabilities from code-assisted generation to an AI-driven development framework with autonomous planning, execution, and testing abilities.

Below is a summary of its key features.

Vibe Coding

Traditionally, developers wrote code in their IDE. When they hit an issue, they copied the snippet to the GPT-5 web interface, reviewed the suggestion, and then pasted it back for compilation and testing. This constant context-switching slowed developers down.

I previously mentioned “Boosting Coding Efficiency and Developer Experience with GitHub Copilot Agent Mode,” which explained how to integrate AI programming assistance tools into the IDE. Its key feature allows developers to directly ask the AI for code corrections or generation via dialogue or commands within the IDE environment, grant the AI editing permissions to make direct modifications, and then review the changes. This model is known as “vibe coding” and represents the current and future mainstream of AI-collaborative development.

The ecosystem of tools proposed by GPT-5 Codex is an extension and deepening of this model.

Market Competition and Tool Comparison

The launch of GPT-5 Codex places it directly into the competitive market of AI programming development tools. Key competitors include Anthropic’s Claude Code, Microsoft’s GitHub Copilot, and Cursor IDE, as summarized in the table below:

Tool	Starting Price	Official Release Date
GPT-5 Codex	Plus: USD $20/mo; Pro: USD $200/mo	2025-09-15
GitHub Copilot (Microsoft)	Personal Pro: USD $10/mo; Business: USD $19/user/mo	2022-06-21
Anthropic — Claude Code	Pro: USD $17/mo; Max: USD $100/user/mo	2025-05-22
Cursor	Basic: USD $0/mo; Pro: USD $20/mo; Teams: USD $40/user/mo	2023-03-14
Qodo Gen (Qodo)	Developer: USD $0/mo; Teams: USD $19–30/user/mo	2024-03-22
Google — Jules	Free: USD $0/mo; Pro: USD $19.99/mo; Ultra: USD $124.99/mo	2025-08-06

Installing and Using Codex in VS Code

Besides VS Code, Codex also supports Cursor, Windsurf, or other VS Code-compatible editors.

Currently, the official documentation states that the Codex extension supports macOS and Linux; Windows support is still experimental. For the best experience on Windows, it is recommended to use it through the Windows Subsystem for Linux (WSL).

First, download it from the official page: Codex – OpenAI’s coding agent

Codex extension on the VS Code Marketplace page

After downloading, click the Codex button (OpenAI icon) on the left side of VS Code. The Codex page will appear, then click Sign in with ChatGPT. A login window appears.

Codex sign-in option (VS Code)

You will be redirected to a webpage explaining the permissions that need to be authorized.

Codex web authorization notification

After authorizing, return to VS Code, and you can start using it normally. The basic usage is consistent with GitHub Copilot.

Codex main interface

Approval Modes

Mode selection

Chat Mode (Chat only)
- Capabilities: Chat, explain code, and suggest fixes. It never edits files or runs commands.
- Risk: Lowest (does not alter the project). Suitable for teaching, planning, or security-sensitive contexts.
- Use Case: When you want to discuss a design first or get suggestions without automatically changing the code.
Agent (Default)
- Capabilities: Can read files, and automatically edit and execute commands within the working directory. It asks for your approval before touching files outside your project root or accessing the network.
- Risk: Medium — Automatic operations within the project directory can improve efficiency, but external operations are blocked until you approve them.
- Use Case: Daily development, automatically fixing bugs, refactoring files, running test commands, and being willing to approve actions at risk points step-by-step.
Agent (Full Access)
- Capabilities: Can fully automatically read/write files, execute commands, and access the network (without requiring approval each time).
- Risk: Highest — Potential risk of data leakage or accidental execution of sensitive commands.
- Use Case: Highly trusted environments (e.g., sandbox, CI runner, or non-sensitive projects) requiring maximum automation.

Practical Advice (Trade-offs)
During initial development or when handling unknown/sensitive projects: Use Chat or Agent (Default) and retain step-by-step approval.
In CI or closed automated environments: If a risk assessment has been conducted, consider Agent (Full Access) with monitoring and auditing (logs).
If necessary, restrict network or external tool access and regularly review Codex’s change history and commit diffs.

Differences in Models Offered by Codex

Model selection

GPT-5
- Role: A general large-scale model suitable for a wide range of code and language tasks in broad contexts.
- Advantages: High versatility, capable of handling diverse tasks.
- Disadvantages: May not be as optimized for agentic coding (automatic modification, execution) behaviors as a specially optimized model.
GPT-5 Codex (Recommended for Codex)
- Role: A version optimized within Codex for code editing, execution, and agentic operations.
- Advantages: Generally better reliability and consistency in code understanding, generation, change suggestions, and execution sequences. Recommended for scenarios requiring automated modifications and execution.
- Disadvantages: May not differ much from the default on some non-code language tasks; may be slightly slower (depending on reasoning settings).

Suggested Use Cases for Switching Models
Use GPT-5 Codex as the first choice when the work is primarily focused on automatic modifications, running tests, or complex refactoring.
Use GPT-5 as a backup if you are only having high-level design discussions or natural language tasks and prefer faster responses.

Reasoning Effort

Low: Fastest but shallow. Best for quick edits or rapid Q&A.
Medium: Balances speed and depth. The recommended setting for general development.
High: More reliable for complex problems or when needing to automatically generate multi-step execution sequences (e.g., cross-file refactoring, complex bug fixes), but response time is longer.

Recommendation: Use Medium for daily use; switch to High for complex automated tasks; choose Low for numerous short interactions or quick editing feedback to speed things up.

Practical Settings and Best Practices

IDE and Platform: macOS / Linux for the best experience; use WSL on Windows or consider it experimental.
Authentication: Prioritize logging in with a ChatGPT account to get associated usage quotas; when using an API key, ensure the key and environment variables are stored securely.
Permission Management: Use Agent (requires approval for external actions) by default, and only enable Full Access in controlled environments.
Model Selection: Prefer GPT-5 Codex for agentic coding; switch to GPT-5 for quick and short suggestions.
Reasoning: Medium → High (complex tasks), Low (quick interactions).
Review Process: Even with automatic modifications enabled, maintain a commit diff / PR review and testing process.
Logging and Traceability: Enable and save Codex’s operational records (commands, file modifications, execution output) for auditing purposes.

Writing Prompts for the Codex Agent

To get the most out of Codex, giving clear and specific instructions is key.

If you can’t provide specific metrics, you can first use another LLM to generate a specific PROMPT, then give it to Codex for analysis and processing.

It is also important to clearly indicate the code and its limitations. For example, using the “@” symbol can specify input files, allowing Codex to analyze the provided files directly.

You can also include verification steps to help it confirm the correctness of its work. Breaking down large, complex tasks into several smaller steps can improve processing performance and review efficiency. When debugging, directly pasting detailed error logs allows Codex to assist in analyzing and finding the root cause.

In addition to specific task instructions, you can further customize how Codex works. Explicitly ask it to follow a specific process, use or avoid certain tools, or generate messages according to a specified template. An important technique is to instruct Codex to treat a specific file (like AGENTS.md) as a configuration file containing special rules, thereby guiding its execution strategy. Additionally, trying open-ended questions can also let it assist with creative tasks like code cleanup, brainstorming, or writing documentation.

Using AGENTS.md in Codex

The previous section mentioned that guiding the AI agent with an AGENTS.md file is an important technique. The rise of this concept is precisely to address the challenges and opportunities brought by highly autonomous agent tools like GPT-5 Codex.

Think of AGENTS.md as a README for AI agents. While README.md is a project guide for human developers, AGENTS.md is a Machine-Readable Operations Manual for tools like Codex. It transforms the project’s Implicit Knowledge—such as how to set up the environment, run tests, and code style—into explicit instructions that the AI can directly follow.

This is particularly crucial for the agent mode of GPT-5 Codex. When a developer gives a high-level command, such as “refactor the user login module and ensure all tests pass,” Codex no longer just guesses but consults AGENTS.md to perform the following tasks:

Project Setup: Runs pnpm install to fetch dependencies.
Development Guidelines: Ensures the code follows the Next.js 14+ framework, enables TypeScript Strict Mode, and uses Tailwind CSS for styling instead of traditional CSS.
Build & Test: Runs pnpm lint and pnpm test after modifications to verify code quality and functional correctness.
Pull Request: Follows Conventional Commits guidelines to generate commit messages.

Here is an example of an AGENTS.md that fits the scenario described above:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# AGENTS.md

## Project Setup

-   Install dependencies: `pnpm install`
-   Start development server: `pnpm dev`

---
## Build & Test

-   Production build: `pnpm build`
-   Run unit tests: `pnpm test`
-   Code style check: `pnpm lint`

## Development Guidelines

-   Framework versions: Next.js 14+ / React 18+
-   Node.js version: v20.x (please follow the `.nvmrc` file settings)
-   Styling:
    -   Must use Tailwind CSS for styling.
    -   Writing traditional CSS files (`.css`, `.scss`) or using CSS-in-JS is prohibited in this project.
    -   Shared components should be placed in the `/components` directory.
-   Syntax:
    -   TypeScript's `Strict Mode` must be enabled.
    -   Use `ESLint` and `Prettier` for code formatting, following the project's configuration files.

## Pull Request

-   Title format: Follow the Conventional Commits specification (e.g., `feat: Add button component`).
-   Pre-commit check: Before submitting, you must run `pnpm lint` and `pnpm test` locally and ensure all checks pass.

Codex Privacy Statement

Business User Protection
By default, OpenAI does not use the input or output data from business users (including ChatGPT Business, ChatGPT Enterprise, and the API) to improve its models.
General User Settings
Conversation content from general users may be used to improve the models.
If you do not want your data to be used for training, please go to ChatGPT → Settings and turn off Improve the model for everyone.

How to disable ChatGPT data training

Conclusion

GPT-5 Codex signals a major shift: from “assistive” to fully agent-driven development. Its core value lies not just in the quality of the code it produces, but in its framework capability for autonomous task planning and execution. Through configurable permission modes and machine-readable instruction files like AGENTS.md, Codex transforms steps in the development process that required manual intervention into automated workflows.

From a business model and market competition perspective, Codex’s greatest appeal is its integration with the existing ChatGPT subscription system, allowing paying users to avoid an additional GitHub Copilot subscription. Its initial usage limits are also relatively generous compared to other mature products (Plus: 30-150/5 hours; Pro, Business: 300-1,500/5 hours).

From my personal experience and in terms of development workflow integration, the more mature GitHub Copilot is what I would recommend for its convenience. The future focus of the market will be on how OpenAI iterates on Codex’s user experience, narrows the convenience gap with competitors, and leverages its unique agent framework advantages. The developer community will continue to watch its subsequent adjustments and feature expansions, anticipating the new possibilities it will bring to automated development.