GPT-5: Key Characteristics, Pricing and Model Card

Article Content

Main Article

Published: August 7, 2025

A detailed analysis of the GPT-5 model family, including its architecture, API models, pricing, and safety features like 'safe-completions', based on early access and the official system card.

llm-reasoning llm-release pelican-riding-a-bicycle

Video Preview

A video by Simon Willison providing a preview of the new GPT-5 model family, based on two weeks of early access.

Watch Video

Key Article Sections

Key Model Characteristics

GPT-5 is a hybrid system with smart, reasoning, and mini models. API versions have four reasoning levels and high token limits.

Pricing is Aggressively Competitive

GPT-5's pricing is highly competitive, with the main model at half the input cost of GPT-4o and significant caching discounts.

Prompt Injection Analysis

While improved, with a 56.8% attack success rate, prompt injection remains an unsolved problem for GPT-5.

Thinking Traces in the API

API users can access model 'thinking traces' by including the 'reasoning': {'summary': 'auto'} parameter.

Safe-Completions

The system card details training data, improvements in reducing hallucinations and sycophancy, and the new 'safe-completions' safety approach.

SVGs of Pelicans

The author's benchmark test of generating an SVG of a pelican on a bicycle shows impressive results from all GPT-5 variants.

Models & Pricing

GPT-5 Model Family

GPT-5

The main model - highest competence and complex reasoning capability.

Pricing:

Input: $1.25 per 1M tokens

Output: $10.00 per 1M tokens

Knowledge cutoff: September 30, 2024

Input limit: 272,000 tokens

Output limit: 128,000 tokens

GPT-5 Mini

Good balance of cost and performance for many tasks.

Pricing:

Input: $0.25 per 1M tokens

Output: $2.00 per 1M tokens

Knowledge cutoff: May 30, 2024

Input limit: 272,000 tokens

Output limit: 128,000 tokens

GPT-5 Nano

Fastest responses and lowest cost for simpler, high-volume tasks.

Pricing:

Input: $0.05 per 1M tokens

Output: $0.40 per 1M tokens

Knowledge cutoff: May 30, 2024

Input limit: 272,000 tokens

Output limit: 128,000 tokens

Competitor Pricing Comparison

Claude Opus 4.1

Input: $15.00 per 1M tokens

Output: $75.00 per 1M tokens

Gemini 2.5 Pro

Input: $1.25 per 1M tokens

Output: $10.00 per 1M tokens

(<200,000 context)

Grok 4

Input: $3.00 per 1M tokens

Output: $15.00 per 1M tokens

Key Pricing Features

90% caching discount on input tokens used within previous few minutes
Four reasoning levels: minimal, low, medium, and high
Competitive positioning: GPT-5 main model at half the input cost of GPT-4o
High token limits: 272,000 input / 128,000 output (including reasoning tokens)

How-To Guides

How to Access GPT-5 Thinking Traces via the API

A step-by-step guide to retrieve reasoning traces from the GPT-5 API using a curl command.

Step 1: Construct the API Request

Begin by constructing a curl command targeting the https://api.openai.com/v1/responses endpoint.

Step 2: Add Authorization and Content-Type Headers

Include the necessary headers: Authorization: Bearer YOUR_API_KEY and Content-Type: application/json.

Step 3: Define the Request Body

In the data payload, specify the model (e.g., 'gpt-5'), your input prompt, and the crucial reasoning parameter: 'reasoning': {'summary': 'auto'}.

curl https://api.openai.com/v1/responses \
  -H "Authorization: Bearer $(llm keys get openai)" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "gpt-5",
  "input": "Give me a one-sentence fun fact about octopuses.",
  "reasoning": {"summary": "auto"}
  }'

Step 4: Execute and Inspect Response

Execute the curl command. The JSON response will contain a 'reasoning' field with the model's thinking traces alongside the final output.

How to Choose the Right GPT-5 Model

A guide on selecting between gpt-5, gpt-5-mini, and gpt-5-nano based on cost, performance, and task complexity.

Assess Performance Needs

Choose the main GPT-5 model for the highest competence and complex reasoning. It is the most capable but also the most expensive.

Balance Cost and Performance

Select GPT-5 Mini for a good balance. It is cheaper than the main model while still being highly capable for many tasks.

Prioritize Speed and Low Cost

Use GPT-5 Nano for the fastest responses and lowest cost, ideal for simpler, high-volume tasks where top-tier reasoning is not required.

Frequently Asked Questions

Key Terminology

Safe-Completions

A safety-training approach that centers on the safety of the assistant's output rather than a binary classification of the user's intent.

Sycophancy

The tendency of a model to produce outputs that align with a user's perceived preferences or biases, which OpenAI has post-trained GPT-5 to reduce.

Thinking Traces

The internal reasoning steps a model takes to generate a final response, which can be exposed via the API using a specific parameter.

Prompt Injection

An attack technique where malicious input is crafted to make an LLM ignore its original instructions and perform unintended actions.

Reasoning Effort

A parameter for GPT-5 API models that controls the amount of computational 'thinking' the model performs before generating a response, with levels from 'minimal' to 'high'.