# LLM Pricing Data Reference

Last Updated: 2025-06

All prices in USD per 1 million tokens unless noted.

## Anthropic (Claude)

| Model | Input | Output | Cached Input | Context |
|-------|-------|--------|--------------|---------|
| claude-opus-4 | $15.00 | $75.00 | $1.50 | 200K |
| claude-sonnet-4 | $3.00 | $15.00 | $0.30 | 200K |
| claude-haiku-4 | $0.80 | $4.00 | $0.08 | 200K |
| claude-3.5-sonnet | $3.00 | $15.00 | $0.30 | 200K |
| claude-3-opus | $15.00 | $75.00 | $1.50 | 200K |
| claude-3-sonnet | $3.00 | $15.00 | $0.30 | 200K |
| claude-3-haiku | $0.25 | $1.25 | $0.03 | 200K |

## OpenAI

| Model | Input | Output | Cached Input | Context |
|-------|-------|--------|--------------|---------|
| gpt-4.1 | $2.00 | $8.00 | $0.50 | 1M |
| gpt-4.1-mini | $0.40 | $1.60 | $0.10 | 1M |
| gpt-4.1-nano | $0.10 | $0.40 | $0.025 | 1M |
| gpt-4o | $2.50 | $10.00 | $1.25 | 128K |
| gpt-4o-mini | $0.15 | $0.60 | $0.075 | 128K |
| gpt-4-turbo | $10.00 | $30.00 | - | 128K |
| gpt-3.5-turbo | $0.50 | $1.50 | - | 16K |
| o1 | $15.00 | $60.00 | $7.50 | 200K |
| o1-mini | $3.00 | $12.00 | $1.50 | 128K |
| o3-mini | $1.10 | $4.40 | $0.55 | 200K |

## Google (Gemini)

| Model | Input | Output | Cached Input | Context |
|-------|-------|--------|--------------|---------|
| gemini-2.5-pro | $1.25 | $10.00 | $0.31 | 1M |
| gemini-2.5-flash | $0.15 | $0.60 | $0.0375 | 1M |
| gemini-2.5-flash-lite | $0.075 | $0.30 | $0.01875 | 1M |
| gemini-2.0-flash | $0.10 | $0.40 | $0.025 | 1M |
| gemini-1.5-pro | $1.25 | $5.00 | $0.31 | 2M |
| gemini-1.5-flash | $0.075 | $0.30 | $0.01875 | 1M |
| gemini-1.5-flash-8b | $0.0375 | $0.15 | $0.01 | 1M |

## Mistral

| Model | Input | Output | Context |
|-------|-------|--------|---------|
| mistral-large | $2.00 | $6.00 | 128K |
| mistral-medium | $2.70 | $8.10 | 32K |
| mistral-small | $0.20 | $0.60 | 32K |
| codestral | $0.30 | $0.90 | 32K |
| ministral-8b | $0.10 | $0.10 | 128K |
| ministral-3b | $0.04 | $0.04 | 128K |
| pixtral-large | $2.00 | $6.00 | 128K |

## Meta (Llama via Providers)

Typical hosted pricing (varies by provider):

| Model | Input | Output | Context |
|-------|-------|--------|---------|
| llama-3.3-70b | $0.40 | $0.40 | 128K |
| llama-3.1-405b | $3.00 | $3.00 | 128K |
| llama-3.1-70b | $0.35 | $0.40 | 128K |
| llama-3.1-8b | $0.05 | $0.08 | 128K |

## xAI (Grok)

| Model | Input | Output | Context |
|-------|-------|--------|---------|
| grok-3 | $3.00 | $15.00 | 128K |
| grok-3-mini | $0.30 | $0.50 | 128K |
| grok-2 | $2.00 | $10.00 | 128K |

## DeepSeek

| Model | Input | Output | Cached Input | Context |
|-------|-------|--------|--------------|---------|
| deepseek-v3 | $0.27 | $1.10 | $0.07 | 64K |
| deepseek-r1 | $0.55 | $2.19 | $0.14 | 64K |
| deepseek-coder | $0.14 | $0.28 | - | 64K |

## Amazon Bedrock (Additional)

| Model | Input | Output | Context |
|-------|-------|--------|---------|
| amazon-nova-pro | $0.80 | $3.20 | 300K |
| amazon-nova-lite | $0.06 | $0.24 | 300K |
| amazon-nova-micro | $0.035 | $0.14 | 128K |

## Cost Tiers Summary

### Budget Tier (< $0.50/1M input)
- gemini-1.5-flash-8b ($0.0375)
- ministral-3b ($0.04)
- llama-3.1-8b ($0.05)
- amazon-nova-lite ($0.06)
- gemini-1.5-flash ($0.075)
- gemini-2.5-flash-lite ($0.075)
- gpt-4.1-nano ($0.10)
- gemini-2.0-flash ($0.10)
- ministral-8b ($0.10)
- gpt-4o-mini ($0.15)
- gemini-2.5-flash ($0.15)
- mistral-small ($0.20)
- claude-3-haiku ($0.25)
- deepseek-v3 ($0.27)
- grok-3-mini ($0.30)
- llama-3.3-70b ($0.40)
- gpt-4.1-mini ($0.40)

### Mid Tier ($0.50 - $3.00/1M input)
- deepseek-r1 ($0.55)
- claude-haiku-4 ($0.80)
- amazon-nova-pro ($0.80)
- o3-mini ($1.10)
- gemini-2.5-pro ($1.25)
- gemini-1.5-pro ($1.25)
- mistral-large ($2.00)
- pixtral-large ($2.00)
- gpt-4.1 ($2.00)
- grok-2 ($2.00)
- gpt-4o ($2.50)
- mistral-medium ($2.70)
- o1-mini ($3.00)
- claude-sonnet-4 ($3.00)
- claude-3.5-sonnet ($3.00)
- llama-3.1-405b ($3.00)
- grok-3 ($3.00)

### Premium Tier (> $3.00/1M input)
- gpt-4-turbo ($10.00)
- o1 ($15.00)
- claude-opus-4 ($15.00)
- claude-3-opus ($15.00)

## Notes

1. **Cached Input**: Discounted rate for repeated context (prompt caching)
2. **Batch API**: Many providers offer 50% discount for async batch processing
3. **Volume Discounts**: Enterprise agreements may reduce prices 20-40%
4. **Free Tiers**: Most providers offer limited free usage for testing
5. **Regional Pricing**: Some providers charge differently by region
