External Data
AI Benchmarks
Aggregated benchmark data from leading AI research sources.
LLM Benchmarks
Data: Artificial Analysis
Updated 12/16/2025Source
| # | Model | Intelligence | Coding | Math | Speed | Input $/1M | Output $/1M |
|---|---|---|---|---|---|---|---|
| 1 | Gemini 3 Pro Preview (high) | 72.8 | — | — | — | — | — |
| 2 | GPT-5.2 (xhigh) | 72.6 | — | — | — | — | — |
| 3 | Claude Opus 4.5 (Reasoning) | 69.8 | — | — | — | — | — |
| 4 | GPT-5.1 (high) | 69.7 | — | — | — | — | — |
| 5 | GPT-5 (high) | 68.5 | — | — | — | — | — |
| 6 | GPT-5.1 Codex (high) | 66.9 | — | — | — | — | — |
| 7 | GPT-5 (medium) | 66.4 | — | — | — | — | — |
| 8 | DeepSeek V3.2 (Reasoning) | 65.9 | — | — | — | — | — |
| 9 | o3 | 65.5 | — | — | — | — | — |
| 10 | Grok 4 | 65.3 | — | — | — | — | — |
| 11 | Gemini 3 Pro Preview (low) | 64.5 | — | — | — | — | — |
| 12 | GPT-5 mini (high) | 64.3 | — | — | — | — | — |
| 13 | Grok 4.1 Fast (Reasoning) | 64.1 | — | — | — | — | — |
| 14 | Claude 4.5 Sonnet (Reasoning) | 62.7 | — | — | — | — | — |
| 15 | Nova 2.0 Pro Preview (medium) | 62.4 | — | — | — | — | — |
| 16 | GPT-5.1 Codex mini (high) | 62.3 | — | — | — | — | — |
| 17 | GPT-5 (low) | 61.8 | — | — | — | — | — |
| 18 | MiniMax-M2 | 61.4 | — | — | — | — | — |
| 19 | GPT-5 mini (medium) | 60.8 | — | — | — | — | — |
| 20 | gpt-oss-120B (high) | 60.5 | — | — | — | — | — |
ARC-AGI Benchmark
Data: ARC Prize (arcprize.org)
No ARC Prize data available yet. Source is disabled.
SemiAnalysis Research
Source: SemiAnalysis (may require subscription)
No SemiAnalysis headlines available yet. Source is disabled.
Data aggregated from external sources. Always verify with original sources before making decisions.