Leaderboard
overall rankings
Updated weeklyEach leaderboard uses transparent weighted scoring, current model context, and supporting analysis to help teams interpret the results with confidence. Only full-profile entries appear in rankings; broader catalog records remain available elsewhere on the site when only source-backed metadata is currently available.
Models with complete enough metadata and scoring coverage to be meaningfully ranked in this category.
Scores combine benchmark evidence, product metadata, and cost/context signals when those fields are published.
Tracked models without full scoring remain in the directory and provider pages, but are not relied on for analytical ranking claims.
| Rank | Model | Provider | Score | Context |
|---|---|---|---|---|
| #1 | GPT-5.4 | OpenAI | 93 | 1,000,000 |
| #2 | Claude Sonnet 4.6 | Anthropic | 92 | 1,000,000 |
| #3 | Claude Opus 4.6 | Anthropic | 91 | 1,000,000 |
| #4 | Claude Sonnet 4.5 | Anthropic | 91 | 1,000,000 |
| #5 | Gemini 3.1 Pro | Google DeepMind | 91 | 1,048,576 |
| #6 | GPT-4o | OpenAI | 91 | 128,000 |
| #7 | GPT-5.2 | OpenAI | 91 | 1,000,000 |
| #8 | GPT-5.4 Pro | OpenAI | 91 | 1,000,000 |
| #9 | Claude 3.7 Sonnet | Anthropic | 90 | 200,000 |
| #10 | Gemini 2.5 Pro | Google DeepMind | 90 | 1,048,576 |
| #11 | GPT-5.2 Pro | OpenAI | 90 | 1,000,000 |
| #12 | Claude Opus 4.1 | Anthropic | 89 | 200,000 |
| #13 | Claude Sonnet 4 | Anthropic | 89 | 1,000,000 |
| #14 | Gemini 3.0 Pro | Google DeepMind | 89 | 1,048,576 |
| #15 | Gemini 3.1 Flash | Google DeepMind | 89 | 1,048,576 |
| #16 | GPT-5 | OpenAI | 89 | 1,000,000 |
| #17 | Gemini 2.5 Pro TTS | Google DeepMind | 88 | 1,048,576 |
| #18 | Claude Opus 4 | Anthropic | 87 | 200,000 |
| #19 | Gemini 2.5 Flash | Google DeepMind | 87 | 1,048,576 |
| #20 | Gemini 3.0 Flash | Google DeepMind | 87 | 1,048,576 |
| #21 | GPT-5.3-Codex | OpenAI | 87 | 1,000,000 |
| #22 | GPT-5 mini | OpenAI | 86 | 1,000,000 |
| #23 | Mistral Large 25 | Mistral AI | 86 | 128,000 |
| #24 | Gemini 2.0 Flash | Google DeepMind | 85 | 1,048,576 |
| #25 | GPT-4.1 | OpenAI | 85 | 1,048,576 |
| #26 | GPT-5.3 Instant | OpenAI | 85 | 1,000,000 |
| #27 | Claude Haiku 4.5 | Anthropic | 84 | 200,000 |
| #28 | Llama 4 Maverick | Meta | 84 | 1,048,576 |
| #29 | Gemini 2.5 Flash Live | Google DeepMind | 83 | 1,048,576 |
| #30 | Gemini 2.5 Flash Native Audio Preview | Google DeepMind | 83 | 1,048,576 |
| #31 | Gemini 2.5 Flash-Lite | Google DeepMind | 83 | 1,048,576 |
| #32 | Command R+ 2026 | Cohere | 82 | 128,000 |
| #33 | Gemini 1.5 Pro | Google DeepMind | 82 | 2,097,152 |
| #34 | Gemini 3.1 Flash-Lite | Google DeepMind | 82 | 1,048,576 |
| #35 | o3 | OpenAI | 81 | 200,000 |
| #36 | o4-mini | OpenAI | 81 | 200,000 |
| #37 | GPT-4.1 mini | OpenAI | 80 | 1,048,576 |
| #38 | o3-deep-research | OpenAI | 80 | 200,000 |
| #39 | o4-mini-deep-research | OpenAI | 80 | 200,000 |
| #40 | Claude Haiku 3.5 | Anthropic | 79 | 200,000 |
| #41 | Claude Opus 3 | Anthropic | 79 | 200,000 |
| #42 | Gemini 2.0 Flash-Lite | Google DeepMind | 79 | 1,048,576 |
| #43 | GPT-5 nano | OpenAI | 79 | 1,000,000 |
| #44 | Nova Pro | Amazon Web Services | 79 | 300,000 |
| #45 | Claude Sonnet 3 | Anthropic | 78 | 200,000 |
| #46 | Gemini 1.5 Flash | Google DeepMind | 78 | 1,048,576 |
| #47 | GPT-4o-mini | OpenAI | 78 | 128,000 |
| #48 | o1 | OpenAI | 75 | 200,000 |
| #49 | Sonar Reasoning Pro | Perplexity | 75 | 200,000 |
| #50 | Llama 3.3 70B Instruct | Meta | 74 | 128,000 |
| #51 | Nova Lite | Amazon Web Services | 74 | 300,000 |
| #52 | o1-mini | OpenAI | 74 | 128,000 |
| #53 | Sonar Pro | Perplexity | 74 | 200,000 |
| #54 | Gemini 1.5 Flash-8B | Google DeepMind | 73 | 1,048,576 |
| #55 | Llama 4 Scout | Meta | 73 | 10,485,760 |
| #56 | Claude Haiku 3 | Anthropic | 72 | 200,000 |
| #57 | Claude Haiku 3 | Anthropic | 72 | 200,000 |
| #58 | Claude Haiku 3.5 | Anthropic | 72 | 200,000 |
| #59 | Claude Haiku 4.5 | Anthropic | 72 | 200,000 |
| #60 | Codestral | Mistral AI | 72 | 256,000 |
| #61 | Codestral Embed | Mistral AI | 72 | 32,768 |
| #62 | CogVideoX | Z.AI | 72 | 128,000 |
| #63 | CogView 4 | Z.AI | 72 | 128,000 |
| #64 | Devstral Medium 1.0 | Mistral AI | 72 | 128,000 |
| #65 | Doubao-Seed-1.6 | ByteDance / Doubao | 72 | 128,000 |
| #66 | Doubao-Seed-1.6-Flash | ByteDance / Doubao | 72 | 128,000 |
| #67 | Doubao-Seed-2.0-Code | ByteDance / Doubao | 72 | 128,000 |
| #68 | Doubao-Seed-Code | ByteDance / Doubao | 72 | 128,000 |
| #69 | ERNIE 3.5 128K | Baidu / ERNIE | 72 | 128,000 |
| #70 | ERNIE 4.0 Turbo 8K | Baidu / ERNIE | 72 | 128,000 |
| #71 | ERNIE Functions 8K | Baidu / ERNIE | 72 | 128,000 |
| #72 | ERNIE Speed 128K | Baidu / ERNIE | 72 | 128,000 |
| #73 | GLM-4.5 | Z.AI | 72 | 128,000 |
| #74 | GLM-4.5V | Z.AI | 72 | 128,000 |
| #75 | GLM-4.6 | Z.AI | 72 | 128,000 |
| #76 | GLM-4.6V | Z.AI | 72 | 128,000 |
| #77 | GLM-4.7 | Z.AI | 72 | 128,000 |
| #78 | GLM-5 | Z.AI | 72 | 128,000 |
| #79 | GLM-Image | Z.AI | 72 | 128,000 |
| #80 | GLM-OCR | Z.AI | 72 | 128,000 |
| #81 | gpt-audio | OpenAI | 72 | 128,000 |
| #82 | gpt-oss-120b | OpenAI | 72 | 131,072 |
| #83 | gpt-realtime | OpenAI | 72 | 128,000 |
| #84 | Hunyuan Code | Tencent / Hunyuan | 72 | 128,000 |
| #85 | Hunyuan Lite | Tencent / Hunyuan | 72 | 128,000 |
| #86 | Hunyuan Standard | Tencent / Hunyuan | 72 | 128,000 |
| #87 | Hunyuan T1 | Tencent / Hunyuan | 72 | 256,000 |
| #88 | Hunyuan T1 Vision | Tencent / Hunyuan | 72 | 128,000 |
| #89 | Hunyuan TurboS | Tencent / Hunyuan | 72 | 128,000 |
| #90 | Hunyuan TurboS LongText 128K | Tencent / Hunyuan | 72 | 128,000 |
| #91 | Kimi K2 | Moonshot AI / Kimi | 72 | 131,072 |
| #92 | Kimi K2 Thinking | Moonshot AI / Kimi | 72 | 256,000 |
| #93 | Kimi K2 Turbo Preview | Moonshot AI / Kimi | 72 | 256,000 |
| #94 | Kimi K2.5 | Moonshot AI / Kimi | 72 | 256,000 |
| #95 | Magistral Medium 1.2 | Mistral AI | 72 | 128,000 |
| #96 | MiniMax-M1 | MiniMax | 72 | 204,800 |
| #97 | MiniMax-M2 | MiniMax | 72 | 204,800 |
| #98 | MiniMax-M2.1 | MiniMax | 72 | 204,800 |
| #99 | MiniMax-M2.1-highspeed | MiniMax | 72 | 204,800 |
| #100 | MiniMax-M2.5 | MiniMax | 72 | 204,800 |
| #101 | MiniMax-M2.5-highspeed | MiniMax | 72 | 204,800 |
| #102 | MiniMax-Text-01 | MiniMax | 72 | 204,800 |
| #103 | MiniMax-VL-01 | MiniMax | 72 | 204,800 |
| #104 | Mistral Embed | Mistral AI | 72 | 32,768 |
| #105 | Mistral Large 3 | Mistral AI | 72 | 128,000 |
| #106 | Mistral Medium 3.1 | Mistral AI | 72 | 128,000 |
| #107 | Mistral Moderation | Mistral AI | 72 | 32,768 |
| #108 | Mistral Small 3.1 | Mistral AI | 72 | 128,000 |
| #109 | Mistral Small 3.2 Open | Mistral AI | 72 | 128,000 |
| #110 | Pixtral 12B | Mistral AI | 72 | 131,072 |
| #111 | Pixtral Large | Mistral AI | 72 | 131,072 |
| #112 | Sonar Deep Research | Perplexity | 72 | 200,000 |
| #113 | Vidu Q1 | Z.AI | 72 | 128,000 |
| #114 | Voxtral Mini Open | Mistral AI | 72 | 131,072 |
| #115 | Voxtral Mini Transcribe | Mistral AI | 72 | 131,072 |
| #116 | Voxtral Small Open | Mistral AI | 72 | 131,072 |
| #117 | Claude Sonnet 3 | Anthropic | 71 | 200,000 |
| #118 | Claude Sonnet 4 | Anthropic | 71 | 1,000,000 |
| #119 | Command A | Cohere | 71 | 256,000 |
| #120 | Command A Reasoning | Cohere | 71 | 128,000 |
| #121 | Command A Translate | Cohere | 71 | 128,000 |
| #122 | Command A Vision | Cohere | 71 | 128,000 |
| #123 | Command R+ | Cohere | 71 | 128,000 |
| #124 | Command R7B | Cohere | 71 | 128,000 |
| #125 | DeepSeek-Coder-V2 | DeepSeek | 71 | 128,000 |
| #126 | DeepSeek-Math-V2 | DeepSeek | 71 | 128,000 |
| #127 | DeepSeek-R1 | DeepSeek | 71 | 128,000 |
| #128 | DeepSeek-R1-Distill-Llama-70B | DeepSeek | 71 | 128,000 |
| #129 | DeepSeek-V2.5 | DeepSeek | 71 | 128,000 |
| #130 | DeepSeek-V3 | DeepSeek | 71 | 128,000 |
| #131 | DeepSeek-V3.1 | DeepSeek | 71 | 128,000 |
| #132 | DeepSeek-V3.1-Base | DeepSeek | 71 | 128,000 |
| #133 | DeepSeek-V3.2 | DeepSeek | 71 | 128,000 |
| #134 | DeepSeek-V3.2-Exp | DeepSeek | 71 | 128,000 |
| #135 | Devstral 2 Open | Mistral AI | 71 | 128,000 |
| #136 | Devstral Small 2 | Mistral AI | 71 | 128,000 |
| #137 | Embed 4 | Cohere | 71 | 128,000 |
| #138 | ERNIE 4.5 Turbo 32K | Baidu / ERNIE | 71 | 32,768 |
| #139 | Grok 3 | xAI | 71 | 131,072 |
| #140 | Grok 3 Mini | xAI | 71 | 131,072 |
| #141 | Grok 4 | xAI | 71 | 256,000 |
| #142 | Grok 4 Fast Reasoning | xAI | 71 | 131,072 |
| #143 | grok-image | xAI | 71 | 131,072 |
| #144 | Jamba 3B | AI21 Labs | 71 | 256,000 |
| #145 | Jamba Large | AI21 Labs | 71 | 256,000 |
| #146 | Jamba Large 1.6 | AI21 Labs | 71 | 256,000 |
| #147 | Jamba Mini | AI21 Labs | 71 | 256,000 |
| #148 | Jamba Mini 1.6 | AI21 Labs | 71 | 256,000 |
| #149 | Jamba Mini 1.7 | AI21 Labs | 71 | 256,000 |
| #150 | Magistral Small 1.2 Open | Mistral AI | 71 | 128,000 |
| #151 | Ministral 3 14B Open | Mistral AI | 71 | 128,000 |
| #152 | Ministral 3 3B Open | Mistral AI | 71 | 128,000 |
| #153 | Ministral 3 8B Open | Mistral AI | 71 | 128,000 |
| #154 | Mistral Large 3 Open | Mistral AI | 71 | 128,000 |
| #155 | Mistral Nemo 12B | Mistral AI | 71 | 128,000 |
| #156 | Mistral OCR 2505 | Mistral AI | 71 | 32,768 |
| #157 | morph-v3-fast-apply | Morph | 71 | 128,000 |
| #158 | Phi-3-vision-128k-instruct | Microsoft | 71 | 128,000 |
| #159 | Phi-3.5-mini-instruct | Microsoft | 71 | 131,072 |
| #160 | Phi-3.5-MoE-instruct | Microsoft | 71 | 131,072 |
| #161 | Phi-3.5-vision-instruct | Microsoft | 71 | 131,072 |
| #162 | Phi-4-mini-flash-reasoning | Microsoft | 71 | 131,072 |
| #163 | Phi-4-mini-instruct | Microsoft | 71 | 131,072 |
| #164 | Phi-4-multimodal-instruct | Microsoft | 71 | 131,072 |
| #165 | Phi-4-reasoning | Microsoft | 71 | 131,072 |
| #166 | Phi-4-reasoning-plus | Microsoft | 71 | 131,072 |
| #167 | Phi-4-reasoning-vision-15B | Microsoft | 71 | 131,072 |
| #168 | Qwen2.5-1.5B-Instruct | Alibaba Qwen | 71 | 131,072 |
| #169 | Qwen2.5-14B-Instruct | Alibaba Qwen | 71 | 131,072 |
| #170 | Qwen2.5-32B-Instruct | Alibaba Qwen | 71 | 131,072 |
| #171 | Qwen2.5-3B-Instruct | Alibaba Qwen | 71 | 131,072 |
| #172 | Qwen2.5-72B-Instruct | Alibaba Qwen | 71 | 131,072 |
| #173 | Qwen2.5-7B-Instruct | Alibaba Qwen | 71 | 131,072 |
| #174 | Qwen2.5-Max | Alibaba Qwen | 71 | 131,072 |
| #175 | Qwen2.5-VL-72B-Instruct | Alibaba Qwen | 71 | 131,072 |
| #176 | Qwen2.5-VL-7B-Instruct | Alibaba Qwen | 71 | 131,072 |
| #177 | Qwen3-Coder-Next | Alibaba Qwen | 71 | 131,072 |
| #178 | Qwen3.5-0.8B | Alibaba Qwen | 71 | 131,072 |
| #179 | Qwen3.5-122B-A10B | Alibaba Qwen | 71 | 131,072 |
| #180 | Qwen3.5-27B | Alibaba Qwen | 71 | 131,072 |
| #181 | Qwen3.5-2B | Alibaba Qwen | 71 | 131,072 |
| #182 | Qwen3.5-35B-A3B | Alibaba Qwen | 71 | 131,072 |
| #183 | Qwen3.5-397B-A17B | Alibaba Qwen | 71 | 131,072 |
| #184 | Qwen3.5-4B | Alibaba Qwen | 71 | 131,072 |
| #185 | Qwen3.5-9B | Alibaba Qwen | 71 | 131,072 |
| #186 | Sonar | Perplexity | 71 | 128,000 |
| #187 | warpgrep-v2 | Morph | 71 | 128,000 |
| #188 | DeepSeek-OCR | DeepSeek | 70 | 16,384 |
| #189 | DeepSeek-OCR-2 | DeepSeek | 70 | 16,384 |
| #190 | DeepSeek-VL2-Small | DeepSeek | 70 | 16,384 |
| #191 | image-01 | MiniMax | 70 | 8,192 |
| #192 | image-01-live | MiniMax | 70 | 8,192 |
| #193 | Janus-Pro-7B | DeepSeek | 70 | 16,384 |
| #194 | Llama 3.1 405B Instruct | Meta | 70 | 128,000 |
| #195 | Llama 3.1 70B Instruct | Meta | 70 | 128,000 |
| #196 | MiniMax-Speech-02 | MiniMax | 70 | 8,192 |
| #197 | music-2.0 | MiniMax | 70 | 8,192 |
| #198 | Phi-4 | Microsoft | 70 | 16,384 |
| #199 | Step-3.5-Flash | StepFun | 70 | 131,072 |
| #200 | Claude Opus 3 | Anthropic | 69 | 200,000 |
| #201 | Claude Opus 4 | Anthropic | 69 | 200,000 |
| #202 | Llama 3.2 90B Vision Instruct | Meta | 69 | 128,000 |
| #203 | phi-1 | Microsoft | 69 | 4,096 |
| #204 | phi-1_5 | Microsoft | 69 | 4,096 |
| #205 | phi-2 | Microsoft | 69 | 4,096 |
| #206 | Phi-3-medium-4k-instruct | Microsoft | 69 | 4,096 |
| #207 | Phi-3-mini-4k-instruct | Microsoft | 69 | 4,096 |
| #208 | Phi-tiny-MoE-instruct | Microsoft | 69 | 4,096 |
| #209 | flash-compact | Morph | 68 | 200,000 |
| #210 | GPT Image 1 | OpenAI | 67 | 32,768 |
| #211 | Nova Micro | Amazon Web Services | 67 | 128,000 |
| #212 | Step3-VL-10B | StepFun | 67 | 131,072 |
| #213 | gpt-audio-mini | OpenAI | 66 | 128,000 |
| #214 | gpt-realtime-mini | OpenAI | 66 | 128,000 |
| #215 | MiMo-VL-7B | Xiaomi | 66 | 131,072 |
| #216 | chatgpt-image-latest | OpenAI | 65 | 32,768 |
| #217 | gpt-image-1-mini | OpenAI | 65 | 32,768 |
| #218 | gpt-oss-20b | OpenAI | 65 | 131,072 |
| #219 | Llama 3.1 8B Instruct | Meta | 64 | 128,000 |
| #220 | Llama 3.2 11B Vision Instruct | Meta | 64 | 128,000 |
| #221 | Llama 3.2 1B Instruct | Meta | 61 | 128,000 |
| #222 | Llama 3.2 3B Instruct | Meta | 61 | 128,000 |
| #223 | MiMo-Audio-7B | Xiaomi | 61 | 131,072 |
| #224 | Code Llama 70B Instruct | Meta | 60 | 8,192 |
| #225 | LFM2-24B-A2B | Liquid AI | 60 | 32,768 |
| #226 | Llama Guard 4 12B | Meta | 60 | 131,072 |
| #227 | Meta Llama 3 70B Instruct | Meta | 60 | 8,192 |
| #228 | Step-Audio-R1.1 | StepFun | 59 | 131,072 |
| #229 | GPT-4o mini Transcribe | OpenAI | 58 | 128,000 |
| #230 | GPT-4o mini TTS | OpenAI | 58 | 128,000 |
| #231 | GPT-4o Transcribe | OpenAI | 58 | 128,000 |
| #232 | Code Llama 34B Instruct | Meta | 57 | 8,192 |
| #233 | Llama Guard 3 11B Vision | Meta | 57 | 131,072 |
| #234 | Meta Llama 3 8B Instruct | Meta | 57 | 8,192 |
| #235 | LFM2-8B-A1B | Liquid AI | 55 | 32,768 |
| #236 | FLUX 1.1 Pro | Black Forest Labs | 51 | 512 |
| #237 | FLUX 1.1 Pro Ultra | Black Forest Labs | 51 | 512 |
| #238 | FLUX 1 Pro | Black Forest Labs | 49 | 512 |
| #239 | LFM2.5-1.2B-Instruct | Liquid AI | 49 | 131,072 |
| #240 | LFM2.5-1.2B-Thinking | Liquid AI | 49 | 131,072 |
| #241 | LFM2-2.6B | Liquid AI | 47 | 32,768 |
| #242 | pplx-embed-v1-4b | Perplexity | 44 | 8,192 |
| #243 | pplx-embed-v1-0.6b | Perplexity | 43 | 8,192 |
| #244 | NextStep-1.1 | StepFun | 42 | 512 |
| #245 | Prompt Guard 86M | Meta | 41 | 512 |
Why #1: GPT-5.4
OpenAI's GPT-5.4, the most capable and efficient frontier model for professional work. First general-purpose model with native computer-use capabilities. Combines industry-leading coding from GPT-5.3-Codex with improved agentic workflows.
This model clears the current full-profile threshold for leaderboard methodology.
Why #2: Claude Sonnet 4.6
Anthropic's current Sonnet tier for fast frontier reasoning, coding, and long-context agent work.
This model clears the current full-profile threshold for leaderboard methodology.
Why #3: Claude Opus 4.6
Anthropic's most intelligent Claude model for complex agents, coding, and deep reasoning, with 1M token context and 128K output.
This model clears the current full-profile threshold for leaderboard methodology.