Leaderboard
open source rankings
Updated weeklyEach leaderboard uses transparent weighted scoring, current model context, and supporting analysis to help teams interpret the results with confidence. Only full-profile entries appear in rankings; broader catalog records remain available elsewhere on the site when only source-backed metadata is currently available.
Models with complete enough metadata and scoring coverage to be meaningfully ranked in this category.
Scores combine benchmark evidence, product metadata, and cost/context signals when those fields are published.
Tracked models without full scoring remain in the directory and provider pages, but are not relied on for analytical ranking claims.
| Rank | Model | Provider | Score | Context |
|---|---|---|---|---|
| #1 | Llama 4 Maverick | Meta | 91 | 1,048,576 |
| #2 | gpt-oss-120b | OpenAI | 85 | 131,072 |
| #3 | Llama 3.3 70B Instruct | Meta | 85 | 128,000 |
| #4 | Step-3.5-Flash | StepFun | 84 | 131,072 |
| #5 | Llama 3.1 70B Instruct | Meta | 82 | 128,000 |
| #6 | Llama 4 Scout | Meta | 82 | 10,485,760 |
| #7 | Pixtral 12B | Mistral AI | 82 | 131,072 |
| #8 | Voxtral Mini Open | Mistral AI | 82 | 131,072 |
| #9 | Voxtral Small Open | Mistral AI | 82 | 131,072 |
| #10 | DeepSeek-Coder-V2 | DeepSeek | 81 | 128,000 |
| #11 | DeepSeek-Math-V2 | DeepSeek | 81 | 128,000 |
| #12 | DeepSeek-R1 | DeepSeek | 81 | 128,000 |
| #13 | DeepSeek-R1-Distill-Llama-70B | DeepSeek | 81 | 128,000 |
| #14 | DeepSeek-V2.5 | DeepSeek | 81 | 128,000 |
| #15 | DeepSeek-V3 | DeepSeek | 81 | 128,000 |
| #16 | DeepSeek-V3.1 | DeepSeek | 81 | 128,000 |
| #17 | DeepSeek-V3.1-Base | DeepSeek | 81 | 128,000 |
| #18 | DeepSeek-V3.2 | DeepSeek | 81 | 128,000 |
| #19 | DeepSeek-V3.2-Exp | DeepSeek | 81 | 128,000 |
| #20 | Devstral 2 Open | Mistral AI | 81 | 128,000 |
| #21 | Devstral Small 2 | Mistral AI | 81 | 128,000 |
| #22 | Jamba 3B | AI21 Labs | 81 | 256,000 |
| #23 | Jamba Large | AI21 Labs | 81 | 256,000 |
| #24 | Jamba Large 1.6 | AI21 Labs | 81 | 256,000 |
| #25 | Jamba Mini | AI21 Labs | 81 | 256,000 |
| #26 | Jamba Mini 1.6 | AI21 Labs | 81 | 256,000 |
| #27 | Jamba Mini 1.7 | AI21 Labs | 81 | 256,000 |
| #28 | Llama 3.1 405B Instruct | Meta | 81 | 128,000 |
| #29 | Magistral Small 1.2 Open | Mistral AI | 81 | 128,000 |
| #30 | Ministral 3 14B Open | Mistral AI | 81 | 128,000 |
| #31 | Ministral 3 3B Open | Mistral AI | 81 | 128,000 |
| #32 | Ministral 3 8B Open | Mistral AI | 81 | 128,000 |
| #33 | Mistral Large 3 Open | Mistral AI | 81 | 128,000 |
| #34 | Mistral Nemo 12B | Mistral AI | 81 | 128,000 |
| #35 | Phi-3-vision-128k-instruct | Microsoft | 81 | 128,000 |
| #36 | Phi-3.5-mini-instruct | Microsoft | 81 | 131,072 |
| #37 | Phi-3.5-MoE-instruct | Microsoft | 81 | 131,072 |
| #38 | Phi-3.5-vision-instruct | Microsoft | 81 | 131,072 |
| #39 | Phi-4-mini-flash-reasoning | Microsoft | 81 | 131,072 |
| #40 | Phi-4-mini-instruct | Microsoft | 81 | 131,072 |
| #41 | Phi-4-multimodal-instruct | Microsoft | 81 | 131,072 |
| #42 | Phi-4-reasoning | Microsoft | 81 | 131,072 |
| #43 | Phi-4-reasoning-plus | Microsoft | 81 | 131,072 |
| #44 | Phi-4-reasoning-vision-15B | Microsoft | 81 | 131,072 |
| #45 | Qwen2.5-1.5B-Instruct | Alibaba Qwen | 81 | 131,072 |
| #46 | Qwen2.5-14B-Instruct | Alibaba Qwen | 81 | 131,072 |
| #47 | Qwen2.5-32B-Instruct | Alibaba Qwen | 81 | 131,072 |
| #48 | Qwen2.5-3B-Instruct | Alibaba Qwen | 81 | 131,072 |
| #49 | Qwen2.5-72B-Instruct | Alibaba Qwen | 81 | 131,072 |
| #50 | Qwen2.5-7B-Instruct | Alibaba Qwen | 81 | 131,072 |
| #51 | Qwen2.5-Max | Alibaba Qwen | 81 | 131,072 |
| #52 | Qwen2.5-VL-72B-Instruct | Alibaba Qwen | 81 | 131,072 |
| #53 | Qwen2.5-VL-7B-Instruct | Alibaba Qwen | 81 | 131,072 |
| #54 | Qwen3-Coder-Next | Alibaba Qwen | 81 | 131,072 |
| #55 | Qwen3.5-0.8B | Alibaba Qwen | 81 | 131,072 |
| #56 | Qwen3.5-122B-A10B | Alibaba Qwen | 81 | 131,072 |
| #57 | Qwen3.5-27B | Alibaba Qwen | 81 | 131,072 |
| #58 | Qwen3.5-2B | Alibaba Qwen | 81 | 131,072 |
| #59 | Qwen3.5-35B-A3B | Alibaba Qwen | 81 | 131,072 |
| #60 | Qwen3.5-397B-A17B | Alibaba Qwen | 81 | 131,072 |
| #61 | Qwen3.5-4B | Alibaba Qwen | 81 | 131,072 |
| #62 | Qwen3.5-9B | Alibaba Qwen | 81 | 131,072 |
| #63 | DeepSeek-OCR | DeepSeek | 79 | 16,384 |
| #64 | DeepSeek-OCR-2 | DeepSeek | 79 | 16,384 |
| #65 | DeepSeek-VL2-Small | DeepSeek | 79 | 16,384 |
| #66 | gpt-oss-20b | OpenAI | 79 | 131,072 |
| #67 | Janus-Pro-7B | DeepSeek | 79 | 16,384 |
| #68 | Phi-4 | Microsoft | 79 | 16,384 |
| #69 | Llama 3.1 8B Instruct | Meta | 78 | 128,000 |
| #70 | phi-1 | Microsoft | 78 | 4,096 |
| #71 | phi-1_5 | Microsoft | 78 | 4,096 |
| #72 | phi-2 | Microsoft | 78 | 4,096 |
| #73 | Phi-3-medium-4k-instruct | Microsoft | 78 | 4,096 |
| #74 | Phi-3-mini-4k-instruct | Microsoft | 78 | 4,096 |
| #75 | Phi-tiny-MoE-instruct | Microsoft | 78 | 4,096 |
| #76 | LFM2-24B-A2B | Liquid AI | 76 | 32,768 |
| #77 | Llama 3.2 90B Vision Instruct | Meta | 76 | 128,000 |
| #78 | Code Llama 70B Instruct | Meta | 75 | 8,192 |
| #79 | Llama 3.2 1B Instruct | Meta | 75 | 128,000 |
| #80 | Llama 3.2 3B Instruct | Meta | 75 | 128,000 |
| #81 | Meta Llama 3 70B Instruct | Meta | 75 | 8,192 |
| #82 | Step3-VL-10B | StepFun | 75 | 131,072 |
| #83 | Llama 3.2 11B Vision Instruct | Meta | 74 | 128,000 |
| #84 | MiMo-VL-7B | Xiaomi | 74 | 131,072 |
| #85 | Code Llama 34B Instruct | Meta | 73 | 8,192 |
| #86 | Gemini 3.1 Flash | Google DeepMind | 73 | 1,048,576 |
| #87 | Meta Llama 3 8B Instruct | Meta | 73 | 8,192 |
| #88 | Gemini 2.5 Flash | Google DeepMind | 72 | 1,048,576 |
| #89 | Gemini 3.0 Flash | Google DeepMind | 72 | 1,048,576 |
| #90 | Gemini 3.1 Pro | Google DeepMind | 72 | 1,048,576 |
| #91 | GPT-5.2 | OpenAI | 72 | 1,000,000 |
| #92 | GPT-5.4 | OpenAI | 72 | 1,000,000 |
| #93 | MiMo-Audio-7B | Xiaomi | 72 | 131,072 |
| #94 | Claude 3.7 Sonnet | Anthropic | 71 | 200,000 |
| #95 | Claude Sonnet 4.5 | Anthropic | 71 | 1,000,000 |
| #96 | Claude Sonnet 4.6 | Anthropic | 71 | 1,000,000 |
| #97 | Gemini 2.0 Flash | Google DeepMind | 71 | 1,048,576 |
| #98 | Gemini 2.5 Pro | Google DeepMind | 71 | 1,048,576 |
| #99 | Gemini 3.0 Pro | Google DeepMind | 71 | 1,048,576 |
| #100 | GPT-5 mini | OpenAI | 71 | 1,000,000 |
| #101 | GPT-5.3 Instant | OpenAI | 71 | 1,000,000 |
| #102 | GPT-5.3-Codex | OpenAI | 71 | 1,000,000 |
| #103 | LFM2-8B-A1B | Liquid AI | 71 | 32,768 |
| #104 | Step-Audio-R1.1 | StepFun | 71 | 131,072 |
| #105 | Claude Haiku 4.5 | Anthropic | 70 | 200,000 |
| #106 | Claude Sonnet 4 | Anthropic | 70 | 1,000,000 |
| #107 | Gemini 2.5 Pro TTS | Google DeepMind | 70 | 1,048,576 |
| #108 | GPT-4.1 | OpenAI | 70 | 1,048,576 |
| #109 | GPT-4o | OpenAI | 70 | 128,000 |
| #110 | GPT-5 | OpenAI | 70 | 1,000,000 |
| #111 | Mistral Large 25 | Mistral AI | 70 | 128,000 |
| #112 | o4-mini | OpenAI | 70 | 200,000 |
| #113 | Command R+ 2026 | Cohere | 69 | 128,000 |
| #114 | Gemini 2.5 Flash Live | Google DeepMind | 69 | 1,048,576 |
| #115 | Gemini 2.5 Flash Native Audio Preview | Google DeepMind | 69 | 1,048,576 |
| #116 | Gemini 2.5 Flash-Lite | Google DeepMind | 69 | 1,048,576 |
| #117 | Gemini 3.1 Flash-Lite | Google DeepMind | 69 | 1,048,576 |
| #118 | GPT-4.1 mini | OpenAI | 69 | 1,048,576 |
| #119 | morph-v3-fast-apply | Morph | 69 | 128,000 |
| #120 | warpgrep-v2 | Morph | 69 | 128,000 |
| #121 | Claude Opus 4.6 | Anthropic | 68 | 1,000,000 |
| #122 | GPT-5 nano | OpenAI | 68 | 1,000,000 |
| #123 | flash-compact | Morph | 67 | 200,000 |
| #124 | Gemini 1.5 Flash | Google DeepMind | 67 | 1,048,576 |
| #125 | Gemini 1.5 Pro | Google DeepMind | 67 | 2,097,152 |
| #126 | Gemini 2.0 Flash-Lite | Google DeepMind | 67 | 1,048,576 |
| #127 | GPT-4o-mini | OpenAI | 67 | 128,000 |
| #128 | LFM2.5-1.2B-Instruct | Liquid AI | 67 | 131,072 |
| #129 | LFM2.5-1.2B-Thinking | Liquid AI | 67 | 131,072 |
| #130 | Claude Haiku 3.5 | Anthropic | 66 | 200,000 |
| #131 | GPT-5.2 Pro | OpenAI | 66 | 1,000,000 |
| #132 | GPT-5.4 Pro | OpenAI | 66 | 1,000,000 |
| #133 | Llama Guard 4 12B | Meta | 66 | 131,072 |
| #134 | Nova Pro | Amazon Web Services | 66 | 300,000 |
| #135 | o4-mini-deep-research | OpenAI | 66 | 200,000 |
| #136 | Sonar Reasoning Pro | Perplexity | 66 | 200,000 |
| #137 | LFM2-2.6B | Liquid AI | 65 | 32,768 |
| #138 | Sonar | Perplexity | 65 | 128,000 |
| #139 | Claude Haiku 3 | Anthropic | 64 | 200,000 |
| #140 | Nova Lite | Amazon Web Services | 64 | 300,000 |
| #141 | o1-mini | OpenAI | 64 | 128,000 |
| #142 | Sonar Deep Research | Perplexity | 64 | 200,000 |
| #143 | Claude Haiku 3 | Anthropic | 63 | 200,000 |
| #144 | Claude Haiku 3.5 | Anthropic | 63 | 200,000 |
| #145 | Claude Haiku 4.5 | Anthropic | 63 | 200,000 |
| #146 | Claude Sonnet 3 | Anthropic | 63 | 200,000 |
| #147 | Codestral | Mistral AI | 63 | 256,000 |
| #148 | Codestral Embed | Mistral AI | 63 | 32,768 |
| #149 | Doubao-Seed-1.6 | ByteDance / Doubao | 63 | 128,000 |
| #150 | Doubao-Seed-1.6-Flash | ByteDance / Doubao | 63 | 128,000 |
| #151 | Doubao-Seed-2.0-Code | ByteDance / Doubao | 63 | 128,000 |
| #152 | Doubao-Seed-Code | ByteDance / Doubao | 63 | 128,000 |
| #153 | Gemini 1.5 Flash-8B | Google DeepMind | 63 | 1,048,576 |
| #154 | Llama Guard 3 11B Vision | Meta | 63 | 131,072 |
| #155 | MiniMax-M1 | MiniMax | 63 | 204,800 |
| #156 | MiniMax-Text-01 | MiniMax | 63 | 204,800 |
| #157 | MiniMax-VL-01 | MiniMax | 63 | 204,800 |
| #158 | Mistral Embed | Mistral AI | 63 | 32,768 |
| #159 | Mistral Moderation | Mistral AI | 63 | 32,768 |
| #160 | Sonar Pro | Perplexity | 63 | 200,000 |
| #161 | Voxtral Mini Transcribe | Mistral AI | 63 | 131,072 |
| #162 | Claude Opus 4 | Anthropic | 62 | 200,000 |
| #163 | Claude Opus 4.1 | Anthropic | 62 | 200,000 |
| #164 | CogVideoX | Z.AI | 62 | 128,000 |
| #165 | CogView 4 | Z.AI | 62 | 128,000 |
| #166 | Devstral Medium 1.0 | Mistral AI | 62 | 128,000 |
| #167 | ERNIE 3.5 128K | Baidu / ERNIE | 62 | 128,000 |
| #168 | ERNIE 4.0 Turbo 8K | Baidu / ERNIE | 62 | 128,000 |
| #169 | ERNIE 4.5 Turbo 32K | Baidu / ERNIE | 62 | 32,768 |
| #170 | ERNIE Functions 8K | Baidu / ERNIE | 62 | 128,000 |
| #171 | ERNIE Speed 128K | Baidu / ERNIE | 62 | 128,000 |
| #172 | GLM-4.5 | Z.AI | 62 | 128,000 |
| #173 | GLM-4.5V | Z.AI | 62 | 128,000 |
| #174 | GLM-4.6 | Z.AI | 62 | 128,000 |
| #175 | GLM-4.6V | Z.AI | 62 | 128,000 |
| #176 | GLM-4.7 | Z.AI | 62 | 128,000 |
| #177 | GLM-5 | Z.AI | 62 | 128,000 |
| #178 | GLM-Image | Z.AI | 62 | 128,000 |
| #179 | GLM-OCR | Z.AI | 62 | 128,000 |
| #180 | Hunyuan Code | Tencent / Hunyuan | 62 | 128,000 |
| #181 | Hunyuan Lite | Tencent / Hunyuan | 62 | 128,000 |
| #182 | Hunyuan Standard | Tencent / Hunyuan | 62 | 128,000 |
| #183 | Hunyuan T1 | Tencent / Hunyuan | 62 | 256,000 |
| #184 | Hunyuan T1 Vision | Tencent / Hunyuan | 62 | 128,000 |
| #185 | Hunyuan TurboS | Tencent / Hunyuan | 62 | 128,000 |
| #186 | Hunyuan TurboS LongText 128K | Tencent / Hunyuan | 62 | 128,000 |
| #187 | Kimi K2 | Moonshot AI / Kimi | 62 | 131,072 |
| #188 | Kimi K2 Thinking | Moonshot AI / Kimi | 62 | 256,000 |
| #189 | Kimi K2 Turbo Preview | Moonshot AI / Kimi | 62 | 256,000 |
| #190 | Kimi K2.5 | Moonshot AI / Kimi | 62 | 256,000 |
| #191 | Magistral Medium 1.2 | Mistral AI | 62 | 128,000 |
| #192 | Mistral Large 3 | Mistral AI | 62 | 128,000 |
| #193 | Mistral Medium 3.1 | Mistral AI | 62 | 128,000 |
| #194 | Mistral OCR 2505 | Mistral AI | 62 | 32,768 |
| #195 | Mistral Small 3.1 | Mistral AI | 62 | 128,000 |
| #196 | Mistral Small 3.2 Open | Mistral AI | 62 | 128,000 |
| #197 | Pixtral Large | Mistral AI | 62 | 131,072 |
| #198 | Vidu Q1 | Z.AI | 62 | 128,000 |
| #199 | image-01 | MiniMax | 61 | 8,192 |
| #200 | image-01-live | MiniMax | 61 | 8,192 |
| #201 | MiniMax-M2 | MiniMax | 61 | 204,800 |
| #202 | MiniMax-M2.1 | MiniMax | 61 | 204,800 |
| #203 | MiniMax-M2.1-highspeed | MiniMax | 61 | 204,800 |
| #204 | MiniMax-M2.5 | MiniMax | 61 | 204,800 |
| #205 | MiniMax-M2.5-highspeed | MiniMax | 61 | 204,800 |
| #206 | MiniMax-Speech-02 | MiniMax | 61 | 8,192 |
| #207 | music-2.0 | MiniMax | 61 | 8,192 |
| #208 | Nova Micro | Amazon Web Services | 61 | 128,000 |
| #209 | Claude Sonnet 3 | Anthropic | 59 | 200,000 |
| #210 | Claude Sonnet 4 | Anthropic | 59 | 1,000,000 |
| #211 | Command A | Cohere | 59 | 256,000 |
| #212 | Command A Reasoning | Cohere | 59 | 128,000 |
| #213 | Command A Translate | Cohere | 59 | 128,000 |
| #214 | Command A Vision | Cohere | 59 | 128,000 |
| #215 | Command R+ | Cohere | 59 | 128,000 |
| #216 | Command R7B | Cohere | 59 | 128,000 |
| #217 | Embed 4 | Cohere | 59 | 128,000 |
| #218 | gpt-audio | OpenAI | 59 | 128,000 |
| #219 | gpt-realtime | OpenAI | 59 | 128,000 |
| #220 | Grok 3 | xAI | 59 | 131,072 |
| #221 | Grok 3 Mini | xAI | 59 | 131,072 |
| #222 | Grok 4 | xAI | 59 | 256,000 |
| #223 | Grok 4 Fast Reasoning | xAI | 59 | 131,072 |
| #224 | grok-image | xAI | 59 | 131,072 |
| #225 | o3 | OpenAI | 59 | 200,000 |
| #226 | o3-deep-research | OpenAI | 59 | 200,000 |
| #227 | Claude Opus 3 | Anthropic | 56 | 200,000 |
| #228 | o1 | OpenAI | 56 | 200,000 |
| #229 | gpt-audio-mini | OpenAI | 55 | 128,000 |
| #230 | gpt-realtime-mini | OpenAI | 55 | 128,000 |
| #231 | Prompt Guard 86M | Meta | 53 | 512 |
| #232 | Claude Opus 3 | Anthropic | 52 | 200,000 |
| #233 | Claude Opus 4 | Anthropic | 52 | 200,000 |
| #234 | GPT Image 1 | OpenAI | 49 | 32,768 |
| #235 | GPT-4o mini Transcribe | OpenAI | 49 | 128,000 |
| #236 | GPT-4o mini TTS | OpenAI | 49 | 128,000 |
| #237 | GPT-4o Transcribe | OpenAI | 49 | 128,000 |
| #238 | chatgpt-image-latest | OpenAI | 48 | 32,768 |
| #239 | gpt-image-1-mini | OpenAI | 48 | 32,768 |
| #240 | NextStep-1.1 | StepFun | 48 | 512 |
| #241 | pplx-embed-v1-0.6b | Perplexity | 43 | 8,192 |
| #242 | pplx-embed-v1-4b | Perplexity | 43 | 8,192 |
| #243 | FLUX 1.1 Pro | Black Forest Labs | 39 | 512 |
| #244 | FLUX 1 Pro | Black Forest Labs | 37 | 512 |
| #245 | FLUX 1.1 Pro Ultra | Black Forest Labs | 37 | 512 |
Why #1: Llama 4 Maverick
Meta's 17Bx128E MoE open-weight model with 1M token context, pretrained on ~22T tokens. Strong multimodal and multilingual capabilities for teams that need control, private deployment, and customization.
This model clears the current full-profile threshold for leaderboard methodology.
Why #2: gpt-oss-120b
OpenAI's 120B open-weight model for frontier-style reasoning with self-hosted deployment.
This model clears the current full-profile threshold for leaderboard methodology.
Why #3: Llama 3.3 70B Instruct
Meta's latest Llama 3.x dense model, 70B parameters with 128K context.
This model clears the current full-profile threshold for leaderboard methodology.