Leaderboard

reasoning rankings

Updated weekly

Each leaderboard uses transparent weighted scoring, current model context, and supporting analysis to help teams interpret the results with confidence. Only full-profile entries appear in rankings; broader catalog records remain available elsewhere on the site when only source-backed metadata is currently available.

Full-profile entries

245

Models with complete enough metadata and scoring coverage to be meaningfully ranked in this category.

Ranking basis

reasoning

Scores combine benchmark evidence, product metadata, and cost/context signals when those fields are published.

Catalog caveat

Tracked models without full scoring remain in the directory and provider pages, but are not relied on for analytical ranking claims.

Rank	Model	Provider	Score	Context
#1	Claude Opus 4.6	Anthropic	94	1,000,000
#2	Claude Sonnet 4.6	Anthropic	94	1,000,000
#3	GPT-5.4	OpenAI	94	1,000,000
#4	Claude 3.7 Sonnet	Anthropic	93	200,000
#5	Claude Sonnet 4.5	Anthropic	93	1,000,000
#6	GPT-5.2	OpenAI	93	1,000,000
#7	GPT-5.4 Pro	OpenAI	93	1,000,000
#8	Gemini 2.5 Pro	Google DeepMind	92	1,048,576
#9	Gemini 3.1 Pro	Google DeepMind	92	1,048,576
#10	GPT-5.2 Pro	OpenAI	92	1,000,000
#11	Claude Opus 4.1	Anthropic	91	200,000
#12	Claude Sonnet 4	Anthropic	91	1,000,000
#13	Gemini 3.0 Pro	Google DeepMind	91	1,048,576
#14	GPT-4o	OpenAI	91	128,000
#15	GPT-5	OpenAI	91	1,000,000
#16	Claude Opus 4	Anthropic	90	200,000
#17	Gemini 3.1 Flash	Google DeepMind	90	1,048,576
#18	GPT-5.3-Codex	OpenAI	90	1,000,000
#19	Gemini 2.5 Pro TTS	Google DeepMind	89	1,048,576
#20	Gemini 2.5 Flash	Google DeepMind	88	1,048,576
#21	Gemini 3.0 Flash	Google DeepMind	88	1,048,576
#22	GPT-5 mini	OpenAI	88	1,000,000
#23	Command R+ 2026	Cohere	87	128,000
#24	GPT-4.1	OpenAI	87	1,048,576
#25	GPT-5.3 Instant	OpenAI	87	1,000,000
#26	Mistral Large 25	Mistral AI	87	128,000
#27	o3	OpenAI	87	200,000
#28	o3-deep-research	OpenAI	87	200,000
#29	Sonar Deep Research	Perplexity	87	200,000
#30	Claude Haiku 4.5	Anthropic	86	200,000
#31	Gemini 2.0 Flash	Google DeepMind	86	1,048,576
#32	Llama 4 Maverick	Meta	86	1,048,576
#33	o4-mini	OpenAI	86	200,000
#34	o4-mini-deep-research	OpenAI	86	200,000
#35	Sonar Reasoning Pro	Perplexity	86	200,000
#36	Sonar Pro	Perplexity	85	200,000
#37	Gemini 1.5 Pro	Google DeepMind	84	2,097,152
#38	Gemini 2.5 Flash Live	Google DeepMind	84	1,048,576
#39	Gemini 2.5 Flash Native Audio Preview	Google DeepMind	84	1,048,576
#40	Gemini 2.5 Flash-Lite	Google DeepMind	84	1,048,576
#41	Claude Opus 3	Anthropic	83	200,000
#42	Gemini 3.1 Flash-Lite	Google DeepMind	83	1,048,576
#43	o1	OpenAI	83	200,000
#44	Claude Sonnet 3	Anthropic	82	200,000
#45	GPT-4.1 mini	OpenAI	82	1,048,576
#46	Claude Haiku 3.5	Anthropic	81	200,000
#47	Gemini 2.0 Flash-Lite	Google DeepMind	81	1,048,576
#48	GPT-5 nano	OpenAI	81	1,000,000
#49	gpt-audio	OpenAI	81	128,000
#50	gpt-realtime	OpenAI	81	128,000
#51	Nova Pro	Amazon Web Services	81	300,000
#52	o1-mini	OpenAI	81	128,000
#53	Sonar	Perplexity	81	128,000
#54	Gemini 1.5 Flash	Google DeepMind	80	1,048,576
#55	GPT-4o-mini	OpenAI	80	128,000
#56	gpt-oss-120b	OpenAI	80	131,072
#57	Llama 3.3 70B Instruct	Meta	80	128,000
#58	Llama 4 Scout	Meta	78	10,485,760
#59	Step-3.5-Flash	StepFun	78	131,072
#60	Claude Haiku 3	Anthropic	77	200,000
#61	Claude Haiku 3.5	Anthropic	77	200,000
#62	Claude Haiku 4.5	Anthropic	77	200,000
#63	Codestral	Mistral AI	77	256,000
#64	Doubao-Seed-1.6	ByteDance / Doubao	77	128,000
#65	Doubao-Seed-1.6-Flash	ByteDance / Doubao	77	128,000
#66	Doubao-Seed-2.0-Code	ByteDance / Doubao	77	128,000
#67	Doubao-Seed-Code	ByteDance / Doubao	77	128,000
#68	Llama 3.1 405B Instruct	Meta	77	128,000
#69	MiniMax-M1	MiniMax	77	204,800
#70	MiniMax-Text-01	MiniMax	77	204,800
#71	MiniMax-VL-01	MiniMax	77	204,800
#72	Voxtral Mini Transcribe	Mistral AI	77	131,072
#73	Claude Haiku 3	Anthropic	76	200,000
#74	Claude Sonnet 3	Anthropic	76	200,000
#75	Claude Sonnet 4	Anthropic	76	1,000,000
#76	CogVideoX	Z.AI	76	128,000
#77	CogView 4	Z.AI	76	128,000
#78	Command A	Cohere	76	256,000
#79	Command A Reasoning	Cohere	76	128,000
#80	Command A Translate	Cohere	76	128,000
#81	Command A Vision	Cohere	76	128,000
#82	Command R+	Cohere	76	128,000
#83	Command R7B	Cohere	76	128,000
#84	DeepSeek-Coder-V2	DeepSeek	76	128,000
#85	DeepSeek-Math-V2	DeepSeek	76	128,000
#86	DeepSeek-R1	DeepSeek	76	128,000
#87	DeepSeek-R1-Distill-Llama-70B	DeepSeek	76	128,000
#88	DeepSeek-V2.5	DeepSeek	76	128,000
#89	DeepSeek-V3	DeepSeek	76	128,000
#90	DeepSeek-V3.1	DeepSeek	76	128,000
#91	DeepSeek-V3.1-Base	DeepSeek	76	128,000
#92	DeepSeek-V3.2	DeepSeek	76	128,000
#93	DeepSeek-V3.2-Exp	DeepSeek	76	128,000
#94	Devstral 2 Open	Mistral AI	76	128,000
#95	Devstral Medium 1.0	Mistral AI	76	128,000
#96	Devstral Small 2	Mistral AI	76	128,000
#97	Embed 4	Cohere	76	128,000
#98	ERNIE 3.5 128K	Baidu / ERNIE	76	128,000
#99	ERNIE 4.0 Turbo 8K	Baidu / ERNIE	76	128,000
#100	ERNIE Functions 8K	Baidu / ERNIE	76	128,000
#101	ERNIE Speed 128K	Baidu / ERNIE	76	128,000
#102	GLM-4.5	Z.AI	76	128,000
#103	GLM-4.5V	Z.AI	76	128,000
#104	GLM-4.6	Z.AI	76	128,000
#105	GLM-4.6V	Z.AI	76	128,000
#106	GLM-4.7	Z.AI	76	128,000
#107	GLM-5	Z.AI	76	128,000
#108	GLM-Image	Z.AI	76	128,000
#109	GLM-OCR	Z.AI	76	128,000
#110	Grok 3	xAI	76	131,072
#111	Grok 3 Mini	xAI	76	131,072
#112	Grok 4	xAI	76	256,000
#113	Grok 4 Fast Reasoning	xAI	76	131,072
#114	grok-image	xAI	76	131,072
#115	Hunyuan Code	Tencent / Hunyuan	76	128,000
#116	Hunyuan Lite	Tencent / Hunyuan	76	128,000
#117	Hunyuan Standard	Tencent / Hunyuan	76	128,000
#118	Hunyuan T1	Tencent / Hunyuan	76	256,000
#119	Hunyuan T1 Vision	Tencent / Hunyuan	76	128,000
#120	Hunyuan TurboS	Tencent / Hunyuan	76	128,000
#121	Hunyuan TurboS LongText 128K	Tencent / Hunyuan	76	128,000
#122	Jamba 3B	AI21 Labs	76	256,000
#123	Jamba Large	AI21 Labs	76	256,000
#124	Jamba Large 1.6	AI21 Labs	76	256,000
#125	Jamba Mini	AI21 Labs	76	256,000
#126	Jamba Mini 1.6	AI21 Labs	76	256,000
#127	Jamba Mini 1.7	AI21 Labs	76	256,000
#128	Kimi K2	Moonshot AI / Kimi	76	131,072
#129	Kimi K2 Thinking	Moonshot AI / Kimi	76	256,000
#130	Kimi K2 Turbo Preview	Moonshot AI / Kimi	76	256,000
#131	Kimi K2.5	Moonshot AI / Kimi	76	256,000
#132	Llama 3.1 70B Instruct	Meta	76	128,000
#133	Magistral Medium 1.2	Mistral AI	76	128,000
#134	Magistral Small 1.2 Open	Mistral AI	76	128,000
#135	MiniMax-M2	MiniMax	76	204,800
#136	MiniMax-M2.1	MiniMax	76	204,800
#137	MiniMax-M2.1-highspeed	MiniMax	76	204,800
#138	MiniMax-M2.5	MiniMax	76	204,800
#139	MiniMax-M2.5-highspeed	MiniMax	76	204,800
#140	Ministral 3 14B Open	Mistral AI	76	128,000
#141	Ministral 3 3B Open	Mistral AI	76	128,000
#142	Ministral 3 8B Open	Mistral AI	76	128,000
#143	Mistral Large 3	Mistral AI	76	128,000
#144	Mistral Large 3 Open	Mistral AI	76	128,000
#145	Mistral Medium 3.1	Mistral AI	76	128,000
#146	Mistral Nemo 12B	Mistral AI	76	128,000
#147	Mistral Small 3.1	Mistral AI	76	128,000
#148	Mistral Small 3.2 Open	Mistral AI	76	128,000
#149	Nova Lite	Amazon Web Services	76	300,000
#150	Phi-3-vision-128k-instruct	Microsoft	76	128,000
#151	Phi-3.5-mini-instruct	Microsoft	76	131,072
#152	Phi-3.5-MoE-instruct	Microsoft	76	131,072
#153	Phi-3.5-vision-instruct	Microsoft	76	131,072
#154	Phi-4-mini-flash-reasoning	Microsoft	76	131,072
#155	Phi-4-mini-instruct	Microsoft	76	131,072
#156	Phi-4-multimodal-instruct	Microsoft	76	131,072
#157	Phi-4-reasoning	Microsoft	76	131,072
#158	Phi-4-reasoning-plus	Microsoft	76	131,072
#159	Phi-4-reasoning-vision-15B	Microsoft	76	131,072
#160	Pixtral 12B	Mistral AI	76	131,072
#161	Pixtral Large	Mistral AI	76	131,072
#162	Qwen2.5-1.5B-Instruct	Alibaba Qwen	76	131,072
#163	Qwen2.5-14B-Instruct	Alibaba Qwen	76	131,072
#164	Qwen2.5-32B-Instruct	Alibaba Qwen	76	131,072
#165	Qwen2.5-3B-Instruct	Alibaba Qwen	76	131,072
#166	Qwen2.5-72B-Instruct	Alibaba Qwen	76	131,072
#167	Qwen2.5-7B-Instruct	Alibaba Qwen	76	131,072
#168	Qwen2.5-Max	Alibaba Qwen	76	131,072
#169	Qwen2.5-VL-72B-Instruct	Alibaba Qwen	76	131,072
#170	Qwen2.5-VL-7B-Instruct	Alibaba Qwen	76	131,072
#171	Qwen3-Coder-Next	Alibaba Qwen	76	131,072
#172	Qwen3.5-0.8B	Alibaba Qwen	76	131,072
#173	Qwen3.5-122B-A10B	Alibaba Qwen	76	131,072
#174	Qwen3.5-27B	Alibaba Qwen	76	131,072
#175	Qwen3.5-2B	Alibaba Qwen	76	131,072
#176	Qwen3.5-35B-A3B	Alibaba Qwen	76	131,072
#177	Qwen3.5-397B-A17B	Alibaba Qwen	76	131,072
#178	Qwen3.5-4B	Alibaba Qwen	76	131,072
#179	Qwen3.5-9B	Alibaba Qwen	76	131,072
#180	Vidu Q1	Z.AI	76	128,000
#181	Voxtral Mini Open	Mistral AI	76	131,072
#182	Voxtral Small Open	Mistral AI	76	131,072
#183	Codestral Embed	Mistral AI	75	32,768
#184	ERNIE 4.5 Turbo 32K	Baidu / ERNIE	75	32,768
#185	Gemini 1.5 Flash-8B	Google DeepMind	75	1,048,576
#186	Mistral Embed	Mistral AI	75	32,768
#187	Mistral Moderation	Mistral AI	75	32,768
#188	Mistral OCR 2505	Mistral AI	75	32,768
#189	warpgrep-v2	Morph	75	128,000
#190	Claude Opus 3	Anthropic	74	200,000
#191	Claude Opus 4	Anthropic	74	200,000
#192	gpt-audio-mini	OpenAI	74	128,000
#193	gpt-realtime-mini	OpenAI	74	128,000
#194	DeepSeek-OCR	DeepSeek	73	16,384
#195	DeepSeek-OCR-2	DeepSeek	73	16,384
#196	DeepSeek-VL2-Small	DeepSeek	73	16,384
#197	gpt-oss-20b	OpenAI	73	131,072
#198	image-01	MiniMax	73	8,192
#199	image-01-live	MiniMax	73	8,192
#200	Janus-Pro-7B	DeepSeek	73	16,384
#201	MiniMax-Speech-02	MiniMax	73	8,192
#202	morph-v3-fast-apply	Morph	73	128,000
#203	music-2.0	MiniMax	73	8,192
#204	Phi-4	Microsoft	73	16,384
#205	Nova Micro	Amazon Web Services	72	128,000
#206	Llama 3.2 90B Vision Instruct	Meta	71	128,000
#207	phi-1	Microsoft	71	4,096
#208	phi-1_5	Microsoft	71	4,096
#209	phi-2	Microsoft	71	4,096
#210	Phi-3-medium-4k-instruct	Microsoft	71	4,096
#211	Phi-3-mini-4k-instruct	Microsoft	71	4,096
#212	Phi-tiny-MoE-instruct	Microsoft	71	4,096
#213	flash-compact	Morph	70	200,000
#214	LFM2-24B-A2B	Liquid AI	70	32,768
#215	Llama 3.1 8B Instruct	Meta	70	128,000
#216	Step3-VL-10B	StepFun	69	131,072
#217	GPT Image 1	OpenAI	68	32,768
#218	MiMo-VL-7B	Xiaomi	68	131,072
#219	Code Llama 70B Instruct	Meta	67	8,192
#220	GPT-4o mini Transcribe	OpenAI	67	128,000
#221	GPT-4o mini TTS	OpenAI	67	128,000
#222	GPT-4o Transcribe	OpenAI	67	128,000
#223	Llama 3.2 11B Vision Instruct	Meta	67	128,000
#224	Llama 3.2 1B Instruct	Meta	67	128,000
#225	Llama 3.2 3B Instruct	Meta	67	128,000
#226	Meta Llama 3 70B Instruct	Meta	67	8,192
#227	MiMo-Audio-7B	Xiaomi	66	131,072
#228	chatgpt-image-latest	OpenAI	65	32,768
#229	gpt-image-1-mini	OpenAI	65	32,768
#230	Llama Guard 4 12B	Meta	65	131,072
#231	Step-Audio-R1.1	StepFun	65	131,072
#232	LFM2-8B-A1B	Liquid AI	64	32,768
#233	Code Llama 34B Instruct	Meta	63	8,192
#234	Meta Llama 3 8B Instruct	Meta	63	8,192
#235	Llama Guard 3 11B Vision	Meta	62	131,072
#236	LFM2.5-1.2B-Instruct	Liquid AI	59	131,072
#237	LFM2.5-1.2B-Thinking	Liquid AI	59	131,072
#238	LFM2-2.6B	Liquid AI	55	32,768
#239	pplx-embed-v1-4b	Perplexity	48	8,192
#240	pplx-embed-v1-0.6b	Perplexity	46	8,192
#241	Prompt Guard 86M	Meta	44	512
#242	FLUX 1.1 Pro	Black Forest Labs	43	512
#243	FLUX 1.1 Pro Ultra	Black Forest Labs	43	512
#244	FLUX 1 Pro	Black Forest Labs	41	512
#245	NextStep-1.1	StepFun	34	512

Why #1: Claude Opus 4.6

Anthropic's most intelligent Claude model for complex agents, coding, and deep reasoning, with 1M token context and 128K output.

This model clears the current full-profile threshold for leaderboard methodology.

Why #2: Claude Sonnet 4.6

Anthropic's current Sonnet tier for fast frontier reasoning, coding, and long-context agent work.

This model clears the current full-profile threshold for leaderboard methodology.

Why #3: GPT-5.4

OpenAI's GPT-5.4, the most capable and efficient frontier model for professional work. First general-purpose model with native computer-use capabilities. Combines industry-leading coding from GPT-5.3-Codex with improved agentic workflows.

This model clears the current full-profile threshold for leaderboard methodology.