Leaderboard

overall rankings

Updated weekly

Each leaderboard uses transparent weighted scoring, current model context, and supporting analysis to help teams interpret the results with confidence. Only full-profile entries appear in rankings; broader catalog records remain available elsewhere on the site when only source-backed metadata is currently available.

Full-profile entries

245

Models with complete enough metadata and scoring coverage to be meaningfully ranked in this category.

Ranking basis

overall

Scores combine benchmark evidence, product metadata, and cost/context signals when those fields are published.

Catalog caveat

Tracked models without full scoring remain in the directory and provider pages, but are not relied on for analytical ranking claims.

Rank	Model	Provider	Score	Context
#1	GPT-5.4	OpenAI	93	1,000,000
#2	Claude Sonnet 4.6	Anthropic	92	1,000,000
#3	Claude Opus 4.6	Anthropic	91	1,000,000
#4	Claude Sonnet 4.5	Anthropic	91	1,000,000
#5	Gemini 3.1 Pro	Google DeepMind	91	1,048,576
#6	GPT-4o	OpenAI	91	128,000
#7	GPT-5.2	OpenAI	91	1,000,000
#8	GPT-5.4 Pro	OpenAI	91	1,000,000
#9	Claude 3.7 Sonnet	Anthropic	90	200,000
#10	Gemini 2.5 Pro	Google DeepMind	90	1,048,576
#11	GPT-5.2 Pro	OpenAI	90	1,000,000
#12	Claude Opus 4.1	Anthropic	89	200,000
#13	Claude Sonnet 4	Anthropic	89	1,000,000
#14	Gemini 3.0 Pro	Google DeepMind	89	1,048,576
#15	Gemini 3.1 Flash	Google DeepMind	89	1,048,576
#16	GPT-5	OpenAI	89	1,000,000
#17	Gemini 2.5 Pro TTS	Google DeepMind	88	1,048,576
#18	Claude Opus 4	Anthropic	87	200,000
#19	Gemini 2.5 Flash	Google DeepMind	87	1,048,576
#20	Gemini 3.0 Flash	Google DeepMind	87	1,048,576
#21	GPT-5.3-Codex	OpenAI	87	1,000,000
#22	GPT-5 mini	OpenAI	86	1,000,000
#23	Mistral Large 25	Mistral AI	86	128,000
#24	Gemini 2.0 Flash	Google DeepMind	85	1,048,576
#25	GPT-4.1	OpenAI	85	1,048,576
#26	GPT-5.3 Instant	OpenAI	85	1,000,000
#27	Claude Haiku 4.5	Anthropic	84	200,000
#28	Llama 4 Maverick	Meta	84	1,048,576
#29	Gemini 2.5 Flash Live	Google DeepMind	83	1,048,576
#30	Gemini 2.5 Flash Native Audio Preview	Google DeepMind	83	1,048,576
#31	Gemini 2.5 Flash-Lite	Google DeepMind	83	1,048,576
#32	Command R+ 2026	Cohere	82	128,000
#33	Gemini 1.5 Pro	Google DeepMind	82	2,097,152
#34	Gemini 3.1 Flash-Lite	Google DeepMind	82	1,048,576
#35	o3	OpenAI	81	200,000
#36	o4-mini	OpenAI	81	200,000
#37	GPT-4.1 mini	OpenAI	80	1,048,576
#38	o3-deep-research	OpenAI	80	200,000
#39	o4-mini-deep-research	OpenAI	80	200,000
#40	Claude Haiku 3.5	Anthropic	79	200,000
#41	Claude Opus 3	Anthropic	79	200,000
#42	Gemini 2.0 Flash-Lite	Google DeepMind	79	1,048,576
#43	GPT-5 nano	OpenAI	79	1,000,000
#44	Nova Pro	Amazon Web Services	79	300,000
#45	Claude Sonnet 3	Anthropic	78	200,000
#46	Gemini 1.5 Flash	Google DeepMind	78	1,048,576
#47	GPT-4o-mini	OpenAI	78	128,000
#48	o1	OpenAI	75	200,000
#49	Sonar Reasoning Pro	Perplexity	75	200,000
#50	Llama 3.3 70B Instruct	Meta	74	128,000
#51	Nova Lite	Amazon Web Services	74	300,000
#52	o1-mini	OpenAI	74	128,000
#53	Sonar Pro	Perplexity	74	200,000
#54	Gemini 1.5 Flash-8B	Google DeepMind	73	1,048,576
#55	Llama 4 Scout	Meta	73	10,485,760
#56	Claude Haiku 3	Anthropic	72	200,000
#57	Claude Haiku 3	Anthropic	72	200,000
#58	Claude Haiku 3.5	Anthropic	72	200,000
#59	Claude Haiku 4.5	Anthropic	72	200,000
#60	Codestral	Mistral AI	72	256,000
#61	Codestral Embed	Mistral AI	72	32,768
#62	CogVideoX	Z.AI	72	128,000
#63	CogView 4	Z.AI	72	128,000
#64	Devstral Medium 1.0	Mistral AI	72	128,000
#65	Doubao-Seed-1.6	ByteDance / Doubao	72	128,000
#66	Doubao-Seed-1.6-Flash	ByteDance / Doubao	72	128,000
#67	Doubao-Seed-2.0-Code	ByteDance / Doubao	72	128,000
#68	Doubao-Seed-Code	ByteDance / Doubao	72	128,000
#69	ERNIE 3.5 128K	Baidu / ERNIE	72	128,000
#70	ERNIE 4.0 Turbo 8K	Baidu / ERNIE	72	128,000
#71	ERNIE Functions 8K	Baidu / ERNIE	72	128,000
#72	ERNIE Speed 128K	Baidu / ERNIE	72	128,000
#73	GLM-4.5	Z.AI	72	128,000
#74	GLM-4.5V	Z.AI	72	128,000
#75	GLM-4.6	Z.AI	72	128,000
#76	GLM-4.6V	Z.AI	72	128,000
#77	GLM-4.7	Z.AI	72	128,000
#78	GLM-5	Z.AI	72	128,000
#79	GLM-Image	Z.AI	72	128,000
#80	GLM-OCR	Z.AI	72	128,000
#81	gpt-audio	OpenAI	72	128,000
#82	gpt-oss-120b	OpenAI	72	131,072
#83	gpt-realtime	OpenAI	72	128,000
#84	Hunyuan Code	Tencent / Hunyuan	72	128,000
#85	Hunyuan Lite	Tencent / Hunyuan	72	128,000
#86	Hunyuan Standard	Tencent / Hunyuan	72	128,000
#87	Hunyuan T1	Tencent / Hunyuan	72	256,000
#88	Hunyuan T1 Vision	Tencent / Hunyuan	72	128,000
#89	Hunyuan TurboS	Tencent / Hunyuan	72	128,000
#90	Hunyuan TurboS LongText 128K	Tencent / Hunyuan	72	128,000
#91	Kimi K2	Moonshot AI / Kimi	72	131,072
#92	Kimi K2 Thinking	Moonshot AI / Kimi	72	256,000
#93	Kimi K2 Turbo Preview	Moonshot AI / Kimi	72	256,000
#94	Kimi K2.5	Moonshot AI / Kimi	72	256,000
#95	Magistral Medium 1.2	Mistral AI	72	128,000
#96	MiniMax-M1	MiniMax	72	204,800
#97	MiniMax-M2	MiniMax	72	204,800
#98	MiniMax-M2.1	MiniMax	72	204,800
#99	MiniMax-M2.1-highspeed	MiniMax	72	204,800
#100	MiniMax-M2.5	MiniMax	72	204,800
#101	MiniMax-M2.5-highspeed	MiniMax	72	204,800
#102	MiniMax-Text-01	MiniMax	72	204,800
#103	MiniMax-VL-01	MiniMax	72	204,800
#104	Mistral Embed	Mistral AI	72	32,768
#105	Mistral Large 3	Mistral AI	72	128,000
#106	Mistral Medium 3.1	Mistral AI	72	128,000
#107	Mistral Moderation	Mistral AI	72	32,768
#108	Mistral Small 3.1	Mistral AI	72	128,000
#109	Mistral Small 3.2 Open	Mistral AI	72	128,000
#110	Pixtral 12B	Mistral AI	72	131,072
#111	Pixtral Large	Mistral AI	72	131,072
#112	Sonar Deep Research	Perplexity	72	200,000
#113	Vidu Q1	Z.AI	72	128,000
#114	Voxtral Mini Open	Mistral AI	72	131,072
#115	Voxtral Mini Transcribe	Mistral AI	72	131,072
#116	Voxtral Small Open	Mistral AI	72	131,072
#117	Claude Sonnet 3	Anthropic	71	200,000
#118	Claude Sonnet 4	Anthropic	71	1,000,000
#119	Command A	Cohere	71	256,000
#120	Command A Reasoning	Cohere	71	128,000
#121	Command A Translate	Cohere	71	128,000
#122	Command A Vision	Cohere	71	128,000
#123	Command R+	Cohere	71	128,000
#124	Command R7B	Cohere	71	128,000
#125	DeepSeek-Coder-V2	DeepSeek	71	128,000
#126	DeepSeek-Math-V2	DeepSeek	71	128,000
#127	DeepSeek-R1	DeepSeek	71	128,000
#128	DeepSeek-R1-Distill-Llama-70B	DeepSeek	71	128,000
#129	DeepSeek-V2.5	DeepSeek	71	128,000
#130	DeepSeek-V3	DeepSeek	71	128,000
#131	DeepSeek-V3.1	DeepSeek	71	128,000
#132	DeepSeek-V3.1-Base	DeepSeek	71	128,000
#133	DeepSeek-V3.2	DeepSeek	71	128,000
#134	DeepSeek-V3.2-Exp	DeepSeek	71	128,000
#135	Devstral 2 Open	Mistral AI	71	128,000
#136	Devstral Small 2	Mistral AI	71	128,000
#137	Embed 4	Cohere	71	128,000
#138	ERNIE 4.5 Turbo 32K	Baidu / ERNIE	71	32,768
#139	Grok 3	xAI	71	131,072
#140	Grok 3 Mini	xAI	71	131,072
#141	Grok 4	xAI	71	256,000
#142	Grok 4 Fast Reasoning	xAI	71	131,072
#143	grok-image	xAI	71	131,072
#144	Jamba 3B	AI21 Labs	71	256,000
#145	Jamba Large	AI21 Labs	71	256,000
#146	Jamba Large 1.6	AI21 Labs	71	256,000
#147	Jamba Mini	AI21 Labs	71	256,000
#148	Jamba Mini 1.6	AI21 Labs	71	256,000
#149	Jamba Mini 1.7	AI21 Labs	71	256,000
#150	Magistral Small 1.2 Open	Mistral AI	71	128,000
#151	Ministral 3 14B Open	Mistral AI	71	128,000
#152	Ministral 3 3B Open	Mistral AI	71	128,000
#153	Ministral 3 8B Open	Mistral AI	71	128,000
#154	Mistral Large 3 Open	Mistral AI	71	128,000
#155	Mistral Nemo 12B	Mistral AI	71	128,000
#156	Mistral OCR 2505	Mistral AI	71	32,768
#157	morph-v3-fast-apply	Morph	71	128,000
#158	Phi-3-vision-128k-instruct	Microsoft	71	128,000
#159	Phi-3.5-mini-instruct	Microsoft	71	131,072
#160	Phi-3.5-MoE-instruct	Microsoft	71	131,072
#161	Phi-3.5-vision-instruct	Microsoft	71	131,072
#162	Phi-4-mini-flash-reasoning	Microsoft	71	131,072
#163	Phi-4-mini-instruct	Microsoft	71	131,072
#164	Phi-4-multimodal-instruct	Microsoft	71	131,072
#165	Phi-4-reasoning	Microsoft	71	131,072
#166	Phi-4-reasoning-plus	Microsoft	71	131,072
#167	Phi-4-reasoning-vision-15B	Microsoft	71	131,072
#168	Qwen2.5-1.5B-Instruct	Alibaba Qwen	71	131,072
#169	Qwen2.5-14B-Instruct	Alibaba Qwen	71	131,072
#170	Qwen2.5-32B-Instruct	Alibaba Qwen	71	131,072
#171	Qwen2.5-3B-Instruct	Alibaba Qwen	71	131,072
#172	Qwen2.5-72B-Instruct	Alibaba Qwen	71	131,072
#173	Qwen2.5-7B-Instruct	Alibaba Qwen	71	131,072
#174	Qwen2.5-Max	Alibaba Qwen	71	131,072
#175	Qwen2.5-VL-72B-Instruct	Alibaba Qwen	71	131,072
#176	Qwen2.5-VL-7B-Instruct	Alibaba Qwen	71	131,072
#177	Qwen3-Coder-Next	Alibaba Qwen	71	131,072
#178	Qwen3.5-0.8B	Alibaba Qwen	71	131,072
#179	Qwen3.5-122B-A10B	Alibaba Qwen	71	131,072
#180	Qwen3.5-27B	Alibaba Qwen	71	131,072
#181	Qwen3.5-2B	Alibaba Qwen	71	131,072
#182	Qwen3.5-35B-A3B	Alibaba Qwen	71	131,072
#183	Qwen3.5-397B-A17B	Alibaba Qwen	71	131,072
#184	Qwen3.5-4B	Alibaba Qwen	71	131,072
#185	Qwen3.5-9B	Alibaba Qwen	71	131,072
#186	Sonar	Perplexity	71	128,000
#187	warpgrep-v2	Morph	71	128,000
#188	DeepSeek-OCR	DeepSeek	70	16,384
#189	DeepSeek-OCR-2	DeepSeek	70	16,384
#190	DeepSeek-VL2-Small	DeepSeek	70	16,384
#191	image-01	MiniMax	70	8,192
#192	image-01-live	MiniMax	70	8,192
#193	Janus-Pro-7B	DeepSeek	70	16,384
#194	Llama 3.1 405B Instruct	Meta	70	128,000
#195	Llama 3.1 70B Instruct	Meta	70	128,000
#196	MiniMax-Speech-02	MiniMax	70	8,192
#197	music-2.0	MiniMax	70	8,192
#198	Phi-4	Microsoft	70	16,384
#199	Step-3.5-Flash	StepFun	70	131,072
#200	Claude Opus 3	Anthropic	69	200,000
#201	Claude Opus 4	Anthropic	69	200,000
#202	Llama 3.2 90B Vision Instruct	Meta	69	128,000
#203	phi-1	Microsoft	69	4,096
#204	phi-1_5	Microsoft	69	4,096
#205	phi-2	Microsoft	69	4,096
#206	Phi-3-medium-4k-instruct	Microsoft	69	4,096
#207	Phi-3-mini-4k-instruct	Microsoft	69	4,096
#208	Phi-tiny-MoE-instruct	Microsoft	69	4,096
#209	flash-compact	Morph	68	200,000
#210	GPT Image 1	OpenAI	67	32,768
#211	Nova Micro	Amazon Web Services	67	128,000
#212	Step3-VL-10B	StepFun	67	131,072
#213	gpt-audio-mini	OpenAI	66	128,000
#214	gpt-realtime-mini	OpenAI	66	128,000
#215	MiMo-VL-7B	Xiaomi	66	131,072
#216	chatgpt-image-latest	OpenAI	65	32,768
#217	gpt-image-1-mini	OpenAI	65	32,768
#218	gpt-oss-20b	OpenAI	65	131,072
#219	Llama 3.1 8B Instruct	Meta	64	128,000
#220	Llama 3.2 11B Vision Instruct	Meta	64	128,000
#221	Llama 3.2 1B Instruct	Meta	61	128,000
#222	Llama 3.2 3B Instruct	Meta	61	128,000
#223	MiMo-Audio-7B	Xiaomi	61	131,072
#224	Code Llama 70B Instruct	Meta	60	8,192
#225	LFM2-24B-A2B	Liquid AI	60	32,768
#226	Llama Guard 4 12B	Meta	60	131,072
#227	Meta Llama 3 70B Instruct	Meta	60	8,192
#228	Step-Audio-R1.1	StepFun	59	131,072
#229	GPT-4o mini Transcribe	OpenAI	58	128,000
#230	GPT-4o mini TTS	OpenAI	58	128,000
#231	GPT-4o Transcribe	OpenAI	58	128,000
#232	Code Llama 34B Instruct	Meta	57	8,192
#233	Llama Guard 3 11B Vision	Meta	57	131,072
#234	Meta Llama 3 8B Instruct	Meta	57	8,192
#235	LFM2-8B-A1B	Liquid AI	55	32,768
#236	FLUX 1.1 Pro	Black Forest Labs	51	512
#237	FLUX 1.1 Pro Ultra	Black Forest Labs	51	512
#238	FLUX 1 Pro	Black Forest Labs	49	512
#239	LFM2.5-1.2B-Instruct	Liquid AI	49	131,072
#240	LFM2.5-1.2B-Thinking	Liquid AI	49	131,072
#241	LFM2-2.6B	Liquid AI	47	32,768
#242	pplx-embed-v1-4b	Perplexity	44	8,192
#243	pplx-embed-v1-0.6b	Perplexity	43	8,192
#244	NextStep-1.1	StepFun	42	512
#245	Prompt Guard 86M	Meta	41	512

Why #1: GPT-5.4

OpenAI's GPT-5.4, the most capable and efficient frontier model for professional work. First general-purpose model with native computer-use capabilities. Combines industry-leading coding from GPT-5.3-Codex with improved agentic workflows.

This model clears the current full-profile threshold for leaderboard methodology.

Why #2: Claude Sonnet 4.6

Anthropic's current Sonnet tier for fast frontier reasoning, coding, and long-context agent work.

This model clears the current full-profile threshold for leaderboard methodology.

Why #3: Claude Opus 4.6

Anthropic's most intelligent Claude model for complex agents, coding, and deep reasoning, with 1M token context and 128K output.

This model clears the current full-profile threshold for leaderboard methodology.