Leaderboard

safety rankings

Updated weekly

Each leaderboard uses transparent weighted scoring, current model context, and supporting analysis to help teams interpret the results with confidence. Only full-profile entries appear in rankings; broader catalog records remain available elsewhere on the site when only source-backed metadata is currently available.

Full-profile entries

245

Models with complete enough metadata and scoring coverage to be meaningfully ranked in this category.

Ranking basis

safety

Scores combine benchmark evidence, product metadata, and cost/context signals when those fields are published.

Catalog caveat

Tracked models without full scoring remain in the directory and provider pages, but are not relied on for analytical ranking claims.

Rank	Model	Provider	Score	Context
#1	Claude Opus 4.6	Anthropic	92	1,000,000
#2	Claude Sonnet 4.5	Anthropic	92	1,000,000
#3	Claude Sonnet 4.6	Anthropic	92	1,000,000
#4	Claude 3.7 Sonnet	Anthropic	91	200,000
#5	Claude Opus 4.1	Anthropic	90	200,000
#6	Claude Sonnet 4	Anthropic	90	1,000,000
#7	Claude Opus 4	Anthropic	89	200,000
#8	Command R+ 2026	Cohere	89	128,000
#9	Gemini 2.5 Pro	Google DeepMind	88	1,048,576
#10	GPT-4o	OpenAI	88	128,000
#11	GPT-5.4	OpenAI	88	1,000,000
#12	Claude Haiku 4.5	Anthropic	87	200,000
#13	Gemini 3.1 Pro	Google DeepMind	86	1,048,576
#14	GPT-5.2	OpenAI	86	1,000,000
#15	GPT-5.4 Pro	OpenAI	86	1,000,000
#16	Gemini 3.0 Pro	Google DeepMind	85	1,048,576
#17	Gemini 3.1 Flash	Google DeepMind	85	1,048,576
#18	GPT-5	OpenAI	85	1,000,000
#19	GPT-5.2 Pro	OpenAI	85	1,000,000
#20	GPT-5.3 Instant	OpenAI	85	1,000,000
#21	Claude Opus 3	Anthropic	84	200,000
#22	Claude Sonnet 3	Anthropic	84	200,000
#23	Gemini 2.0 Flash	Google DeepMind	84	1,048,576
#24	Gemini 2.5 Flash	Google DeepMind	84	1,048,576
#25	Gemini 2.5 Pro TTS	Google DeepMind	84	1,048,576
#26	Gemini 3.0 Flash	Google DeepMind	84	1,048,576
#27	GPT-5 mini	OpenAI	84	1,000,000
#28	GPT-5.3-Codex	OpenAI	84	1,000,000
#29	Mistral Large 25	Mistral AI	84	128,000
#30	Sonar Deep Research	Perplexity	84	200,000
#31	Claude Haiku 3.5	Anthropic	83	200,000
#32	Gemini 2.5 Flash Live	Google DeepMind	83	1,048,576
#33	Gemini 2.5 Flash Native Audio Preview	Google DeepMind	83	1,048,576
#34	Gemini 2.5 Flash-Lite	Google DeepMind	83	1,048,576
#35	Gemini 3.1 Flash-Lite	Google DeepMind	83	1,048,576
#36	GPT-4.1	OpenAI	83	1,048,576
#37	GPT-4o-mini	OpenAI	83	128,000
#38	Llama Guard 4 12B	Meta	83	131,072
#39	Nova Pro	Amazon Web Services	83	300,000
#40	o4-mini	OpenAI	83	200,000
#41	Sonar Reasoning Pro	Perplexity	83	200,000
#42	GPT-4.1 mini	OpenAI	82	1,048,576
#43	GPT-5 nano	OpenAI	82	1,000,000
#44	Sonar Pro	Perplexity	82	200,000
#45	Claude Haiku 3	Anthropic	81	200,000
#46	Gemini 1.5 Pro	Google DeepMind	81	2,097,152
#47	Gemini 2.0 Flash-Lite	Google DeepMind	81	1,048,576
#48	Llama 4 Maverick	Meta	81	1,048,576
#49	Nova Lite	Amazon Web Services	81	300,000
#50	o4-mini-deep-research	OpenAI	81	200,000
#51	Sonar	Perplexity	81	128,000
#52	Gemini 1.5 Flash	Google DeepMind	80	1,048,576
#53	Llama Guard 3 11B Vision	Meta	80	131,072
#54	o1-mini	OpenAI	80	128,000
#55	o3	OpenAI	80	200,000
#56	o3-deep-research	OpenAI	80	200,000
#57	morph-v3-fast-apply	Morph	79	128,000
#58	Nova Micro	Amazon Web Services	79	128,000
#59	warpgrep-v2	Morph	79	128,000
#60	Gemini 1.5 Flash-8B	Google DeepMind	78	1,048,576
#61	gpt-audio	OpenAI	78	128,000
#62	gpt-realtime	OpenAI	78	128,000
#63	o1	OpenAI	78	200,000
#64	flash-compact	Morph	77	200,000
#65	Claude Haiku 3	Anthropic	76	200,000
#66	Codestral	Mistral AI	76	256,000
#67	GPT Image 1	OpenAI	76	32,768
#68	gpt-audio-mini	OpenAI	76	128,000
#69	gpt-realtime-mini	OpenAI	76	128,000
#70	Voxtral Mini Transcribe	Mistral AI	76	131,072
#71	chatgpt-image-latest	OpenAI	75	32,768
#72	Claude Haiku 3.5	Anthropic	75	200,000
#73	Claude Haiku 4.5	Anthropic	75	200,000
#74	Codestral Embed	Mistral AI	75	32,768
#75	CogVideoX	Z.AI	75	128,000
#76	CogView 4	Z.AI	75	128,000
#77	Devstral Medium 1.0	Mistral AI	75	128,000
#78	Doubao-Seed-1.6	ByteDance / Doubao	75	128,000
#79	Doubao-Seed-1.6-Flash	ByteDance / Doubao	75	128,000
#80	Doubao-Seed-2.0-Code	ByteDance / Doubao	75	128,000
#81	Doubao-Seed-Code	ByteDance / Doubao	75	128,000
#82	ERNIE 3.5 128K	Baidu / ERNIE	75	128,000
#83	ERNIE 4.0 Turbo 8K	Baidu / ERNIE	75	128,000
#84	ERNIE Functions 8K	Baidu / ERNIE	75	128,000
#85	ERNIE Speed 128K	Baidu / ERNIE	75	128,000
#86	GLM-4.5	Z.AI	75	128,000
#87	GLM-4.5V	Z.AI	75	128,000
#88	GLM-4.6	Z.AI	75	128,000
#89	GLM-4.6V	Z.AI	75	128,000
#90	GLM-4.7	Z.AI	75	128,000
#91	GLM-5	Z.AI	75	128,000
#92	GLM-Image	Z.AI	75	128,000
#93	GLM-OCR	Z.AI	75	128,000
#94	gpt-image-1-mini	OpenAI	75	32,768
#95	Hunyuan Code	Tencent / Hunyuan	75	128,000
#96	Hunyuan Lite	Tencent / Hunyuan	75	128,000
#97	Hunyuan Standard	Tencent / Hunyuan	75	128,000
#98	Hunyuan T1	Tencent / Hunyuan	75	256,000
#99	Hunyuan T1 Vision	Tencent / Hunyuan	75	128,000
#100	Hunyuan TurboS	Tencent / Hunyuan	75	128,000
#101	Hunyuan TurboS LongText 128K	Tencent / Hunyuan	75	128,000
#102	Kimi K2	Moonshot AI / Kimi	75	131,072
#103	Kimi K2 Thinking	Moonshot AI / Kimi	75	256,000
#104	Kimi K2 Turbo Preview	Moonshot AI / Kimi	75	256,000
#105	Kimi K2.5	Moonshot AI / Kimi	75	256,000
#106	Magistral Medium 1.2	Mistral AI	75	128,000
#107	MiniMax-M1	MiniMax	75	204,800
#108	MiniMax-M2	MiniMax	75	204,800
#109	MiniMax-M2.1	MiniMax	75	204,800
#110	MiniMax-M2.1-highspeed	MiniMax	75	204,800
#111	MiniMax-M2.5	MiniMax	75	204,800
#112	MiniMax-M2.5-highspeed	MiniMax	75	204,800
#113	MiniMax-Text-01	MiniMax	75	204,800
#114	MiniMax-VL-01	MiniMax	75	204,800
#115	Mistral Embed	Mistral AI	75	32,768
#116	Mistral Large 3	Mistral AI	75	128,000
#117	Mistral Medium 3.1	Mistral AI	75	128,000
#118	Mistral Moderation	Mistral AI	75	32,768
#119	Mistral Small 3.1	Mistral AI	75	128,000
#120	Mistral Small 3.2 Open	Mistral AI	75	128,000
#121	Pixtral 12B	Mistral AI	75	131,072
#122	Pixtral Large	Mistral AI	75	131,072
#123	Vidu Q1	Z.AI	75	128,000
#124	Voxtral Mini Open	Mistral AI	75	131,072
#125	Voxtral Small Open	Mistral AI	75	131,072
#126	Claude Sonnet 3	Anthropic	74	200,000
#127	Claude Sonnet 4	Anthropic	74	1,000,000
#128	Command A	Cohere	74	256,000
#129	Command A Reasoning	Cohere	74	128,000
#130	Command A Translate	Cohere	74	128,000
#131	Command A Vision	Cohere	74	128,000
#132	Command R+	Cohere	74	128,000
#133	Command R7B	Cohere	74	128,000
#134	DeepSeek-Coder-V2	DeepSeek	74	128,000
#135	DeepSeek-Math-V2	DeepSeek	74	128,000
#136	DeepSeek-R1	DeepSeek	74	128,000
#137	DeepSeek-R1-Distill-Llama-70B	DeepSeek	74	128,000
#138	DeepSeek-V2.5	DeepSeek	74	128,000
#139	DeepSeek-V3	DeepSeek	74	128,000
#140	DeepSeek-V3.1	DeepSeek	74	128,000
#141	DeepSeek-V3.1-Base	DeepSeek	74	128,000
#142	DeepSeek-V3.2	DeepSeek	74	128,000
#143	DeepSeek-V3.2-Exp	DeepSeek	74	128,000
#144	Devstral 2 Open	Mistral AI	74	128,000
#145	Devstral Small 2	Mistral AI	74	128,000
#146	Embed 4	Cohere	74	128,000
#147	ERNIE 4.5 Turbo 32K	Baidu / ERNIE	74	32,768
#148	Grok 3	xAI	74	131,072
#149	Grok 3 Mini	xAI	74	131,072
#150	Grok 4	xAI	74	256,000
#151	Grok 4 Fast Reasoning	xAI	74	131,072
#152	grok-image	xAI	74	131,072
#153	image-01	MiniMax	74	8,192
#154	image-01-live	MiniMax	74	8,192
#155	Jamba 3B	AI21 Labs	74	256,000
#156	Jamba Large	AI21 Labs	74	256,000
#157	Jamba Large 1.6	AI21 Labs	74	256,000
#158	Jamba Mini	AI21 Labs	74	256,000
#159	Jamba Mini 1.6	AI21 Labs	74	256,000
#160	Jamba Mini 1.7	AI21 Labs	74	256,000
#161	Magistral Small 1.2 Open	Mistral AI	74	128,000
#162	MiniMax-Speech-02	MiniMax	74	8,192
#163	Ministral 3 14B Open	Mistral AI	74	128,000
#164	Ministral 3 3B Open	Mistral AI	74	128,000
#165	Ministral 3 8B Open	Mistral AI	74	128,000
#166	Mistral Large 3 Open	Mistral AI	74	128,000
#167	Mistral Nemo 12B	Mistral AI	74	128,000
#168	Mistral OCR 2505	Mistral AI	74	32,768
#169	music-2.0	MiniMax	74	8,192
#170	Phi-3-vision-128k-instruct	Microsoft	74	128,000
#171	Phi-3.5-mini-instruct	Microsoft	74	131,072
#172	Phi-3.5-MoE-instruct	Microsoft	74	131,072
#173	Phi-3.5-vision-instruct	Microsoft	74	131,072
#174	Phi-4-mini-flash-reasoning	Microsoft	74	131,072
#175	Phi-4-mini-instruct	Microsoft	74	131,072
#176	Phi-4-multimodal-instruct	Microsoft	74	131,072
#177	Phi-4-reasoning	Microsoft	74	131,072
#178	Phi-4-reasoning-plus	Microsoft	74	131,072
#179	Phi-4-reasoning-vision-15B	Microsoft	74	131,072
#180	Qwen2.5-1.5B-Instruct	Alibaba Qwen	74	131,072
#181	Qwen2.5-14B-Instruct	Alibaba Qwen	74	131,072
#182	Qwen2.5-32B-Instruct	Alibaba Qwen	74	131,072
#183	Qwen2.5-3B-Instruct	Alibaba Qwen	74	131,072
#184	Qwen2.5-72B-Instruct	Alibaba Qwen	74	131,072
#185	Qwen2.5-7B-Instruct	Alibaba Qwen	74	131,072
#186	Qwen2.5-Max	Alibaba Qwen	74	131,072
#187	Qwen2.5-VL-72B-Instruct	Alibaba Qwen	74	131,072
#188	Qwen2.5-VL-7B-Instruct	Alibaba Qwen	74	131,072
#189	Qwen3-Coder-Next	Alibaba Qwen	74	131,072
#190	Qwen3.5-0.8B	Alibaba Qwen	74	131,072
#191	Qwen3.5-122B-A10B	Alibaba Qwen	74	131,072
#192	Qwen3.5-27B	Alibaba Qwen	74	131,072
#193	Qwen3.5-2B	Alibaba Qwen	74	131,072
#194	Qwen3.5-35B-A3B	Alibaba Qwen	74	131,072
#195	Qwen3.5-397B-A17B	Alibaba Qwen	74	131,072
#196	Qwen3.5-4B	Alibaba Qwen	74	131,072
#197	Qwen3.5-9B	Alibaba Qwen	74	131,072
#198	DeepSeek-OCR	DeepSeek	73	16,384
#199	DeepSeek-OCR-2	DeepSeek	73	16,384
#200	DeepSeek-VL2-Small	DeepSeek	73	16,384
#201	GPT-4o mini Transcribe	OpenAI	73	128,000
#202	GPT-4o mini TTS	OpenAI	73	128,000
#203	GPT-4o Transcribe	OpenAI	73	128,000
#204	gpt-oss-120b	OpenAI	73	131,072
#205	Janus-Pro-7B	DeepSeek	73	16,384
#206	Phi-4	Microsoft	73	16,384
#207	Claude Opus 3	Anthropic	72	200,000
#208	Claude Opus 4	Anthropic	72	200,000
#209	Llama 3.3 70B Instruct	Meta	72	128,000
#210	phi-1	Microsoft	72	4,096
#211	phi-1_5	Microsoft	72	4,096
#212	phi-2	Microsoft	72	4,096
#213	Phi-3-medium-4k-instruct	Microsoft	72	4,096
#214	Phi-3-mini-4k-instruct	Microsoft	72	4,096
#215	Phi-tiny-MoE-instruct	Microsoft	72	4,096
#216	pplx-embed-v1-4b	Perplexity	72	8,192
#217	Prompt Guard 86M	Meta	72	512
#218	pplx-embed-v1-0.6b	Perplexity	71	8,192
#219	Step-3.5-Flash	StepFun	71	131,072
#220	gpt-oss-20b	OpenAI	70	131,072
#221	Llama 3.1 405B Instruct	Meta	70	128,000
#222	Llama 4 Scout	Meta	70	10,485,760
#223	LFM2-24B-A2B	Liquid AI	69	32,768
#224	Llama 3.1 70B Instruct	Meta	69	128,000
#225	MiMo-VL-7B	Xiaomi	68	131,072
#226	Step3-VL-10B	StepFun	68	131,072
#227	Llama 3.1 8B Instruct	Meta	67	128,000
#228	Llama 3.2 90B Vision Instruct	Meta	67	128,000
#229	MiMo-Audio-7B	Xiaomi	67	131,072
#230	LFM2-8B-A1B	Liquid AI	66	32,768
#231	Llama 3.2 1B Instruct	Meta	66	128,000
#232	Llama 3.2 3B Instruct	Meta	66	128,000
#233	Step-Audio-R1.1	StepFun	66	131,072
#234	Llama 3.2 11B Vision Instruct	Meta	65	128,000
#235	Code Llama 70B Instruct	Meta	64	8,192
#236	Meta Llama 3 70B Instruct	Meta	64	8,192
#237	FLUX 1.1 Pro	Black Forest Labs	63	512
#238	LFM2.5-1.2B-Instruct	Liquid AI	63	131,072
#239	LFM2.5-1.2B-Thinking	Liquid AI	63	131,072
#240	Code Llama 34B Instruct	Meta	62	8,192
#241	FLUX 1.1 Pro Ultra	Black Forest Labs	62	512
#242	Meta Llama 3 8B Instruct	Meta	62	8,192
#243	FLUX 1 Pro	Black Forest Labs	61	512
#244	LFM2-2.6B	Liquid AI	60	32,768
#245	NextStep-1.1	StepFun	54	512

Why #1: Claude Opus 4.6

Anthropic's most intelligent Claude model for complex agents, coding, and deep reasoning, with 1M token context and 128K output.

This model clears the current full-profile threshold for leaderboard methodology.

Why #2: Claude Sonnet 4.5

Anthropic's Sonnet 4.5 with 1M token context for fast frontier reasoning, coding, and long-context agent work.

This model clears the current full-profile threshold for leaderboard methodology.

Why #3: Claude Sonnet 4.6

Anthropic's current Sonnet tier for fast frontier reasoning, coding, and long-context agent work.

This model clears the current full-profile threshold for leaderboard methodology.