Benchmark

GlobalMMLU

reasoning

Multilingual reasoning benchmark. Tests knowledge and reasoning across 40+ languages and cultural contexts.

Interpretation

GlobalMMLU is a reasoning benchmark evaluating reasoning and problem-solving capabilities. It ranks 11 models from GPT-5.4 (85) to Llama 4 Maverick (75). This benchmark contributes to the reasoning scoring on model pages and rankings.

Methodology: Translated and culturally adapted MMLU questions across 40+ languages. Tests multilingual reasoning capability.

Source: https://huggingface.co/datasets/CohereForAI/Global-MMLU