Benchmark

Vision Vista

vision

Synthetic multimodal benchmark for image understanding and analysis. Tests visual reasoning, OCR, document understanding, and image captioning.

Interpretation

Vision Vista is a vision benchmark evaluating visual understanding and analysis capabilities. It ranks 17 models from Gemini 3.1 Pro (94) to Command R+ 2026 (52.7). This benchmark contributes to the vision scoring on model pages and rankings.

Methodology: Synthetic and real-world image understanding tasks including OCR, visual QA, document parsing, and image captioning.