NVIDIA

Llama-Embed-Nemotron 8B

Open weightopen-weight

NVIDIA's Llama-Embed-Nemotron 8B ranked #1 on multilingual MTEB leaderboard with text and image retrieval support.

Last verified: 2026-03-29Confidence: MediumSources: 4

textimageopen-source

Input price

Unpublished

Output price

Unpublished

Context window

32,768

Max output

Release date

2025-10-21

Access

open-weight, self-hosted, api

License

Llama Community License

Last verified

2026-03-29

Capability profile

Radar view of the model's practical strengths. This chart is backed by textual summaries below for crawlability.

Benchmark summary

Top multilingual embedding model for multimodal retrieval workflows.

No benchmark series is attached to this model yet. Source links and product metadata are available below.

Strengths

• #1 multilingual MTEB
• Text+image retrieval
• Strong RAG performance

Trade-offs

• Embedding only, no generation

Crawlable benchmark analysis

Llama-Embed-Nemotron 8B is positioned as an embedding model with published scores that emphasize its practical fit for buyers evaluating the entry.

Published scores highlight reasoning 22/100, coding 12/100, enterprise readiness 82/100, vision 60/100, speed 78/100, and safety 72/100.

Pricing is not fully published for this entry. With a context window of 32,768 tokens, it supports large-document analysis and retrieval workflows.

Benchmark coverage is still limited for this entry, so this section focuses on published metadata and deployment fit.

Sources

Provider and distribution links used to verify this model record.

Last verified: 2026-03-29

NVIDIA website
official-website
Open link
NVIDIA NIM docs
official-docs
Open link
NVIDIA NIM pricing
official-pricing
Open link
NVIDIA HuggingFace
cloud-platform
Open link

Related models

OpenAI

GPT-5.4

OpenAI

OpenAI's GPT-5.4, the most capable and efficient frontier model for professional work. First general-purpose model with native computer-use capabilities. Combines industry-leading coding from GPT-5.3-Codex with improved agentic workflows.

Score 933 sources

textreasoningtool-usevisionapihosted

Context: 1,000,000
Input: $0.005/1K tok
Output: $0.02/1K tok
Coverage: Full profile

View analysis

Anthropic

Claude Sonnet 4.6

Claude 4.6

Anthropic's current Sonnet tier for fast frontier reasoning, coding, and long-context agent work.

Score 923 sources

textvisionreasoningcodetool-useapihosted

Context: 1,000,000
Input: $0.003/1K tok
Output: $0.02/1K tok
Coverage: Full profile

View analysis

Anthropic

Claude Opus 4.6

Claude 1M

Anthropic's most intelligent Claude model for complex agents, coding, and deep reasoning, with 1M token context and 128K output.

Score 913 sources

textvisionreasoningapihosted

Context: 1,000,000
Input: $0.005/1K tok
Output: $0.03/1K tok
Coverage: Full profile

View analysis