OpenAI

gpt-audio

Overall 72commercial

OpenAI's realtime and audio models for low-latency voice interfaces.

Last verified: 2026-03-29Confidence: HighSources: 3

audiotext

Input price

$0.006/1K tok

Output price

$0.02/1K tok

Context window

128,000

Max output

16,384

Release date

2025-06-01

Access

api, hosted

License

Proprietary / not disclosed

Last verified

2026-03-29

Capability profile

Radar view of the model's practical strengths. This chart is backed by textual summaries below for crawlability.

Benchmark summary

This entry has source-backed metadata, but benchmark coverage is still sparse, so the page emphasizes published specs and deployment fit.

No benchmark series is attached to this model yet. Source links and product metadata are available below.

Strengths

• Source-backed listing for audio / realtime buyers
• Useful when deeper benchmark profiling is still incomplete

Trade-offs

• Detailed benchmark coverage is still limited for this entry
• Some pricing or capability fields may remain unpublished

Crawlable benchmark analysis

gpt-audio is positioned as an audio / realtime model with published scores that emphasize its practical fit for buyers evaluating the entry.

Published scores highlight reasoning 78/100, coding 62/100, enterprise readiness 78/100, vision 38/100, speed 95/100, and safety 76/100.

Pricing starts at $0.006 per 1K input tokens and $0.02 per 1K output tokens. With a context window of 128,000 tokens, it supports large-document analysis and retrieval workflows.

Benchmark coverage is still limited for this entry, so this section focuses on published metadata and deployment fit.

Sources

Provider and distribution links used to verify this model record.

Last verified: 2026-03-29

OpenAI website
official-website
Open link
OpenAI docs
official-docs
Open link
OpenAI pricing
official-pricing
Open link

Related models

OpenAI

GPT-5.4

OpenAI

OpenAI's GPT-5.4, the most capable and efficient frontier model for professional work. First general-purpose model with native computer-use capabilities. Combines industry-leading coding from GPT-5.3-Codex with improved agentic workflows.

Score 933 sources

textreasoningtool-usevisionapihosted

Context: 1,000,000
Input: $0.005/1K tok
Output: $0.02/1K tok
Coverage: Full profile

View analysis

Anthropic

Claude Sonnet 4.6

Claude 4.6

Anthropic's current Sonnet tier for fast frontier reasoning, coding, and long-context agent work.

Score 923 sources

textvisionreasoningcodetool-useapihosted

Context: 1,000,000
Input: $0.003/1K tok
Output: $0.02/1K tok
Coverage: Full profile

View analysis

Anthropic

Claude Opus 4.6

Claude 1M

Anthropic's most intelligent Claude model for complex agents, coding, and deep reasoning, with 1M token context and 128K output.

Score 913 sources

textvisionreasoningapihosted

Context: 1,000,000
Input: $0.005/1K tok
Output: $0.03/1K tok
Coverage: Full profile

View analysis