
NVIDIA | Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)
Llama 3.1 Nemotron Ultra 253B v1 Reasoning by NVIDIA: Text model; TTFT 0.721s, 42.5 tok/s.
artificial-analysismanufactureraa-bootstrap
Latency
721ms
Throughput
-
Total Context
-
Max Output
-
Input Price
$0.6/M
Output Price
$1.8/M
API Parameters & Capabilities
Model Type
-
Parameter Size
-
Input Modality
-
Output Modality
-
Inference Speed
42.502 tokens/s
Success Rate
-
Peak Concurrency
-
Release Date
5/7/2026
Integration & Pricing Details
Pricing Mode
-
Free Tier
-
Supported Languages
-
SDK
-
API Key Acquisition
-
Rate Limit
-
User Reviews
0 verified user reviews
Loading reviews...
