L
Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)
Last updated: Recently
Model Type
Text
Max Context
-
Max Output
-
Parameter Scale
-
Llama 3.1 Nemotron Ultra 253B v1 Reasoning by NVIDIA: Text model; TTFT 0.721s, 42.5 tok/s.
Input Modality
-
Output Modality
-
Inference Speed
42.502 tokens/s
Latest Release Date
4/7/2025
SDK Ecosystem
-
artificial-analysismanufactureraa-bootstrap
Model Overview Fields
name
Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)
Release date
4/7/2025
Performance
42.502 tokens/s
model_creator.name
NVIDIA
AA Evaluation Scores
AA Intelligence Index
15
AA Coding Index
13.1
AA Math Index
63.7
MMLU Pro
0.825
GPQA
0.728
HLE
0.081
LiveCodeBench
0.641
SciCode
0.347
Math-500
0.952
AIME
0.747
AIME 2025
0.637
IFBench
0.382
LCR
0.073
TerminalBench Hard
0.023
TAU2
0.114
Output Speed (tokens/s)
42.502 tok/s
TTFT (s)
0.721s
First Answer Token (s)
47.778s

