Llama 3.1 Nemotron 70B Reward
About
NVIDIA reward model based on Llama 3.1 70B, used for RLHF and preference ranking.
Capabilities
VisionMultimodalReasoningFunction CallingTool UseJSON ModeCode Execution
Providers(1)
| Provider | Input (per 1M) | Output (per 1M) | Type | |
|---|---|---|---|---|
| NVIDIA NIM | — | — | Serverless |
Specifications
FamilyNvidia
Released2024-10-01
Parameters70B
Context4K
ArchitectureDecoder Only
Specializationreward_model
Trainingpretraining