LLM Reference
Azure OpenAI

Nemotron 3 8B on Azure OpenAI

Nemotron-3 · NVIDIA AI

Provisioned

Pricing

TypePrice (per 1M)
Input tokens$0.37
Output tokens$1.10

Capabilities

VisionMultimodalReasoningFunction CallingTool UseJSON ModeCode Execution

About Nemotron 3 8B

Nemotron-3 8B is a series of large language models from NVIDIA, geared towards corporate applications for developing bespoke LLMs. Utilizing a GPT-3-style transformer architecture, the core model features 8 billion parameters and supports a 4,096 token context length. This model forms the backbone for specialized variants like Nemotron-3-8B-Base-4k for customization, Nemotron-3-8B-Chat models allowing for steerable outputs and refined via RLHF, and Nemotron-3-8B-QA, optimized for question-answering. Compatible with the NVIDIA NeMo framework, these models support fine-tuning methods such as LoRA and are designed for efficient deployment on NVIDIA GPUs. They have been trained on extensive multilingual data containing 3.5 to 3.8 trillion tokens across a diverse range of languages and evaluation benchmarks, although they may exhibit biases and inaccuracies due to their training data.

Get Started

Model Specs

Released2026-03-01
Parameters8B
Context4K
ArchitectureDecoder Only