LLM Reference

Nemotron 3 8B

About

Nemotron-3 8B is a series of large language models from NVIDIA, geared towards corporate applications for developing bespoke LLMs. Utilizing a GPT-3-style transformer architecture, the core model features 8 billion parameters and supports a 4,096 token context length. This model forms the backbone for specialized variants like Nemotron-3-8B-Base-4k for customization, Nemotron-3-8B-Chat models allowing for steerable outputs and refined via RLHF, and Nemotron-3-8B-QA, optimized for question-answering. Compatible with the NVIDIA NeMo framework, these models support fine-tuning methods such as LoRA and are designed for efficient deployment on NVIDIA GPUs. They have been trained on extensive multilingual data containing 3.5 to 3.8 trillion tokens across a diverse range of languages and evaluation benchmarks, although they may exhibit biases and inaccuracies due to their training data.

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Providers(1)

ProviderInput (per 1M)Output (per 1M)Type
Azure OpenAI
Provisioned

Specifications

Parameters8B
Context4K
ArchitectureDecoder Only
Specializationgeneral