LLM Reference

NVIDIA Llama 3 ChatQA 70B

Released
2024-08-15
Last refreshed
2026-05-19
Status
Researched 16d ago
Deprecated

NVIDIA Llama 3 ChatQA 70B is a legacy integration reference; keep it only while you identify a current replacement.

Use it for

  • Teams maintaining an existing integration
  • Workloads that can use a 8k context window
  • Buyers comparing 2 tracked provider routes

Do not use it for

  • New production launches
  • Vision or document-understanding workloads
  • Strict JSON or tool-calling flows
Specifications
Released
2024-08-15
Context
8k
Parameters
70B
Architecture
Decoder Only
Specialization
general
Training
finetuned
Created by

Accelerated AI for enterprise solutions

Santa Clara, California, United States
Founded 2015
Website
Pricing
Output / 1M
$11.34
Input / 1M
$3.78

Cheapest of 2 routes · Microsoft Foundry

About

NVIDIA Llama 3 ChatQA 70B is NVIDIA AI's NVIDIA Llama 3 ChatQA model. It is deprecated (originally released 2024-08-15); use it only for reproducing earlier results or evaluating drift over time.

NVIDIA Llama 3 ChatQA 70B is a model in the NVIDIA Llama 3 ChatQA family. The structured metadata tracks a 8k-token context window. This page tracks provider routes through NVIDIA NIM and Microsoft Foundry, with the cheapest tracked route listed at $3.78 input and $11.34 output per 1M tokens. No headline benchmark score is tracked for NVIDIA Llama 3 ChatQA 70B yet.

Top use-case fit

No primary decision-task fit is mapped for this model yet.

Provider price ladder

Compare all 2

Compare API pricing across 2 providers for input and output tokens, batch, and cached reads when available.

ProviderInput / 1MOutput / 1MRoute
Microsoft Foundry$3.78$11.34
Provisioned
NVIDIA NIM--
ProvisionedPartial

Capabilities

No model capability flags are currently sourced.

Benchmark peer barsfor Coding

No task-mapped benchmark peers are available for this model yet.

Migration checks

No linked migration route is available for this model yet.