Last refreshed 2026-04-15. Next refresh: weekly.
Why use Llama 3.2 3B on Fireworks AI?
Fireworks AI offers Llama 3.2 3B with pay-as-you-go pricing at $0.10/1M input tokens. Fireworks AI offers a generative AI platform as a service, focusing on rapid product iteration and cost-efficient AI deployment.
Input / 1M
$0.10
Output / 1M
$0.10
Cache
Not sourced
Batch
Not sourced
Setup recipe
Python + curlInstall
pip install openaiAuth
export FIREWORKS_API_KEY=...Call
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["FIREWORKS_API_KEY"],Model ID
llama-3.2-3bRequest example
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["FIREWORKS_API_KEY"],
base_url="https://api.fireworks.ai/inference/v1"
)
response = client.chat.completions.create(
model="llama-3.2-3b",
messages=[{"role": "user", "content": "Hello"}]
)
print(response.choices[0].message.content)Gotchas
- Fireworks model IDs use "accounts/fireworks/models/{model-name}" format, e.g. "accounts/fireworks/models/llama4-scout-instruct-basic" or "accounts/fireworks/models/deepseek-r1".
- The examples expect FIREWORKS_API_KEY; rename it only if your application config maps the new variable.
Pricing
| Type | Price (per 1M) |
|---|---|
| Input tokens | $0.10 |
| Output tokens | $0.10 |
Capabilities
No model capability flags are currently sourced.
FAQ
What does Llama 3.2 3B cost on Fireworks AI?
On Fireworks AI, Llama 3.2 3B costs $0.1 per 1M input tokens and $0.1 per 1M output tokens.
What is the context window for Llama 3.2 3B on Fireworks AI?
Llama 3.2 3B supports a 128,000 token context window on Fireworks AI.
Who created Llama 3.2 3B?
Llama 3.2 3B was created by AI at Meta as part of the Llama 3.2 model family.
Is Llama 3.2 3B open source?
Llama 3.2 3B is open source according to the seed data.
Get Started
Model Specs
Released2024-09-25
Parameters3.21B
Context128K
ArchitectureDecoder Only
Knowledge cutoff2023-12