Last refreshed 2026-05-07. Next refresh: weekly.
Why use Cogito v1 Preview Llama 8B on Fireworks AI?
Fireworks AI offers Cogito v1 Preview Llama 8B with pay-as-you-go pricing at $0.20/1M input tokens. Fireworks AI offers a generative AI platform as a service, focusing on rapid product iteration and cost-efficient AI deployment.
Setup recipe
Python + curlpip install openaiexport FIREWORKS_API_KEY=...import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["FIREWORKS_API_KEY"],cogito-v1-preview-llama-8bRequest example
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["FIREWORKS_API_KEY"],
base_url="https://api.fireworks.ai/inference/v1"
)
response = client.chat.completions.create(
model="cogito-v1-preview-llama-8b",
messages=[{"role": "user", "content": "Hello"}]
)
print(response.choices[0].message.content)Gotchas
- Fireworks model IDs use "accounts/fireworks/models/{model-name}" format, e.g. "accounts/fireworks/models/llama4-scout-instruct-basic" or "accounts/fireworks/models/deepseek-r1".
- The examples expect FIREWORKS_API_KEY; rename it only if your application config maps the new variable.
Pricing
| Type | Price (per 1M) |
|---|---|
| Input tokens | $0.20 |
| Output tokens | $0.20 |
Capabilities
About Cogito v1 Preview Llama 8B
Cogito v1 Preview Llama 8B is a hybrid reasoning model fine-tuned from Llama 3.1 8B using Iterated Distillation and Amplification (IDA). Supports direct and extended-thinking modes, tool calling, and 30+ languages.
FAQ
What does Cogito v1 Preview Llama 8B cost on Fireworks AI?
On Fireworks AI, Cogito v1 Preview Llama 8B costs $0.2 per 1M input tokens and $0.2 per 1M output tokens.
What is the context window for Cogito v1 Preview Llama 8B on Fireworks AI?
Cogito v1 Preview Llama 8B supports a 128,000 token context window on Fireworks AI.
Who created Cogito v1 Preview Llama 8B?
Cogito v1 Preview Llama 8B was created by Deep Cogito as part of the Cogito model family.
Is Cogito v1 Preview Llama 8B open source?
Cogito v1 Preview Llama 8B is open source according to the seed data.