Last refreshed 2026-05-19. Next refresh: weekly.
Why use OctoML Llama-2-70b-chat on OctoML (Deprecated)?
OctoML (Deprecated) offers OctoML Llama-2-70b-chat with pay-as-you-go pricing at $0.40/1M input tokens. OctoML is an optimized inference platform for foundation models, offering serverless and dedicated deployment with performance tuning for production AI workloads.
Setup recipe
Docs fallbackUse the provider REST API or SDKCreate a provider API keymodel: octoml-llama-2-70b-chatoctoml-llama-2-70b-chatRequest example
octoml-llama-2-70b-chat.Gotchas
No curated gotchas have been sourced for this exact provider/model route yet.
Pricing
| Type | Price (per 1M) |
|---|---|
| Input tokens | $0.40 |
| Output tokens | $0.60 |
Capabilities
No model capability flags are currently sourced.
About OctoML Llama-2-70b-chat
OctoML Llama-2-70b-chat is Meta's Llama 2 model. Weights are openly available for self-hosting.
FAQ
What does OctoML Llama-2-70b-chat cost on OctoML (Deprecated)?
On OctoML (Deprecated), OctoML Llama-2-70b-chat costs $0.4 per 1M input tokens and $0.6 per 1M output tokens.
What is the context window for OctoML Llama-2-70b-chat on OctoML (Deprecated)?
OctoML Llama-2-70b-chat supports a 4,096 token context window on OctoML (Deprecated).
Who created OctoML Llama-2-70b-chat?
OctoML Llama-2-70b-chat was created by AI at Meta as part of the Llama 2 model family.
Is OctoML Llama-2-70b-chat open source?
OctoML Llama-2-70b-chat is open source according to the seed data.