Using OctoML Llama-2-70b-chat on OctoML (Deprecated)
Implementation guide · Llama 2 · AI at Meta
ServerlessOpen Source
Quick Start
- 1
- 2Use the OctoML (Deprecated) SDK or REST API to call
octoml-llama-2-70b-chat— see the documentation for request format. - 3
Code Examples
About OctoML (Deprecated)
Optimized inference platform for foundation models
OctoML is an optimized inference platform for foundation models, offering serverless and dedicated deployment with performance tuning for production AI workloads.
Pricing on OctoML (Deprecated)
| Type | Price (per 1M) |
|---|---|
| Input tokens | $0.40 |
| Output tokens | $0.60 |
Capabilities
No model capability flags are currently sourced.
About OctoML Llama-2-70b-chat
OctoML Llama-2-70b-chat is Meta's Llama 2 model. Weights are openly available for self-hosting.
Model Specs
Released2023-07-18
Parameters70B
Context4K
ArchitectureDecoder Only
Knowledge cutoff2022-09