LLM Reference

Using OctoML Llama-2-70b-chat on OctoML (Deprecated)

Implementation guide · Llama 2 · AI at Meta

ServerlessOpen Source

Quick Start

  1. 1
    Create an account at OctoML (Deprecated) and generate an API key.
  2. 2
    Use the OctoML (Deprecated) SDK or REST API to call octoml-llama-2-70b-chat — see the documentation for request format.
  3. 3
    You'll be billed $0.40/1M input, $0.60/1M output tokens. See full pricing.

Code Examples

See OctoML (Deprecated) documentation for integration details.

About OctoML (Deprecated)

Optimized inference platform for foundation models

OctoML is an optimized inference platform for foundation models, offering serverless and dedicated deployment with performance tuning for production AI workloads.

Pricing on OctoML (Deprecated)

TypePrice (per 1M)
Input tokens$0.40
Output tokens$0.60

Capabilities

No model capability flags are currently sourced.

About OctoML Llama-2-70b-chat

OctoML Llama-2-70b-chat is Meta's Llama 2 model. Weights are openly available for self-hosting.

Model Specs

Released2023-07-18
Parameters70B
Context4K
ArchitectureDecoder Only
Knowledge cutoff2022-09

Provider

OctoML (Deprecated)

OctoML

Seattle, Washington, United States