llmreference
OctoAI API (Deprecated)

Using Llama 3.1 8B Instruct on OctoAI API (Deprecated)

Implementation guide · Llama 3.1 · AI at Meta

ServerlessOpen Source

Quick Start

  1. 1
    Create an account at OctoAI API (Deprecated) and generate an API key.
  2. 2
    Use the OctoAI API (Deprecated) SDK or REST API to call llama3.1-8b-instruct — see the documentation for request format.
  3. 3
    You'll be billed $0.15/1M input, $0.15/1M output tokens. See full pricing.

Code Examples

See OctoAI API (Deprecated) documentation for integration details.

About OctoAI API (Deprecated)

OctoAI's generative AI platform offers a versatile and scalable solution for running, tuning, and scaling various AI models. The platform's core feature, OctoStack, provides a turnkey production stack that enables model deployment in cloud or on-premises environments, ensuring data control and privacy. Users can access a library of pre-built templates for popular open-source models, facilitating quick development and integration into existing workflows. The platform also incorporates advanced performance optimizations, significantly improving GPU utilization and reducing operational costs, making it suitable for high-demand applications. The platform emphasizes user experience through easy-to-use APIs and customizable features. It employs automated hardware selection to optimize price-performance trade-offs, enabling efficient scaling of applications. With capabilities such as intelligent request routing, efficient auto-scaling, and reduced cold start times, the platform can handle millions of daily image generations seamlessly. Additionally, it offers fine-tuning options and dynamic customizations, allowing users to create unique, high-quality outputs tailored to their specific needs, thereby enhancing overall application performance and user satisfaction.

OctoAI is a powerful AI infrastructure platform designed to help developers run, tune, and scale generative AI applications efficiently. The platform offers access to some of the fastest foundation models available, including Llama-2, Stable Diffusion, and SDXL, along with integrated customization solutions. OctoAI's infrastructure allows developers to focus on building impressive AI applications without becoming AI infrastructure experts. Key features of OctoAI's platform include: 1. Easy access to optimized models and fine-tuning capabilities 2. Seamless scaling from development to production 3. World-class machine learning systems 4. SaaS offering or deployment in the user's environment 5. Infrastructure optimized for running the latest AI models OctoAI aims to make models work for developers, streamlining the process of integrating AI into applications and ensuring efficient performance. The platform is particularly suited for developers looking to leverage generative AI technologies in their projects without the complexity of managing the underlying infrastructure. Founded by the creators of Apache TVM, an open-source ML stack for model performance and portability, OctoAI brings extensive expertise in optimizing machine learning models and systems to its platform offerings.

Pricing on OctoAI API (Deprecated)

TypePrice (per 1M)
Input tokens$0.15
Output tokens$0.15

Capabilities

Structured Outputs

About Llama 3.1 8B Instruct

The Llama 3.1 8B Instruct model, released on July 23, 2024, is a multilingual large language model with 8 billion parameters, optimized for instruction-following tasks. It features an enhanced transformer architecture, supporting languages like English, German, French, and others. The model excels in dialogue applications, having been fine-tuned using supervised fine-tuning and reinforcement learning with human feedback. Trained on approximately 15 trillion tokens with a December 2023 data cutoff, it outperforms many existing open-source and closed chat models in various benchmarks. Ideal for commercial and research applications such as conversational agents and content generation, the model can be accessed on Hugging Face .

Model Specs

Released2024-07-23
Parameters8B
Context128K
ArchitectureDecoder Only
Knowledge cutoff2023-12

Provider

OctoAI API (Deprecated)
OctoAI API (Deprecated)

OctoAI

Seattle, Washington, United States