LLM Reference
OctoAI API

Models on OctoAI API

13 models available · OctoAI

ModelInput (per 1M)Output (per 1M)Context
Hermes 2 Pro Llama 3 8B$0.15$0.15
Llama 3 8B Instruct$0.15$0.158K
Llama 3.1 8B Instruct$0.15$0.15128K
Llama Guard 2 8B$0.15$0.158K
Mistral 7B v0.1$0.15$0.158K
Nous Hermes 2 Mixtral 8x7B$0.15$0.15
Qwen2 7B$0.15$0.15128K
Mixtral 8x7B$0.45$0.4532K
Llama 3 70B Instruct$0.9$0.98K
Llama 3.1 70B Instruct$0.9$0.9128K
Mixtral 8x22B v0.1$1.2$1.264K
WizardLM-2 8x22B$1.2$1.2
Llama 3.1 405B Instruct$3$9128K

Pricing Overview

Cheapest$0.15/1M
Most expensive$3.00/1M

About OctoAI API

OctoAI's generative AI platform offers a versatile and scalable solution for running, tuning, and scaling various AI models. The platform's core feature, OctoStack, provides a turnkey production stack that enables model deployment in cloud or on-premises environments, ensuring data control and privacy. Users can access a library of pre-built templates for popular open-source models, facilitating quick development and integration into existing workflows. The platform also incorporates advanced performance optimizations, significantly improving GPU utilization and reducing operational costs, making it suitable for high-demand applications. The platform emphasizes user experience through easy-to-use APIs and customizable features. It employs automated hardware selection to optimize price-performance trade-offs, enabling efficient scaling of applications. With capabilities such as intelligent request routing, efficient auto-scaling, and reduced cold start times, the platform can handle millions of daily image generations seamlessly. Additionally, it offers fine-tuning options and dynamic customizations, allowing users to create unique, high-quality outputs tailored to their specific needs, thereby enhancing overall application performance and user satisfaction.

Full provider profile →