Using Gemma 2 9B Instruct on Chutes AI

Implementation guide · Gemma 2 · Google DeepMind

ServerlessOpen Weights

Quick Start

1
Create an account at Chutes AI and generate an API key.
2
Use the Chutes AI SDK or REST API to call gemma-2-9b-it.
3
You'll be billed $0.10/1M input, $0.30/1M output tokens.

Code Examples

Code examples for this provider have not been sourced yet.

About Chutes AI

Chutes AI is a decentralized GPU compute network providing inference API access to open-source AI models.

View all models on Chutes AI →

Pricing on Chutes AI

Type	Price (per 1M)
Input tokens	$0.10
Output tokens	$0.30

Capabilities

Structured Outputs

About Gemma 2 9B Instruct

Gemma 2 9B Instruct, developed by Google, is a state-of-the-art large language model based on the advanced Gemini framework. It is a decoder-only transformer model with 9 billion parameters, offering a balance between size and performance. The model is trained on an expansive dataset comprising 8 trillion tokens, including web documents, code, and mathematical text, a notable 30% increase from its predecessor, Gemma 1.1. This allows it to adeptly handle diverse tasks such as question answering, creative writing, coding, and mathematical problem-solving. However, it shares common limitations of large language models, such as potential biases and the risk of generating inaccuracies or outdated information. Notably, Gemma 2 9B Instruct incorporates Grouped-Query Attention (GQA) and uses the GeGLU activation function, and is specifically fine-tuned to follow instructions and participate effectively in multi-turn dialogues.

Full model details →

Model Specs

Released2024-06-27

Parameters9B

Context8k

ArchitectureDecoder Only

Also available on(4)

Replicate API$0.10/1M Fireworks AI$0.20/1M NVIDIA NIM

Compare all providers →

Provider

Chutes AI