Using CodeLlama 70B Python on Together AI

Implementation guide · Code Llama · AI at Meta

ServerlessOpen Weights

Quick Start

1
Create an account at Together AI and generate an API key.
2
Use the Together AI SDK or REST API to call codellama-70b-python — see the documentation for request format.
3
You'll be billed $0.90/1M input, $0.90/1M output tokens. See full pricing.

API Portal Documentation Pricing

Code Examples

Install

pip install together

API key

TOGETHER_API_KEY

Model ID

codellama-70b-python

Together uses "organization/model-name" format, e.g. "meta-llama/Llama-4-Scout-17B-16E-Instruct" or "Qwen/QwQ-32B". See the Together model catalog for the exact ID.

from together import Together

client = Together()  # reads TOGETHER_API_KEY from env
response = client.chat.completions.create(
    model="codellama-70b-python",
    messages=[{"role": "user", "content": "Hello"}]
)
print(response.choices[0].message.content)

About Together AI

Platform for running open-source and proprietary LLMs

Together AI is a platform for running open-source and proprietary LLMs with fast serverless and dedicated endpoints at competitive inference pricing.

View all models on Together AI →

Pricing on Together AI

Type	Price (per 1M)
Input tokens	$0.90
Output tokens	$0.90

Capabilities

Structured Outputs

About CodeLlama 70B Python

CodeLlama 70B Python is a specialized AI model by Meta, designed for Python code synthesis and understanding. With 70 billion parameters, it excels in code completion, infilling, and instruction following tasks. The model leverages an optimized transformer architecture and has been fine-tuned with up to 16,000 tokens, making it particularly effective for Python-centric development workflows. While it doesn't support long contexts of 100,000 tokens, it offers powerful capabilities for both commercial and research applications in Python programming environments. More details can be found in the research paper "Code Llama: Open Foundation Models for Code" .

Full model details →

Model Specs

Released2024-01-29

Parameters70B

Context16k

ArchitectureDecoder Only

Knowledge cutoff2022-09

San Francisco, California, United States