GPT-3.5 Turbo on Replicate API

Name: GPT-3.5 Turbo on Replicate API
Brand: OpenAI
SKU: gpt-3.5-turbo-replicate
Price: 0.5 USD

Serverless

Last refreshed 2026-05-10. Next refresh: weekly.

Why use GPT-3.5 Turbo on Replicate API?

Replicate API offers GPT-3.5 Turbo with pay-as-you-go pricing at $0.50/1M input tokens. Replicate is a cloud-based platform that enables users to run machine learning models easily and efficiently.

Compare GPT-3.5 Turbo across 6 providers to find the best fit for your use case

Input / 1M

$0.50

Output / 1M

$1.50

Cache

Not sourced

Batch

Not sourced

Setup recipe

Python + curl

Install

pip install replicate

Auth

export REPLICATE_API_TOKEN=...

Call

import replicate
output = replicate.run(
    "gpt-3.5-turbo",
    input={"prompt": "Hello"}

Model ID

gpt-3.5-turbo

Request example

import replicate

# reads REPLICATE_API_TOKEN from env
# gpt-3.5-turbo format: "owner/model-name" (latest version) or "owner/model-name:version-hash"
output = replicate.run(
    "gpt-3.5-turbo",
    input={"prompt": "Hello"}
)
# Output is a list or generator depending on the model
print("".join(output))

Gotchas

Replicate uses "owner/model-name" format (e.g. "meta/meta-llama-3-8b-instruct") for the latest version, or "owner/model-name:version-sha" to pin to a specific version. The REST endpoint splits owner and model-name into the path: /v1/models/{owner}/{model-name}/predictions.
The examples expect REPLICATE_API_TOKEN; rename it only if your application config maps the new variable.

Compare GPT-3.5 Turbo Across Providers

Provider	Input (per 1M)	Output (per 1M)
Azure OpenAI	$0.50	$1.50
OpenAI API	$0.50	$1.50
Salesforce Einstein Generative AI	—	—
OpenRouter	$0.50	$1.50
Replicate API	$0.50	$1.50

View all 6 providers →

Pricing

Type	Price (per 1M)
Input tokens	$0.50
Output tokens	$1.50

Capabilities

Structured Outputs

About GPT-3.5 Turbo

GPT-3.5 Turbo is an advanced language model developed by OpenAI, showcasing significant advancements over GPT-3 and GPT-3.5. As the engine behind the popular ChatGPT application, it excels in tasks like text generation, translation, question answering, summarization, and code generation. This model employs Reinforcement Learning from Human Feedback (RLHF) to enhance accuracy and produce policy-optimized responses. Despite its prowess, it has a knowledge cutoff of September 2021 and can demonstrate biases from its training data. Occasionally, it may generate incorrect or nonsensical content, known as "hallucination," and is sensitive to input phrasing variations. Additionally, the free version may experience slowdowns due to high demand. Nevertheless, GPT-3.5 Turbo remains a powerful tool with versatile applications across numerous fields.