LLM ReferenceLLM Reference
Replicate API

GPT-3.5 Turbo on Replicate API

GPT-3.5 · OpenAI

Serverless

Last refreshed 2026-05-10. Next refresh: weekly.

Why use GPT-3.5 Turbo on Replicate API?

Replicate API offers GPT-3.5 Turbo with pay-as-you-go pricing at $0.50/1M input tokens. Replicate is a cloud-based platform that enables users to run machine learning models easily and efficiently.

Compare GPT-3.5 Turbo across 5 providers to find the best fit for your use case
Input / 1M
$0.50
Output / 1M
$1.50
Cache
Not sourced
Batch
Not sourced

Setup recipe

Python + curl
Install
pip install replicate
Auth
export REPLICATE_API_TOKEN=...
Call
import replicate
output = replicate.run(
    "gpt-3.5-turbo",
    input={"prompt": "Hello"}
Model ID
gpt-3.5-turbo

Request example

import replicate

# reads REPLICATE_API_TOKEN from env
# gpt-3.5-turbo format: "owner/model-name" (latest version) or "owner/model-name:version-hash"
output = replicate.run(
    "gpt-3.5-turbo",
    input={"prompt": "Hello"}
)
# Output is a list or generator depending on the model
print("".join(output))

Gotchas

  • Replicate uses "owner/model-name" format (e.g. "meta/meta-llama-3-8b-instruct") for the latest version, or "owner/model-name:version-sha" to pin to a specific version. The REST endpoint splits owner and model-name into the path: /v1/models/{owner}/{model-name}/predictions.
  • The examples expect REPLICATE_API_TOKEN; rename it only if your application config maps the new variable.

Compare GPT-3.5 Turbo Across Providers

ProviderInput (per 1M)Output (per 1M)
Azure OpenAI$0.50$1.50
OpenAI API$0.50$1.50
Salesforce Einstein Generative AI
OpenRouter$0.50$1.50
Replicate API$0.50$1.50

Pricing

TypePrice (per 1M)
Input tokens$0.50
Output tokens$1.50

Capabilities

Structured Outputs

About GPT-3.5 Turbo

GPT-3.5 Turbo is an advanced language model developed by OpenAI, showcasing significant advancements over GPT-3 and GPT-3.5. As the engine behind the popular ChatGPT application, it excels in tasks like text generation, translation, question answering, summarization, and code generation. This model employs Reinforcement Learning from Human Feedback (RLHF) to enhance accuracy and produce policy-optimized responses. Despite its prowess, it has a knowledge cutoff of September 2021 and can demonstrate biases from its training data. Occasionally, it may generate incorrect or nonsensical content, known as "hallucination," and is sensitive to input phrasing variations. Additionally, the free version may experience slowdowns due to high demand. Nevertheless, GPT-3.5 Turbo remains a powerful tool with versatile applications across numerous fields.

FAQ

What does GPT-3.5 Turbo cost on Replicate API?

On Replicate API, GPT-3.5 Turbo costs $0.5 per 1M input tokens and $1.5 per 1M output tokens.

What is the context window for GPT-3.5 Turbo on Replicate API?

GPT-3.5 Turbo supports a 16,000 token context window on Replicate API.

How does Replicate API compare to other GPT-3.5 Turbo providers?

GPT-3.5 Turbo is available from 5 providers. The cheapest input pricing is $0.5/1M tokens from Azure OpenAI.

Who created GPT-3.5 Turbo?

GPT-3.5 Turbo was created by OpenAI as part of the GPT-3.5 model family.

Is GPT-3.5 Turbo open source?

GPT-3.5 Turbo is not open source; the seed data lists it as proprietary.

Get Started

Model Specs

Released2023-03-01
Parameters20B
Context16K
ArchitectureDecoder Only
Knowledge cutoff2021-09