Granite 3.3 8B Instruct

Name: Granite 3.3 8B Instruct
Author: IBM Research

Released

2025-03-01

Last refreshed

2026-05-01

Status

Researched 182d ago

Open sourceCommercial use: permittedRAGAgentsLong contextJSON / Tool use

Granite 3.3 8B Instruct is worth evaluating for rag, agents, and long context when its provider route and context window match the workload.

Use it for

Teams evaluating rag, agents, and long context
Workloads that can use a 128k context window
Buyers comparing 2 tracked provider routes

Do not use it for

Vision or document-understanding workloads

Specifications

Family: Granite 3
Released: 2025-03-01
Context: 128k
Parameters: 8B
Architecture: Decoder Only
Knowledge cutoff: 2024-04
Specialization: general
Openness: Open source
License: Apache 2.0OSI-approvedCommercial use: permitted
Training: Fine-tuned

Created by

IBM Research

Creating reliable and adaptable AI solutions

Armonk, New York, United States

Founded 1945

Website

Pricing

Output / 1M

$0.250

Input / 1M

$0.030

Cheapest of 2 routes · Replicate API

Providers(2)

NVIDIA NIM Replicate API

View 2 provider routes

About

IBM Granite 3.3 8B with improved reasoning capabilities. Part of IBM's enterprise-focused Granite model family optimized for instruction following.

Granite 3.3 8B Instruct is an open-source model in the Granite 3 family. The structured metadata tracks a 128k-token context window, function calling, and tool use. This page tracks provider routes through NVIDIA NIM and Replicate API, with the cheapest tracked route listed at $0.03 input and $0.25 output per 1M tokens. No headline benchmark score is tracked for Granite 3.3 8B Instruct yet.

Top use-case fit: coding, agents, and build tasks

RAG

Included by capability and metadata signals in the decision map.

Agents

Included by capability and metadata signals in the decision map.

Long context

Included by capability and metadata signals in the decision map.

Provider price ladder

Compare all 2

Compare API pricing across 2 providers for input and output tokens, batch, and cached reads when available.

Provider	Input / 1M	Output / 1M	Route
Replicate API	$0.030	$0.250	Serverless
NVIDIA NIM	-	-	ServerlessPartial

Available via routers & gateways(1)

NVIDIA LLM Router Blueprint

Router

NVIDIA's open-source AI blueprint for LLM routing that selects the optimal model per prompt via intent classification or neural auto-routing; being deprecated 2026-06-20.

Free OSSNVIDIA NIM