LLM ReferenceLLM Reference
NVIDIA NIM

Kimi K2 Thinking on NVIDIA NIM

Kimi K2 · Moonshot AI

Serverless

Last refreshed 2026-05-01. Next refresh: weekly.

Why use Kimi K2 Thinking on NVIDIA NIM?

NVIDIA NIM offers Kimi K2 Thinking with competitive pricing. NVIDIA NIM is NVIDIA's deployment platform for GPU-accelerated inference microservices.

Compare Kimi K2 Thinking across 5 providers to find the best fit for your use case
Input / 1M
-
Output / 1M
-
Cache
Not sourced
Batch
Not sourced

Setup recipe

Docs fallback
Install
Use the provider REST API or SDK
Auth
Create a provider API key
Call
model: moonshotai/kimi-k2-thinking
Model ID
moonshotai/kimi-k2-thinking

Request example

Curated snippets for this provider are not sourced yet. Use NVIDIA NIM documentation with model ID moonshotai/kimi-k2-thinking.

Gotchas

  • Use provider model ID "moonshotai/kimi-k2-thinking", not the LLMReference slug "kimi-k2-thinking".

Compare Kimi K2 Thinking Across Providers

ProviderInput (per 1M)Output (per 1M)
Fireworks AI$0.60$2.50
GCP Vertex AI$0.60$2.50
NVIDIA NIM
AWS Bedrock$0.60$2.50
OpenRouter$0.60$2.50

Pricing

TypeRate
GPU Hour Rate$1.00/GPU·hr

Capabilities

ReasoningStructured Outputs

About Kimi K2 Thinking

Extended thinking variant of Kimi K2 with native reasoning capabilities. 256K context.

FAQ

What is the context window for Kimi K2 Thinking on NVIDIA NIM?

Kimi K2 Thinking supports a 256,000 token context window on NVIDIA NIM.

How does NVIDIA NIM compare to other Kimi K2 Thinking providers?

Kimi K2 Thinking is available from 5 providers. The cheapest input pricing is $0.6/1M tokens from Fireworks AI.

What API model ID do I use for Kimi K2 Thinking on NVIDIA NIM?

Use the model ID moonshotai/kimi-k2-thinking when calling NVIDIA NIM's API.

Who created Kimi K2 Thinking?

Kimi K2 Thinking was created by Moonshot AI as part of the Kimi K2 model family.

Is Kimi K2 Thinking open source?

Kimi K2 Thinking is not open source; the seed data lists it as proprietary.

Get Started

Model Specs

Released2025-01-01
Context256K
ArchitectureDecoder Only

Related Models on NVIDIA NIM