Last refreshed 2026-05-01. Next refresh: weekly.
Why use Kimi K2 Thinking on NVIDIA NIM?
NVIDIA NIM offers Kimi K2 Thinking with competitive pricing. NVIDIA NIM is NVIDIA's deployment platform for GPU-accelerated inference microservices.
Compare Kimi K2 Thinking across 5 providers to find the best fit for your use caseInput / 1M
-
Output / 1M
-
Cache
Not sourced
Batch
Not sourced
Setup recipe
Docs fallbackInstall
Use the provider REST API or SDKAuth
Create a provider API keyCall
model: moonshotai/kimi-k2-thinkingModel ID
moonshotai/kimi-k2-thinkingRequest example
Curated snippets for this provider are not sourced yet. Use NVIDIA NIM documentation with model ID
moonshotai/kimi-k2-thinking.Gotchas
- Use provider model ID "moonshotai/kimi-k2-thinking", not the LLMReference slug "kimi-k2-thinking".
Compare Kimi K2 Thinking Across Providers
| Provider | Input (per 1M) | Output (per 1M) |
|---|---|---|
| Fireworks AI | $0.60 | $2.50 |
| GCP Vertex AI | $0.60 | $2.50 |
| NVIDIA NIM | — | — |
| AWS Bedrock | $0.60 | $2.50 |
| OpenRouter | $0.60 | $2.50 |
Pricing
| Type | Rate |
|---|---|
| GPU Hour Rate | $1.00/GPU·hr |
Capabilities
ReasoningStructured Outputs
About Kimi K2 Thinking
Extended thinking variant of Kimi K2 with native reasoning capabilities. 256K context.
FAQ
What is the context window for Kimi K2 Thinking on NVIDIA NIM?
Kimi K2 Thinking supports a 256,000 token context window on NVIDIA NIM.
How does NVIDIA NIM compare to other Kimi K2 Thinking providers?
Kimi K2 Thinking is available from 5 providers. The cheapest input pricing is $0.6/1M tokens from Fireworks AI.
What API model ID do I use for Kimi K2 Thinking on NVIDIA NIM?
Use the model ID moonshotai/kimi-k2-thinking when calling NVIDIA NIM's API.
Who created Kimi K2 Thinking?
Kimi K2 Thinking was created by Moonshot AI as part of the Kimi K2 model family.
Is Kimi K2 Thinking open source?
Kimi K2 Thinking is not open source; the seed data lists it as proprietary.
Model Specs
Released2025-01-01
Context256K
ArchitectureDecoder Only