Cohere Embed v4.0
Cohere Embed v4.0 is worth evaluating for general LLM work when its provider route and context window match the workload.
Use it for
- Teams evaluating general LLM work
- Workloads that can use a 128k context window
- Buyers comparing 3 tracked provider routes
Do not use it for
- Strict JSON or tool-calling flows
- Family
- Cohere Embed
- Released
- 2025-04-01
- Context
- 128k
- Architecture
- Encoder Only
- Specialization
- embedding
- Openness
- Proprietary
- License
- ProprietaryCommercial use: conditional
Cheapest of 3 routes · Microsoft Foundry
About
Latest multimodal embedding model supporting text, images, and mixed content (e.g., PDFs). Embed v4.0 offers variable embedding dimensions (256, 512, 1024, 1536 default) and supports multiple similarity metrics (Cosine, Dot Product, Euclidean Distance). Ideal for semantic search, classification, and clustering across multimodal data.
Cohere Embed v4.0 is a proprietary model in the Cohere Embed family. The structured metadata tracks a 128k-token context window and multimodal input. This page tracks provider routes through Microsoft Foundry, Vercel AI Gateway, and Cohere API. No headline benchmark score is tracked for Cohere Embed v4.0 yet.
Top use-case fit
No primary decision-task fit is mapped for this model yet.
Provider price ladder
Compare all 3Compare API pricing across 3 providers for input and output tokens, batch, and cached reads when available.
| Provider | Input / 1M | Output / 1M | Route |
|---|---|---|---|
| Microsoft Foundry | $0.120 | - | ServerlessPartial |
| Vercel AI Gateway | $0.120 | - | ServerlessPartial |
| Cohere API | - | - | ServerlessPartial |
Available via routers & gateways(6)
LiteLLM
GatewayOpen-source Python SDK and proxy server that unifies 100+ LLM APIs behind a single OpenAI-compatible interface, with load balancing, cost tracking, and configurable failover.
OpenRouter
HybridUnified hybrid gateway to 400+ models from 60+ providers via a single OpenAI-compatible API, with optional auto-routing that selects the best model per prompt.
Portkey
GatewayProduction AI gateway routing to 1,600+ LLMs with failover, load balancing, semantic caching, and guardrails; Apache 2.0 core is fully self-hostable with the complete feature set.
Azure AI Foundry Model Router
RouterMicrosoft Azure AI Foundry's native model router that uses a trained ML model to route each prompt in real time to the optimal Azure-hosted model, with Balanced/Cost/Quality mode selection and automatic failover.
Helicone
GatewayObservability-first AI gateway with routing, caching, rate limiting, and request tracing; Apache 2.0 open-source core with a managed hosted tier for logging and analytics.
Kong AI Gateway
GatewayMulti-LLM AI gateway built on Kong Gateway 3.x, adding semantic routing, load balancing, guardrails, and MCP traffic analytics as plugins over Kong's existing API management platform.
Capabilities
Benchmark peer barsfor Coding
No task-mapped benchmark peers are available for this model yet.
Migration checks
No linked migration route is available for this model yet.
Frequently asked questions
What is the context window of Cohere Embed v4.0?
Cohere Embed v4.0 has a context window of 128k tokens.
How much does Cohere Embed v4.0 cost?
Cohere Embed v4.0 is available at $0.12/1M input tokens through Microsoft Foundry.
When was Cohere Embed v4.0 released?
Cohere Embed v4.0 was released on 2025-04-01.
Which providers offer Cohere Embed v4.0?
Cohere Embed v4.0 is available from 3 providers: Microsoft Foundry, Vercel AI Gateway, Cohere API.
Cheapest of 3 routes · Microsoft Foundry