LTM-2-miniLTM-2-mini is Magic's research prototype supporting a 100 million token context window, announced August 29, 2024. Uses a novel sequence-dimension algorithm approximately 1,000× more memory-efficient than transformer attention at this scale — requiring only a fraction of a single H100's HBM versus 638 H100s for Llama 3.1 405B at the same context length. Not publicly released for API access or self-hosting; Magic stated they were separately training a full LTM-2 model. Specialization: coding/software development. Source: https://magic.dev/blog/100m-token-context-windows
2024-08-29
Researched 2d ago
100M context
No tracked provider route
Llama 4 Scout 17BMultimodal Llama 4 with 16 active experts, supports 10M token context window for long-document processing
2025-10-01
Researched 32d ago
10M contextMultimodalJSONBatch
Llama 4 Scout 17B InstructLlama 4 Scout 17B Instruct is Meta's Llama 4 model with multimodal text and image input. It scores 1295 on the Chatbot Arena benchmark.
2025-04-05
Researched 2d ago
10M contextMultimodalJSONBatch
LTM-1LTM-1 (Long-Term Memory 1) is Magic's first model with a 5 million token context window, announced June 6, 2023. Designed to process entire codebases in context for AI-assisted software development. Architecture and parameter count not publicly disclosed. Not available as a public API; Magic used it in an early-access coding product. Source: https://magic.dev/blog/ltm-1
2023-06-06
Researched 2d ago
5M context
No tracked provider route
Gemini 1.5 ProGemini 1.5 Pro, created by Google DeepMind, is a state-of-the-art multimodal large language model that significantly advances over its predecessors in processing and analyzing large datasets across various formats like text, images, audio, and video. It features a highly extended context window of up to 2 million tokens, allowing it to maintain coherence over lengthy interactions. With over 200 billion parameters, the model excels in tasks requiring nuanced language processing, coding assistance, and advanced reasoning. Integrated into Google's platforms such as Vertex AI, Gemini 1.5 Pro also emphasizes ethical considerations, ensuring safety and appropriateness in AI deployment.
2024-02-15
Researched 32d ago
2M contextJSON
Gemini 1.5 Pro 002Stable Gemini 1.5 Pro release (February variant) optimized for complex reasoning and high-quality multimodal analysis. Supports 2M context for extended document and video processing.
2024-09-24
Researched 2d ago
2M context
No tracked provider route
2024-08-27
Researched 2d ago
2M context
No tracked provider route
2024-08-01
Researched 2d ago
2M context
No tracked provider route
Grok 4.20 Multi-AgentGrok 4.20 Multi-Agent is the extended-context xAI API variant launched around March 10, 2026 as grok-4.20-multi-agent-0309. Its reasoning.effort parameter controls how many collaborating agents are used, and the variant carries a 2M token context window.
2026-03-10
Researched 1d ago
2M contextReasoningVisionMultimodalTool useFunctions
GPT-5.4 ProPremium extended-reasoning GPT-5.4 variant producing smarter and more precise responses. Replacement for o3-deep-research and o4-mini-deep-research. No prompt caching discount.
2026-03-01
Researched 11d ago
1.1M contextReasoningVisionMultimodalTool useFunctions
GPT-5.5 ProGPT-5.5 Pro is OpenAI's premium variant of GPT-5.5, released April 23, 2026. Targets large quality gains for business, legal, education, and data science use cases. Scores 39.6% on FrontierMath Tier 4 (postdoctoral-level math problems), compared to 22.9% for Claude Opus 4.7. Priced at 6× the standard GPT-5.5 API rate. Available to ChatGPT subscribers and via API.
2026-04-23
Researched 2d ago
1.1M contextReasoningVisionMultimodalTool useFunctions
GPT-5.5GPT-5.5 is OpenAI's fully retrained agentic model, released April 23, 2026. Optimized for agentic coding, computer use, knowledge work, and early scientific research. Achieves 82.7% on Terminal-Bench 2.0, 84.9% on GDPval, and 58.6% on SWE-Bench Pro. Individual factual claims are 23% more likely to be correct versus GPT-5.4, with factual errors 3% less frequent. Uses fewer tokens than GPT-5.4 for equivalent tasks. Supports text and image inputs. Available to ChatGPT Plus, Business, and Enterprise subscribers; API access coming soon. Model ID: gpt-5.5.
2026-04-23
Researched 2d ago
1.1M contextReasoningVisionMultimodalTool useFunctions
GPT-5.4GPT-5.4 is OpenAI's flagship frontier reasoning model, released March 5, 2026. It incorporates advances from GPT-5.3-Codex for coding and agentic workflows, and adds 'Thinking' mode with editable reasoning plans. Key capabilities include computer use (navigating interfaces via Playwright), image understanding and generation integration, full-stack web app generation, tool calling, and deep research. Knowledge cutoff is August 31, 2025. Model ID: gpt-5.4.
2026-03-05
Researched 11d ago
1.1M contextReasoningMultimodalTool useFunctionsJSON
Xiaomi MiMo-V2.5Xiaomi MiMo-V2.5 is the lower-cost native omnimodal sibling in the MiMo-V2.5 series. OpenRouter describes it as supporting text, image, audio, and video inputs with text output, Pro-level agentic performance at roughly half the inference cost, and improved multimodal perception over MiMo-V2-Omni. Xiaomi's official April 22 release page highlights MiMo-V2.5 alongside MiMo-V2.5-Pro in benchmark data and says the V2.5 series will be open-sourced soon; no public weights/license were verified at research time.
2026-04-22
Researched 28d ago
1M contextReasoningVisionMultimodalTool useFunctions
Xiaomi MiMo-V2.5-ProXiaomi's April 22, 2026 public-beta flagship in the MiMo-V2.5 series. The official Xiaomi MiMo page describes MiMo-V2.5-Pro as its most capable model to date, focused on general agentic capability, complex software engineering, long-horizon tasks, and ultra-long-context instruction following. OpenRouter lists it as text-to-text with 1,048,576 token context, 131,072 max completion tokens, reasoning controls, tool use, and response_format support. Xiaomi says the V2.5 series will be open-sourced soon, but no public weights/license were verified at research time.
2026-04-22
Researched 28d ago
1M contextTool useFunctionsJSON
Nemotron 3 Super-120B-A12BNVIDIA Nemotron 3 Super-120B-A12B is a 120B total / 12B active hybrid Latent MoE model with interleaved Mamba-2 and MoE layers for agentic, reasoning, and conversational tasks. Fireworks lists the NVFP4 variant for on-demand deployment with 262k context.
2026-03-11
Researched 5d ago
1M contextJSON
2026-03-19
Researched 7d ago
1M context
No tracked provider route
Gemini 3.5 FlashGemini 3.5 Flash is Google DeepMind's generally available Flash model for sustained frontier-level performance on agentic and coding tasks. It supports multimodal inputs, native thinking, tool and function calling, structured outputs, code execution, search grounding, batch processing, and long contexts up to 1M tokens.
2026-05-19
Researched 2d ago
1M contextReasoningVisionMultimodalAudioTool use
Gemini 3.1 Flash-LiteGemini 3.1 Flash-Lite is Google's generally available low-latency Gemini 3.1 model, launched May 7, 2026. It is optimized for high-volume, cost-sensitive workloads with text, image, and video inputs, a 1M token context window, and a 66K token maximum output. The GA model uses the stable API ID gemini-3.1-flash-lite and replaces gemini-3.1-flash-lite-preview, which is scheduled to shut down on May 25, 2026. Pricing is $0.25 per 1M input tokens and $1.50 per 1M output tokens.
2026-05-07
Researched 13d ago
1M contextVisionMultimodalTool useFunctionsJSON
2025-10-01
Researched 23d ago
1.048576M
1,048,576 tokens
1.048576M contextVisionMultimodalTool useFunctionsJSON
MiMo-V2-ProXiaomi MiMo-V2-Pro language model. The larger, higher-capability model in the MiMo V2 series with an extended 1M token context window.
2026-03-18
Researched 17d ago
1M context
Llama 3 70B Gradient 1048KLlama 3 70B Gradient 1048K is Gradient's Gradient Llama 3 model. It offers a 1048K-token context window.
2024-04-18
Researched 2d ago
1048K context
No tracked provider route
Llama 3 8B Gradient 1048KLlama 3 8B Gradient 1048K is Gradient's Gradient Llama 3 model. It offers a 1048K-token context window.
2024-04-18
Researched 2d ago
1048K context
No tracked provider route
2024-04-18
Researched 2d ago
1048K context
No tracked provider route
GPT-4.1OpenAI's GPT-4.1 model released April 2025, excelling at coding tasks, precise instruction following, and web development. Outperforms GPT-4o in these areas with a 1 million token context window. Available via API and in ChatGPT for Plus, Pro, Team, Enterprise, and Edu users.
2025-04-01
Researched 11d ago
1M contextVisionMultimodalTool useFunctionsJSON
GPT-4.1 MiniFast and efficient small model from OpenAI replacing GPT-4o mini. Released April 2025 alongside GPT-4.1. Shows improvements in instruction-following, coding, and intelligence with a 1 million token context window. Available in ChatGPT for paid users.
2025-04-01
Researched 11d ago
1M contextVisionMultimodalTool useFunctionsJSON
Amazon Nova PremierAmazon Nova Premier is Amazon's most capable standard Bedrock Nova understanding model for complex reasoning, agentic workflows, and model distillation. It supports a 1M-token context window, text/image/video inputs, text output, reasoning, tool calling, and prompt caching; use it as the standard Bedrock Nova frontier pick instead of Nova 2 Omni early-access Forge checkpoints.
2025-03-17
Researched 1d ago
1M contextReasoningVisionMultimodalTool useFunctions
DeepSeek V4 ProDeepSeek V4 Pro is the flagship 1.6T parameter (49B activated) Mixture-of-Experts language model with 1M-token context. Features hybrid attention (CSA+HCA) requiring only 27% of inference FLOPs vs DeepSeek-V3.2 at 1M context, Manifold-Constrained Hyper-Connections (mHC), and Muon Optimizer for training stability. Achieves 93.5% on LiveCodeBench, 89.8% on IMOAnswerBench, and 90.1% on MMLU. Supports Non-Think, Think High, and Think Max reasoning modes. Pricing: $1.74/1M input, $3.48/1M output (cache hit: $0.145/1M input). MIT licensed. Pricing note: DeepSeek API docs state that deepseek-v4-pro is currently offered at a 75% discount, extended until 2026/05/31 15:59 UTC.
2026-04-24
Researched 7d ago
1M contextReasoningTool useFunctionsJSONPrompt cache
Qwen3.6-PlusQwen3.6-Plus is Alibaba Cloud's GA Qwen3.6 flagship for long-context reasoning, coding, tool use, and multimodal workflows. DashScope lists it with a 1M-token context window, structured output support, and standard public token pricing.
2026-04-01
Researched 1d ago
1M contextVisionMultimodalTool useFunctionsJSON
Gemini 1.5 FlashGemini 1.5 Flash is a large language AI model by Google, crafted for speed and efficiency in high-volume scenarios 145. As a lightweight model, it's optimized for fast processing and cost-effectiveness, making it ideal for real-time applications and high-frequency tasks 567. With its multimodal capabilities, Gemini 1.5 Flash effectively processes and reasons across multiple data types, including text, images, audio, video, and PDFs 145. Despite its smaller size compared to Gemini 1.5 Pro, it excels in tasks like summarization, chat applications, and data extraction from lengthy documents, employing "knowledge distillation" to transfer essential knowledge from larger models 5. Additionally, it features an extensive context window of up to 1 million tokens, allowing it to manage large information volumes effectively 456.
2024-05-14
Researched 32d ago
1M contextJSON
Gemini 1.5 Flash 8BLightweight 8B variant of Gemini 1.5 Flash optimized for speed and cost-efficiency. Supports 1M token context with fast inference for real-time applications.
2024-10-03
Researched 2d ago
1M context
Gemini 1.5 Flash on Google Vertex AIGemini 1.5 Flash on Google Vertex AI is Google DeepMind's Gemini 1.5 model with multimodal text and image input. It offers a 1M-token context window.
2024-02-15
Researched 2d ago
1M contextVisionMultimodalJSON
2024-02-15
Researched 2d ago
1M contextVisionMultimodalJSON
Gemini 1.5 Pro on Google Vertex AIGemini 1.5 Pro on Google Vertex AI is Google DeepMind's Gemini 1.5 model with multimodal text and image input. It offers a 1M-token context window.
2024-02-15
Researched 2d ago
1M contextVisionMultimodalJSON
2024-02-15
Researched 2d ago
1M contextVisionMultimodalJSON
Gemini 1.0 UltraGoogle's Gemini 1.0 Ultra is a leading large language model designed for tackling highly complex tasks with advanced analytical capabilities. As the largest model in the Gemini 1.0 family, it excels in coding, mathematical reasoning, and multimodal reasoning. Its strength lies in its ability to seamlessly understand and process diverse data types, including text, code, audio, images, and video. Gemini Ultra surpasses human experts on the MMLU benchmark with a 90% score, although it has limitations in image generation and some multimodal tasks. The model features a 32,000-token context window, less than some competitors, and access is primarily through a paid subscription or via Google Cloud for developers.
2023-12-13
Researched 140d ago
1M context
MiniMax M1MiniMax-M1 is a large-scale open-weight reasoning model from MiniMax with 456B total parameters and a 1M token context window, designed for extended reasoning and high-efficiency inference.
2025-09-01
Researched 24d ago
1M contextReasoningTool useFunctionsJSON
No tracked provider route
2025-02-05
Researched 2d ago
1M context
No tracked provider route
2025-02-05
Researched 2d ago
1M contextJSON
No tracked provider route
2024-12-11
Researched 140d ago
1M context
No tracked provider route
2024-11-19
Researched 2d ago
1M context
No tracked provider route
Gemini 1.5 Flash 002Stable Gemini 1.5 Flash release (February variant) optimized for high-speed processing and cost efficiency. Supports 1M context with fast token generation for real-time use.
2024-09-24
Researched 2d ago
1M context
No tracked provider route
2024-09-24
Researched 2d ago
1M context
No tracked provider route
2024-08-27
Researched 2d ago
1M context
No tracked provider route
2024-08-27
Researched 2d ago
1M context
No tracked provider route
Gemini 3 FlashGemini 3 Flash is Google's speed-optimized Gemini 3 model, available in public preview via the Gemini API and Vertex AI. It supports text, image, audio, and video inputs with a 1M token context window and is priced at $0.50 per 1M input tokens and $3.00 per 1M output tokens.
2025-12-17
Researched 4d ago
1M contextVisionMultimodalAudioTool useFunctions
Gemini 3 ProGoogle DeepMind's most advanced reasoning Gemini model. Part of the Gemini 3 series with frontier-class intelligence, multimodal understanding, and 1M token context window.
2025-12-11
Researched 140d ago
1M contextVisionMultimodalTool useFunctionsCode exec
Gemini 3 Flash PreviewFrontier-class performance rivaling larger models at a fraction of the cost. Most intelligent Gemini model built for speed, combining frontier intelligence with superior search and grounding. $0.50 input / $3.00 output per 1M tokens.
2025-12-17
Researched 32d ago
1M contextVisionMultimodalTool useFunctionsJSON
Amazon Nova 2 LiteAmazon Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads that processes text, images, and videos at 1M token context with improved reasoning over Nova Lite v1.
2026-03-01
Researched 24d ago
1M contextReasoningVisionMultimodalTool useFunctions
No tracked provider route
Llama 4 Maverick 17B Instruct FP8Meta's Llama 4 Maverick 17B with 128 experts, FP8-optimized for cost-efficient inference. Supports native Model Router integration on Microsoft Foundry.
2025-04-05
Researched 32d ago
1M contextJSON
Claude Mythos PreviewClaude Mythos Preview is Anthropic's frontier research model, positioned above the public Claude 4 family and released exclusively via invitation-only Project Glasswing to roughly 12 launch partners and over 40 organizations working on critical infrastructure. No public API or self-serve access. Specializes in defensive cybersecurity — autonomously identified zero-day vulnerabilities including a 27-year-old OpenBSD TCP SACK remote code execution bug and a 17-year-old FreeBSD NFS RCE. Codenamed Capybara internally. Scores 93.9% on SWE-bench Verified, 82.0% on Terminal-Bench 2.0, and 97.6% on USAMO 2026. Partner pricing: $25/$125 per million tokens (input/output). Max output: 128K tokens. Knowledge cutoff: December 2025.
2026-04-07
Researched 20d ago
1M contextReasoningVisionMultimodalTool useFunctions
Palmyra X5Palmyra X5 is Writer's most advanced model, purpose-built for enterprise AI agents. It delivers high capability at 1M token context for large-scale document processing and complex multi-step agent workflows.
2026-02-01
Researched 24d ago
1M contextTool useFunctionsJSON
No tracked provider route
Claude Sonnet 4.6Claude Sonnet 4.6 is Anthropic's best combination of speed and intelligence. Proprietary decoder-only model with 1M-token context, 64K max output, multimodal vision, extended thinking, and function calling. Available via Anthropic API, AWS Bedrock, GCP Vertex AI, and OpenRouter at $3/1M input and $15/1M output tokens.
2026-02-17
Researched 13d ago
1M contextReasoningVisionMultimodalTool useFunctions
Claude Opus 4.7Claude Opus 4.7 is Anthropic's generally available flagship model with 1M context, 128K max output, adaptive thinking, and a new tokenizer with roughly 555K words per 1M tokens.
2026-04-16
Researched 7d ago
1M contextReasoningVisionMultimodalTool useFunctions
Claude Opus 4.6Claude Opus 4.6 is Anthropic's Claude 4.6 model with multimodal text and image input and an optional reasoning mode. It offers a 1M-token context window and scores 80.8 on SWE-bench Verified.
2026-02-05
Researched 2d ago
1M contextReasoningVisionMultimodalTool useFunctions
DeepSeek V4 FlashDeepSeek V4 Flash is a 284B parameter (13B activated) Mixture-of-Experts language model with 1M-token context. Features a hybrid attention architecture combining Compressed Sparse Attention (CSA) and Heavily Compressed Attention (HCA) for efficient long-context inference. Supports thinking and non-thinking modes. Legacy API aliases deepseek-chat and deepseek-reasoner map to this model's non-thinking and thinking modes respectively. Pricing: $0.14/1M input, $0.28/1M output (cache hit: $0.0028/1M input). MIT licensed.
2026-04-24
Researched 1d ago
1M contextReasoningTool useFunctionsJSONPrompt cache
Gemini 3.1 Pro PreviewGoogle: Gemini 3.1 Pro Preview available via OpenRouter. Pricing: $2/1M input, $12/1M output.
2026-02-19
Researched 32d ago
1M contextVisionMultimodalTool useFunctionsJSON
Gemini 2.5 FlashGoogle: Gemini 2.5 Flash available via OpenRouter. Pricing: $0.3/1M input, $2.5/1M output.
2025-06-17
Researched 32d ago
1M contextVisionMultimodalTool useFunctionsJSON
2025-09-01
Researched 32d ago
1M contextVisionMultimodalTool useFunctionsJSON
Gemini 2.5 Flash LiteGoogle: Gemini 2.5 Flash Lite available via OpenRouter. Pricing: $0.1/1M input, $0.4/1M output.
2025-07-22
Researched 32d ago
1M contextVisionMultimodalTool useFunctionsJSON
Gemini 2.5 ProGoogle: Gemini 2.5 Pro available via OpenRouter. Pricing: $1.25/1M input, $10/1M output.
2025-06-17
Researched 32d ago
1M contextVisionMultimodalTool useFunctionsJSON
2026-01-01
Researched 32d ago
1M contextJSON
2025-05-06
Researched 32d ago
1M contextJSON
Gemini 2.0 Flash LiteGoogle: Gemini 2.0 Flash Lite available via OpenRouter. Pricing: $0.075/1M input, $0.3/1M output.
2025-02-12
Researched 32d ago
1M contextJSON
SubQ 1M-PreviewSubQ 1M-Preview is Subquadratic's first large language model, built on a fully sub-quadratic sparse-attention architecture that scales compute linearly with context length (O(n) vs. traditional O(n²)). Supports a production context window of 1M tokens (architecture tested to 12M). Achieves 81.8% on SWE-Bench Verified, 95.0% on RULER @128K, and 65.9% on MRCR v2 (8-needle, 1M). Claims 50x faster and 50x cheaper than leading frontier models at 1M context length. Available via OpenAI-compatible API with streaming and tool use support. Model is proprietary and not open-source; fine-tuning for customer-specific use cases is mentioned as a future capability.
2026-05-05
Researched 2d ago
1M contextReasoningTool useFunctions
Grok 4.3xAI's Grok 4.3 is the current flagship API chat model for agentic tool calling and instruction following. xAI lists text and image input, text output, configurable reasoning, a 1,000,000 token context window, cached-input pricing, function calling, and structured outputs.
2026-05-06
Researched 1d ago
1M contextReasoningVisionMultimodalTool useFunctions
Qwen3.6-FlashQwen3.6-Flash is a native vision-language Flash model delivering a significant performance boost over Qwen3.5-Flash, with particular excellence in agentic coding capabilities and substantially improved spatial intelligence. Vision enhancements include notably better object localization and object detection.
2026-04-16
Researched 30d ago
1M contextMultimodal
Qwen3.5-FlashQwen3.5-Flash is a fast, cost-effective native vision-language model in the Qwen3.5 series, delivering outstanding performance comparable to the latest state-of-the-art models with significant leaps in both pure-text and multimodal capabilities compared to the Qwen3 series.
2026-02-23
Researched 30d ago
1M contextMultimodal
Grok 4.20Grok 4.20 is xAI's February 2026 Grok 4-series model, first previewed under the informal Grok 4.2 beta label. Standard API variants launched around March 10, 2026 as grok-4.20-0309-reasoning and grok-4.20-0309-non-reasoning with a 1M context window.
2026-02-17
Researched 1d ago
1M contextReasoningVisionMultimodalTool useFunctions
Qwen3.5-PlusQwen3.5-Plus is the flagship commercial API model of the Qwen3.5 native vision-language series, delivering outstanding performance comparable to state-of-the-art models with significant leaps in both pure-text and multimodal capabilities compared to the Qwen3 series.
2026-02-15
Researched 30d ago
1M contextMultimodal
Gemini Deep Research Max PreviewMaximum-comprehensiveness version of Google's Deep Research agent, built on Gemini 3.1 Pro and released April 21, 2026. Spends more compute than the standard preview to consult more sources, refine reports, and capture nuanced details. Designed for accuracy-critical long-form investigations synthesizing information from hundreds of sources. Supports MCP servers, File Search, and multi-step planning. Context: 1M tokens; max output: 65,536 tokens. Runs at Gemini 3.1 Pro rates ($2.00/$12.00 per MTok). API ID: deep-research-max-preview-04-2026.
2026-04-21
Researched 19d ago
1M contextVisionMultimodalAudioTool useFunctions
Gemini Deep Research PreviewGoogle's agentic deep research model built on Gemini 3.1 Pro, released April 21, 2026. Designed for speed and efficiency in autonomous multi-step research: ingests text, images, PDFs, audio, and video to produce comprehensive cited reports from public web sources and private workspace data. Supports collaborative planning, visualization, MCP servers, and File Search. Context window: 1M tokens; max output: 65,536 tokens. Runs at Gemini 3.1 Pro rates ($2.00/$12.00 per MTok). API ID: deep-research-preview-04-2026.
2026-04-21
Researched 19d ago
1M contextVisionMultimodalAudioTool useFunctions
Grok 4.20 Non-ReasoningGrok 4.20 Non-Reasoning is the xAI API non-reasoning variant launched around March 10, 2026 as grok-4.20-0309-non-reasoning. It is the live replacement target for retired non-reasoning fast models.
2026-03-10
Researched 1d ago
1M contextVisionMultimodalTool useFunctionsJSON
Grok 4.20 ReasoningGrok 4.20 Reasoning is the xAI API reasoning variant launched around March 10, 2026 as grok-4.20-0309-reasoning. The prior May 2026 seed date was a placeholder; this model was already available months earlier and remains active.
2026-03-10
Researched 1d ago
1M contextReasoningVisionMultimodalTool useFunctions
Qwen-Plus-CharacterQwen-Plus-Character is the Plus-tier role-playing model in the Qwen series, optimized for anthropomorphic role-playing with advanced capabilities in following predefined character instructions, advancing conversations, and demonstrating active listening and empathy. It supports deep restoration of personalized characters and is dynamically updated.
2026-01-29
Researched 30d ago
1M context
Qwen-Flash-CharacterQwen-Flash-Character is the Flash-tier role-playing model from the Qwen series, optimized for multi-language anthropomorphic interaction with advanced character consistency, context-aware dialogue progression, and empathetic engagement. Features enhanced Japanese linguistic localization, human-like role-playing authenticity, and narrative coherence control.
2026-01-12
Researched 30d ago
1M context
Qwen-PlusQwen-Plus is an enhanced commercial API endpoint in the Qwen series, supporting Chinese, English, and multiple other languages. The backbone has been upgraded to the Qwen3 architecture, achieving effective integration of thinking and non-thinking modes with seamless switching during conversations.
2025-11-30
Researched 30d ago
1M context
Qwen3-Coder-PlusQwen3-Coder-Plus is a Qwen3-based code generation model with strong coding agent capabilities, excelling at tool invocation and environment interaction. It enables autonomous programming with outstanding code capability while retaining general-purpose reasoning.
2025-09-23
Researched 30d ago
1M context
Qwen-FlashQwen-Flash is a Qwen3 series Flash model that seamlessly integrates thinking and non-thinking modes switchable mid-dialogue, excelling at complex thinking tasks with significant improvements in instruction adherence and text understanding. It supports 1M context length with tiered pricing based on context length.
2025-08-01
Researched 30d ago
1M context
Qwen3-Coder-FlashQwen3-Coder-Flash inherits the coding agent capabilities of Qwen3-Coder-Plus with support for multi-turn tool interaction, focused optimization on repository-level understanding, and enhanced tool-calling stability.
2025-07-29
Researched 30d ago
1M context