Lightweight reasoning variant of Microsoft Phi-4 Mini optimized for fast inference.
2025-12-01
Researched 134d ago
128K
128,000 tokens
Also known as: reasoning model, deliberative reasoning
deliberate problem solving
127
matching active models
27
tracked providers
87
models with routes
Reasoning capability describes models marketed or tracked as stronger at multi-step problem solving, planning, math, coding, and answer checking. It is not a guarantee for every workload; use it with benchmark and provider-route evidence when choosing a production model.
Showing the first 80 decision-sorted matches, with model flags and provider-route evidence from seed data.
Lightweight reasoning variant of Microsoft Phi-4 Mini optimized for fast inference.
2025-12-01
Researched 134d ago
128K
128,000 tokens
Mistral Magistral Small reasoning model released January 2026.
2025-06-10
Researched 1d ago
128K
128,000 tokens
Tencent HunYuan's Hy3 Preview is a high-efficiency Mixture-of-Experts language model for agentic and production workflows. OpenRouter lists tencent/hy3-preview as released Apr 22, 2026 with 262,144 context, free preview pricing, reasoning controls, and tool-use support. Hugging Face Transformers documents Hy3-preview as a Tencent HunYuan MoE model with a dense-MoE hybrid architecture, 192 routed experts, and one always-active shared expert per MoE layer. SCMP's Apr 23 coverage reports Tencent described HY3-Preview as a new flagship model developed by the HunYuan and Yuanbao teams with 295B parameters. Treat release metadata as high confidence for existence/context and medium confidence for exact parameter count until Tencent publishes a primary technical card.
2026-04-22
Researched 22d ago
262K
262,144 tokens
Xiaomi MiMo-V2.5 is the lower-cost native omnimodal sibling in the MiMo-V2.5 series. OpenRouter describes it as supporting text, image, audio, and video inputs with text output, Pro-level agentic performance at roughly half the inference cost, and improved multimodal perception over MiMo-V2-Omni. Xiaomi's official April 22 release page highlights MiMo-V2.5 alongside MiMo-V2.5-Pro in benchmark data and says the V2.5 series will be open-sourced soon; no public weights/license were verified at research time.
2026-04-22
Researched 22d ago
1M
1,048,576 tokens
Kimi K2.6 is Moonshot AI's latest agentic reasoning model, launched April 13 2026 as a code preview for Kimi Code subscribers. Built on a 1-trillion-parameter MoE architecture (32B active, 384 experts), it inherits K2.5's 256K context window and adds enhanced reliability for long-horizon agentic workflows — supporting 200–300 sequential tool calls without drift. Optimized for coding, multi-step agent planning, and vision-assisted tasks such as processing screenshots, PDFs, and spreadsheets.
2026-04-20
Researched 9d ago
262K
262,144 tokens
Hermes 4's 405B hosted variant on Nous Portal. The portal describes it as the largest Hermes 4 model, focused on advanced reasoning and creative depth rather than inference speed or cost.
2025-09-22
Researched 22d ago
128K
128,000 tokens
Nanbeige4-3B is an open-source 3B-parameter language model by Nanbeige LLM Lab (BOSS Zhipin), released December 2025. Pre-trained on 23 trillion high-quality tokens with SFT on 30M+ diverse instructions. Context extended to 64K via Adjusted Base Frequency (ABF). Sets state-of-the-art on AIME 2024 (90.4), AIME 2025 (85.6), and GPQA-Diamond (82.2) for sub-10B models, outperforming models up to 10× larger including Qwen3-32B. arxiv: 2512.06266. HuggingFace: Nanbeige/Nanbeige4-3B-Base.
2025-12-13
Researched 13d ago
64K
64,000 tokens
No tracked provider route
Grok-0, developed by xAI, is a large language model (LLM) that boasts 33 billion parameters 237. It impressively performs on par with Meta's 70-billion parameter LLaMA 2 model, despite utilizing only half the training resources, highlighting its architectural efficiency and optimization 235. Grok-0 served as the prototype for this efficient design before being succeeded by Grok-1, which further enhanced reasoning and coding capabilities 2.
2023-08-18
Researched 134d ago
—
No window data
No tracked provider route
Grok-1, created by xAI, is a formidable 314-billion parameter Mixture-of-Experts (MoE) language model. It boasts a sophisticated architecture with 8 experts, leveraging 2 for each token input, spread across 64 layers and equipped with 48 attention heads per query. This vast model was trained from scratch using a specially crafted training stack based on JAX and Rust, finishing its pre-training phase by October 2023. Released as a base model under the permissive Apache 2.0 license, its open-source framework allows both commercial and non-commercial applications, though it lacks fine-tuning for specific tasks. Benchmarks highlight Grok-1's superior reasoning on various tasks but recognize its potential for generating inaccuracies ("hallucinations"). Running on a local setup requires substantial hardware, including a multi-GPU system, for efficient performance.
2023-11-03
Researched 134d ago
—
No window data
No tracked provider route
Grok-1.5V, created by xAI, is a multimodal large language model that combines both text and image processing capabilities. This model excels at interpreting and interacting with diverse visual data, including documents, diagrams, charts, screenshots, and photographs. Its multimodal nature allows it to perform advanced tasks like translating diagrams into code, generating image descriptions, and answering questions based on visual inputs, all while displaying a strong understanding of spatial information. Grok-1.5V has demonstrated competitive prowess against top models such as GPT-4V and Gemini Pro 1.5, particularly in areas that require spatial reasoning. Initially, access is primarily limited to early testers and existing Grok users, with plans for broader availability in the future 124.
2024-04-12
Researched 134d ago
—
No window data
No tracked provider route
Grok-1.5, developed by xAI, Elon Musk's AI company, is a large language model focused on advanced reasoning skills in coding and mathematics, highlighted by its exceptional performance on benchmarks such as MATH, GSM8K, and HumanEval 123. It supports handling long contexts of up to 128,000 tokens, surpassing its predecessor in this area, and is built using a custom distributed training framework on JAX, Rust, and Kubernetes 12. Designed for comprehensive context understanding and logical reasoning, it is being deployed to early testers and users. Additionally, a multimodal version, Grok-1.5V, is available, which incorporates visual information processing capabilities, including documents, diagrams, and photographs 71113.
2024-03-29
Researched 134d ago
—
No window data
No tracked provider route
Post-training variant of GLM-5 from Zhipu AI with enhanced reasoning and coding capabilities. 754B parameters (40B active) in Mixture of Experts architecture. Optimized for complex agentic workflows and multi-step reasoning. Available via Z.AI API and open weights under the MIT license.
2026-04-07
Researched 11d ago
200k
200,000 tokens
Lightweight variant of Grok-2 from xAI with extended context for general reasoning tasks.
2024-08-01
Researched 134d ago
128K
128,000 tokens
No tracked provider route
Enhanced contextual memory with limited image input; political filter added.
2024-08-01
Researched 26d ago
128K
128,000 tokens
Claude 3 Sonnet by Anthropic is a versatile large language AI model, balancing intelligence and speed for diverse enterprise use cases. It is part of the Claude 3 family, positioned between the powerful Opus and the faster Haiku models. Sonnet excels in nuanced content creation, accurate summarization, and complex scientific query handling while also showcasing proficiency in non-English languages and coding tasks. Additionally, it enhances vision capabilities with exceptional skills in visual reasoning, such as interpreting charts, graphs, and transcribing text from imperfect images, which benefits industries like retail, logistics, and finance. Operated at twice the speed of Claude 3 Opus, Sonnet is efficient in context-sensitive customer support and multi-step workflows. It has achieved AI Safety Level 2 (ASL-2) and is accessible through multiple platforms, including Claude.ai, the Claude iOS app, the Anthropic API, Amazon Bedrock, and Google Cloud's Vertex AI.
2024-03-04
Researched 26d ago
200K
200,000 tokens
DeepSeek R1: Reasoning-optimized model with extended thinking capabilities. 128K context.
2025-01-20
Researched 26d ago
128K
128,000 tokens
Claude 3.7 Sonnet is Anthropic's advanced model with extended thinking capabilities, offering state-of-the-art reasoning for complex tasks.
2024-03-04
Researched 26d ago
200K
200,000 tokens
Flagship open-weight foundation model from Zhipu AI with 744B parameters (40B active per token) in Mixture of Experts architecture. Trained on 28.5T tokens using DeepSeek Sparse Attention on Huawei Ascend hardware. Achieves state-of-the-art performance on coding and agentic benchmarks (SWE-bench Verified: 77.8%). Supports autonomous planning, multi-step tool use, and self-correction.
2026-02-11
Researched 26d ago
200k
200,000 tokens
Large-scale distilled DeepSeek R1 leveraging Llama 70B for complex reasoning.
2025-01-20
Researched 26d ago
128K
128,000 tokens
DeepSeek V4 Pro is the flagship 1.6T parameter (49B activated) Mixture-of-Experts language model with 1M-token context. Features hybrid attention (CSA+HCA) requiring only 27% of inference FLOPs vs DeepSeek-V3.2 at 1M context, Manifold-Constrained Hyper-Connections (mHC), and Muon Optimizer for training stability. Achieves 93.5% on LiveCodeBench, 89.8% on IMOAnswerBench, and 90.1% on MMLU. Supports Non-Think, Think High, and Think Max reasoning modes. Pricing: $1.74/1M input, $3.48/1M output (cache hit: $0.145/1M input). MIT licensed. Pricing note: DeepSeek API docs state that deepseek-v4-pro is currently offered at a 75% discount, extended until 2026/05/31 15:59 UTC.
2026-04-24
Researched 1d ago
1M
1,000,000 tokens
Distilled DeepSeek R1 reasoning in Qwen-32B for advanced problem-solving.
2025-01-20
Researched 26d ago
128K
128,000 tokens
Kimi K2 Instruct is an instruction-tuned language model from Moonshot AI, available via Fireworks AI.
2025-01-01
Researched 26d ago
—
No window data
GPT-5.2 is OpenAI's incremental update in the GPT-5 series offering improvements in agentic coding and long-context performance at 128K context.
2025-12-11
Researched 1d ago
400K
400,000 tokens
OpenAI o3 reasoning model with advanced multi-step problem-solving capabilities.
2025-03-31
Researched 5d ago
200K
200,000 tokens
Distilled DeepSeek R1 reasoning encoded into Llama 8B architecture.
2025-01-20
Researched 134d ago
128K
128,000 tokens
Distilled DeepSeek R1 with reasoning in Qwen-14B for mid-scale inference.
2025-01-20
Researched 134d ago
128K
128,000 tokens
Distilled DeepSeek R1 reasoning capabilities in Qwen-7B form factor.
2025-01-20
Researched 134d ago
128K
128,000 tokens
Hermes 4's 70B hosted variant on Nous Portal. The portal describes it as a hybrid-mode reasoning model that balances scale and size while staying fast and cost effective for complex reasoning tasks.
2025-09-22
Researched 22d ago
128K
128,000 tokens
Distilled DeepSeek R1 based on Qwen-1.5B for compact reasoning.
2025-01-20
Researched 134d ago
128K
128,000 tokens
OpenAI o1-mini model emphasizing fast reasoning for smaller tasks and problems.
2024-09-12
Researched 134d ago
128K
128,000 tokens
Nanbeige4.1-3B is an open-source 3.93B-parameter reasoning and agentic language model by Nanbeige LLM Lab (BOSS Zhipin), released February 11, 2026. Built on Nanbeige4-3B-Base with further SFT and reinforcement learning. Supports 256K token context. A unified generalist model achieving strong reasoning, preference alignment, and agentic tool-use capabilities at the ~3B scale. Competitive with much larger open models. arxiv: 2602.13367. HuggingFace: Nanbeige/Nanbeige4.1-3B.
2026-02-11
Researched 13d ago
256K
256,000 tokens
No tracked provider route
MiniMax-M1 is a large-scale open-weight reasoning model from MiniMax with 456B total parameters and a 1M token context window, designed for extended reasoning and high-efficiency inference.
2025-09-01
Researched 18d ago
1M
1,000,000 tokens
No tracked provider route
Tencent Hunyuan T1 is a deep-thinking reasoning model released March 21, 2025. Built on Hunyuan TurboS base, using a Hybrid-Transformer-Mamba MoE architecture — the first ultra-large-scale Mamba-powered LLM with 16 total experts and 52B activated parameters via dynamic routing. 96.7% of compute allocated to reinforcement learning post-training. Benchmarks: MATH-500 96.2, LiveCodeBench 64.9, GPQA Diamond 69.3, MMLU-PRO 87.2 (second only to o1). Decoding speed 2× faster than comparable transformer models. 256K token context. Available on Tencent Cloud API. Source: https://tencent.github.io/llm.hunyuan.T1/README_EN.html
2025-03-21
Researched 13d ago
256K
256,000 tokens
No tracked provider route
2025-01-20
Researched 134d ago
128K
128,000 tokens
No tracked provider route
Lightweight DeepSeek R1 reasoning model optimized for speed.
2024-11-21
Researched 134d ago
128K
128,000 tokens
No tracked provider route
OpenAI o1 preview model emphasizing reasoning and complex problem-solving.
2024-09-12
Researched 134d ago
128K
128,000 tokens
No tracked provider route
The Cerebras GPT 590M is a robust language model featuring 590 million parameters and a transformer architecture akin to GPT-3. It is optimized for natural language processing tasks such as text generation, completion, and summarization. Trained using the Chinchilla scaling laws and Cerebras' weight streaming technology, this model achieves high efficiency, offering faster training times and reduced costs. The Andromeda AI supercomputer facilitated its training on the extensive Pile dataset. Open-sourced under the Apache 2.0 license, it primarily supports English and requires additional tuning for other languages and conversational applications due to its lack of reinforcement learning from human feedback.
2023-03-13
Researched 134d ago
—
No window data
No tracked provider route
The NeMo Megatron-GPT 5B is a transformer-based language model with 5 billion trainable parameters, inspired by models like GPT-2 and GPT-3 1. Its architecture is a decoder-only transformer, designed to sequentially process input for text generation and language understanding tasks 15. Trained on "The Piles" dataset by Eleuther.AI, it leverages its substantial dataset to produce coherent and natural-sounding text while also answering questions and completing sentences 5. Despite its strengths, the model can reflect biases and toxic language from its dataset, sometimes yielding inappropriate outputs. Evaluations on benchmarks like the LM Evaluation Test Suite showcase its varying performance, scoring 0.5566 on ARC-Easy and 0.6133 on Winogrande 1, indicating both strengths and limitations across different tasks.
2019-08-28
Researched 134d ago
—
No window data
No tracked provider route
OpenAI's previous intelligent reasoning model with configurable reasoning effort. Released August 2025. Supports minimal, low, medium, and high reasoning levels. Succeeded by GPT-5.1 and later models.
2025-08-07
Researched 5d ago
400K
400,000 tokens
Near-frontier intelligence for cost-sensitive, low-latency, high-volume workloads. Released August 2025. Replaces o4-mini (shutting down Oct 2026).
2025-08-07
Researched 5d ago
400K
400,000 tokens
Fastest, cheapest GPT-5 variant for summarization and classification tasks. Also available via Realtime API.
2025-08-07
Researched 5d ago
400K
400,000 tokens
Premium extended-reasoning GPT-5.4 variant producing smarter and more precise responses. Replacement for o3-deep-research and o4-mini-deep-research. No prompt caching discount.
2026-03-01
Researched 5d ago
1.1M
1,050,000 tokens
Advanced o3 reasoning model for complex math, science, and coding problems. Supports tools, vision, and extended thinking. Available to Pro users. Released June 10, 2025.
2025-06-10
Researched 26d ago
—
No window data
Seed 1.6 Flash is ByteDance Seed's ultra-fast multimodal thinking model supporting text and visual understanding at 256K context, optimized for low-latency inference.
2026-03-01
Researched 18d ago
256K
256,000 tokens
No tracked provider route
Amazon Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads that processes text, images, and videos at 1M token context with improved reasoning over Nova Lite v1.
2026-03-01
Researched 18d ago
1M
1,000,000 tokens
No tracked provider route
Seed 1.6 is a general-purpose multimodal model from ByteDance Seed supporting text, image, and video inputs. It incorporates multimodal capabilities and deep thinking for complex tasks at 256K context.
2026-03-01
Researched 18d ago
256K
256,000 tokens
No tracked provider route
o3-deep-research is OpenAI's advanced model for deep research, designed to tackle complex multi-step research tasks by synthesizing information from multiple sources at 200K context.
2025-10-10
Researched 1d ago
200K
200,000 tokens
No tracked provider route
Open-weight dense Qwen3.6 27B model with native multimodal support across text, image, and video. Apache 2.0.
2026-04-27
Researched 1d ago
262K
262,144 tokens
Aion-1.0-Mini is a 32B parameter model distilled from DeepSeek-R1, designed for strong performance in reasoning and coding at a smaller footprint.
2026-01-01
Researched 18d ago
128K
128,000 tokens
No tracked provider route
Claude 3.5 Haiku is Anthropic's latest AI model, known for its speed and efficiency while maintaining high intelligence. It is optimized for applications needing rapid response, like interactive chatbots and real-time content moderation. Initially text-only, future plans include image input capabilities. It excels in delivering fast, accurate code suggestions, processing and categorizing information swiftly, and handling large volumes of user interactions. Priced accessibly, it offers advanced coding, tool use, and reasoning abilities. Though initially surpassing Claude 3 Haiku in benchmarks, its pricing reflects its enhanced performance 123457.
2024-10-22
Researched 26d ago
200k
200,000 tokens
OLMo 3 32B Think is Allen Institute for AI's reasoning-focused model with extended thinking chains for complex logic problems and multi-step reasoning.
2026-03-01
Researched 18d ago
64K
64,000 tokens
No tracked provider route
Claude Sonnet 4.5 available on AWS Bedrock
2025-09-29
Researched 26d ago
200K
200,000 tokens
Claude 3.5 Sonnet, the latest in Anthropic's line of large language models, merges state-of-the-art reasoning, coding, and natural language understanding capabilities with advanced multi-modal processing. Released in October 2024, it excels in benchmarks against previous models and competitors, thanks to its scalable attention mechanisms and massive neural network architecture. Its dynamic routing enables specialization in various tasks, supporting applications from software development and data analysis to customer support and content creation. Users benefit from its "Artifacts" feature for real-time collaborative workflows and can access the model through platforms like Claude.ai and APIs at competitive pricing rates.
2024-06-20
Researched 26d ago
200K
200,000 tokens
Claude Opus 4.5 available on AWS Bedrock
2025-11-01
Researched 26d ago
200K
200,000 tokens
Claude Mythos Preview is Anthropic's frontier research model, positioned above the public Claude 4 family and released exclusively via invitation-only Project Glasswing to roughly 12 launch partners and over 40 organizations working on critical infrastructure. No public API or self-serve access. Specializes in defensive cybersecurity — autonomously identified zero-day vulnerabilities including a 27-year-old OpenBSD TCP SACK remote code execution bug and a 17-year-old FreeBSD NFS RCE. Codenamed Capybara internally. Scores 93.9% on SWE-bench Verified, 82.0% on Terminal-Bench 2.0, and 97.6% on USAMO 2026. Partner pricing: $25/$125 per million tokens (input/output). Max output: 128K tokens. Knowledge cutoff: December 2025.
2026-04-07
Researched 14d ago
1M
1,000,000 tokens
INTELLECT-3 is Prime Intellect's 106B-parameter MoE model with 12B active parameters, post-trained from GLM-4.5-Air-Base via SFT and reinforcement learning, matching frontier closed-model performance.
2026-04-01
Researched 18d ago
128K
128,000 tokens
No tracked provider route
Aion-1.0 is a multi-model system from AionLabs designed for high performance across reasoning and coding tasks, trained with direct distillation from frontier models.
2026-01-01
Researched 18d ago
128K
128,000 tokens
No tracked provider route
Maestro Reasoning is Arcee AI's flagship 32B analysis and reasoning model, fine-tuned with DPO from Qwen 2.5-32B for cross-domain reasoning, mathematics, and structured analysis tasks.
2025-12-01
Researched 18d ago
128K
128,000 tokens
No tracked provider route
Cogito v2.1 671B MoE is Deep Cogito's strongest open model, matching performance of frontier closed models. It features deep thinking capabilities and strong results on coding, reasoning, and math benchmarks.
2025-11-19
Researched 8d ago
128K
128,000 tokens
No tracked provider route
Claude Sonnet 4.6 is Anthropic's best combination of speed and intelligence. Proprietary decoder-only model with 1M-token context, 64K max output, multimodal vision, extended thinking, and function calling. Available via Anthropic API, AWS Bedrock, GCP Vertex AI, and OpenRouter at $3/1M input and $15/1M output tokens.
2026-02-17
Researched 7d ago
1M
1,000,000 tokens
Claude Opus 4.7 is Anthropic's generally available flagship model with 1M context, 128K max output, adaptive thinking, and a new tokenizer with roughly 555K words per 1M tokens.
2026-04-16
Researched 1d ago
1M
1,000,000 tokens
Claude Opus 4.6 available on AWS Bedrock
2026-02-05
Researched 26d ago
1M
1,000,000 tokens
Qwen3.6-Max-Preview is a proprietary frontier model from Alibaba Cloud built on a sparse MoE architecture, available for preview as part of the Qwen3.6 series.
2026-04-20
Researched 3d ago
256K
256,000 tokens
Mistral Medium 3.5 is Mistral AI's first flagship merged model, combining instruction-following, reasoning, coding, and vision in one dense 128B model. It supports configurable reasoning effort, text and image input, native function calling, JSON output, and a 256K context window. Released as open weights under Mistral's Modified MIT license, it can be self-hosted on as few as four H100/H200 GPUs and scores 77.6% on SWE-bench Verified.
2026-04-29
Researched 1d ago
256K
256,000 tokens
DeepSeek V4 Flash is a 284B parameter (13B activated) Mixture-of-Experts language model with 1M-token context. Features a hybrid attention architecture combining Compressed Sparse Attention (CSA) and Heavily Compressed Attention (HCA) for efficient long-context inference. Supports thinking and non-thinking modes. Legacy API aliases deepseek-chat and deepseek-reasoner map to this model's non-thinking and thinking modes respectively. Pricing: $0.14/1M input, $0.28/1M output (cache hit: $0.0028/1M input). MIT licensed.
2026-04-24
Researched 1d ago
1M
1,000,000 tokens
2025-01-01
Researched 26d ago
160K
160,000 tokens
Extended thinking variant of Kimi K2 with native reasoning capabilities. 256K context.
2025-01-01
Researched 26d ago
256K
256,000 tokens
MiniMax M2.7 is MiniMax's self-improving frontier model, released March 18, 2026. It introduces native multi-agent collaboration, complex skill orchestration, and early recursive self-improvement capabilities. The model uses 10B active parameters, supports a 204,800-token context window, and was released alongside MiniMax-M2.7-highspeed, a 66% faster latency-optimized variant. Public provider listings price standard M2.7 at $0.30 per 1M input tokens and $1.20 per 1M output tokens.
2026-03-18
Researched 11d ago
205K
204,800 tokens
MiniMax M2.5 Highspeed is MiniMax's inference-optimized variant of M2.5, released simultaneously in February 2026. It delivers identical intelligence and outputs to standard M2.5 through a specialized inference engine at lower latency. The model supports a 204,800-token context window, 131,072-token max output, function calling, structured output, and reasoning. API model ID: MiniMax-M2.5-highspeed. It is designed for latency-sensitive interactive applications and automated agent pipelines.
2026-02-12
Researched 11d ago
205K
204,800 tokens
MiniMax M2.7 Highspeed is the inference-optimized variant of MiniMax M2.7, released simultaneously on March 18, 2026. It reaches 100 tokens per second output speed, about 66% faster than standard M2.7, while preserving identical intelligence and outputs through engine optimization rather than weight changes. It supports a 204,800-token context window, 131,072-token max output, function calling, structured output, and reasoning. API model ID: MiniMax-M2.7-highspeed.
2026-03-18
Researched 11d ago
205K
204,800 tokens
Cogito v1 Preview Llama 3B is Deep Cogito's smallest hybrid reasoning model. Fine-tuned from Llama 3.2 3B using Iterated Distillation and Amplification (IDA). Supports both direct and extended-thinking (reasoning) modes, tool calling, and 30+ languages.
2025-04-08
Researched 8d ago
128K
128,000 tokens
Cogito v1 Preview Llama 70B is Deep Cogito's largest v1 dense model. Fine-tuned from a Llama 70B base using Iterated Distillation and Amplification (IDA). Outperforms Llama 4 109B MoE on standard benchmarks according to Deep Cogito. Supports direct and reasoning modes with tool calling.
2025-04-08
Researched 8d ago
128K
128,000 tokens
Cogito v1 Preview Llama 8B is a hybrid reasoning model fine-tuned from Llama 3.1 8B using Iterated Distillation and Amplification (IDA). Supports direct and extended-thinking modes, tool calling, and 30+ languages.
2025-04-08
Researched 8d ago
128K
128,000 tokens
Cogito v1 Preview Qwen-14B is a hybrid reasoning model fine-tuned from Qwen 2.5 14B using Iterated Distillation and Amplification (IDA). Supports direct and extended-thinking modes, tool calling, and 30+ languages.
2025-04-08
Researched 8d ago
128K
128,000 tokens
Cogito v1 Preview Qwen-32B is a hybrid reasoning model fine-tuned from Qwen 2.5 32B using Iterated Distillation and Amplification (IDA). Supports direct and extended-thinking modes, tool calling, and 30+ languages.
2025-04-08
Researched 8d ago
128K
128,000 tokens
2025-01-01
Researched 134d ago
160K
160,000 tokens
2025-01-01
Researched 134d ago
160K
160,000 tokens
2025-01-01
Researched 134d ago
160K
160,000 tokens
2025-01-01
Researched 134d ago
160K
160,000 tokens
2025-01-01
Researched 134d ago
128K
128,000 tokens