LLM ReferenceLLM Reference

Using SubQ 1M-Preview on SubQ API

Implementation guide · SubQ · Subquadratic

Serverless

Quick Start

  1. 1
    Create an account at SubQ API and generate an API key.
  2. 2
    Use the SubQ API SDK or REST API to call subq-1m-preview — see the documentation for request format.

Code Examples

See SubQ API documentation for integration details.

About SubQ API

SubQ API is Subquadratic's OpenAI-compatible API platform providing access to their SubQ large language models. The platform features a 12 million token context window powered by a novel sub-quadratic sparse-attention architecture (O(n) compute complexity), enabling long-context reasoning at a fraction of the cost of comparable transformer-based models. Key capabilities include streaming, tool use, and a coding agent integration (SubQ Code) compatible with Claude Code, Codex, and Cursor. The API is currently in private preview with early access available by request.

Subquadratic is a frontier AI research and infrastructure company launched in May 2026 with $29M in seed funding. Led by CEO Justin Dangel (five-time founder) and CTO Alexander Whedon (former Meta engineer), the company's team includes 11 PhD researchers and engineers from Meta, Google, Oxford, Cambridge, ByteDance, Adobe, and Microsoft. Their flagship model SubQ is built on a fully sub-quadratic sparse-attention architecture that scales linearly with context length, enabling a 12M-token context window with 50x lower compute cost than leading frontier models at the same context length. Products include SubQ API (OpenAI-compatible REST API), SubQ Code (CLI coding agent), and SubQ Search (long-context search).

Pricing on SubQ API

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

About SubQ 1M-Preview

SubQ 1M-Preview is Subquadratic's first large language model, built on a fully sub-quadratic sparse-attention architecture that scales compute linearly with context length (O(n) vs. traditional O(n²)). Supports a production context window of 1M tokens (architecture tested to 12M). Achieves 81.8% on SWE-Bench Verified, 95.0% on RULER @128K, and 65.9% on MRCR v2 (8-needle, 1M). Claims 50x faster and 50x cheaper than leading frontier models at 1M context length. Available via OpenAI-compatible API with streaming and tool use support. Model is proprietary and not open-source; fine-tuning for customer-specific use cases is mentioned as a future capability.

Model Specs

Released2026-05-05
Context1M
ArchitectureDecoder Only

Provider

SubQ API

Subquadratic