LLM ReferenceLLM Reference
AI Glossary
foundationbeginner

Model

Definition

A model is the large language model itself — the trained neural-network weights that turn input tokens into output tokens. It is the raw computational substrate of every AI system: a frozen artifact produced by pretraining on trillions of tokens, often further shaped by instruction tuning and alignment. Examples include Claude Opus 4.7 and Sonnet 4.6 from Anthropic, GPT-5 from OpenAI, and Google's Gemini family.

On its own a model is stateless and narrow. Given input tokens it predicts the next token distribution — that is the whole job. It has no memory between API calls, no filesystem, no ability to browse the web, no "session." Everything else users associate with AI — conversation history, tool access, long-horizon behavior, retrieval — is supplied by the layers wrapping it.

When choosing a model you typically trade off capability (frontier reasoning, coding, math), context length, latency and price per million tokens, and modality support (text, vision, audio, video). Post-training recipe, training cutoff, and parameter count all shape real-world behavior even when the architecture looks similar.

In our five-layer mental model — model → tools → context → harness → agent — the model is the primitive that every other layer wraps. Swapping the underlying model (say, Sonnet 4.6 for Opus 4.7 inside the same harness) is one of the most common agentic-system upgrades.

As a storefront-level anchor, $1.10 / $4.40 per 1M tokens is the May 2026 standard-rate pair we track on the canonical OpenAI API routing row for o4-mini (always re-check routing before estimating spend).

See the models directory for the full catalog with pricing, context windows, and benchmark results.

Models Using Model(12)