LLM Reference
Perplexity Labs

Llama 3 Sonar Small 32K Chat on Perplexity Labs

Sonar · Perplexity Labs

Serverless

Pricing

TypePrice (per 1M)
Input tokens$0.20
Output tokens$0.20

Capabilities

VisionMultimodalReasoningFunction CallingTool UseJSON ModeCode Execution

About Llama 3 Sonar Small 32K Chat

The Llama 3 Sonar Small 32K Chat model by Perplexity AI is a large language model optimized for chat applications. It stands out for its cost-effectiveness, speed, and enhanced performance compared to earlier Sonar models. This model supports a context window of 32,000 tokens, allowing it to sustain lengthy conversation histories, although some users note it might not fully utilize this memory capacity. It targets use in conversational AI environments such as chatbots and virtual assistants, providing coherent and contextually aware responses. Despite its relatively smaller size within the Llama 3 lineup, the model ensures a balance between performance and resource efficiency. However, like other LLMs, it can sometimes deliver inaccurate or outdated information, making independent verification essential.

Get Started

Model Specs

Released2024-05-05
Parameters8B
Context32K
ArchitectureDecoder Only
Knowledge cutoff2024-03