Llama 3 Sonar Small 32K Chat
DeprecatedAbout
The Llama 3 Sonar Small 32K Chat model by Perplexity AI is a large language model optimized for chat applications. It stands out for its cost-effectiveness, speed, and enhanced performance compared to earlier Sonar models. This model supports a context window of 32,000 tokens, allowing it to sustain lengthy conversation histories, although some users note it might not fully utilize this memory capacity. It targets use in conversational AI environments such as chatbots and virtual assistants, providing coherent and contextually aware responses. Despite its relatively smaller size within the Llama 3 lineup, the model ensures a balance between performance and resource efficiency. However, like other LLMs, it can sometimes deliver inaccurate or outdated information, making independent verification essential.
Capabilities
Providers(1)
| Provider | Input (per 1M) | Output (per 1M) | Type | |
|---|---|---|---|---|
| Perplexity Labs | — | — | Serverless |