LLM ReferenceLLM Reference

Llama 3 Sonar Small 32K Chat

llama-3-sonar-small-32k-chat

Deprecated
Open Source

About

The Llama 3 Sonar Small 32K Chat model by Perplexity AI is a large language model optimized for chat applications. It stands out for its cost-effectiveness, speed, and enhanced performance compared to earlier Sonar models. This model supports a context window of 32,000 tokens, allowing it to sustain lengthy conversation histories, although some users note it might not fully utilize this memory capacity. It targets use in conversational AI environments such as chatbots and virtual assistants, providing coherent and contextually aware responses. Despite its relatively smaller size within the Llama 3 lineup, the model ensures a balance between performance and resource efficiency. However, like other LLMs, it can sometimes deliver inaccurate or outdated information, making independent verification essential.

Llama 3 Sonar Small 32K Chat has a 32K-token context window.

Llama 3 Sonar Small 32K Chat input tokens at $0.2/1M, output at $0.2/1M.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

Providers(1)

ProviderInput (per 1M)Output (per 1M)Type
Perplexity Labs$0.20$0.20Serverless

Specifications

FamilySonar
Released2024-05-05
Parameters8B
Context32K
ArchitectureDecoder Only
Knowledge cutoff2024-03
Specializationgeneral
Trainingfinetuned

Created by

Developing AI for complex problem-solving.

San Francisco, California, United States
Founded 2022
Website

Providers(1)