Vicuna Models by LMSYS Org
About
The Vicuna large language model (LLM) family, developed by LMSYS, consists of open-source chat assistants fine-tuned from the LLaMA family, specifically LLaMA and Llama 2, using a diverse dataset from ShareGPT 145. Designed to deliver detailed and structured responses rivaling leading commercial models like ChatGPT and Google Bard, the Vicuna models vary in size and context window length, such as 7B, 13B parameters, and 16k tokens 1. They are primarily intended for research, with their code, weights, and demos available under a non-commercial license, ensuring accessibility for experimentation and development 4. Initial evaluations found these models reaching about 90% of ChatGPT's performance, prompting ongoing refinement 1.
Current Variants
Use-when guidance is derived from seed capabilities, context, release, and replacement fields.
Use when the workload needs 16k context, 13B parameters, and structured outputs.
Use when the workload needs 2k context, 13B parameters, and structured outputs.
Use when the workload needs 16k context, 7B parameters, and structured outputs.
Use when the workload needs 2k context, 7B parameters, and structured outputs.
Use when the workload needs 16k context and 13B parameters.
Use when the workload needs 2k context, 13B parameters, and structured outputs.
Use when the workload needs 16k context and 7B parameters.
Use when the workload needs 2k context, 7B parameters, and structured outputs.
| Model | Use when | Released | Signals | Status |
|---|---|---|---|---|
| Vicuna 13B 16K | Use when the workload needs 16k context, 13B parameters, and structured outputs. | 2023-10 | 16k context13B parametersstructured outputs | Current |
| Vicuna 13B | Use when the workload needs 2k context, 13B parameters, and structured outputs. | 2023-10 | 2k context13B parametersstructured outputs | Current |
| Vicuna 7B 16K | Use when the workload needs 16k context, 7B parameters, and structured outputs. | 2023-10 | 16k context7B parametersstructured outputs | Current |
| Vicuna 7B | Use when the workload needs 2k context, 7B parameters, and structured outputs. | 2023-10 | 2k context7B parametersstructured outputs | Current |
| Vicuna 13B V1.5 16K | Use when the workload needs 16k context and 13B parameters. | 2023-10 | 16k context13B parameters | Current |
| Vicuna 13B V1.5 | Use when the workload needs 2k context, 13B parameters, and structured outputs. | 2023-10 | 2k context13B parametersstructured outputs | Current |
| Vicuna 7B V1.5 16K | Use when the workload needs 16k context and 7B parameters. | 2023-10 | 16k context7B parameters | Current |
| Vicuna 7B V1.5 | Use when the workload needs 2k context, 7B parameters, and structured outputs. | 2023-10 | 2k context7B parametersstructured outputs | Current |
Release Timeline
1 release groupSpecifications(8 models)
| Model | Released | Context | Parameters | Structured Outputs |
|---|---|---|---|---|
| Vicuna 13B 16K | 2023-10 | 16k | 13B | Yes |
| Vicuna 13B | 2023-10 | 2k | 13B | Yes |
| Vicuna 7B 16K | 2023-10 | 16k | 7B | Yes |
| Vicuna 7B | 2023-10 | 2k | 7B | Yes |
| Vicuna 13B V1.5 16K | 2023-10 | 16k | 13B | No |
| Vicuna 13B V1.5 | 2023-10 | 2k | 13B | Yes |
| Vicuna 7B V1.5 16K | 2023-10 | 16k | 7B | No |
| Vicuna 7B V1.5 | 2023-10 | 2k | 7B | Yes |
Available From(3 providers)
Pricing
| Model | Provider | Input / 1M | Output / 1M | Type |
|---|---|---|---|---|
| Vicuna 13B | Replicate API | $0.1 | $0.5 | Serverless |
| Vicuna 7B V1.5 | Together AI | $0.2 | $0.2 | Serverless |
| Vicuna 13B V1.5 | Together AI | $0.3 | $0.3 | Serverless |
Frequently Asked Questions
- What is Vicuna used for?
- Vicuna is used for structured outputs, coding, and math-heavy prompts. The family description and listed model capabilities point to those workloads as the best fit.
- How does Vicuna compare to MOSS-Audio?
- Vicuna by LMSYS Org is strongest where you need structured outputs, while MOSS-Audio by MOSI Intelligence is the closest related family to check for multimodal. Vicuna has 8 listed variants and reaches up to 16k context, so compare the specs and pricing tables before choosing a production model.
- Which Vicuna model should I use?
- For the lowest listed input price, start with Vicuna 13B through Replicate API at $0.1/1M input tokens. For the most capable/latest local choice, evaluate Vicuna 13B 16K with 16k context and structured outputs.




