Aya 23 35B
About
Aya 23 35B is a cutting-edge multilingual large language model developed by Cohere For AI, featuring 35 billion parameters. It is designed for advanced multilingual capabilities across 23 languages, offering tasks like text generation, translation, summarization, and question answering. The model employs an optimized transformer architecture with innovations such as parallel attention, SwiGLU activation, and rotary positional embeddings to enhance performance. It uses a 256k vocabulary size BPE tokenizer, processing input word by word as a decoder-only model. The training process included instruction-fine-tuning with various data sources, achieving high performance in supported languages. However, it does not cover the full spectrum of global languages and has a context length of 8192 tokens, potentially limiting its application compared to some other LLMs. Despite occasional unexpected responses, Aya 23 35B remains a powerful tool for applications such as content creation and language learning.
Capabilities
Providers(1)
| Provider | Input (per 1M) | Output (per 1M) | Type | |
|---|---|---|---|---|
| Cohere API | — | — | Serverless |