LLM Reference

Aya 23 35B

About

Aya 23 35B is a cutting-edge multilingual large language model developed by Cohere For AI, featuring 35 billion parameters. It is designed for advanced multilingual capabilities across 23 languages, offering tasks like text generation, translation, summarization, and question answering. The model employs an optimized transformer architecture with innovations such as parallel attention, SwiGLU activation, and rotary positional embeddings to enhance performance. It uses a 256k vocabulary size BPE tokenizer, processing input word by word as a decoder-only model. The training process included instruction-fine-tuning with various data sources, achieving high performance in supported languages. However, it does not cover the full spectrum of global languages and has a context length of 8192 tokens, potentially limiting its application compared to some other LLMs. Despite occasional unexpected responses, Aya 23 35B remains a powerful tool for applications such as content creation and language learning.

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Providers(1)

ProviderInput (per 1M)Output (per 1M)Type
Cohere API
Serverless

Specifications

FamilyAya
Parameters35B
ArchitectureDecoder Only
Specializationgeneral