LLM Reference

Aya 23 8B

About

Aya-23-8B is a multilingual large language model developed by Cohere For AI, featuring 8 billion parameters. As an instruction-fine-tuned model, it is adept at following instructions and is optimized for text generation and understanding. The model employs a decoder-only Transformer architecture, utilizing enhancements like parallel attention and feed-forward layers for efficiency. It supports 23 languages, including Arabic, Chinese, English, and French, and is proficient in tasks such as machine translation, chatbot interactions, and text summarization. Despite its capabilities, its performance might vary across languages, particularly those with less linguistic resources, and it has a context length limit of 8192 tokens. Training involved diverse data sources like human annotations and synthetic datasets to bolster its multilingual proficiency.

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Specifications

FamilyAya
Parameters8B
ArchitectureDecoder Only
Specializationgeneral