LLM ReferenceLLM Reference

Aya 23 8B

About

Aya-23-8B is a multilingual large language model developed by Cohere For AI, featuring 8 billion parameters. As an instruction-fine-tuned model, it is adept at following instructions and is optimized for text generation and understanding. The model employs a decoder-only Transformer architecture, utilizing enhancements like parallel attention and feed-forward layers for efficiency. It supports 23 languages, including Arabic, Chinese, English, and French, and is proficient in tasks such as machine translation, chatbot interactions, and text summarization. Despite its capabilities, its performance might vary across languages, particularly those with less linguistic resources, and it has a context length limit of 8192 tokens. Training involved diverse data sources like human annotations and synthetic datasets to bolster its multilingual proficiency.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

Benchmark Scores(4)

BenchmarkScoreVersionSource
Google-Proof Q&A45.2diamondOpen LLM Leaderboard
HellaSwag87.310-shotOpen LLM Leaderboard
HumanEval68.5pass@1Open LLM Leaderboard
Massive Multitask Language Understanding72.85-shotOpen LLM Leaderboard

Rankings

Specifications

FamilyAya
Released2024-02-21
Parameters8B
ArchitectureDecoder Only
Specializationgeneral
Trainingfinetuning

Created by

Empowering developers with advanced language AI.

Toronto, Ontario, Canada
Founded 2022
Website