LLM ReferenceLLM Reference

Persimmon 8B

About

Persimmon-8B is a sophisticated open-source large language model developed by Adept AI, featuring approximately 8 billion parameters. It is a decoder-only transformer enhanced with squared ReLU activation functions and rotary positional encodings, offering a substantial context window of 16,000 tokens, more than quadrupling the capacity of models like LLaMA 2 and GPT-3. Trained on a dataset consisting of 737 billion tokens blended with text and code, it employs an advanced version of FlashAttention for efficient handling of long sequences. Despite utilizing less data than LLaMA 2, it achieves comparable performance on various benchmarks. Released under the Apache 2.0 license, Persimmon-8B is poised for potential multimodal applications with its unused embeddings and provides versatile, fast inference capabilities, although it requires further fine-tuning to mitigate bias.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

Rankings

Specifications

FamilyPersimmon
Released2023-09-16
Parameters8B
ArchitectureDecoder Only
Specializationgeneral
Trainingfinetuning

Created by

AI lab that automates software processes.

San Francisco, California, United States
Founded 2022
Website