LLM ReferenceLLM Reference

DeciLM 6B

About

DeciLM 6B is a decoder-only large language model featuring 5.7 billion parameters and advanced architecture with Grouped-Query Attention (GQA). This novel technique adjusts attention patterns dynamically across layers, enhancing computational efficiency and output quality. Combined with Deci’s Neural Architecture Search technology, AutoNAC, it allows faster training and improved performance. The model supports a 4096-token context window and is trained on the SlimPajamas dataset. A fine-tuned variant, DeciLM 6B-Instruct, uses LoRA for optimized instruction-following on the OpenOrca dataset. It reportedly achieves up to 15 times the throughput of Llama 2 7B while maintaining similar performance, though this claim requires independent verification. Available under the Llama 2 Community License with an extension for hosting services, DeciLM 6B permits both commercial and research application.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

Rankings

Specifications

FamilyDeciLM
Released2024-01-16
Parameters6B
ArchitectureDecoder Only
Specializationgeneral
Trainingfinetuning

Created by

Automating neural architecture design

Tel Aviv, Israel
Founded 2019
Website