LLM Reference

OLMo 7B Twin-2T

About

The OLMo 7B Twin-2T is a robust open-source large language model that implements a decoder-only transformer architecture with enhancements for greater stability and performance. It features non-parametric layer normalization and SwiGLU activation functions, along with Rotary positional embeddings for better sequence handling. The model, comprising 32 layers and 32 attention heads, was trained on approximately 2 trillion tokens and supports a context length of 2048. It is notable for its transparency in AI research, as all training data, code, and evaluations are publicly accessible, promoting collaborative advancements. The model excels in various NLP tasks and has options for fine-tuning, while its developers advocate for responsible AI usage to mitigate risks of bias and inaccuracies.

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Providers(1)

ProviderInput (per 1M)Output (per 1M)Type
Together AI API$0.2$0.2
Serverless

Specifications

FamilyOLMo
Parameters7B
ArchitectureDecoder Only
Specializationgeneral