LLM Reference

StripedHyena Hessian 7B

About

The StripedHyena Hessian 7B (SH 7B) is a cutting-edge large language model developed by Together Computer. It employs a hybrid architecture that integrates multi-head, grouped-query attention, and gated convolutions arranged in Hyena blocks, distinguished from traditional Transformer models by its enhanced performance and efficiency 12. Designed for superior long-context processing, the model features a state-space model (SSM) layer for efficient inference and reduced memory usage 10, excelling in tasks like multi-document question answering and long-form text summarization. It supports sequences up to 32k tokens, ensuring fast decoding and high throughput, and even includes a variant, SH-N 7B, tailored for chat applications 2. Despite its strengths, it requires custom kernels outside its playground, is a mixed-precision model, and ongoing research is needed to explore further improvements 1.

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Providers(1)

ProviderInput (per 1M)Output (per 1M)Type
Together AI API$0.2$0.2
Serverless

Specifications

Released2023-12-08
Parameters7B
ArchitectureDecoder Only
Specializationgeneral