LLM Reference

Athene Models by Nexusflow

NexusflowLlama 3 CommunityOpen weights
1 model2024Up to 4k ctx

Details

ResearcherNexusflow
Commercial useCommercial use with conditions
Models1
Released2024
Max context4k

About

The Athene family of LLMs originates from Nexusflow, a company that excels in post-training optimization of large language models. Its flagship model, Athene-Llama3-70B (commonly known as Athene-70B), is an open-weights model fine-tuned from Meta AI's Llama-3-70B-Instruct through reinforcement learning from human feedback (RLHF) 5. This meticulous post-training approach has notably enhanced its capabilities, leading to an impressive 77.8% score on Arena-Hard-Auto, a benchmark that strongly aligns with human evaluation on Chatbot Arena 5. As a result, Athene-70B is among the top-performing open-source models, on par with leading proprietary models 5. Nexusflow's post-training efforts mainly targeted improving the model’s instruction following, reasoning, coding, creative writing, and multilingual abilities 5. Additionally, the Athene series includes smaller models and others hinted at as Athene v2, v3, and v4 in different references 3811, though these are less detailed in the provided information.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

1 in view
Athene 70BCurrent

Use when the workload needs 4k context and 70B parameters.

2024-074k context70B parameters

Release Timeline

1 release group
2024-07
1 current
Athene 70B
4k context70B parameters
Current

Specifications(1 models)

Athene model specifications comparison
ModelReleasedContextParameters
Athene 70B2024-074k70B

Frequently Asked Questions

What is Athene used for?
Athene is used for coding and chatbot and role-playing use cases. The family description and listed model capabilities point to those workloads as the best fit.
How does Athene compare to Starling?
Athene by Nexusflow is strongest where you need coding, while Starling by Nexusflow is the closest related family to check for coding. Athene has 1 listed variant and reaches up to 4k context, while Starling reaches up to 8k context, so compare the specs and pricing tables before choosing a production model.
Which Athene model should I use?
If price is the main constraint, use the pricing table first because Athene does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate Athene 70B with 4k context.