LLM Reference

Merlinite Models by IBM Research

IBM ResearchApache 2.0
1 model2024Up to 32k ctxFrom $0.6/1M input

About

The IBM Merlinite model family, particularly the Merlinite-7B, is a series of advanced large language models developed for enterprise and research applications. Built upon the Mistral-7B foundation, Merlinite leverages IBM's proprietary LAB (Large-scale Alignment for chatBots) methodology. This approach combines taxonomy-driven data curation, synthetic data generation, and a two-phase training process with replay buffers to fine-tune the model for high alignment with user needs. The model is designed to incrementally integrate new knowledge and skills while avoiding catastrophic forgetting, a key challenge in AI training. This makes it particularly versatile for enterprise-specific use cases. Merlinite-7B demonstrates robust performance across a variety of benchmarks, excelling in categories such as reading comprehension, knowledge retrieval, and logic tasks. Its LAB-driven synthetic data approach ensures a diverse and tailored knowledge base, optimized using Mixtral-8x7B-Instruct as a teacher model. This innovative training method has enabled Merlinite to compete effectively with larger models while remaining efficient and adaptable. With its focus on domain-specific alignment and efficient scalability, IBM Merlinite-7B is positioned as a significant player in the enterprise AI landscape. It supports applications requiring high-context understanding, such as customer support, knowledge management, and technical documentation. IBM’s commitment to innovation in AI ensures Merlinite’s continued evolution as a cutting-edge solution for complex language-based tasks.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

1 in view

Use when the workload needs 32k context and 7B parameters.

2024-0832k context7B parameters

Release Timeline

1 release group
2024-08
1 current
Merlinite 7B
32k context7B parameters
Current

Specifications(1 models)

Merlinite model specifications comparison
ModelReleasedContextParameters
Merlinite 7B2024-0832k7B

Available From(1 provider)

Pricing

Merlinite model pricing by provider
ModelProviderInput / 1MOutput / 1MType
Merlinite 7BIBM watsonx$0.6$0.6Serverless

Frequently Asked Questions

What is Merlinite used for?
Merlinite is used for chatbot and role-playing use cases. The family description and listed model capabilities point to those workloads as the best fit.
How does Merlinite compare to Granite 4?
Merlinite by IBM Research is strongest where you need chatbot and role-playing use cases, while Granite 4 by IBM Research is the closest related family to check for audio. Merlinite has 1 listed variant and reaches up to 32k context, while Granite 4 reaches up to 131k context, so compare the specs and pricing tables before choosing a production model.
Which Merlinite model should I use?
For the lowest listed input price, start with Merlinite 7B through IBM watsonx at $0.6/1M input tokens. For the most capable/latest local choice, evaluate Merlinite 7B with 32k context.

Models(1)