Granite Guardian 4.1 8B
granite-guardian-4.1-8b
Open Source
About
IBM Granite Guardian 4.1 8B is a safety and risk-detection model fine-tuned from Granite 4.1 8B. Features hybrid thinking mode (detailed reasoning via <think> tags or fast yes/no). Detects: harm, social bias, jailbreaking, violence, profanity, sexual content, unethical behavior, RAG hallucination (groundedness, context relevance, answer relevance), and function-calling hallucination in agentic workflows. Supports custom judging criteria (BYOC). Acts as reward model for best-of-N selection. Benchmarks: OOD Safety F1 0.79, RAG hallucination avg BAcc 0.76, BFCL function calling BAcc 0.79. Apache 2.0.
Granite Guardian 4.1 8B has a 8K-token context window.
Capabilities
VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution
Created by
Creating reliable and adaptable AI solutions
Armonk, New York, United States
Founded 1945
Website