Falcon2 11B
About
Falcon2-11B is an 11-billion parameter causal decoder-only large language model developed by the Technology Innovation Institute. It is constructed using the Transformer architecture that includes grouped query attention with FlashAttention-2 for enhanced speed and efficiency. Trained on more than 5 trillion tokens from the RefinedWeb dataset and supplemented by curated corpora, it supports eleven languages such as English, German, and Spanish. Falcon2-11B exhibits strong performance in multilingual and code generation tasks, often outperforming similar models like Llama3-8B and Mistral-7B, and matching Gemma-7B in performance. Despite its capabilities, it may still exhibit biases and variations in performance across different languages due to the nature of its training data. The model is open-source, and its multimodal version, Falcon2-11B-vlm, includes image understanding capabilities.