
ELYZA Japanese CodeLlama
About
The ELYZA Japanese CodeLlama family of large language models (LLMs) is designed for Japanese language processing, particularly excelling in code generation and completion tasks. Derived from the Meta Code Llama architecture, these models, developed by ELYZA, Inc., enhance Japanese language understanding through significant improvements and capabilities. The family includes the base model ELYZA-japanese-CodeLlama-7b and its instruction-tuned counterpart ELYZA-japanese-CodeLlama-7b-instruct, each with distinct performance characteristics. Boasting a vocabulary size of 32,016 and 6.27 billion parameters, these models underwent additional pre-training with 18 billion tokens from Japanese datasets like OSCAR and Wikipedia. Furthermore, the instruction-tuned model is fine-tuned to better follow instructions across various tasks. Both models are accessible under the Llama 2 Community License, permitting research and commercial use as per the Acceptable Use Policy 1256.