LLM Reference

BERT Large

About

BERT, or Bidirectional Encoder Representations from Transformers, is a sophisticated large language model developed by Google AI in 2018. It utilizes a transformer architecture based on self-attention mechanisms, enabling it to process text bidirectionally by considering context from both preceding and succeeding words. This capability allows BERT to capture complex language structures and word relationships more effectively than its predecessors. BERT's architecture primarily comprises encoder layers that convert input text into contextualized representations for various tasks. Pre-trained on extensive datasets including BooksCorpus and English Wikipedia, it leverages masked language modeling and next sentence prediction during training. BERT can be fine-tuned for specific NLP tasks like question answering, text classification, and named entity recognition. Initially, BERT was released in two model sizes: BERTBASE with 110 million parameters and BERTLARGE with 340 million. Over time, many variants and adaptations have emerged to cater to specialized applications.

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Specifications

FamilyBERT
Parameters340M
ArchitectureDecoder Only
Specializationgeneral