LLM Reference

About

BERT, short for Bidirectional Encoder Representations from Transformers, is a prominent family of large language models (LLMs) originally introduced by Google AI in 2018 1)3. These models utilize the transformer architecture to process text in a unique bidirectional manner, enabling an understanding of context by considering both preceding and following words within a sentence 8. Techniques such as masked language modeling (MLM) and next sentence prediction (NSP) contribute to BERT's superior performance on various natural language processing (NLP) tasks compared to older models 10. Initially, BERT was released in two configurations, BERTBASE with 110 million parameters and BERTLARGE with 340 million parameters, both trained on extensive datasets like the BookCorpus and English Wikipedia 3. The BERT family has since expanded to include multilingual versions and smaller models like DistilBERT and TinyBERT, catering to specific tasks and resource constraints 4. This adaptability has made BERT integral to applications like question answering, text classification, and named entity recognition 2.

Models(2)

Details

ResearcherGoogle DeepMind
Models2