BERT Large

Name: BERT Large
Author: Google DeepMind

Released

2018-10-11

Last refreshed

2026-05-19

Status

Researched 60d ago

Commercial use: unknown

BERT Large is an early limited-data entry; LLMReference does not yet track enough provider, pricing, benchmark, or task-fit evidence to recommend it.

Use it for

Teams evaluating general LLM work
Workloads that can use a 512 context window

Do not use it for

Cost-sensitive launches that need sourced token pricing
Vision or document-understanding workloads
Strict JSON or tool-calling flows

Specifications

Family: BERT
Released: 2018-10-11
Context: 512
Parameters: 340M
Architecture: Decoder Only
Specialization: general
License: Unknown / UnverifiedCommercial use: unknown
Weights: Unknown
Code: Unknown
Training: Fine-tuned

Created by

Google DeepMind

Pioneering artificial intelligence research.

London, United Kingdom

Founded 2014

Website

Pricing

No tracked provider token pricing is available yet.

About

BERT, or Bidirectional Encoder Representations from Transformers, is a sophisticated large language model developed by Google AI in 2018. It utilizes a transformer architecture based on self-attention mechanisms, enabling it to process text bidirectionally by considering context from both preceding and succeeding words. This capability allows BERT to capture complex language structures and word relationships more effectively than its predecessors. BERT's architecture primarily comprises encoder layers that convert input text into contextualized representations for various tasks. Pre-trained on extensive datasets including BooksCorpus and English Wikipedia, it leverages masked language modeling and next sentence prediction during training. BERT can be fine-tuned for specific NLP tasks like question answering, text classification, and named entity recognition. Initially, BERT was released in two model sizes: BERTBASE with 110 million parameters and BERTLARGE with 340 million. Over time, many variants and adaptations have emerged to cater to specialized applications.

BERT Large is a model in the BERT family. The structured metadata tracks a 512-token context window. No headline benchmark score is tracked for BERT Large yet.

Top use-case fit

No primary decision-task fit is mapped for this model yet.

Provider price ladder

No tracked provider token pricing is available for this model yet.

Capabilities

No model capability flags are currently sourced.

Benchmark peer barsfor Coding

No task-mapped benchmark peers are available for this model yet.

Migration checks

No linked migration route is available for this model yet.

Frequently asked questions

What is the context window of BERT Large?

BERT Large has a context window of 512 tokens.

When was BERT Large released?

BERT Large was released on 2018-10-11.