GPT-1
About
GPT-1, released by OpenAI in 2018, was a groundbreaking large language model that introduced a 12-layer decoder-only transformer architecture. It featured 12 masked self-attention heads with 64-dimensional states, resulting in a total of 768 dimensions 1310. Despite its relatively modest parameter size of 117 million, GPT-1 effectively demonstrated the potential for generating human-like text, answering questions, and completing sentences 56. However, it faced limitations such as a limited context window and the requirement for significant labeled data for fine-tuning 4. Nevertheless, GPT-1 laid a crucial foundation for future models, paving the way for the development of more advanced iterations like GPT-2 and GPT-3, which expanded and improved upon its capabilities 2.