Phi-1.5

About

Phi-1.5 is a 1.3-billion parameter large language model (LLM) developed by Microsoft, designed to excel in complex reasoning tasks, particularly those requiring common sense, language understanding, and logical reasoning. Unlike many LLMs that heavily depend on web-scraped data, Phi-1.5 primarily uses a high-quality, synthetic dataset comprising about 30 billion tokens. This dataset is curated to resemble "textbook-like" content, concentrating on common sense and general knowledge, and sets the model apart. The Phi-1.5 architecture is a standard Transformer with 24 layers, 32 heads, and a head dimension of 64. It incorporates rotary embeddings and flash-attention for efficiency, utilizing a codegen-mono tokenizer. This model demonstrates impressive performance on various natural language benchmarks, rivaling models five times its size, with notable strengths in multi-step reasoning tasks, such as math word problems and coding challenges. However, Phi-1.5's capabilities are not without limitations, as it may produce inaccurate code or facts and is sensitive to prompt variations. The model's training leverages several data sources, with variations like Phi-1.5-web-only and Phi-1.5-web, providing insights into the impact of different datasets on performance.

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Providers(1)

Provider	Input (per 1M)	Output (per 1M)	Type
Azure OpenAI	—	—	Provisioned

Specifications

FamilyPhi-1

Released2023-09-11

Parameters1.3B

ArchitectureDecoder Only

Specializationgeneral