DeepSeek 67B Chat
About
DeepSeek LLM 67B Chat is a sophisticated language model with 67 billion parameters, leveraging the LLaMA architecture with enhancements such as Grouped-Query Attention across 95 layers. Trained on a vast corpus of 2 trillion tokens in English and Chinese, it excels in tasks like text generation, question answering, and fluent conversation, demonstrating superior performance in reasoning, coding, and mathematics compared to some larger models. Despite its advanced capabilities, the model can exhibit biases from its training data, experience hallucinations, and produce repetitive outputs. Due to its size, substantial computational resources are needed for inference, although quantization methods can reduce its size with potential trade-offs in quality.
Capabilities
Providers(1)
| Provider | Input (per 1M) | Output (per 1M) | Type | |
|---|---|---|---|---|
| Together AI API | $0.9 | $0.9 | Serverless |