DeepSeek-R1

该项目创造性地基于 DeepSeek V3 基座模型，采用大规模强化学习技术，成功训练出一个完全通过强化学习增强的推理模型。它拥有比肩 OpenAI o1 正式版的智商和超低的训练成本，不仅开源了模型权重，并公开了训练方法和技术。

This project creatively builds upon the DeepSeek V3 base model and employs large-scale reinforcement learning techniques to successfully train an inference model entirely enhanced by reinforcement learning. It matches the intelligence level of OpenAI's o1 official version while boasting extremely low training costs. The model weights are open-sourced, and the training methods and techniques are publicly disclosed.

DeepSeek-R1

DeepSeek-R1

Comments