DeepScaleR 1.5B represents a fine-tuned iteration of the Deepseek-R1-Distilled-Qwen-1.5B model, engineered to advance accessibility in Reinforcement Learning (RL) for Large Language Models (LLMs).
This model exhibits cross-platform compatibility, supporting macOS, Linux, and Windows, thereby facilitating a broad adoption among researchers and developers.
Key Features of DeepScaleR 1.5B