DeepSeek-R1-Distill-Qwen-32B-Q2-6
This model was converted to MLX from deepseek-ai/DeepSeek-R1-Distill-Qwen-32B, using mixed 2/6 bit quantization. This scheme preserves quality much more than a standard 2-bit quantization.
Use with mlx
pip install mlx-lm
python -m mlx_lm.chat --model pcuenq/DeepSeek-R1-Distill-Qwen-32B-Q2-6 --max-tokens 10000 --temp 0.6 --top-p 0.7
- Downloads last month
- 13
Model size
4B params
Tensor type
BF16
·
U32 ·
Hardware compatibility
Log In to add your hardware
4-bit
Model tree for pcuenq/DeepSeek-R1-Distill-Qwen-32B-Q2-6
Base model
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B