Block Diffusion for Flash Speculative Decoding
AI & ML interests
Efficient AI
Recent Activity
Papers
DFlash: Block Diffusion for Flash Speculative Decoding
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
-
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
Paper • 2511.10645 • Published • 8 -
z-lab/gemma-4-31B-it-PARO
Image-Text-to-Text • 6B • Updated • 80 • 2 -
z-lab/Qwen3.5-9B-PARO
Image-Text-to-Text • 3B • Updated • 55k • 41 -
z-lab/Qwen3.5-4B-PARO
Image-Text-to-Text • 1B • Updated • 17.2k • 14
Block Diffusion for Flash Speculative Decoding
Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
-
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
Paper • 2511.10645 • Published • 8 -
z-lab/gemma-4-31B-it-PARO
Image-Text-to-Text • 6B • Updated • 80 • 2 -
z-lab/Qwen3.5-9B-PARO
Image-Text-to-Text • 3B • Updated • 55k • 41 -
z-lab/Qwen3.5-4B-PARO
Image-Text-to-Text • 1B • Updated • 17.2k • 14
models 32
z-lab/Kimi-K2.5-DFlash
Text Generation • Updated • 28
z-lab/Qwen3.5-35B-A3B-PARO
Image-Text-to-Text • 6B • Updated • 227 • 4
z-lab/Qwen3.5-27B-PARO
Image-Text-to-Text • 6B • Updated • 2.71k • 15
z-lab/Qwen3.5-9B-PARO
Image-Text-to-Text • 3B • Updated • 55k • 41
z-lab/Qwen3.5-4B-PARO
Image-Text-to-Text • 1B • Updated • 17.2k • 14
z-lab/Qwen3.5-2B-PARO
Image-Text-to-Text • 1B • Updated • 343 • 2
z-lab/Qwen3.5-0.8B-PARO
Image-Text-to-Text • 0.4B • Updated • 881 • 1
z-lab/Qwen3-14B-PARO
Text Generation • 2B • Updated • 576 • 2
z-lab/Qwen3-8B-PARO
Text Generation • 1B • Updated • 378 • 1
z-lab/Qwen3-4B-PARO
Text Generation • 0.9B • Updated • 629 • 1