bartowski/allura-forge_Llama-3.3-8B-Instruct-GGUF Text Generation • 8B • Updated 4 days ago • 3.98k • 17
Cerebras REAP Collection Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 19 items • Updated 14 days ago • 76
VTP Collection Towards Scalable Pre-training of Visual Tokenizers for Generation • 4 items • Updated 18 days ago • 39
Teacher Logits Collection Logits captured from large models to act as the teacher for distillation • 3 items • Updated 18 days ago • 7