Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published 11 days ago • 186
CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published 10 days ago • 261
DeltaRubric: Generative Multimodal Reward Modeling via Joint Planning and Verification Paper • 2605.09269 • Published 13 days ago • 6
trl-internal-testing/tiny-Qwen2ForCausalLM-2.5 Text Generation • 2.43M • Updated Dec 19, 2025 • 5.47M • 6
SciLT: Long-Tailed Classification in Scientific Image Domains Paper • 2604.03687 • Published Apr 4 • 8
TC-AE: Unlocking Token Capacity for Deep Compression Autoencoders Paper • 2604.07340 • Published Apr 8 • 17
MegaStyle: Constructing Diverse and Scalable Style Dataset via Consistent Text-to-Image Style Mapping Paper • 2604.08364 • Published Apr 9 • 101
InCoder-32B-Thinking: Industrial Code World Model for Thinking Paper • 2604.03144 • Published Apr 3 • 233