Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text Paper • 2601.22975 • Published 9 days ago • 86
Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B Technical Report Paper • 2601.21051 • Published 11 days ago • 12
On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models Paper • 2512.07783 • Published Dec 8, 2025 • 38
More Thought, Less Accuracy? On the Dual Nature of Reasoning in Vision-Language Models Paper • 2509.25848 • Published Sep 30, 2025 • 80