view article Article Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models about 13 hours ago • 38
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published 14 days ago • 211
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices Paper • 2512.01374 • Published 15 days ago • 89
Olmo 3 Post-training Collection All artifacts for post-training Olmo 3. Datasets follow the model that resulted from training on them. • 32 items • Updated 6 days ago • 44
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published 21 days ago • 57
nvidia/Nemotron-RL-instruction_following-structured_outputs Viewer • Updated 4 days ago • 9.95k • 766 • 23
Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning Paper • 2508.03501 • Published Aug 5 • 59