FunctionGemma Tuning Lab is a new no-code tool by @google that lets you fine-tune a model directly from the browser, with no coding knowledge required, using TRL behind the scenes.
It includes GDPO, the latest variant of GRPO for multi-reward RL ✨ GDPO decouples reward normalization to avoid reward collapse and improve per-reward convergence — developed by @sliuau@SimonX et al.
Recursive Language Models (RLM) is a new interface for LLMs with cool ideas by Alex Zhang!
⚠️ LLMs struggle with long prompts → attention overload & lost info 🔄 RLMs inspect, split & call themselves on chunks, then aggregate results ✅ Handles millions of tokens, reduces noise, improves reasoning 💡 System prompt guides recursion 🎯 RLM trajectories can be used for RL training or distillation (OpenEnv+TRL!!)
The list of hands-on notebooks (some beginner-friendly!) to get started with fine-tuning using TRL keeps growing!!
• SFT • GRPO • Tool calling & agents • RL environments with OpenEnv • LLMs and VLMs ✨ Many run on FREE Colab, making it super easy to get started fast!
The Christmas holidays are here! 🎄 Thinking about learning something new in AI?
@huggingface offers 12 FREE courses covering all the relevant topics, for every level of experience. A great challenge for the holidays (and worth saving for later 🙄)