edbeeching/decision-transformer-gym-hopper-expert Reinforcement Learning • Updated Jun 29, 2022 • 300 • 20
mradermacher/Tifa-Deepsex-14b-CoT-i1-GGUF Reinforcement Learning • 15B • Updated Feb 13, 2025 • 481 • 14
Open-Reasoner-Zero/Open-Reasoner-Zero-7B Reinforcement Learning • 8B • Updated Apr 7, 2025 • 1.75k • 34
ValueFX9507/Tifa-DeepsexV3-14b-GGUF-Q6 Reinforcement Learning • 15B • Updated Jul 1, 2025 • 13.4k • 43
tensorblock/Nellyw888_VeriReason-codeLlama-7b-RTLCoder-Verilog-GRPO-reasoning-tb-GGUF Reinforcement Learning • 7B • Updated Jan 27 • 3 • 1