waiting - a D-YZ Collection

D-YZ 's Collections

waiting

waiting

updated Sep 2, 2025

Adapting Vision-Language Models Without Labels: A Comprehensive Survey

Paper • 2508.05547 • Published Aug 7, 2025 • 11
Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models

Paper • 2508.10751 • Published Aug 14, 2025 • 28
SSRL: Self-Search Reinforcement Learning

Paper • 2508.10874 • Published Aug 14, 2025 • 97
Mind the Generation Process: Fine-Grained Confidence Estimation During LLM Generation

Paper • 2508.12040 • Published Aug 16, 2025 • 14
Deep Think with Confidence

Paper • 2508.15260 • Published Aug 21, 2025 • 90
A Stitch in Time Saves Nine: Proactive Self-Refinement for Language Models

Paper • 2508.12903 • Published Aug 18, 2025 • 11
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR

Paper • 2508.14029 • Published Aug 19, 2025 • 118
Controlling Multimodal LLMs via Reward-guided Decoding

Paper • 2508.11616 • Published Aug 15, 2025 • 7
VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning

Paper • 2507.22607 • Published Jul 30, 2025 • 46