Submitted by floyed 157 VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training rednote-hilab 14 2
Submitted by taesiri 14 Generated Reality: Human-centric World Simulation using Interactive Video Generation with Hand and Camera Control · 6 authors 3
Submitted by hba123 7 Decoding as Optimisation on the Probability Simplex: From Top-K to Top-P (Nucleus) to Best-of-K Samplers · 4 authors 1
Submitted by taesiri 5 EgoPush: Learning End-to-End Egocentric Multi-Object Rearrangement for Mobile Robots · 7 authors 1
Submitted by nielsr 2 VidEoMT: Your ViT is Secretly Also a Video Segmentation Model Mobile Perception Systems Lab 10 1
Submitted by skylenage 2 DeepVision-103K: A Visually Diverse, Broad-Coverage, and Verifiable Mathematical Dataset for Multimodal Reasoning · 8 authors 2 1
Submitted by taesiri 1 Learning Smooth Time-Varying Linear Policies with an Action Jacobian Penalty · 3 authors 1
Submitted by aidar-myrzakhan 1 Sink-Aware Pruning for Diffusion Language Models Mohamed Bin Zayed University of Artificial Intelligence 5 1
Submitted by beopst 1 Selective Training for Large Vision Language Models via Visual Information Gain Seoul National University of Science and Technology 1
Submitted by Luo-Yihang 1 4RC: 4D Reconstruction via Conditional Querying Anytime and Anywhere · 5 authors 2