XYX's picture

XYX

xuyd16

·

AI & ML interests

None yet

Recent Activity

authored a paper 3 days ago

Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle for Language-Model Post-Training

upvoted a paper 3 days ago

Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle for Language-Model Post-Training

submitted a paper 3 days ago

Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle for Language-Model Post-Training

View all activity

Organizations

None yet

upvoted a paper 3 days ago

Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle for Language-Model Post-Training

Paper • 2605.12483 • Published 4 days ago • 8

upvoted a paper 13 days ago

TIP: Token Importance in On-Policy Distillation

Paper • 2604.14084 • Published Apr 15 • 15

upvoted 4 papers 30 days ago

OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language World Models

Paper • 2604.10866 • Published Apr 13 • 66

SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments

Paper • 2604.14144 • Published Apr 15 • 63

RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time

Paper • 2604.11626 • Published Apr 13 • 101

Seedance 2.0: Advancing Video Generation for World Complexity

Paper • 2604.14148 • Published Apr 15 • 160

upvoted 3 papers 2 months ago

PACED: Distillation at the Frontier of Student Competence

Paper • 2603.11178 • Published Mar 11 • 4

Overconfident Errors Need Stronger Correction: Asymmetric Confidence Penalties for Reinforcement Learning

Paper • 2602.21420 • Published Feb 24 • 6

Flash-KMeans: Fast and Memory-Efficient Exact K-Means

Paper • 2603.09229 • Published Mar 10 • 82