Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players Paper • 2605.28816 • Published 8 days ago • 419
Learning POMDP World Models from Observations with Language-Model Priors Paper • 2605.13740 • Published 22 days ago • 6
SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise Paper • 2602.12783 • Published Feb 13 • 246
CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published 22 days ago • 270
KernelBench-X: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels Paper • 2605.04956 • Published 29 days ago • 7
From Context to Skills: Can Language Models Learn from Context Skillfully? Paper • 2604.27660 • Published May 3 • 166
ImplicitMemBench: Measuring Unconscious Behavioral Adaptation in Large Language Models Paper • 2604.08064 • Published Apr 9 • 8
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver Paper • 2604.08377 • Published Apr 9 • 291
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 504
AvatarPointillist: AutoRegressive 4D Gaussian Avatarization Paper • 2604.04787 • Published Apr 6 • 12
ACES: Who Tests the Tests? Leave-One-Out AUC Consistency for Code Generation Paper • 2604.03922 • Published Apr 5 • 53
TrajectoryMover: Generative Movement of Object Trajectories in Videos Paper • 2603.29092 • Published Mar 31 • 3
Distilling Human-Aligned Privacy Sensitivity Assessment from Large Language Models Paper • 2603.29497 • Published Mar 31 • 6
BioVITA: Biological Dataset, Model, and Benchmark for Visual-Textual-Acoustic Alignment Paper • 2603.23883 • Published Mar 25 • 6
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models Paper • 2603.25716 • Published Mar 26 • 156