WorldVQA: Measuring Atomic World Knowledge in Multimodal Large Language Models Paper • 2602.02537 • Published 7 days ago • 5
Balancing Understanding and Generation in Discrete Diffusion Models Paper • 2602.01362 • Published 3 days ago • 9
Unified Personalized Reward Model for Vision Generation Paper • 2602.02380 • Published 1 day ago • 15
AOrchestra: Automating Sub-Agent Creation for Agentic Orchestration Paper • 2602.03786 • Published about 17 hours ago • 28
SWE-Master: Unleashing the Potential of Software Engineering Agents via Post-Training Paper • 2602.03411 • Published about 23 hours ago • 26
SWE-World: Building Software Engineering Agents in Docker-Free Environments Paper • 2602.03419 • Published about 23 hours ago • 28
No Global Plan in Chain-of-Thought: Uncover the Latent Planning Horizon of LLMs Paper • 2602.02103 • Published 2 days ago • 31
daVinci-Agency: Unlocking Long-Horizon Agency Data-Efficiently Paper • 2602.02619 • Published 2 days ago • 37
CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding Paper • 2602.01785 • Published 2 days ago • 52
Search-R2: Enhancing Search-Integrated Reasoning via Actor-Refiner Collaboration Paper • 2602.03647 • Published about 19 hours ago • 1
Bridging Online and Offline RL: Contextual Bandit Learning for Multi-Turn Code Generation Paper • 2602.03806 • Published about 16 hours ago • 3
CoBA-RL: Capability-Oriented Budget Allocation for Reinforcement Learning in LLMs Paper • 2602.03048 • Published 1 day ago • 23
WideSeek: Advancing Wide Research via Multi-Agent Scaling Paper • 2602.02636 • Published 1 day ago • 11
Learning Query-Specific Rubrics from Human Preferences for DeepResearch Report Generation Paper • 2602.03619 • Published about 19 hours ago • 20
MARS: Modular Agent with Reflective Search for Automated AI Research Paper • 2602.02660 • Published 1 day ago • 33
Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks Paper • 2602.01630 • Published 2 days ago • 39
AgentIF-OneDay: A Task-level Instruction-Following Benchmark for General AI Agents in Daily Scenarios Paper • 2601.20613 • Published 7 days ago • 9
Thinking with Comics: Enhancing Multimodal Reasoning through Structured Visual Storytelling Paper • 2602.02453 • Published 1 day ago • 33
Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning Paper • 2602.01058 • Published 3 days ago • 32