Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions Paper • 2507.05257 • Published Jul 7, 2025 • 14
ReasonMap Collection A fine-grained visual reasoning benchmark (We show more question types in the extension dataset.) • 3 items • Updated Oct 1, 2025 • 8
RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning Paper • 2510.02240 • Published Oct 2, 2025 • 17
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents Paper • 2506.03143 • Published Jun 3, 2025 • 53
ShowUI: One Vision-Language-Action Model for GUI Visual Agent Paper • 2411.17465 • Published Nov 26, 2024 • 89
Optimus-3: Towards Generalist Multimodal Minecraft Agents with Scalable Task Experts Paper • 2506.10357 • Published Jun 12, 2025 • 21
AgentSynth: Scalable Task Generation for Generalist Computer-Use Agents Paper • 2506.14205 • Published Jun 17, 2025 • 7
Decoupled Planning and Execution: A Hierarchical Reasoning Framework for Deep Search Paper • 2507.02652 • Published Jul 3, 2025 • 26
Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps Paper • 2505.18675 • Published May 24, 2025 • 26