4D-VLA: Spatiotemporal Vision-Language-Action Pretraining with Cross-Scene Calibration Paper • 2506.22242 • Published Jun 27, 2025
UniUGG: Unified 3D Understanding and Generation via Geometric-Semantic Encoding Paper • 2508.11952 • Published Aug 16, 2025 • 1
From Flatland to Space: Teaching Vision-Language Models to Perceive and Reason in 3D Paper • 2503.22976 • Published Mar 29, 2025 • 3