SpatialTree: How Spatial Abilities Branch Out in MLLMs Paper • 2512.20617 • Published 11 days ago • 42
Running on Zero Featured 336 Depth Anything 3 🏢 336 Create detailed depth maps from images using Depth Anything 3
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models Paper • 2507.13344 • Published Jul 17, 2025 • 57
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models Paper • 2501.01423 • Published Jan 2, 2025 • 44
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation Paper • 2504.02782 • Published Apr 3, 2025 • 57
R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization Paper • 2503.10615 • Published Mar 13, 2025 • 17
Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation Paper • 2412.14015 • Published Dec 18, 2024 • 12
StreetCrafter: Street View Synthesis with Controllable Video Diffusion Models Paper • 2412.13188 • Published Dec 17, 2024