yunzhi yan

yunzhiy

https://yunzhiy.github.io/

yunzhiy

AI & ML interests

3D Reconstruction

Recent Activity

upvoted a paper 10 days ago

SpatialTree: How Spatial Abilities Branch Out in MLLMs

liked a Space about 2 months ago

depth-anything/depth-anything-3

liked a dataset 2 months ago

nvidia/PhysicalAI-Autonomous-Vehicles

View all activity

Organizations

None yet

upvoted a paper 10 days ago

SpatialTree: How Spatial Abilities Branch Out in MLLMs

Paper • 2512.20617 • Published 11 days ago • 42

liked a Space about 2 months ago

Depth Anything 3

🏢

336

Create detailed depth maps from images using Depth Anything 3

liked a dataset 2 months ago

nvidia/PhysicalAI-Autonomous-Vehicles

Updated 29 days ago • 169k • 579

liked a model 4 months ago

stdstu123/Yume-I2V-540P

Image-to-Video • Updated Jul 24, 2025 • 30

liked a model 5 months ago

lightx2v/Wan2.1-T2V-14B-StepDistill-CfgDistill

Text-to-Video • Updated Oct 17, 2025 • 128

upvoted a paper 6 months ago

Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models

Paper • 2507.13344 • Published Jul 17, 2025 • 57

liked a dataset 6 months ago

TencentARC/MiraData

Viewer • Updated Jul 19, 2024 • 475k • 296 • 39

liked a Space 6 months ago

SpatialTrackerV2

⚡

102

Official Space for SpatialTrackerV2

liked a Space 7 months ago

vggt

🏆

444

VGGT (CVPR 2025)

upvoted a paper 8 months ago

TPDiff: Temporal Pyramid Video Diffusion Model

Paper • 2503.09566 • Published Mar 12, 2025 • 45

liked a dataset 8 months ago

hw-liang/Diffusion4D

Updated Jan 20, 2025 • 858 • 29

upvoted 2 papers 9 months ago

Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

Paper • 2501.01423 • Published Jan 2, 2025 • 44

GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation

Paper • 2504.02782 • Published Apr 3, 2025 • 57

upvoted a paper 10 months ago

R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization

Paper • 2503.10615 • Published Mar 13, 2025 • 17

liked a model 11 months ago

Fancy-MLLM/R1-Onevision-7B

Image-Text-to-Text • 8B • Updated Feb 25, 2025 • 496 • 44

liked a Space 11 months ago

MatchAnything

🏢

245

Find matching images based on input criteria

upvoted a paper 11 months ago

Neural Gaffer: Relighting Any Object via Diffusion

Paper • 2406.07520 • Published Jun 11, 2024 • 6

upvoted a paper about 1 year ago

Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation

Paper • 2412.14015 • Published Dec 18, 2024 • 12

authored a paper about 1 year ago

StreetCrafter: Street View Synthesis with Controllable Video Diffusion Models

Paper • 2412.13188 • Published Dec 17, 2024

liked a Space about 1 year ago

Prompt Depth Anything

🐠

Generate detailed depth maps from iPhone captures