Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2508.18733

3D CAD Generation

Drawing2CAD: Sequence-to-Sequence Learning for CAD Generation from Vector Drawings

Paper • 2508.18733 • Published Aug 26, 2025 • 9

Neural LightRig: Unlocking Accurate Object Normal and Material Estimation with Multi-Light Diffusion

Paper • 2412.09593 • Published Dec 12, 2024 • 18
wushuang98/Direct3D-S2

Image-to-3D • Updated Jun 14, 2025 • 35 • 79
yejunliang23/ShapeLLM-7B-omni

Image-to-3D • 8B • Updated Jun 18, 2025 • 683 • 14
tencent/Hunyuan3D-2.1

Image-to-3D • Updated Oct 17, 2025 • 19.4k • 796

LinFusion: 1 GPU, 1 Minute, 16K Image

Paper • 2409.02097 • Published Sep 3, 2024 • 34
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Paper • 2409.11406 • Published Sep 17, 2024 • 27
Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27, 2024 • 126
Segment Anything with Multiple Modalities

Paper • 2408.09085 • Published Aug 17, 2024 • 22

Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations

Paper • 2508.09789 • Published Aug 13, 2025 • 5
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents

Paper • 2508.13186 • Published Aug 14, 2025 • 19
ZARA: Zero-shot Motion Time-Series Analysis via Knowledge and Retrieval Driven LLM Agents

Paper • 2508.04038 • Published Aug 6, 2025 • 1
Prompt Orchestration Markup Language

Paper • 2508.13948 • Published Aug 19, 2025 • 48

Describe Anything: Detailed Localized Image and Video Captioning

Paper • 2504.16072 • Published Apr 22, 2025 • 63
EmbodiedCity: A Benchmark Platform for Embodied Agent in Real-world City Environment

Paper • 2410.09604 • Published Oct 12, 2024
Geospatial Mechanistic Interpretability of Large Language Models

Paper • 2505.03368 • Published May 6, 2025 • 11
Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation

Paper • 2505.02836 • Published May 5, 2025 • 8

3D CAD Generation

Drawing2CAD: Sequence-to-Sequence Learning for CAD Generation from Vector Drawings

Paper • 2508.18733 • Published Aug 26, 2025 • 9

Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations

Paper • 2508.09789 • Published Aug 13, 2025 • 5
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents

Paper • 2508.13186 • Published Aug 14, 2025 • 19
ZARA: Zero-shot Motion Time-Series Analysis via Knowledge and Retrieval Driven LLM Agents

Paper • 2508.04038 • Published Aug 6, 2025 • 1
Prompt Orchestration Markup Language

Paper • 2508.13948 • Published Aug 19, 2025 • 48

Neural LightRig: Unlocking Accurate Object Normal and Material Estimation with Multi-Light Diffusion

Paper • 2412.09593 • Published Dec 12, 2024 • 18
wushuang98/Direct3D-S2

Image-to-3D • Updated Jun 14, 2025 • 35 • 79
yejunliang23/ShapeLLM-7B-omni

Image-to-3D • 8B • Updated Jun 18, 2025 • 683 • 14
tencent/Hunyuan3D-2.1

Image-to-3D • Updated Oct 17, 2025 • 19.4k • 796

Describe Anything: Detailed Localized Image and Video Captioning

Paper • 2504.16072 • Published Apr 22, 2025 • 63
EmbodiedCity: A Benchmark Platform for Embodied Agent in Real-world City Environment

Paper • 2410.09604 • Published Oct 12, 2024
Geospatial Mechanistic Interpretability of Large Language Models

Paper • 2505.03368 • Published May 6, 2025 • 11
Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation

Paper • 2505.02836 • Published May 5, 2025 • 8

LinFusion: 1 GPU, 1 Minute, 16K Image

Paper • 2409.02097 • Published Sep 3, 2024 • 34
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Paper • 2409.11406 • Published Sep 17, 2024 • 27
Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27, 2024 • 126
Segment Anything with Multiple Modalities

Paper • 2408.09085 • Published Aug 17, 2024 • 22

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs