Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2604.02296

Potential Papers

RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details

Paper • 2604.06870 • Published 16 days ago • 41
Vanast: Virtual Try-On with Human Image Animation via Synthetic Triplet Supervision

Paper • 2604.04934 • Published 18 days ago • 45
VOID: Video Object and Interaction Deletion

Paper • 2604.02296 • Published 22 days ago • 53
FIT: A Large-Scale Dataset for Fit-Aware Virtual Try-On

Paper • 2604.08526 • Published 15 days ago • 20

Interesting work but not directly related

about 18 hours ago

VOID: Video Object and Interaction Deletion

Paper • 2604.02296 • Published 22 days ago • 53
OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation

Paper • 2604.18486 • Published 4 days ago • 79
WildDet3D: Scaling Promptable 3D Detection in the Wild

Paper • 2604.08626 • Published 15 days ago • 240

Interesting paper

Code2World: A GUI World Model via Renderable Code Generation

Paper • 2602.09856 • Published Feb 10 • 202
Steerable Visual Representations

Paper • 2604.02327 • Published 22 days ago • 53
VOID: Video Object and Interaction Deletion

Paper • 2604.02296 • Published 22 days ago • 53

关于video生成编辑加速等任务

DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer

Paper • 2601.01425 • Published Jan 4 • 53
DenseGRPO: From Sparse to Dense Reward for Flow Matching Model Alignment

Paper • 2601.20218 • Published Jan 28 • 16
FSVideo: Fast Speed Video Diffusion Model in a Highly-Compressed Latent Space

Paper • 2602.02092 • Published Feb 2 • 18
3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation

Paper • 2602.03796 • Published Feb 3 • 64

The Papers with interesting work and industrial experts.

VOID: Video Object and Interaction Deletion

Paper • 2604.02296 • Published 22 days ago • 53

VOID: Video Object and Interaction Deletion

Paper • 2604.02296 • Published 22 days ago • 53

[papers] Image & Video

The Script is All You Need: An Agentic Framework for Long-Horizon Dialogue-to-Cinematic Video Generation

Paper • 2601.17737 • Published Jan 25 • 56
Advancing Open-source World Models

Paper • 2601.20540 • Published Jan 28 • 135
OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation

Paper • 2601.15369 • Published Jan 21 • 21
Video-As-Prompt: Unified Semantic Control for Video Generation

Paper • 2510.20888 • Published Oct 23, 2025 • 50

Potential Papers

RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details

Paper • 2604.06870 • Published 16 days ago • 41
Vanast: Virtual Try-On with Human Image Animation via Synthetic Triplet Supervision

Paper • 2604.04934 • Published 18 days ago • 45
VOID: Video Object and Interaction Deletion

Paper • 2604.02296 • Published 22 days ago • 53
FIT: A Large-Scale Dataset for Fit-Aware Virtual Try-On

Paper • 2604.08526 • Published 15 days ago • 20

The Papers with interesting work and industrial experts.

VOID: Video Object and Interaction Deletion

Paper • 2604.02296 • Published 22 days ago • 53

Interesting work but not directly related

about 18 hours ago

VOID: Video Object and Interaction Deletion

Paper • 2604.02296 • Published 22 days ago • 53
OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation

Paper • 2604.18486 • Published 4 days ago • 79
WildDet3D: Scaling Promptable 3D Detection in the Wild

Paper • 2604.08626 • Published 15 days ago • 240

VOID: Video Object and Interaction Deletion

Paper • 2604.02296 • Published 22 days ago • 53

Interesting paper

Code2World: A GUI World Model via Renderable Code Generation

Paper • 2602.09856 • Published Feb 10 • 202
Steerable Visual Representations

Paper • 2604.02327 • Published 22 days ago • 53
VOID: Video Object and Interaction Deletion

Paper • 2604.02296 • Published 22 days ago • 53

[papers] Image & Video

The Script is All You Need: An Agentic Framework for Long-Horizon Dialogue-to-Cinematic Video Generation

Paper • 2601.17737 • Published Jan 25 • 56
Advancing Open-source World Models

Paper • 2601.20540 • Published Jan 28 • 135
OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation

Paper • 2601.15369 • Published Jan 21 • 21
Video-As-Prompt: Unified Semantic Control for Video Generation

Paper • 2510.20888 • Published Oct 23, 2025 • 50

关于video生成编辑加速等任务

DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer

Paper • 2601.01425 • Published Jan 4 • 53
DenseGRPO: From Sparse to Dense Reward for Flow Matching Model Alignment

Paper • 2601.20218 • Published Jan 28 • 16
FSVideo: Fast Speed Video Diffusion Model in a Highly-Compressed Latent Space

Paper • 2602.02092 • Published Feb 2 • 18
3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation

Paper • 2602.03796 • Published Feb 3 • 64

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs