UniVidX: A Unified Multimodal Framework for Versatile Video Generation via Diffusion Priors Paper • 2605.00658 • Published 17 days ago • 82
Bridging Semantic and Kinematic Conditions with Diffusion-based Discrete Motion Tokenizer Paper • 2603.19227 • Published Mar 19 • 42
FantasyVLN: Unified Multimodal Chain-of-Thought Reasoning for Vision-Language Navigation Paper • 2601.13976 • Published Jan 20 • 22
view article Article New ViT and ALIGN Models From Kakao Brain +2 adirik, Unso, dylan-m, jun-untitled • Mar 6, 2023 • 6