Multi-Resolution Flow Matching: Training-Free Diffusion Acceleration via Staged Sampling Paper • 2607.01642 • Published 4 days ago • 28
PhotoQuilt: Training-Free Arbitrary-Resolution Photomosaics via Bootstrapped Tiled Denoising Paper • 2606.30968 • Published 7 days ago • 24
SEGA: Spectral-Energy Guided Attention for Resolution Extrapolation in Diffusion Transformers Paper • 2605.22668 • Published May 21 • 41
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds Paper • 2604.14268 • Published Apr 15 • 127
DyaDiT: A Multi-Modal Diffusion Transformer for Socially Favorable Dyadic Gesture Generation Paper • 2602.23165 • Published Feb 26 • 3
OmniLottie: Generating Vector Animations via Parameterized Lottie Tokens Paper • 2603.02138 • Published Mar 2 • 151
When Does RL Help Medical VLMs? Disentangling Vision, SFT, and RL Gains Paper • 2603.01301 • Published Mar 1 • 8
LoopFormer: Elastic-Depth Looped Transformers for Latent Reasoning via Shortcut Modulation Paper • 2602.11451 • Published Feb 11 • 17
EasyV2V: A High-quality Instruction-based Video Editing Framework Paper • 2512.16920 • Published Dec 18, 2025 • 18
PuzzleCraft: Exploration-Aware Curriculum Learning for Puzzle-Based RLVR in VLMs Paper • 2512.14944 • Published Mar 13 • 36