Taming LLMs by Scaling Learning Rates with Gradient Grouping Paper • 2506.01049 • Published Jun 1, 2025 • 38
DexUMI: Using Human Hand as the Universal Manipulation Interface for Dexterous Manipulation Paper • 2505.21864 • Published May 28, 2025 • 9
OMNIGUARD: An Efficient Approach for AI Safety Moderation Across Modalities Paper • 2505.23856 • Published May 29, 2025 • 2
X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents Paper • 2504.13203 • Published Apr 15, 2025 • 35
Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives Paper • 2501.04003 • Published Jan 7, 2025 • 27
MatAnyone: Stable Video Matting with Consistent Memory Propagation Paper • 2501.14677 • Published Jan 24, 2025 • 34
Small Models Struggle to Learn from Strong Reasoners Paper • 2502.12143 • Published Feb 17, 2025 • 39
AuraFusion360: Augmented Unseen Region Alignment for Reference-based 360° Unbounded Scene Inpainting Paper • 2502.05176 • Published Feb 7, 2025 • 39
MoM: Linear Sequence Modeling with Mixture-of-Memories Paper • 2502.13685 • Published Feb 19, 2025 • 36
TimeChat-Online: 80% Visual Tokens are Naturally Redundant in Streaming Videos Paper • 2504.17343 • Published Apr 24, 2025 • 13
ViSMaP: Unsupervised Hour-long Video Summarisation by Meta-Prompting Paper • 2504.15921 • Published Apr 22, 2025 • 7
SkyReels-A2: Compose Anything in Video Diffusion Transformers Paper • 2504.02436 • Published Apr 3, 2025 • 39
Have we unified image generation and understanding yet? An empirical study of GPT-4o's image generation ability Paper • 2504.08003 • Published Apr 9, 2025 • 49
MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization Paper • 2504.00999 • Published Apr 1, 2025 • 95
You Do Not Fully Utilize Transformer's Representation Capacity Paper • 2502.09245 • Published Feb 13, 2025 • 37
ConceptAttention: Diffusion Transformers Learn Highly Interpretable Features Paper • 2502.04320 • Published Feb 6, 2025 • 36
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss Paper • 2410.17243 • Published Oct 22, 2024 • 92
Addition is All You Need for Energy-efficient Language Models Paper • 2410.00907 • Published Oct 1, 2024 • 151