Submitted by akhaliq 54 Uni-SMART: Universal Science Multimodal Analysis and Research Transformer · 17 authors 4
Submitted by akhaliq 37 VideoAgent: Long-form Video Understanding with Large Language Model as Agent · 4 authors 136 2
Submitted by akhaliq 32 Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations · 19 authors 2
Submitted by akhaliq 21 Recurrent Drafter for Fast Speculative Decoding in Large Language Models · 5 authors 221 1
Submitted by akhaliq 11 FDGaussian: Fast Gaussian Splatting from Single Image via Geometric-aware Diffusion Model · 4 authors 2
Submitted by akhaliq 10 EfficientVMamba: Atrous Selective Scan for Light Weight Visual Mamba · 3 authors 243 1
Submitted by akhaliq 8 Isotropic3D: Image-to-3D Generation Based on a Single CLIP Embedding · 7 authors 81 1
Submitted by akhaliq 7 Controllable Text-to-3D Generation via Surface-Aligned Gaussian Splatting · 4 authors 214 1
Submitted by akhaliq 3 NeuFlow: Real-time, High-accuracy Optical Flow Estimation on Robots Using Edge Devices · 3 authors 100 1