ShareGPT4Video

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

Lin-Chen authored a paper 16 days ago

Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models

Lin-Chen authored a paper 16 days ago

Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models

LanguageBind submitted a paper 23 days ago

iFSQ: Improving FSQ for Image Generation with 1 Line of Code

View all activity

Lin-Chen

authored 2 papers 16 days ago

Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models

Paper • 2602.02185 • Published 17 days ago • 125

Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models

Paper • 2601.22060 • Published 21 days ago • 153

LanguageBind

submitted a paper to Daily Papers 23 days ago

iFSQ: Improving FSQ for Image Generation with 1 Line of Code

Paper • 2601.17124 • Published 27 days ago • 32

Lin-Chen

authored a paper about 1 month ago

UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision

Paper • 2601.03193 • Published Jan 6 • 47

Lin-Chen

submitted a paper to Daily Papers about 1 month ago

UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision

Paper • 2601.03193 • Published Jan 6 • 47

Lin-Chen

authored a paper 3 months ago

DualVLA: Building a Generalizable Embodied Agent via Partial Decoupling of Reasoning and Action

Paper • 2511.22134 • Published Nov 27, 2025 • 22

LanguageBind

authored 5 papers 4 months ago

Look-Back: Implicit Visual Re-focusing in MLLM Reasoning

Paper • 2507.03019 • Published Jul 2, 2025 • 1

Can Understanding and Generation Truly Benefit Together -- or Just Coexist?

Paper • 2509.09666 • Published Sep 11, 2025 • 34

FlashI2V: Fourier-Guided Latent Shifting Prevents Conditional Image Leakage in Image-to-Video Generation

Paper • 2509.25187 • Published Sep 29, 2025 • 2

GIR-Bench: Versatile Benchmark for Generating Images with Reasoning

Paper • 2510.11026 • Published Oct 13, 2025 • 18

Uniworld-V2: Reinforce Image Editing with Diffusion Negative-aware Finetuning and MLLM Implicit Feedback

Paper • 2510.16888 • Published Oct 19, 2025 • 22

Lin-Chen

authored a paper 4 months ago

Agentic Jigsaw Interaction Learning for Enhancing Visual Perception and Reasoning in Vision-Language Models

Paper • 2510.01304 • Published Oct 1, 2025 • 11

Wiselnn

authored a paper 5 months ago

SIM-CoT: Supervised Implicit Chain-of-Thought

Paper • 2509.20317 • Published Sep 24, 2025 • 42

Jinsong-Li

authored 3 papers 6 months ago

Light-A-Video: Training-free Video Relighting via Progressive Light Fusion

Paper • 2502.08590 • Published Feb 12, 2025 • 42

Towards Storage-Efficient Visual Document Retrieval: An Empirical Study on Reducing Patch-Level Embeddings

Paper • 2506.04997 • Published Jun 5, 2025

ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing

Paper • 2506.19848 • Published Jun 24, 2025 • 26

Jinsong-Li

authored a paper 7 months ago

Beyond Fixed: Variable-Length Denoising for Diffusion Large Language Models

Paper • 2508.00819 • Published Aug 1, 2025 • 63

LanguageBind

authored 3 papers 9 months ago

Next Patch Prediction for Autoregressive Visual Generation

Paper • 2412.15321 • Published Dec 19, 2024 • 1

DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses

Paper • 2412.00397 • Published Nov 30, 2024

WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation

Paper • 2503.07265 • Published Mar 10, 2025 • 4

AI & ML interests

Recent Activity

Team members 4

ShareGPT4Video's activity