BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset Paper • 2505.09568 • Published May 14 • 98
LLM Can be a Dangerous Persuader: Empirical Study of Persuasion Safety in Large Language Models Paper • 2504.10430 • Published Apr 14 • 5
Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models Paper • 2410.03290 • Published Oct 4, 2024 • 7
Vision-Flan: Scaling Human-Labeled Tasks in Visual Instruction Tuning Paper • 2402.11690 • Published Feb 18, 2024 • 10