Vision-Language Models Qwen/Qwen2.5-VL-7B-Instruct Image-Text-to-Text • 8B • Updated Apr 6, 2025 • 4.72M • • 1.58k microsoft/Florence-2-large Image-Text-to-Text • 0.8B • Updated Aug 4, 2025 • 397k • 1.82k google/paligemma2-3b-pt-224 Image-Text-to-Text • 3B • Updated Dec 5, 2024 • 26.4k • 173
OCR & Document AI nvidia/nemotron-ocr-v2 Image-to-Text • Updated 23 days ago • 9.11k • 205 deepseek-ai/DeepSeek-OCR Image-Text-to-Text • 3B • Updated Nov 4, 2025 • 1.68M • 3.28k zai-org/GLM-OCR Image-Text-to-Text • 1B • Updated 25 days ago • 2.7M • • 1.82k
Vision-Language Models Qwen/Qwen2.5-VL-7B-Instruct Image-Text-to-Text • 8B • Updated Apr 6, 2025 • 4.72M • • 1.58k microsoft/Florence-2-large Image-Text-to-Text • 0.8B • Updated Aug 4, 2025 • 397k • 1.82k google/paligemma2-3b-pt-224 Image-Text-to-Text • 3B • Updated Dec 5, 2024 • 26.4k • 173
OCR & Document AI nvidia/nemotron-ocr-v2 Image-to-Text • Updated 23 days ago • 9.11k • 205 deepseek-ai/DeepSeek-OCR Image-Text-to-Text • 3B • Updated Nov 4, 2025 • 1.68M • 3.28k zai-org/GLM-OCR Image-Text-to-Text • 1B • Updated 25 days ago • 2.7M • • 1.82k