Vision-Language Models Qwen/Qwen2.5-VL-7B-Instruct Image-Text-to-Text • 8B • Updated Apr 6, 2025 • 4.93M • • 1.58k microsoft/Florence-2-large Image-Text-to-Text • 0.8B • Updated Aug 4, 2025 • 402k • 1.82k google/paligemma2-3b-pt-224 Image-Text-to-Text • 3B • Updated Dec 5, 2024 • 26.6k • 173
Aerial & Drone Object Detection PekingU/rtdetr_r50vd Object Detection • 43M • Updated Feb 6, 2025 • 40.4k • 33 silveroupti/VisDrone Updated Apr 10 • 51 PekingU/rtdetr_r50vd_coco_o365 Object Detection • 43M • Updated Jul 1, 2024 • 112k • 17 IDEA-Research/grounding-dino-base Zero-Shot Object Detection • 0.2B • Updated May 12, 2024 • 1.56M • 187
IDEA-Research/grounding-dino-base Zero-Shot Object Detection • 0.2B • Updated May 12, 2024 • 1.56M • 187
Applied Machine Learning for Computer Vision and VLMs Qwen/Qwen3.5-9B Image-Text-to-Text • 10B • Updated Mar 2 • 5.63M • • 1.56k dx8152/Qwen-Edit-2509-Multiple-angles Image-to-Image • Updated Apr 21 • 91.5k • • 951 KangLiao/Puffin Text-to-3D • Updated Mar 6 • 24
OCR & Document AI nvidia/nemotron-ocr-v2 Image-to-Text • Updated 24 days ago • 9.28k • 205 deepseek-ai/DeepSeek-OCR Image-Text-to-Text • 3B • Updated Nov 4, 2025 • 1.66M • 3.28k zai-org/GLM-OCR Image-Text-to-Text • 1B • Updated 27 days ago • 2.64M • • 1.83k
Theory and Representation learning V-JEPA 2.1: Unlocking Dense Features in Video Self-Supervised Learning Paper • 2603.14482 • Published Mar 15 • 36 Real-Time Object Detection Meets DINOv3 Paper • 2509.20787 • Published Sep 25, 2025 • 11
V-JEPA 2.1: Unlocking Dense Features in Video Self-Supervised Learning Paper • 2603.14482 • Published Mar 15 • 36
Vision-Language Models Qwen/Qwen2.5-VL-7B-Instruct Image-Text-to-Text • 8B • Updated Apr 6, 2025 • 4.93M • • 1.58k microsoft/Florence-2-large Image-Text-to-Text • 0.8B • Updated Aug 4, 2025 • 402k • 1.82k google/paligemma2-3b-pt-224 Image-Text-to-Text • 3B • Updated Dec 5, 2024 • 26.6k • 173
OCR & Document AI nvidia/nemotron-ocr-v2 Image-to-Text • Updated 24 days ago • 9.28k • 205 deepseek-ai/DeepSeek-OCR Image-Text-to-Text • 3B • Updated Nov 4, 2025 • 1.66M • 3.28k zai-org/GLM-OCR Image-Text-to-Text • 1B • Updated 27 days ago • 2.64M • • 1.83k
Aerial & Drone Object Detection PekingU/rtdetr_r50vd Object Detection • 43M • Updated Feb 6, 2025 • 40.4k • 33 silveroupti/VisDrone Updated Apr 10 • 51 PekingU/rtdetr_r50vd_coco_o365 Object Detection • 43M • Updated Jul 1, 2024 • 112k • 17 IDEA-Research/grounding-dino-base Zero-Shot Object Detection • 0.2B • Updated May 12, 2024 • 1.56M • 187
IDEA-Research/grounding-dino-base Zero-Shot Object Detection • 0.2B • Updated May 12, 2024 • 1.56M • 187
Theory and Representation learning V-JEPA 2.1: Unlocking Dense Features in Video Self-Supervised Learning Paper • 2603.14482 • Published Mar 15 • 36 Real-Time Object Detection Meets DINOv3 Paper • 2509.20787 • Published Sep 25, 2025 • 11
V-JEPA 2.1: Unlocking Dense Features in Video Self-Supervised Learning Paper • 2603.14482 • Published Mar 15 • 36
Applied Machine Learning for Computer Vision and VLMs Qwen/Qwen3.5-9B Image-Text-to-Text • 10B • Updated Mar 2 • 5.63M • • 1.56k dx8152/Qwen-Edit-2509-Multiple-angles Image-to-Image • Updated Apr 21 • 91.5k • • 951 KangLiao/Puffin Text-to-3D • Updated Mar 6 • 24