Visual Question Answering
Transformers
Safetensors
English
videollama2_qwen2
text-generation
Audio-visual Question Answering
Audio Question Answering
multimodal large language model
Instructions to use lym0302/VideoLLaMA2.1-7B-AV-QA with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use lym0302/VideoLLaMA2.1-7B-AV-QA with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("visual-question-answering", model="lym0302/VideoLLaMA2.1-7B-AV-QA")# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("lym0302/VideoLLaMA2.1-7B-AV-QA", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Xet hash:
- 05fbfa5ae228a893927e805607016ec538d3011d67e49532d4628270b4f57292
- Size of remote file:
- 182 MB
- SHA256:
- cb2ae5c2d39ef3e5550cf3af103503c3a970941d7d3ae37bfea3464e980f002c
·
Xet efficiently stores Large Files inside Git, intelligently splitting files into unique chunks and accelerating uploads and downloads. More info.