Visual Question Answering
Transformers
Safetensors
English
videollama2_qwen2
text-generation
Audio-visual Question Answering
Audio Question Answering
multimodal large language model
Instructions to use lym0302/VideoLLaMA2.1-7B-AV-QA with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use lym0302/VideoLLaMA2.1-7B-AV-QA with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("visual-question-answering", model="lym0302/VideoLLaMA2.1-7B-AV-QA")# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("lym0302/VideoLLaMA2.1-7B-AV-QA", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Xet hash:
- 75641ae3a8e66ff56a0ed1f28403610f6088de0d29af289a7f50a8b89eb5caf6
- Size of remote file:
- 6.9 kB
- SHA256:
- e5c1aec160f970dd7f051a11cf368ffb25bd33880921204414ed90fa25b101af
·
Xet efficiently stores Large Files inside Git, intelligently splitting files into unique chunks and accelerating uploads and downloads. More info.