Usage

This can be used with the pipeline function from the Transformers module.


import torch
from transformers import pipeline

audio = "path to the audio file to be transcribed"
device = "cuda:0" if torch.cuda.is_available() else "cpu"
modelTags = "ARTPARK-IISc/whisper-large-v3-vaani-telugu"

transcribe = pipeline(
    task="automatic-speech-recognition",
    model=modelTags,
    chunk_length_s=30,
    device=device
)


transcribe.model.config.forced_decoder_ids = None
transcribe.model.generation_config.forced_decoder_ids = None

print("Transcription:", transcribe(audio)["text"])

Citation

If you use this model, please cite the following:

@misc{pulikodan2026vaanicapturinglanguagelandscape,
      title={VAANI: Capturing the language landscape for an inclusive digital India}, 
      author={Sujith Pulikodan and Abhayjeet Singh and Agneedh Basu and Nihar Desai and Pavan Kumar J and Pranav D Bhat and Raghu Dharmaraju and Ritika Gupta and Sathvik Udupa and Saurabh Kumar and Sumit Sharma and Vaibhav Vishwakarma and Visruth Sanka and Dinesh Tewari and Harsh Dhand and Amrita Kamat and Sukhwinder Singh and Shikhar Vashishth and Partha Talukdar and Raj Acharya and Prasanta Kumar Ghosh},
      year={2026},
      eprint={2603.28714},
      archivePrefix={arXiv},
      primaryClass={eess.AS},
      url={https://arxiv.org/abs/2603.28714}, 
}
Downloads last month
187
Safetensors
Model size
2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ARTPARK-IISc/whisper-large-v3-vaani-telugu

Finetuned
(806)
this model

Dataset used to train ARTPARK-IISc/whisper-large-v3-vaani-telugu

Collection including ARTPARK-IISc/whisper-large-v3-vaani-telugu

Paper for ARTPARK-IISc/whisper-large-v3-vaani-telugu