Text Generation
Safetensors
Hindi
English
llama
text-to-speech
hindi
hinglish
audio-generation
fine-tuned
unsloth
conversational
Instructions to use Itsharshi/tts_500 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Local Apps
- Unsloth Studio new
How to use Itsharshi/tts_500 with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Itsharshi/tts_500 to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Itsharshi/tts_500 to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Itsharshi/tts_500 to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="Itsharshi/tts_500", max_seq_length=2048, )
Hinglish TTS 3B Model
This is a fine-tuned version of canopylabs/3b-hi-pretrain-research_release specialized for Hinglish (Hindi-English mixed) text-to-speech generation.
Model Details
- Base Model: canopylabs/3b-hi-pretrain-research_release
- Fine-tuning Method: LoRA with Unsloth (merged)
- Languages: Hindi, English, Hinglish
- Task: Text-to-Speech via audio token generation
- Model Size: ~3B parameters
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
# Load model and tokenizer
model_name = "Itsharshi/tts_500"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.float16,
device_map="auto"
)
# Generate text
prompt = "Hello doston, main aapka dost hun"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=1200)
Fine-tuning Details
- LoRA Rank: 64
- LoRA Alpha: 64
- Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- Training Framework: Unsloth
Audio Generation
This model generates audio tokens that need to be decoded using a SNAC (Scalable Neural Audio Codec) model:
from snac import SNAC
# Load SNAC decoder
snac_model = SNAC.from_pretrained("hubertsiuzdak/snac_24khz")
# Process generated tokens to audio codes and decode
# (See full implementation in the original training code)
Limitations
- Requires SNAC model for audio generation
- Optimized for Hinglish content
- May not perform well on pure English or pure Hindi in some cases
Citation
If you use this model, please cite the original base model:
@misc{canopylabs-3b-hi,
title={3B Hindi Pretrained Model},
author={Canopy Labs},
year={2024},
url={https://huggingface.co/canopylabs/3b-hi-pretrain-research_release}
}
- Downloads last month
- 9
Model tree for Itsharshi/tts_500
Base model
meta-llama/Llama-3.2-3B-Instruct Finetuned
canopylabs/orpheus-3b-0.1-pretrained