mozilla-foundation/common_voice_13_0
Updated • 2.42k • 3
How to use kasunw/whisper-large-v3-hindi with PEFT:
Task type is invalid.
Configuration Parsing Warning:In adapter_config.json: "peft.task_type" must be a string
Whisper large-v3 trained on common-voice-13 Hindi dataset using LoRA
from peft import PeftModel, PeftConfig
from transformers import WhisperForConditionalGeneration, WhisperProcessor
peft_model_id = "kasunw/whisper-large-v3-hindi"
peft_config = PeftConfig.from_pretrained(peft_model_id)
model = WhisperForConditionalGeneration.from_pretrained(
peft_config.base_model_name_or_path, device_map="auto", torch_dtype=torch.float16
)
model = PeftModel.from_pretrained(model, peft_model_id)
model.config.use_cache = True
processor = WhisperProcessor.from_pretrained(peft_config.base_model_name_or_path, language="Hindi", task="transcribe")
torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
pipe = pipeline(
"automatic-speech-recognition",
model=model,
tokenizer=processor.tokenizer,
feature_extractor=processor.feature_extractor,
max_new_tokens=128,
chunk_length_s=30,
batch_size=16,
return_timestamps=True,
torch_dtype=torch_dtype,
device=model.device,
)
path_to_audio = "audio.mp3"
result = pipe(path_to_audio)
print(result["text"])
common-voice-13.0 Hindi Portion
Followed the instruction given in this notebook