Whisper Medium Turbo
This is a "turbo" variant of openai/whisper-medium created by reducing the decoder layers to 4 (following the same approach used for whisper-large-v3-turbo).
Model Description
- Base model: openai/whisper-medium
- Parameters: 427.97M
- Decoder layers: 4
- Encoder layers: 24 (unchanged)
Architecture
| Component | Layers |
|---|---|
| Encoder | 24 |
| Decoder | 4 |
Usage
from transformers import WhisperProcessor, WhisperForConditionalGeneration
processor = WhisperProcessor.from_pretrained("mekpro/whisper-medium-turbo")
model = WhisperForConditionalGeneration.from_pretrained("mekpro/whisper-medium-turbo")
# Transcribe audio
# input_features = processor(audio, sampling_rate=16000, return_tensors="pt").input_features
# predicted_ids = model.generate(input_features)
# transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
Creation Method
This model was created by:
- Loading the original openai/whisper-medium model
- Creating a new model with decoder_layers=4
- Copying encoder weights (unchanged)
- Copying first 4 decoder layers
- Copying embeddings and layer norms
Note: This model has not been fine-tuned after pruning. For best results, consider fine-tuning on your target domain.
Created With
python create_whisper_turbo.py --model openai/whisper-medium
- Downloads last month
- 19
Model tree for mekpro/whisper-medium-turbo
Base model
openai/whisper-medium