Whisper Medium Turbo

This is a "turbo" variant of openai/whisper-medium created by reducing the decoder layers to 4 (following the same approach used for whisper-large-v3-turbo).

Model Description

  • Base model: openai/whisper-medium
  • Parameters: 427.97M
  • Decoder layers: 4
  • Encoder layers: 24 (unchanged)

Architecture

Component Layers
Encoder 24
Decoder 4

Usage

from transformers import WhisperProcessor, WhisperForConditionalGeneration

processor = WhisperProcessor.from_pretrained("mekpro/whisper-medium-turbo")
model = WhisperForConditionalGeneration.from_pretrained("mekpro/whisper-medium-turbo")

# Transcribe audio
# input_features = processor(audio, sampling_rate=16000, return_tensors="pt").input_features
# predicted_ids = model.generate(input_features)
# transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)

Creation Method

This model was created by:

  1. Loading the original openai/whisper-medium model
  2. Creating a new model with decoder_layers=4
  3. Copying encoder weights (unchanged)
  4. Copying first 4 decoder layers
  5. Copying embeddings and layer norms

Note: This model has not been fine-tuned after pruning. For best results, consider fine-tuning on your target domain.

Created With

python create_whisper_turbo.py --model openai/whisper-medium
Downloads last month
19
Safetensors
Model size
0.4B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for mekpro/whisper-medium-turbo

Finetuned
(767)
this model