whisper-small-hi-cv

This model is a fine-tuned version of openai/whisper-small on the Common Voice 15 dataset. It achieves the following results on the evaluation set:

  • Wer: 13.9913
  • Cer: 5.8844

View the results on Kaggle Notebook: https://www.kaggle.com/code/kingabzpro/whisper-hindi-eval

Evaluation

from datasets import load_dataset,load_metric,Audio
from transformers import WhisperForConditionalGeneration, WhisperProcessor
import torch
import torchaudio

test_dataset = load_dataset("mozilla-foundation/common_voice_13_0", "hi", split="test")
wer = load_metric("wer")
cer = load_metric("cer")

processor = WhisperProcessor.from_pretrained("kingabzpro/whisper-small-hi-cv")
model = WhisperForConditionalGeneration.from_pretrained("kingabzpro/whisper-small-hi-cv").to("cuda")
test_dataset = test_dataset.cast_column("audio", Audio(sampling_rate=16000))

def map_to_pred(batch):
    audio = batch["audio"]
    input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
    batch["reference"] = processor.tokenizer._normalize(batch['sentence'])

    with torch.no_grad():
        predicted_ids = model.generate(input_features.to("cuda"))[0]
    transcription = processor.decode(predicted_ids)
    batch["prediction"] = processor.tokenizer._normalize(transcription)
    return batch

result = test_dataset.map(map_to_pred)

print("WER: {:2f}".format(100 * wer.compute(predictions=result["prediction"], references=result["reference"])))
print("CER: {:2f}".format(100 * cer.compute(predictions=result["prediction"], references=result["reference"])))
WER: 23.3824
CER: 10.5288
Downloads last month
11
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for kingabzpro/whisper-small-hi-cv

Finetuned
(2146)
this model

Datasets used to train kingabzpro/whisper-small-hi-cv

Evaluation results