--- license: mit language: - de metrics: - wer pipeline_tag: automatic-speech-recognition --- ## Model Description Fine-tuned Whisper-tiny on SwissDial-ZH dataset for Swiss German dialects. ## Model Details - **Model Name**: nizarmichaud/whisper-tiny-swiss-german - **Base Model**: Whisper-tiny-v3 - **Dataset**: SwissDial-ZH (8 Swiss German dialects): https://mtc.ethz.ch/publications/open-source/swiss-dial.html - **Languages**: Swiss German ## Training - **Duration**: 4 hours - **Hardware**: NVIDIA RTX 3080 - **Batch Size**: 32 - **Train/Test Split**: 90%/10% (specific sentence selection) ## Performance - **WER**: ~37% on test set ## Usage ```python from transformers import WhisperForConditionalGeneration, WhisperProcessor model_name = "nizarmichaud/whisper-tiny-swiss-german" model = WhisperForConditionalGeneration.from_pretrained(model_name) processor = WhisperProcessor.from_pretrained(model_name) audio_input = ... # Your audio input here inputs = processor(audio_input, return_tensors="pt", sampling_rate=16000) generated_ids = model.generate(inputs["input_features"]) transcription = processor.batch_decode(generated_ids, skip_special_tokens=True) print(transcription) ``` --- license: mit ---