Spaces:
Running
on
L4
Fine-tuning Whisper in more than one language
Suppose I have a dataset in two or more languages (one of them under-represented in Whisper's pre-trained models), and I want to fine-tune those 2 or more languages to continue with a multilingual model and avoid catastrophic forgetting. Is fine-tuning possible?
Can I define the tokenizer and the processor without indicating the language?
Hi! I am looking at a similar scenario. In case you managed to find a solution, would you be able to share it? :)
Hi
@bmichele
!
I've found some sort of solution on this thread:
https://huggingface.co/spaces/openai/whisper/discussions/6#643d8bc551e2958ef6cd69ef
However, I'm still wondering which is the best strategy:
- I've tried fine-tuning sequentially, by the results get worst on each fine tune cycle.
- I've tried fine-tuning in a multilingual way, avoiding the "lang" label in the tokenizer and in the processor, relying on the Whisper ability to detect the language, and the results are promising.
- Finally, I've tried fine-tuning in a multilingual way, indicating the lang label as is indicated in the discussion. However, the results are not as expected, so I'm wondering if I did something wrong.
It could be nice that someone else try this approach to confirm my results :)
@andrespm hey any update on the same ? over the months is there any great solutions that converges the model to great wer on multiple languages ?
Any updates on this thread would be helpful. I would also like to know also about how to improve the translation task along with transcribing
I was also wondering if this same multi-language approach works for LoRA fine-tuning. I tried LoRa and the language agnostic approach gave me lower accuracy