greenw0lf
/

wav2vec2-large-xls-r-1b-frisian

@@ -3,7 +3,7 @@ license: apache-2.0
 tags:
 - generated_from_trainer
 datasets:
-- mozilla-foundation/common_voice_12_0
 metrics:
 - wer
 model-index:
@@ -13,17 +13,15 @@ model-index:
       name: Automatic Speech Recognition
       type: automatic-speech-recognition
     dataset:
-      name: common_voice_12_0
-      type: common_voice_12_0
       config: fy-NL
-      split: test
       args: fy-NL
     metrics:
     - name: Wer
       type: wer
-      value: 0.15990775235054105
-language:
-- fy
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -31,72 +29,73 @@ should probably proofread and complete it, then remove this comment. -->
 # wav2vec2-large-xls-r-1b-frisian
-This model is a fine-tuned version of [facebook/wav2vec2-xls-r-1b](https://huggingface.co/facebook/wav2vec2-xls-r-1b) on the common_voice_12_0 dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.2634
-- WER: 0.1599
-This model was developed together with [golesheed](https://huggingface.co/golesheed) for the course "Speech Recognition II" of the "MSc Voice Technology" program at Rijksuniversiteit Groningen - Campus Fryslân.
-## Intended uses & limitations
-Intended use is for recognizing Frisian speech.
-Limitations include not enough hyperparameter tuning, no LM rescoring, and using v12 of Common Voice instead of v13.
 ## Training and evaluation data
-Training and evaluation splits used are the ones available in the Common Voice dataset.
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 8e-05
 - train_batch_size: 16
 - eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 2
 - total_train_batch_size: 32
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 500
-- num_epochs: 50
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Wer    |
 |:-------------:|:-----:|:----:|:---------------:|:------:|
-| 4.7284        | 2.1   | 250  | 2.9453          | 1.0    |
-| 1.7496        | 4.2   | 500  | 0.5141          | 0.4771 |
-| 0.8168        | 6.3   | 750  | 0.3220          | 0.3148 |
-| 0.7403        | 8.4   | 1000 | 0.2988          | 0.2573 |
-| 0.7298        | 10.5  | 1250 | 0.2794          | 0.2347 |
-| 0.6303        | 12.61 | 1500 | 0.2577          | 0.2164 |
-| 0.5201        | 14.71 | 1750 | 0.2746          | 0.2162 |
-| 0.5189        | 16.81 | 2000 | 0.2543          | 0.2034 |
-| 0.5054        | 18.91 | 2250 | 0.2847          | 0.2071 |
-| 0.5112        | 21.01 | 2500 | 0.2772          | 0.1979 |
-| 0.5105        | 23.11 | 2750 | 0.2633          | 0.1920 |
-| 0.5032        | 25.21 | 3000 | 0.2667          | 0.1856 |
-| 0.46          | 27.31 | 3250 | 0.2730          | 0.1852 |
-| 0.4992        | 29.41 | 3500 | 0.2626          | 0.1782 |
-| 0.4535        | 31.51 | 3750 | 0.2778          | 0.1749 |
-| 0.4036        | 33.61 | 4000 | 0.2825          | 0.1747 |
-| 0.3347        | 35.71 | 4250 | 0.2797          | 0.1708 |
-| 0.2708        | 37.82 | 4500 | 0.2662          | 0.1712 |
-| 0.1825        | 39.92 | 4750 | 0.2652          | 0.1648 |
-| 0.1654        | 42.02 | 5000 | 0.2719          | 0.1628 |
-| 0.1387        | 44.12 | 5250 | 0.2552          | 0.1607 |
-| 0.1367        | 46.22 | 5500 | 0.2641          | 0.1591 |
-| 0.1218        | 48.32 | 5750 | 0.2634          | 0.1598 |
 ### Framework versions
-- Transformers 4.27.3
 - Pytorch 2.0.0+cu117
-- Datasets 2.10.1
-- Tokenizers 0.13.2

 tags:
 - generated_from_trainer
 datasets:
+- common_voice_13_0
 metrics:
 - wer
 model-index:
       name: Automatic Speech Recognition
       type: automatic-speech-recognition
     dataset:
+      name: common_voice_13_0
+      type: common_voice_13_0
       config: fy-NL
+      split: validation
       args: fy-NL
     metrics:
     - name: Wer
       type: wer
+      value: 0.15077102723494865
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 # wav2vec2-large-xls-r-1b-frisian
+This model is a fine-tuned version of [facebook/wav2vec2-xls-r-1b](https://huggingface.co/facebook/wav2vec2-xls-r-1b) on the common_voice_13_0 dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.2206
+- Wer: 0.1508
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
 ## Training and evaluation data
+More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 7e-05
 - train_batch_size: 16
 - eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 2
 - total_train_batch_size: 32
+- optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-08
 - lr_scheduler_type: linear
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 60
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Wer    |
 |:-------------:|:-----:|:----:|:---------------:|:------:|
+| 4.9606        | 2.45  | 300  | 2.6184          | 1.0    |
+| 1.4992        | 4.9   | 600  | 0.4233          | 0.4143 |
+| 0.9757        | 7.35  | 900  | 0.2765          | 0.3021 |
+| 0.8773        | 9.8   | 1200 | 0.2529          | 0.2528 |
+| 0.7448        | 12.24 | 1500 | 0.2363          | 0.2258 |
+| 0.7039        | 14.69 | 1800 | 0.2258          | 0.2103 |
+| 0.6811        | 17.14 | 2100 | 0.2217          | 0.2074 |
+| 0.6279        | 19.59 | 2400 | 0.2050          | 0.1915 |
+| 0.5938        | 22.04 | 2700 | 0.2229          | 0.1922 |
+| 0.6227        | 24.49 | 3000 | 0.2088          | 0.2019 |
+| 0.5682        | 26.94 | 3300 | 0.2127          | 0.1874 |
+| 0.5939        | 29.39 | 3600 | 0.2044          | 0.1789 |
+| 0.5427        | 31.84 | 3900 | 0.2185          | 0.1791 |
+| 0.5551        | 34.41 | 4200 | 0.2097          | 0.1644 |
+| 0.5021        | 36.86 | 4500 | 0.2180          | 0.1678 |
+| 0.4589        | 39.31 | 4800 | 0.2076          | 0.1581 |
+| 0.5204        | 41.76 | 5100 | 0.2181          | 0.1587 |
+| 0.512         | 44.21 | 5400 | 0.2263          | 0.1607 |
+| 0.465         | 46.66 | 5700 | 0.2204          | 0.1493 |
+| 0.4482        | 49.11 | 6000 | 0.2143          | 0.1527 |
+| 0.3972        | 51.63 | 6300 | 0.2198          | 0.1617 |
+| 0.3168        | 54.09 | 6600 | 0.2170          | 0.1528 |
+| 0.2432        | 56.53 | 6900 | 0.2182          | 0.1529 |
+| 0.252         | 58.98 | 7200 | 0.2206          | 0.1508 |
 ### Framework versions
+- Transformers 4.28.1
 - Pytorch 2.0.0+cu117
+- Datasets 2.11.0
+- Tokenizers 0.13.3