greenw0lf commited on
Commit
6e2b018
·
1 Parent(s): a113273

update model card README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -45
README.md CHANGED
@@ -3,7 +3,7 @@ license: apache-2.0
3
  tags:
4
  - generated_from_trainer
5
  datasets:
6
- - mozilla-foundation/common_voice_12_0
7
  metrics:
8
  - wer
9
  model-index:
@@ -13,17 +13,15 @@ model-index:
13
  name: Automatic Speech Recognition
14
  type: automatic-speech-recognition
15
  dataset:
16
- name: common_voice_12_0
17
- type: common_voice_12_0
18
  config: fy-NL
19
- split: test
20
  args: fy-NL
21
  metrics:
22
  - name: Wer
23
  type: wer
24
- value: 0.15990775235054105
25
- language:
26
- - fy
27
  ---
28
 
29
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -31,72 +29,73 @@ should probably proofread and complete it, then remove this comment. -->
31
 
32
  # wav2vec2-large-xls-r-1b-frisian
33
 
34
- This model is a fine-tuned version of [facebook/wav2vec2-xls-r-1b](https://huggingface.co/facebook/wav2vec2-xls-r-1b) on the common_voice_12_0 dataset.
35
  It achieves the following results on the evaluation set:
36
- - Loss: 0.2634
37
- - WER: 0.1599
38
 
39
- This model was developed together with [golesheed](https://huggingface.co/golesheed) for the course "Speech Recognition II" of the "MSc Voice Technology" program at Rijksuniversiteit Groningen - Campus Fryslân.
40
 
41
- ## Intended uses & limitations
42
 
43
- Intended use is for recognizing Frisian speech.
44
 
45
- Limitations include not enough hyperparameter tuning, no LM rescoring, and using v12 of Common Voice instead of v13.
46
 
47
  ## Training and evaluation data
48
 
49
- Training and evaluation splits used are the ones available in the Common Voice dataset.
50
 
51
  ## Training procedure
52
 
53
  ### Training hyperparameters
54
 
55
  The following hyperparameters were used during training:
56
- - learning_rate: 8e-05
57
  - train_batch_size: 16
58
  - eval_batch_size: 8
59
  - seed: 42
60
  - gradient_accumulation_steps: 2
61
  - total_train_batch_size: 32
62
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
63
  - lr_scheduler_type: linear
64
- - lr_scheduler_warmup_steps: 500
65
- - num_epochs: 50
66
  - mixed_precision_training: Native AMP
67
 
68
  ### Training results
69
 
70
  | Training Loss | Epoch | Step | Validation Loss | Wer |
71
  |:-------------:|:-----:|:----:|:---------------:|:------:|
72
- | 4.7284 | 2.1 | 250 | 2.9453 | 1.0 |
73
- | 1.7496 | 4.2 | 500 | 0.5141 | 0.4771 |
74
- | 0.8168 | 6.3 | 750 | 0.3220 | 0.3148 |
75
- | 0.7403 | 8.4 | 1000 | 0.2988 | 0.2573 |
76
- | 0.7298 | 10.5 | 1250 | 0.2794 | 0.2347 |
77
- | 0.6303 | 12.61 | 1500 | 0.2577 | 0.2164 |
78
- | 0.5201 | 14.71 | 1750 | 0.2746 | 0.2162 |
79
- | 0.5189 | 16.81 | 2000 | 0.2543 | 0.2034 |
80
- | 0.5054 | 18.91 | 2250 | 0.2847 | 0.2071 |
81
- | 0.5112 | 21.01 | 2500 | 0.2772 | 0.1979 |
82
- | 0.5105 | 23.11 | 2750 | 0.2633 | 0.1920 |
83
- | 0.5032 | 25.21 | 3000 | 0.2667 | 0.1856 |
84
- | 0.46 | 27.31 | 3250 | 0.2730 | 0.1852 |
85
- | 0.4992 | 29.41 | 3500 | 0.2626 | 0.1782 |
86
- | 0.4535 | 31.51 | 3750 | 0.2778 | 0.1749 |
87
- | 0.4036 | 33.61 | 4000 | 0.2825 | 0.1747 |
88
- | 0.3347 | 35.71 | 4250 | 0.2797 | 0.1708 |
89
- | 0.2708 | 37.82 | 4500 | 0.2662 | 0.1712 |
90
- | 0.1825 | 39.92 | 4750 | 0.2652 | 0.1648 |
91
- | 0.1654 | 42.02 | 5000 | 0.2719 | 0.1628 |
92
- | 0.1387 | 44.12 | 5250 | 0.2552 | 0.1607 |
93
- | 0.1367 | 46.22 | 5500 | 0.2641 | 0.1591 |
94
- | 0.1218 | 48.32 | 5750 | 0.2634 | 0.1598 |
 
95
 
96
 
97
  ### Framework versions
98
 
99
- - Transformers 4.27.3
100
  - Pytorch 2.0.0+cu117
101
- - Datasets 2.10.1
102
- - Tokenizers 0.13.2
 
3
  tags:
4
  - generated_from_trainer
5
  datasets:
6
+ - common_voice_13_0
7
  metrics:
8
  - wer
9
  model-index:
 
13
  name: Automatic Speech Recognition
14
  type: automatic-speech-recognition
15
  dataset:
16
+ name: common_voice_13_0
17
+ type: common_voice_13_0
18
  config: fy-NL
19
+ split: validation
20
  args: fy-NL
21
  metrics:
22
  - name: Wer
23
  type: wer
24
+ value: 0.15077102723494865
 
 
25
  ---
26
 
27
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
29
 
30
  # wav2vec2-large-xls-r-1b-frisian
31
 
32
+ This model is a fine-tuned version of [facebook/wav2vec2-xls-r-1b](https://huggingface.co/facebook/wav2vec2-xls-r-1b) on the common_voice_13_0 dataset.
33
  It achieves the following results on the evaluation set:
34
+ - Loss: 0.2206
35
+ - Wer: 0.1508
36
 
37
+ ## Model description
38
 
39
+ More information needed
40
 
41
+ ## Intended uses & limitations
42
 
43
+ More information needed
44
 
45
  ## Training and evaluation data
46
 
47
+ More information needed
48
 
49
  ## Training procedure
50
 
51
  ### Training hyperparameters
52
 
53
  The following hyperparameters were used during training:
54
+ - learning_rate: 7e-05
55
  - train_batch_size: 16
56
  - eval_batch_size: 8
57
  - seed: 42
58
  - gradient_accumulation_steps: 2
59
  - total_train_batch_size: 32
60
+ - optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-08
61
  - lr_scheduler_type: linear
62
+ - lr_scheduler_warmup_ratio: 0.1
63
+ - num_epochs: 60
64
  - mixed_precision_training: Native AMP
65
 
66
  ### Training results
67
 
68
  | Training Loss | Epoch | Step | Validation Loss | Wer |
69
  |:-------------:|:-----:|:----:|:---------------:|:------:|
70
+ | 4.9606 | 2.45 | 300 | 2.6184 | 1.0 |
71
+ | 1.4992 | 4.9 | 600 | 0.4233 | 0.4143 |
72
+ | 0.9757 | 7.35 | 900 | 0.2765 | 0.3021 |
73
+ | 0.8773 | 9.8 | 1200 | 0.2529 | 0.2528 |
74
+ | 0.7448 | 12.24 | 1500 | 0.2363 | 0.2258 |
75
+ | 0.7039 | 14.69 | 1800 | 0.2258 | 0.2103 |
76
+ | 0.6811 | 17.14 | 2100 | 0.2217 | 0.2074 |
77
+ | 0.6279 | 19.59 | 2400 | 0.2050 | 0.1915 |
78
+ | 0.5938 | 22.04 | 2700 | 0.2229 | 0.1922 |
79
+ | 0.6227 | 24.49 | 3000 | 0.2088 | 0.2019 |
80
+ | 0.5682 | 26.94 | 3300 | 0.2127 | 0.1874 |
81
+ | 0.5939 | 29.39 | 3600 | 0.2044 | 0.1789 |
82
+ | 0.5427 | 31.84 | 3900 | 0.2185 | 0.1791 |
83
+ | 0.5551 | 34.41 | 4200 | 0.2097 | 0.1644 |
84
+ | 0.5021 | 36.86 | 4500 | 0.2180 | 0.1678 |
85
+ | 0.4589 | 39.31 | 4800 | 0.2076 | 0.1581 |
86
+ | 0.5204 | 41.76 | 5100 | 0.2181 | 0.1587 |
87
+ | 0.512 | 44.21 | 5400 | 0.2263 | 0.1607 |
88
+ | 0.465 | 46.66 | 5700 | 0.2204 | 0.1493 |
89
+ | 0.4482 | 49.11 | 6000 | 0.2143 | 0.1527 |
90
+ | 0.3972 | 51.63 | 6300 | 0.2198 | 0.1617 |
91
+ | 0.3168 | 54.09 | 6600 | 0.2170 | 0.1528 |
92
+ | 0.2432 | 56.53 | 6900 | 0.2182 | 0.1529 |
93
+ | 0.252 | 58.98 | 7200 | 0.2206 | 0.1508 |
94
 
95
 
96
  ### Framework versions
97
 
98
+ - Transformers 4.28.1
99
  - Pytorch 2.0.0+cu117
100
+ - Datasets 2.11.0
101
+ - Tokenizers 0.13.3