gorkemgoknar
commited on
Commit
·
b2756ff
1
Parent(s):
a250161
Update README.md
Browse files
README.md
CHANGED
@@ -90,6 +90,7 @@ predicted_ids = torch.argmax(logits, dim=-1)
|
|
90 |
print("Prediction:", processor.batch_decode(predicted_ids))
|
91 |
print("Reference:", test_dataset["sentence"][:2])
|
92 |
```
|
|
|
93 |
## Evaluation
|
94 |
The model can be evaluated as follows on the Turkish test data of Common Voice.
|
95 |
```python
|
@@ -103,10 +104,8 @@ wer = load_metric("wer")
|
|
103 |
processor = Wav2Vec2Processor.from_pretrained("gorkemgoknar/wav2vec2-large-xlsr-53-turkish")
|
104 |
model = Wav2Vec2ForCTC.from_pretrained("gorkemgoknar/wav2vec2-large-xlsr-53-turkish")
|
105 |
model.to("cuda")
|
106 |
-
|
107 |
-
|
108 |
-
chars_to_ignore_regex = '[\,\?\.\!\-\;\:\"\“\%\‘\”\�\#\>\<\_\’\[\]\{\}]'
|
109 |
-
|
110 |
resampler = torchaudio.transforms.Resample(48_000, 16_000)
|
111 |
# Preprocessing the datasets.
|
112 |
# We need to read the aduio files as arrays
|
|
|
90 |
print("Prediction:", processor.batch_decode(predicted_ids))
|
91 |
print("Reference:", test_dataset["sentence"][:2])
|
92 |
```
|
93 |
+
|
94 |
## Evaluation
|
95 |
The model can be evaluated as follows on the Turkish test data of Common Voice.
|
96 |
```python
|
|
|
104 |
processor = Wav2Vec2Processor.from_pretrained("gorkemgoknar/wav2vec2-large-xlsr-53-turkish")
|
105 |
model = Wav2Vec2ForCTC.from_pretrained("gorkemgoknar/wav2vec2-large-xlsr-53-turkish")
|
106 |
model.to("cuda")
|
107 |
+
# Note: Not ignoring "'" on this one
|
108 |
+
chars_to_ignore_regex = '[\\,\\?\\.\\!\\-\\;\\:\\"\\“\\%\\‘\\”\\�\\#\\>\\<\\_\\’\\[\\]\\{\\}]'
|
|
|
|
|
109 |
resampler = torchaudio.transforms.Resample(48_000, 16_000)
|
110 |
# Preprocessing the datasets.
|
111 |
# We need to read the aduio files as arrays
|