Update README.md
Browse files
README.md
CHANGED
@@ -30,14 +30,25 @@ tokenizer.decode(output[0], skip_special_tokens=True, clean_up_tokenization_spac
|
|
30 |
|----------------------|-------------------------------------------------------------------------------------------------|
|
31 |
| Dataset | WMT14-de-en |
|
32 |
| Translation Pairs | 4.5M (83M tokens total) |
|
33 |
-
| Epochs |
|
34 |
| Batch Size | 16 |
|
35 |
| Accumulation Batch | 8 |
|
36 |
| Effective Batch Size | 128 (16 * 8) |
|
37 |
| Training Script | [train.py](https://github.com/ubaada/scratch-transformer/blob/main/train.py) |
|
38 |
| Optimiser | Adam (learning rate = 0.0001) |
|
39 |
| Loss Type | Cross Entropy |
|
40 |
-
| Final Test Loss | 1.
|
41 |
| GPU. | RTX 4070 (12GB) |
|
42 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
43 |
|
|
|
30 |
|----------------------|-------------------------------------------------------------------------------------------------|
|
31 |
| Dataset | WMT14-de-en |
|
32 |
| Translation Pairs | 4.5M (83M tokens total) |
|
33 |
+
| Epochs | 24 |
|
34 |
| Batch Size | 16 |
|
35 |
| Accumulation Batch | 8 |
|
36 |
| Effective Batch Size | 128 (16 * 8) |
|
37 |
| Training Script | [train.py](https://github.com/ubaada/scratch-transformer/blob/main/train.py) |
|
38 |
| Optimiser | Adam (learning rate = 0.0001) |
|
39 |
| Loss Type | Cross Entropy |
|
40 |
+
| Final Test Loss | 1.87 |
|
41 |
| GPU. | RTX 4070 (12GB) |
|
42 |
|
43 |
+
<p align="center" style="width:500px;">
|
44 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/62a7d1e152aa8695f9209345/0p4eEHiYFaeaibjk_Rf1y.png" />
|
45 |
+
</p>
|
46 |
+
|
47 |
+
|
48 |
+
## Results
|
49 |
+
|
50 |
+
<p align="center" style="width:500px;">
|
51 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/62a7d1e152aa8695f9209345/Gip1Ox-M1_z3qdafGGh3-.png" />
|
52 |
+
</p>
|
53 |
+
|
54 |
|