ubaada
/

original-transformer

Text2Text Generation

original_transformer

Model card Files Files and versions Community

ubaada commited on Nov 10, 2024

Commit

09ad330

·

verified ·

1 Parent(s): e7c1c98

Update README.md

Files changed (1) hide show

README.md +5 -1

README.md CHANGED Viewed

@@ -19,7 +19,7 @@ output = model.generate(**tokenizer(text, return_tensors="pt", add_special_token
 tokenizer.decode(output[0], skip_special_tokens=True, clean_up_tokenization_spaces=True)
 # Output: ' Das ist meine Katze.'
 ```
-(remember the `trust_remote_code=True` because of custom modeling fiel)
 ## Training:
 | Parameter            | Value                                                                                           |
 |----------------------|-------------------------------------------------------------------------------------------------|
@@ -31,4 +31,8 @@ tokenizer.decode(output[0], skip_special_tokens=True, clean_up_tokenization_spac
 | Effective Batch Size | 128 (16 * 8)                                                                                    |
 | Training Script      | [train.py](https://github.com/ubaada/scratch-transformer/blob/main/train.py)             |
 | Optimiser            | Adam (learning rate = 0.0001)                                                                   |

 tokenizer.decode(output[0], skip_special_tokens=True, clean_up_tokenization_spaces=True)
 # Output: ' Das ist meine Katze.'
 ```
+(remember the `trust_remote_code=True` because of custom modeling file)
 ## Training:
 | Parameter            | Value                                                                                           |
 |----------------------|-------------------------------------------------------------------------------------------------|
 | Effective Batch Size | 128 (16 * 8)                                                                                    |
 | Training Script      | [train.py](https://github.com/ubaada/scratch-transformer/blob/main/train.py)             |
 | Optimiser            | Adam (learning rate = 0.0001)                                                                   |
+| Loss Type            | Cross Entropy |
+| Final Test Loss      | 1.9 |
+| GPU.                 | RTX 4070 (12GB) |