ubaada commited on
Commit
e7c1c98
·
verified ·
1 Parent(s): e79a50f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -3
README.md CHANGED
@@ -8,9 +8,9 @@ language:
8
  pipeline_tag: text2text-generation
9
  ---
10
 
11
- This is a huggingface port of the [PyTorch implementation of the original transformer](https://github.com/ubaada/scratch-transformer) model from 2017 introduced in the paper "[Attention Is All You Need](https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf)". This is the 65M parameter base model version trained to do English-to-German translations.
12
 
13
- Usage:
14
  ```python
15
  model = AutoModel.from_pretrained("ubaada/original-transformer", trust_remote_code=True)
16
  tokenizer = AutoTokenizer.from_pretrained("ubaada/original-transformer")
@@ -18,4 +18,17 @@ text = 'This is my cat'
18
  output = model.generate(**tokenizer(text, return_tensors="pt", add_special_tokens=True, truncation=True, max_length=100))
19
  tokenizer.decode(output[0], skip_special_tokens=True, clean_up_tokenization_spaces=True)
20
  # Output: ' Das ist meine Katze.'
21
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  pipeline_tag: text2text-generation
9
  ---
10
 
11
+ This is a custom huggingface model port of the [PyTorch implementation of the original transformer](https://github.com/ubaada/scratch-transformer) model from 2017 introduced in the paper "[Attention Is All You Need](https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf)". This is the 65M parameter base model version trained to do English-to-German translations.
12
 
13
+ ## Usage:
14
  ```python
15
  model = AutoModel.from_pretrained("ubaada/original-transformer", trust_remote_code=True)
16
  tokenizer = AutoTokenizer.from_pretrained("ubaada/original-transformer")
 
18
  output = model.generate(**tokenizer(text, return_tensors="pt", add_special_tokens=True, truncation=True, max_length=100))
19
  tokenizer.decode(output[0], skip_special_tokens=True, clean_up_tokenization_spaces=True)
20
  # Output: ' Das ist meine Katze.'
21
+ ```
22
+ (remember the `trust_remote_code=True` because of custom modeling fiel)
23
+ ## Training:
24
+ | Parameter | Value |
25
+ |----------------------|-------------------------------------------------------------------------------------------------|
26
+ | Dataset | WMT14-de-en |
27
+ | Translation Pairs | 4.5M (83M tokens total) |
28
+ | Epochs | 25 |
29
+ | Batch Size | 16 |
30
+ | Accumulation Batch | 8 |
31
+ | Effective Batch Size | 128 (16 * 8) |
32
+ | Training Script | [train.py](https://github.com/ubaada/scratch-transformer/blob/main/train.py) |
33
+ | Optimiser | Adam (learning rate = 0.0001) |
34
+