license: apache-2.0 | |
datasets: | |
- wmt/wmt14 | |
language: | |
- en | |
- de | |
```python | |
from tokenizers import Tokenizer | |
tok = Tokenizer.from_pretrained("llm-scratch/wmt-14-en-de-tok") | |
``` |
license: apache-2.0 | |
datasets: | |
- wmt/wmt14 | |
language: | |
- en | |
- de | |
```python | |
from tokenizers import Tokenizer | |
tok = Tokenizer.from_pretrained("llm-scratch/wmt-14-en-de-tok") | |
``` |