Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,54 @@
|
|
1 |
-
---
|
2 |
-
license: apache-2.0
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
- is
|
6 |
+
library_name: fairseq
|
7 |
+
tags:
|
8 |
+
- translation
|
9 |
+
- wmt
|
10 |
+
---
|
11 |
+
|
12 |
+
## Model description
|
13 |
+
This is a translation model which translates text from English to Icelandic. It follows the architecture of the transformer model described in [Attention is All You Need](https://arxiv.org/pdf/1706.03762) and was trained with [fairseq](https://github.com/facebookresearch/fairseq) for [WMT24](https://www2.statmt.org/wmt24/).
|
14 |
+
|
15 |
+
This is the base version of our model. See also: [wmt24-en-is-transformer-base](https://huggingface.co/arnastofnun/wmt24-en-is-transformer-base), [wmt24-en-is-transformer-big](https://huggingface.co/arnastofnun/wmt24-en-is-transformer-base-deep), [wmt24-en-is-transformer-big-deep](https://huggingface.co/arnastofnun/wmt24-en-is-transformer-big-deep).
|
16 |
+
|
17 |
+
| model | d_model | d_ff | h | N_enc | N_dec |
|
18 |
+
|:---------------|:----------------------|:-------------------|:--------------|:--------------------|:--------------------|
|
19 |
+
| Base | 512 | 2048 | 8 | 6 | 6 |
|
20 |
+
| Base_deep | 512 | 2048 | 8 | 36 | 12 |
|
21 |
+
| Big | 1024 | 4096 | 16 | 6 | 6 |
|
22 |
+
| Big_deep | 1024 | 4096 | 16 | 36 | 12 |
|
23 |
+
|
24 |
+
|
25 |
+
#### How to use
|
26 |
+
|
27 |
+
```python
|
28 |
+
from fairseq.models.transformer import TransformerModel
|
29 |
+
TRANSLATION_MODEL_NAME = 'checkpoint_best.pt'
|
30 |
+
TRANSLATION_MODEL = TransformerModel.from_pretrained('path/to/model', checkpoint_file=TRANSLATION_MODEL_NAME, bpe='sentencepiece', sentencepiece_model='sentencepiece.bpe.model')
|
31 |
+
src_sentences = ['This is a test sentence.', 'This is another test sentence.']
|
32 |
+
translated_sentences = translate(translation_model=TRANSLATION_MODEL, sentences=src_sentences, beam=5)
|
33 |
+
print(translated_sentences)
|
34 |
+
```
|
35 |
+
|
36 |
+
## Eval results
|
37 |
+
We evaluated our data on the [WMT21 test set](https://github.com/wmt-conference/wmt21-news-systems/). These are the chrF scores for our published models:
|
38 |
+
|
39 |
+
| model | chrF |
|
40 |
+
|:---------------|:------|
|
41 |
+
| Base | 56.8 |
|
42 |
+
| Base_deep | 57.1 |
|
43 |
+
| Big | 57.7 |
|
44 |
+
| Big_deep | 57.7 |
|
45 |
+
## BibTeX entry and citation info
|
46 |
+
|
47 |
+
```bibtex
|
48 |
+
@inproceedings{...,
|
49 |
+
year={XXX},
|
50 |
+
title={XXX},
|
51 |
+
author={XXX},
|
52 |
+
booktitle={XXX},
|
53 |
+
}
|
54 |
+
```
|