arnastofnun
/

wmt24-en-is-transformer-big

Model card Files Files and versions Community

atlijas commited on Aug 21, 2024

Commit

318d76a

·

verified ·

1 Parent(s): 4e3218c

Update README.md

Files changed (1) hide show

README.md +54 -3

README.md CHANGED Viewed

@@ -1,3 +1,54 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+language:
+- en
+- is
+library_name: fairseq
+tags:
+- translation
+- wmt
+---
+## Model description
+This is a translation model which translates text from English to Icelandic. It follows the architecture of the transformer model described in [Attention is All You Need](https://arxiv.org/pdf/1706.03762) and was trained with [fairseq](https://github.com/facebookresearch/fairseq) for [WMT24](https://www2.statmt.org/wmt24/).
+This is the base version of our model. See also: [wmt24-en-is-transformer-base](https://huggingface.co/arnastofnun/wmt24-en-is-transformer-base), [wmt24-en-is-transformer-big](https://huggingface.co/arnastofnun/wmt24-en-is-transformer-base-deep), [wmt24-en-is-transformer-big-deep](https://huggingface.co/arnastofnun/wmt24-en-is-transformer-big-deep).
+| model | d_model | d_ff | h | N_enc | N_dec |
+|:---------------|:----------------------|:-------------------|:--------------|:--------------------|:--------------------|
+| Base | 512 | 2048 | 8 | 6 | 6 |
+| Base_deep | 512 | 2048 | 8 | 36 | 12 |
+| Big | 1024 | 4096 | 16 | 6 | 6 |
+| Big_deep | 1024 | 4096 | 16 | 36 | 12 |
+#### How to use
+```python
+from fairseq.models.transformer import TransformerModel
+TRANSLATION_MODEL_NAME = 'checkpoint_best.pt'
+TRANSLATION_MODEL = TransformerModel.from_pretrained('path/to/model', checkpoint_file=TRANSLATION_MODEL_NAME, bpe='sentencepiece', sentencepiece_model='sentencepiece.bpe.model')
+src_sentences = ['This is a test sentence.', 'This is another test sentence.']
+translated_sentences = translate(translation_model=TRANSLATION_MODEL, sentences=src_sentences, beam=5)
+print(translated_sentences)
+```
+## Eval results
+We evaluated our data on the [WMT21 test set](https://github.com/wmt-conference/wmt21-news-systems/). These are the chrF scores for our published models:
+| model  | chrF |
+|:---------------|:------|
+| Base        | 56.8 |
+| Base_deep | 57.1 |
+| Big         | 57.7 |
+| Big_deep  | 57.7 |
+## BibTeX entry and citation info
+```bibtex
+@inproceedings{...,
+year={XXX},
+title={XXX},
+author={XXX},
+booktitle={XXX},
+}
+```