mwizakunda
commited on
Commit
·
4ea7b32
1
Parent(s):
dfba56e
Add information about T5
Browse files
README.md
CHANGED
@@ -11,7 +11,9 @@ Through HuggingFace Optimum, Graphcore released ready-to-use IPU-trained model c
|
|
11 |
|
12 |
## Model description
|
13 |
|
14 |
-
Multilingual Text-to-Text Transfer Transformer (mT5) is the multilingual variant of [T5](https://arxiv.org/abs/1910.10683). mT5 is
|
|
|
|
|
15 |
|
16 |
Afrikaans, Albanian, Amharic, Arabic, Armenian, Azerbaijani, Basque, Belarusian, Bengali, Bulgarian, Burmese, Catalan, Cebuano, Chichewa, Chinese, Corsican, Czech, Danish, Dutch, English, Esperanto, Estonian, Filipino, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hmong, Hungarian, Icelandic, Igbo, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Kurdish, Kyrgyz, Lao, Latin, Latvian, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Nepali, Norwegian, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Samoan, Scottish Gaelic, Serbian, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Sotho, Spanish, Sundanese, Swahili, Swedish, Tajik, Tamil, Telugu, Thai, Turkish, Ukrainian, Urdu, Uzbek, Vietnamese, Welsh, West Frisian, Xhosa, Yiddish, Yoruba, Zulu.
|
17 |
|
|
|
11 |
|
12 |
## Model description
|
13 |
|
14 |
+
Multilingual Text-to-Text Transfer Transformer (mT5) is the multilingual variant of [T5](https://arxiv.org/abs/1910.10683). Multilingual Text-to-Text Transfer Transformer (mT5) is the multilingual variant of [T5](https://arxiv.org/abs/1910.10683). T5 is a Transformer based model that uses a text-to-text approach for translation, question answering, and classification. It introduces an unified framework that converts all text-based language problems into a text-to-text format for transfer learning for NLP. This allows for the use of the same model, loss function, hyperparameters, etc. across our diverse set of tasks.
|
15 |
+
|
16 |
+
mT5 is pretrained on the mC4 corpus, covering 101 languages:
|
17 |
|
18 |
Afrikaans, Albanian, Amharic, Arabic, Armenian, Azerbaijani, Basque, Belarusian, Bengali, Bulgarian, Burmese, Catalan, Cebuano, Chichewa, Chinese, Corsican, Czech, Danish, Dutch, English, Esperanto, Estonian, Filipino, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hmong, Hungarian, Icelandic, Igbo, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Kurdish, Kyrgyz, Lao, Latin, Latvian, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Nepali, Norwegian, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Samoan, Scottish Gaelic, Serbian, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Sotho, Spanish, Sundanese, Swahili, Swedish, Tajik, Tamil, Telugu, Thai, Turkish, Ukrainian, Urdu, Uzbek, Vietnamese, Welsh, West Frisian, Xhosa, Yiddish, Yoruba, Zulu.
|
19 |
|