metadata
language:
- es
tags:
- twitter
- pos-tagging
POS Tagging model for Spanish/English
robertuito-pos
Repository: https://github.com/pysentimiento/pysentimiento/
Model trained with the Spanish/English split of the LinCE NER corpus, a code-switched benchmark . Base model is RoBERTuito, a RoBERTa model trained in Spanish tweets.
Results
Results are taken from the LinCE leaderboard
Model | Sentiment | NER | POS |
---|---|---|---|
RoBERTuito | 60.6 | 68.5 | 97.2 |
XLM Large | -- | 69.5 | 97.2 |
XLM Base | -- | 64.9 | 97.0 |
C2S mBERT | 59.1 | 64.6 | 96.9 |
mBERT | 56.4 | 64.0 | 97.1 |
BERT | 58.4 | 61.1 | 96.9 |
BETO | 56.5 | -- | -- |
Citation
If you use this model in your research, please cite pysentimiento, RoBERTuito and LinCE papers:
@misc{perez2021pysentimiento,
title={pysentimiento: A Python Toolkit for Sentiment Analysis and SocialNLP tasks},
author={Juan Manuel Pérez and Juan Carlos Giudici and Franco Luque},
year={2021},
eprint={2106.09462},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
@inproceedings{ortega2019overview,
title={Overview of the task on irony detection in Spanish variants},
author={Ortega-Bueno, Reynier and Rangel, Francisco and Hern{\'a}ndez Far{\i}as, D and Rosso, Paolo and Montes-y-G{\'o}mez, Manuel and Medina Pagola, Jos{\'e} E},
booktitle={Proceedings of the Iberian languages evaluation forum (IberLEF 2019), co-located with 34th conference of the Spanish Society for natural language processing (SEPLN 2019). CEUR-WS. org},
volume={2421},
pages={229--256},
year={2019}
}
@inproceedings{aguilar2020lince,
title={LinCE: A Centralized Benchmark for Linguistic Code-switching Evaluation},
author={Aguilar, Gustavo and Kar, Sudipta and Solorio, Thamar},
booktitle={Proceedings of the 12th Language Resources and Evaluation Conference},
pages={1803--1813},
year={2020}
}