metadata

language:
  - es
tags:
  - twitter
  - pos-tagging

POS Tagging model for Spanish/English

robertuito-pos

Repository: https://github.com/pysentimiento/pysentimiento/

Model trained with the Spanish/English split of the LinCE NER corpus, a code-switched benchmark . Base model is RoBERTuito, a RoBERTa model trained in Spanish tweets.

Results

Results are taken from the LinCE leaderboard

Model	Sentiment	NER	POS
RoBERTuito	60.6	68.5	97.2
XLM Large	--	69.5	97.2
XLM Base	--	64.9	97.0
C2S mBERT	59.1	64.6	96.9
mBERT	56.4	64.0	97.1
BERT	58.4	61.1	96.9
BETO	56.5	--	--

Citation

If you use this model in your research, please cite pysentimiento, RoBERTuito and LinCE papers:

@misc{perez2021pysentimiento,
      title={pysentimiento: A Python Toolkit for Sentiment Analysis and SocialNLP tasks},
      author={Juan Manuel Pérez and Juan Carlos Giudici and Franco Luque},
      year={2021},
      eprint={2106.09462},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
@inproceedings{ortega2019overview,
  title={Overview of the task on irony detection in Spanish variants},
  author={Ortega-Bueno, Reynier and Rangel, Francisco and Hern{\'a}ndez Far{\i}as, D and Rosso, Paolo and Montes-y-G{\'o}mez, Manuel and Medina Pagola, Jos{\'e} E},
  booktitle={Proceedings of the Iberian languages evaluation forum (IberLEF 2019), co-located with 34th conference of the Spanish Society for natural language processing (SEPLN 2019). CEUR-WS. org},
  volume={2421},
  pages={229--256},
  year={2019}
}

@inproceedings{aguilar2020lince,
  title={LinCE: A Centralized Benchmark for Linguistic Code-switching Evaluation},
  author={Aguilar, Gustavo and Kar, Sudipta and Solorio, Thamar},
  booktitle={Proceedings of the 12th Language Resources and Evaluation Conference},
  pages={1803--1813},
  year={2020}
}