Transformer language model for Croatian and Serbian
Trained on 6GB datasets that contain Croatian and Serbian language for two epochs (500k steps).
Leipzig, OSCAR and srWac datasets
Model |
#params |
Arch. |
Training data |
Andrija/SRoBERTa-L |
80M |
Third |
Leipzig Corpus, OSCAR and srWac (6 GB of text) |