Unsupervised Cross-lingual Representation Learning at Scale (XLM-RoBERTa)


XLM-R (XLM-RoBERTa) is a generic cross lingual sentence encoder that obtains state-of-the-art results on many cross-lingual understanding (XLU) benchmarks. It is trained on 2.5T of filtered CommonCrawl data in 100 languages (list below).

Leave a Reply

Your email address will not be published. Required fields are marked *