-
CroSloEngual BERT
Trilingual BERT (Bidirectional Encoder Representations from Transformers) model, trained on Croatian, Slovenian, and English data. State of the art tool representing... -
SimLex-999 Slovenian translation SimLex-999-sl 1.0
The resource contains English SimLex-999 (Hill et al. 2015) and their Slovene translations. In the translation process, the word pairs were first translated by two translators... -
24sata news comment dataset 1.0
The dataset of user comments provided for research purposes for the EMBEDDIA, a Horizon 2020 project, extracted from the database of user comments from the 24sata.hr news... -
CroSloEngual BERT 1.1
Trilingual BERT (Bidirectional Encoder Representations from Transformers) model, trained on Croatian, Slovenian, and English data. State of the art tool representing... -
Latvian Delfi article archive (in Latvian and Russian) 1.0
This dataset is an archive of articles from the Delfi news site from 2015-2019, containing over 180,000 articles (c. 50% in Latvian and 50% in the Russian language). Keywords... -
Multilingual Culture-Independent Word Analogy Datasets
Word analogy task evaluates word embeddings, based on analagous word pairs (eg. "Paris - France" should be equivalent to "Rome - Italy", "son - daughter" should be equivalent to...
