6 datasets found

None: downloadable_files_count: 3 FundingReference: info:eu-repo/grantAgreement/EC/H2020/825153

Filter Results
  • CroSloEngual BERT

    Trilingual BERT (Bidirectional Encoder Representations from Transformers) model, trained on Croatian, Slovenian, and English data. State of the art tool representing...
  • SimLex-999 Slovenian translation SimLex-999-sl 1.0

    The resource contains English SimLex-999 (Hill et al. 2015) and their Slovene translations. In the translation process, the word pairs were first translated by two translators...
  • 24sata news comment dataset 1.0

    The dataset of user comments provided for research purposes for the EMBEDDIA, a Horizon 2020 project, extracted from the database of user comments from the 24sata.hr news...
  • CroSloEngual BERT 1.1

    Trilingual BERT (Bidirectional Encoder Representations from Transformers) model, trained on Croatian, Slovenian, and English data. State of the art tool representing...
  • Latvian Delfi article archive (in Latvian and Russian) 1.0

    This dataset is an archive of articles from the Delfi news site from 2015-2019, containing over 180,000 articles (c. 50% in Latvian and 50% in the Russian language). Keywords...
  • Multilingual Culture-Independent Word Analogy Datasets

    Word analogy task evaluates word embeddings, based on analagous word pairs (eg. "Paris - France" should be equivalent to "Rome - Italy", "son - daughter" should be equivalent to...
You can also access this registry using the API (see API Docs).