-
A Resource for Evaluating Graded Word Similarity in Context: CoSimLex
The dataset contains human similarity ratings for pairs of words. The annotators were presented with contexts that contained both of the words in the pair and the dataset... -
Slovenian RoBERTa contextual embeddings model: SloBERTa 1.0
The monolingual Slovene RoBERTa (A Robustly Optimized Bidirectional Encoder Representations from Transformers) model is a state-of-the-art model representing words/tokens as... -
CroSloEngual BERT
Trilingual BERT (Bidirectional Encoder Representations from Transformers) model, trained on Croatian, Slovenian, and English data. State of the art tool representing... -
ELMo embeddings model, Slovenian
ELMo language model (https://github.com/allenai/bilm-tf) used to produce contextual word embeddings, trained on entire Gigafida 2.0 corpus... -
Slovenian RoBERTa contextual embeddings model: SloBERTa 2.0
The monolingual Slovene RoBERTa (A Robustly Optimized Bidirectional Encoder Representations from Transformers) model is a state-of-the-art model representing words/tokens as... -
CroSloEngual BERT 1.1
Trilingual BERT (Bidirectional Encoder Representations from Transformers) model, trained on Croatian, Slovenian, and English data. State of the art tool representing... -
ELMo embeddings models for seven languages
ELMo language model (https://github.com/allenai/bilm-tf) used to produce contextual word embeddings, trained on large monolingual corpora for 7 languages: Slovenian, Croatian,... -
Multilingual Culture-Independent Word Analogy Datasets
Word analogy task evaluates word embeddings, based on analagous word pairs (eg. "Paris - France" should be equivalent to "Rome - Italy", "son - daughter" should be equivalent to... -
List of single-word male and female occupations in Slovenian
The list of single-word occupations in Slovene is based on the Slovene Standard Classification of Occupations...
