Wordnet-based Evaluation of Large Distributional Models for Polish

Dataset

PID

The paper presents construction of large scale test datasets for word embeddings on the basis of a very large wordnet. They were next applied for evaluation of word embedding models and used to assess and compare the usefulness of different word embeddings extracted from a very large corpus of Polish. We analysed also and compared several publicly available models described in literature. In addition, several large word embeddings models built on the basis of a very large Polish corpus are presented.

Identifier
PID	http://hdl.handle.net/11321/997
Metadata Access	https://clarin-pl.eu/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:clarin-pl.eu:11321/997

Provenance
Creator	Piasecki, Maciej; Czachor, Gabriela; Janz, Arkadiusz; Kaszewski, Dominik; Kędzia, Paweł
Publisher	Global Wordnet Association
Publication Year	2018
Rights	Creative Commons - Attribution 4.0 International (CC BY 4.0); https://creativecommons.org/licenses/by/4.0/; CC
OpenAccess	true
Contact	clarin-pl(at)pwr.edu.pl

Representation
Language	English; Polish
Resource Type	languageDescription
Format	text/plain; charset=utf-8; application/pdf; downloadable_files_count: 1
Discipline	Linguistics