Wordnet-based Evaluation of Large Distributional Models for Polish

PID

The paper presents construction of large scale test datasets for word embeddings on the basis of a very large wordnet. They were next applied for evaluation of word embedding models and used to assess and compare the usefulness of different word embeddings extracted from a very large corpus of Polish. We analysed also and compared several publicly available models described in literature. In addition, several large word embeddings models built on the basis of a very large Polish corpus are presented.

Identifier
PID http://hdl.handle.net/11321/997
Metadata Access https://clarin-pl.eu/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:clarin-pl.eu:11321/997
Provenance
Creator Piasecki, Maciej; Czachor, Gabriela; Janz, Arkadiusz; Kaszewski, Dominik; Kędzia, Paweł
Publisher Global Wordnet Association
Publication Year 2018
Rights Creative Commons - Attribution 4.0 International (CC BY 4.0); https://creativecommons.org/licenses/by/4.0/; CC
OpenAccess true
Contact clarin-pl(at)pwr.edu.pl
Representation
Language English; Polish
Resource Type languageDescription
Format text/plain; charset=utf-8; application/pdf; downloadable_files_count: 1
Discipline Linguistics