Wordnet for Definition Augmentation with Encoder-Decoder Architecture

PID

Data augmentation is a difficult task in Natural Language Processing. Simple methods that can be relatively easily applied in other domains like insertion, deletion or substitution, mostly result in changing the sentence meaning significantly and obtaining an incorrect example. Wordnets are potentially a perfect source of rich and high quality data that when integrated with the powerful capacity of generative models can help to solve this complex task. In this work, we use plWordNet, which is a wordnet of the Polish language, to explore the capability of encoder-decoder architectures in data augmentation of sense glosses. We discuss the limitations of generative methods and perform qualitative review of generated data samples.

Identifier
PID http://hdl.handle.net/11321/1003
Metadata Access https://clarin-pl.eu/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:clarin-pl.eu:11321/1003
Provenance
Creator Wojtasik, Konrad; Janz, Arkadiusz; Alberski, Bartłomiej; Piasecki, Maciej
Publisher Global Wordnet Association
Publication Year 2023
Rights Creative Commons - Attribution 4.0 International (CC BY 4.0); https://creativecommons.org/licenses/by/4.0/; CC
OpenAccess true
Contact clarin-pl(at)pwr.edu.pl
Representation
Language English; Polish
Resource Type languageDescription
Format text/plain; charset=utf-8; application/pdf; downloadable_files_count: 1
Discipline Linguistics