Supporting data for: Ukrainian Indefinite Pronouns and Language Typology

DOI

Dataset abstract: In order to shed light on the distribution of Ukrainian indefinite pronouns and adverbs we carried out corpus searches and created datasets with corpus data which were subjected to empirical analysis. Ukrainian indefinite pronouns and adverbs consist of two parts, viz. an interrogative (question word, e.g. що ‘what’) and an indefiniteness marker. The datasets are organized in a way that enables us to investigate the frequency and meaning of the attested combinations of interrogatives and indefiniteness markers. We carried out corpus searches and exported the data to a spreadsheet, where additional annotation was added.

Abstract from related publication: The present article offers an empirical analysis of Ukrainian indefinite pronouns and adverbs based on data from the GRAC corpus. The proposed analysis has ramifications for Ukrainian linguistics, Slavic linguistics, and language typology. With regard to Ukrainian linguistics, we identify substantial frequency differences and suggest distinguishing between a “core” system including the indefiniteness markers de-, -s’ and bud’-, and three “peripheral” markers, viz. -nebud’, aby-, and kazna. From the perspective of Slavic linguistics, the proposed analysis facilitates comparison with other Slavic languages, such as Polish and Russian. Our analysis pinpoints a number of similarities across Ukrainian, Polish and Russian, but also demonstrates that Ukrainian has a distinct system that merits investigation in its own right. For language typology, the analysis we propose shows how frequency information can be integrated in semantic maps, which arguably makes semantic maps a more powerful tool for cross-linguistic comparison.

Excel, 16.103.4 (25120717)

Identifier
DOI https://doi.org/10.18710/VESH4N
Metadata Access https://dataverse.no/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.18710/VESH4N
Provenance
Creator Nesset, Tore ORCID logo; Palii, Yuliia (ORCID: 0000-0003-0686-593X)
Publisher DataverseNO
Contributor Nesset, Tore; UiT The Arctic University of Norway; Palii, Yuliia; The Tromsø Repository of Language and Linguistics (TROLLing)
Publication Year 2025
Rights CC0 1.0; info:eu-repo/semantics/openAccess; http://creativecommons.org/publicdomain/zero/1.0
OpenAccess true
Contact Nesset, Tore (UiT Norges arktiske universitet)
Representation
Resource Type corpus data; Dataset
Format text/plain; text/comma-separated-values
Size 9300; 234992; 2238391; 29694423; 11952081; 244050
Version 1.0
Discipline Humanities; Linguistics