Supporting data for: Ukrainian Indefinite Pronouns and Language Typology

Dataset

DOI

Dataset abstract: In order to shed light on the distribution of Ukrainian indefinite pronouns and adverbs we carried out corpus searches and created datasets with corpus data which were subjected to empirical analysis. Ukrainian indefinite pronouns and adverbs consist of two parts, viz. an interrogative (question word, e.g. що ‘what’) and an indefiniteness marker. The datasets are organized in a way that enables us to investigate the frequency and meaning of the attested combinations of interrogatives and indefiniteness markers. We carried out corpus searches and exported the data to a spreadsheet, where additional annotation was added.

Abstract from related publication: The present article offers an empirical analysis of Ukrainian indefinite pronouns and adverbs based on data from the GRAC corpus. The proposed analysis has ramifications for Ukrainian linguistics, Slavic linguistics, and language typology. With regard to Ukrainian linguistics, we identify substantial frequency differences and suggest distinguishing between a “core” system including the indefiniteness markers de-, -s’ and bud’-, and three “peripheral” markers, viz. -nebud’, aby-, and kazna. From the perspective of Slavic linguistics, the proposed analysis facilitates comparison with other Slavic languages, such as Polish and Russian. Our analysis pinpoints a number of similarities across Ukrainian, Polish and Russian, but also demonstrates that Ukrainian has a distinct system that merits investigation in its own right. For language typology, the analysis we propose shows how frequency information can be integrated in semantic maps, which arguably makes semantic maps a more powerful tool for cross-linguistic comparison.

Excel, 16.103.4 (25120717)

Identifier
DOI	https://doi.org/10.18710/VESH4N
Metadata Access	https://dataverse.no/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.18710/VESH4N

Provenance
Creator	Nesset, Tore ; Palii, Yuliia (ORCID: 0000-0003-0686-593X)
Publisher	DataverseNO
Contributor	Nesset, Tore; UiT The Arctic University of Norway; Palii, Yuliia; The Tromsø Repository of Language and Linguistics (TROLLing)
Publication Year	2025
Rights	CC0 1.0; info:eu-repo/semantics/openAccess; http://creativecommons.org/publicdomain/zero/1.0
OpenAccess	true
Contact	Nesset, Tore (UiT Norges arktiske universitet)

Representation
Resource Type	corpus data; Dataset
Format	text/plain; text/comma-separated-values
Size	9300; 234992; 2238391; 29694423; 11952081; 244050
Version	1.0
Discipline	Humanities; Linguistics