Replication Data for: The many guises of productivity: a case-study of Spanish inchoative constructions

Dataset

DOI

The dataset contains the quantitative data used as input for the Principal Components Analysis conducted in the article "The many guises of productivity: a case-study of Spanish inchoative constructions". The data originates from the Spanish Web Corpus (esTenTen18), accessed via Sketch Engine (Kilgariff & Renau 2013). Only the subcorpus for European Spanish Data was selected. After downloading, the samples were manually cleaned. In the dataset, maximally 500 tokens were retained per auxiliary. The data were annotated for 'Subject', 'AUX', 'Filler', 'Person', 'Tense', 'LexicalTypeInf', SyntaxInf, 'Intercalation', 'Intentionality', and 'Abruptness', besides other criteria that are not taken into account for this study. For this analysis, only the variables auxiliary, abbreviated as 'AUX' and infintive, abbreviated as 'INF' are taken into account. See data-specific sections below for more information about the variables.

Identifier
DOI	https://doi.org/10.18710/5E8I0T
Metadata Access	https://dataverse.no/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.18710/5E8I0T

Provenance
Creator	Van Hulle, Sven
Publisher	DataverseNO
Contributor	Van Hulle, Sven; Ghent University; Enghels, Renata; Lauwers, Peter; The Tromsø Repository of Language and Linguistics (TROLLing)
Publication Year	2024
Rights	CC0 1.0; info:eu-repo/semantics/openAccess; http://creativecommons.org/publicdomain/zero/1.0
OpenAccess	true
Contact	Van Hulle, Sven (Ghent University)

Representation
Resource Type	annotated corpus data; Dataset
Format	text/plain; text/comma-separated-values; type/x-r-syntax
Size	5473; 21761; 1877
Version	1.0
Discipline	Humanities; Linguistics