Replication Data for: The copular subschema [become/devenir + past participle] in English and French: Productivity and degrees of passivity

DOI

These data form the basis for a contrastive analysis of the English copular subschema [become + past participle] and the equivalent copular subschema [devenir + past participle] in French. See the article abstract below.

The dataset contains 2500 corpus examples for each copular subschema. These two samples were extracted from the the English Web corpus 2013 and the French Web corpus 2012 in the Sketch Engine family of corpora (https://www.sketchengine.eu/), respectively. Moreover, several variables were encoded, addressing the past participles in subject complement position and quantitative measurements pertaining to these past participles and the infinitives from which the participles are derived. See the codebook file for more details. Finally, the dataset can be analyzed by means of the accompanying R script, in order to reproduce the findings of the associated research article.

Article abstract: This article presents a contrastive analysis of the English copular subschema [become + past participle] and the equivalent copular subschema [devenir + past participle] in French, based on web data. It is shown that both patterns are almost equally productive at the subject complement level. Furthermore, a more in-depth analysis demonstrates that, in the segment of participles with a high adjectival potential, devenir accumulates more participle tokens than become. Conversely, the reverse holds true for participles with a high verbal potential, in which case become is characterized by more participle tokens than devenir. This high amount of combinations between become and eventive participles also suggests a higher degree of passivity for become. However, in the segment of participles with an intermediate verbal potential, devenir is slightly more type frequent than become, which hints at an emerging productivity in this area for devenir as well.

Identifier
DOI https://doi.org/10.18710/UDVRZM
Related Identifier IsCitedBy https://doi.org/10.1075/lic.19013.van
Metadata Access https://dataverse.no/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.18710/UDVRZM
Provenance
Creator Van Wettere, Niek (ORCID: 0000-0002-9455-368X)
Publisher DataverseNO
Contributor Van Wettere, Niek; Vrije Universiteit Brussel; Ghent University; The Tromsø Repository of Language and Linguistics (TROLLing)
Publication Year 2020
Rights CC0 1.0; info:eu-repo/semantics/openAccess; http://creativecommons.org/publicdomain/zero/1.0
OpenAccess true
Contact Van Wettere, Niek (Universiteit Gent & Vrije Universiteit Brussel)
Representation
Resource Type corpus data; Dataset
Format text/plain; application/pdf; text/csv; type/x-r-syntax
Size 4935; 518234; 471239; 3158064; 15073; 1955
Version 1.1
Discipline Humanities