Diachronic emergence of German universal concessive conditionals with wh-clause-initial marking

DOI

These are the data analysed in Chapter 7 of Vander Haegen's dissertation entitled "Konstruktionsgrammatik und Variation. Eine Mikrotypologie universaler Irrelevanzgefüge im Gegenwartsdeutschen" (for a full reference, see "Related Publication" metadata field). The dataset contains German construction types in which an expression of irrelevance such as "egal", "gleichgültig" or "wurscht" (all: 'no matter') is followed by the wh word "was" 'what' or "wer" 'who', including the oblique forms of the latter ("wessen" 'whose', "wem" 'to.whom', "wen" 'whom'). The construction types under scrutiny can be arranged along a cline with two end points, illustrated in 1. and 2. below:

Es ist egal, was ich bin, ich bin ich. 'It doesn't matter who I am, I am me.' Lit.: 'It is no.matter who ...'
Egal was ich bin, ich bin ich. 'No matter who I am, I am me.'

In 1., the irrelevance expression "egal" 'no matter' appears as a predicative within a copular clause that embeds a wh interrogative. The copular clause is paratactically linked to the clause 'I am me'. In 2., however, the irrelevance expression functions as the introduction to an adverbial subclause, which is hypotactically linked to the clause 'I am me'. The construction type in 1. will henceforth be referred to as a "clause complex with an irrelevance predicate" (cf. Leuschner 2006: 77); the construction type in 2. is known as a "universal concessive conditional" (or UCC for short, cf. inter alia Haspelmath/König 1998: 563f.). The dataset was compiled in the context of a doctoral dissertation project which sought to, inter alia, trace the macro- and microdiachronic emergence of German UCCs as in 2. from clause complexes with irrelevance predicates as in 1. To this end, two samples were exported: a macrodiachronic sample covering the time period 1609–1948 and a microdiachronic one covering the time period 1947–2018. The macrodiachronic sample consists of N = 1463 tokens from the DWDS historical metacorpus; the macrodiachronic sample consists of N = 24,113 tokens from the German Reference Corpus DeReKo that were originally collected by Vander Haegen (2023). The full samples (including the corpus data) and the samples including only ID numbers and annotations (to facilitate processing with statistical software) are shared in four separate .csv-files. An R Markdown file with the data analysis and an html file with the R code and output are shared as well.

COSMAS II web, 2.4.5.4

Microsoft Excel for Mac, 16.103

R (aarch64-apple-darwin20), 4.4.2

R Studio, 2025.09.01+401

The data source of the macrodiachronic sample is publicly available without log-in via https://www.dwds.de/d/korpora/dtaxl. The data source of the microdiachronic sample is publicly available after logging in on https://cosmas2.ids-mannheim.de/cosmas2-web/ or https://korap.ids-mannheim.de/.

Identifier
DOI https://doi.org/10.18710/EDL4BF
Metadata Access https://dataverse.no/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.18710/EDL4BF
Provenance
Creator Vander Haegen, Flor ORCID logo
Publisher DataverseNO
Contributor Vander Haegen, Flor; Fonds Wetenschappelijk Onderzoek – Vlaanderen; Ghent University; The Tromsø Repository of Language and Linguistics (TROLLing)
Publication Year 2025
Funding Reference Fonds Wetenschappelijk Onderzoek — Vlaanderen 1197324N
Rights CC0 1.0; info:eu-repo/semantics/openAccess; http://creativecommons.org/publicdomain/zero/1.0
OpenAccess true
Contact Vander Haegen, Flor (Ghent University)
Representation
Resource Type manually annotated corpus data; Dataset
Format text/plain; text/comma-separated-values; text/x-r-notebook; text/html; type/x-r-syntax
Size 19310; 577039; 138594; 13762005; 1785511; 23109; 1985016; 1441
Version 1.0
Discipline Humanities