Replication Data for: Metaphor analysis meets lexical strings: Finetuning the Metaphor Identification Procedure for quantitative semantic analyses

Dataset

DOI

Dataset Abstract: This is the data that serves as the basis for a methodological article which proposes and illustrates two ways to extend the Metaphor Identification Procedure in such a way as to allow it to capture (metaphorical) lexical strings, in addition to (simple) metaphor-related lexical units. It includes a sample of 25 linguistic metaphors which stem from a larger corpus compiled by the first author, between October 2021 and May 2023. The corpus contains newspaper articles published in the Spanish-language, US-based newspaper El Diario (the El Paso and Juárez local editions). These articles revolve around the DACA (Deferred Action for Childhood Arrivals program) immigration program and were published between November 2020 and May 2023.

Article abstract: Recent years have witnessed the development of the Metaphor Identification Procedure (MIP/VU), a step-by-step protocol designed to identify metaphorically-used words in discourse. However, MIP(VU)’s merits notwithstanding, the procedure poses a problem to scholars intending to use its output as the basis for a semantic field analysis involving a quantitative component. Depending on the research question, metaphor analysts may be interested in chunks of language situated above the procedure’s standardized level of analysis (i.e, the lexical unit), including phrases and sentences. Yet, attempts to decenter the method’s exclusive focus on metaphor-related words have been the target of critique, among others on the grounds of their lack of clear unit-formation guidelines and, hence, their inconsistent unit of analysis and measurement. Drawing on data derived from a Spanish-language US-based newspaper’s coverage of the migration program known as DACA (Deferred Action for Childhood Arrivals), this article describes challenges that analysts can run into when attempting to use a dataset containing atomized metaphor-related words as the input for subsequent quantitative semantic analyses. Its main methodological contribution consists in a proposal and illustration of two possible ways to extend the existing MIP(VU)-protocol in such a way as to allow it to catch metaphorical strings, on top of words, in a reliable and systematic manner. One approach is procedural, and entails formulating a-priori grouping-directives based on the research question(s). The other is exploratory, involving the ad hoc grouping of units and adding a descriptive parameter meant to keep track of grouping-decisions made by the analyst, thereby safeguarding transparency at all times

Identifier
DOI	https://doi.org/10.18710/SJ89E3
Metadata Access	https://dataverse.no/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.18710/SJ89E3

Provenance
Creator	De Backer, Laurence
Publisher	DataverseNO
Contributor	De Backer, Laurence; Ghent University; Patrick Goethals; Sven Van Hulle; The Tromsø Repository of Language and Linguistics (TROLLing)
Publication Year	2025
Funding Reference	Fonds Wetenschappelijk Onderzoek Vlaanderen
Rights	CC BY 4.0; info:eu-repo/semantics/openAccess; http://creativecommons.org/licenses/by/4.0
OpenAccess	true
Contact	De Backer, Laurence (Ghent University)

Representation
Resource Type	Linguistic data; Dataset
Format	text/plain; text/comma-separated-values; type/x-r-syntax
Size	15390; 6326; 24834; 434047; 6910; 3633; 48144; 3056
Version	1.0
Discipline	Humanities