Tweetplomacy 23 – An Annotated Collection of Tweets Outlining Strategies of Political Risk Communication during Global Crises (2018-2023)

DOI

Tweetplomacy 23 is a semantically annotated corpus of tweets capturing digital communicative interaction between international political leaders, peer groups and citizens in the wake of three major global crises: (1) the increasing emphasis on the security of energy supplies following Russia’s invasion of Ukraine; (2) the political and geo-economic consequences of the COVID-19 pandemic; (3) the intensified debate on the progression of climate change. These events occurred between 2018 and 2023, each of them marking a significant shake-up of the international system. The dataset focuses on the strategic use of networked information on X (formerly Twitter) by executive political actors facing exogenous shocks in the context of a global crisis situation. It is extracted from an X archive covering more than 14 billion tweets collected from the 1% random sample API. To extract the dataset, we resort to a list of top executives of the political administration – heads of state, heads of government, ministers of foreign affairs – or their respective public-relations offices. Their tweets are filtered using a list of thematically relevant keywords in four languages (English, German, French, Spanish), reflecting the discourse with respect to the three crises mentioned above. Our sample covers instances from the beginning of 2018 up to May 2023, representing statements made by leading politicians from 83 countries on all continents. As a subset, tweets published by the political leaders of the 38 member states of the OECD and the five BRICS countries (Brazil, Russia, India, China, South Africa) have been extracted. Additionally, the sample comprises a selection of 10 international organizations. The entire data collection consists of the following files: (1) users: excel file with a list of 654 Twitter user handles(usernames) of top executives of the political administration (and/or their institutional accounts), their nationalities, functions/roles and tenure; (2) keywords: excel file with a list of 60 crisis-related keywords (five keywords for each of the three individual crises in four languages); (3) a gzipped JSONL file per language: each line in the JSONL files represents a JSON object containing metadata about a tweet matching either one or more of the user handles and one or more of the keywords in the respective language. Additionally, semantic enrichments (i.e., entities and sentiments) calculated on the basis of the tweet text are provided. The JSON object includes the following fields: tweetId: integer, unique ID for an original tweet timeStamp: format ("EEE MMM dd HH:mm:ss Z yyyy"), the timestamp of the original tweet userName: JSON object containing the MD5-hashed user names for private persons or the user names for public persons and institutions userBio: string (available only for public users and institutions), metadata at the time point of the original tweets or of retweets followers: integer, metadata at the timepoint of the original tweets or of retweets followees: integer, metadata at the timepoint of the original tweets or of retweets retweets: integer, metadata at the timepoint of the original tweets or of retweets favorites: integer, metadata at the timepoint of the original tweets or of retweets replies: integer, metadata at the timepoint of the original tweets or of retweets matchingKeywords: list of strings representing the matching keywords matchingUserMentions: list of strings representing the matching user mentions matchingUserName: string representing the matching user name sentiments: JSON object containing the output of the VADER sentiment analysis tool (available only for English, German and French) entities: JSON object containing the output of Entity Fishing named entity linking tool hashtags: list of strings containing the hashtags extracted from the tweet text mentions: list of strings containing the user mentions extracted from the tweet text urls: JSON object containing (resolved) URLs extracted from the tweet text retweetId: integer, unique ID for the retweet of an original tweet with an ID captured in the tweetId field retweetTimeStamp: format ("EEE MMM dd HH:mm:ss Z yyyy"), the timestamp of the retweet retweetUserName: JSON object containing the MD5-hashed username of the retweeting user The dataset may serve to track and examine the repercussions/resonance produced by the ‘digital audience’ of the most influential political leaders in the course of the three crises, thus hinting at the political and societal impact their communicative actions had in the digital realm. Additionally, changes in sentiments, argumentation and/or tonality as well as more general breakpoints of discussion might be identified by conducting in-depth analyses of the online discourse relating to each of the three debates. Ultimately, the data may yield new insights into networks of communication among ‘online champions’ in the diplomatic community with regard to global political crises. To this end, researchers will be able to employ both quantitative/statistical and qualitative/hermeneutic methodologies to further explore and compare specific communicative motivations of national political leaders and the global ‘digital public’ in such cases. The data might therefore be used as a valuable empirical input not merely for political or media scientists, but also for scholars focusing on sociological, economic or socio-psychological aspects of crisis communication.

Identifier
DOI https://doi.org/10.7802/2985
Source https://search.gesis.org/research_data/SDN-10.7802-2985?lang=de
Metadata Access https://datacatalogue.cessda.eu/oai-pmh/v0/oai?verb=GetRecord&metadataPrefix=oai_ddi25&identifier=cd9a39adf851d89b16866210934d3fb6b40d94272aaa9237aa757a60e56ac614
Provenance
Creator Petermann, Jan-Henrik; Bensmann, Felix; Zhang, Yudong; Dimitrov, Dimitar
Publisher GESIS Data Archive for the Social Sciences; GESIS Datenarchiv für Sozialwissenschaften
Publication Year 2026
Rights Free access (with registration) - The research data can be downloaded by registered users. CC BY-NC 4.0: Attribution – NonCommercial (https://creativecommons.org/licenses/by-nc/4.0/deed.de); Freier Zugang (mit Registrierung) - Die Forschungsdaten können von allen registrierten Nutzerinnen und Nutzern heruntergeladen werden. CC BY-NC 4.0: Attribution – NonCommercial (https://creativecommons.org/licenses/by-nc/4.0/deed.de)
OpenAccess true
Contact http://www.gesis.org/
Representation
Discipline Social Sciences