This dataset includes replication data from the study "Fiscal Policy in the Bundestag: Textual Analysis and Macroeconomic Effects" by Albina Latifi, Viktoriia Naboka-Krell, Peter Tillmann, and Peter Winker. The dataset encompasses all speeches from the German Bundestag from September 7, 1949, to September 7, 2021, totaling 877,140 speeches.
doc_id : The overall document ID. | doc_lp_id : ID for each legislature period. | speech_identification_ent : Identified Entity for speech identification obtained by Named Entity Recognition Model. | date : Date object which represents date of each speech. | period : The legislative period. | session : Specifies the particular session. | pos_speechbeginning : Potential identification of a speech. Needed to disaggregate the corpus. | Party : Party affiliation of the speaker. Values: {'no-text' (e.g. chair), 'CDU/CSU', 'KPD', 'SPD', 'FDP', 'BP', 'Cabinet' (e.g. members of government), 'DP', 'Zentrum', 'NR', 'WAV', 'parteilos', 'NS', 'DRP', 'SRP', 'GB/BHE', 'fraktionslos', 'FU', 'DPB', 'DA', 'GRÜNE', 'PDS', 'LINKE', 'AfD'} | Role : Role of the speaker. Values: {'Alterspraesident', 'MdB', 'Bundestagspraesident', 'Schriftfuehrer', 'Bundeskanzler', 'Bundesminister', 'Vizepraesident', 'Staatssekretär', 'Staatsminister, 'Landesminister', 'Senator', 'Buergermeister', 'Gastredner', 'Wehrbeauftragter', 'Beauftragter'} | governing_Party : Indicates governing Party. Values: {0: Opposition, 1: Governing Party, 'no-text' : without assignment (e.g. chair), nan: non-party members/ non-affiliated members of parliament}} | text : Speech content. | text_length : Total number of words in a speech.