Corpus of political party programs Programi2022

PID

A corpus of political party programs for the 2022 parliamentary elections in Slovenia. Included are political programs for 19 parties with candidates at the election. The programs were extracted from party-published sources (websites, PDF files) and linguistically annotated with CLASSLA (https://github.com/clarinsi/classla).

The corpus is split by party, extracted text files are available for all 19 programs, while PDF files are available for parties that published them. It contains 330559 tokens in total, with the longest party program containing about 80 thousand and the shortest about 300 tokens.

Identifier
PID http://hdl.handle.net/11356/1734
Related Identifier https://www.clarin.si/info/services/projects/#Compiling_a_corpus_of_political_party_programmes_for_the_2022_Parliamentary_Election
Metadata Access http://www.clarin.si/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:www.clarin.si:11356/1734
Provenance
Creator Polanič, Petra; Dobranić, Filip
Publisher Institute of Contemporary History
Publication Year 2022
Rights Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0); https://creativecommons.org/licenses/by-nc-sa/4.0/; PUB
OpenAccess true
Contact info(at)clarin.si
Representation
Language Slovenian; Slovene
Resource Type corpus
Format text/plain; charset=utf-8; application/zip; downloadable_files_count: 3
Discipline Linguistics