Corpus of 1968 Slovenian literature Maj68 2.0

PID

Maj68 corpus contains 1,521 texts by 198 known authors published between 1964 and 1972 in the periodicals "Tribuna", "Problemi" and "Problemi. Literatura." The texts contain complete bibliographical data, are classified according to text and language type, degree of presence of non-standard Slovenian, foreign languages, modernism, and visual elements. The data about the authors of the texts are provided with their gender and year of birth. The presence of visual elements is marked in the corpus; note that 48 texts have only visual elements, i.e. do not contain any text.

The corpus is available as facsimiles (PDFs), in the TEI encoding, as plain text files accompanied by metadata files, as the linguistically annotated TEI corpus, and the derived vertical files and registry file, for mounting on CWB-type concordancers. The TEI encoding follows the CLARIN.SI TEI customisation (https://github.com/clarinsi/TEI-schema).

The automatic linguistic annotation includes lemmas, MULTEXT-East morphosyntactic descriptions and Universal Dependencies morphological features and syntactic annotation.

As opposed to version 1 of this corpus, 647 new text from Tribuna and Problemi have been added, and some mistakes in metadata corrected.

Identifier
PID http://hdl.handle.net/11356/1491
Related Identifier http://hdl.handle.net/11356/1430
Related Identifier https://maj68.zrc-sazu.si/
Metadata Access http://www.clarin.si/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:www.clarin.si:11356/1491
Provenance
Creator Juvan, Marko; Žejn, Andrejka; Šorli, Mojca; Mandić, Lucija; Tomažin, Andrej; Jež, Andraž; Balžalorsky Antić, Varja; Erjavec, Tomaž
Publisher ZRC SAZU
Publication Year 2022
Rights Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0); https://creativecommons.org/licenses/by-nc-sa/4.0/; PUB
OpenAccess true
Contact info(at)clarin.si
Representation
Language Slovenian; Slovene
Resource Type corpus
Format text/plain; charset=utf-8; application/zip; downloadable_files_count: 5
Discipline Linguistics