SLäNDa 2.0

PID

SLäNDa, the Swedish literature corpus of narrative and dialogue, is a corpus made up of eight Swedish literary novels from the 19th and early 20th centuries, manually annotated mainly for different aspects of dialogue. The full annotation also contains other cited materials, like thoughts, signs and letters. The main motivation for including these categories as well, is to be able to identify the main narrative, which is all remaining unannotated text.

SLäNDa version 2.0 extends version 1.0 mainly by adding more data, but also by additional quality control, and a slight modification of the annotation scheme. In addition, the data is organized into test sets with different types of speech marking: quotation marks, dashes, and no marking.

Identifier
PID http://hdl.handle.net/11372/LRT-4739
Related Identifier http://hdl.handle.net/11372/LRT-3169
Metadata Access http://lindat.mff.cuni.cz/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:lindat.mff.cuni.cz:11372/LRT-4739
Provenance
Creator Stymne, Sara; Östman, Carin
Publisher Uppsala University
Publication Year 2022
Rights Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0); http://creativecommons.org/licenses/by-nc-sa/4.0/; PUB
OpenAccess true
Contact lindat-help(at)ufal.mff.cuni.cz
Representation
Language Swedish
Resource Type corpus
Format text/plain; charset=utf-8; application/zip; downloadable_files_count: 1
Discipline Linguistics