Parallel corpus EN-SL RSDO4 2.0

PID

The RSDO4 parallel corpus of English-Slovene and Slovene-English translation pairs was collected as part of work package 4 of the Slovene in the Digital Environment project. It contains texts collected from public institutions and texts submitted by individual donors through the text collection portal created within the project. The updated corpus consists of 3143624 (previously 964433) translation pairs (extracted from standard translation formats (TMX, XLIFF) or manually aligned) in randomized order which can be used for machine translation training.

Identifier
PID http://hdl.handle.net/11356/1698
Related Identifier http://hdl.handle.net/11356/1457
Related Identifier https://rsdo.slovenscina.eu/en/machine-translation
Metadata Access http://www.clarin.si/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:www.clarin.si:11356/1698
Provenance
Creator Repar, Andraž; Lebar Bajec, Iztok
Publisher Centre for Language Resources and Technologies, University of Ljubljana
Publication Year 2021
Rights Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0); https://creativecommons.org/licenses/by-sa/4.0/; PUB
OpenAccess true
Contact info(at)clarin.si
Representation
Language English; Slovenian; Slovene
Resource Type corpus
Format text/plain; charset=utf-8; application/zip; downloadable_files_count: 1
Discipline Linguistics