JRC EU DGT Translation Memory Parsebank DGT-UD 1.0


DGT-UD is a 2 billion word 23-language parallel syntactically parsed corpus, which consists of the JRC DGT translation memory of European law, automatically annotated with UD-Pipe 1.2 (http://ufal.mff.cuni.cz/udpipe) using Universal Dependencies 2.0 models (http://hdl.handle.net/11234/1-2364). Note that the European Commission retains ownership of the source texts.

PID http://hdl.handle.net/11356/1197
Related Identifier https://ec.europa.eu/jrc/en/language-technologies/dgt-translation-memory
Metadata Access http://www.clarin.si/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:www.clarin.si:11356/1197
Creator Ljubešić, Nikola; Erjavec, Tomaž
Publisher Jožef Stefan Institute
Publication Year 2018
Rights Creative Commons - Attribution 4.0 International (CC BY 4.0); https://creativecommons.org/licenses/by/4.0/; PUB
OpenAccess true
Contact info(at)clarin.si
Language Bulgarian; Czech; Danish; German; Greek, Modern (1453-); Greek; English; Spanish; Castilian; Estonian; Finnish; French; Hungarian; Italian; Lithuanian; Latvian; Dutch; Flemish; Polish; Portuguese; Romanian; Moldavian; Moldovan; Croatian; Slovak; Slovenian; Slovene; Swedish; Irish
Resource Type corpus
Format application/zip; text/plain; charset=utf-8; downloadable_files_count: 24
Discipline Linguistics