EPIC-EuroParl-UdS: A GPT-2 and NMT Surprisal-Annotated Corpus for Translation and Interpreting

PID

EPIC-EuroParl-UdS is a bidirectional document- and sentence-aligned English–German corpus of European Parliament debates (up to mid-July 2018). It includes the official written versions of speeches and their translations, as well as manual transcriptions produced from audio recordings of the original spoken speeches and their simultaneous interpreting.

Identifier
PID https://hdl.handle.net/epic-europarl-uds
Metadata Access http://fedora.clarin-d.uni-saarland.de/oaiprovider/?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:fedora.clarin-d.uni-saarland.de:clarind-uds:epic-europarl-uds
Provenance
Creator Kunilovskaya, Maria: ORCID logo; Pollkläsener, Christina: ORCID logo
Publisher CLARIND-UdS: Language Resources Repository, Department of Language Science and Technology, Saarland University, Germany: https://www.re3data.org/repository/r3d100010384
Contributor Teich, Elke: ORCID logo
Publication Year 2026
Rights Creative Commons Attribution 4.0 International: https://creativecommons.org/licenses/by/4.0/legalcode; Copyright © 2026 Maria Kunilovskaya
OpenAccess true
Contact j.knappen(at)mx.uni-saarland.de
Representation
Language German; English
Resource Type Dataset; parallel corpus
Format text/tab-separated-values
Discipline Linguistics