LegISTyr is a machine translation test set for evaluating the quality of legal terminology translation from Italian to South Tyrolean German, a minor standard variety of German. It covers specific legal subdomains or legal translation issues: 1) standardised terminology, 2) occupational health and safety, 3) subsidised housing, 4) family law, 5) criminal and criminal procedure law, 6) homonyms, 7) abbreviated forms, 8) gender-inclusive writing strategies. Each subset contains at least 250 examples, i.e. five examples for each term or twenty examples for each inclusive writing strategy. The total number of examples is 2067.
The example sentences in the test set showcase single-word and multi-word terms from the Italian legal system, together with their correct, standardised or non-standardised South Tyrolean German target hypothesis. It also lists other (less) acceptable variants used in South Tyrol and, where available, equivalent terms from other German-speaking legal systems (mainly Austria, Germany, Switzerland). The legal subdomain is specified for each example in every subset, except for the last subset on gender-inclusive writing. This subset contains examples for different strategies used in Italian but no target hypotheses, as there may be several acceptable ones.
LegISTyr can be used, for example, to assess the success of terminology enforcement strategies when machine translating legal and administrative texts from Italian into German as well as the influence of major varieties of legal German on translations into a minor standard variety.