ccGigafida ARPA language model 1.0

PID

The ccGigafida ARPA language model was created from the ccGigafida written corpus of Slovenian (https://www.clarin.si/repository/xmlui/handle/11356/1035) using the KenLM algorithm in the Moses machine translation framework. It is a general language model of contemporary standard Slovenian language that can be used as a language model in statistical machine translation systems.

The language model was created as a part of the master thesis: Kadivec, Jože. 2016. Prilagoditev statističnega strojnega prevajalnika za specifično domeno v slovenskem jeziku (Domain specific adaptation of a statistical machine translation engine in Slovene language). Master's thesis, Faculty of computer and information science, University of Ljubljana. https://repozitorij.uni-lj.si/IzpisGradiva.php?id=84815

Identifier
PID http://hdl.handle.net/11356/1119
Metadata Access http://www.clarin.si/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:www.clarin.si:11356/1119
Provenance
Creator Kadivec, Jože; Robnik-Šikonja, Marko; Vintar, Špela
Publisher Faculty of Computer and Information Science, University of Ljubljana
Publication Year 2017
Rights Creative Commons - Attribution 4.0 International (CC BY 4.0); https://creativecommons.org/licenses/by/4.0/; PUB
OpenAccess true
Contact info(at)clarin.si
Representation
Language Slovenian; Slovene
Resource Type toolService
Format application/gzip; text/plain; charset=utf-8; downloadable_files_count: 1
Discipline Linguistics