-
Big Data language model - second version - ARPA
Big Data language model - second version - ARPA -
Polish Grapheme-to-phoneme tool and service
This archive contains the source code of the Polish grapheme-to-phoneme conversion tool and the webservice located at http://mowa.clarin-pl.eu/transcriber/ -
EU Parliament Speech corpus
A collection of 1040 EU parliament speeches with transcription and annotations. Includes original speeches and PL/EN translations. -
Clarin-PL Studio Corpus (EMU)
Polish speech corpus of read speech recorded in a studio. Contains many speakers, each reading a few dozen different sentences and a list of words with rare phonemes. Useful for... -
Speech tools plugin for Annotation Pro
This resource describes the Annotation Pro plugin containing various tools for automatic processing of speech data. The initial tool provides only a speech aligner, but more are... -
Clarin-PL Mobile Corpus (EMU)
Polish speech corpus of read speech recorded over the phone. Contains many speakers, each reading a few dozen different sentences and a list of words with rare phonemes. Useful... -
Big Data language model in Word2Vec Skip-gram format.
Big Data language model in Word2Vec Skip-gram format. -
Big data language model stemmed with BPE in RAW format
Big data language model stemmed with BPE in RAW format -
Big Data language model - subword - SYLLABED - RAW
Big data language model based on syllabes in RAW format -
Speech activity annotation for a subset of the Clarin-PL studio corpus
This is a hand-checked annotation of speech activity within a subset of the Clarin-PL studio corpus, containing 20 session with 619 recordings. This submission does not contain... -
Speech Recognition System for Polish: Studio Quality
This resource contains dockerized models and scripts of an automatic speech recognition system for Polish trained on studio quality speech. The system is based on the Kaldi... -
Parallel Corpora from Comparable Corpora tool
Script consists of 2 parts: article parser aligner Required software (install before using script): yalign additional Ubuntu packages: mongodb ipython python-nose... -
Polish Speech Services
This archive contains the source code and configuration of the speech tools web service available at http://mowa.clarin-pl.eu/mowa. The services provided include: + speech to... -
Big data language model tagged with POS - ARPA
Big Data language model tagged with POS - ARPA -
Big Data language model - second version - RAW
Big Data language model - second version - RAW -
Clarin-PL Studio Corpus (EMU;updated phonetics)
Polish speech corpus of read speech recorded in a studio. Contains many speakers, each reading a few dozen different sentences and a list of words with rare phonemes. Useful for... -
Long term archive operating system source code
This submission contains the operating system of the long-term archive, built in the Polish-Japanese Academy of Information Technology for the Clarin-PL project. Basic elements... -
google22
gggggggggggggggg