-
Clarin-PL Studio Corpus (EMU;updated phonetics)
Polish speech corpus of read speech recorded in a studio. Contains many speakers, each reading a few dozen different sentences and a list of words with rare phonemes. Useful for... -
Big data language model with part of speech tags stemmed in RAW format
Big data language model with part of speech tags stemmed in RAW format -
Big data language model stemmed with BPE in ARPA format
Big data language model stemmed with BPE in ARPA format -
Big Data language model - second version - RAW
Big Data language model - second version - RAW -
Speech tools plugin for Annotation Pro
This resource describes the Annotation Pro plugin containing various tools for automatic processing of speech data. The initial tool provides only a speech aligner, but more are... -
Big Data language model in FastText Skip-gram format.
Big Data language model in FastText Skip-gram format. -
Big Data language model with grammatical groups - ARPA
Big Data Language model tagged with grammatical groups trained in ARPA format. -
Long term archive operating system source code
This submission contains the operating system of the long-term archive, built in the Polish-Japanese Academy of Information Technology for the Clarin-PL project. Basic elements... -
Big Data language model - subword - SYLLABED - ARPA
Big data language model based on syllabes in ARPA format. -
Big Data language model - second version - ARPA
Big Data language model - second version - ARPA -
Big Data language model tagged with POS - RAW.
Big data language model tagged with POS - RAW -
Clarin-PL Studio Corpus (EMU)
Polish speech corpus of read speech recorded in a studio. Contains many speakers, each reading a few dozen different sentences and a list of words with rare phonemes. Useful for... -
Polish Speech Services
This archive contains the source code and configuration of the speech tools web service available at http://mowa.clarin-pl.eu/mowa. The services provided include: + speech to... -
Big data language model stemmed with BPE in RAW format
Big data language model stemmed with BPE in RAW format -
Big Data language model - subword - BPE - ARPA
Big data language model based on subword units, based on byte pair encoding in ARPA format -
Big Data language model in Word2Vec CBOW format.
Big Data language model in Word2Vec CBOW format. -
Big Data language model in FastText CBOW format
Big Data language model in FastText CBOW format -
Big Data language model - STEMMED - RAW data
Big data language model stemmed in RAW format
