-
Finnish Locative Cases for Nouns
Picking the right locative case in Finnish can be quite the challenge. Some words seem to prefer the internal locative case such Suomessa in Finland, while other words are... -
Orthography-based dating and localisation of Middle Dutch charters
In this study we build models for the localisation and dating of Middle Dutch charters. First, we extract character trigrams and use these to train a machine learner (K Nearest... -
Časování sloves v bengálštině
Description of verbal paradigms in Bengali. The description is written in Czech. -
Language Learning Stimulus Video
This is a video recording that is used for studying language learning by young children. -
Syntactically annotated Czech legal texts
Two legal texts syntactically manually annotated according to the Prague dependency treebank framework. Dependency trees are presented as images. The annotation editor TrEd was... -
B2 eta C1 mailetako azterketen etiketatzea eta analisia
Hizkuntza ikasleen azterketak bildu ditugu. Europar markoko B2 eta C1 mailetako probak dira, sail bakoitzetik 20 ale. Horiek etiketatu eta ondoren esleitutako etiketekin analisi... -
Comparison of the usage of nouns by female and male members of the Polish par...
Dataset based on the Polish Parliamentary Corpus: utterances from male and female Members of Parliament (MP), extracted from the current cadency (8th) of Sejm, between... -
HD graduondokoa (Magia argibideak)
Magia jokoak egiteko argibide sorta -
Replication of part of the IFA corpus
The IFA Spoken Language corpus is a free (GPL) database of hand-segmented Dutch speech. It was constructed with off-the-shelf software using speech from 8 speakers in a variety... -
Syntax Maker - The NLG tool for Finnish
Syntax maker is the natural language generation tool for generating syntactically correct sentences in Finnish automatically. The tool is especially useful in the case of... -
Finnish Words and their Concreteness Values
Context This data has been produced for poem generation in Finnish. If you use this dataset in your publication, please cite: Hämäläinen, M., & Alnajjar, K. (2019). Let’s... -
Haur Hezkuntzako ipuin-bilduma
Euskal Herriko Ikastolen elkartean lantzen diren ipuinen bilduma -
Sign Language Interaction
This is a sign language interaction recording made for scientific purposes. -
SemMyv - Semantic Database for Erzya
This SQLite database contains Erzya lemmas and their frequencies in a big corpus. The lemmas are linked to each other based on the syntactic relations they have had in the... -
Natas - Python 3 library for processing historical English
This library will have methods for processing historical English corpora, especially for studying neologisms. The first functionalities to be released relate to normalization of... -
SemSms - Semantic Database for Skolt Sami
This SQLite database contains Skolt Sami lemmas and their frequencies in a big corpus. The lemmas are linked to each other based on the syntactic relations they have had in the... -
Prague Dependency Treebank 2.0 Sample Data
This is a small sample dataset from PDT 2.0. As such it can be released under a very permissive CC-BY license. -
FinMeter - Tools for assessing Finnish poetry
FinMeter is a library for analyzing poetry in Finnish. It handles typical rhyming such as alliteration, assonance and consonance, Japanese meters and Kalevala meter. It can also... -
Interaction and dialogue with large-scale textual data: Parliamentary speeche...
Prof. Dr. Andreas Blätte's keynote talk at the CLARIN Annual Conference 2015. Additional material, including the presented 3D visualisations, are available via... -
Model of English OCR Post-Correction
This is an OpenNMT-py model for OCR post-correction in English Usage, see: https://github.com/mikahama/natas This is a part of the following publication: Mika Hämäläinen, and...
