-
Comparison of the usage of nouns by female and male members of the Polish par...
Dataset based on the Polish Parliamentary Corpus: utterances from male and female Members of Parliament (MP), extracted from the current cadency (8th) of Sejm, between... -
HD graduondokoa (Magia argibideak)
Magia jokoak egiteko argibide sorta -
Replication of part of the IFA corpus
The IFA Spoken Language corpus is a free (GPL) database of hand-segmented Dutch speech. It was constructed with off-the-shelf software using speech from 8 speakers in a variety... -
Syntax Maker - The NLG tool for Finnish
Syntax maker is the natural language generation tool for generating syntactically correct sentences in Finnish automatically. The tool is especially useful in the case of... -
Finnish Words and their Concreteness Values
Context This data has been produced for poem generation in Finnish. If you use this dataset in your publication, please cite: Hämäläinen, M., & Alnajjar, K. (2019). Let’s... -
Haur Hezkuntzako ipuin-bilduma
Euskal Herriko Ikastolen elkartean lantzen diren ipuinen bilduma -
Sign Language Interaction
This is a sign language interaction recording made for scientific purposes. -
SemMyv - Semantic Database for Erzya
This SQLite database contains Erzya lemmas and their frequencies in a big corpus. The lemmas are linked to each other based on the syntactic relations they have had in the... -
Natas - Python 3 library for processing historical English
This library will have methods for processing historical English corpora, especially for studying neologisms. The first functionalities to be released relate to normalization of... -
SemSms - Semantic Database for Skolt Sami
This SQLite database contains Skolt Sami lemmas and their frequencies in a big corpus. The lemmas are linked to each other based on the syntactic relations they have had in the... -
Prague Dependency Treebank 2.0 Sample Data
This is a small sample dataset from PDT 2.0. As such it can be released under a very permissive CC-BY license. -
FinMeter - Tools for assessing Finnish poetry
FinMeter is a library for analyzing poetry in Finnish. It handles typical rhyming such as alliteration, assonance and consonance, Japanese meters and Kalevala meter. It can also... -
Interaction and dialogue with large-scale textual data: Parliamentary speeche...
Prof. Dr. Andreas Blätte's keynote talk at the CLARIN Annual Conference 2015. Additional material, including the presented 3D visualisations, are available via... -
Model of English OCR Post-Correction
This is an OpenNMT-py model for OCR post-correction in English Usage, see: https://github.com/mikahama/natas This is a part of the following publication: Mika Hämäläinen, and... -
CATUC: Corpus académico de textos universitarios en castellano
This research was conducted on a corpus of texts produced by first-year undergraduate students at the University of the Basque Country (UPV/EHU). The corpus is called CATUC:... -
NoticIA
We present NoticIA, a dataset consisting of 850 Spanish news articles featuring prominent clickbait headlines, each paired with high-quality, single-sentence generative... -
Psycholinguistic Experiment Video
This is a video recording that is being used in psycholinguistic experiments. -
Laburpen corpusa The Basque Summaries Corpus
School summaries obtained from Unai Atutxa's thesis (Atutxa, 2022) are available under the CC BY-NC 4.0 license. A total of 1676 extractions and abstractions have been... -
SemMdf - Semantic Database for Moksha
This SQLite database contains Moksha lemmas and their frequencies in a big corpus. The lemmas are linked to each other based on the syntactic relations they have had in the... -
SemKpv - Semantic Database for Komi-Zyrian
This SQLite database contains Komi-Zyrian lemmas and their frequencies in a big corpus. The lemmas are linked to each other based on the syntactic relations they have had in the...
