Dataset - B2FIND

Dataset for color terms, 2012

This dataset comprises adjective-noun phrases with color terms.

Psychotherapy Transcription Standards [data]

The datasets contains the newly developed psycholotheraupeutically informed transcription rules we developed within our study. These transcription rules can be used to...

Supplemental Material for "Visual-Explainable AI: The Use Case of Language Mo...

Supplemental material for the poster "Visual-Explainable AI: The Use Case of Language Models" published at the International Conference on Data-Integrated Simulation Science...

Visual Analytics System for Hidden States in Recurrent Neural Networks

Source code of our visual analytics system for the interpretation of hidden states in recurrent neural networks. This project contains source code for preprocessing data and the...

EXCEPTIUS Corpus

EXCEPTIUS Corpus v1.0, containing the following data: - raw documents for 21 countries at national level - pre-processed data with spacy-udpipe v1.0 - automatically annotated...

Impact of manipulating word boundaries on the information distributed in morp...

These plots are part of the study "Impact of manipulating word boundaries on the information distributed in morphology and syntax". Each plot represents the word-structure...

Dataset: tweets and analyses related to the paper 'The (Un)Predictability of ...

This dataset features all the tweetids and labels that were used to model the language of 24 hashtags, and test the performance on predicting the hashtags in unseen tweets. This...

To NER or not to NER? A case study of low-resource deontic modalities in EU l...

Deontic modality (obligation, permission, prohibition) in legal documents can convey critical information, and identification of deontic modalities is often performed using...

Data: Timely identification of event start dates from Twitter

This directory features data that is discussed in the paper: F. Kunneman, A. Hürriyetoglu, N. Oostdijk and A. Van den Bosch (2014), Timely identification of event start dates...

Dataset: tweets and analyses related to the paper 'The (Un)Predictability of ...

This dataset features all the tweetids and labels that were used to model the language of 24 hashtags, and test the performance on predicting the hashtags in unseen tweets. This...

Data: Timely identification of event start dates from Twitter

This directory features data that is discussed in the paper: F. Kunneman, A. Hürriyetoglu, N. Oostdijk and A. Van den Bosch (2014), Timely identification of event start dates...

Dataset: output related to the paper 'Event detection in Twitter: A machine-l...

This dataset features the output of intermediate steps and the final output of the research that is described in the paper: F. Kunneman and A. Van den Bosch (2014), Event...

Dataset: Events and periodicity analysis related to the paper 'Automatically ...

This dataset features information on all the events that were automatically extracted from Twitter and used as input to periodicity detection, as described in the paper: F....

Dataset: tweets and events linked to the paper 'Open-domain extraction of fut...

Input data and output of research conducted in the study described in the paper: F. Kunneman and A. Van den Bosch (2016), Open-domain extraction of future events from Twitter,...

International Medical Students‘ Improvement in Communication Skills in Psycho...

In this study, we evaluated international medical students' communication skills in the context of psychosomatic medicine, after attending a three-day training seminar. Using a...

Dataset: input and results related to the paper 'Anticipointment detection in...

This dataset features the training models, emotion classifications and emotion patterns before and after events, related to the paper: F. Kunneman, M. van Mulken and A. Van den...

LLM4Reuse

Low-code development platforms afford fast and easy process automation. Their intuitive drag-and-drop interfaces enable employees without formal programming skills to...

Dataset: input and results related to the paper 'Anticipointment detection in...

This dataset features the training models, emotion classifications and emotion patterns before and after events, related to the paper: F. Kunneman, M. van Mulken and A. Van den...

CorpusExplorer

Software for corpus linguists and text/data mining enthusiasts. The CorpusExplorer combines over 45 interactive visualizations under a user-friendly interface. Routine tasks...

DBS Corpus

The DBS corpus contains 93 multi-document summaries for 293 German documents about 30 education-related topics. We sampled the topics from the Deutscher Bildungsserver (DBS)...

36 datasets found