-
Dataset for color terms, 2012
This dataset comprises adjective-noun phrases with color terms. -
Supplemental Material for "Visual-Explainable AI: The Use Case of Language Mo...
Supplemental material for the poster "Visual-Explainable AI: The Use Case of Language Models" published at the International Conference on Data-Integrated Simulation Science... -
Visual Analytics System for Hidden States in Recurrent Neural Networks
Source code of our visual analytics system for the interpretation of hidden states in recurrent neural networks. This project contains source code for preprocessing data and the... -
Dataset: tweets and analyses related to the paper 'The (Un)Predictability of ...
This dataset features all the tweetids and labels that were used to model the language of 24 hashtags, and test the performance on predicting the hashtags in unseen tweets. This... -
Data: Timely identification of event start dates from Twitter
This directory features data that is discussed in the paper: F. Kunneman, A. Hürriyetoglu, N. Oostdijk and A. Van den Bosch (2014), Timely identification of event start dates... -
Dataset: input and results related to the paper 'Anticipointment detection in...
This dataset features the training models, emotion classifications and emotion patterns before and after events, related to the paper: F. Kunneman, M. van Mulken and A. Van den... -
Dataset: output related to the paper 'Event detection in Twitter: A machine-l...
This dataset features the output of intermediate steps and the final output of the research that is described in the paper: F. Kunneman and A. Van den Bosch (2014), Event... -
Dataset: Events and periodicity analysis related to the paper 'Automatically ...
This dataset features information on all the events that were automatically extracted from Twitter and used as input to periodicity detection, as described in the paper: F.... -
Dataset: tweets and events linked to the paper 'Open-domain extraction of fut...
Input data and output of research conducted in the study described in the paper: F. Kunneman and A. Van den Bosch (2016), Open-domain extraction of future events from Twitter,... -
Svensk text
Swedish text resources (e.g., names of men, women, cities, municipalities, Swedish government agencies) for simple and efficient computer processing. Samling med språkresurser... -
Engelsk-svensk guldstandard för ordlänkning (GES)
Reference corpus for word linking, divided into training data and test data. The sentences come from the English and Swedish parts of Europarl. Data are created from the... -
TexPrax
Dataset collected and annotated in the project TexPrax -
Data Linking Workshop 2023: Computer Vision and Natural Language Processing –...
The humanities meet computer science to create new synergies using computer vision and natural language processing. Aim & Scope Historians are increasingly using... -
3rd Workshop on Humanities-Centred Artificial Intelligence (CHAI 2023)
AI can support research in the Humanities making it easier and more efficient. It is thus essential that AI practitioners and Humanities scholars take a Humanities-centred... -
Data Linking Workshop 2023: Computer Vision and Natural Language Processing –...
The humanities meet computer science to create new synergies using computer vision and natural language processing. Aim & Scope Historians are increasingly using... -
MDSWriter
MDSWriter is a software for manually creating multi-document summarization corpora and a platform for developing complex annotation tasks spanning multiple steps. If you use or... -
DBS Corpus
The DBS corpus contains 93 multi-document summaries for 293 German documents about 30 education-related topics. We sampled the topics from the Deutscher Bildungsserver (DBS)... -
Impact of manipulating word boundaries on the information distributed in morp...
These plots are part of the study "Impact of manipulating word boundaries on the information distributed in morphology and syntax". Each plot represents the word-structure... -
CorpusExplorer
Software for corpus linguists and text/data mining enthusiasts. The CorpusExplorer combines over 45 interactive visualizations under a user-friendly interface. Routine tasks... -
EXCEPTIUS Corpus
EXCEPTIUS Corpus v1.0, containing the following data: - raw documents for 21 countries at national level - pre-processed data with spacy-udpipe v1.0 - automatically annotated...