-
UKP Convincing Arguments v1
Corpus content UKPConvArg1-full-XML This is the full corpus as referred in the article (Table 2, UKPConvArgAll). It contains 32 xml files, each file corresponding to one... -
Cognate pairs for several languages
Cognates for the following language pairs can be used for research purposes: en-es, en-de, en-ru, en-el, en-fa, de-cz. Includes: * The training and test data for the en-es... -
Fine-tuned model weights for Stance Detection Benchmark System
This collection includes model weights (BERT-based), fine-tuned in a multi-task setting on 10 heterogeneous stance detection datasets. For more information, please refer to the... -
Forum Post Quality Dataset
The dataset has been compiled from Nabble.com. It has been used and is described in the papers listed below. The data can be obtained on request. -
Football Coreference Corpus
This script generates: the original sentence-level Football Coreference Corpus (FCC), a version of the sentence-level FCC which was cleaned and updated after manual review,... -
Personality Profiling of Fictional Characters using Sense-Level Links between...
This dataset contains the personality gold standard of 298 book characters annotated for their MBTI traits, gathered manually from the http://mbti-databank.com/ website and... -
Visual Feature Track Dataset
This dataset contains 282 visual feature tracks. A visual feature track is a sequence of feature observations of the same real 3D-landmark in consecutive image frames. These... -
WWW 2019 X-Ling Question Retrieval Data v1
This repository contains the data and code to reproduce the results of our paper "Improved Cross-Lingual Question Retrieval for Community Question Answering"... -
Whittle Networks datasets
Datasets for paper "Whittle Networks: A Deep Likelihood Model for Time Series" Paper at http://proceedings.mlr.press/v139/yu21c.html Code at... -
Verb Sense Labelling
Vocabulary used for the creation of sense patterns: -
Fast Axiomatic Attribution for Neural Networks
Mitigating the dependence on spurious correlations present in the training dataset is a quickly emerging and important topic of deep learning. Recent approaches include priors... -
Self-supervised Augmentation Consistency for Adapting Semantic Segmentation
We propose an approach to domain adaptation for semantic segmentation that is both practical and highly accurate. In contrast to previous work, we abandon the use of... -
Dense Unsupervised Learning for Video Segmentation
We present a novel approach to unsupervised learning for video object segmentation (VOS). Unlike previous work, our formulation allows to learn dense feature representations... -
Single-stage Semantic Segmentation from Image Labels
Recent years have seen a rapid growth in new approaches improving the accuracy of semantic segmentation in a weakly supervised setting, i.e. with only image-level labels... -
On emergence
Output files for the paper Are Emergent Abilities in Large Language Models just In-Context Learning. -
The TYC Dataset for Understanding Instance-Level Semantics and Motions of Cel...
TYC dataset proposed in the paper "The TYC Dataset for Understanding Instance-Level Semantics and Motions of Cells in Microstructures" [ICCVW 2023]. Project page:... -
Analyzing Dataset Annotation Quality Management in the Wild
This is the accompanying data for the paper "Analyzing Dataset Annotation Quality Management in the Wild". Data quality is crucial for training accurate, unbiased, and... -
Lessons Learned from a Citizen Science Project for Natural Language Processing
This is the accompanying data for our paper "Lessons Learned from a Citizen Science Project for Natural Language Processing". Many Natural Language Processing (NLP) systems use... -
Annotation Error Detection: Analyzing the Past and Present for a More Coheren...
This is the accompanying data for our paper "Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future". Annotated data is an essential ingredient in... -
DRZ Living Lab Tracked Robot SLAM Dataset
Data set for the evaluation of SLAM systems in challenging terrains. The data set covers four sequences with challenging terrain, each tracked with a high-performance Qualisys...
