-
OwnReality API-only web application
This dataset contains the data platform for the research project "OwnReality. To Each His Own Reality". During the course of the project, data was gathered and entered into a... -
A harmonised testsuite for social media POS tagging (DE)
A harmonised POS testsuite of web data, CMC and Twitter microtext, with word forms and STTS pos tags (+ some additional CMC-specific tags). UD pos tags have been automatically... -
Source code and data for the PhD Thesis "Learning Neural Graph Representation...
This dataset contains source code and data used in the PhD thesis "Learning Neural Graph Representations in Non-Euclidean Geometries". The dataset is split into four... -
Cataloging Cultural Objects (CCO) – The CCO Commons examples in VRA Core 4 XML
“Cataloging Cultural Objects - a Guide to Describing Cultural Works and Their Images” (CCO) provides a data content standard for catalogers of cultural heritage. It is a... -
DeModify
deModify consists of 3631 instances, each with three annotations obtained through CrowdFlower. An instance is a short story in which a modifier is annotated with respect to its... -
CoCo-Ex
CoCo-Ex extracts meaningful concepts from natural language texts and maps them to conjunct concept nodes in ConceptNet, utilizing the maximum of relational information stored in... -
Sensor-Based Measurements in Paraplegia: Classified References from a Systema...
This dataset contains the results (publication references) of the systematic review "Current Use of Sensor-Based Measurements for Paraplegics", presented at MIE 2020, Geneva. -
The MSC Data Set
From this page you can download resources we created for modal sense classification as reported in Zhou et al. (2015), Marasović et al. (2016) and Marasović and Frank (2015)... -
Twitter Titling Corpus
The Twitter Titling Corpus contains 4002 stance-annotated tweets collected between 20 June 2017 and 30 August 2017 mentioning 6 presidents. Each tweet is annotated for the... -
HeiCuBeDa Hilprecht - Heidelberg Cuneiform Benchmark Dataset for the Hilprech...
The number of known cuneiform tablets is assumed to be in the hundreds of thousands. A fraction has been published by printing photographs and manual tracings in books, which is... -
X-SRL Dataset and mBERT Word Aligner
This code contains a method to automatically align words from parallel sentences by using multilingual BERT pre-trained embeddings. This can be used to transfer source... -
Knowledge-Enhanced Neural Networks for Machine Reading Comprehension [Source ...
Machine Reading Comprehension is a language understanding task where a system is expected to read a given passage of text and typically answer questions about it. When humans... -
Machine learning models featuring somatic and mental comorbidities for prolon...
Abstract Background: Knowledge about the influencing factors on hospital in-patient length-of-stay is integral for optimizing care and resource planning. Existing studies on... -
Collagen breaks at weak sacrificial bonds taming its mechanoradicals [Data]
This dataset contains input files for MD simulations, derived breakage counts from these simulations that are used to generate the figures in the publication, and the... -
Converter for content-to-head style syntactic dependencies
A set of Python scripts that convert function-head style encodings in dependency treebanks in a content-head style encoding (as used in the UD treebanks) and vice versa (for... -
Expert Sample Consensus (ESAC) for Visual Re-Localization [Data]
Supplementary training data for visual camera re-localization, particularly pre-computed scene coordinates to the MSR 7Scenes dataset and the Standford 12Scenes dataset. We also... -
Selectional Preference Embeddings (EMNLP 2017)
Joint embeddings of selectional preferences, words, and fine-grained entity types. The vocabulary consists of: verbs and their dependency relation separated by "@", e.g.... -
Multilingual Modal Sense Classification using a Convolutional Neural Network ...
Abstract Modal sense classification (MSC) is aspecial WSD task that depends on themeaning of the proposition in the modal’s scope. We explore a CNN architecture for... -
DSAC++ Visual Camera Re-Localization [Data]
Supplementary training data for visual camera re-localization, particularly rendered depth maps to be used in combination with the Cambridge Landmarks dataset. We also provide... -
Datasets for Dependency Tree Reranking
This resource contains the datasets for dependency tree reranking in 3 languages: English, German and Czech. The creation, analysis and experiment results of the datasets are...