-
SLTrans
The dataset consists of source code and LLVM IR pairs generated from accepted and de-duped programming contest solutions. The dataset is divided into language configs and mode... -
SpaRC and SpaRP: Spatial Reasoning Characterization and Path Generation for U...
Dataset resource released as part of our ACL 2024 paper "SpaRC and SpaRP: Spatial Reasoning Characterization and Path Generation for Understanding Spatial Reasoning Capability... -
Systematic Task Exploration with LLMs: A Study in Citation Text Generation
The components of this dataset are used in the experiments of the paper "Systematic Task Exploration with LLMs: A Study in Citation Text Generation" published at main conference... -
AR-CP: Uncertainty-Aware Perception in Adverse Conditions with Conformal Pred...
Deep learning models play a crucial role in improving driver assistance systems and environmental perception. However, their tendency toward overconfident predictions poses... -
Common Vulnerability Scoring System Prediction Based on Open Source Intellige...
This repository contains the dataset used to train BERT-based models based on open information sources. We aimed to build a more robust model to predict the Common Vulnerability... -
TexPrax
Dataset collected and annotated in the project TexPrax -
Constrained C-Test Generation via Mixed-Integer Programming (Supplementary Ma...
This work proposes a novel method to generate C-Tests; a deviated form of cloze tests (a gap filling exercise) where only the last part of a word is turned into a gap. In... -
FEDI Dataset
FEDI is the first task-oriented document-grounded dialogue dataset for learning from demographic information, user emotions and implicit user feedback. In its current version,... -
Document Structure in Long Document Transformers
This repository contains the data for the paper "Document Structure in Long Document Transformers", accepted at EACL 2024. Please see README.md for more information. -
Exploring Jiu-Jitsu Argumentation for Writing Peer Review Rebuttals
This is the resource for the dataset and models released as a part of our EMNLP 2023 paper "Exploring Jiu-Jitsu Argumentation for Writing Peer Review Rebuttals" -
Pose Prediction for Mobile Ground Robots Evaluation Dataset
This dataset provides ground truth robot trajectories in rough terrain for the evaluation of pose prediction approaches for mobile ground robots. It is composed of six datasets... -
Hector Enrich 2023 Radiation Mapping Dataset
Data set for the evaluation of radiation mapping methods for mobile robots accompanying our SSRR 2023 paper "Online 2D-3D Radiation Mapping and Source Localization using... -
DRZ Living Lab Tracked Robot SLAM Dataset
Data set for the evaluation of SLAM systems in challenging terrains. The data set covers four sequences with challenging terrain, each tracked with a high-performance Qualisys... -
Annotation Error Detection: Analyzing the Past and Present for a More Coheren...
This is the accompanying data for our paper "Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future". Annotated data is an essential ingredient in... -
Lessons Learned from a Citizen Science Project for Natural Language Processing
This is the accompanying data for our paper "Lessons Learned from a Citizen Science Project for Natural Language Processing". Many Natural Language Processing (NLP) systems use... -
Analyzing Dataset Annotation Quality Management in the Wild
This is the accompanying data for the paper "Analyzing Dataset Annotation Quality Management in the Wild". Data quality is crucial for training accurate, unbiased, and... -
The TYC Dataset for Understanding Instance-Level Semantics and Motions of Cel...
TYC dataset proposed in the paper "The TYC Dataset for Understanding Instance-Level Semantics and Motions of Cells in Microstructures" [ICCVW 2023]. Project page:... -
On emergence
Output files for the paper Are Emergent Abilities in Large Language Models just In-Context Learning. -
NLPEER: A Unified Resource for the Computational Study of Peer Review
Dataset of peer review reports and paper drafts from diverse domains and venues. We provide multiple versions of the dataset; when in doubt, download the newest version. You can... -
Self-supervised Augmentation Consistency for Adapting Semantic Segmentation
We propose an approach to domain adaptation for semantic segmentation that is both practical and highly accurate. In contrast to previous work, we abandon the use of...