UKP Snopes Corpus

This corpus is based on the Snopes fact-checking website and provides annotations for training machine learning models for different tasks in the fact-checking process: document retrieval, stance detection, evidence identification and claim validation. The corpus contains 6,422 validated claims, 16,507 evidence text snippets (annotated with sentence level evidence), and 14,296 documents with their sources (URLs).

Please note: We crawled and provide the data according to the regulations of the German text and data mining policy, and we are allowed to share the corpus only for research purposes. Thus, in order to be able to download the corpus, you need to get in contact with us.

If you use the corpus in academic works, please cite our CoNLL paper.

Identifier
Source https://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/2081
Related Identifier https://github.com/UKPLab/conll2019-snopes-experiments
Related Identifier https://github.com/UKPLab/conll2019-snopes-crawling
Metadata Access https://tudatalib.ulb.tu-darmstadt.de/oai/openairedata?verb=GetRecord&metadataPrefix=oai_datacite&identifier=oai:tudatalib.ulb.tu-darmstadt.de:tudatalib/2081
Provenance
Creator Hanselowski, Andreas; Stab, Christian; Schulz, Claudia; Li, Zile; Gurevych, Iryna
Publisher TU Darmstadt
Publication Year 2019
Rights in Copyright; info:eu-repo/semantics/openAccess
OpenAccess true
Contact https://tudatalib.ulb.tu-darmstadt.de/page/contact
Representation
Language English
Resource Type Dataset
Format application/zip
Discipline Other