-
NCSE v2.0: A Dataset of OCR-Processed 19th Century English Newspapers
NCSE v2.0 Dataset RepositoryThis repository contains the NCSE v2.0 dataset and associated supporting data used in the paper "Reading the unreadable: Creating a dataset of 19th... -
Transcribed newspaper articles from the NCSE collection
CLOCR-C: Transcribed newspaper articles from the NCSE collection This dataset contains 91 pairs of newspaper articles from the Nineteenth Century Serials Edition (NCSE). The... -
Scrambled text: training Language Models to correct OCR errors using syntheti...
This data repository contains the key datasets required to reproduce the paper "Scrambled text: training Language Models to correct OCR errors using synthetic data". In addition... -
PitVQA: A Dataset of Visual Question Answering in Pituitary Surgery
PitVQA dataset comprises 25 videos of endoscopic pituitary surgeries from the National Hospital of Neurology and Neurosurgery in London, United Kingdom, similar to the dataset... -
RESPONSE: Dataset for Commonsense Reasoning about Disaster Management
This dataset contains 1789 data instances with problem identification, missing resource, time-dependent questions and answers pairs for disaster management.