-
Office for Open Science & Scholarship Operational Plan 2025-28
This document sets out the vision, context and high-level project plans for the UCL Office for Open Science & Scholarship 2025-8 -
NCSE v2.0: A Dataset of OCR-Processed 19th Century English Newspapers
NCSE v2.0 Dataset RepositoryThis repository contains the NCSE v2.0 dataset and associated supporting data used in the paper "Reading the unreadable: Creating a dataset of 19th... -
Transcribed newspaper articles from the NCSE collection
CLOCR-C: Transcribed newspaper articles from the NCSE collection This dataset contains 91 pairs of newspaper articles from the Nineteenth Century Serials Edition (NCSE). The... -
Resources from first UCL Citizen Science Community Event
On Monday 9 December, the UCL Office for Open Science & Scholarship hosted the first Citizen Science Community event. These resources include the event programme and master... -
Scrambled text: training Language Models to correct OCR errors using syntheti...
This data repository contains the key datasets required to reproduce the paper "Scrambled text: training Language Models to correct OCR errors using synthetic data". In addition...