This dataset is a processed version of the CAMELYON17 dataset used in the NeurIPS 2024 paper "Are nuclear masks all you need for improved out-of-domain generalization? A closer look at cancer classification in histopathology". It consists of patches / tiles from 50 Whole Slide Images (WSIs) (10 WSIs from each of the 5 hospitals) in the CAMELYON17 dataset that have tumour segmentation available. Tiles were picked such that each hospital has equal number of tumourous and non-tumours tiles. Each tile is of size 270x270 pixels. A tile is considered tumourous if the centre region of tile (90x90 pixels in size) has at least 1 pixel that lies inside the tumour segmentation map.
The dataset also contains nuclear segmentation masks for all the tiles. Masks were generated using HoVer-Net trained on the CoNSeP dataset.