Sample preparation artefacts represent a significant source of errors in high-content screening datasets leading to misinterpretation of results in drug discovery. To address this we have created a multispectral high-content imaging dataset with typical sample preparation artefacts added to the samples. This dataset consists of high-content images of cultured HeLa ATCC cells in the presence of typical sample preparation artefacts. The aim of this dataset. HeLa cells imaged in this dataset were cultured in a black 96-well (rows A to H and columns 1 to 12) polystyrene imaging plate (Corning, Sigma).
To obtain a dataset similar to the experimental setup of a high-content image-based screening we have used a 96-well (rows A to H and columns 1 to 12) black polystyrene imaging plate (Corning, Sigma). HeLa cells were seeded a day prior to the experiment in 200 µL volume (per well) containing 250000 cells per mL in Dulbecco’s Modified Eagle’s Medium (Sigma) containing 10% fetal calf serum (Sigma) 4500 mg/L glucose (Sigma), sodium bicarbonate (Sigma), L-glutamine (Sigma), sodium pyruvate (Sigma), and non-essential amino acids (Sigma). To obtain a gradient of cell density, the cell suspension was stepwise diluted at 1:2 ratio during seeding (columns 2 to 12). The first column was reserved as no-cells control. Upon seeding, the HeLa cells were incubated overnight at 37° C in a 5% CO2 atmosphere with humidity control. The next day after seeding, cells were fixed with 4% paraformaldehyde (Sigma) solution prepared in phosphate buffer saline (PBS, Sigma). Upon fixation, HeLa cell nuclei were stained with Hoechst 33342 dye (Sigma) at 40 µg/mL concentration prepared in PBS. Row A was kept unstained as the control without Hoechst dye. Upon preparation of the bona fide artefact-free experimental plate, we have collected samples of dust across the approximately 100 m2 laboratory and prepared a suspension of these dust samples in PBS. Next, we added this suspension to rows A to G of the 96-well plate, leaving row H as an artefact-free control.
The dataset consists of images obtained with 4x and 10x objectives using fluorescence cube assemblies for DAPI, CFP, GFP, TRITC and Cy5 channels. For hardware reasons, images with the CFP filter cube were obtained separately from images with DAPI, CFP, GFP, TRITC and Cy5 filter cubes. Furthermore, CFP images (and in some cases DAPI images) were obtained with varying exposure times corresponding to “_w1”, “_w2” and so on filename suffixes. Images were obtained using ImageXpress Micro XL high-content microscope (Molecular Devices). Images are organised into the following folders:
4x-cfp
4x-dapi-gfp-tritc-cy5
10x-6cfp
10x-6dapi
10x-cfp
dapi-gfp-tritc-cy5
filters_spectra
Here, folders A and B correspond to 4x magnification and contain images obtained with the CFP (folder A) and the other filter cubes respectively (folder B). Each folder contains “TimePoint_1” subfolder containing the raw images. In the case of 4x images, each field of view (“site” designed with “_s1”, “_s2” etc. suffixes) corresponds to a nearly perfect quarter of a 96-well plate well. In addition to the raw images in the “TimePoint_1”, a subfolder “Stitched” contains images of the entire wells. In the case of folder B containing all other fluorescence channels, “_w1”, “_w2”, “_w3”, and “_w4” correspond to a single optimal exposure time of DAPI, GFP, TRITC and Cy5 filters respectively.
Similarly, folders C - F correspond to 10x magnification and contain images of multiple exposures of CFP and DAPI (folders C and D) and single exposures of CFP and other channels (folders E and F). In the case of CFP and DAPI multiple exposures folders, varying exposure times correspond to “_w1”, “_w2” etc. Finally, folder G contains metadata on filter cubes used in the dataset, including the emission and excitation filters spectra for each filter cube.