Replication Data for: The 3-billion fossil question: How to automate classification of microfossils

DOI

This dataset consists of 100,000 PNG images of individual microfossils extracted from whole slide images of fossils from a single wellbore. The well in question belongs to the Mikkel field on the Norwegian continental shelf. The wellbore is located at 64 degrees north-south, 7 degrees east-west with ID NO 6407-6-5. All images in this dataset are of RGB format with a 224-by-224 pixel resolution. There are no labels associated with these images, thus the number of different species represented in this dataset is unknown and likely to be in the order of 1000. The dataset was created using the method described in the associated paper. The purpose was to create a medium sized dataset of preprocessed microfossil crops for use in self-supervised training of small to medium sized deep learning models, for which this dataset is suffiently big. For large scale training of e.g. Vision Transformers, more data will be required. See the following fact page for more geological information: https://factpages.sodir.no/en/wellbore/PageView/Exploration/With/PalySlides/3921. Note that the original whole slide images from which this dataset is created were provided by the Norwegian national data repository for petroleum data (Diskos) under the Norwegian Licence for Open Government Data (NLOD) 2.0.

Python, 3.xx

VS Code, 1.xx

Identifier
DOI https://doi.org/10.18710/KWP9WA
Metadata Access https://dataverse.no/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.18710/KWP9WA
Provenance
Creator Martinsen, Iver ORCID logo; Ricaud, Benjamin; Godtliebsen, Fred; Wade, David
Publisher DataverseNO
Contributor Martinsen, Iver; UiT The Arctic University of Norway; Wade, David; Ricaud, Benjamin; Godtliebsen, Fred
Publication Year 2024
Funding Reference The Research Council of Norway 309439
Rights info:eu-repo/semantics/openAccess
OpenAccess true
Contact Martinsen, Iver (UiT The Arctic University of Norway)
Representation
Resource Type Curated survey data; Dataset
Format text/plain; application/zip
Size 3465; 6577283622
Version 1.0
Discipline Earth and Environmental Science; Environmental Research; Geosciences; Natural Sciences