Dense and taxonomically detailed habitat maps of coral reef benthos machine-generated from underwater hyperspectral transects in Curaçao

DOI

This dataset contains 248 benthic habitat maps, that were created from 31 underwater hyperspectral images captured with the HyperDiver device in 8 reef sites across the western coastline of Curacao (see https://doi.org/10.3390/data5010019 for information on the acquisition of the transects). The maps were produced by 8 combinations of two semantic labelspaces (detailed and reefgroups), two machine learning classifiers (patched and segmented), and two spectral signals (radiance and reflectance). Maps in the detailed labelspace have each pixel assigned to one of 43 labels, which are taxonomic labels at family, genus and species levels for biotic components of the reef (corals, sponges, macroalgae, etc.), as well as substrate labels (sediment, cyanobacterial mats, turf algae) and survey material labels (transect tape, reference board, etc.). The set of maps in the reefgroups labelspace cluster the labels in the detailed labelspace into 11 classes that describe reef functional groups (i.e. corals, sponges, algae, etc.). All habitat maps were produced with high accuracy (Fbeta 87%), by two different machine learning methods: a random forest ensemble classifier (segmented method) and a deep learning neural network classifier (patched method). The maps are further divided by the signal type from the hyperspectral image that was used, either radiance or reflectance (the latter was calculated with a reference board located at the beginning and end of each transect). These benthic habitat maps can be used to obtain accurate descriptions of the benthic community and habitat structure of coral reef sites in Curacao. The dataset also contains: an assessment of the accuracy and data efficiency of the machine learning methods, a consistency assessment of the mapped regions, a comparison of habitat metrics (class coverage, biodiversity indices, composition and configuration) between habitat maps produced by each method, and an effort-vs-error analysis of sparse sampling techniques on the densely classified maps.

The files for the habitat maps are in the form: habitat_maps_dataset/habitat_maps/transects/transect_/habitat_map.where the parameters can have the following values:num: 005, 006, 019, 024, 026, 028, 031, 043, 044, 046, 054, 080, 081, 082, 084, 085, 086, 090, 091, 095, 097, 102, 107, 114, 118, 125, 129, 130, 132, 134, 141labelspace: detailed, reefgroupsspectrum: radiance, reflectancemethod: patched, segmentedext: nc or jpgFor each transect, the following files are available:habitatmap_.nc in netCDF4 format: contains the habitat map data for the given combination of semantic labelspace, signal type (or spectrum) and machine learning method used. The map data contains a 2D (Y, X) dataarray classmap which has an integer in each position. The integers are a code for each class. To decode the class integers into the class labels, a lookup table for each labelspace is provided in the attributes 'label' and 'label_id' of the data array for each class map.habitat_map___.jpg: an image file that visualizes the habitat map with a corresponding color for each class.

Identifier
DOI https://doi.org/10.1594/PANGAEA.946315
Related Identifier https://doi.org/10.1111/2041-210X.14029
Related Identifier https://doi.org/10.1038/s41598-017-07337-y
Related Identifier https://doi.org/10.1594/PANGAEA.911300
Related Identifier https://doi.org/10.3390/data5010019
Metadata Access https://ws.pangaea.de/oai/provider?verb=GetRecord&metadataPrefix=datacite4&identifier=oai:pangaea.de:doi:10.1594/PANGAEA.946315
Provenance
Creator Schürholz, Daniel ORCID logo; Chennu, Arjun ORCID logo
Publisher PANGAEA
Publication Year 2022
Funding Reference Horizon 2020 https://doi.org/10.13039/501100007601 Crossref Funder ID 813360 https://doi.org/10.3030/813360 4D_REEF
Rights Creative Commons Attribution 4.0 International; https://creativecommons.org/licenses/by/4.0/
OpenAccess true
Representation
Resource Type Dataset
Format text/tab-separated-values
Size 12 data points
Discipline Earth System Research
Spatial Coverage (-69.159W, 12.042S, -68.745E, 12.375N); Curacao