UVP6Net : plankton images captured with the UVP6

DOI

Plankton was imaged with UVP6 in contrasted oceanic regions. The full images were processed by the UVP6 firmware and the regions of interest (ROIs) around each individual object were recorded. A set of associated features were measured on the objects (see Picheral et al. 2021, doi:10.1002/lom3.10475, for more information). All objects were classified by a limited number of operators into 110 different classes using the web application EcoTaxa (http://ecotaxa.obs-vlfr.fr). The following dataset corresponds to the 634 459 objects that have an area superior to 73 pixels (equivalent spherical diameter of 9.8 pixels, corresponding to the default size limit of 620µm in the UVP6 configuration). The different files provide information about the features of the objects, their taxonomic identification as well as the raw images. For the purpose of training machine learning classifiers, the images in each class were split into training, validation, and test sets, with proportions 70%, 15% and 15%. An additional folder is provided, which includes the subset of images used to train the unique embedded classification model of the UVP6 actually deployed on the NKE CTS5 floats (10.5281/zenodo.10694203). These images correspond to UVP6Net objects filtered to retain only those with a size of 79 pixels to fit with the 645µm class from EcoPart, resulting in a total of 595,595 objects. The taxonomic identification was also made coarser (from 110 classes to 20) to ensure adequate performance of the classification model on power-constrained hardware. Images in this subset display objects as shades of grey/white on a black background. The folder UVP6Net_data.tar contains : taxa.csv.gz Table of the classification of each object in the dataset, with columns : - objid: unique object identifier in EcoTaxa (integer number). - taxon_level1: taxonomic name corresponding to the level 1 classification - lineage_level1: taxonomic lineage corresponding to the level 1 classification  - taxon_level2: name of the taxon corresponding to the level 2 classification - plankton: if the object is a plankton or not (boolean) - set: class of the image corresponding to the taxon (train: training, val: validation, or test) - img_path: local path of the image corresponding to the taxon (of level 1), named according to the object id features_native.csv.gz Table of metadata of each object including the different features processed by the UVPapp application. All features are computed on the object only, excluding the background. All area/length measures are in pixels. All grey levels are encoded in 8 bits (0=black, 255=white). With columns : - objid: unique object identifier in EcoTaxa (integer number). And 62 features:  - area - mean - stddev - mode - min - max - perim - width - height - major - minor - angle - circ - feret - intden - median - skew - kurt - %area - area_exc - fractal - skelarea - slope - histcum1, 2, 3 - nb1 nb2 nb3 - symetrieh - symetriev - symetriehc - symetrievc - convperim - convarea - fcons - thickr - elongation - range - meanpos - cv - sr - perimareaexc - feretareaexc - perimferet - perimmajor - circex - kurt_mean - skew_mean - convperim_perim - convarea_area - symetrieh_area - symetriev_area - nb1, nb2, nb3_area - nb1, nb2, nb3_range - median_mean/median_mean_range - skeleton_area See OBJECT measurements at https://doi.org/10.5281/zenodo.14704250 for definitions. features_skimage.csv.gz Table of morphological features recomputed with skimage.measure.regionprops on the ROIs produced by UVP6 firmware. See http://scikit-image.org/docs/dev/api/skimage.measure.html#skimage.measure.regionprops for documentation. inventory.tsv Tree view of the taxonomy and number of images in each taxon, displayed as text. With columns : - lineage_level1: taxonomic lineage corresponding to the  level 1 classification - taxon_level1: name of the taxon corresponding to the level 1 classification  - n: number of objects in each class          2. Second folder UVP6Net_imgs.tar contains : imgs Images of each object, named according to the object id objid and sorted in subdirectories according to their taxon.        3. The last folder UVPEC_imgs.tar contains : imgs Images of each object on a black background, stored in the format required to train and embedded classifier with the UVPEC package (https://github.com/ecotaxa/uvpec); i.e. each image is stored as “objid.jpg” in folders corresponding to their taxon (20 different classes), named “taxon_name__taxon_id”.         4. And : map.png Map of the sampling locations, to give an idea of the diversity sampled in this dataset.

Identifier
DOI https://doi.org/10.17882/101948
Metadata Access http://www.seanoe.org/oai/OAIHandler?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:seanoe.org:101948
Provenance
Creator Picheral, Marc; Jalabert, Laetitia; Motreuil, Solène; Courchet, Lucas; Carray-counil, Louis; Ricour, Florian; Panaiotis, Thelma; Petit, Flavien; Elineau, Amanda
Publisher SEANOE
Publication Year 2024
Rights CC-BY-NC
OpenAccess true
Contact SEANOE
Representation
Resource Type Dataset
Discipline Marine Science