SnakeCLEF 2021

PID

The dataset with 409,679 images belonging to 772 snake species from 188 countries and all continents (386,006 images with labels targeted for development and 23,673 images without labels for testing). In addition, we provide a simple train/val (90% / 10%) split to validate preliminary results while ensuring the same species distributions. Furthermore, we prepared a compact subset (70,208 images) for fast prototyping. The test set data consists of 23,673 images submitted to the iNaturalist platform within the "first four months of 2021. All data were gathered from online biodiversity platforms (i.e., iNaturalist, HerpMapper) and further extended by data scraped from Flickr. The provided dataset has a heavy long-tailed class distribution, where the most frequent species (Thamnophis sirtalis) is represented by 22,163 images and the least frequent by just 10 (Achalinus formosanus).

Identifier
PID http://hdl.handle.net/20.500.12800/1-4773
Related Identifier https://dspace5.zcu.cz/bitstream/11025/47274/1/paper-125.pdf
Metadata Access http://lindat.mff.cuni.cz/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:lindat.mff.cuni.cz:20.500.12800/1-4773
Provenance
Creator Picek, Lukáš; Bolon, Isabelle; Durso, Andrew M.; Castañeda, Rafael Ruiz de
Publisher CEUR Workshop Proceedings (CEUR-WS.org)
Publication Year 2021
Rights BSD 3-Clause "New" or "Revised" license; http://opensource.org/licenses/BSD-3-Clause; PUB
OpenAccess true
Contact lindat-help(at)ufal.mff.cuni.cz
Representation
Language No linguistic content; Not applicable
Resource Type corpus
Format text/plain; charset=utf-8; application/x-gzip; application/octet-stream; downloadable_files_count: 6
Discipline Linguistics