Aim: While species distribution models (SDMs) are standard tools to predict species distributions, they can suffer from observation and sampling biases, particularly presence-only SDMs that often rely on species observations from non-standardized sampling efforts. To address this issue, sampling background points with a target-group strategy is commonly used, although more robust strategies and refinements could be implemented. Here, we exploited a dataset of plant species from the European Alps to propose and demonstrate efficient ways to correct for observer and sampling bias in presence-only models.
Innovation: Recent methods correct for observer bias by using covariates related to accessibility in model calibrations (classic bias covariate correction, Classic-BCC). However, depending on how species are sampled, accessibility covariates may not sufficiently capture observer bias. Here, we introduced BCCs more directly related to sampling effort, as well as a novel corrective method based on stratified resampling of the observational dataset before model calibration (environmental bias correction, EBC). We compared, individually and jointly, the effect of EBC and different BCC strategies, when modelling the distributions of 1’900 plant species. We evaluated model performance with spatial block split-sampling and independent test data, and assessed the accuracy of plant diversity predictions across the European Alps.
Main conclusions: Implementing EBC with BCC showed best results for every evaluation method. Particularly, adding the observation density of a target group as bias covariate (Target-BCC) displayed most realistic modelled species distributions, with a clear positive correlation (r≃0.5) found between predicted and expert-based species richness. Although EBC must be carefully implemented in a species-specific manner, such limitations may be addressed via automated diagnostics included in a provided R function. Implementing EBC and bias covariate correction together may allow future studies to address efficiently observer bias in presence-only models, and overcome the standard need of an independent test dataset for model evaluation.