Bove Pasquale, Bertini Andrea, Coro Gianpaolo
Species Commonness, Species Prevalence, Artificial Intelligence, Deep Learning, Wetlands
The prevalence of a species in a given area is crucial for estimating the environmental conditions associated with its subsistence within ecological niche models (ENMs). Prevalence is defined as the proportion of presences relative to the total number of sampled sites, reflecting prior expectation on species commonness or rarity. However, reliable estimation often faces challenges due to limited or biased occurrence data, particularly for rare or poorly monitored species. This work presents a data-driven, multi-species methodology to estimate species prevalence for use in ENMs. It leverages species occurrence records from the Global Biodiversity Information Facility and is entirely unsupervised. It utilises two clustering methods, one deep-learning model, and an ensemble model, plus statistical analysis to classify species commonness and transform classifications into prevalence probabilities. A case study is presented for 161 species living in the Massaciuccoli Lake basin (Tuscany, Italy), a wetland of high biodiversity value and ecological sensitivity. The models classified the species’ prevalence based on observations from other Italian wetland sites, and were evaluated against expert-based assessments. All models achieved high accuracy, with the deep-learning model achieving the highest (~ 81–90%). The proposed methodology is scalable and reproducible and can inform ENMs with objective, robust prevalence estimates.
Source: SCIENTIFIC REPORTS, pp. 1-30
@article{oai:iris.cnr.it:20.500.14243/568241,
title = {Estimating species commonness and prevalence through unsupervised methods},
author = {Bove Pasquale and Bertini Andrea and Coro Gianpaolo},
doi = {10.1038/s41598-026-38900-1},
year = {2026}
}Italian Integrated Environmental Research Infrastructures System
Italian Integrated Environmental Research Infrastructures System