Assessing sampling coverage of species distribution in biodiversity databases

Sporbert M, Bruelheide H, Seidler G, Keil P, Jandt U, 21 other autors, Welk E 2019. Journal of Vegetation Science 30: 620-632

Abstract

Aim Biodiversity databases are valuable resources for understanding plant species distributions and dynamics, but they may insufficiently represent the actual geographic distribution and climatic niches of species. Here we propose and test a method to assess sampling coverage of species distribution in biodiversity databases in geographic and climatic space.

Location Europe.

Methods Using a test selection of 808,794 vegetation plots from the European Vegetation Archive (EVA), we assessed the sampling coverage of 564 European vascular plant species across both their geographic ranges and realized climatic niches. Range maps from the Chorological Database Halle (CDH) were used as background reference data to capture species geographic ranges and to derive species climatic niches. To quantify sampling coverage, we developed a box-counting method, the Dynamic Match Coefficient (DMC), which quantifies how much a set of occurrences of a given species matches with its geographic range or climatic niche. DMC is the area under the curve measuring the match between occurrence data and background reference (geographic range or climatic niche) across grids with variable resolution. High DMC values indicate good sampling coverage. We applied null models to compare observed DMC values with expectations from random distributions across species ranges and niches.

Results Comparisons with null models showed that, for most species, actual distributions within EVA are deviating from null model expectations and are more clumped than expected in both geographic and climatic space. Despite high interspecific variation, we found a positive relationship in DMC values between geographic and climatic space, but sampling coverage was in general more random across geographic space.

Conclusion Because DMC values are species-specific and most biodiversity databases are clearly biased in terms of sampling coverage of species occurrences, we recommend using DMC values as covariates in macro-ecological models that use species as the observation unit.