The soil environment contains a large, but historically underexplored, reservoir of biodiversity. Sequencing prokaryotic marker genes has become commonplace for the discovery and characterization of soil bacteria and archaea. Increasingly, this approach is also applied to eukaryotic marker genes to characterize the diversity and distribution of soil eukaryotes. However, understanding the properties and limitations of eukaryotic marker sequences is essential for correctly analysing, interpreting, and synthesizing the resulting data.
Here, we illustrate several biases from sequencing data that affect measurements of biodiversity that arise from variation in morphology, taxonomy and phylogeny between organisms, as well as from sampling designs. We recommend analytical approaches to overcome these limitations, and outline how the benchmarking and standardization of sequencing protocols may improve the comparability of the data.