The completeness of taxonomic inventories for describing the global diversity and distribution of marine fishes

Abstract
Taxonomic inventories (or species censuses) are the most elementary data in biogeography, macroecology and conservation biology. They play fundamental roles in the construction of species richness patterns, delineation of species ranges, quantification of extinction risk and prioritization of conservation efforts in hot spot areas. Given their importance, any issue related to the completeness of taxonomic inventories can have far-reaching consequences. Here, we used the largest publicly available database of georeferenced marine fish records to determine its usefulness in depicting the diversity and distribution of this taxonomic group. All records were grouped at multiple spatial resolutions to generate accumulation curves, from which the expected number of species were extrapolated using a variety of nonlinear models. Comparison of the inventoried number of species with that expected from the models was used to calculate the completeness of the taxonomic inventory at each resolution. In terms of the global number of fish species, we found that approximately 21% of the species remain to be described. In terms of spatial distribution, we found that the completeness of taxonomic data was highly scale dependent, with completeness being lower at finer spatial resolutions. At a 3 degrees (approx. 350km2) spatial resolution, less than 1.8% of the world's oceans have above 80% of their fish fauna currently described. Censuses of species were particularly incomplete in tropical areas and across the entire range of countries' gross domestic product (GDP), although the few censuses nearing completion were all along the coasts of a few developed countries or territories. Our findings highlight that failure to quantify the completeness of taxonomic inventories can introduce substantial flaws in the description of diversity patterns, and raise concerns over the effectiveness of conservation strategies based upon data that remain largely precarious.