Sample data and training modules for cleaning biodiversity information

Main Article Content

Marlon E Cobos
Laura Jiménez
Claudia Nuñez-Penichet
Daniel Romero-Alvarez
Marianna Simoes

Abstract

Large-scale biodiversity databases have become crucial information sources in many analyses in biogeography, macroecology, and conservation biology, often involving development of empirical models of species’ ecological niches and predictions of their geographic distributions. These analyses, however, can be impaired by the presence of errors, particularly as regards taxonomic identifications and accurate geographic coordinates. Here, we present a detailed data-cleaning exercise based on two contrasting datasets; we link these example data with a step-by-step guide to overcoming these problems and improving data quality for analyses based on these data.

Article Details

Section
Biodiversity Informatics Training Modules
Author Biographies

Marlon E Cobos, University of Kansas

Department of Ecology and Evolutionary Biology and Biodiversity Institute

Ph. D. student

Laura Jiménez, University of Kansas

Department of Ecology and Evolutionary Biology and Biodiversity Institute

Ph. D. student

Claudia Nuñez-Penichet, University of Kansas

Department of Ecology and Evolutionary Biology and Biodiversity Institute

Graduate student

Daniel Romero-Alvarez, University of Kansas

Department of Ecology and Evolutionary Biology and Biodiversity Institute

Ph. D. student

Marianna Simoes, University of Kansas

Department of Ecology and Evolutionary Biology and Biodiversity Institute

Ph. D. student