On the dates of GBIF mobilised primary biodiversity records

Main Article Content

Javier Otegui
Arturo H. Ariño
Vishwas Chavan
Samy Gaiji

Abstract

There are more than 267 million primary biodiversity data records published by hundreds of data publishers through the GBIF network. Thus, GBIF network is the single most comprehensive index for this kind of data. Ensuring or, at least, assessing data quality is of capital importance for the reliability and usability of this data. While conducting a time data gap analysis on this mass of data, we have detected some issues with the way date information is processed and shared. Dates can be obscured or altered under certain circumstances, when a specific combination of publisher’s error or date handling features, and faulty or inadequate date parsing and processing routines gets chained together. The extent of the date unreliability (either at the source or through GBIF portal) is not high, and further it is concentrated in a few data publishers. We analyse the types of errors and misprocessing in dates through the sources and the published records; assess their impact on the overall data quality of the published index, and suggest corrective measures.

Article Details

Section
Articles (peer-reviewed)
Author Biographies

Javier Otegui, University of Navarra

PhD Candidate Department of Zoology and Ecology

Arturo H. Ariño, University of Navarra

Professor of Ecology Department of Zoology and Ecology

Vishwas Chavan, Global Biodiversity Information Facility Secretariat

Senior Programme Officer for Digitisation and Mobilisation of Primary Biodiversity Data (DIGIT)

Samy Gaiji, Global Biodiversity Information Facility Secretariat

Senior Programme Officer for Science & Scientific Liaison