Downloading images from GBIF: Licenses, citation and link rot

Authors

DOI:

https://doi.org/10.17161/bi.v20i1.24326

Abstract

Downloading images of preserved specimens in bulk is becoming increasingly important for many research projects, especially those connected with machine learning and image analysis. A useful source of images is the standard biodiversity aggregator, the Global Biodiversity Information Facility (GBIF). Here we identify four major issues connected to GBIF image downloads, distinct from those associated with text downloads. These are (1) license considerations, (2) citation issues, (3) restricting to specific providers for project reasons or cybersecurity concerns, and, finally, (4) attempting to use links that are no longer functioning (often referred to as “link rot” or “data rot”). We suggest an incremental approach to downloading and suggest techniques for improved image download. We provide an implementation of our suggestions in Python (gbifimage-downloader).

Downloads

Download data is not yet available.

Author Biography

  • Mark Pitblado, Beaty Biodiversity Museum

    Collections Curator of Biodiversity Informatics

    Beaty Biodiversity Museum

    University of BC, Vancouver, BC, Canada

Downloads

Published

2026-01-08

Issue

Section

Software and Protocols (peer-reviewed)

How to Cite

Cronk, Quentin, and Mark Pitblado. 2026. “Downloading Images from GBIF: Licenses, Citation and Link Rot”. Biodiversity Informatics 20 (1). https://doi.org/10.17161/bi.v20i1.24326.