Downloading images from GBIF: Licenses, citation and link rot
DOI:
https://doi.org/10.17161/bi.v20i1.24326Abstract
Downloading images of preserved specimens in bulk is becoming increasingly important for many research projects, especially those connected with machine learning and image analysis. A useful source of images is the standard biodiversity aggregator, the Global Biodiversity Information Facility (GBIF). Here we identify four major issues connected to GBIF image downloads, distinct from those associated with text downloads. These are (1) license considerations, (2) citation issues, (3) restricting to specific providers for project reasons or cybersecurity concerns, and, finally, (4) attempting to use links that are no longer functioning (often referred to as “link rot” or “data rot”). We suggest an incremental approach to downloading and suggest techniques for improved image download. We provide an implementation of our suggestions in Python (gbifimage-downloader).
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Quentin Cronk, Mark Pitblado

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Copyright for articles published in this journal is retained by the authors, with first publication rights granted to the journal. All articles are licensed under a Creative Commons Attribution Non-Commercial license.
Competing Interests: The authors have declared that no competing interests exist.