Character Selection During Interactive Taxonomic Identification: “Best Characters”

Authors

  • Nadia Talent Royal Ontario Museum
  • Richard B. Dickinson
  • Timothy A. Dickinson Royal Ontario Museum

DOI:

https://doi.org/10.17161/bi.v9i1.4611

Keywords:

Separation coefficient, polyclave, multi-access key, entropy, Delta-Intkey, information theory

Abstract

Software interfaces for interactive multiple-entry taxonomic identification (polyclaves) sometimes provide a “best character” or “separation” coefficient, to guide the user to choose a character that could most effectively reduce the number of identification steps required. The coefficient could be particularly helpful when difficult or expensive tasks are needed for forensic identification, and in very large databases, uses that appear likely to increase in importance. Several current systems also provide tools to develop taxonomies or single-entry identification keys, with a variety of coefficients that are appropriate to that purpose. For the identification task, however, information theory neatly applies, and provides the most appropriate coefficient. To our knowledge, Delta-Intkey is the only currently available system that uses a coefficient related to information theory, and it is currently being reimplemented, which may allow for improvement. We describe two improvements to the algorithm used by Delta-Intkey. The first improves transparency as the number of remaining taxa decreases, by normalizing the range of the coefficient to [0,1]. The second concerns numeric ranges, which require consistent treatment of sub-intervals and their end-points. A stand-alone Bestchar program for categorical data is provided, in the Python and R languages. The source code is freely available and dedicated to the Public Domain.

Metrics

Metrics Loading ...

Downloads

Download data is not yet available.

Downloads

Published

2014-03-27

Issue

Section

Articles (peer-reviewed)

How to Cite

Talent, Nadia, Richard B. Dickinson, and Timothy A. Dickinson. 2014. “Character Selection During Interactive Taxonomic Identification: ‘Best Characters’”. Biodiversity Informatics 9 (1). https://doi.org/10.17161/bi.v9i1.4611.