Background & Summary

Hormones are central regulators of phenotype, whose effects span multiple fields of research, from molecular biology to population biology15. Because of their role in regulating organismal function and flexibility, selection might be expected to constrain hormone levels or their context-dependent flexibility around one or more fitness optima69. Nevertheless, endocrine responses vary markedly both within and among populations8,1012. Why do some individuals mount a hormonal response that is two or more orders of magnitude greater than others, when faced with the same stimulus? Similarly, why have some species evolved to express plasma testosterone levels that are an order of magnitude greater than others during reproduction, when testosterone mediates the same basic reproductive processes?

A particularly promising approach to answering such questions – and many others of broad interest to animal behaviour and organismal biology – lies in large-scale comparative analyses of the multitude of endocrine data that have been collected over the past several decades. Such analyses, conducted within a rigorous phylogenetic, environmental, and life-history framework, have the potential to illuminate the factors driving divergence in the hormonal mechanisms of behaviour, physiology, and morphology13,14. To date, most analyses have focused on relatively small taxonomic scales, and on comparing mean trait values across populations and species1519. However, resources are rapidly becoming available to aggregate and analyse decades of available data on circulating hormone levels and their variation within free-living populations, across taxonomic groups. Identifying and characterizing the variation in endocrine traits, and their links with environment, life history, and fitness, could provide insight into how endocrine systems evolve, and how selection on these phenotypic integrators may influence the dynamics and distribution of populations2023.

In this context, we present HormoneBase, a resource of compiled endocrine data across vertebrates. Included in this dataset are >6,580 measures of mean and within-population variation in glucocorticoids and androgens from 476 species (Figs 1,2; Table 1) that were reported in 648 publications – and additional unpublished resources – between 1967 and 2015. Additional information on geographic location (Fig. 3), life history, study design, and time period accompanies each entry. By making HormoneBase publicly available we aim to encourage data sharing across the scientific community and facilitate research into the function and evolution of physiological traits.

Figure 1: Total data entries in HormoneBase for each steroid by measurement type and sex.
figure 1

Within each category, counts are shown separately for mean, coefficient of variation, and range.

Figure 2: The number of species with data on mean hormone concentrations in HormoneBase.
figure 2

Counts are shown separately for males and females, and for androgens, baseline glucocorticoids, and stress-induced glucocorticoids.

Table 1 The taxonomic distribution of entries of mean circulating hormone levels in HormoneBase for males (M) and females (F).
Figure 3: The geographical distribution of entries in HormoneBase.
figure 3

Points represent the location of measurements of (a) androgens, (b) baseline glucocorticoids, and (c) stress-induced glucocorticoids. Precipitation patterns reflect sums for December 2015 and were acquired from the CRU-TS 4.0 Climate Database36.

Methods

Hormonal Data

Endocrine data were obtained from publications, and from several unpublished datasets (Data Citation 1). We searched for studies that conformed to our inclusion criteria using: (i) online academic databases (e.g., Google Scholar, Web of Science), and (ii) cross-referencing from other published works. Studies were selected for inclusion if they included data on circulating glucocorticoids (baseline or stress-induced corticosterone/cortisol) or androgens (testosterone/11-ketotestosterone) that: (i) were from free-living populations, (ii) were collected from adults that had not been subject to an experimental manipulation prior to sampling (e.g., of hormones or the environment), (iii) measured plasma levels, (iv) did not pool data across males and females, or across adults and juveniles, and (v) were reported in or could be converted to a standard unit of measurement (ng/mL).

Published values were obtained from text, tables, or supplementary materials, or extracted from published figures using the program Data Thief III (http://datathief.org). Entries include mean circulating concentrations (ng/mL) for each population/group and time period; whenever possible, data on within-population variation (coefficient of variation, standard error), range (maximum and minimum values), and sample size are also included. When papers did not directly report the coefficient of variation (CV), it was calculated from the standard deviation (SD) or standard error (SE) and sample size (n), according to the following formulas: CV= SD mean *100 or CV= SE * n mean *100. If papers reported that outliers had been excluded we noted this for each hormone measure, and noted the criteria for exclusion where provided.

When a single reference reported multiple means for different groups of individuals (e.g., populations or life history stages), or from different time points, data were entered on separate lines. In cases where papers reported a single hormonal mean from data collected across multiple populations, the location of up to three of the sampled populations was noted in the entry. When stress-induced glucocorticoid levels were measured at multiple time points during a standardized stress series, only the time period at which mean glucocorticoid levels were highest was included.

The decision was made to focus HormoneBase on androgens and glucocorticoids because these are currently the most widely sampled hormones across vertebrates. Because hormone concentrations are not directly comparable across biological matrices, we included only plasma hormone concentrations. Hormone levels are also increasingly being measured in other biological matrices (e.g., feces, feathers) but these sample types are not very well-suited for large-scale comparative analyses because they use hormone metabolites, which differ within and across species and assay/antibody types24.

Sample Collection and Assay Method Data

Because sampling method and assay technique may influence circulating hormone levels, we included specific information about capture, sampling, and assay approach. The time of day (i.e., range of hours) that samples were collected was recorded as provided. The specific method used to capture free-living individuals was noted, and the capture/sampling method assigned to one of three categories. “Active” sampling methods were those in which a blood sample was obtained rapidly and within a known period of time after targeting a previously undisturbed animal. “Passive” methods are those in which animals were sampled after an unknown period of restraint (e.g., non-continuously monitored traps or nets). “Attractant” methods are those in which animals were drawn to the site of capture using some type of attractant (e.g., song playbacks, baited traps). The maximum sampling latency (interval from capture to blood sampling) was recorded for androgens and baseline glucocorticoids, and the type of acute stressor and the interval from initial capture to the collection of a stress-induced glucocorticoid sample were also recorded.

To explore and control for potential differences in assay technique, we included information on the assay method used to assess plasma hormone levels (e.g., radioimmunoassay, enzyme immunoassay), and, where provided, the specific antibody or commercial kit that was used. Because laboratory identity can also influence measured hormone levels25, we recorded the identity of the laboratory in which hormone assays were conducted. For collaborative papers that did not identify where assays were conducted, and were the product of multiple endocrine laboratories, we arbitrarily assigned one of the collaborating laboratories as the assay location.

Taxonomic and Geographical Data

All endocrine data include associated taxonomic information, using common and scientific names. Where relevant, scientific names were updated to reflect recent reclassifications. Taxonomy was determined using major lineage-specific trees (ray-finned fishes26, amphibians27,28, mammals29, squamates30, turtles31, and birds32).

The location name, geographic coordinates (latitude and longitude in degrees decimal), and elevation (in meters) of the population from which the data were collected are also recorded for each entry. When not provided in the original publication, approximate geographic coordinates and elevation were determined by searching for the location name in Google Earth.

Temporal and Life History Data

To enable assessment of seasonal and life history patterns17,33,34, we included information on the time period of sampling as reported (the month and year in which data were collected) and the life history stage of sampled individuals. Measurements were characterized as coming from breeding or non-breeding individuals, or a combination of the two. Designations were based on author classifications when provided. When life history stage was not provided in the original data source, samples were classified as coming from a combination of breeding and non-breeding individuals, except in cases where seasonally breeding populations were sampled only during months that did not overlap with the breeding season.

When life history sub-stage was provided in the original data source, this information was also included in the database. To provide some standardization across species, and widely varying terminology, reported sub-stages were combined into fourteen categories: pre-breeding, courtship, incubation, copulation, gravid/pregnant, non-gravid/pregnant, laying, young care, lactation, post-breeding, migration, torpor, hibernation, pre-basic moult. When information about life history sub-stage was not contained in the original data source, this field was left blank. An associated column provides information about whether the sampled individuals were confirmed to be in a given life history stage (e.g., incubating birds captured off their nests), or whether the life history stage reflected the typical stage for individuals in that population at the time of sampling (e.g., birds sampled in mist nets during the breeding season but not traced to a specific nest). For birds, information on moult status was also recorded as provided35.

Data Records

The HormoneBase dataset (Data Citation 1) is provided as two comma-separated values text files: a single file that includes all data described above (HormoneBase_v1.csv), and a file that contains the reference information for the source of each entry (HormoneBase_references_v1.csv). Variable names are provided in the first row, with details of each variable and units measured summarized in Table 2 (available online only). These files are accompanied by a metadata pdf file (HormoneBase_MetadataData.pdf).

Table 2 Variables included in HormoneBase

Technical Validation

The data presented in HormoneBase are primarily from published, peer-reviewed sources, but also contain unpublished data provided by authors. Data entry was initially proofed by each lab that entered the data to confirm that the entries matched reported data. Upon submission to the central repository, two members of the database entry team independently examined each entry to identify incomplete entries or extreme values. All hormone measures were also mapped onto a phylogeny to reveal putative taxonomic outliers. When such cases were identified, entries were confirmed or corrected by consulting the original source.

Usage Notes

The data are available to access and download from Figshare repository (Data Citation 1). Three files are provided:

  1. 1

    HormoneBase_v1.csv

  2. 2

    HormoneBase_references_v1.csv

  3. 3

    Metadata.pdf

Additional information

How to cite this article: Vitousek, M. N. et al. HormoneBase, a population-level database of steroid hormone levels across vertebrates. Sci. Data 5:180097 doi: 10.1038/sdata.2018.97 (2018).

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.