Long-term surveys of age structure in 13 ungulate and one ostrich species in the Serengeti, 1926–2018

The Serengeti ecosystem spans an extensive network of protected areas in Tanzania, eastern Africa, and a UNESCO Wold Heritage Site. It is home to some of the largest animal migrations on the planet. Here, we describe a dataset consisting of the sample counts of three age classes (infant, juvenile and adult) of 13 ungulate and one ostrich species. Sample counts were tallied visually from the ground, or, in some instances, aerial photographs, during a period extending from 1926 to 2018. Observed animals were assigned to age classes based on specific criteria for each species. For nine of the 14 species of this dataset, the number of sampling years is over 30. This resulted in a total of 533 different records of count across age classes. By computing age-class ratios, these data can be used to measure long-term recruitment success at different ages of the tallied species. In particular, the temporal extent of these data allows comparison of patterns to other long-term processes, such as the El Niño-Southern Oscillation (ENSO).

www.nature.com/scientificdata www.nature.com/scientificdata/  1785). In addition, this dataset includes similar measurements for the ostrich (Struthio camelus Linnaeus 1758). Samples were obtained by driving along roads and recording the age and sex of animals out to approximately 100 m, a distance where age classes can still be readily identified. For common species, listed in Method 1 below, this sampling was conducted once or twice a year at specific times, while, for rarer species, listed in Method 2 below, observations were ad hoc and all records for the year were summed. A special case of Method 1 were the very large herds of wildebeest (C. taurinus) and zebra (Eq. quagga), for which subsampling along transects were needed. Very early samples of African buffalo (Sy. caffer), giraffe (G. camelopardalus) and wildebeest (C. taurinus) were obtained from aerial photographs (Method 3, explained below). Although they do not provide a complete census of the populations, these data can be used to estimate rates of reproduction (# infants/ # adult females) or effective recruitment (# juveniles / # adult females) across the years of sampling.

Methods
There were three methods of sampling the populations. For Methods 1 and 2, records were obtained by driving along the road transects, and stopping to score the age groups in herds within some 100 m of the road. There were three road transects, entirely in the administrative boundaries of Serengeti National Park and consistent every year (1962-2018), with records summed over the three for each data entry. Transect 1 was from Seronera (34.823°E, 2.428°S) west to Kirawira (34.208°E, 2.151°S; 120 km), Transect 2 from Seronera to Bologonja (35.173°E, 1.757°S; 115 km), and Transect 3 from Seronera to Olduvai Gorge (35.35°E, 2.993°S; 75 km) (Fig. 1). The first two transects were in similar savanna ecosystems, and comparison of samples from these two showed close similarity.
The criteria for age classes in each species are given in Online-only Table 1. The sample was the herd within view (such as a group of impalas (Ae. melampus) or hartebeests (Al. buselaphus), which occur in discrete groups), or a subset of it if the herd was very large. One observer, using 8-10 x magnification binoculars, called out the age category while a recorder entered the records on data sheets. These were later entered digitally.
Two exceptions to this were the immense herds of migrant wildebeest (C. taurinus) and zebra (Eq. quagga). Because they were numerous and extensive, herds had to be sampled in a systematic way. A vehicle drove through the herds, stopping every half kilometer, where a 180 degree scan out to 100 m was conducted to count the sample within view. The transects were from the start to the end of the herd, with some being 30 km long through a single, continuous herd. Method 3 used aerial pictures of the herds to score age groups. Although the sampling protocol www.nature.com/scientificdata www.nature.com/scientificdata/ was different in the three methods (due to different distributions of each species) the same criteria for identifying age classes was used in all methods. All methods used either systematic or random sampling of the populations.
All species were either migrants, if the species shows seasonal variation in habitat, or residents, if the species remains in the same area of the park year-round. A notable exception to this is the wildebeest (C. taurinus). In fact, there were two populations of wildebeest, a large migrant herd and a small resident herd at the far western end of the ecosystem. These two were sampled separately and scored as either migrant or resident. Populations were sampled once or twice a year at specific times, depending on the availability of different age classes in the areas near transects. Because ungulates had different birth seasons samples were collected at two time periods, once in mid-year and once at year-end. Only one time period per year was used for each species. The early age group, "infants", was sampled usually near the end of the rainy season (March-June) since many species give birth during the rainy season. For some species, there was a second sampling period (August-December) at the end of the dry season, to measure the survival of juveniles during this period of ecological stress. There are a few cases where more than two samples were obtained in a single year, so as to track the survival of the whole cohort throughout a year. These species were sufficiently scarce that an adequate sample could not be obtained at specific times. For these, records were scored whenever the species was seen in a sampling period, and then records for all sampling periods of a single given year were summed. A special case was Thomson's gazelle (Eu. thomsonii), which, although numerous, was scored only during one short time period (1992-1994) for the months of August and September.

Method 3. This method was used in sample years 1965-1973 for African buffalo (Sy. caffer), and 1926-1933
for giraffe (G. camelopardalus tippelskirchi), wildebeest (C. taurinus), and zebra (Eq. quagga). The area covered was in all cases within the Serengeti ecosystem. Buffalo and giraffe were only found in the savanna, while wildebeest were sampled when they were on the plains. Flights were made systematically over the area, wildebeest was sampled using photographs at regular intervals, buffalo and giraffe were sampled when they were encountered.
The third method, applied only in the very early years, used aerial photographs to identify age classes and females. The same criteria for identifying age classes was used as those for Methods 1 and 2 (Online-only Table 1  Whether the population is migrant ("M") or resident ("R") in the ecosystem sampling_type Which age classes were recorded for that specific sample: infants + juveniles + females ("ijf "), infants + juveniles + all adults ("ija"), infants + females ("if "), juveniles + females ("jf "), infants + all adults ("ia"), or juveniles + all adults ("ja") sampling_method Whether sampling Method 1 (sampling once or twice a year at specific times), 2 (yearly counts for rare species) or 3 (tallying using aerial pictures) was used for that specific sample www.nature.com/scientificdata www.nature.com/scientificdata/

Data Records
The dataset includes 533 different year-month-species measurement, represented as a list of species names and their count in each age class. As shown in Table 1, the data consist of 15 columns comprising taxonomic information, count in each age class, as well as the information on sampling. The data are provided in a .txt file 12 .

Technical Validation
Sampling of all herds seen along transects was designed to provide an unbiased measurement of recruitment success in the populations relative to the number of females. Therefore, males were not part of this sampling program for most species. This focus on females was important because males of many species separate from the female herds and become solitary or form bachelor herds. An unbiased sampling of males would therefore require an unmanageably large sample over the whole ecosystem. However, the sexes of two species, zebra (Eq. quagga) and warthog (P. africanus), could not be identified with certainty so males and females were recorded together as adults. In both of these species, males are evenly distributed with the females, so sampling remained relatively unbiased. This observation is based on a subset of data where the sexes could be distinguished and on published research 13 . In addition, although age is continuous, our age categories were discrete. This could induce bias when observed young are at the border of two age categories. However, the observers used consistent criteria to identify age classes (Online-only Table 1). In order to reduce the different individual biases that could arise from different observers, only one (A.R.E. Sinclair) scored the observations before 1997, including the photographs, and four observers conducted the survey between 1997 and 2018. All other observers were thoroughly trained by A.R.E. Sinclair.

Code availability
The code for compiling and generating the dataset is available in the Github repository for the project 14 .