Main

Anogenital distance (AGD), the distance from the anus to the genitals, is longer in males than in females in both rodents and humans (1, 2, 3, 4, 5). In male offspring, normal differentiation and development of the male reproductive system, including AGD, is dependent on adequate androgen levels in utero during the masculinization programming window (6, 7, 8, 9). In rodents, deficient androgen action during this critical window, due to exposure to phthalates or other compounds with anti-androgenic effects, leads to a decrease in AGD, along with an increased occurrence of other reproductive disorders (10, 11, 12).

Swan et al. suggested that AGD may also be a marker of anti-androgen exposure during the masculinization programming window in humans, and found that mothers with high phthalate levels gave birth to boys with shorter AGD (13), findings that subsequently have been replicated in other cohorts (14, 15, 16, 17). Furthermore, AGD has been associated with other reproductive end points in humans. Boys born with cryptorchidism or hypospadias have shorter AGD (18, 19, 20, 21), and cross-sectional studies have found associations between shorter AGD and lower testosterone levels, poorer semen quality, and infertility in adult men (22, 23, 24, 25). In women, a longer adult AGD has been associated with higher testosterone levels and larger ovarian follicle number, whereas shorter AGD has been linked to endometriosis (26, 27, 28). However, it is not known whether AGD represents a phenotypic signature that correlates within an individual from infancy through childhood and puberty to adulthood, which is of importance if AGD should be used as a lifelong marker of in utero disruption of the reproductive system. Two previous longitudinal studies have assessed age-related changes in AGD showing correlations of AGD measurements over time during the first years of life (29, 30). As AGD measures are short in infants, precision of measurements needs to be taken into account, and there is a need for larger longitudinal studies with longer follow-up. We therefore measured AGD repeatedly at 3 and 18 months of age in 689 children to assess age-related changes and reproducibility of measurements.

Methods

Study Population and Measurements

The study population is part of the ongoing population-based cohort, the Odense Child Cohort, described in detail in Kyhl et al. (31). In short, pregnant women living in the Municipality of Odense, Denmark, between January 2010 and December 2012, were invited to participate in the cohort at their first antenatal visit, between gestational week 8 and 16. Of the eligible population of 6,707 pregnant women, 43% accepted to be enrolled. The study complied with the Helsinki II declaration and was approved by The Health Research Ethics Committee in Denmark and the Danish Data Protection Agency (j.no. 2008−58−0035).

The participating women answered several questionnaires, including information on country of origin. Data on gestational age, weight, and length at birth were obtained from birth records. The children were invited to participate in the first examination 3 months after the expected date of birth, regardless of chronological age, and again at the chronological age of 18 months. At the examinations, the children’s length and weight were assessed and AGDs were measured.

For AGD measurements, the child was placed on a flat surface and positioned with legs held back and apart in frog position in accordance with standardized methods developed for “The Infant Development and the Environment Study” (TIDES) (32), but with the legs held in a 45−60° angle from the torso at the hip instead of a 60−90° angle. With a skin marker, a mark was made close to the center of the anus and used for the subsequent measurements, which were conducted with a Vernier caliper with numbers facing away from the examiner. In boys, AGD was measured from the center of the anus to the posterior base of the scrotum (AGDas), and to the cephalad insertion of the penis (AGDap). Penile width (PW) was measured at the base of the penis. Correspondingly, in girls the AGD was measured from the center of the anus to the posterior fourchette (AGDaf) and to the top of the clitoris (AGDac). Without repositioning the child, the measurements were repeated three times, closing the caliper in between, and an arithmetic mean was calculated. All examiners went through special training sessions and supervision to obtain the highest possible accuracy. The first 46 AGD measurements in girls were excluded because of low accuracy.

The present study includes singleton children born to women of Caucasian origin who had a measurement of AGD at both the three and eighteen months’ examination leaving 407 boys and 282 girls eligible for analyses. Other children in the cohort with an AGD measurement were included as reference population (boys 3 months N=966; boys 18 months N=788; girls 3 months N=791; girls 18 months N=633).

Statistics

Stratified by boys and girls, summary statistics were calculated (mean and SD or percentage) on data from birth, and each of the two examinations. The AGD was plotted to illustrate differences with age and sex of the child. The change in AGD between the two examinations within each child was illustrated graphically, the paired difference calculated, and in subanalyses stratified on whether measurements had been conducted by the same or two different examiners and on the study period, as an improvement of precision may occur over time. Furthermore, for each child, an AGD z-score for each of the two examinations was calculated based on the average of the child’s three measurements, using all children in the Odense Child Cohort with a measurement at the given examination as reference. At each examination, the z-score was calculated as the difference between the child’s AGD and the mean AGD in the reference population divided by the AGD SD at ~3 or 18 months of age, respectively. All calculations were performed separately for boys and girls because of differences in reference values between the sexes. The AGD z-scores from the two examinations were plotted against each other and a paired intra-class correlation coefficient was calculated. In subanalyses, AGD z-scores were calculated and stratified by age at examination (monthly intervals) for the three months’ examinations, as age within this examination period was positively associated with AGD, whereas this was not the case at 18 months.

To investigate reproducibility of the measures, we added extra measurements from children who had been examined twice by two different examiners as part of the quality-control program (nboys=23 and ngirls=17). The measurements were repeated by the second examiner without repositioning the child. Mixed-effects models were fitted to estimate the variance components for each of the AGDs and PW. An AGD z-score for each of the three repetitions was calculated for both the three and eighteen months’ examinations. Child, examiner, and occasion were accounted for using random effects, whereas the age of the child at examination (continuous variable) and an age × examination time (first or second) interaction were included as fixed effects. In subanalyses, the child’s weight (continuous variable) and a weight × time interaction or the child’s length (continuous variable) and a length × time interaction were, furthermore, included. The model divided the total variance into components, which are presented as percentages: the percentage of total variation due to biological differences between children, differences between examiners, differences within examiners, and unaccounted variance. The reproducibility of the measures within examiners was expressed as reliability coefficients, calculated as the between-children, between-examiner, and unaccounted variance components divided by the total variance, with a coefficient of 1 indicating perfect reliability. The impact of multiple measurements on reliability coefficients was investigated as described in Papadopoulou et al. (30: page 92, line 8−12): “The reliability of the mean of m replicate measurements (ρm) was obtained as

where m is the number of the repetitions and ρ is the reliability of a single measurement.”

Statistical analyses were performed using SAS, version 9.4 (SAS Institute, Cary, NC).

Results

Summary Statistics

Anthropometric characteristics of the children at birth, 3 months, and 18 months of age are reported in Table 1. AGD showed a sex-dimorphic pattern with clearly different means and only little overlap of the ranges between sexes (Table 1 and Figure 1). The average male to female ratio was almost equal—1.8 for AGDas vs. AGDaf and 1.9 for AGDap vs. AGDac at 3 months of age and 1.9 for both AGD ratios at 18 months of age.

Table 1 Children characteristics at birth and at the three and eighteen months’ examination (mean±SD or %)
Figure 1
figure 1

Distribution of AGD in boys and girls at the three and eighteen months’ examination. (a) AGDas and AGDaf and (b) AGDap and AGDac. AGDac, anogenital distance from anus to clitoris; AGDaf, anogenital distance from anus to posterior fourchette; AGDap, anogential distance from anus to penis; AGDas, anogential distance from anus to scrotum.

Age-Related Changes in AGD

AGD and PW in boys increased significantly from 3 to 18 months of age (Table 1 and Supplementary Figure S1 online). However, for some children a minor decrease was observed. Among the boys, a slight decrease in AGDas, AGDap, and PW was observed from age 3 to 18 months in 19%, 15%, and 42%, respectively. Among the girls, a decrease was observed in 37% for AGDaf and 26% for AGDac (Supplementary Figure S1). Similar findings were observed in the entire study period and regardless of whether the child had been examined by the same or by two different examiners (data not shown). AGD z-score for each child was significantly correlated between the two examinations (P<0.01). AGDas-, AGDap-, and PW-paired intra-class correlation coefficients were 0.63, 0.35, and 0.43, respectively. Correlation coefficients were lower for female AGD; 0.26 for AGDaf; and 0.19 for AGDac (Figure 2). The subanalyses on z-scores calculated based on monthly intervals for the three months’ examination showed similar results (data not shown).

Figure 2
figure 2

AGD and penile width z-scores at 3 and 18 months of age and paired intra-class correlation coefficients (R) with corresponding P values. AGD, anogenital distance. AGDac, anogenital distance from anus to clitoris; AGDaf, anogenital distance from anus to posterior fourchette; AGDap, anogential distance from anus to penis; AGDas, anogential distance from anus to scrotum.

Sources of Variation

For all genital measures, major sources of variance were the true variation between the children and the unaccounted variation, whereas variation between and within examiners contributed less (Table 2). A higher percentage of variance in AGD was due to between-children differences in the boys; the proportion of between-children variance was 62% for AGDas and 40% for AGDap in boys, whereas in girls the between-children variance was 30% for AGDaf and 21% for AGDac (Table 2). Similar results for the variance components were found when including weight or length of the children, whereas the effect of age became insignificant, and the overall variability due to the factors included as random-effects decreased slightly, more so when including weight than length. For all AGD measures, differences within examiners explained only between 4 and 7% of the total variation, and differences between examiners between 0.3 and 7%. The between-examiner variance was not statistically significant (P>0.05) in most models. For all genital measures, the reliability coefficient increased only slightly when the average of two or three repeated measurements was used, i.e., only minor gains in precision were achieved by repeating the measurement (Table 2).

Table 2 Variance components (% of variance due to between-children, between-examiners, within-examiner, and unaccounted variance) and reliability coefficients

Discussion

Among 689 children with repeated AGD measurements, a clear sex-dimorphic pattern in AGD was found. A large proportion of the observed variation in AGD was because of true differences between the children, and AGD correlated between 3 and 18 months of age, especially in boys and particularly for AGDas. Measurement error due to between- and within-examiner variation was low. This shows that AGD can be assessed reliably with few repetitions of measurements and by different examiners after training. The results support that AGD is a phenotypic signature during infancy.

To our knowledge, only two studies have performed longitudinal AGD measurements. In a British study only modest correlations between AGD at birth and subsequent measurements up to 2 years of age were reported. In boys, the correlation coefficients for AGDas were 0.30 at 3 months (N=204), 0.24 at 12 months (N=131), 0.15 at 18 months (insignificant, N=82), and 0.26 at 24 months (N=52). In girls, a significant correlation of 0.26 was found for AGDaf between birth and 3 months (N=191) of age, but not in subsequent measurements (29). In a Greek study, measurements at birth and during the second year of life were compared in 61 boys and 51 girls. The highest correlation of 0.63 was observed for AGDas in boys compared with 0.19 for AGDap (insignificant), which is in accordance with our findings. In girls, the correlation coefficient of 0.53 for AGDaf and 0.32 for AGDac was higher than in our study (30).

The physical landmarks used for AGD measurements in boys, especially the distance from anus to scrotum, are more distinct than in girls, which decrease the variability in measurements in boys (1). In addition, the distance in girls is shorter and, thus, the same absolute measurement error is of relatively larger importance. This may explain the better reproducibility of AGD in boys compared with girls, and the higher correlation for AGDas compared with AGDap, in the present cohort. However, another possibility is that the different AGD measures, as well as AGD in boys vs. girls, resemble different developmental mechanisms, which could have an impact on how well they correlate over time.

The unaccounted variation adds considerable imprecision to the AGD measures and could not be explained by differences in general growth patterns (weight or length). Although examination techniques are standardized, we speculate that the exact position of the child still plays a role as may room temperature, mood of the child etc. Most studies, including ours, have used the average of three AGD measurements, which were conducted by marking the mid-anus once and measuring the AGD repeatedly without repositioning the child. The between-examiner variation in this study was also calculated based on measurements conducted with the child in the same position. Thus, if the position of the child has an influence, the within- and between-examiner variations are likely underestimated and the reliability coefficients are overestimated. Measuring AGD in the same child at two different occasions, or at least after re-positioning, and using the average of these measurements may thus reduce the true measurement error. Studies thoroughly investigating the impact of child position are needed.

It has been suggested that a shortened male AGD may be a manifestation of the testicular dysgenesis syndrome, which comprises cryptorchidism, hypospadias, testicular cancer, decreased testosterone production, and impaired spermatogenesis—alone or in combinations. These conditions can, similar to de-masculinization of the male AGD, occur because of disruptions in utero compromising the development and function of the male reproductive system (33, 34). In women, a corresponding ovarian dysgenesis syndrome has been suggested (35). Our study supports the hypothesis that AGD may be a reliable marker of androgen action in utero, but we cannot rule out that it may also be affected by postnatal factors. Rodent studies have indicated plasticity in AGD and demonstrated the need of adequate androgen levels not only prenatally but also postnatally in order to achieve or retain the AGD programed in utero (36, 37, 38). However, according to Kita et al., AGD in rats is clearly most sensitive to exposures impairing androgen action in utero, whereas responses are much smaller if exposures occur during puberty (39).

The strength of the present study is the longitudinal set-up including 689 children examined repeatedly under similar conditions. The decrease in AGD between 3 and 18 months among between 15 and 36% of the children has not previously been reported. This may be explained by measurement uncertainty and unaccounted variance, and, accordingly, decreases were more frequently observed for the girls, where the unaccounted variance was larger. In the British study, the largest increase in AGD was observed between birth and 3 months of age and AGD was subsequently stable during the second year of life (29). The British and Greek study conducted the first AGD measurement at birth, and thus included the time period with the steepest AGD increase. We measured AGD at 3 months of age, which is the time of the “mini-puberty”, where testosterone and other reproductive hormones increase during a short period (40). In boys, this period is associated with penile and testicular growth (41, 42), although effects on AGD have not been studied. Thus, the increased androgen levels, peaking between 1 and 3 months of age, may temporarily stimulate growth of AGD, and thereby explain the decline in AGD measurements from 3 to 18 months of age in some children (40). It would therefore have been preferable to have measured AGD at birth to assess temporary or lasting effects of mini-puberty and other postnatal influences on AGD.

In conclusion, AGD is an easily obtained measure in newborns and young children representing a phenotypic signature with good reproducibility and correlating between age 3 and 18 months, especially in boys. This supports the hypothesis that AGD is determined in utero and could be a readout of prenatal androgen action during infancy. The reproducibility of AGD measurements within examiners was high and increased only slightly with more than one measurement. However, the children in our study were measured without repositioning between measurements, which may increase accuracy. It still needs to be determined whether AGD represents a phenotypic signature throughout childhood, puberty, and into adulthood, and we are therefore planning to follow our cohort until the age of 18 years.