Introduction

Developmental stability is defined as the ability of an organism to buffer its development against genetic or environmental disturbances encountered during development to produce the genetically predetermined phenotype (Waddington, 1942) and, as such, it is a fundamental characteristic of development. Developmental stability is influenced by both genotype and environment, as evidenced by different genotypes displaying different levels of stability under identical environments and identical genotypes displaying different stability under varying environments (Zakharov, 1989). The most commonly used estimate of developmental stability has been fluctuating asymmetry. The underlying assumption of fluctuating asymmetry analysis is that the development of the two sides of a bilaterally symmetrical organism is influenced by identical genes and, therefore, nondirectional differences between the sides must be environmental in origin and reflect accidents occurring during development (Waddington, 1942). Because developmental stability acts to suppress such accidents, fluctuating asymmetry will reflect the efficiency of developmental stability mechanisms (Van Valen, 1962; Palmer & Strobeck, 1986). The efficiency of developmental stability mechanisms is thought to be reduced during exposure to environmental or genetic stress, as evidenced by stressed populations displaying higher levels of asymmetry than nonstressed populations.

The genetic basis of developmental stability is poorly understood, and a genetic model of stability has not been developed. Both statistical (Palmer & Strobeck, 1992; Palmer et al., 1993) and mechanistic (Emlen et al., 1993Graham et al., 1993 a) models of stability have been developed, but these lack reference to genetic processes. The current hypotheses suggest that stability is controlled by genome-wide genetic characteristics, such as the level of genomic heterozygosity or genomic co-adaptation (Clarke, 1993a). Specifically, these hypotheses argue that heterozygous or well-co-adapted genotypes are developmentally more stable than homozygous or poorly adapted genotypes. The supporting evidence for both of these alternatives is, for the most part, unconvincing, particularly for the heterozygosity theory, and there are numerous examples that fail to show any relationship between stability, heterozygosity or gene balance (Clarke, 1993a). Certainly there is insufficient evidence to attempt any generalization of a genetic mechanism of stability.

In addition, these hypotheses fail to explain some of the commonly observed patterns of stability within populations and individuals. At the population level, if one population is more asymmetrical (less stable) than another population for one character, there is a tendency that it will also be more asymmetrical for other characters (Soulé, 1967; Soulé & Baker, 1968, Felley, 1980; Kat, 1982; Jagoe & Haines, 1985 ). This relationship led Soulé (1967) to propose a ‘population asymmetry parameter’. Such differences in stability between populations have led to the widespread use of developmental stability analysis as a technique for identifying and characterizing populations subject to systemic stress (Leary & Allendorf, 1989; Parsons, 1990, 1992; Zakharov, 1990, Clarke 1992, 1993b, 1994; Graham et al., 1993b). At the individual level, however, there is no equivalent ‘individual asymmetry parameter’. It has been generally observed that, if one individual is more asymmetrical than another individual for one character, there is no tendency for that individual to be more asymmetrical for other characters, even those correlated phenotypically (Hubbs & Hubbs, 1945; Van Valen, 1962; Sakai & Shimamoto, 1965; Ames et al., 1979; Clarke et al., 1992, Leamy, 1993. However, Leamy (1993)) did find evidence for an individual asymmetry parameter among nine morphometric characters in mice. If developmental stability is controlled by genome-wide genetic characteristics such as those proposed, an individual asymmetry parameter might be predicted to exist, as stability of all characters can be expected to be under the same control.

One possible explanation for the lack of an individual asymmetry parameter is that there are only certain ‘windows of opportunity’ during the development of a character when environmental perturbation may result in the production of an aberrant phenotype. If such windows of opportunity are different for different characters, and environmental perturbation or exposure to stress for a given individual is haphazard, then different characters might not be expected to show a correlated response or pattern of stability. When all individual responses for all characters are averaged over a population, then a population response or population asymmetry parameter can be envisaged. However, in nearly all experiments in which attempts have been made to identify an individual asymmetry parameter, stress exposure has been constant throughout development of the individual, such that, even if characters did possess different windows of opportunity, a correlated response among characters should have been manifest.

Although this explanation is consistent with hypotheses of genome-wide genetic control of stability, it also introduces the concept that developmental stability may be character specific. It has been commonly observed that certain characters are developmentally more stable than others within populations, showing very little response after exposure to stress, whereas other characters are highly variable even under ‘optimal’ conditions (Clarke, 1995).

As with many of the other perceived ‘tenets’ of developmental stability, such as the positive relationship between stability and heterozygosity (Clarke, 1993a) and decreased stability of extreme phenotypes (Clarke, 1995), the presence or absence of individual and population asymmetry parameters has not been subjected to critical evaluation. The number of studies specifically testing for the presence of individual or population asymmetry parameters is very limited. The current paper seeks to redress this situation by examining a number of existing data sets for the presence of these asymmetry parameters by addressing the following four questions.1 Are characters consistently different from each other among individuals (i.e. within samples)? That is, is character X consistently more (or less) asymmetrical than character Y across all individuals within a sample.2 Are individuals within a sample consistently different from each other across all characters? That is, is individual X consistently more (or less) asymmetrical than individual Y for all characters. An affirmative answer would indicate the presence of a significant individual asymmetry parameter (IAP).3 Are characters consistently different from each other among samples? That is, is character X consistently more (or less) asymmetrical than character Y across all samples.4 Are samples consistently different from each other across all characters? That is, is sample X consistently more (or less) asymmetrical than sample Y for all characters. An affirmative answer would indicate a significant population asymmetry parameter (PAP).

Materials and methods

Data sets

All data used come from existing data sets. Fifty samples from 11 invertebrate species are used: Apis cerana, A. mellifera, A. m.capensis, A. m.scutellata (Hymenoptera: Apidae), Chrysopa perla (Neuroptera: Chrysopidae), Heptacarpus brevirostris (Crustacea: Decapoda), Lucilia cuprina (Diptera: Calliphoridae), Solenopsis invicta (Hymenoptera: Formicidae), Tisbe holothuriae (Copepoda: Harpacticoida), Trichocolletes affvenutus (Hymenoptera: Colletidae) and Vespula germanica (Hymenoptera: Vespidae). Details of the colonies, experimental protocols used and characters examined for the generation of these data sets can be found elsewhere (Clarke et al., 1986; Clarke & Mackenzie, 1992; Clarke, 1993c; Clarke & Oldroyd, 1996; Clarke, 1997). All characters examined display true fluctuating asymmetry, with no evidence of directional asymmetry, antisymmetry or size dependence. In addition, measurement error was not significant (as tested by methods outlined by Palmer, 1994) in all samples.

Statistical analysis

Nonparametric methods were used for all analyses, based on the estimation of Kendall's coefficient of concordance, W (Siegel, 1956). As this measure is based on ranks, the four questions may be restated as follows.1 Is the rank order of characters consistent across individuals within a sample?2 Is the rank order of individuals within a sample consistent across all characters?3 Is the rank order of characters consistent across samples?4 Is the rank order of samples consistent across all characters?Procedures for each analysis are given below.1 In order to compare different types of characters (morphometric and meristic) measured on different scales, it is first necessary to express the asymmetry values for each individual for each character on a common scale. After consideration of many options it was decided to use |log(Li/Ri)|. This measure provides an accurate measurement of fluctuating asymmetry and does not depend on the units of measurement (i.e. it is dimensionless) (A. R. Palmer, personal communication). For each individual within each sample, asymmetry values for each character were expressed as |log(Li/Ri)|. For each individual, each character was ranked (lowest to highest) based on its asymmetry score. Ranks for each character were summed across all individuals within a sample and used to estimate W.2 For each character, each individual was ranked (lowest to highest) based on its absolute asymmetry score (i.e. |Li−Ri)|). It was not necessary to use transformed values, as comparisons are among individuals and not among characters. Ranks for each individual were summed across all characters and used to estimate W.3 The rank sums for each character within a sample (obtained from one above) were compared across samples. For each sample, each character was ranked (lowest to highest) based on its within-sample rank sum. For each character, these new ranks were summed across samples and used to estimate W.4 For each character, each sample was ranked (lowest to highest) based on its mean absolute asymmetry value (i.e. Σ|Li−Ri|/N. Ranks for each sample were summed across all characters and used to estimate W.

This coefficient is particularly sensitive to the presence of tied observations. The effect of tied ranks is to depress the value of W. If the proportion of ties is large, a correction factor should be used as detailed in Siegel (1956) . The majority of popular biometrics texts, including the widely used Sokal & Rohlf (1981), do not mention this correction factor, and most statistical software packages do not incorporate it into their algorithms for estimating W. In the search for a significant IAP, the number of tied observations is likely to be large. This is because, within each character being examined, there are likely to be few different asymmetry values, particularly for meristic characters, and given the large numbers of individuals being ranked, significant proportions of individuals will share the same rank. As the presence of ties depresses the value of W (driving it towards nonsignificance), failure to correct for ties would almost guarantee a failure to detect a significant IAP. Within the current IAP analyses, correcting for ties was found to increase the value of W by an average of 18 per cent with a range of 5–65 per cent. As the significance of W is tested by approximation against the χ2-distribution, when sample sizes (and thus degrees of freedom) are large, as in the case of tests for a significant IAP, a 20% increase in W can result in as much as a fivefold decrease in the associated probability value.

It is worth noting that Palmer (1994) has suggested alternative statistical procedures for testing for the presence of individual and population asymmetry parameters. These tests are based on conducting a series of two-way analyses of variance (for details, see Palmer 1994 ). However, as these analyses are based on actual measures rather than ranks, a given term will be significant only if the magnitude of the asymmetry values differs significantly between the groups (either characters, individuals or samples) being compared. Thus, consistent and significant trends in the direction of differences between any two samples may be missed if the actual magnitude of the asymmetry values does not differ significantly between the samples. Thus, caution is needed in using this approach.

Results

Characters

In general, asymmetry values for different characters are consistently different both within individuals (Table 1) and within samples (Table 2). That is, after measurement of all characters within a single individual or sample, it is possible to predict the relative order of characters (i.e. which characters display the greater or lesser levels of asymmetry) in additional individuals or samples. In other words, if character X is more (or less) asymmetrical than character Y in one individual or sample, there is a significant tendency for it to be more (or less) asymmetrical than character Y in all other individuals or samples. Less than 20% of samples fail to show this pattern, with results indicating no differences in the level of developmental stability among characters.

Table 1 Summary of results of significance tests of Kendall's coefficient of concordance on a variety of species
Table 2 Summary of results of significance tests of Kendall's coefficient of concordance on a variety of species

Individual asymmetry parameter

There was no indication of a significant individual asymmetry parameter in 48 of the 50 samples tested. That is, within a sample it is not possible to predict the relative order of individuals (i.e. which individuals display the greater or lesser levels of asymmetry) for any additional characters based on the relative order estimated from any single character. Therefore, if individual X is more (or less) asymmetrical than individual Y for one character, there is no significant tendency for it to be more (or less) asymmetrical than individual Y for any other character(s).

A significant IAP was found for two samples, males and females from a single colony of Apis cerana. Pooling males and females within this colony results in an estimate of W=0.2673, P=0.009, again a significant IAP. It should be noted that these significant results are only revealed using the corrected Kendall's coefficient. Without the correction factor, all three W- values are not significant.

Population asymmetry parameter

In four of 11 species, results indicate the existence of a significant PAP (Table 2). For some of these species, subgroupings existed among the samples and, thus, additional ‘among samples’ comparisons were possible. Of these, only five of 22 revealed a significant PAP (Table 3). Thus, the majority of cases failed to display a significant PAP. That is, it is not possible to predict the relative order of samples (i.e. which samples display the greater or lesser levels of asymmetry) for any additional characters based on the relative order estimated from any single character. Therefore, if sample X is more (or less) asymmetrical than sample Y for one character, there is no significant tendency for it to be more (or less) asymmetrical than sample Y for any other character(s).

Table 3 Summary of results of significance tests of Kendall's coefficient of concordance on a variety of species

Examination of the original data of the cases for which a significant PAP was observed (Table 2 and Table 3) revealed that almost all of these involved comparisons between the sexes, in which males (which are haploid in all these cases) were consistently ranked higher on their asymmetry values than females (diploids). In the case of Tisbe holothuriae, the inbred sample ranked consistently higher than the outbred sample. It must be remembered that these results are based on consistent differences in ranks between samples and not on the magnitude of the asymmetry differences. The observation that males consistently rank higher than females with respect to asymmetry values does not imply that males are significantly more asymmetrical than females across characters.

Discussion

The consistent differences between characters observed both within and among samples indicates that some characters are developmentally more stable than others, i.e. stability is character dependent. It has been suggested that the degree of developmental stability of a given character depends on the relationship of the character to the fitness of the organism (Palmer & Strobeck, 1986Leamy, 1993Clarke, 1995Gummer & Brigham, 1995 ). Under such an hypothesis, characters for which phenotypic constancy is important for the efficient functioning of the organism would be expected to display greater developmental stability, and thus reduced fluctuating asymmetry, than characters for which constancy is less important. Thus, the relative rankings of a series of characters may reflect the relative functional significance of the characters to the organism (Palmer & Strobeck, 1986). This hypothesis is yet to be rigorously tested.

Results herein for the four species of Apis are consistent, in that forewing length consistently displayed the lowest level of asymmetry both within and among samples and the number of hamuli invariably displayed the highest levels of asymmetry. This pattern was repeated in both T. affvenutus and V. germanica. The hamuli function to link the fore- and hindwings together in flight, and it could be argued that the absolute number of hamuli and the relative number on each side is unimportant for efficient functioning. It should be noted that the number of hamuli varies considerably among individuals compared with wing lengths (coefficients of variation are typically four to five times higher for hamuli). In T. holothuriae, tibia length was consistently the least asymmetrical character compared with a number of setal count characters within an outbred population, yet there were no differences between the characters in inbred individuals from the same population. Thus, the detrimental effect of inbreeding on character development in this species (Clarke et al., 1986) has not been consistent across characters. Although tibia length still ranks the lowest within the inbred individual, the second lowest ranked character in the outbred sample now ranks highest in the inbreds. In L. cuprina, three bristle characters (counts) were scored. Interestingly, the two characters that consistently displayed the lowest and highest levels of asymmetry are both wing bristle characters. In both C.perla and H. brevirostris, only meristic characters were scored, and there were no differences among them. Although these apparent patterns of association between degree of asymmetry and character function may be coincidental, similar associations have been revealed previously within some of these species (Clarke, 1995).

The general failure to detect an individual asymmetry parameter is consistent with the majority of previous studies. If developmental noise is random in its impact on individual characters, regardless of whether the genetic basis of stability is genome wide or character dependent, one should not expect a concordant response to noise across all characters within an individual. If, however, the impact of developmental noise is nonrandom across characters, a significant concordance would be expected. The lack of concordance among characters within individuals would therefore argue that development noise is random in its effect on individual characters. The significant and consistent differences between characters strongly suggests that the genetic control of stability is character dependent. However, significant differences among individuals may be expected if the genetic differences or exposure to environmental disturbance between individuals are large enough.

 Whitlock (1996) has suggested an alternative explanation for the lack of correlation among characters within individuals. His arguments are based on the problems associated with attempting to assess fluctuating asymmetry in an individual as opposed to a sample, a problem also highlighted by Palmer (1994) . As asymmetry is a trait with very low repeatability within an individual (as opposed to a sample), he argues that it is a very poor estimator of underlying developmental stability and thus will tend to underestimate any true differences in stability between characters. Such statistical problems add to the difficulty in trying to ‘measure’ an inherently noisy and heterogeneous system.

The results for the PAP analyses appear somewhat equivocal. Soulé (1967) defines the PAP as a ‘property (of a population) that can be estimated by a random sample of uncorrelated character asymmetries’. That is, within a population, estimates of asymmetry values for different characters effectively estimate the same underlying mechanisms of stability within the population. For this to be true, any single population should yield a consistent pattern of differences when compared with any other population across all the characters examined. This is clearly not the case in the species and populations examined here. For example, in Apis mellifera, although males and females differed within one colony, there were no differences between these males or females when compared with males and females from another colony. The general inconsistency of the pattern of significant PAP results certainly prevents any generality associated with the existence of such a population property. In fact, the only comparisons that revealed a significant PAP are those for which previous analyses had revealed significant differences in the magnitude of asymmetry between samples for each character independently.

The extent to which populations display significant heterogeneity among population-based asymmetry estimates (PAPs) will depend on the level of genetic differentiation among them for the genetic factors controlling development and the magnitude of the environmental variation experienced during development, i.e. a classic genotype-by-environment interaction. This can be seen by examination of the PAPs in Soulé's (1967) and Soulé & Baker's (1968) original studies. Undertaking a number of pairwise comparisons of populations in either of these studies yields a wide variety of both significant and nonsignificant results, depending on which populations are being compared. The same heterogeneity in results is revealed by excluding certain characters from the analysis.

All of the results discussed herein indicate that developmental stability is character, taxon and population specific. Palmer & Strobeck (1986) state in their extensive review of fluctuating asymmetry that ‘A notable feature of this summary is the widespread lack of consistency among studies. Although some patterns are moderately consistent, exceptions are present for nearly all.’ If developmental stability is specific to the character and population under study and developmental noise acts randomly throughout development, then such heterogeneity among studies is to be expected and any quest for a unifying theory of stability. is probably foolhardy. This does not imply that the use of fluctuating asymmetry is damned as an indicator of environmental or genetic stress within populations, but only that we should take particular care in the choice of characters and populations we study