Introduction

Genetically controlled self-incompatibility (SI) systems are widespread among angiosperm species and have evolved repeatedly, presumably as a mechanism to avoid self-fertilization and enforce outcrossing. As plants are sessile, often cosexual (ie, possessing both male and female gender function), and typically produce multiple flowers within individuals, SI systems have the potential to govern both individual and population patterns of mating. In addition, SI systems can affect the evolution of other traits associated with the mating system (eg, inbreeding depression: Jain, 1976; Charlesworth and Charlesworth, 1987; gender expression: Charlesworth, 1985; Miller and Venable, 2000).

Among homomorphic SI systems, gametophytic self-incompatibility (GSI) is widespread among angiosperm lineages (Kao and McCubbin, 1996). Two genes control the incompatibility reaction in Solanaceae, one expressed in the pistil and another in the pollen. Molecular studies have shown that the pistil S-gene product is a glycoprotein (Kehyr-Pour and Pernes, 1985; Anderson et al, 1986) with ribonuclease (RNase) activity (McClure et al, 1990). More recently, the pollen determinant (PiSLF) was confirmed in Petunia inflata and, like the pistil S-RNase, PiSLF shows high levels of amino acid (AA) diversity for the alleles examined to date (Sijacic et al, 2004). When the haploid S-genotype of the pollen grain matches that of either of the two S-RNases expressed in the pistil of the maternal plant, pollen tube growth is terminated. At the level of individuals, such genetic control enforces heterozygosity. At the level of populations, strong negative frequency dependence shelters alleles from extinction and creates strong selection for novel alleles; both these factors result in the maintenance of large numbers of S-alleles within populations (Wright, 1939).

GSI is well studied in Solanaceae and has been characterized for natural populations of species from several genera (eg, Lycium, Richman, 2000; Petunia, Wang et al, 2001; Physalis, Richman et al, 1996, Richman and Kohn, 1999; Solanum, Richman et al, 1995; Witheringia, Richman and Kohn, 2000, Stone and Pierce, 2005). Data from these species generally meet expectations; individuals are heterozygous and populations are highly polymorphic at the S-RNase locus (see reviews by Richman and Kohn, 1996; Lawrence, 2000; Castric and Vekemans, 2004). In addition, S-RNases are often highly divergent from one another (Richman and Kohn, 1996), and genealogies indicate that S-RNases are ancient, often grouping more closely with alleles from other species and genera (Ioerger et al, 1990; Richman and Kohn, 2000). This latter phenomenon is known as transpecific (TS) or transgeneric (TG) evolution, where polymorphisms are passed down from a common ancestor to multiple descendant species and even genera (Ioerger et al, 1990).

TG evolution is extensive among S-RNases sampled across several genera in Solanaceae (Ioerger et al, 1990), and patterns of TG evolution have been used to infer historical events in the evolutionary history of various Solanaceae. Whereas alleles from Lycium andersonii and Solanum carolinense showed extensive TG evolution (12 and eight TG lineages, respectively), alleles from two species of Physalis and Witheringia maculata were confined to three TG lineages (Richman and Kohn, 2000). The authors suggest that a genetic bottleneck in the common ancestor of Physalis and Witheringia resulted in the loss of entire S-RNase lineages. Such a bottleneck is consistent with the observed pattern of both the shared identity and reduced number of TG lineages in Physalis and Witheringia, as well as current estimates of species phylogenies in Solanaceae (Olmstead et al, 1999).

In addition to variation in the number of TS or TG lineages across species and genera, S-RNase allele number also varies. Despite equal numbers of TS lineages between two species of Witheringia, estimates of S-RNase diversity were over twice as high for W. solanacea compared to W. maculata (Stone and Pierce, 2005). Similarly, Richman and Kohn's (1999) comparison of Physalis cinerascens and Physalis crassifolia, showed variation in allele number but not allelic age, as measured by the number of TG lineages. Richman and Kohn (2000) point out that variation in TG lineages is expected to evolve more slowly (compared to allele number) since every member-allele must be lost before the extinction of a TG lineage. In contrast, the number of alleles in a population may be more sensitive to recent changes in species ecology. These authors suggest that a major determinant of allele number in populations is a result of differences among species in life-history characteristics that affect both population size and longevity. Several studies have considered the relationship between S-RNase diversity and life-history characteristics (Richman and Kohn, 1999, 2000; Lu, 2001; Stone and Pierce, 2005); however, no clear association has emerged in Solanaceae. Further sampling of species closely related to those previously studied can strengthen overall patterns of TG evolution in Solanaceae and highlight differences between congeners.

Patterns of molecular evolution (ie, Dn/Ds, the ratio of nonsynonymous substitutions per nonsynonymous site to synonymous substitutions per synonymous site) are often used to measure positive selection at the S-RNase gene in species with GSI. Substitution rates can be calculated over the entire region; however, given the conserved and hypervariable regions described by Ioerger et al (1991), substitution patterns may vary between regions. Structural analysis of a S-RNase from Nicotiana alata (Ida et al, 2001) indicates that the hypervariable regions are located on the surface of the protein, thus corroborating previous suggestions that these regions are involved in the recognition mechanism (Ioerger et al, 1991). Genealogical relationships among alleles can also complicate patterns of molecular evolution. Specifically, distantly related alleles are expected to accumulate neutral mutations over time, and these substitutions may obscure patterns of positive selection (Richman et al, 1996). To take this into account, Richman et al (1996) characterized Dn/Ds ratios for closely related alleles in Physalis crassifola and S. carolinense; in both species, Dn/Ds was greater than 1.0 when comparing alleles that were relatively similar, but <1.0 for comparisons of more divergent alleles (see also Lu, 2002).

TG and TS lineages are useful for examining deep genealogical relationships among alleles (eg, ascertaining bottleneck events, Richman and Kohn, 1999, 2000), but may confound more recent evolutionary patterns such as the timing of diversification of alleles. The Dn/Ds ratio can also be used to compare pre- versus post-speciation diversification of S-alleles for closely related species. If S-RNase diversification largely precedes speciation, the Dn/Ds ratio should be similar for allelic comparisons within species and between closely related species. In contrast, if diversification is predominately post-speciation, Dn/Ds is expected to be higher in intraspecific comparisons (compared to interspecific allele pairs), since replacement substitutions among alleles within a single species would be contributing to allelic diversity. More extensive sampling from closely related species within genera are needed to detect species-specific patterns of S-RNase diversification.

Miller and Venable (2002) demonstrated that fruit production, seed set, and pollen tube number in a natural population of hermaphroditic L. parishii was significantly higher following outcross pollination compared to self-pollination. Based on these crossing studies, these authors suggested that L. parishii was self-incompatible. Miller and Venable (2000, 2002) also argued that, for some species of Lycium, GSI had been rendered nonfunctional as a result of polyploidy, eventually leading to the evolution of gender dimorphism (ie, separate females and hermaphrodites). An important assumption of this hypothesis is that GSI is ancestral in Lycium; thus, confirming GSI in the hermaphroditic close relatives (eg, L. parishii; Levin and Miller, 2005) of the dimorphic species is an important first step in evaluating this hypothesis.

Here, we analyze partial S-RNase sequences from L. parishii and investigate SI in this species by genotyping 25 individuals from a population in southern Arizona. Specifically, we: (1) investigate the molecular signature of SI (ie, heterozygosity of individuals and allelic diversity within populations), (2) explore molecular evolution of the S-RNase gene using Dn/Ds ratios calculated for different regions of the gene and for site-specific models of substitution patterns, (3) compare S-RNase diversity and patterns of allele diversification in L. parishii to previous work on its congener L. andersonii (Richman, 2000), and (4) explore the genealogy (including the number of TS and TG lineages) of Solanaceae S-RNases using species pairs for five genera in the family.

Methods

Study species and population sampling

Lycium L. (Solanaceae) is a genus of approximately 80 species distributed worldwide, but particularly concentrated in South America, southwestern North America and southern Africa (Hitchcock, 1932; Chiang-Cabrera, 1981; Bernardello, 1986; Venter, 2000; Miller, 2002; Levin and Miller, 2005). Lycium species are long-lived perennial shrubs and many inhabit arid to semiarid environments, though some are found in coastal saline habitats. The majority of species are hermaphroditic, have perfect flowers, and produce red, fleshy berries. The study population of L. parishii was from two adjacent sampling sites in Organ Pipe National Monument (OPNM) located along the US-Mexican border along South Puerto Blanco Drive (31°54′00″ N, 112°51′00″ W, Pima County, AZ, USA). L. parishii is common throughout OPNM, and plants in this population were fairly evenly spaced, although plants were larger and found at higher densities along washes. Ten to twenty styles from mature floral buds were dissected from flowers on each plant and frozen immediately on dry ice. Styles were collected from nine plants in March 2001 and an additional 16 plants in March 2004. Individual plants selected for sampling were a minimum of 5 m apart, although it is possible that what appeared to be a single individual may have been more than one genetic individual, especially where plants were large.

RT-PCR, cloning, RFLP analysis, and S-RNase sequencing

Three to 12 styles from each individual were ground in liquid nitrogen and stylar mRNA was obtained using Qiagen's RNeasy Plant Mini Kit (Valencia, CA, USA). We synthesized cDNA using the Novagen First Strand cDNA synthesis kit (Darmstadt, Germany) and amplified a portion of the S-RNase gene between conserved regions C2 and C5 (Ioerger et al, 1991) using degenerate primers PR1 (5′-GAA TTC A(TC)G GN(TC) TNT GGC CNG A-3′) and PR3 (5′-NNG TCG AC(GA) AAA CAT ATN CCT A(TC)(TC) TCN (ATC)NT AG(TC) TC-3′) from Richman et al (1995). We ran 40 cycles (94°C, 15 s; 45°C, 60 s; 72°C, 60 s) on a GeneAmp PCR system 2700 (Applied Biosystems, Foster City, CA, USA). Amplification products were cloned into the pT7Blue vector using the Novagen Perfectly Blunt Cloning kit (Darmstadt, Germany) to separate allele copies. Individual colonies were isolated with sterile pipette tips and amplified using the PCR primers and conditions described above. Inserts can be incorporated into the vector in either direction (5′–3′ or vice versa), thus primers PR1 and PR3 were used for this amplification to avoid incorrect RFLP interpretations. The colony PCR products were screened for polymorphisms using RFLP analysis with a battery of restriction enzymes including TaqI, HhaI, HaeIII, and DdeI. Approximately 20 colonies were selected from each individual for RFLP analysis. All colonies identified as having unique RFLP banding patterns were amplified with the vector primers U19 (5′-GTT TTC CCA GTC ACG ACG T-3′) and R20 (5′-CAG CTA TGA CCA TGA TTA CG-3′) and sequenced on an Applied Biosystems Automated 3730 DNA Analyzer at Cornell University (Ithaca, NY, USA). Nine of the RFLP patterns recovered were sequenced multiple (range, 2–13) times to confirm the reliability of RFLP analysis to identify S-alleles. For all of the 49 sequenced colonies corresponding to the nine RFLP patterns, sequences with the same RFLP pattern had identical nucleotide sequences.

L. parishii S-RNase diversity

The number of S-alleles for a diploid, self-incompatible population was estimated from the number of alleles found in a sample of that population by iteratively solving Paxman's (1963) estimator. Given n GSI alleles in a sample of r diploid individuals, the number of alleles in the population, N, is given by Equation (1).

This equation assumes that individuals are under balancing selection and that alleles exist in equal frequencies within the population. To examine the assumption that the frequency distribution of alleles in the sample were drawn from a uniform distribution, Mantel's (1974) test was used (2), where Cj is the number of times an allele occurs, n is the number of alleles found and r is the number of diploid individuals sampled.

Molecular evolution, allele diversification, and terminal branch lengths

L. parishii S-RNase sequences were aligned in Clustal X (Thompson et al, 1994) using default settings and adjusted by eye following Ioerger et al (1991). Partial S-RNase sequences from L. andersonii that spanned regions C2–C5 from Ioerger et al (1991) were included in the alignment (Richman, 2000; GenBank accession numbers AF05343–4, AF105347–9, AF105353, AF105355, AF105358–9, and AF105362–3).

Nonsynonymous and synonymous substitution rates were measured in MEGA3 (Kumar et al, 2004) using the modified Nei-Gojobori model with Jukes–Cantor correction for the calculated transition/transversion ratio (R=0.75) among L. parishii S-RNases. Values of Dn and Ds and their ratio were calculated for the entire S-RNase sequence, as well as for the conserved (C3 and C4) and hypervariable (HVa and HVb) regions separately. Nucleotide substitution patterns were also assessed using site-specific codon models in Phylogenetic Analysis using Maximum Likelihood (PAML, version 3.13d; Yang, 2002). Likelihood ratio tests were used to compare models of neutral (M1) and positive (M2) selection for both L. parishii and L. andersonii.

To assess patterns of allelic diversification, we identified pairs of closely related alleles using the maximum likelihood (ML) topology (see below in Methods and Figure 3 in Results) and calculated the Dn, Ds, and Dn/Ds ratios for each pair of alleles. This resulted in 14 intraspecific allele pairs within L. parishii (S01, S03; S01, S02; S02, S03; S08, S18; S10, S13; S10, S11; S11, S13; S04, S12; S06, S07; S06, S23; S07, S23; S21, S22; S21, S25; S22, S25), five intraspecific allele pairs within L. andersonii (S07, S17; S06, S21; S07, S20; S17, S20; S05, S16), and 14 interspecific allele pairs between L. parishii and L. andersonii (P-S24, A-S13; P-S09, A-S06; P-S09, A-S21; P-S18, A-S11; P-S10, A-S01; P-S11, A-S01; P-S06, A-S16; P-S07, A-S16; P-S23, A-S16; P-S21, A-S05; P-S22, A-S05; P-S25, A-S05). We used non–parametric Mann–Whitney U-tests to determine if Dn, Ds, or Dn/Ds for intraspecific pairs of alleles differed from interspecific allele pairs.

Figure 3
figure 3

ML genealogy of 90 S-RNase sequences from Solanaceae. The 90 sequences were selected from a larger set of L. andersonii, Petunia spp., Physalis spp., Solanum spp., and Witheringia spp. S-RNases obtained from GenBank (see Methods for accession numbers and sampling strategy). Clades 1–10 are L. parishii TG lineages and clades A–C are the three lineages including all Witheringia and Physalis S-RNases.

We measured terminal branch lengths in L. parishii to facilitate comparisons with previous studies in Lycium (Richman and Kohn, 2000) and Solanaceae (Richman and Kohn, 2000; Stone and Pierce, 2005). After verifying that the L. parishii sequences did not significantly deviate from clocklike behavior using a two-cluster test in LINTREE (Takezaki et al, 1995), terminal branch lengths were measured using KITSCH (Felsenstein, 2005). Relative terminal branch lengths were calculated following Uyenoyama (1997) using equation (3), where S is the sum of terminal branch lengths, D is the time since coalescence of all alleles (total depth of the genealogy), and n is the number of alleles in the sample:

Genealogical analysis of Solanaceae S-RNases

Initially we constructed a genealogy using the neighbor-joining algorithm (Saitou and Nei, 1987) including all L. parishii S-RNases reported here, those mentioned above for L. andersonii, Petunia axillaris (AF239907–10, AY180048–50), Petunia integrifolia (AF301167–80), Physalis cinerascens (AF058930–41), P. crassifolia (L46653, L46655–63, L46665–6, L46668–9, L46672–3, L46677, L46679–80), Solanum carolinense (L40539–48, L40551), Solanum chacoense (AF191732, L36666–7, S69589, X56896–7), Witheringia. maculata (AF102065–75), and W. solanacea (AY454099–107, AY454109–21). The S-like RNase S2 from Antirrhinum hispanicum (X96465) was used as the outgroup.

From this genealogy (not shown), one allele representing each TG lineage from each species of Petunia, Physalis, Solanum, and Witheringia was selected in order to preserve TG relationships across alleles from these genera plus Lycium. The selected alleles, as well as all alleles from L. parishii and L. andersonii were included in a ML analysis containing 90 alleles sampled across these five genera within Solanaceae. The ML model parameters were determined using the Akaike information criterion in Modeltest version 3.6 (Posada and Crandall, 1998). The best-fit model (GTR+I+G) was used in a ML analysis in PAUP* (Swofford, 2002) using the heuristic search option, TBR branch swapping, MulTrees option in effect, and a single neighbor-joining tree as a starting topology.

We also generated a Bayesian consensus tree to test for differences in branching relationships compared to the ML tree. We included the entire 138-allele data set of Lycium, Petunia, Physalis, Solanum, and Witheringia S-alleles in the Bayesian tree to determine if any TG lineages were lost by reducing allelic sampling in the ML analysis. Bayesian analysis was run in MrBayes version 3.0b4 (Ronquist and Huelsenbeck, 2003) using four simultaneous Markov chain Monte Carlo chains each starting from a random tree, with a general time reversible substitution model and gamma-distributed rate variation across sites. Two million generations were run with a tree saved every 100 generations, and trees proceeding the stabilization of likelihood values were excluded in the construction of a 50% majority rule consensus tree in PAUP*. The number of TG lineages was determined for L. andersonii and L. parishii for both the ML and Bayesian analyses; we define a TG lineage as the most recent node including alleles from more than one genus.

Results

L. parishii S-RNase diversity

The 25 sampled L. parishii plants contained 24 unique S-RNase sequences ranging in length from 369 to 390 bp (GenBank accession numbers DQ367853–76). Twenty-one of the 25 plants had unique S-RNase genotypes, while genotypes S03 S12 and S01 S04 were each found in two individuals (Table 1). Only one allele was recovered from one of the sampled plants, whereas several plants had either three or four alleles. The possibility of cross-contamination among samples can be eliminated for genotype 04-P13; thus, it is possible that individuals of L. parishii can have >2 S-RNases. In addition, further sequencing of genotype 04-P12 found no evidence of contamination (ie, the presence of the second allele from the potentially contaminating individual). While we recognize that this region is susceptible to contamination (due to the necessity of multiple PCRs and cloning), we do not believe it explains individuals with >2 S-RNases in our sample.

Table 1 S-RNase genotypes for 25 Lycium parishii plants sampled from Organ Pipe National Monument, Pima county, AZ, USA

The frequency of alleles in the sample did not significantly differ from a uniform distribution, as assessed using Mantel's (1974) test, and so Paxman's (1963) estimator was used to measure population allelic diversity. We estimated population allele number for three combinations of the number of individuals (r, equation (1)) and alleles (n, equation (1)). First, we included all plants (r=25) and alleles (n=24), which returned a population estimate of 29 alleles. We also estimated allele number by treating those individuals with greater than two alleles (Table 1) as multiple individuals (r=30 and n=24); doing so reduced the estimate of the number of alleles to 27. Finally, excluding all of the individuals with greater than two S-alleles (r=20 and n=19), Paxman's estimator returned a population estimate of 23 S-alleles.

Molecular evolution at the S-RNase locus in Lycium

Among L. parishii S-RNases, the ratio of nonsynonymous substitutions per nonsynonymous site to synonymous substitutions per synonymous site (Dn/Ds) was 0.758 for the entire amplified region of the S-RNase gene, which included both the conserved and hypervariable regions specified by Ioerger et al (1991). Including only the hypervariable regions HVa and HVb, the Dn/Ds ratio increased to 1.068, whereas including only conserved regions returned a very low Dn/Ds of 0.107 (Figure 1).

Figure 1
figure 1

Pairwise comparison of nonsynonymous substitutions per nonsynonymous site (Dn) versus synonymous substitutions per synonymous site (Ds) among L. parishii S-RNases. Open symbols indicate comparisons including the conserved regions C3 and C4, whereas closed symbols are for HVa and HVb regions (Ioerger et al, 1991).

Previous studies indicating that variation at fewer than five AA positions can be sufficient to result in a new S-allele specificity (eg, Matton et al, 1997; Wang et al, 2001) suggests that a site-by-site analysis of nucleotide substitutions may be more appropriate for detecting positive selection at this locus. Likelihood ratio tests comparing site-by-site models M1 (neutral evolution) to M2 (positive selection) were conducted in PAML; these analyses found that the model of positive selection fit the data significantly better than the model of neutral evolution for both L. parishii (2Δ𝓁=68.12, df=2, P<0.0001) and L. andersonii (2Δ𝓁=65.97, df=2, P<0.0001) S-RNases (Table 2). Among L. parishii alleles, positive selection was detected at 27% of AA sites (p2=0.27, Table 2) for a total of 35 sites for L. parishii, and there was a probability of at least 0.90 that the Dn/Ds ratio exceeded one for 17 of these sites. For L. andersonii, positive selection was detected at 37.4% of sites, with a probability of at least 0.90 that the Dn/Ds ratio exceeded one at a total of 25 sites (Table 2).

Table 2 Likelihood ratio tests comparing models of neutral evolution (M1) and positive selection (M2) among S-RNases for Lycium parishii and L. andersonii as implemented in PAML (Yang, 2002)

Figure 2 depicts positively selected sites across the different structural regions of the S-RNase gene. Eighteen of the 35 positively selected sites among L. parishii alleles occurred within the hypervariable regions HVa and HVb. Further, among the 17 sites with a 90% probability or higher that the Dn/Ds ratio exceeded one, 11 were located in the hypervariable regions. In contrast, 20 of 47 positively selected sites, and only 13 of the 25 sites with a 90% probability or higher that the Dn/Ds ratio exceeded one, occurred within hypervariable regions among L. andersonii alleles. Positive selection was not detected at any site within conserved regions C3 or C4 for either species (Figure 2).

Figure 2
figure 2

Schematic structure of the S-RNase gene between conserved regions C2 and C5 (Ioerger et al, 1991) in (a) L. parishii and (b) L. andersonii. Regions are drawn to scale and the AA positions of the hypervariable (HVa and HVb) and conserved regions (C3 and C4) are indicated above or below the regions, respectively. Numbers directly below a region are those AA sites reported to be under positive selection using site-specific analyses in PAML. Those sites in bold had a probability of positive selection >90%, whereas others had probabilities between 50 and 90%. Underlined sites represent those shared between L. parishii and L. andersonii.

L. andersonii and L. parishii share 24 positively selected sites (underlined in Figure 2). There are eight common hypervariable sites between the two species where there is at least a 90% probability of positive selection for both species, five in HVa (AA sites 15, 18, 23, 26, and 28) and three in HVb (AA sites 43, 44, and 50). Many of the positively selected sites among L. andersonii alleles occur in the region beyond C4, which has not been previously designated hypervariable or conserved.

Alleles S08 and S13 recovered from L. parishii (this study) were identical to alleles S11 and S01 recovered by Richman (2000) for L. andersonii, reducing the number of interspecific comparisons of closely related alleles to 12. There was no difference in Dn, Ds, or Dn/Ds for the comparison of all intraspecific allele pairs (n=19) to interspecific pairs of closely related alleles (Ds, U=120; Dn, U=136; Dn/Ds, U=136; all n1=19, n2=12, Ucritical=163, P>0.05; Table 3).

Table 3 Average Ds, Dn, and Dn/Ds for closely-related alleles within Lycium parishii, within L. andersonii and between L. parishii and L. andersonii

A two-cluster test (Takezaki et al, 1995) did not reject a molecular clock for L. parishii (Q=33.7, df=23, P=0.0694). Terminal branch lengths relative to the depth of the genealogy were significantly longer than expected (Rsd=5.22) when compared to Uyenoyama (1997) simulated frequency distribution for a gene under balancing selection. There was no negative association between the age (measured either by the number of TG lineages or Rsd) and number of alleles for L. parishii (Table 4).

Table 4 The number of S-RNases sampled from plants within natural populations for seven species of Solanaceae and estimates of allelic diversity in these populations (n, number individuals sampled)

Solanaceae S-RNase genealogies

S-RNase alleles from L. parishii are located across numerous clades in the ML (Figure 3) and Bayesian genealogies (Figure 4). Both the ML and Bayesian topologies cluster L. parishii alleles into ten TG lineages (labeled 1–10 in Figures 3 and 4). L. parishii allele S15 does not group closely with alleles from any other species, but is so divergent from all conspecific alleles that we consider it a TG lineage and attribute its isolation to incomplete sampling of S-RNase polymorphism in other genera. L. parishii alleles are not evenly distributed: seven occur in lineage nine, while the remaining lineages have only one, two or three alleles. In contrast, the L. andersonii S-alleles have a relatively even distribution, although substantially fewer alleles (11 S-alleles) were included in the analyses compared to L. parishii (24 S-alleles). S-RNases from both species of Lycium group together in many lineages (Figures 3 and 4) with only a few exceptions; allele S26 from L. parishii is separate from L. andersonii alleles, as is the lineage containing alleles S04 and S12 from L. parishii.

Figure 4
figure 4

Bayesian genealogy including 138 S-RNase sequences from Solanaceae. Sequences for alleles from L. andersonii, Petunia spp., Physalis spp., Solanum spp., and Witheringia spp. were taken from GenBank (see Methods for accession numbers). Posterior probabilities are indicated above the branches. Clades 1–10 are L. parishii TG lineages and clades A–C are the three lineages including all Witheringia and Physalis S-RNases.

At the level of genera, Lycium exhibits similar levels of TG polymorphism as Solanum (11 TG lineages) and Petunia (eight TG lineages) (Figure 3). Consistent with Richman and Kohn's (2000) neighbor-joining Solanaceae S-allele genealogy, Witheringia and Physalis alleles are restricted to three TG lineages (lineages A–C in Figures 3 and 4). There is extensive diversification within two of these three lineages (A and B), but not the third (C) (Figure 4). Similar to L. andersonii but not to L. parishii, Solanum and Petunia alleles are generally evenly distributed across lineages, with the exception of one P. integrifolia TG lineage that includes 6 alleles.

Discussion

SI in L. parishii

One purpose of this study was to verify SI in L. parishii using molecular sequence data from the S-RNase locus. We recovered a total of 24 S-RNases from L. parishii; however, several individuals had >2 alleles. One explanation for the three (or four) allele plants could be that these individuals are polyploid. Chromosome counts have revealed mixed ploidy (ie, both diploid and polyploid individuals) for several species of Lycium (L. chilense, Stiefkens and Bernardello, 2000; L. gariepense, L. horridum, and L. mascarenense, Venter, 2000; L. californicum, Yeung et al, 2005). The L. parishii population is located within a national monument, which is home to six additional Lycium species; tetraploidy following hybridization between L. parishii and one or more of these other species is a possible explanation for plants bearing >2 S-RNases. Whether the putative polyploids are self-incompatible is presently unknown and can only be confirmed via experimental crosses; thus, conclusions on the compatibility status of these plants are premature. In addition, it is unclear if the plants bearing three S-RNases are partially homozygous or whether additional intensive sampling of clones could detect a fourth allele. An alternative interpretation of the >2 S-RNase individuals is that the style samples included pistils collected from two different genotypes. L. parishii often grows in dense patches along washes and it is possible that styles from multiple individuals were inadvertently pooled in these samples (see Methods). Treating individuals with three or four alleles as separate genotypes, our population estimate of the number of alleles is 27. Excluding these individuals, the population estimate is 23 alleles. Regardless of which estimate is used, both are within the range of alleles recovered for other SI species (Table 4), and genealogical analysis reveals that TG polymorphism and terminal branch lengths are comparable to other Solanaceae with GSI (Figures 3 and 4).

S-RNases in L. parishii were highly polymorphic (average pairwise AA divergence was 52%), and all but a single individual were heterozygous, as expected under GSI. It is unclear why we were not able to retrieve two alleles from 01-P10; perhaps this plant has an extremely divergent S-RNase that we were unable to amplify with the degenerate primers. Alternatively, this plant may be homozygous and possibly self-compatible. This does not necessarily indicate a breakdown of GSI in the L. parishii population, as a low frequency of SC individuals has been reported for other populations with GSI (eg, Tsukamoto et al, 1999; Stone and Pierce, 2005).

The data presented here, coupled with the experimental crosses in Miller and Venable (2002), confirm the presence of SI in this species. However, it should be noted that Miller and Venable (2002) reported low levels of seed production following self-crosses for L. parishii. Given that SI in many Solanaceae appears to be disrupted by polyploidy (reviewed in Stone, 2002), the relationship between the low levels of self-fertility reported by Miller and Venable (2002) and the >2 S-RNase genotypes should be investigated. Indeed, Miller and Venable (2000) have argued that in Lycium the loss of SI leads to the evolution of gender dimorphism. Confirmation of GSI in the hermaphroditic close relatives of the dimorphic species is an important first step in evaluating this hypothesis, and an extension of the present study to the dimorphic taxa (Miller and Savage, unpublished data) may shed light on Miller and Venable's suggestion.

Patterns of molecular evolution at the S-RNase locus

Despite the classic explanation that balancing selection maintains the extreme allelic polymorphism commonly found at the S-RNase locus (Wright, 1939), S-allele studies are often unable to demonstrate such selection using Dn/Ds ratios averaged across the entire amplified S-RNase gene (eg, Ma and Oliveira, 2002; Kato and Mukai, 2004). This is not surprising, as the S-RNase gene includes five highly conserved regions that are structurally necessary for RNase function and would be expected to evolve primarily under purifying selection. In contrast, hypervariable regions HVa and HVb are located on the surface of the S-RNase protein and are thought to be involved in recognition, making these regions likely candidates for positive selection (Ioerger et al, 1991). In L. parishii, the low Dn/Ds for conserved regions is consistent with functional constraints, whereas the only comparisons that had a Dn/Ds ratio greater than one were for hypervariable regions (Figure 1). Considering individual sites, many of the positively selected sites in both L. parishii and L. andersonii were within the hypervariable regions, and there were no positively selected sites detected in the conserved regions (Figure 2). In addition, L. parishii and L. andersonii S-RNases both fit a model of positive selection significantly better than a model of neutral evolution (Table 2).

Congruence across genera in the hypervariable sites that determine specificity suggests that a single change within targeted regions of the S-RNase locus may be capable of generating new alleles. Ida et al (2001) compared the hypervariable regions of 10 S-RNases across three genera (Nicotiana, Petunia, and Solanum) in Solanaceae to determine the most variable sites within HVa and HVb. In our alignment, these sites correspond to sites 21 (HVa) and 43 (HVb; Figure 2). There is at least a 95% probability of positive selection at site 21 and at least a 99% probability of positive selection at site 43 for the L. andersonii S-RNases. For L. parishii, there is only a 74% probability of positive selection at site 21, but a 99% probability of selection at site 43. There are four sites (in our alignment, sites 18, 20, 21, and 46) within the S-RNase hypervariable regions that Matton et al (1997) modified in the S. chacoense S11 allele to match the S13 allele, resulting in a change to the S13 specificity. When only three of these four sites were modified (sites 18, 21, and 46), the result was a dual-specificity allele that rejected both S11 and S13 pollen (Matton et al, 1999). In our analyses, these four sites were found to be under positive selection for both species (Figure 2) with the exception of site 46 in L. parishii and site 20 in L. andersonii. Overlap across multiple genera in the specific hypervariable sites that differ reinforces the hypothesis that a few targeted AA changes can generate new specificities. This process could enable the rapid regeneration of allelic diversity following a bottleneck event, such as the one suggested in the ancestor of Witheringia and Physalis (Richman and Kohn, 1999), and may explain the maintenance of GSI in these genera in the period of reduced Ne directly following the bottleneck.

S-RNases recovered from this L. parishii population may provide evidence of a new lower limit for the number of AA changes necessary to generate a novel S-RNase specificity. Previous empirical research found a minimum of four AA differences between functionally distinct S-RNases (Matton et al, 1997). In L. parishii, two alleles, S21 and S22, vary at only two nonsynonymous sites (AA sites 44 and 123, Figure 2) and at a single synonymous site (site 46). Both L. parishii alleles were sequenced twice to verify these nucleotide differences, although functional differences in these alleles can only be confirmed through experimental crosses. Empirical verification that these two sequences are unique and functional S-RNases would reinforce the importance of the hypervariable regions in determining specificity, and suggest that even fewer AA changes may be important to the generation of new alleles.

Solanaceae S-RNase genealogy

Genealogies constructed using neighbor-joining, ML, and Bayesian methods show extensive TG polymorphism among L. parishii S-RNases. This is consistent with both theoretical expectations under GSI (Vekemans and Slatkin, 1994) and empirical findings from closely related L. andersonii (Richman, 2000). Although there are only six TG lineages for L. andersonii in our genealogies (Figures 3 and 4), we restricted our sampling to the 11 L. andersonii S-RNase sequences that fully spanned regions C2–C5. Richman's (2000) original analysis included 22 L. andersonii alleles and recovered 11 distinct TG lineages. Thus, 10 TG lineages in L. parishii is similar to results for the complete set of L. andersonii alleles (see also Table 4).

Contrary to the results of Stone and Pierce (2005) for Witheringia, we find no evidence for post speciation radiation of S-RNases in L. parishii or L. andersonii. Estimates of Ds, Dn, and Dn/Ds were equivalent for intraspecific and interspecific allelic comparisons (Table 3), and alleles from both species group within similar TG lineages in the genealogies (Figures 3 and 4). Thus, it would appear that considerable S-RNase diversity was present in the ancestor of these species. It is intriguing then to consider how S-RNases S08 and S13, recovered here for L. parishii, could have identical nucleotide sequences to S-RNases S11 (AF10535) and S01 (AF10534) recovered for L. andersonii by Richman (2000). Although L. parishii and L. andersonii are closely related (Miller, 2002; Levin and Miller, 2005), it is unlikely that the S-RNases would have identical DNA sequences. Further, as our laboratory has never worked with L. andersonii S-RNases, this finding cannot be explained by cross-contamination. Such a result seems best explained by gene flow between species; additional studies investigating gene flow between Lycium are warranted. It is interesting to note that both the S08 and S13 alleles from L. parishii were found in individuals with more than two S-RNases (Table 1).

For Lycium, Physalis, and Witheringia, species pairs within genera have similar numbers of TG lineages and terminal branch lengths (Table 4). However, whereas the level of S-RNase diversity between Lycium species is comparable, the numbers of alleles are strikingly different for intrageneric comparisons within Physalis (44 versus 14 alleles) and Witheringia (37 versus 15 alleles; Table 4). Richman and Kohn (1999, 2000) argue that for Physalis the difference in allelic diversity is a result of divergent life-history strategies in the two species that affect the long-term stability of populations. Specifically, ephemeral populations of P. cinerascens result in small effective population sizes and lower S-RNase diversity compared to larger, stable populations of P. crassifolia, which experience strong selection for S-RNase diversification. Similarly, Stone and Pierce (2005) note that the larger altitudinal range of W. solanacea (compared to W. maculata) may be responsible for maintaining high S-RNase diversity in this species, primarily as a result of seed rain between populations of W. solanacea.

In contrast to comparisons within Physalis and Witheringia, L. parishii and L. andersonii have similar measures of terminal branch lengths (L. parishii, Rsd=5.22 versus L. andersonii, Rsd=5.42; Table 4), indicating similar rates of allelic turnover, and relatively similar and high numbers of S-RNases (38 versus 27 alleles for L. andersonii and L. parishii, respectively; Table 4). L. andersonii has a wider geographic range than does L. parishii (Chiang-Cabrera, 1981), and it would be informative to investigate S-RNase diversity in multiple populations of both taxa to obtain species-wide estimates of allele number. Although both species are primarily pollinated by anthophorid bees and seeds are dispersed by birds, the greater range of L. andersonii may harbor higher numbers of S-RNases in the species as a whole.