Introduction

Avian infectious bronchitis virus (AIBV), a typical species of the genus Coronavirus 1, can induce a highly contagious disease in chickens. The AIBV genome encodes four structural proteins: the envelope protein (E), the spike protein (S), the membrane protein (M) and the nucleocapsid protein (N) 2, 3. Among them, the S protein is the major envelope glycoprotein of AIBV, which has been believed to associate with attachment 4 and virus-neutralizing antibody induction 5, as the same case in many other Coronavirus viruses.

Extensive antigenic and genetic variation is a distinct feature of AIBV. Antigenic variants are generally related to the appearance of positively selected, single point mutations in the antigenic domain of the viral proteins. These mutations lead to the alteration of virulence and the escape of the viruses from host defenses 6. Thus, even though the vaccines have been extensively used 7, outbreaks of the disease continue. In present study, we are the first to investigate the amino acid substitutions features in the AIBV antigenic domain of a vaccine serotype (DE072) and a virulent viral strain (GA98) to better understand adaptive evolution of AIBV. More importantly, we compared our results from AIBV in a similar analysis of wellknown severe acute respiratory syndrome Coronavirus (SARS-CoV), identifying the locations where amino acid changes are likely important to the survival of the virus.

Materials and Methods

Data mining

AIBV S protein sequences of DE072 vaccine and virulent viral strain GA98, which was thought to have arisen as a result of the escaping from the immunity of DE072 vaccine 6, were collected from literature 6. After excluding those questionable and short sequences, there are 31 AIBV sequences in total for our analyses, including 4 DE072 and 27 GA98 S protein sequences. Their name and accession numbers in Genbank are listed in Figure 1.

Figure 1
figure 1

Phylogenetic tree of 31 AIBV sequences. After removal of the ambiguous sites, tree is constructed using the NJ method. Bootstrap percentages higher than 50 are shown above the branch. The scale bar shows 0.01 nucleotide substitutions per site.

In addition, eighty six sequences from the first epidemic, which include 4 CoV sequences of the palm civets (Paguma Larvata) and 82 SARS-CoV sequences, representing three phases of first SARS outbreak 8, as well as thirteen sequences from the second epidemic were also retrieved from literature 8, 9. The phylogenetic relationships between these sequences are presented in Figure 2.

Figure 2
figure 2

Phylogenetic tree of SARS sequences examined in this study. After removal of the identical sequences, 46 sequences were used to construct the tree using the NJ method. The sequences from palm civets are marked by star.

Evolutionary analyses

The amino acid sequences were first aligned by CLUSTAL W 10. Then, the nucleotide sequences were aligned according to the amino acid sequence alignment matrix and used in the phylogenetic reconstruction. We conducted phylogenetic analysis using program MEGA2.1 11. The reliability of the resulting trees was evaluated by the bootstrap method 12 with 1 000 replications.

To identify putative amino acids that are subjected to the positive selection, the likelihood method of Yang et al. 13 was used with the PAML package 14 to estimate the numbers of synonymous substitutions per synonymous site (dS) and the number of nonsynonymous substitutions per nonsynonymous site (dN). First, the likelihood ratio test (LRT) was employed to test if positive selection exists by comparing a null model with a more general one. The null model does not allow for sites with ω (dN/dS) > 1, while the more general one does. Here the LRT compares M7 (null model) with M8 (general model) for the presence of sites under positive selection. M7 assumes that ω ratios were distributed among sites by a beta distribution and M8 adds a discrete ω class to M7. In general, positive selection can be inferred if ω value estimated under M8 is greater than 1. Second, the Bayesian theorem, which calculates the posterior probability that a site has ω > 1, is used to identify residues under positive selection when the LRT suggests their presence.

Results and Discussion

In this study, we analyzed the spike protein (S) of DE072 vaccine and GA98 strain to identify putative amino acids that are subjected to positive selection and contribute to the evasion from the host immunity. The results revealed that inferred amino acid substitutions predominantly occurred within the regions of conformational virus-neutralizing epitopes previously recognized by monocolonal antibodies analyses 15, indicating that these substitutions may be responsible for the emergence of antigenic variations and that they may be crucial for the evasion of GA98 serotype from DE072 immunity. Specifically, we found that 6 of 9 amino acid changes that were under positive selection in GA98 strain tended to switch from polar to non-polar or from neutral to charged residues (Table 1). For example, positions 60, 67 and 121 exhibit the amino acid transformation from neutral (A, G, Q) in DE072 to charged (E, R, K) in GA98 population, and positions 23, 123 and 278 change from polar (E, T, E) in DE072 to non-polar (V, L, A) in GA98 population. Therefore, it appears that the shift of residue charges in S protein was especially important in determining the potential epitopes for antibody, as also has been demonstrated in other viruses16. Finally, the phylogenetic analysis showed that GA98 strain sequences can be divided into two clusters, and at some inferred sites, different amino acids substitutions were observed between the two clusters of GA98 strain (Table 1). It is possible that these substitutions are associated with strain virulence differences of two GA98 serotype clusters.

Table 1 LRTs of positive selection for the SARS and AIBV

It is interesting to see whether the substitution features identified to be important for the survival of GA98 AIBV are also characteristic of SARS-CoV. Human SARS epidemics are generally believed to originate from the animals with the palm civet as the primary suspect 8, 9, 17. The comparison between the CoV sequences of the palm civets and SARS-CoV sequences allows inference of the importance of amino acid changes since the divergence of the virus 9. After the first epidemic of SARS occurring during 2002-2003, scattered new cases were reported in Guangzhou during 2003-2004. Phylogenetic analysis has suggested that the new epidemics may be caused by independent viral invasion from animal to human 17. Thus, we analyzed the sequences of animal and human viruses from two epidemics separately. In the first epidemic, we found 8 of the amino acids changes under positive selection distinguishing animal and human viruses (Table 1). The first three occur in the S1 subunit and all exhibit an interesting amino acid transformation from negative charge (K) to neutral residues (N or T). Four of the remaining five revealed the alterations of amino acid polarity from non-polar to polar residues, more specifically, from P, L, A and A in CoV of the palm civets to S, S, R/T and T in SARS-CoV, respectively. These changes may influence the conformation of S protein and consequently the power of antibody binding. Similarly, among the 4 amino acid changes under positive selection in the second epidemic we identified two charged residue shifts (Table 1). Importantly, one of these two positively selected sites (479) was observed not only in this epidemic but also in the first epidemic, with the same trends of alterations of amino acid charge from negative charge (K, R) to neutral residues (N or T). Interestingly, this amino acid has also been suggested to be under positive selection in Song et al. (2005) by analyzing SNV (single-nucleotide variation) in different populations. Moreover, this amino acid is located in the region (residues 318–510), where it can efficiently bind angiotensin-converting enzyme 2, a functional receptor for SARS-Cov 18. Since the CoV of palm civets and SARS-CoV are analogous to the DE072 vaccine and GA98 strain AIBV, our analysis suggests that significant changes of residue charge and polarity in critical proteins have enabled these viruses to establish themselves, although the direction of changes are different in the two situations.

In addition, as was found in the AIBV case, amino acids substitutions were found to be different among three clusters corresponding to three phases of the first SARS outbreak. For instance, two substitutions (75 T→R and 311 G→R) were only found in the early phase of Guangzhou and Zhongshan lineages 8, but not in the subsequent phases. Similarly, unique substitutions to the middle and late phases, respectively, were also present. The evidence suggests that these particular amino acid substitutions may affect strain virulence and infectivity. Hence, it can be followed that different genotypes of viruses must be taken into account when we adopt attenuated virus as SARS vaccine, similar to the common practice of administering influenza vaccine19.

In conclusion, the implication of this study is that during the development of vaccines special attention needs to be paid to those amino acid changes that have resulted in overall shift of residue charge and polarity. Therefore, the current findings are expected not only to shed new light on the thorough understanding of avian infectious bronchitis viral evolution, but provide suggestive information in the drug and vaccination development of SARS as well, whether the vaccine is based on the attenuated virus or DNA.