Introduction

The crested ibis (Nipponia nippon) is one of the most endangered avian species in the world. It has suffered a catastrophic decline since the late nineteenth century and was thought to be extinct until 7 wild birds were rediscovered in China in 19811. Since then, this tiny population has stably expanded under great conservation efforts and is currently the only wild population in the world1. However, several captive populations were established to help its recovery. By 2013, the crested ibis population in China increased to more than 15002. However, as a bottlenecked species with intensive inbreeding, recent populations suffer from low genetic diversity3. The mortality of ibis embryos and nestlings is also high1,4,5 and many pathogens can fatally infect the bird2,6,7,8. Therefore, the genetic conservation and immunological function of the crested ibis require rigorous investigation.

Construction of a genomic library for the crested ibis can conserve its DNA resources and thereby promote detailed genomic research on this rare species and development of strategies for better protection. Bacterial artificial chromosomes (BACs) provide a capacity to clone genomic fragments larger than 100 kb9, which is sufficient for most research on a genomic scale. Moreover, BACs can be stably transformed into bacterial hosts with low occurrence of chimerism and are more easily purified than bacteriophage λ cosmid vectors or yeast artificial chromosomes10. Given these advantages, BACs have been widely used in the construction of DNA libraries and in genomic studies10.

Defensins are a collection of small peptides (commonly less than 100 amino acids) enriched in hydrophobic and cationic residues11,12 that play a critical role in regulating the innate immune systems13 of a wide range of organisms including fungi, plants, invertebrates and vertebrates12,14,15,16. These antimicrobial peptides can protect the host against a broad spectrum of pathogens, such as bacteria, enveloped viruses, certain fungi and parasitic protozoa13,17, mainly by binding to and disturbing pathogen membranes, blocking viral replication cycles, or changing host cell recognition sites13,18. Moreover, defensins are chemotactic for T cells, macrophages, mast cells and immature dendritic cells19, thereby serving as a bridge between innate and adaptive immunity.

Defensins contain 6–8 conserved cysteine residues forming 3–4 intramolecular disulfide bonds. Based on the positions of the cysteine residues and the pairing of the disulfide bridge, they are categorized into five groups: plant (C1–C8, C2–C5, C3–C6 and C4–C7), invertebrate (C1–C4, C2–C5 and C3–C6), α-defensins (C1–C6, C2–C4 and C3–C5), β-defensins (C1–C5, C2–C4 and C3–C6) and θ-defensins (C1–C6, C2–C5 and C3–C4), with latter three unique to vertebrates15,16,20,21. β-defensins are found in a majority of vertebrates12,16,22,23, while α-defensins are found only in mammals, including marsupials, but are absent in some Laurasiatheria species, such as dogs and cattle24,25. The cyclic θ-defensins, derived from the posttranslational ligation of two truncated α-defensin-like precursor peptides21, have only been found in Old World monkeys and some apes and are defunct in gorillas, chimpanzees and humans26. Based on their prevalence, the major β-defensins subfamilies are believed to have arisen before the bird–mammal split, followed by the appearance of α-defensins after the divergence of mammals and finally θ-defensins in the primate lineage26,27. In the human genome, at least 6 α-defensins13 and 40 potential β-defensins were found within 5 clusters scattered on chromosome 8, 6 and 20 and the number of β-defensins may be much higher due to rapid duplication28. Typically, the human β-defensins contain two exons: exon 1 corresponds to the 5′ untranslated region (UTR) and signal peptide and exon 2 corresponds to the short anionic propiece followed by the mature peptide and 3′UTR28. Post-translational modifications include proteolytic cleavage of the signal peptide and propiece from the prepeptide to yield the mature peptide28,29.

The first avian defensin genes were isolated from chicken and turkey leukocytes in the mid 1990s30. Since then, defensins have been increasingly studied in other avian species31,32, but comprehensive genomic information on defensin clusters has only been reported in chicken (14 genes)27,33, duck (19 genes including a pseudogene)34 and zebra finch (22 genes)35, all suggesting that only β-defensins are present in birds and revealing various gene duplications. Phylogenetic analysis based on the three birds revealed that the avian β-defensins (AvBDs) could be classified into 12 subfamilies34: nine subfamilies (AvBD2, AvBD45 and AvBD813) maintain one-to-one orthologous genes in all species; two subfamilies including three members (AvBD1/AvBD3 and AvBD7) show different lineage-specific duplications among species; and one subfamily (AvBD14) is absent in zebra finch. Accordingly, most avian defensins evolved before the bird species split from each other. In contrast to the human β-defensins, the avian β-defensins typically contain four exons: exon 1 for the first part of the 5′UTR, exon 2 for the rest of the 5′UTR and the signal peptide, exon 3 for the short propiece (sometimes absent) and majority of the mature peptide and exon 4 for the C-terminus of the mature peptide and the 3′UTR12,27.

In this work, we constructed genomic libraries for the crested ibis using BACs, including a routine 4D-PCR library and a simplified reverse-4D library. Based on the target BAC clones screened from the libraries, we characterized the content and organization of the defensin gene cluster, which is the first to be reported in Pelecaniformes. The genomic and evolutionary information about defensins revealed by this work will increase our understanding of the immunity of this endangered species and the historic events in avian defensin gene evolution.

Results

Construction and characterization of BAC libraries

Since the crested ibis is an endangered species, it required a routinely constructed BAC genomic DNA library, a 4D-PCR BAC library36, to store its genetic resource. The library contains 129312 clones conserved on 1347 96-well plates and was arrayed into 27 superpools (49 × 96 clones). The average insert size of the library was 86.5 kb, with an empty vector rate of 6.5% (Figure 1a). Through comparison with the C-value of other threskiornithidae species available in the Animal Genome Size Database (http://www.genomesize.com/), we estimated the crested ibis genome at about 1.35 Gb. Thus, the genomic coverage of this library was about 7.8 equivalents of the crested ibis genome.

Figure 1
figure 1

Quality and usability of BAC libraries.

(a) Distribution of insert sizes for randomly selected BAC library clones. (b) BAC overlapping cluster of defensin genes. H46-1, H1-10, 375B9, H68-1 and 780E4 correspond to the five positive BACs. Grey arrows: end sequences of BACs; black arrow: location of primers (AvBD7) used to screen BAC libraries. (c) Chromosome localization of the crested ibis β-defensin cluster. The H46-1 probe was mapped to crested ibis chromosome 3 (green signals; numbering described in Methods). Chromosome scales are at the upper right corner.

Considering the relatively small insert size in the 4D-PCR library, we decided to construct a second BAC library with greater coverage. However, 4D-PCR BAC library construction is time-consuming and labor-intensive. Recently, a new type of simply arrayed BAC library, a reverse-4D BAC library, was developed for birds37; this method can significantly reduce the time investment and is efficient enough to acquire positive clones covering the target genomic region. Therefore, we adjusted library construction conditions (Table 1) and built a reverse-4D BAC library for the crested ibis consisting of 1300 sub-libraries, each with an average of 400 clones. Randomly selected clones had average insert sizes of 100.9 kb and an empty vector rate of 5.6% (Figure 1a). The reverse-4D library therefore provides at least 35-fold coverage of the crested ibis genome.

Table 1 Conditions for constructing the 4D-PCR and reverse-4D libraries

Validation of the usability of the BAC libraries

We obtained two pieces of evidence to verify the usability of the crested ibis BAC libraries. First, we employed the primer pair from defensin gene AvBD7 to screen the BAC libraries and achieved five positive BACs (Figure 1b), of which 375B9 and 780E4 are from the routine 4D-PCR library while the other three (H46-1, H1-10 and H68-1) are from the simplified reverse-4D library. The overlap among BACs shows that the big BAC clones are all from the reverse-4D library, validating large inserts in the reverse-4D library. The biggest BAC, H46-1, contains a 150 kb insert spanning the target genomic region and covers all defensin genes of the crested ibis as determined from a BLAST search against the relevant genomic region of chicken defensins, providing an opportunity to characterize the crested ibis defensin cluster in detail.

Second, we used the plasmid DNA of BAC H46-1 as a probe to perform fluorescence in situ hybridization (FISH). The result shows that the crested ibis defensins physically map to chromosome 3, the same chromosome that contains the defensin clusters in the chicken and zebra finch genomes27,35, without any non-specific binding (Figure 1c). This means that, in contrast to the human and mouse defensins, which are scattered on several different chromosomes29, the avian defensins are probably restricted to only one chromosome.

Characterization of the crested ibis defensin genes

Fourteen β-defensin loci clustering within a region of about 129 kb were identified in the clone H46-1 (Table 2). We also obtained the full-length cDNA sequences of these defensin genes except AvBD11 and AvBD14, which may not be expressed in the selected organs or may be only inductively expressed when exposed to certain pathogens. The cDNA sequences obtained range from 378 bp to 813 bp. By aligning the deduced amino acid sequences of the 14 crested ibis defensins, we detected high sequence similarity in the signal peptides but a very low level of conservation in the remaining regions, including the propiece and mature peptides (Figure 2a). This pattern was also observed in many other vertebrates23,27,35 and the high sequence divergence concentrated in the mature peptide among different defensin genes was suggested to be driven by accelerated adaptive selection after successive rounds of gene duplication23. Nevertheless, all the defensin genes in crested ibis exhibit six conserved cysteine residues located in the mature peptides (Figure 2a). Moreover, we noticed splicing signal variations in intron 2 (AG/CT) from AvBD12 and in intron 3 (both GT/CA) from AvBD2 and AvBD1α, which coincidently correspond to the two hypervariable regions of AvBDs (i.e., the propiece and mature C-terminal, respectively) (Figure 2b–2c). Interestingly, AvBD11 showed two tandem copies of the special six-cysteine motif in the C-terminal segment with a low level of sequence identity (Figure 2a). Similar patterns were also observed for chicken AvBD1127 and lizard Anolis carolinensisAcBD1422. Obviously, these defensins have six, instead of three, intramolecular disulfide bridges, but the functional significance remains to be studied.

Table 2 β-defensin genes in the crested ibis
Figure 2
figure 2

Amino acid sequence alignment and structural variation of mRNA/coding sequences.

(a) Dots and dashes represent identity and deletion to AvBD13, respectively. Regions corresponding to signal, propiece and mature peptides are indicated. Light-grey: conserved residues (frequency ≥50%); dark grey: six highly conserved cysteine motifs; boxes: six additional cysteines in AvBD11. (b) Multiform gene organizations ranging from two to four exons are distinguished by different colors. Two-exon organizations contain fusions of the first and last two exons; three-exon organizations contain fusion of the first two exons (Table 2). “Undetermined” indicates prediction of coding sequences without UTRs (AvBD11 and AvBD14). Gene regions are drawn to scale (lower right corner). Coding regions correspond to sequence alignments in (a). AvBD5 has two splice variants (5-1 and 5-2) with the same coding sequence. (c) Classical four-exon structure of avian β-defensins. aIntron 2 of AvBD12 shows splicing signal variation (GT/AG to AG/CT). bIntron 3 of AvBD2 and AvBD1α exhibits splicing signal variation (GT/AG to GT/CA).

Notably, we also observed extensive fusion of adjacent exons in the crested ibis β-defensins (Table 2), creating three different gene organization patterns (including the numbers and locations of introns/exons) ranging from two exons (typical for human β-defensins) to four exons (typical for the avian β-defensins), as illustrated in Figure 2b. Classical AvBD genes are constituted by four exons (four-exon organization; Figure 2c), in which exon 1 corresponds to the 5′UTR, the two internal exons encode the signal, propiece (absent in some genes) and mature peptides and exon 4 includes the C-terminal sequence of the mature peptide and the 3′UTR. In some defensins (AvBD5, 9, 10), the first two exons fused, leaving only three exons (three-exon organization). Moreover, for AvBD12 and 13, not only the first two but also the last two exons fused, resulting in “two-exon organization,” which perfectly resembles that of typical human β-defensins28. Additionally, we found one alternative mRNA splicing for AvBD5 (Table 2 and Figure 2b). Compared with the transcript AvBD5-1, the other transcript AvBD5-2 lacks the corresponding exon 1, but its exon 2 instead extends towards intron 1 for 29 bp spliced in AvBD5-1. Nevertheless, the two splice variants only differ in 5′UTR and thus share the same translated sequence.

Comparative genomic analyses

The genomic organizations of whole β-defensin gene clusters in the chicken27, duck34, zebra finch35 and crested ibis are shown in Figure 3a–3d and dot plot comparisons between the crested ibis and other birds are shown in Figure 4a–4c. First, compared to the duck (19 genes within 103 kb, 0.184 genes/kb)34, zebra finch (22 genes within 127 kb, 0.173 genes/kb)35 and chicken (14 genes within 82 kb, 0.171 genes/kb)27, the gene density within the β-defensin cluster is remarkably low in crested ibis (14 genes within 128 kb, 0.109 genes/kb).

Figure 3
figure 3

Organization of β-defensin gene clusters.

β-defensin clusters from (a) chicken, (b) duck, (c) zebra finch and (d) crested ibis. Vertical bars: gene locations; lengths drawn to scale (upper right corner). Paralogous genes for AvBD1 (orange) and AvBD3 (blue) and chicken-specific AvBD6 (green) are shown. Red: tRNA clusters. Small triangles: transcriptional orientations of defensin gens. GC contents of gene clusters are shown above the genomic maps. Duplicated regions are bracketed by double-sided arrows and dotted lines.

Figure 4
figure 4

Dot plot analysis of defensin clusters.

Crested ibis vs. (a) chicken, (b) duck and (c) zebra finch. Positions of the genes for each species are indicated on the corresponding axes (colors as in Figure 3). Defensin (red boxes) and tRNA cluster (purple boxes) duplications are depicted as replicated blocks between syntenic regions.

Second, a series of gene duplication events were identified in these species (replicated blocks, Figure 4a–4c), including several species-specific duplications and one common duplication (AvBD1 vs. AvBD3), also supported by the relative gene locations (Figure 3a–3d). The detection of two AvBD7 copies (AvBD6 and AvBD7) in chicken (Figure 4a), seven AvBD3 copies (AvBD3S1, AvBD3A–3F) in duck (Figure 4b) and three AvBD1 copies (AvBD123–AvBD125) and nine AvBD3 copies (AvBD115–AvBD122) in zebra finch (Figure 4c) were all in accordance with previous reports27,34,35. Similar to the zebra finch, duplication of AvBD1 was also detected in the crested ibis (Figure 4a–4c), leading to two closely linked gene copies (AvBD1α and AvBD1β) between AvBD2 and AvBD3. Besides these gene duplications unique to each species, the dot plots also first revealed a common duplication between AvBD1 and AvBD3 in all the four birds. This result shows that a historical duplication event preceding the speciation of these species gave rise to the two ancestral AvBD lineages with adjacent locations, which can be dated back to 87–135 Mya38.

Third, AvBD14, which is absent from zebra finch35, was identified in the chicken27 and duck34 as well as the crested ibis. In fact, we found that Hwamei (Passeriforme) also lost AvBD14 (unpublished data). Furthermore, the phylogenetic analysis in the next section shows that AvBD14 is a single copy orthologous gene among these three birds (Figure 5). Thus, we predicted that the absence of this gene may be exclusive to Passeriformes. Collectively, this study reveals that gene gain and loss have occurred widely during the evolution of the avian defensin gene clusters. Except for the different gene duplications found in the four birds, β-defensin gene clusters are generally conserved with nearly identical gene order and orientation ranging from AvBD13 to AvBD14 on the gene map (Figure 3a–3d) with considerable sequence homology (unbroken diagonal lines, Figure 4a–4c).

Figure 5
figure 5

Phylogenetic relationship of avian β-defensins.

AvBD genes from chicken (Gallus gallus, Gaga), zebra finch (Taeniopygia guttata, Tagu), duck (Anas platyrhynchos, Anpl) and crested ibis (Nipponia Nippon, Nini; grey). Numbers beside nodes indicate maximum-likelihood bootstrap supports/Bayesian posterior probabilities (values > 50 only). The AvBD12 lineage branch is truncated by a dash. Green lizard (Anolis carolinensis, Anca) AcBD15 is included as the outgroup.

Interestingly, we noticed that the GC content of the regions corresponding to the subfamily AvBD1/AvBD3 in all species and the subfamily AvBD7 in chicken, where gene duplication events occurred, is generally greater than in non-duplicated areas of each species, especially in duck (Figure 3b) and zebra finch (Figure 3c). We further compared the average GC content between the duplicated and non-duplicated regions in each bird and found significant differences (P < 0.005) (Supplementary Table S1). Earlier studies reported a positive correlation between duplication/recombination and GC content39,40, which implies that the frequent duplication events in certain avian defensins, especially in the AvBD1/AvBD3 subfamily, might be associated with the high GC content in these regions. The AvBD5 region shows the highest GC content in chicken, zebra finch and crested ibis but has an obviously low GC percentage in duck, suggesting this segment has species-specific GC content variation. Actually, besides the duplicated defensin gene regions, we also detected significantly higher (P < 0.005) GC content in the tRNA clusters located between AvBD10 and AvBD11 in the four species (Figure 3a–3d) (Supplementary Table S1), within which extensive duplications were also detected by dot plots (Figure 4a–4c). According to the predicted results, these tRNA genes (4 in chicken, 8 in duck, 5 in zebra finch and 7 in crested ibis) are mainly responsible for transferring tryptophan (hydrophobic) and arginine (cationic), which are typical for defensins12,20, indicating the coevolution of defensin and tRNA genes via physical proximity.

Evolutionary analyses

In Bayesian and maximum-likelihood (ML) phylogenetic analyses (Figure 5), genes from the same subfamilies were clustered with relatively high posterior probabilities (or bootstrap values), most of which were >90% (bootstrap values > 90). Meanwhile, except for 1:1:1:1 orthologs in the AvBD2, 4, 5, 8, 9, 10, 11, 12 and 13 clusters, the other clades exhibit single-copy orthologs among 2–3 birds and/or extra paralogs in the remaining species (Figure 5). Based on the strong orthologous signals and the frequent gene gain and loss, we propose that the avian defensin genes evolved by the birth-and-death process, which was suggested to be an important evolutionary mode for multigene families, especially for immune system genes41 and has been reported for the mammalian42 and reptilian β-defensins22.

In both phylogenetic trees, the AvBD1 and AvBD3 lineages are grouped into a large subfamily, which further supports the historical gene duplication event detected in Figure 4a–4c. For AvBD1, chicken and duck present one copy each while crested ibis and zebra finch show two and three copies, respectively (Figure 5). Nonetheless, the three zebra finch AvBD1s are grouped into one intra-species cluster while the crested ibis AvBD1α and are separated into one basal primitive branch of the all-AvBD1 clade and another group with the chicken and duck AvBD1s (Figure 5), suggesting one ancient duplication in the crested ibis AvBD1 and two recent duplications in the zebra finch AvBD1. For AvBD3, we noticed a sharp contrast in gene distribution pattern among species; chicken and crested ibis have only one copy AvBD3 while duck and zebra finch have many duplicated AvBD3s (Figure 5). In addition, AvBD6 is unique to chicken and shows a high degree of pairing with Gaga-AvBD7 (Figure 5), indicating that the AvBD6 gene was duplicated from AvBD7 recently and exclusively in Galliformes35. As a result, gene duplication further enlarged and diversified the repertoire of β-defensins in different birds.

For orthologous genes among the four birds, the pattern of purifying selection was supported by a mean ω (overall dN/dS ratio) < 1 in all defensin loci and further supported by all the pairwise calculated ω < 1, with exception for comparison of AvBD2 between the crested ibis and zebra finch (Table 3). This implies that the β-defensins may have mainly evolved under relaxed purifying selection in Aves35. Furthermore, the signal peptide coding regions generally have higher ω values than the mature peptides, indicating that the mature peptides have relatively greater sequence conservation within each gene lineage, which may be critical for maintaining specific structure or function for each orthologous defensin gene.

Table 3 dN/dS values calculated for orthologous β-defensin genes in chicken, zebra finch, duck and crested ibis

Discussion

The present study constructed a 4D-PCR library with a relatively small insert size and a reverse-4D library with an elevated average insert size, which is attributed to our condition optimization (Table 1). First, we decreased the number of cells embedded in the low melting point agarose in order to reduce the DNA concentration and increase separation efficiency for small DNA fragments by pulse field gel electrophoresis (PFGE), thus decreasing the amount of short fragments and facilitating inclusion of large DNA fragments into vectors. Second, compared to the PFGE process in the construction of 4D-PCR library, we performed an additional PFGE step for 8 hours (3–5 s pulses at 5 V/cm), which further eliminated small DNA fragments. Third, the electrical resistance was increased to 200 Ω, thereby elongating the pulse time for electrotransformation, which may promote the uptake of large BACs.

Due to the difficulty of acquiring samples from endangered and rare species, it is important to store genetic resources in genomic DNA libraries. Thus, this study provides optimized conditions of library construction and increases the convenience of building large-insert BAC libraries for other species. Because very few molecular or genomic studies have been performed in the crested ibis, the BAC libraries presented here are not only key for this work but also provide a solid foundation for future species protection based on genomic features of the crested ibis.

In this study, we observed extensive fusion in the crested ibis β-defensins (Figure 2b and Table 2), ranging from the classical four-exon structure (Figure 2c) to mammal-like two-exon organization. In chicken, only AvBD12 presents fusion of the last two exons (resultant three-exon structure), while all the other defensin genes consist of four exons27. In the green lizard, at least seven different gene organization patterns also ranging from two to four exons were reported22, including the exact “four-exon organization” (AcBD7, 12 and 26), “three-exon organization” (most AcBDs) and “two-exon organization” (AcBD17, 21, 22, 31 and 30b). Such variability of gene organization has also been reported in many invertebrates and was proven to be a result of exon-shuffling during evolution14. Exon-shuffling by intronic recombination plays an important role in rapid construction of new genes encoding multidomain proteins during evolution43. It was previously speculated that the two-exon human β-defensins may evolve through exon-shuffling and this method is presumably adaptive since it allows β-defensins to mobilize faster to efficiently defend against pathogens27.

We also observed two splice variants for the crested ibis AvBD5 (Figure 2a–2b). Such alternative splice patterns have been revealed in the green lizard (AcBD13a/b and AcBD18a/b) and the authors suggested that the first exons in both variants probably associate with different proximal promoters, which may contribute to tissue- or time-specific gene expression regulation22. Actually, four different alternative splice patterns were found for the green lizard β-defensins, which is undoubtedly a critical mechanism for generating new molecular diversity in β-defensin genes. Thus, exon fusion and alternative splicing might play special roles in defensin evolution and innate immunity of the crested ibis.

The Bayesian and ML trees show that the crested ibis AvBD1α has a closer phylogenetic relationship to the AvBD1 genes of the other three birds, while AvBD1β forms a basal primitive branch of the whole AvBD1 lineage (Figure 5). This implies that after the duplication between ancestral AvBD1 and AvBD3 lineages, the AvBD1 was further duplicated before the four birds diverged, which generated the two copies AvBD1α and AvBD1β. However, AvBD1β was later lost during the subsequent avian evolution history and has so far only been identified in the crested ibis genome. Consequently, the crested ibis defensins provide insight into ancestral avian defensin structure; in addition to the previously suggested 13 ancestral avian β-defensins including AvBD1–5 and AvBD7–1434, we propose that the ancient Neognathae may also have AvBD1β, which needs verification in other avian orders.

The crested ibis β-defensins, first characterized in this study, will be helpful for protecting this endangered bird in the two following aspects. In recent years, there has been an increasing number of reports regarding infectious diseases in the crested ibis, including septicemia6, new castle2, Salmonella infection8, Escherichia coli infection5, avian influenza7 and others, which implies that this rare bird is at increasingly high risk under variable pathogen pressures. Identification of defensin genes responsive to certain pathogens has been conducted in many avian studies12,34 but never in crested ibis. This study determines crested ibis β-defensin genes in detail and thus makes it possible to conduct pathogen-associated analysis and further help this endangered bird better cope with pathogen invasion.

Although the number of crested ibis individuals has expanded greatly, the severe threat of high mortality rates during the pre- and post-hatch periods, largely caused by pathogen infection, has emerged due to extreme inbreeding1,4,5. This could be mainly attributed to the immature immune system of the developing embryos and newly hatched chicks, since components of the adaptive response are in a naïve state44. Many avian studies have demonstrated that avian defensins can protect embryos and nestlings from potential pathogenic assault and promote the early transition from an innate immune response to an adaptive one11,12,45. Thus, some studies utilized defensin genes as genetic markers to select for resistance to certain infections and thereby enhance the innate immune response via selective breeding in poultry45,46, while some others adopted artificially regulated defensin expression via maternal diet, for example, to control the selective colonization of birds and thereby suppress pathological microflora11,47. These findings all provide conservation implications for scientists and managers, e.g., methods for using defensin-based genetic management and dietary regulation to reduce the early mortality rates of crested ibis, especially in captive populations.

Methods

Ethics statement

The samples used in this study, including blood, liver and spleen, were collected from two captive individuals with permission from the Crested Ibis Breeding Centers (CIBC) and the Department of Wildlife Conservation, State Forestry Administration. The blood sample was obtained from the wing vein of a female individual bred in CIBC, Louguantai County, Shaanxi Province, China and was performed in cooperation with technical staff from the center with utmost care. The liver and spleen tissues were acquired from another dead nestling provided by CIBC, Deqing County, Zhejiang Province, China, which was immediately dissected after its death that was probably caused by a birth defect. All experiments were approved by the ethics committees of these two CIBCs and were carried out in accordance with the approved guidelines.

Construction of the BAC library

A routine BAC genomic DNA library was generated with DNA from the blood sample. The construction of this library was based on a previous protocol36 with a few modifications using a pCC1BAC vector (Epicentre) and MboI restriction enzyme. The library was arrayed for rapid screening by 4D-PCR36. A reverse-4D BAC library was generated and characterized following a previous protocol37, with some conditions optimized (Table 1). Three different restriction enzymes (MboI, EcoRI and HindIII) were used separately to form DNA inserts with different cohesive ends in order to maximize the complementation of sublibraries.

BAC screening and sequencing

Nine pairs of universal primers48 were used to perform PCRs with the crested ibis genomic DNA as template. Agarose gel electrophoresis on PCR products indicated that four primer pairs (primers for AvBD4, 7, 11 and 12) had derived products, among which the product of primers for AvBD7 (5′ to 3′ sequences, F: TTGATCACTTTCATGGTGAATGAC, R: TCACCTTTTGCAGCAAGAATA) produced a single band of the expected size. To further confirm this identification, the product amplified by AvBD7 primers was processed by (1) isolation by agarose gel electrophoresis and purification by DNA gel extraction kit (Axygen), (2) ligation into pMD18-T vector (TaKaRa) and transformation into E. coli DH5α competent cells (TaKaRa) and (3) sequencing of the cloned products. Sequences were identified by BLAST as AvBD7 from the crested ibis.

BAC clones containing β-defensin loci were subsequently screened from both BAC libraries using the AvBD7 primers above. The PCR screening processes were performed as previously described36,37. Insert sizes of positive clones were estimated by electrophoresis and BAC end sequencing36,37 was performed. The end sequences of these β-defensin-containing clones were aligned with the available chicken β-defensin genomic sequences from NCBI to preliminarily determine whether the BAC clone contained the entire defensin cluster. Finally, the BAC was shotgun subcloned, sequenced and assembled by a commercial sequencing service (Majorbio). The full BAC sequence was deposited in GenBank (accession number: KM098134).

Chromosome localization of the crested ibis β-defensins

Metaphase chromosomes were prepared from the peripheral blood of the crested ibis according to the method described by Moorhead et al.49 and the BAC clone H46-1 was used as a probe. FISH was performed following the previously reported protocol50. One hundred well-spread and distinct metaphases were karyotyped by screening 15 slides and the chromosome sizes were measured using ImageJ51, including the total length of the macrochromosomes and the arm ratio (long arm/short arm, q/p). Chromosome numbering was performed according to the reported karyotype ideogram of crested ibis52 and the criteria recommended by Levan et al.53.

Sequence analyses

According to the order and orientation as well as the clustering results of phylogenetic analysis described below, the crested ibis defensin genes were named following the nomenclature proposed previously33. The mRNA was extracted from the liver and spleen and the full-length sequences of β-defensin cDNA were obtained by GeneRacer™ Kit (Invitrogen) using the primers listed in Supplementary Table S2. All the cDNA sequences were used for gene annotation and deposited in GenBank (accession number: KM272304–KM272316). The defensin genes undetected in the liver and spleen tissues were analyzed with BLAST based on homologies to known chicken genes and the coding regions were predicted by GeneWise54.

The chicken (chromosome 3: 107,022,000–107,104,000 bp) and zebra finch (chromosome 3: 110,733,400–110,860,340 bp) defensin cluster sequences were both obtained from Ensembl (http://asia.ensembl.org/index.html), while the duck defensin sequences were obtained from NCBI (NW_004678274.1, NW_004683518.1, KB742693.1 and NW_004716371.1). The several gaps in and between these four duck defensin scaffolds were filled by applying long-range PCR (Supplementary Table S3), cloning and sequencing. We predicted tRNA genes in defensin clusters for four birds by tRNAscan-SE55, analyzed GC content with 200 bp windows using Isochore56 (http://www.ebi.ac.uk/Tools/seqstats/emboss_isochore/) and further conducted Pearson's chi-squared test in SPSS 20.0 (SPSS Inc., Chicago, IL, US) to identify any significant differences between each duplicated region and the non-duplicated region. Finally, we drew dot plots using PipMaker57 to perform syntenic analysis between the crested ibis and other bird defensin clusters.

Evolutionary analyses

Amino acid sequences were aligned by MUSCLE in MEGA5.058. According to the amino acid sequence alignment, nucleotide sequences were aligned and subsequently used for phylogenetic analysis with chicken, zebra finch and duck β-defensin genes. Nucleotide sequences encoding the signal and mature peptides were used for phylogenetic reconstruction in MrBayes 3.2.259 with the green lizard AcBD15 (GenBank accession number: FR850158) as the outgroup. Two independent runs of 600,0000 generations with a sample frequency of 100 were performed and the first 25% of the samples were discarded as “burn-in.” An ML tree was also constructed using the PhyML web service Phylogeny.fr60 with default settings. The rates of synonymous and non-synonymous substitutions (dS and dN, respectively) were calculated in MEGA5.0. Since the saturated non-synonymous sites in paralogous genes could bias the comparisons35, the dN/dS ratios (ω) were calculated pairwise for orthologous genes among the four birds (AvBD14 genes were only compared among the chicken, duck and crested ibis). The overall dN/dS ratios were calculated separately for signal peptide and mature peptide coding regions.