Introduction

Candidatus Liberibacter asiaticus” (CLas), a phloem-limited α-proteobacterium, is associated with citrus Huanglongbing (HLB, yellow shoot disease, also known as citrus greening disease) that is devastating citrus production worldwide1,2. No effective cure for HLB is currently available. Management of HLB depends on excluding CLas from citrus-producing regions though use of regional quarantines, pathogen-free nursery stocks, removal of infected trees, and control of vectors, e.g. the Asian citrus psyllid (ACP, Diaphorina citri). Knowledge about CLas biology plays critical roles for development of novel, effective HLB control strategies. Yet, study of this bacterium has been difficult due to the inability to culture it in vitro.

Recent developments in bacterial whole genome sequencing through next generation sequence (NGS) technology have opened a new venue for research in non-culturable plant pathogenic bacteria. We recently sequenced the whole genome of CLas strain A4 from Guangdong, China where HLB was first described3,4. Analyses of genome sequence of the A4 strain has led to discovery of a CRISPR/cas system and dominant single prophage phenomenon in CLas strains in China5. We also observed several large (>300 bp) DNA duplications in the strain A4 chromosome. One of them was identified as ribonucleotide reductase (RNR) β-subunit gene, nrdB. RNR is a key enzyme for converting ribonucleotides to deoxyribonucleotides, the precursors of DNA synthesis and repair, which is under strict regulation during cell proliferation6,7,8. RNR is also an important target for development of antibacterial drugs8. There have been extensive studies on RNR and its genes in model bacteria6,7,8. A database dedicated for RNR research has been established9. Currently, no information about CLas RNR has been published, except for a brief mention of a partial RNR gene sequence in PCR detection10.

Detection of CLas mainly relies on PCR technologies involving the use of specifically designed primer sets based genomic DNA sequences, mostly the 16S rRNA gene. Examples are primer set OI1/OI2c for standard PCR11 and primer set HLBas/HLBp/HLBr for TaqMan real-time PCR12. The chromosome of CLas has three copies of the 16S rRNA gene13. One strategy for further improvement of PCR detection is to identify and target genes with >3 copies. The proof of concept has recently been achieved in PCR detection of Spiroplasma citri, causing citrus stubborn disease by targeting multi-copy phage genes14. In CLas, a phage-based primer set (LJ900f/LJ900r) has been developed and tested15. However, recent investigation showed that CLas prophages and their sequences were highly variable including the absence of prophage5,16, which could impede detection reliability or accuracy. The high copy number nrdB provides an ideal target for sensitive detection of CLas.

The aims of this research were: (1) characterize nrdB in CLas based on available RNR information and bacterial genome sequences and predict its possible biological role; (2) elucidate phylogenetic relationships of CLas among eubacteria based on nrdB DNA and amino acid sequences; and (3) evaluate the use of a nrdB-based primer set for improvement of CLas detection, with comparisons made to existing PCR primers such as the 16S rRNA gene-based primer set HLBas/HLBr and the prophage sequence-based primer set LJ900f/LJ900r.

Results

Identification of multiple-copy regions in A4 genome

As shown in Fig. 1, ten repeat regions were detected in the A4 genome by Dot Matrix analysis. Examination of the retrieved sequences revealed that regions 3, 4 and 6 were identical DNA sequences of 5,769 bp, each containing the genes of 16S, 23S and 5S rRNAs or the rrn operon (Supplementary Table S1). The other seven regions were sequences of three different sizes: 1,881 bp for region 1 and 10, 1,059 bp for regions 2, 5, and 9, and 1,491 bp for regions 7 and 8. Results of sequence alignments showed that regions 1, 2, 5, 9, and 10 contained a common 390-bp sequence (red in Fig. 1); regions 1, 7, 8, and 10 contained a common 1,492 bp sequence (green in Fig. 1); and regions 2, 5, and 9 contained a common 769 bp sequence (purple in Fig. 1). Genes or open reading frames (ORFs) corresponding to each region were listed in Supplementary Table S1.

Figure 1
figure 1

Visualization of repeat regions in the genome sequence of “Candidatus Liberibacter asiaticus” (CLas) strain A4 (CP010804).

The dot-matrix map was created by self-comparison through BLAST program available in National Center for Biotechnological Information. Genome length was marked in both X- and Y-axis with the prophage region identified. The upper-left diagonal (in blue shadow) shares the same information as the bottom-right diagonal. Examination on one diagonal (e.g. the bottom-right) reveals ten repeat regions on the diagonal line labeled with numbers accordingly. Sequences sharing >99.9% similarities (repeats) among the ten regions are marked with the same color. The red color sequence (390 bp) has the higher copy number of five (Region 1, 2, 5, 9, and10). Region 3, 4, and 6 are rrn operon in blue.

Characterization of CLas nrdB

Since the 390 bp sequence was repeated five times (the highest) in the CLas genome, the 390 bp-containing sequences, i.e. region 1, 2, 5, 9, 10 (Fig. 1) were selected for further study. In region 1 and 10, 378 of the 390 bp formed ORFs CD16_00035 and CD16_04445, respectively (Supplementary Table S1). In region 2, 5, and 9, the whole 1,059 bp formed ORFs CD16_00300, CD16_03625, and CD16_04230, respectively (Supplementary Table S1). All five sequences were annotated as nrdB encoding the β-subunit of RNR Class Ia (EC 1.17.4.1), two (CD16_00035 and CD16_04445) in short form (nrdBS, 125 amino acids) and three (CD16_00300, CD16_03625, and CD16_04230) in long forms (nrdBL, 352 amino acids) (Table 1). Note that 12 bp at the 5′ end of the 390-bp sequence were not part of nrdBS (Fig. 2). nrdBS1and nrdBS2 had a SNP at position 389, part of the synonymous stop codons. Five SNPs were found among nrdBL1, nrdBL2, and nrdBL3 without causing frame shifts (Fig. 2). Conserved domain analysis indicated the long nrdBL protein (352-aa) contained a diiron center (ion binding site), the tyrosyl radical, a putative radical transfer pathway and a dimer interface (polypeptide binding site) (Fig. 3). No iron binding site was identified on the short nrdBS protein (125-aa) as shown in the predicted 3-D structures (Fig. 3).

Table 1 General information of ribonucleotide reductase genes in “Candidatus Liberibacter asiaticus” A4 genome.
Figure 2
figure 2

Alignment of five ribonucleotide reductase β-subunit gene (nrdB) related sequences in “Candidatus Liberibacter asiaticus” strains A4 (CP010804) showing single nucleotide polymorphisms (SNPs) and TaqMan PCR primer and probe designs.

Position numbers are listed above or under are the sequence; SNPs are identified in red with corresponding codon underlined and amino acids indicated above or under in blue. Sequence of TaqMan primers (RNRf/RNRr) and probe (RNRp) are underlined; “~~”represents omitted identical nucleotides; # indicates initial position of nrdBSor nrdBL.

Figure 3
figure 3

Predicted 3-D structures of nrdBL1 (long form) and nrdBS1 (short form) of ribonucleotide reductase β-subunit.

(a) nrdBL and (b) nrdBS. The iron binding residues in pink centered by a purple dot (binding site), the tyrosyl radical in red, the putative radical transfer pathway in green. The regions targeted by primer set RNRf/RNRr were highlight in cyan. All conserved residues and model was generated using Phyre server29. The final refinement of all 3-D structure figures were made using the Pymol Molecular Graphics System (v1.7.6).

BLASTn search against all published CLas genome sequences revealed that all CLas strains had the same number of nearly identical nrdB genes (both nrdBS and nrdBL) (Table 2), except for the CLas strain SGCA5, which could be due to the influence of de novo assembly17 that dropped out repeat sequences because reassembly using A4 sequence as a reference showed the same five nrdB genes (unpublished data). The copy number of nrdB in CLas was much higher (five) than all the non-CLas Liberibacters, as well as those of other bacterial species (Table 2). Phylogenetic trees of selected representative bacteria based on 16S rRNA gene, amino acid sequence and DNA sequence of nrdB gene are shown in Fig. 4. In all three trees, Liberibacters were clustered together. Within Liberibacters, CLas clustered together, demonstrating the monophyletic lineage of CLas based on nrdB gene as that of the 16S rRNA gene. It is, however, noted that based on 16S rRNA gene tree, Agrobacterium was closely related to Liberibacters. This was not the case in the nrdB gene tree.

Table 2 Selected bacterial whole genome sequences and their numbers of class Ia ribonucleotide reductase genes, α-subunit nrdA and β-subunit nrdB.
Figure 4
figure 4

Phylogenetic trees of “Candidatus Liberibacter asiaticus” related to other bacteria.

(a) Based on the DNA sequence of 16S rRNA genes. (b) Based on the amino acid sequence of ribonucleotide reductase gene β-subunit nrdB. (c) Based on the DNA sequence of nrdB. The CLas cluster was labeled by the right brace.

Specificity of RNR primer set

Primer set RNRf/RNRr was designed based on the 390 bp repeats in the CLas genome (Fig. 2; Table 3). BLASTn search (word size = 16) using RNRf/RNRr primer sequences as queries against the GenBank nr/nt database that contained >1,000 bacterial genome sequences returned hits strictly to the RNR gene of CLas. PCR of DNA samples extracted from two healthy citrus plants and one CLas-free psyllid reared in our laboratory showed no amplification with primer set RNRf/RNRr by SYBR Green real-time PCR. The melting point of RNRf/RNRr amplicon was at 81.50 °C.

Table 3 General information of PCR primers in this study.

Evaluations among RNRf/RNRr, HLBas/HLBr, and LJ900f/LJ900r

A total of 57 CLas samples collected from China and USA were selected for primer set evaluations (Fig. 5). Sensitivity comparisons were performed simultaneously by SYBR Green real-time PCR format (all three primer sets) and TaqMan real-time PCR (RNRf/RNRr and HLBas/HLBr). As shown in Fig. 5, mean Ct values were 20.05 for RNRf/RNRr, 21.71 for HLBas/HLBr, and 23.33 for LJ900f/LJ900r. Standard deviations from RNRf/RNRr (2.22) and HLBas/HLBr (2.37) were smaller than that from LJ900f/LJ900r (4.91), suggesting higher sequence variations of CLas prophages than those of the conserved 16S rRNA gene and nrdB. Mean Ct differences between RNRf/RNRr and HLBas/HLBr were significant P < 0.001 in both SYBR Green PCR and TaqMan PCR formats, with ΔCt being −1.68 ± 0.18 for SYBR green PCR and −1.77 ± 0.18 for TaqMan PCR. These represent >3 fold increase of sensitivity based on the ΔCt method18. Differences between RNRf/RNRr and LJ900f/LJ900r and between HLBas/HLBr and LJ900f/LJ900r were also significant at P < 0.05 level.

Figure 5
figure 5

Comparisons of PCR detection sensitivities on “Candidatus Liberibacter asiaticus” using 57 samples from China (34) and USA (23) among primer sets RNRf/RNRr (nrdB-based), HLBas/HLBr (16S rRNA gene-based), and LJ900f/LJ900r (prophage-based).

(a) SYBR Green real-time PCR. (b) TaqMan Real-time PCR. Numbers within each bar box are mean Ct values with standard deviation. P values were calculated based on independent-sample T-test. All qPCR assays were performed on the ABI real-time PCR system with the same regent kit (Universal PCR Master Mix, Applied biosystems).

Evaluation on RNRf/RNRr with field samples from China and USA

A total of 262 DNA samples extracted from CLas infected plants and psyllids in seven provinces in China and three states in USA were tested with SYBR Green real-time PCR format (Table 4). Overall, there was a significant difference between the Ct values of RNRf/RNRr and HLBas/HLBr (P < 0.0001), although variations existed from location to location in both countries. The largest P value in China was from Guangxi Province and the largest P value in USA was from Florida. However, in all cases, P values were <0.05 and ΔCt were negative within a range from −1.36 to −1.75 (Table 4). In addition, the RNRf/RNRr qPCR assays on three different qPCR systems (ABI system, MJ system, and CFX system) also showed the robust of RNRf/RNRf on detection of CLas (Table S2).

Table 4 Evaluation of primer sets RNRf/RNRr (nrdB-based) and HLBas/HLBr (16S rRNA gene-based) on detection of “Candidatus Liberibacter asiaticus” using field samples collected from China and USA.

Discussion

The inability to culture CLas in vitro limits the use of traditional in vitro culture-based methodologies to study its biology. Genome sequence analyses in this study provided the first insight into an RNR gene of CLas and reveal previously unknown properties of the bacterium. According to model studies, RNRs are divided into three classes (Classes I, II, and III), largely based on their interaction with oxygen and the way in which they generate their tyrosyl racdical19. The CLas nrdB described in this study belongs to Class Ia, that is exclusively oxygen-dependent8, implying an aerobic lifestyle of CLas. This is the first report on oxygen usage status of CLas, which will benefit future efforts on in vitro cultivation of the bacterium.

Typically, bacterial RNR genes are arranged in an operon. Class Ia RNR genes form nrdAB, where nrdA encodes RNR α-subunit, and nrdB encodes RNR β-subunit. This does not seem to be the case in CLas, where both nrdA and nrdB are dispersed separately in the bacterial genome (Table 1). Examinations of neighboring regions of each nrdB gene revealed no RNR gene homologs with the exception of nrdBL2 (Supplementary Table S3). Upstream of nrdBL2 is RibF (Supplementary Table S3), encoding a riboflavin biosynthesis protein similar to that of nrdI in the Class Ib operon nrdHIEF where nrdH encodes a glutaredoxin-like protein, nrdI encodes a flavorotein, nrdE encodes RNR α-subunit, and nrdF encodes RNR β-subunit20. It was also reported that in Mycobacterium tuberculosis, RNR subunit genes were not arranged in an operon21. Interestingly, both CLas and M. tuberculosis are nutritionally fastidious intracellular pathogens. The HLB associated CLas is not cultivable. The slow growing M. tuberculosis causes human tuberculosis.

The most intriguing finding from this study is that CLas has five copies nrdB, three in a long form designated nrdBL and two in a short form designated nrdBS, along with a single nrdA (Table 1). As shown in Table 2, among the known Liberibacter genomes, only CLas has multiple copies of RNR genes. Although it is common to find multiple RNR classes within a single bacterial species8, only a few cases of nrd gene direct duplication have been reported. For example, M. tuberculosis has a second class Ib-like subunit gene21 and Sreptococcus pyogenes has two clusters of class Ib genes, nrdHEF and nrdF*I*E*22. In both cases, the duplicated genes show significant variations at the level of DNA sequences (<71% identity). In this study, the sequences of three nrdBL are almost identical and the two nrdBS are nearly identical. The common regions between nrdBL and nrdBS are also identical. These indicate that the nrdB gene duplication events are recent.

Duplication of RNR genes has been shown to be important for bacterial proliferation. As in the cases of M. tuberculosis and S. pyogenes, the two different nrd genes allowed bacterial growth under different growth environments21,22. Along this direction, the nrdB duplication in CLas could be related to its environmental adaptation and likely by increasing functional dosage23. Although more evidence is needed, it will be of interest to study if this possible dosage effect could be linked to the current dominance of CLas in HLB. In Brazil, both CLas and CLam were reported to be associated with HLB24. However, as observation continued, the population of CLas increased whereas the population of CLam decreased25,26.

It is noted that nrdBS has no active site (Fig. 4). Its biological role(s) could be an interesting topic. In early research, a strain of Escherichi coli (C600) was found to have two forms of β-subunit of RNR, one was a full length and functional β-polypeptide, the other was a truncated and non-functional β’-polypeptide27. In a model RNR structure of α2β2, there could be two possible homodimeric β-subunits (ββ and β’β’) and one heterodimeric β-subunit (ββ’). The heterodimeric β-subunit was found to conform to a half-site reactivity, which might be involved in regulation of enzyme activity. In this regard, we speculate that the non-functional short form nrdBS could be used at the transcriptional level to generate a heterodimer as part of the RNR regulation in CLas proliferation.

While in silico genome sequence analyses of RNR genes only provide information for understanding CLas biology, the high copy number and conserved feature of nrdB was explored for CLas detection. The use of primer set HLBas/HLBr along with a hybridization probe (TaqMan PCR) has been regarded as a standard protocol for CLas detection. However, problems arise when high Ct values, e.g. Ct = 30 or higher, are encountered. This situation is commonly encountered when testing citrus trees for the presence of CLas, especially for symptomless or atypical symptom samples. The available RNRf/RNRr PCR detection system provides a remedy. First, as HLBas/HLBas, RNRf/RNRr was also based on the highly conserved gene. This assured the reliability of CLas detection, in contrast to the prophage-based primer set Lj900f/LJ900r (Fig. 5). In fact, the universal presence of RNR gene has been recommended as a key target for phylogeny research of viruses that lack ribosomal RNA genes28; and second, the RNRf/RNRr locus has five copies, higher than the three copies of the 16S rRNA gene. This means more initial targets are available for PCR leading to increased sensitivity of detection. As demonstrated in Fig. 5, RNRf/RNRr PCR is at least three times more sensitive than HLBas/HLBr PCR in both SYBR green and TaqMan formats. In this study, the robust of RNRf/RNRrqPCR assays were also confirmed on three different real-time PCR system, although greater sensitivity of RNR primers was showed on both ABI system and MJ system rather than on CFX system (Table S2).

In summary, through genome sequence analyses, we discovered that CLas had five copies of RNR β-subunit gene nrdB. CLas nrdB has both long and short forms that could play a role in the RNR regulation in the bacterial proliferation. Phylogenetically, all CLas nrdB genes clustered together, forming a stable evolutionary lineage, as that of the 16S rRNA gene. The high copy number and conserved feature of nrdB provide a foundation for being used in sensitive and reliable detection of CLas. Primer set RNRf/RNRr has been developed and tested. The detection system is recommended for use to resolve CLas detection issue when the primer set HLBas/HLBr encounters border line Ct for interpretation.

Materials and Methods

Bacterial genome sequences and strains

The whole genome sequence of CLas strain A4 that originated from an HLB citrus tree in Guangdong of China (CP010804)3 was used for DNA/gene copy evaluation. All bacterial genome sequences were downloaded from GenBank database (v211.0) hosted by the National Center for Biotechnology information (NCBI) (Table 2). Field strains were collected for population study. A CLas strain was represented by DNA extracted from an infected leaf sample of citrus (Citrus sp.) or periwinkle (Catharanthus roseus) or an individual ACP. Samples were from seven provinces (Guangdong, Guangxi, Yunnan, Fujian, Jiangxi, Zhejiang and Hainan) in China and three states (Florida, Texas and California) in USA (Table 4). DNA were extracted by with E. Z. N. A. HP Plant DNA Kit (OMEGA Bio-Tek Co., Guangdong, China) or DNeasy Plant Kits (Qiagen Inc., Valencia, CA) for plant samples, and TIANamp Genomic DNA Kit (Tiangen Biotech Co., Beijing, China) or DNeasy Blood & Tissue Kit (Qiagen Inc., Valencia, CA) for individual psyllid.

Identification of nrdB and in silico characterization

The CLas strain A4 genome sequence was self-compared using the BLASTn program with the word size set at 128-bp with the web service of NCBI. The result was visualized with the Dot-Matrix option. DNA sequence regions with highest number of repeats were retrieved. The genetic nature of DNA sequences was characterized according to genome annotation, assisted by BLAST search against the NCBI conserved domain database (CDD, v3.14). Since the identified DNA sequences were longer than the annotated genes, only gene sequences were downloaded and used for analyses. Protein structure analyses were initially carried out with Phyre server (http://www.sbg.bio.ic.ac.uk/~phyre2/html/page.cgi?id=index) using a profile-profile alignment algorithm29. Final 3-D structures were made using Pymol Molecular Graphics System (v1.7.6, Schrödinger LLC).

For phylogenetic studies, all published CLas and selected bacterial species representing major bacterial groups were used (Table 2). DNA and amino acid sequence of nrdB were retrieved according to genome annotation or from the ribonucleotide reductase database (v0.901)9. The total number of nrdB gene in each genome was directly counted from the genome annotation and further confirmed by similarity searching the bacterial genome with the corresponding nrdB sequence. DNA sequences of 16S rRNA genes were downloaded from NCBI GenBank nucleotide database (Genbank version 211.0). Phylogenetic trees were constructed using the Neighbor-joining method with MEGA 6.030.

Primer/probe designs and PCR experiments

CLas nrdB sequences were aligned through the Clustal Omega software31. Common regions across all nrdB sequences were identified and used to design PCR primers and TaqMan probe sequences with Primer 3 software32 (Table 3). Primer and probe sequence specificity were checked through BLASTn against the GenBank nucleotide database (Genbank version 211.0). The TaqMan probe was synthesized by labeling the 5′-terminal nucleotide with 6-carboxy-fluorescein (FAM) reporter dye and the 3′-terminal nucleotide with Black Hole Quencher (BHQ)-1 (Table 3) through a commercial source. Primers of HLBas/HLBr and HLBp and LJ900f/LJ900r were synthesized according to the original publication12,15.

Both SYBR Green and TaqMan real-time PCR formats were used in this study. The SYBR Green real-time PCR assays were performed in three different real-time PCR systems. In the USA, MJ Research DNA Engine opticon 2 system (MJ; MJ Research Inc), and Applied Biosystems StepOnePlus™ Real-Time PCR Systems (ABI; Applied Biosystems, Foster City, CA, US) were used. In China, the CFX Connect Real-Time System (Bio-Rad, Hercules, CA, USA) was used. The TaqMan real-time PCR assays were only performed in the Applied Biosystems StepOnePlus™ Real-Time PCR Systems.

Real-time PCR procedures were essentially referenced to that of Li et al.12. For SYBR Green real-time PCR, the reaction mixture contained 10 μl of iQ™ SYBR® Green Supermix (Bio-Rad) or Fast SYBR® Green Master Mix (Applied Biosystems) or Bestar® qPCR Master Mix (DBI® Bioscience), 1 μl of DNA template (~25 ng), 0.5 μl of each forward and reverse primer (10 μM) in a final volume of 20 μl with the following procedure: 95 °C for 3 min (MJ and CFX) or 95 °C for 20 s (ABI), followed by 40 cycles at 95 °C for 10 s (MJ) or 95 °C for 3 s (ABI) or 95 °C for 10 s (CFX, Bio-Rad) and 60 °C for 30 s (MJ and CFX) or 60 °C for 3 s (ABI). The fluorescence signal was captured at the end of each 60 °C step, followed by a melting point analysis.

For TaqMan® real-time PCR, the reaction mixture contained 10 μl of TaqMan® Fast Universal PCR Master Mix (2X) (Applied Biosystems), 1 μl of DNA template (~25 ng), 0.2 μl of TaqMan® probe (5 μM), 0.4 μl of each forward and reverse primer (10 μM) in a final volume of 20 μl with the following procedure: 50 °C for 2 min, then 95 °C for 20 s, followed by 40 cycles at 95 °C for 3 s and 60 °C for 30 s. The fluorescence signal was captured at the end of each 60 °C step. The data were analyzed using Opticon Monitor™ software (MJ Research), StepOnePlus™ Software v2.3 (Applied Biosystems) and Bio-Rad CFX Manager 2.1 software with automated baseline settings and a manually set threshold at 0.1. Amplicons were quantified using standard curves established based on ten-fold serial dilutions of the CLas-infected citrus plant total DNA in triplicate.

For evaluation of differences among primer sets of RNRf/RNRr, HLBas/HLBr, and LJ900f/LJ900r, 34 CLas samples from China and 10 CLas samples from USA were used (Table 3). The SYBR green real-time PCR format was used to for primer set evaluations. Since HLBas/HLBr-HLBp (TaqMan real-time PCR format) was popularly used, RNRf/RNRr-RNRp was also used. To substantiate the evaluation results, a total of 262 CLas samples collected from China and USA (Table 4) were tested with SYBR green format.

Statistical analysis

PCR results (Ct values) among different primer sets were evaluated by independent-sample T test. All tests were performed using the SPSS Statistic package (v19.0, IBM, Armonk, New York, U.S.). Sensitivity increase (R) between RNRf/RNRr and HLBas/HLBr was calculated through the ΔCt method18, i.e. R = 2−ΔCt, ΔCt = Ct (RNRf/RNRr)–Ct (HLBas/HLBr).

Additional Information

How to cite this article: Zheng, Z. et al. Unusual Five Copies and Dual Forms of nrdB in “Candidatus Liberibacter asiaticus”: Biological Implications and PCR Detection Application. Sci. Rep. 6, 39020; doi: 10.1038/srep39020 (2016).

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.