Detection of avian influenza virus: a comparative study of the in silico and in vitro performances of current RT-qPCR assays

Avian influenza viruses (AIV) are negative sense RNA viruses posing a major threat to the poultry industry worldwide, with the potential to spread to mammals, including humans; hence, an accurate and rapid AIV diagnosis is essential. To date AIV detection relies on molecular methods, mainly RT-qPCR directed against AIV M gene segment. The evolution of AIV represents a relevant issue in diagnostic RT-qPCR due to possible mispriming and/or probe-binding failures resulting in false negative results. Consequently, RT-qPCR for AIV detection should be periodically re-assessed both in silico and in vitro. To this end, a specific workflow was developed to evaluate in silico the complementarity of primers and probes of four published RT-qPCR protocols to their target regions. The four assays and one commercially available kit for AIV detection were evaluated both for their analytical sensitivity using eight different viral dilution panels and for their diagnostic performances against clinical specimens of known infectious status. Differences were observed among the tests under evaluation, both in terms of analytical sensitivity and of diagnostic performances. This finding confirms the importance of continuously monitoring the primers and probes complementarity to their binding regions.

Influenza A viruses (IAV) are enveloped negative-strand RNA viruses belonging to the family of Orthomyxoviridae. IAVs are important veterinary and human health pathogens, infecting many different avian and mammalian species worldwide 1 . Viruses of the Influenza virus A genus cause avian influenza (AI), a disease of great importance for animal health, both for the high mortality rate caused by some viral strains in domestic and wild birds and for public health implications due to their zoonotic potential. Based on the antigenic differences between the two surface glycoproteins hemagglutinin (HA) and neuraminidase (NA), AI viruses can be subtyped in 16 HA subtypes (H1-H16) and 9 NA subtypes (N1-N9) 2 . Remarkably, all have been isolated from avian species in most possible combinations. Influenza A viruses infecting poultry can be grouped based on their pathogenicity: highly pathogenic avian influenza (HPAI) viruses, which can cause flock mortality as high as 100%, and low pathogenic avian influenza (LPAI) viruses, which usually cause a milder or unapparent disease 2 . To date, only viruses typed as H5 and H7 have proved to be highly pathogenic in naturally infected poultry 1 .
Several diagnostic methodologies are currently available for the detection of AI infection, with virus isolation (VI) in eggs or in cell cultures universally recognized as the reference diagnostic standard. However, the application of such methods is mainly limited by the fact that they are not flexible to a sudden increase in demand, are not cost-effective, requires high biosafety standards and often a long processing time. For these reasons, in the recent past there has been a significant increase in the development and application of testing procedures for the detection of AI viral RNA. With the advent of molecular biology, several RT-PCR and RT-qPCR protocols have been developed for AIV detection and typing, proving to be rapid, specific and sensitive [3][4][5][6] . The approach to AIV diagnosis using molecular methods adopted in most laboratories has been based on the initial generic detection of AIV in clinical specimens, primarily by targeting the matrix (M) gene segment, followed by specific RT-qPCR tests for H5 and H7 subtype viruses 1 . The rationale behind the targeting of the M gene segment relies on the presence of regions sufficiently conserved among influenza A viruses of various species including avian ones and, hence, suitable for primers and probe selection. Despite several RT-PCR and RT-qPCR assays for generic AIV detection have been routinely and successfully used worldwide, some considerations are due. AIV, with its single-stranded negative-sense RNA genome, arranged into eight genomic segments, shows an intrinsic genetic instability 2 . This is mainly due to the error-prone nature of the virus replication machinery and to re-assortment during infection of a single host cell with two or more distinct AIV types, resulting in considerable genetic heterogeneity and evolutionary diversity 2 . Therefore, any molecular biology method should be periodically re-evaluated, on the ground that the sequence complementarity within the primers and probes binding regions might have changed, affecting the performances of the methods 7 .
In the present study, we compared the analytical sensitivity and the diagnostic performances of four published protocols for AIV M gene segment detection 6,8-10 and a commercially available diagnostic kit.
The four published protocols were chosen on the grounds that they are routinely used in international and national avian influenza reference laboratories, while the commercial kit was included being the first PCR-based commercial kit for the detection of avian influenza licensed by the U.S. Department of Agriculture (USDA); furthermore, the kit was also evaluated by the former European Union AI-ND Reference laboratory (Animal and Plant Health Agency-Weybridge-UK) showing promising results 11 .
Prior in vitro evaluation, we performed a comprehensive in silico analysis to assess the level of identity between primers and probes and their target regions on a dataset based on AIV M gene segment sequences deposited from 2014 onwards.

Materials and Methods
Sequences download and multiple sequence alignment. AIV M gene segment sequences deposited from January 2014 onwards were downloaded from the Global Initiative on Sharing All Influenza Data (GISAID) webserver (https://platform.gisaid.org/). A total of 4088 sequences were aligned against primers and probes of the four published assays 6,8-10 using MAFFT version 7 (https://mafft.cbrc.jp/alignment/software/). A unique multiple sequence alignment (MSA) analysis was performed for the assays developed by Spackman et al. (2002), Heine et al. (2015) and Hoffmann et al. (2016), these last two representing the improved versions of the former via the introduction of degenerated bases in the primers' sequences (Table 1). An independent MSA analysis was performed for Nagy's protocol, since primers and probe target a different gene region. Of notice, Nagy's protocol used probe number 104 (UPL104), from the 165 Universal ProbeLibrary (Merck KGaA, Darmstadt, Germany), containing locked nucleic acids (LNA) to increase probe binding; probe sequence is public available 7 , although the manufacturer did not disclose LNAs positions, hence it was not possible to evaluate the genetic variability in respect to each LNA. No MSA analysis was performed for the commercial kit, as the manufacturer did not disclose any primers and probes sequences.
In silico evaluation of nucleotide variability and genetic diversity. Using BioEdit (http://www. mbio.ncsu.edu/BioEdit/bioedit.html), the multiple sequence alignments were trimmed to include solely the sequence amplified by the assays, and the nucleotide variability within the amplicons was assessed using a web-based Shannon Entropy calculator (https://www.hiv.lanl.gov/content/sequence/ENTROPY/entropy_one. html). Sequences were further trimmed and concatenated, thus the resulting dataset contained merely the nucleotides complementary to primers and probes. The dataset were subsequently processed with cd-hit-est test of the CD-HIT Suite (http://weizhong-lab.ucsd.edu/cdhit_suite/cgi-bin/index.cgi?cmd=cd-hit-est) 12 to cluster sequences that shared 100% identity, such as each cluster represents a unique primers-probe motif. A prototype sequence within each cluster was selected, and eventually each cluster was expanded to the original number of sequences, in order to evaluate the relevance of each cluster. Only clusters containing more than 30 sequences, or rather with an incidence >0.75%, were considered significant to assess the inclusivity of the assays, and hereafter referred as major clusters. Figure 1 depicts the workflow used for the evaluation of nucleotide variability and genetic diversity.
Virus isolates and clinical samples. Eight viruses were selected according to the results of the in silico evaluation and used to assess the analytical sensitivity of the five assays (  Table 1. List of the assays tested in the present study. For each assay the sequences of primers and probe (5′→3′), the target regions and the amplicon size are reported. The degenerated bases introduced in the primers sequences are highlighted in bold.
titrated by inoculation into the allantoic cavity of 9-11-day-old specific pathogens free (SPF) embryonated chicken eggs (ECEs). ECEs were candled daily up to 16 days of age and ECEs with dead embryos were transferred to 4 °C for 16-24 h prior harvesting of the allantoic fluid. Titre are expressed as the 50% embryonic infectious dose EID 50 /100 µl, as determined according to Reed and Muench 13 . A total of 152 clinical samples of known infectious status, were used for the comparison of the diagnostic performances of the five assays. More specifically, 79 were AIV positive field samples of European (n = 52), African (n = 22) and Asian (n = 5) origin sent to our laboratory for diagnostic purpose as national, European and OIE/ FAO reference laboratory (pre-typed either by classical methods after virus isolation or by sequencing) and 73 AIV negative samples. These latter represented isolates of other avian pathogens, both viruses and bacteria, as well as true negative samples obtained from SPF chickens. True negative samples obtained from SPF chickens represent specimens collected during prior animal experiments conducted in our institution; animals' manipulation Nucleic acids isolation. Total nucleic acids were extracted using QIAsymphony DSP Virus/Pathogen Midi kit (Qiagen, Hilden, Germany), in combination with the automated system QIAsymphony SP (Qiagen, Hilden, Germany). Isolation of the nucleic acids was performed following the manufacturer's recommendations and to each sample an internal process control (Intype IC-RNA, Qiagen, Hilden, Germany) was added.
RT-qPCR assays. Protocol 1 -Spackman et al., 2002. The assay, hereafter referred to as protocol 1 for sake of clarity, was carried out using OneStep RT-PCR kit (Qiagen, Hilden, Germany) in a final mastermix of 25 µl containing 0.3 µM of each primer, 0.1 µM of the probe, 2 µl of IC mix and 5 µl of nucleic acid. The following thermoprofile was used: initial step at 50 °C for 30 min, 95 °C for 15 min, 40 cycles at 94 °C for 45 sec and 60 °C for 45 sec.     www.nature.com/scientificreports www.nature.com/scientificreports/ RT-qPCRs were carried out on CFX 96 Real-Time PCR Detection Systems (Biorad, Munich, Germany). Reaction mix of protocols 1 and 4 was prepared following the recommendations of the former Avian Influenza Community Reference Laboratory (Animal and Plant Health Agency, UK) (https://science.vla.gov.uk/flu-lab-net/ docs/pub-protocol-ai-vi493.pdf), while for the remaining assays the authors' or the manufacturer's recommendations were followed. The only deviation from the recommended protocols was the addition of 2 µl IC mix to each reaction.
Assays performances and limit of detection study. The analytical sensitivity of each RT-qPCR assay was assessed using 10-fold serial dilutions of eight titrated AI viruses selected as representative of the M gene segment clusters identified through the in silico evaluation ( Table 2). Each dilution was prepared in triplicate and tested by each assay on the same day. The limit of detection (LoD) was defined as the highest dilution at which all replicates tested positive (cycle threshold < 36).

Diagnostic sensitivity and specificity study. A total of 152 clinical samples of known infectious status
were tested in duplicate by each of the five assays to assess their diagnostic performances. A sample was considered positive when both the replicates produced a sigmoidal amplification curve (Ct < 36). Sample size, together with diagnostic sensitivity (DSe) and specificity (DSp) of all the assays under evaluation, were established as recommended in the OIE Terrestrial Manual 2018 -Principal and methods of validation of diagnostic assay for infectious diseases 14 .
Statistical analysis. Two-way analyses of variance (ANOVA) with Tukey's HSD post hoc test was performed to assess whether the analytical sensitivity was statistically different between the assays.

Evaluation of nucleotide and genetic variability between and within primers and probes binding regions.
First, the sequences were trimmed to span the amplicons of each assay; entropy plots summarize the nucleotide variability at each position of the amplicons. The entropy plot of the amplicons of protocols 1, 2 and 3 ( Fig. 2A) revealed high variability at three nucleotide positions within the sequence targeted by their reverse primers; the analysis showed two transitions at position 80 and 87 of the amplicons, G to A and A to G respectively, while at position 93 a combination of all the four bases with similar frequency was observed. The degenerated bases in the reverse primers of protocols 2 and 3 match the nucleotide diversity at positions 80 and 93, and 87 and 93, respectively. The binding regions of primers and probe of protocol 4 showed almost no variability (Fig. 2B).
The alignments were further trimmed in order to merely consider the regions complementary to primers and probes; the obtained dataset were clustered using CD-HIT tool, with the aim of identifying unique primers-probe motifs within the AIV M gene segment genetic diversity. The datasets for protocols 1, 2 and 3 were characterized by eleven major clusters (>30 sequences) accounting for 3591 AIV M gene segment sequences, or rather 87,8% of AIV genetic diversity within the primers and probes binding regions. The dataset obtained for protocol 4 contains only one major cluster, showing 100% identity within the primers and probes binding regions and accounting for 3918 AIV M gene segment sequences (95.8%).
Based on these datasets, it was possible to identify the eight AIV strains used in the limit of detection study, which were representative of the genetic diversity of 62% of the AIVs circulating worldwide since 2014 within the target region of the primers and probes of the four published protocols. www.nature.com/scientificreports www.nature.com/scientificreports/ Limit of detection of the RT-qPCR assays. A dilution panel for each virus was tested in triplicate by the five assays. The LoD was defined as the highest dilution at which all replicates tested positive (Ct < 36). The LoD of the assays varies among the dilution panels; however, protocols 2 and 3 and the commercial kit proved to be consistently the most sensitive assays (Table 3). Protocol 3 and the commercial kit showed the best analytical sensitivity in seven out of eight dilution panels, with the exception of dilution panels D and G, while protocol 2 was second best only when tested against dilution panels G and M. Protocol 4 matched the best analytical sensitivity in four out of eight dilutions panels. Sensitivity of protocol 1 was significantly the lowest among all the dilution panels, with the exception of dilution panel B (Table 3).
Statistical analysis of Ct values at the same viral titre at optimal baseline and threshold settings for all the assays showed a significant difference (p < 0.05) between Ct values of protocols 1 and those of the remaining assays amongst all the dilution panels, with the exception of dilution panel D (Table 4).

Diagnostic sensitivity and specificity of the RT-qPCR assays.
To assess and compare the diagnostic performances of the five assays, 152 samples of known infectious status were tested in parallel on the same RNA extracts. Details of the diagnostic performances of the five assays are reported in Table 5. The best performances for AIV M gene segment detection from clinical specimens were observed for the commercial kit (DSe = 100%), followed by protocol 3 (DSe = 98,78%), protocol 2 (DSe = 97.47%) and protocol 4 (DSe = 92.41%), with protocol 1 performing the least (DSe = 89.87%). The diagnostic sensitivity of the commercial kit differs (p < 0.05) from those of the remaining assays (Table 4), with the exception of protocol 3, which confirms that these two assays yielded the best performances for AIV detection from clinical specimens. Similarly, the diagnostic sensitivity of protocol 1 was statistically different (p < 0.05) from that of the other assays (Table 4), with the exception of protocol 4, a data confirming the lower diagnostic sensitivity of the assay.
Diagnostic specificity (DSp) ranged from a value of 97,2% for protocol 4 to 100% for the remaining assays; no statistical difference (p > 0.05) in terms of diagnostic specificity was observed between the assays (Table 4).
Kappa (ƙ) values were calculated as a measure of overall agreement among the assays, which proved to be almost perfect (ƙ > 0.80) ( Table 4).

Discussion
Avian influenza viruses exhibit a significant degree of genetic variability; this might lead to diagnostic failures of molecular tests when applied to mutated or new emerging viruses, meaning that a constant monitoring of the efficacy of molecular protocols available is uttermost necessary even when directed towards parts of the viral genome conventionally considered more stable. To our knowledge, this is the first large-scale in silico and in vitro evaluation of the RT-qPCR assays for the detection of AIV from different avian specimens. With the purpose of obtaining useful data on the performances of the assays in use in national and international reference laboratories, we compared the analytical sensitivity and the diagnostic performances of four published protocols 6,8-10 and one licensed commercial kit (VetMAX-Gold AIV Detection Kit, Thermo Fisher Scientific).
When tested against a panel of 152 clinical samples of known infectious status, the five assays yielded comparable results (ƙ > 0.80); however, despite the overall satisfactory level of correlation, discrepancies were observed that deserve further discussion. Protocols 1 and 4 showed lower diagnostic sensitivity (DSe = 89.87% and DSe = 92.41 respectively), giving false negative results when analysing samples which produced a positive signal at late amplification cycles (Ct > 30) when tested with the other protocols. The results observed for protocol 1 are consistent with the low analytical sensitivity detected throughout all the tested dilution panels. It is important to notice that this protocol has been developed in early 2000; hence, primers and probe were designed in accordance with the M gene segment sequences of the viruses circulating at the time. Not surprisingly, the in silico analyses performed on AIVs circulating from 2014 onwards showed a certain lack of identity within their binding regions. The lower complementarity is likely responsible for the limited performances of this assay in comparison to the others, which benefit from an up to date design of primers and probes; to support, in the reverse primers of protocols 2 and 3 degenerated bases matching the high variability observed within their target region were introduced. As a whole, the lower analytical sensitivity and the poor performances observed in the diagnostic setting suggest that protocol 1, due to the continuous evolution of the virus, might not represent anymore the ideal assay for AIV detection. Remarkably, a lack of identity of the reverse primer of protocol 1 was previously observed in respect  www.nature.com/scientificreports www.nature.com/scientificreports/ to the pandemic (H1N1) 2009 influenza virus and swine influenza A viruses (SIVs), leading to the development of an improved version of protocol 1 via the employment of a reverse primer specifically designed to match the genetic diversity of the pandemic and swine influenza viruses 16 . The employment of this new reverse primer might improve the diagnostic performance of protocol 1 towards AIV detection; however, in silico analysis using the dataset and the workflow presented in this study shows that these two reverse primers, even when used in combination, match only partially the genetic diversity of recent AIVs (supplementary table 1). On the contrary, in silico analysis confirmed the superior complementarity between primers and probe of protocol 4 and their target regions, suggesting that the reasons explaining the lower DSe must be ascribable to other factors. Protocol 4 amplifies a gene portion almost double in size in comparison to the other assays, possibly negatively affecting its efficiency and causing the low DSe observed. This hypothesis seems to be corroborated by the data gained during the limit of detection study, as protocol 4 showed an overall lower sensitivity in comparison to the best performing assays and in accordance with previous observation 11 . Another factor influencing the sensitivity of RT-qPCR assay is RNA integrity; considering the large gene portion amplified by protocol 4, we speculate that the diagnostic performances of this assays might be more affected by low quality RNA than the other protocols.   Table 5. Detection results of RT-qPCR assays using 152 samples of known infectious status. Samples were grouped accordingly to origin, HA subtype and clade (if applicable). Diagnostic sensitivity and diagnostic specificity are reported.
The assays of recent development, protocols 2, 3 and the commercial kit, showed the best analytical and diagnostic performances, underlying the importance of monitoring AIV M gene segment evolution and the need to update primers and probes sequences in relation to their binding regions. This need has already been proved for protocols aiming to type AIV 5,8,17 , or rather protocols targeting the HA gene segment known to rapidly evolve. By comparing the performance of five different RT-qPCR assays, our study clearly demonstrates that the same applies to the molecular assays for AIV detection, or rather to protocols targeting the conserved M gene segment 18,19 . To this aim, the in silico evaluation workflow proposed here represents a useful and user-friendly tool for the assessment of primers and probes complementary to their binding regions. Unfortunately, the unavailability of primers and probes sequences of the commercial kit implies having to rely on the manufacturer for such monitoring activity. To some extent, the same applies to protocol 4, as the manufacturer does not disclose the positions of the locked nucleic acids in the UPL104 probe, limiting the monitoring activity. One further concern about the UPL104 probe used in protocol 4 is related to its stability, as some level of degradation leading to false positive results has been observed while performing this study (data not shown). Implementation of good laboratory practice (e.g. aliquot UPL104) should be sufficient to avoid UPL104 degradation; however, the authors recommend extra carefulness in the storage and use of the probe.
The development of assays based on multiple targets is likely to reduce the risk of yielding false negative results; the in silico workflow described in this study, if applied to the whole M gene segment and/or other AIV genes, may lead to the identification of other conserved regions suitable for the implementation of such assays for generic AIV detection.
To conclude, our study confirms the importance of continuously monitoring the performances of the assays for AIV detection, both in silico and in vitro, as the emergence of new strains containing mutations within primers and probes binding regions might strongly affect the positive outcome of the test.

Data availability
The datasets generated and analysed during the current study are available from the corresponding authors on reasonable request.