Introduction

Carnivore protoparvovirus 1 is a species within the genus Protoparvovirus in the family Parvoviridae that includes several viruses of remarkable epidemiological relevance for wild and domestic carnivores1. Canine parvovirus (CPV) is one of the most relevant and well-studied members of this group because of its clinical relevance for companion animals. As other carnivore parvoviruses, it is characterized by a relatively simple, single stranded, negative linear DNA genome of about 5.12 kb, featured by two main Open Reading Frames (ORFs) translated in 4 proteins through alternative splicing2,3. The first ORF encodes non-structural proteins NS1 and NS2, involved in viral replication, while the second ORF encodes the viral capsid constituents VP1 and VP2. An additional protein (VP3) originates after VP2 cleavage mediated by host proteases4.

After its origin, likely due to the new host adaptation of a feline parvovirus-like of wild carnivores5, CPV rapidly spread worldwide in the dog population6,7, causing severe disease outbreaks and high mortality8. Since CPV totally relies on the host cell machinery, viral replication requires actively proliferating cells. This feature largely explains the viral cell tropism and pathogenesis. In fact, CPV recognizes intestinal crypts and lymphoid organs as main tissue targets9. Therefore, the most common clinical picture is characterized by vomit and diarrhea in association with anorexia, depression and fever. Fluid and protein losses through the gastrointestinal tract can result in severe dehydration and hypovolemic shock. Gastrointestinal signs are accompanied by severe immunosuppression characterized by lymphopenia and eventually panleukopenia, which, together with the breakdown of the intestinal barrier, facilitate the passage of bacteria and/or endotoxins in the blood stream10. Septicemia, endotoxemia, systemic inflammation, coagulation disorders and septic shock are therefore commonly present in CPV infected animals and contribute significantly to the disease severity and lethality11,12,13.

Additionally, CPV infects and replicates in myocardiocytes of pups from immunologically naive bitches, causing a typically fatal myocarditis. However, the high population immunity due to extensive vaccination protocols, has substantially decreased the incidence of this sign14.

As other ssDNA viruses, CPV displays a high evolutionary rate, which has led to a remarkable variability at both nucleotide and amino acid level15. In fact, the original CPV (CPV-2) was rapidly replaced all around the world by new antigenic variants like CPV-2a and CPV-2b that emerged, respectively, in 1979 and 198416. More recently, a “major” antigenic variant, CPV-2c, has been identified and proven widely distributed5,17,18. CPV heterogeneity has led to the speculation about the presence of differential immunological and virulence features among strains. However, up to date and except for anecdotal reports, no study has consistently demonstrated an association between antigenic variants and pathogenicity/virulence5,19,20,21.

The current classification of CPV at sub-species level is mainly based on the presence of certain amino acid residues in specific VP2 positions5,22. However, this approach has led to the proposal of a multitude of new variants apart from the aforementioned major ones, complicating CPV nomenclature and the understanding of its epidemiology. Additionally, more extensive analyses have demonstrated the limited consistency of associating certain amino acid positions with virulence23, as incongruities between these phenotypic markers and phylogeny have frequently been reported18. Consequently, the focus on specific amino acid positions could not mirror the actual relationships among strains, hindering the identification of an association between particular viral clades and pathological/virulence features.

The present study attempts to overcome this issue by using different statistical approaches to evaluate the association between CPV strain phylogeny - reconstructed using the whole VP2 gene - and an extensive database of anamnestic, clinical and hemato-biochemical parameters.

Results

Animal and viral population features

Thirty-four subjects, 14 females and 20 males, were enrolled in the study provided the availability of 1) a complete clinical record, 2) a complete, high quality VP2 sequence. Half of the animals were properly vaccinated against CPV, while the remaining did not complete the standard vaccination protocol or were not vaccinated at all. Among the vaccinated dogs, 11.7% (2 out of 17) died compared to 29.4% (5 out 17) of the unvaccinated animals. However, no statistically significant association was found between vaccination status and death (p = 0.398). Similarly, no association was detected between animal gender and death occurrence (p = 1). A more detailed description of the included subject features is provided in Supplementary Table 1.

The obtained VP2 sequences showed a nucleotide average raw genetic distance of 0.5% (interval 0–1.31%) and amino acid average raw distance of 0.5% (interval 0–2.1%). Sampled strains were representative of all major antigenic variants (CPV-2 (n = 1), CPV-2a (n = 23), CPV-2b (n = 7) and CPV-2c (n = 3)) based on VP2 amino acid residues 87 and 426 (Supplementary Fig 1). The amino acid frequency for each VP2 position is provided in Supplementary Fig. 2.

Phylogenesis-trait association

The first principal component axis of pPCA explained 13.16% of the underlying variability. Although it could seem a modest value, this first component significantly outperformed the other ones in explaining the “global” variability along the tree and it was consequently selected for the following analyses.

PC1 loading evaluation showed the higher contribution to the major global axis of variables ascribable to: immune system functionality (e.g. neutrophils, lymphocytes, monocytes, etc.), coagulation (e.g. D-dimers and fibrinogen), tissue and hepatic enzymes (e.g. serum activity of AST, ALT, LDH), markers of acute phase response (APP) (e.g. α and β globulin, haptoglobin, albumin/globulin ratio (A/G)) and more generally, markers of metabolic derangement (e.g. plasma lactate, cortisol/creatine, cholesterol and triglyceride levels) (Fig. 1).

Figure 1
figure 1

Loadings of the first global component. For interpretation easiness, only the labels of the variables with loadings in the higher 75 percentile (blue dashed lines) are displayed.

Abouheif’s test performed on variables mostly contributing to the first global axis demonstrated a significant phylogenetic signal for: α2 globulin, β2 globulin, monocytes, neutrophils, A/G, serum AST, serum haptoglobin, serum triglycerides and WBC (Table 1 and Fig. 2). However, after controlling for False Discovery Rate (FDR) only neutrophils and WBC remained statistically significant. No statistical association was observed between tree topology and categorical variables like animal gender, vaccination status and disease outcome.

Table 1 Summary of statistically associated variables with phylogenetic structure.
Figure 2
figure 2

Barplot reporting the standardized values of the hemato-biochemical parameters proven correlated with the viral topology (left insert). Bars of the deceased dogs are colored in red while not properly vaccinated animals are highlighted with red labels. Labels of parameters statistically significant after controlling for False Discovery Rate (FDR) have been marked in red.

No statistical association was observed between tree topology and categorical variables like animal gender, vaccination status and disease outcome. The locus by locus AMOVA results indicate that the nucleotide frequencies in positions 303, 375, 537, 1509 and 1701 differ significantly between the groups defined for each variable. However, all the identified nucleotide variants code for synonymous mutations and no amino acid differences were observed between groups for the considered variables.

Discussion

The clinical development of infection is an extremely complex phenomenon involving an intricate network of interactions among different concomitant conditions, which are traditionally classified as environmental, host-related and pathogen-related24. Among those, the pathogen virulence clearly plays a major role in determining the disease outcome. For several viral species, remarkably different pathological effects and clinical signs have been associated to particular isolates. Differently from more complex organisms, phenotypic features, including virulence, are almost completely genetically determined in microbial pathogens, especially in viruses. Therefore, forms of inheritance from common ancestors, and thus phylogenetic clustering, can be expected. Such association has been speculated for CPV too, and a different virulence among antigenic variants has been hypothesized19.

Such association has been speculated for CPV too, and a different virulence among antigenic variants has been hypothesized19. Spibey et al.20 suggested the occurrence of more severe clinical signs in animals infected with CPV-2c. However, these results contrast with previous reports21. For instance, Moon et al. (2008)19 evidenced a higher pathogenicity of CPV-2a compared to CPV-2b Taiwanese strains. Interestingly, the two CPV-2a strains did not show relevant differences in terms of virulence, supporting an actual antigenic variant peculiar virulence. Nevertheless, it is important to note that at the amino acid level the two tested CPV-2a strains (classified based on residue 426 only) differed more within-variant (4 amino acid differences) than between CPV-2a and CPV-2b strains (2 amino acid differences)19. Consequently, a proper linkage between viral clustering and clinical consequences is hard to establish. More generally, increasing evidences demonstrate the limits and incongruence of a classification based on phenotypic markers18.

Therefore, the present study was designed to investigate the distribution of clinical-pathological data (i.e. outcome of viral infection) along the phylogenetic tree in order to study the CPV genetics-virulence association and consequently its inheritance.

The results of the pPCA indicated that about 13% of the data variability can be explained by first global component, reflecting a relevant tendency of some clinical features to be related to the phylogenetic structure. While the overall variability explained by this axis seems low, it must be stressed that the performed analyses lie on the implicit assumption of considering clinical outcomes as viral features. The actual scenario is clearly much more complex and influenced by other environmental and host factors, as well as the presence of other undetected co-infections and co-occurring diseases, which could not be accounted for in the present study. Consequently, the evidence of a non-negligible pattern between clinical parameters and viral genetics, in spite of this unavoidable “noise”, highlights the relevance of the observed results.

Remarkably, the evaluation of the first global component loadings evidenced that the variables showing a relationship with the viral phylogeny are involved in CPV pathogenesis and associated systemic diseases, leading to multi-organ suffering or failure. A more formal statistical analysis confirmed these results, highlighting the presence of a significant phylogeny signal for several variables that could be globally ascribable to the host response and considered as markers of disease severity (Table 1). Nevertheless, it must be stressed that, after accounting for FDR, only neutrophils and WBC remained statistically significant. Accordingly, a decrease in immune system cell count is a frequent hallmark of CPV infection because of the destruction of myelopoietic cells and several authors have related the lower leukocyte count to a poor prognosis12,25,26. While a certain risk of type I error cannot be denied for the remaining parameters (Table 1), they are still worth mentioning because of the concordance of different statistical approaches and the biological plausibility.

Alterations in coagulation system and factor concentrations (e.g. fibrinogen) have been documented in CPV-infected animals27, likely because systemic inflammation results in thrombin generation mediated by tissue factor, downregulation of physiological anticoagulant mechanisms, and inhibition of fibrinolysis28,29,30. The hyperfibrinogenemia could also contribute to the pro-coagulative state.

In addition to fibrinogen, other protein levels were identified to be associated with viral genetic features, especially the α2 and β2 globulin components, which comprise elements involved in the acute phase response (i.e. acute phase proteins; APP), like haptoglobin, ceruloplasmin and C-reactive protein. Accordingly, a similar dependency with viral topology was observed in the albumin concentration (negative APP) and A/G ratio. These evidences are particularly suggestive since APPs develop as a precocious response to many pathological stimuli31, including infectious agents, leading to amplification of the inflammation and systemic repercussions. An association between APP concentration and disease severity was evidenced for several diseases, including CPV infection31,32. Particularly, high serum CRP levels have been positively associated with mortality33,34.

Finally, several alterations in hemato-biochemical parameters ascribable to hepatic or renal suffering or variations in the metabolism and acid-base disturbances had an influence on the first global component, albeit the correlation was not always statistically significant. Although these variations are likely not directly caused by CPV, they can be considered a proxy for hypovolemia, shock and hypoperfusion. Alterations in lipid metabolism were also linked to viral phylogeny. Previous studies showed that lipoproteins play a central role during sepsis, binding and neutralizing bacterial lipopolysaccharides, and cholesterol and triglyceride levels were associated with mortality in human patients with sepsis35,36. A similar association was also evidenced in dogs with parvoviral enteritis37.

All the detected parameters, with few exceptions, can hardly be justified by a direct CPV effect on a particular target, and appear to represent more realistically the overall picture of the animal response.

Unfortunately, the limited sample size and the need to preliminary explore a large number of variables impose a trade-off between type I and II error that hinders and complicates the identification of an association between viral genotype and specific phenotypic consequences on the host, being neutrophils and WBC the most noticeable exception.

Nevertheless, taken as a whole, these results demonstrate a certain relation between viral phylogeny and disease severity, suggesting that CPV virulence is an inherited trait that, originating from an ancestral strain, can characterize descendant clusters.

Quite surprisingly, this evidence was not mirrored by a corresponding clustering of the fatality events in the phylogenetic tree (Figure 2). Likely, many non-viral factors could have affected the disease outcome. Among the others, it must be stressed that, while the clinical and hemato-biochemical parameters were collected at admission, setting a baseline as standard as possible, the following therapeutic approach was modulated case-by-case according to the animal clinical situation and owners’ economic constraints and sympathy. While unavoidable from a practical perspective, this choice surely complicated the interpretation of the relations between viral genetic features and clinical outcome.

The AMOVA analysis evidenced the presence of some nucleotide positions potentially associated with alteration of hemato-biochemical parameters. The absence of corresponding amino acid substitutions could suggest a role of the genome itself in affecting viral biological properties (e.g. secondary structures, codon bias, etc.)38 or a linkage of these positions with non-synonymous mutations in other genome regions (i.e. non-structural protein).

Unfortunately, the small sample size due to the difficulty in enrolling a high animal number with associated accurate and standardized hemato-biochemical profiles and reliable viral genetic data limited the statistical power of the study, hindering the detection of the potential association.

Larger studies based on standardized protocols and sharing of the related information among different research groups would surely provide more consistent confirmation of the linkage between CPV phylogeny and virulence by both increasing the study power and exploring a broader viral genetic heterogeneity.

Despite these limitations, the present study suggests, for the first time, the presence of an association between CPV genetic clusters and disease severity, highlighting the inheritableness of this feature. The likely existence of clades with different virulence emphasizes once more the relevance of intensive epidemiological monitoring and research on CPV evolution to better understand the virulence determinants, their epidemiology and arrange adequate countermeasures.

Material and Methods

Animals

Archive serum samples collected over the period 2008–2015 from 34  dogs, naturally infected with CPV, were included in the study. Samples originated from San Marco Veterinary Clinic (Padua, Italy) routine clinical activity and were “a posteriori” selected based on the clinical record evaluation. Particularly, stored sera were further analyzed if the corresponding animals showed typical signs of canine parvoviral enteritis and diagnosis was confirmed by CPV specific real-time PCR18. Only animals reported to have experienced acute and recent episodes of diarrhea were considered. To limit the effect of other concurrent diseases and co-infections, subjects were included in the study only if no previous and unrelated clinical conditions were reported and blood and/or feces samples tested negative to other common enteric viral, bacterial and parasitic pathogens.

At admission, anamnestic data had been obtained and physical examination performed. Blood samples had been collected and analyzed within the day to investigate several hematological and biochemical parameters (a full list of the investigated variables is provided in Appendix). All animals had been treated with standard supportive therapy until recovery or death. Because of the acute disease nature, the negative outcome (i.e. death) was attributed to CPV infection when occurring within 20 days from hospitalization. Variations from the standard therapy were present due to specific animal clinical conditions (CPV infection course and concomitant diseases) and owners’ compliance. All considered analyses and medical procedures were performed in the context of routine diagnostic and clinical activity and no experimental treatments or additional assays were implemented during the study. Therefore, no ethical approval was required to use specimens collected for diagnostic purpose.

CPV VP2 characterization and phylogenetic analysis

Archive serum samples were stored at −20 °C from collection to processing. Two-hundred microliters of serum were extracted using NucleoSpin®Blood extraction kit (MACHEREY-NAGEL) and the full VP2 region was amplified with Thermo Scientific™ Phusion™ Hot Start II High-Fidelity DNA Polymerase kit (Life Technologies Inc., Carlsbad, California) with the primers VP-F 5-TTGGCGTTACTCACAAAGACGTGC-3 and VP-R 5-ACCAACCACCCACACCATAACAAC-3, described by Perez et al.39. Amplification and specificity of the bands were verified by electrophoresis on a SYBR Safe stained agarose gel. PCR products were purified by enzymatic method using Applied Biosystems™ CleanSweep™ PCR Purification Reagent kit (Life Technologies Inc., Carlsbad, California) following the manufacturer’s instructions and Sanger sequenced using 4 overlapping primer pairs as described by Tucciarone et al. (2018)18.

The quality of the obtained chromatograms was visually evaluated using FinchTV (2004–2006 Geospiza Inc) and consensus sequences were produced using ChromasPro (ChromasPro Version 1.5; Technelysium Pty Ltd, South Brisbane, Australia; http://technelysium.com.au/wp/chromaspro/).

Finally, the sequences were aligned at codon level using TranslatorX40 and the region corresponding to the full VP2 was trimmed. The presence of adequate phylogenetic information was evaluated by Likelihood mapping analysis using IQ-Tree41. A Maximum Likelihood phylogenetic tree was reconstructed using PhyML42 selecting as substitution model the one with the lowest Akaike Information Criterion (AIC) calculated using Jmodeltest43. The branch support was calculated by performing 1000 bootstrap replicates.

Association between clinical data and phylogenesis

Clinical and hemato-biochemical continuous variables were matched to the phylogenetic tree tips using the phylobase R library44.

Because of the high number of likely correlated variables, an overall preliminary investigation was performed to identify those potentially associated with the phylogenetic structure. In general, phylogenetic autocorrelation is said to occur whenever the values taken by a set of taxa for a given biological trait are not independent from the phylogeny. This non-independence can be positive (i.e. a higher similarity in related taxa than expected by chance, leading to a so called “global structure”) or negative (a higher dissimilarity in related taxa, leading to a “local structure”)45. Since our aim was to investigate the presence of cluster-related virulence features, a particular focus was reserved to the analysis of global structures.

To this purpose, a phylogenetic Principal Component Analysis (pPCA) was performed on centered and scaled variables using adephylo46, selecting the phylogenetic proximity matrix calculated with Abouheif’s method. This method allows to summarize several traits in a lower number of variables (i.e. Principal Components; PC) exhibiting global or local structures. PCA eigenvalues were evaluated to quantify the amount of variance and phylogenetic correlation expressed by each PC. The first global component was selected and the respective loadings were inspected to evaluate how each trait contributed to the PC. Variables with loading values in the higher quartile (i.e. most contributing variables) were selected for further analyses. The presence of phylogenetic signal was individually evaluated for each of these variables computing the Abouheif’s Cmean using the phyloSignal package47,48. The statistical significance of the obtained index was assessed by comparison with a null hypothesis distribution (i.e. absence of phylogenetic signal) obtained performing 1000 repetitions of the test with label randomization. Because of the limited sample number, the statistical significance level was set to p < 0.1. Multiple comparisons were accounted for using the Benjamini-Hochberg procedure to control the False Discovery Rate (FDR) (Benjamini and Hochberg, 1995).

The association between quantitative traits and tree topology was tested calculating different statistics using BaTS: Parsimony score (PI), Association Index (AI) and MC size49. To account for the phylogenetic uncertainness, a Bayesian Markov Chain Monte Carlo (MCMC) approach was implemented. A one million generation MrBayes50 analysis was performed on CPV alignment sampling model parameters and phylogenetic trees every 5 thousand generations, thus obtaining a tree posterior distribution (after discarding the first 20% of the trees as burn-in). The aforementioned statistics were calculated across these trees, obtaining their posterior distribution. The observed median (μ0bs) value was selected as final outcome. The distribution under the null hypothesis (i.e. no trait-phylogenesis association) was obtained by randomizing without replacement the tip-trait association one thousand times for each tree of the posterior distribution. Each randomized dataset was used to calculate the statistics medians (μnull), which formed the null distribution. This distribution was used to achieve a p-value by simply evaluating the proportion of simulated values more extreme than the observed one.

Locus by locus AMOVA analysis

A locus by locus analysis of molecular variance (AMOVA) was performed using the program Arlequin v3.551. For each of the selected variables, the individuals were categorized in two groups based on the sign (positive or negative) of the standardized values. Then, the average frequencies of each nucleotide per position and group were compared. The positions presenting a Fct value (fixation index among groups) larger than 0.05 were considered significant.