Metagenomic characterization of swine slurry in a North American swine farm operation

Modern day large-scale, high-density farming environments are inherently susceptible to viral outbreaks, inadvertently creating conditions that favor increased pathogen transmission and potential zoonotic spread. Metagenomic sequencing has proven to be a useful tool for characterizing the microbial burden in both people, livestock, and environmental samples. International efforts have been successful at characterizing pathogens in commercial farming environments, especially swine farms, however it is unclear whether the full extent of microbial agents have been adequately captured or is representative of farms elsewhere. To augment international efforts we performed metagenomic next-generation sequencing on nine swine slurry and three environmental samples from a United States of America (U.S.A.) farm operation, characterized the microbial composition of slurry, and identified novel viruses. We assembled a remarkable total of 1792 viral genomes, of which 554 were novel/divergent. We assembled 1637 Picobirnavirus genome segments, of which 538 are novel. In addition, we discovered 10 new viruses belonging to a novel taxon: porcine Statoviruses; which have only been previously reported in human, macaques, mouse, and cows. We assembled 3 divergent Posaviruses and 3 swine Picornaviruses. In addition to viruses described, we found other eukaryotic genera such as Entamoeba and Blastocystis, and bacterial genera such as Listeria, Treponema, Peptoclostridium and Bordetella in the slurry. Of these, two species Entamoeba histolytica and Listeria monocytogenes known to cause human disease were detected. Further, antimicrobial resistance genes such as tetracycline and MLS (macrolide, lincosamide, streptogramin) were also identified. Metagenomic surveillance in swine fecal slurry has great potential for novel and antimicrobial resistant pathogen detection.

While obvious, engaging agribusinesses in emerging or reemerging pathogens surveillance is challenging 22 . Major industry objections that must be overcome include the biosecurity risks of permitting researchers to enter farms, the harm that specimen collection may cause the animals, and the often unspoken concerns that surveillance may reveal occupational hazards that could damage revenue 23 . Some of these objections can be met using noninvasive metagenomic surveillance techniques 24,25 .
In this study, we build upon previous efforts using metagenomic sequencing methods to examine swine fecal slurry samples from a U.S.A. farm for molecular evidence of pathogens. Our primary goal was to determine if indirect and noninvasive swine slurry sampling could yield robust microbial detections in order to support pathogen surveillance at the human-animal interface in North America. We focus our results on the viral families identified, given that they account for the highest number of pathogens in the priority swine (and zoonotic) pathogen list, and global economic impact 8,13,26 . Here, we report that even limited sampling can reveal a rich, highly informative pathogen landscape, replete with both known and novel viral entities.

Results
Overview of mNGS results. Nine  A total of 270 microbial genera meeting our criteria were found in the slurry samples. Figure 1A shows a breakdown of the microbial composition of the pig slurry. Bacteria were the most prevalent kingdom (54.81%), followed by Eukaryota (38.52%), Viruses (4.04%) and Archaea (2.59%). Figure 1B represents the top 25 most abundant genera within each kingdom that were identified. The most prevalent bacteria genera were Oscilibacter and Treponema. Among viruses, Posavirus, Picobirnavirus and Mamastrovirus were the most prevalent genera. As mentioned in the introduction, we have focused our results mainly on the viruses identified in the swine slurries, and offer a brief description of the microbial compositions of other kingdoms. Supplemental Table 1 has detailed breakdown of all the microbial genera found across all nine slurry and 3 farm environment (environment and aerosol) samples.  Table 1). In addition to these, other notable bacterial genera identified included Peptoclostridium (predominantly P. acidaminophilum) and Bordetella (predominantly B. bronchialis) found in 75% and 50% of the samples, respectively.

Bacteria and antimicrobial
In addition to bacteria, we detected a diversity of antimicrobial resistance genes. The most prevalent class of antibiotic resistant genes detected in the nine slurry (and 3 environmental) samples were tetracycline and MLS (macrolide, lincosamide, streptogramin) both of which were detected in 100% of the samples using a 10% allele coverage level (Table 1).

Viruses.
Picobirnavirus. Picobirnavirus was the most prevalent genus in the slurry samples (Fig. 1B). A total of 638 RNA-dependent RNA polymerase (RdRP) segments and 1033 capsid/ORF segments greater than 1 KB were assembled across all nine slurry samples; no viral genomes could be assembled from the farm environment samples (viral reads to Picobirnavirus were identified in two of three environment samples). Phylogenetic analysis of all complete RdRP segments in this study (354/638) and all complete Picobornavirus genomes from NCBI indicates that this genus was highly diverse, and belong to genogroups I and II. The porcine picobirnaviruses were not limited by host range (Supplemental Fig. 1   Statovirus. Ten porcine Statoviruses, a novel taxon described previously only in human, macaques, mouse, and cows, was identified across five slurry samples (Fig. 2B). Of these, 5/10 were > 99% similar to each other at the NT level and 2/10 were 90% similar to each other at the NT level, with the other 3/10 genomes sharing between 75 and 85% percent identity.
Other viral genera. Additionally, other viral genera of interest that were identified in the pig slurry samples included Sapovirus, Torovirus, Husavirus, Rotavirus. Of these, there was sufficient sequencing coverage to assemble one Sapovirus and three Torovirus genomes successfully. Finally, a large number of reads (63,043. 443.0213 NT rPM) were labelled as "uncategorized", (i.e. taxa with neither family nor genus classification were found in all samples), including in the water control (23,680.98 rPM). Due to our exclusion criteria, microbial genera identified in the water control were not reported in the samples. It is likely that further analyses of these "uncategorized" reads, will be revealing, and identify further previously unidentified and uncharacterized microbes.

Discussion
Modern high-density swine farming practices facilitate novel pathogen emergence and horizontal transmission, and increase the risk of zoonotic spillover to other species, including humans 27 . This is due to multiple factors including crowded livestock conditions, the frequent introduction of immunologically naïve young animals, the frequent movement of animals 28 , and environmental transmission pathways. As most high-density animal operations process animal waste in large lagoons, these pools pose an additional transmission risk to neighboring farm and communities through ground water seepage and large rainwater runoffs. GPS mapping of animal production locations in North Carolina can demonstrate that the proximity of animal facilities can pose public health risks of viral spread from one farm to another through simple measures such as feed delivery [29][30][31] . Communication between farmers regarding biosecurity and emerging viruses is essential to the public health of animals and humans working in production environments as well as surrounding communities.
Our pilot metagenomic study of swine slurry captures the virome, both known and novel, as well as microbes across the tree of life, revealing tremendous diversity and complexity. Given that viruses account for the largest number of swine pathogens, and their global economic impact, we have focused the results of our study on this kingdom 8,13 . In this study alone, we report 554 novel viruses of unknown pathogenicity and zoonotic potential. Further, Entamoeba histolytica and Listeria monocytogenes, known human pathogens of concern were identified. We also demonstrated how fecal slurry can be used to examine antimicrobial resistance genes in bacteria prevalent in swine.
In this study, Picobirnaviruses represented a substantial fraction (30%, 1671 genomes) of the viral sequences. Picobirnaviruses are non-enveloped double-stranded RNA viruses, typically consisting of two segments and are often found in the fecal matter of a wide range of species, sometimes associated with diarrhea 32 . A large number of diverse Picorbirnaviruses (n = 1236) have also been previously reported in camel feces 33 . This study adds 1671 genomes, to the existing 2540 genomes of Picobirnaviruses currently publicly available in NCBI, greatly increasing our sequence knowledge of this group of viruses within the Durnavirales order. The true host of Picobirnaviruses, whether it be eukaryotic (including fungal), or prokaryotic, remains unresolved 34 . Whether Picobirnaviruses are causes, or indirect markers of enteric disease also remains controversial.
Posaviruses (Porcine stool-associated RNA viruses) are highly diverse members of Picornavirales and widely observed and first discovered in fecal samples from pigs [35][36][37] . In addition to previously known Posavirus types, we identified three highly divergent Posaviruses. Further, we report the first porcine Statovirus (Stool Associated Tombus-like virus); to-date these viruses have only been reported in the gastrointestinal tract of four other mammals: humans, macaques, cows and mice 38 . As is the case with Picobirnaviruses, the true host and pathogenicity of these viruses are unclear.
Unlike the above mentioned viral genera, there is evidence of porcine host infection by Astroviruses, another positive-sense single stranded non-enveloped virus that has been studied extensively 35,39,40 . Aside from enteric and respiratory infection, porcine astrovirus type 3 in particular has been associated with encephalomyelitis in pigs 41 . However in this study, we only detected porcine astrovirus types 1, 2, 4 and 5.
In addition to viral families discussed above, other microbial species that are known to be pathogenic to humans were detected. E. histolytica, a human pathogen associated with intestinal and extraintestinal infections, responsible for 5 million infections annually, was identified in 41.66% of the samples 42 . Further, Listeria monocytogenes, a serious foodborne illness with a high mortality and hospitalization rates was identified in all slurry samples studied 43 . We also detected antimicrobial resistance genes in these samples. The genes identified have been implicated in both human disease and food safety 27,44 ; however due to a lack of a understanding of how genomic detection of these genes compares to the gold-standard, the significance of these findings are unclear 45 .
Our exploratory pilot study has important limitations. First, only a small number (nine) of slurry samples were collected from two farms. Second, these samples were collected over a limited time period of 6 months; larger and wider sampling is likely to reveal temporal dynamics of the microbial species. Finally, we have focused our analyses and discussion only on viral families and further analyses of the other microbial families (including "uncategorized" reads) is likely to yield a more complex and diverse microbial portrait. Nonetheless, from two moderately-sized farms with an average of 3800 head of swine, we identified a rich and diverse microbial landscape, including 554 novel viruses using our small sample set. More extensive and longitudinal studies of fecal slurry from a large US farm (20,000 head) or a megafarm in China (> 84,000 head) 46 would likely yield even greater diversity. Just as studies of human sewage have assisted public health officials in understanding polio virus 47 and SARS-CoV-2 transmission 48 , our results suggest that periodic assessments of swine farm fecal slurry might be an effective noninvasive approach to novel pathogen surveillance at industrialized swine farms. Such surveillance should be attractive to the swine industry as an early warning method for swine pathogen incursions 49 and additionally, assist public health officials in assessing possible swine zoonoses public health threats 50 .

Materials and methods
Site enrollment. During the fall of 2018, A North Carolina, USA, swine farm with two geographic locations was identified to participate in this preliminary study. Each barn had 12 pens with a center hallway and a fully slatted concrete floor over a deep pit to hold feces, urine, and waste water.at Location #1 and slightly larger at location #2. This farm held on average 3800 head of swine during a year. Pits were emptied up to three times per year and recharged with recycled water. Farm personnel collected up to two (6 oz. syringes) slurry samples per week from the pit of each production location.
This study was granted exemption from review status by the Institutional Animal Care and Use Committee at Duke University on the grounds that the research did not include direct sample collection from animals. www.nature.com/scientificreports/ Sample collection and processing. Slurry is defined as the feces and urine from pigs and the waste water used to remove the urine and feces from the pig pens 51 . Nine slurry samples from two swine barns containing finishing pigs were collected from approximately 5-10 cm below the surface of pits. Three environmental samples (two aerosol and one surface swab) were also collected and were frozen at − 20 °C until shipped to our laboratory (a maximum of 24 h). Aerosol sample and environmental swabs were collected as previously described 52 .
Frozen samples and completed surveys were transported overnight to the Duke One Health Research Laboratory. Dates and pre-assigned sample numbers were used for sample tracking. Slurry samples were diluted by methods previously described 25 and genomic DNA was extracted using the Zymo Research Quick-DNA Miniprep Kit (Cat. No. D3024). Extracted samples were shipped to the Chan-Zuckerberg Biohub (San Francisco, California) and stored at − 80 °C until processed by molecular methods.
Library preparation and metagenomic next-generation sequencing (mNGS). RNA [54][55][56] . Only microbial genera present at over 10 rpM, both at the nucleotide (NT) and protein (NR) levels were reported. Additionally, to control for any potential background contaminants, all microbial genera identified in the water control were excluded.
To more comprehensively characterize the genomes of viruses in the metagenomic dataset, genomes of identified microbes were assembled de novo using St. Petersburg genome assembler (SPAdes) 57 and annotated using Geneious v10.3.2. Contigs assembled that were 50% or greater than the Reference genome were included in the downstream analysis. For the Picobirnavirus genus, contigs greater than 1 KB were included in the analysis. For a given viral species, the assembled genomes were then aligned using the default settings in MUSCLEv3.8.1551. ModelTest-NGv0.1.5 was used to identify the best-fitting evolutionary model for each viral species. We reconstructed a maximum-likelihood phylogeny using RAxML-ngv0.6.0 using default settings (bootstrap = 200).
Criteria for identification of novel or divergent virus. Picobirnavirus virus was categorized as novel/ divergent if they were less than 75% identical to their closest relative on GenBank (BLASTN and BLASTX) 58 .
Posavirus were classified as divergent if their BLASTN and BLASTX identities were below 50% 35 . Novel species in swine picornavirus was identified based on ICTV guidelines 59 .
Antimicrobial resistance (AMR) assessment. To identify antimicrobial resistance genes present in the metagenomic dataset, SRST2 was used 60 . AMR genes with at least 10% allele coverage were considered positive.
Ethics approval. This study was granted exemption from review status by the Institutional Animal Care and Use Committee at Duke University on the grounds that the research did not include direct sample collection from animals.

Data availability
All of the raw data generated for this study is available PRJNA683083. Assembled genomes can be found at MW977024-MW977661, MW977662-MW978694, and MW504477-MW504597.