The first clawed lobster virus Homarus gammarus nudivirus (HgNV n. sp.) expands the diversity of the Nudiviridae

Viral diseases of crustaceans are increasingly recognised as challenges to shellfish farms and fisheries. Here we describe the first naturally-occurring virus reported in any clawed lobster species. Hypertrophied nuclei with emarginated chromatin, characteristic histopathological lesions of DNA virus infection, were observed within the hepatopancreatic epithelial cells of juvenile European lobsters (Homarus gammarus). Transmission electron microscopy revealed infection with a bacilliform virus containing a rod shaped nucleocapsid enveloped in an elliptical membrane. Assembly of PCR-free shotgun metagenomic sequencing produced a circular genome of 107,063 bp containing 97 open reading frames, the majority of which share sequence similarity with a virus infecting the black tiger shrimp: Penaeus monodon nudivirus (PmNV). Multiple phylogenetic analyses confirm the new virus to be a novel member of the Nudiviridae: Homarus gammarus nudivirus (HgNV). Evidence of occlusion body formation, characteristic of PmNV and its closest relatives, was not observed, questioning the horizontal transmission strategy of HgNV outside of the host. We discuss the potential impacts of HgNV on juvenile lobster growth and mortality and present HgNV-specific primers to serve as a diagnostic tool for monitoring the virus in wild and farmed lobster stocks.

European lobster (H. gammarus) 8 as well as numerous other decapod crustacean taxa 9 . However, WSSV has not been detected in wild or cultured Nephropidae. Viral diseases have led to substantial bottlenecks to shrimp aquaculture production. Monodon baculovirus (MBV), the causative agent of spherical baculovirosis, was the first virus reported in penaeid shrimp 10 . Phylogenetic analyses and genomic reconstruction has since suggested that MBV be reclassified as Penaeus monodon nudivirus (PmNV) and be reassigned to the Nudiviridae 11,12 , a family of dsDNA viruses which to that point was exclusively comprised of viruses infecting insects. Although initially named to reflect a lack of occlusion body formation (large protein lattices which protect the bacilliform-shaped virions and facilitate transmission outside of the host), there are now multiple examples within the Nudiviridae where occlusion bodies have been observed, or where sequence and structural homologs of the polyhedrin gene have been found within the genome [12][13][14] . Seven fully sequenced virus species have been characterised as nudiviruses: Penaeus monodon nudivirus (PmNV) 12 ; Gryllus bimaculatus nudivirus (GbNV), infecting the nymph and adult stages of several cricket species 15 ; Heliothis zea nudivirus-1 (HzNV-1), a persistent pathogen of insect cell lines 14 ; Helicoverpa (syn. Heliothis) zea nudivirus-2 (HzNV-2), the sexually transmitted corn earworm moth virus which can cause sterility in the host 16 ; Oryctes rhinoceros nudivirus (OrNV), a biological control agent used to manage palm rhinoceros beetle populations 17 ; Tipula oleracea nudivirus (ToNV) a causative agent of nucleopolydreosis in crane fly larvae 13 ; and Drosophila innubila nudivirus (DiNV) 18,19 , which causes significant reductions to fecundity and lifespan 18 . Three further viruses isolated from metagenomic sequencing of Drosophila melanogaster (Kallithea virus 20 , Tomelloso virus and Esparto virus) have also been described as nudiviruses (Table 1). However, the genomes of these three Drosophila viruses have yet to be analysed with respect to their phylogenetic position. There is also evidence of ancestral nudivirus integration into the host genome (Nilaparvata lugens endogenous nudivirus (NleNV)) 21 and a sister group of the nudiviruses, the bracoviruses, associated with Braconid wasp hosts, where viral genes are also integrated into the host genome 22 . Finally, a large DNA virus infecting the hepatopancreas of the European brown shrimp, Crangon crangon has also been proposed as a putative member of the Nudiviridae albeit based upon limited genomic information 23,24 .
As part of a large UK-based lobster rearing study assessing the growth of hatchery-reared European lobsters in novel sea-based container culture (SBCC) systems (Lobster Grower, www.lobstergrower.co.uk), we conducted a histology-led health screening of a large cohort of individuals (n = 1,698), sampled at several time points throughout a multi-year production cycle. We observed a distinctive histopathology of the hepatopancreas of juvenile lobsters in both hatchery and sea container phases of production. Intranuclear inclusions appeared within the hepatopancreatocytes of affected individuals; later confirmed by transmission electron microscopy (TEM) as of viral aetiology. Genome assembly of PCR-free shotgun metagenomic sequences confirmed the presence of a novel member of the Nudiviridae; hereby named Homarus gammarus nudivirus, the first virus described infecting any clawed lobster genus. Here, we present the fully annotated genome of HgNV, comprising a single contiguous sequence, together with diagnostic primers and reference histology and ultrastructure to aid in future identification in natural and aquaculture settings. HgNV is now the second confirmed aquatic nudivirus.

Results
Histological sectioning reveals virus-associated pathology. Lobsters did not appear to display any clinical signs of infection with HgNV. Histopathology of the virus infection was apparently limited to the tubule epithelial cells of the hepatopancreas (HP), observed in fibrilar (F) and reserve (R) cells. Infected cells contained hypertrophic nuclei occupied by a single, large eosinophilic inclusion. This inclusion displaced the host chromatin resulting in the latter's emargination against the nuclear envelope (Fig. 1A,B). In some cases, this margination of the chromatin causes the formation of septa leading to the appearance of intranuclear compartmentalisation. Viral infection occurred either within the nuclei of isolated cells, within the closely opposing cells of a single tubule, within numerous cells of several closely opposed tubules, or generally throughout the tubules of the Infection prevalence in hatchery and sea-based juvenile lobsters. Intranuclear inclusions were observed in 12.72% of all samples processed for histology (145/1140) across the two-year sampling period. In sea-based lobsters the prevalence of intranuclear inclusions was highest at 39 weeks post deployment (17%) (Fig. 2). At this time point, the percentage of individuals displaying histological signs of viral aetiology was the same in both the hatchery and sea-based populations. However, whereas intranuclear inclusions were not evident in sea-based lobsters at 104 weeks (0%), prevalence had peaked in hatchery reared lobsters (53%) at this time (Fig. 2). Prevalence was generally observed to be higher in hatchery-based individuals compared to those retained in SBCC systems. Three of the 150 (52 week) sea-based lobsters tested were PCR positive for HgNV, all of which were histologically positive. An additional four sea-based samples, displaying histopathological signs of intranuclear inclusions, (C,D) HgNV-specific DNA polymerase probe hybridised to infected nuclei (arrows) within epithelial cells of the hepatopancreas. In-situ hybridisation. Scale bar = 100 µm, 50 µm respectively. (E) Nucleus from a HgNV infected cell containing rod-shaped virions. Virions accumulate at the periphery of the nuclear membrane (arrow), TEM. Scale bar = 500 nm. (F) Longitudinal (white arrow) and transverse sections (black arrow) of HgNV virions within the nucleus. Virions possess an electron dense nucleocapsid surrounded by a trilaminar membrane (envelope). The rod shaped nucleocapsid appears to bend within the envelope forming a "u" or "v" shape in some cases (line arrows). TEM. Scale bar = 500 nm.
www.nature.com/scientificreports www.nature.com/scientificreports/ did not amplify with HgNV-specific primers. Of the 12 hatchery-based animals tested, five were PCR positive for HgNV, three of which were histologically positive. An additional hatchery-based sample showed histological signs of nuclear infection but did not test positive with PCR. In-situ hybridisation of the HgNV-specific amplicon probe confirms detection and demonstrates localisation of the target HgNV DNA polymerase gene in infected tissues (Fig. 1C,D).

Transmission electron microscopy (TEM) confirms the presence of viral infection. Transmission
electron microscopy revealed the presence of masses of enveloped virions accumulated at the nuclear membrane and surrounding the virogenic stroma ( Fig. 1E,F). Virions exhibited an electron-dense nucleocapsid showing a bacilliform morphology and were contained within an elliptical membrane. In some cases, the rod-shaped nucleocapsids appeared "u" or "v" shaped within the envelope (Fig. 1F). The mean length of the enveloped virions was 180.43 ± 16.9 nm, with a mean diameter of 136.07 ± 11.28 nm (n = 20). The mean length of the nucleocapsids was 154 ± 20 nm, with a mean diameter of 36 ± 4 nm, mean envelope width was 5.2 ± 0.2 nm (n = 20).
Complete genome assembly of candidate virus. The alignment of multiple independent assemblies produced a full genome consensus sequence of 107,063 bp (Accession: MK439999). Reassembling the concatenated reads from all samples, after mapping to the candidate consensus sequence, increased coverage to an average of 400.50× (SD: 65.16). The assembled contig of 107,063 bp is concordant with the size of other known nudivirus genomes, as is the estimated GC content of 35.34% (Table 1). REAPR detected no errors or breaks in the assembled genome. PCR confirmation and sequencing of reduced coverage areas revealed the presence of repeating units, which sometimes varied in copy number between independent samples. Sanger reads sequenced from three separate samples confirmed correct assembled sequence.
Tandem repeats associated with viral replication. The HgNV genome does not contain any A/T-rich, palindromic, homologous regions (hrs) that are known to support the origin of replication in baculoviruses and play important roles in viral transcription 25,26 . However, seven direct repeats (drs), ranging from 58.8 to 188 bp were detected ( Table 2), two of which fall within protein coding regions. EcoRI centres or significant palindromic regions, both typical of hrs, were not detected within these repeating regions. However, dr1-dr4 are clustered within 3.3% of the entire genome; a region of 3,531 bp (Fig. 3). A cluster of drs also appear within the PmNV genome 12 .
Open reading frame (ORF) prediction and genome annotation. Prokka predicted 101 protein coding regions in the HgNV genome. FGenesV0 and GeneMarkerS predicted 103 and 89 ORFs, respectively. Ninety-seven ORFs were supported by two or more programs and were distributed evenly across both strands (Fig. 3, Table 3); 49 on the plus strand and 48 on the minus. The gene density of the HgNV genome was estimated to be 1.10 per kb and 69% of ORFs aligned most closely with predicted genes from the PmNV genome.
The exact number of genes conserved across all the nudiviruses is somewhat unclear. However, re-analyses of all sequenced nudivirus genomes revealed a set of 21 core genes conserved between baculoviruses and nudiviruses 13 . The core genes were typically grouped into one of five functional groups: DNA processing, RNA transcription, per os infectivity, package and assembly and conserved genes of unknown function. The HgNV genome contained 7 genes involved in DNA processing; DNA polymerase, helicase, two copies of helicase2, integrase, fen-1 and ligase. Gene predictions similar to three of the four thymidine kinase (tk) genes involved in nucleotide metabolism were also found. All five core baculovirus/nudivirus genes involved in RNA transcription were found; www.nature.com/scientificreports www.nature.com/scientificreports/ p47, lef-4 (late expression factor), lef-5, lef-8 and lef-9. As were 8 genes involved in per os infectivity: pif-0 (p74), pif-1, pif-2, pif-3, pif-4 (19 k/odv-e28), pif-5 (odv-e56) and pif-6 (ac68) and pif-8 (vp91/p95). The 11K-like gene was also found 27 . HgNV contained less than half of the genes encoding packaging, assembly and release processes conserved amongst the Baculoviridae. These include a 38 k gene, p6.9, two copies of vlf-1, vp39, p33 (ac92) and ac-81. Similar to PmNV, HgNV also possessed 2 copies of the Iap genes involved in apoptosis inhibition. Furthermore, HgNV encoded sequences similar to PmNVorf99 and PmNVorf62, reported to be common in nudiviruses (Fig. 4). Other genes typically common to baculoviruses, including methyltransferase and two neighbouring copies of odv-e66, were also found in the HgNV genome.  Table 2. Direct repeat predictions within the HgNV genome. dr = direct repeat. Tandem repeat alignment score of >100.   www.nature.com/scientificreports www.nature.com/scientificreports/ coding regions; HgNV_ORF23, exhibiting a protein kinase structural domain and HgNV_ORF63 coding for the p74 gene. A combination of early and late promoters were predicted to precede 47 potential coding regions.

Discussion
Here we provide the first description of a naturally-occurring virus infection of nephropid lobsters. The virus, Homarus gammarus nudivirus (HgNV) is a new species within the family Nudiviridae; a group of dsDNA viruses that infect arthropod (mainly insect) hosts. Histopathology and ultrastructure of HgNV is similar to numerous other bacilliform viruses described to infect Crustacea, wherein viral replication within the host nucleus displaces host chromatin and results in aberrant, hypertrophied nuclei, visible in routine histological preparations. In many cases, infected epithelial cells are sloughed off the basement membrane of the tubule into the lumen, for excretion via the faeces. It is important to consider that intranuclear inclusions may also be indicative of other pathogens as this may explain discrepancies in prevalence when comparing PCR and histology data. Furthermore, digestive tissues are known to contain inhibitors which can impact PCR success 28,29 . However, in-situ hybridisation confirms that HgNV is inducing this pathology in infected cells (Fig. 1C,D).
Comprehensive genome analysis of infected lobsters revealed that HgNV is most closely related to PmNV, a virus infecting the black tiger shrimp, Penaeus monodon. However, despite a high degree of conservation in gene  1  1  12  131  18  5  1 2  6 0  helicase  84  34  88  104  38  94  118  11  helicase2  66  108  46  60  76  76  105  83  helicase2** 69  108  46  60  76  79  105  83  integrase  42  75  57  144  8  5 5  4 3  3 9  fen-1  13  16  65  68  70  20  1  9 3  ligase  36  121  38  141  10   www.nature.com/scientificreports www.nature.com/scientificreports/ order, the percentage identity of HgNV gene predictions to known annotations was fairly low, averaging just over 38%. Our Maximum Likelihood phylogenies were concordant with previously published trees, which indicate that PmNV may belong to a separate genus within the Nudiviridae; the Gammanudiviruses 12 . Our phylogenetic analyses show that HgNV also belonged to this clade and, together with PmNV, could represent a radiation of nudiviruses infecting diverse aquatic crustacean taxa. Based on the long branch lengths of the neighbouring lineages, it is likely that ToNV and both HzNV-1 and HzNV-2 belong to separate genera; provisionally referred to as Deltanudivirus and Betanudivirus respectively (Fig. 5). We also show that the newly sequenced Drosophila nudiviruses belong to the Alphanudivirus clade and present the most substantial nudivirus phylogeny to date. The multigene phylogeny of the late expression factors provides bootstrap support of 100% in all but one node (98%) (Fig. 5C). www.nature.com/scientificreports www.nature.com/scientificreports/ Nudiviruses contain a distinct repertoire of genes involved in DNA processing, compared to the baculoviruses. Two lef genes and an alk-exo gene are absent in HgNV. Lef-1 has been shown to be associated with DNA primase activity which aids in polymerisation, and lef-2 is thought to stabilize the binding of lef-1 to the DNA molecule 30 , amplifying replication 31 . HgNV contains two copies of the helicase-2 gene which are also found in the PmNV genome (HgNV_ORF66, HgNV_ORF69). Both genes are predicted to contain features characteristic of helicase activity. An integrase gene is also common to all sequenced nudiviruses and is represented by HgNV_ORF42, which contains the phase_integrase domain involved in the integration of viral DNA into the host genome. This is noted to facilitate persistent infection of HzNV-1 in its host 32 .
Of the five core genes involved in RNA transcription, P47 (HgNV_ORF08) encodes a viral transcription regulator, involved in late stage infections, reported to make up one of the four subunits of RNA polymerase, whereas the four remaining lef genes are thought to regulate late and very late gene expression, and are named to reflect their synthesis during infection. In contrast, early gene expression is instead initiated by host-derived RNA polymerase 33 .
Baculovirus life cycles are typically split between occlusion-derived virus (ODV) and budded virus (BV) stages. Per os infectivity genes, conserved within the HgNV genome, are required for the infectivity of ODVs that facilitate the transmission of viral particles from one host to the next, whereas BVs spread virions to neighbouring cells. Pif-0,1 and 2 encode envelope proteins vital for oral infection and are thought to bind virions to the midgut cells 34 . However, pif-3 does not affect midgut binding. Pif-3 is instead hypothesised to interact with the viral cytoskeleton and play a role in translocation of the capsid 35 . It is believed that pif-1, 2, 3 and 4 form a multimolecular protein complex that is vital for oral infectivity, with other pif genes associating with the core complex at a lower affinity 36 . Pif-7¸originally described as the ODV-envelope protein Ac110, also associates with the complex but was not found in the HgNV genome. However, Pif-8, previously described as the structural protein vp91/p95, was detected (HgNV_ORF05) and is predicted to contain chitin-binding peritrophin-A domains. The peritophic membrane surrounds the food bolus and lines the gut of most crustaceans and serves to separate large particulate matter from the epithelial cells and limits the penetration of microbes 37 . HgNV also encodes a homolog to an 11 K protein noted to enhance oral infection. These 11 K proteins typically contain binding motifs common to mucins, peritrophins and chitinases and could facilitate midgut binding, typically occurring after the alkaline dissolution of the occlusion lattice 38 .
The reduction in genes encoding packaging, assembly and release is likely a reflection of the lack of occlusion bodies in the transmission strategy adopted by the Nudiviridae. HgNV_ORF47 encodes a 38K-like gene which mediates the dephosphorylation of the C terminus of the p6.9 gene; a gene responsible for the encapsulation of the viral genome. Although not detected through BLAST alignment, likely a result of its highly repetitive sequence, HgNV_ORF40 was identified as the p6.9 gene after alignment with other annotated sequences. Furthermore, alignment of the hypothesised PmNV coding region of the p6.9 gene to the HgNV genome corresponds to a region within the HgNV_ORF40 predicted ORF. Similarly to PmNV, HgNV shares two separate sequence homologs to the vlf-1 gene (HgNV_ORF43, HgNV_ORF80), responsible for very late gene expression and proper formation of the nucleocapsid 39 . HgNV also encodes a second major capsid protein: p33/ac92 (HgNV_ORF04). HgNV_ORF04 reports an Erv1_Alr feature, belonging to a family of sulfhydryl oxidases. Prior analyses and purification of ac92 suggests it is a flavin adenine dinucleotide (FAD) containing sulfhydryl oxidase 40 . The major viral capsid protein vp39 is thought to be a core baculovirus/nudivirus gene and was reportedly mislabelled as a 31K-like structural protein in PmNV (ORF_022). HgNV_ORF15 shares 33% identity across 99% of the PmNV_022 sequence and 21% identity across 91% of HzNV-1 ORF89 and HzNv-2 ORF52, also annotated in GenBank as 31K-like proteins. However, the similarity of HgNV_ORF15 with the vp39 genes of other nudiviruses is much lower. Protein domains could not be predicted to aid in its clarification.
The PmNV_099-like coding region is also shared amongst the nudiviruses and HgNV_ORF88 indeed shares 40% sequence identity with PmNV_099, which is described as 'microtubule-associated-like' 12,13 . This gene could play a role in the rearrangement of the host nucleus during viral replication, whereby host chromatin is translocated to the inner nuclear membrane, a process thought to be dependent on viral interaction with host tubulin 35 . Much like ac92, HgNV_ORF88 is also predicted to contain the ERV/ALR sulfhydryl oxidase feature which can play a role in virion assembly by catalysing disulphide bond formation between cysteine residues 41 . In regard to HgNV gene predictions found in the baculoviruses, HgNV_ORF02 shares 35.86% identity to a methyltransferase annotated in the PmNV genome, hypothesised to be involved in viral RNA capping 42 . As is the case with PmNV, HgNV encodes two neighbouring odv-e66 predictions responsible for the trafficking of viral proteins during infection 43 . Odv-e66 was also reported as the first chondroitin lyase 44 . Chondroitin is an extracellular matrix polysaccharide and its degradation by pathogenic bacteria facilitates access to the target cell 44 . Chondroitin AC/ alginate lyase Interpro features were also identified in HgNV_ORF24 and HgNV_ORF25.
There are 37 core genes reported to be conserved amongst the baculoviruses 45 . Assuming the 31K-like gene is in fact a vp39 homolog, HgNV encodes all 21 core baculovirus genes proposed to be conserved across the nudiviruses (Fig. 4). Nearly half of HgNV predicted coding regions were preceded by both early and late promoter regions (Table 3), suggesting plasticity in the way HgNV can regulate gene expression. However, as stated by Bezier et al. 13 gene expression chronology should not be generalised to promoter motifs alone. Transcriptome analysis of the baculovirus AcMNPV failed to associate reliable sequence motifs with gene expression patterns 46 .
We did not observe evidence of occlusion body formation within our histological or TEM analyses. Similarly, we did not detect sequence homologs of the poly/gran gene, which encodes the protein that forms the structural lattice. Ingestion of occlusion bodies allows passage to the digestive tract, where alkalinity of the gut causes the proteinous lattice to dissolve, releasing the virions within and initiating infection 33 . Much like the baculoviruses, the nudiviruses surrounding HgNV (HzNV-2, ToNV and PmNV) can rely on occlusion bodies to facilitate transmission outside of the host. As it would seem that HgNV does not form occlusion bodies, it begs the question of how viral particles remain viable during horizontal transmission. An alternative infection strategy would be www.nature.com/scientificreports www.nature.com/scientificreports/ that HgNV persists as a latent virus within the host and its evolution has favoured the maintenance of low virulence, which subsequently translates to an increase in transmission through longer lasting infections, as infection doesn't incapacitate the host. Viruses infecting cells of the digestive tract sloughed out of the animal may remain viable until the degradation of the excreted cell. The ingestion of faeces may therefore serve as possible route of transmission for HgNV 47 . Latency within the host is a shared strategy true of several other shrimp viruses and supported by field data relating to the infection of the marine shrimp Crangon crangon by a putative nudivirus, where prevalence can reach 100% in wild populations 23,48,49 . Alternatively, HgNV may persist in reservoirs outside of its currently known host.
Due to the short life-cycle and seasonal development of their host, insect viruses, like the baculoviruses, are unable rely on either latency or reservoir strategies 35 . Therefore, resistant occlusion bodies would ensure viability outside of the host and facilitate transmission to the next. However, compared to penaeid shrimp, lobsters have very long life-cycles (decades). As such, a virus infecting these animals can rely on latency and is not required to survive long periods within the environment. In further support of this theory, occlusion-derived viruses infecting the insect midgut rely on occlusion body-associated enhancins, or similar factors, that digest the chitin lining of the midgut and facilitate entry 35 . However, the hepatopancreas of the lobster is not chitinous 50 . Therefore, HgNV would not depend on OB-associated proteases to gain entry into hepatopancreatic cells. Slack and Arif (2007) hypothesise that baculovirus ancestors were not occluded and instead relied on alkaline proteolytic activation during infection. It is hypothesised that contrasting ecological niches occupied by the insect host life cycle, limit baculoviruses infection to larval stages 15 . Therefore, occlusion body-facilitated horizontal transmission is vital for its longevity within the environment. The non-occluded nudiviruses, however, have demonstrated their ability to infect adult life stages. Therefore evolutionary maintenance of occlusion body transmission offers little benefit over vertical transmission or latency within the aging host 15 .
The expanding diversity of the Nudiviridae suggests that lack of occlusion alone is not a distinguishing characteristic of these viruses; several occlude prior to horizontal transmission whereas others do not. It is therefore likely that other characteristics of the genome underlie the separation of the group from the baculoviruses. Little is known about the nudivirus lifecycle and so this, and the means by which they gain entry into the host cell and cause infection, may also serve as discernible features of the proposed genus.
We did not observe any accompanying clinical signs in HgNV-infected individuals. Evidence suggests a persistent asymptomatic virus may even offer benefit to the individuals within an infected population. Invertebrates lack a typical adaptive immune system, however, host cells infected with latent Hz-1 virus (HzNV-1) are resistant to a more virulent infection of the same virus via homologous interference 32 . Nevertheless, despite widespread latency within the Nudiviridae, many cause delayed development and eventually death 13 . Whether HgNV has an effect on growth development or mortality of the European lobster remains to be shown. Furthermore environmental and/or physiological stimuli can result in massive viral amplification which give even low virulent viruses the potential to cause mass mortalities within a population 49 . This may be of particular importance as invertebrate aquaculture grows in popularity. The increased prevalence of HgNV in hatchery vs SBCC lobsters suggests either that conditions within SBCC are not conducive to high prevalence (e.g. lower transmission potential) or, that lobsters infected with HgNV have higher mortality during early deployment and thus are not present at later stage sampling points. However, in relation to the latter, given that early mortality in SBCC and hatchery populations did not differ (data not shown), HgNV as a driver of mortality in SBCCs appears unlikely. It should be noted that recirculating systems likely serve as drivers for increased prevalence in older hatchery-reared stocks (52-104 weeks post deployment controls) and juvenile lobsters are not typically on-grown in hatchery environments for such extended periods. Further work on the role of HgNV in early life stage growth and mortality is now required.

Materials and Methods
Experimental design and sample collection. Over the period of July 2016 to April 2017, 14,507 hatchery-reared juvenile lobsters were deployed in SBCCs anchored off the coast of Cornwall (St. Austell Bay 50° 18.956 N, 4°44.063 W). The majority of those deployments (10,987 animals), including those used in the current study, occurred in the summer of 2016. Routine sampling (3,6,28,39,52 and 104 weeks post deployment) was carried out to monitor the incidence of disease in SBCC populations. In total, 1,698 animals were sampled over the 2-year period. Another set of lobsters were retained within the National Lobster Hatchery, Padstow UK, and sampled at the same time points, over this period. Carapace length and survival were measured at each time point. Upon sampling, larger animals (39-104 weeks post deployment) were anaesthetised on ice for several minutes prior to bisection through the dorsal line. One half was fixed in Davidson's Seawater fixative for histological processing, the other fixed in molecular grade ethanol for sequence analysis. A small piece of hepatopancreas was removed and fixed for transmission election microscopy. Smaller animals (0-28 weeks post deployment) were fixed whole and underwent separate analyses.
Six juvenile lobsters displaying pronounced histopathology associated with viral infection were selected from hatchery and SBCC settings, allowing for comparative molecular and transmission election microscopy analyses. Five of the six animals had spent one to two years growing in controlled hatchery raceways. The remaining individual had spent 52 weeks in SBCC in the open sea.
Histopathology. Bisected lobsters were placed in to histological cassettes and fixed in Davidson's Seawater Fixative for 24-48 h before transfer to 70% industrial denatured alcohol (IDA). Cassettes were processed using a Leica Peloris Rapid Tissue Processor and subsequently embedded in paraffin wax. Histological sections were cut using a rotary microtome set at 3 µm thickness, adhered to glass slides and stained using a standard haematoxylin and eosin (H&E) protocol. Slides were examined using a Nikon Eclipse light microscope and NIS imaging software at the International Centre of Excellence for Aquatic Animal Health at the Cefas Laboratory, Weymouth, UK. (2019) 9:10086 | https://doi.org/10.1038/s41598-019-46008-y www.nature.com/scientificreports www.nature.com/scientificreports/ Transmission electron microscopy. Hepatopancreas samples were fixed in 2.5% glutaraldehyde in 0.1 M sodium cacodylate buffer (pH 7.4) and later rinsed in 0.1 M sodium cacodylate buffer prior to processing. Post-fixation was carried out in 1% osmium tetroxide/0.1 M sodium cacodylate buffer for 1 h. Tissues were washed in three changes of 0.01 M sodium cacodylate buffer and were subsequently dehydrated through a graded acetone series before embedding in Agar 100 epoxy resin (Agar Scientific, Agar 100 pre-mix kit medium). Embedded tissues were polymerised overnight at 60 °C. Semi-thin (1-2 µm) sections were cut and stained with Toluidine blue for viewing with a light microscope to identify suitable target areas. Ultra-thin sections (70-90 µM) of targeted areas were mounted on uncoated copper grids and stained with 2% aqueous uranyl acetate and Reynold's lead citrate 51 . Grids were examined using a JEOL JEM1400 transmission electron microscope and digital images captured using an AMT XR80 camera and AMT V602 software.
DNA extraction and sequencing. DNA for genomic reconstruction was extracted using the CTAB/phenol:chloroform extraction protocol as described in Holt et al. 52 . DNA for HgNV screens of HP tissue was extracted using the EZ1 Advanced XL and DNA Tissue Kit (Qiagen). Extracted DNA was cleaned with polyethylene-glycol (PEG) precipitation and submitted to the sequencing service at the University of Exeter, UK for shotgun library preparation using the TruSeq DNA PCR-Free Library Prep Kit. Pooled libraries underwent high-throughput sequencing using an Illumina Miseq (2 × 300 bp).
Sequence analysis. The raw Illumina paired-end sequence reads generated were quality-checked using FastQC v0.11.4 53 and subsequently trimmed to remove adapter sequences and low-quality bases using Trimmomatic v0.36 54 . Sequence reads were error-corrected and digitally normalised using bbnorm (part of BBMap v38.22) 55 and reads of each sample were assembled individually with Unicycler v0.4.7 (using default parameters and-no_correct) 56 . Quality-trimmed paired reads from individual samples were also assembled de novo using the A5-miseq assembly pipeline 57 . Contigs representing putative HgNV were aligned using progres-siveMauve (build date Jun 26 2018) 58 in order to obtain a consensus sequence. In order to identify viral contigs, Prokka 59 was used to identify protein-coding regions spanning the assembled contigs and these were subsequently annotated using the BLASTp algorithm of Diamond v0.7.9 60 and the full NCBI non-redundant (nr) protein database (20170515). Sequences representing dsDNA viruses were identified by visualising the Diamond output in MEGAN6 Community Edition v6.5.5 61 and corresponding contig sequences were extracted. Paired reads from all samples were subsequently mapped to the candidate genome contigs using BWA-MEM 0.7.12-r1039 62 and visualised with the Integrative Genomics Viewer (IGV) v2.3.68 63 Assembly quality and accuracy were assessed with QualiMap v2.0 64 and REAPR (version 1.0.18) 65 . Predicted open reading frames (ORFs) were identified using three different tools, including Prokka, FGenesV0 (softberry.com) and GeneMarkS 66 (amino acid size of 50, circular genome). ORFs that were supported by two or more programs were analysed further. In cases where multiple ORFs were predicted to overlap, the largest sequence was chosen. Supported ORFs were annotated using NCBI BLASTp and the full NCBI nr protein sequence database (20180803).
Tandem repeats within the final assembled genome were identified using the tandem repeats finder using default parameters 67 . Repetitive regions with an alignment score of 100 or more were further analysed for palindromic sequences using the MEME program and a minimal size of 20 bp 68 . Promoter sequences were located within 300 nucleotides upstream of ORF start codon predictions using the Geneious software package v.11.1.4 69  A circular map of the HgNV genome was plotted using shinyCircos 70 . The assembled HgNV genome and corresponding ORF predictions are deposited in GenBank under the genome accession number MK439999.

Molecular confirmation of genome assembly.
To resolve ambiguous regions of the genome assembly, primers were designed that span areas of lower coverage and INDEL queries. PCR amplification was performed in 50 µL volumes using 10 µL of Promega 5X Green GoTaq Flexi Buffer, 5 µL of MgCl 2 , 0.5 µL of each primer (Final concentration; 1 µM), 0.5 µL of DNTPs, 0.25 µL of GoTaq DNA Polymerase, 30.75 µL of molecular grade water and 2.5 µL of template DNA. Initial denaturation was carried out at 94 °C for 2 min. This was followed by 30 PCR cycles of denaturation at 94 °C for 1 min, annealing at 60 °C for 1 min and extension at 72 °C for 1 min, followed by a final extension at 72 °C for 5 min. Sequenced amplicons were aligned to the candidate genome using the multiple sequence alignment program (MAFFT Version 7) 71 and assembly was assessed across query regions.
Diagnostic primers were constructed from the alignment of DNA polymerase gene sequences. HgNV_ DNAPol_F1: 5′ACTTGAAGCTGTGCGTGACT 3′ and HgNV_DNAPol_R1: 5′ TGTATGTCTTGCGGCCCATT 3′ produce an amplicon of 383 bp and only anneal to HgNV when queried with Primer-BLAST and the nr database. PCR conditions were as above. Amplicons were cleaned with the GeneJET PCR Purification Kit (Thermo, US) and sequenced via the Eurofins TubeSeq service. HgNV_DNAPol primers were tested on 150 SBCC and 12 hatchery lobsters (sampled at 52 weeks post-deployment). Shrimp tissues infected with WSSV and a putative nudivirus were tested as negative controls and did not amplify.
Phylogenetic tree construction. Homologous target genes were aligned using the multiple sequence alignment program MAFFT Version 7 71 ; and the E-INS-I iterative refinement method. Multigene alignments were constructed by concatenating gene sequences prior to alignment. A maximum likelihood phylogenetic tree inference was constructed using RAxML-HPC BlackBox version 8 72 on the CIPRES Science Gateway 73 using a generalised time-reversible (GTR) model with CAT approximation (all parameters estimated from the data).
www.nature.com/scientificreports www.nature.com/scientificreports/ In-situ hybridisation. An extended HgNV-specific DNA polymerase probe which spanned and the HgNV_ DNAPol amplicon sequence was designed to optimise the hybridisation protocol. HgNV_DNAPol_ISH_1838f: 5′ AGATTGAGCAGAGTGTAGCCC 3′ and HgNV_DNAPol_ISH_2799R 5′ ACCTTCCGATGATAGTTCTTCC 3′ produce an amplicon of 961 bp. In-situ hybridisation of the extended HgNV probe was carried out following the protocol described by Bojko et al. 2019 74 using a 2X washing buffer (20X SSC, 0.2% BSA, 6 M Urea). However, NBT/NCIP incubation was limited to 15 minutes and slides were instead counterstained with 0.1% Fast Green solution.

Data Availability
Sequences have been deposited in GenBank under the BioProject PRJNA516791.