CRISPR/Cas-9 mediated knock-in by homology dependent repair in the West Nile Virus vector Culex quinquefasciatus Say

Culex quinquefasciatus Say is a mosquito distributed in both tropical and subtropical regions of the world. It is a night-active, opportunistic blood-feeder and vectors many animal and human diseases, including West Nile Virus and avian malaria. Current vector control methods (e.g. physical/chemical) are increasingly ineffective; use of insecticides also imposes hazards to both human and ecosystem health. Advances in genome editing have allowed the development of genetic insect control methods, which are species-specific and, theoretically, highly effective. CRISPR/Cas9 is a bacteria-derived programmable gene editing tool that is functional in a range of species. We describe the first successful germline gene knock-in by homology dependent repair in C. quinquefasciatus. Using CRISPR/Cas9, we integrated an sgRNA expression cassette and marker gene encoding a fluorescent protein fluorophore (Hr5/IE1-DsRed, Cq7SK-sgRNA) into the kynurenine 3-monooxygenase (kmo) gene. We achieved a minimum transformation rate of 2.8%, similar to rates in other mosquito species. Precise knock-in at the intended locus was confirmed. Insertion homozygotes displayed a white eye phenotype in early-mid larvae and a recessive lethal phenotype by pupation. This work provides an efficient method for engineering C. quinquefasciatus, providing a new tool for developing genetic control tools for this vector.

Culex quinquefasciatus Say also known as the southern house mosquito, is part of the Culex pipiens complex. The female mosquito largely feeds on birds, and is a major vector of many veterinary diseases. These include avian malaria, which has been identified as a key factor in a number of extinctions of avian species, and a significant pressure on currently endangered ones 1,2 . However these opportunistic blood-feeders will also target mammals, acting as a bridge for disease transmission between avian species and mammalian hosts, thus posing a significant threat to human health, as seen with some West Nile virus (WNV) outbreaks 3 . In the USA alone, since the introduction of C. quinquefasciatus in 1999, there have been 48,000 reported human cases of WNV, for which the mosquito acts as an efficient vector 4 . Since humans are thought to be a dead-end host for WNV, each of these likely represents mosquito-vectored avian-to-human transmission. Horses are also vulnerable to WNV, with about 25,000 cases in the US (1999-2015) and a case fatality rate of about 33% 5 . In addition, C. quinquefasciatus is a competent vector of the St. Louis encephalitis virus and eastern equine encephalitis virus 6 .
The major human health impact of C. quinquefasciatus globally is as a vector of lymphatic filariasis (LF). LF is a neglected tropical disease caused by a nematode, which can live up to 6-8 years inside a human host, causing disruption and permanent damage to the lymphatic system; despite extensive control efforts there are still an estimated 50 million cases worldwide 7,8 . Eradication of this disease does not currently appear likely despite great efforts with mass drug administration programs, leading to increasing focus on complementary vector control strategies 9 .
Vector control methods currently used for other mosquito species, such as removing or chemically treating larval habitats, are of limited effectiveness in large scale implementation. Moreover, some breeding sites may be impractical or difficult to locate 10 . In addition, most current control strategies are heavily dependent on the use of insecticides, which can have major impacts on non-target species supporting local ecosystems 11 . Furthermore, insecticide use is under threat from rising resistance [12][13][14] . www.nature.com/scientificreports/ New strategies and targets for vector control are therefore urgently required. Genetic approaches potentially provide this, and may be more suitable for large scale implementation, in addition to having a lesser impact on non-target species 11,15 . The ability to edit genes allows the characterization of new targets, but also opens the door to the implementation of genetic manipulation of key vector populations. This might aim to suppress the vector populations, or insert genes which reduce vector competence. Introgressing such traits into wild vector populations might be through mass-release, or by using gene drive systems to amplify the effect of relatively small initial releases 15,16 . The recent availability of efficient gene-editing tools such as clustered regulatory interspaced short palindromic repeats-associated protein 9 (CRISPR/Cas9) has made many of these approaches more feasible. The Cas9 endonuclease is typically paired with one or more synthetic single guide RNAs (sgRNAs); each sgRNA sequence will have complementary bases to that of a target site region of DNA in the genome. When the sgRNA and DNA bind, the Cas9 protein induces a double stranded break in the DNA. This can be repaired either by non-homologous end joining (NHEJ) or homology directed repair (HDR) 17 . Repair through the NHEJ pathway typically results in small indel (insertion/deletion) mutations which can be useful in assessing the function of target genes, if they result in a frame-shift or loss of an important protein function (knock-out). HDR-based repair can be utilised to 'knock-in' an exogenous DNA sequence, if such a DNA sequence is provided with flanking regions homologous to the endogenous break site, for example by co-injection of a 'repair-template' plasmid alongside CRISPR components. Alternatively, sgRNA and Cas9 components can be integrated into the germline of a target species whereby the two repair pathways can be harnessed as mechanisms for gene drive. For example, HDR can be used for homing-based drives [18][19][20] while NHEJ repair of essential genes can form the basis for 'break and repair' based systems such as ClvR or TARE 21,22 .
Regarding mosquitoes, CRISPR/Cas9 has been used successfully to generate germline and somatic knock-in mutations in Aedes and Anopheles species through the HDR repair mechanism 18,23-27 . However, to date, only use of the NHEJ pathway to generate 'knock-out' mutations has been reported in C. quinquefasciatus [28][29][30][31] . Only a few early studies successfully generated transgenics using transposon-mediated transformation in this species, using a Hermes-based vector 32,33 . Our own attempts to generate transgenics with piggyBac-based vectors did not recover any transgenics using both plasmid 34 and in vitro transcribed mRNA 35 as transposase sources.
In this study, using the kynurenine 3-monooxygenase gene (kmo, also known as kynurenine hydroxylase or kh) as a target, we demonstrate for the first time the ability of the CRISPR/Cas9 system to generate knock-in mutations in C. quinquefasciatus. As well as a successful proof of concept for this technology, our chosen integrated components (an RNA polymerase III promoter expressing a sgRNA, itself targeting the knock-in site), if shown to be functional in vivo, will be of use in assessing the potential of a CRISPR/Cas9 homing drive-based approach in this globally important mosquito pest.

CRISPR/Cas9 based site specific insertion into kmo.
Our earlier work identified a region homologous to the Aedes aegypti kmo gene which yielded a white-eyed phenotype when disrupted by CRISPR/Cas9 28 . The target site of the most active sgRNA (LA935) was selected for knock-in experiments. Two independent rounds of embryonic microinjections were performed using a CRISPR/Cas9 HDR donor template (Fig. 1). In the initial round of injections, 384 embryos were injected. Of these, 170 hatched as first instar larvae of which 57 (14.8%) survived to adulthood. In total six pools were generated from these injection survivors (up to 10 G 0 survivors per pool) and G 1 larvae were collected and screened from four oviposition cycles (Table 1). No fluorescent lar- . In addition to a fluorescent marker, the knock-in cassette contains the C. quinquefasciatus 7SK promoter, an RNA Pol III promoter which expresses the same sgRNA used to integrate the cassette into kmo. This would allow an integration of this cassette to be combined with future cas9 expressing lines to test for homing activity. www.nature.com/scientificreports/ vae were identified from this series of injections, from a total of 5313 G 1 larvae screened. A subsequent series of injections were performed on 601 embryos and of these 141 larvae hatched and 71 survived to adulthood (11.8%). In total 4441 G 1 larvae were screened from three ovipositions of these additional 8 pools (~ 10 G 0 adults per pool) and we identified larvae with DsRed fluorescence in two pools (K and L). A total of 60 positive individuals were identified from pool K and 48 from pool L. The HDR integration rate for the second round of injections was calculated to be a minimum of 2.8%; combining both experiments, to avoid artificially discounting null results, would indicate a minimum transformation rate of 1.6%. "Transformation rate" is typically defined as the number of independent integration events identified per fertile G 0 adult injection survivor. Pooling G 0 , required here for efficient recovery of G 1 progeny, means that we do not know what proportion of G 0 adults were fertile; the efficiency calculation therefore uses total G 0 adults instead. The calculated rate also assumes that all positive G 1 larvae recovered from a single pool were the result of a single integration event, here assuming that the 60 positives identified from pool K represent one integration event and the 48 from pool L another. This likely underestimates the true rate, so we refer to such estimates as "minimum estimates".

Molecular confirmation of kmo insertion.
Successful knock-in of the HDR construct at the kmo locus was confirmed by PCR (Fig. 2). Representative fluorescent individuals from both pools K and L produced amplicons of expected size. This PCR assay produced multiple non-specific amplification products in WT samples, so putative integrations were confirmed by Sanger sequencing (Fig. S2). A second diagnostic PCR with the two  Generation of homozygous lines. To assess the viability of homozygous kmo insertions, G 1 fluorescent individuals from pools K and L were sibling crossed (within pool crosses) and also G 1 fluorescent males from pool L were crossed to fluorescent females from pool K (between pool crosses). The inheritance pattern in the offspring (G 2 ) was expected to be 25% homozygous for kmo knock-in (red fluorescent and white-eyed), 50% heterozygous (red fluorescent and WT eyed) and 25% wildtype (non-fluorescent, WT eyed) and was assessed at the larval stages in these crosses. The phenotypes of individuals of these three genotypes are shown in Fig. 3. Larvae which were homozygous for the kmo knock-in displayed brighter fluorescence when compared to the heterozygotes (Fig. 3a), presumably due to having two copies of the transgene. Loss of eye pigmentation (whiteeyed phenotype) and bright fluorescence were observed in 19.9% of the G 2 larvae in pool K within pool cross, 15.8% in pool L within pool cross and 19.8% in offspring of the L × K (between pool) cross ( Table 2) identifying these individuals as homozygous for the insertion. Heterozygotes (56% in pool K, 51.5% in pool L and 50% in pool L × K) and wild-type (24.1% in pool K, 32.7% in pool L and 30.6% in pool L × K) were observed at approximately Mendelian rates ( Table 2). The observation of white eyes only when associated with bright, fluorescent individuals provides additional evidence to suggest that the HDR integration sites are within the kmo locus. The decrease from the expected 25% frequency, as well as an observed slow growth of homozygous kmo knock-in individuals suggests that significant recessive fitness costs were associated with these insertions. This was statistically assessed using a one-tailed Chi-square goodness of fit test and an expected ratio of 1:2:1 (kmo knockin homozygotes: kmo knock-in heterozygotes: WT). For cross 1 (within pool K), no significant deviation was observed (Χ 2 = 3.44, d.f = 2, p = 0.18). For the within pool L and between pool K × L crosses, however, significant deviations from the expected ratio were observed (Χ 2 = 15.8, d.f = 2, p = 0.000372: Χ 2 = 13.5, d.f = 2, p = 0.00116, respectively). In all three crosses, numbers of kmo knock-in homozygotes were less than expected and it is possible that pool K would show a significant devaitaion at increased sample sizes, in line with the other two crosses. Furthermore, we observed that no homozygous transgenic larvae survived to pupation in any of the crosses conducted, indicating recessive lethality, while wild-type and heterozygous larvae pupated normally. As these effects were similar for the L × K offspring as in the within pool crosses, this suggests that the insertion itself is responsible, though in a laboratory strain of presumed limited genetic diversity it is theoretically possible that each independent insertion shares the same tightly-linked recessive lethal allele.

Discussion
Development of next-generation genetics-based control strategies such as gene drives require an efficient and precise method for integrating transgenic sequences into the germline of a target organism. Previous efforts to utilise the piggyBac transposase system for this purpose in C. quinquefasciatus were unsuccessful, despite multiple different forms of transposase being utilised 36 . This was surprising, given the extremely broad range of species, across multiple phyla, in which piggyBac has been shown to be an efficient tool for germline integration 37 . Whilst this poses interesting fundamental questions, the potential damage caused by this mosquito species, and its rapid spread into new habitats around the globe, necessitates the rapid development of orthogonal technologies to act as the building blocks for novel control systems such as gene drives. Previous work demonstrated that the Hermes transposase system was functional in C. quinquefasciatus, but resulted in the non-canonical integration of plasmid backbone sequences alongside those 'desired' components within the Hermes flanks. This is undesirable for the development of genetic control tools designed to be released into the wild as the presence of these sequences would likely complicate the regulatory process for such lines. As with the piggyBac system, Hermes also operates through a semi-random integration method, making it inappropriate for many of the more powerful gene drive designs, which require precise integration of transgenic components into target loci.
Here we provide a solution to both these issues by demonstrating the functionality of CRISPR/Cas9-based knock-in in C. quinquefasciatus. Assessment of the flanking regions of the two established lines showed that integration was precise at the target cut site with no integration of undesirable backbone components. Additionally, our overall 1.6% transformation rate (2.8% for the second round of injections) suggests a relatively efficient process and certainly one which is efficient enough for the testing of different exogenous components. As an example, our knock-in cassette was built to include an RNA Polymerase III promoter previously found to be highly active in C. quinquefasciatus Hsu cells 38 which drove in vivo expression of the same sgRNA used to integrate the cassette. This represents the first step towards testing of the homing-drive concept in C. quinquefasciatus through a 'split-drive' design 20 . Further work will be required to assess the functionality of the chosen 7SK promoter to express the sgRNA in suitable germline tissues and to develop other lines capable of expressing Cas9 with compatible spatial and temporal characteristics.
Interestingly, during our experiments we observed a severe recessive fitness cost associated with the kmo transgene integration, resulting in death of all homozygous individuals prior to pupation. This was surprising for two reasons. Firstly, our previous work generating a frame-shift deletion at this locus in C. quinquefasciatus using the same sgRNA as utilised to specify the integration site here, did not result in such a lethal phenotype, although a significant sub-lethal fitness cost was observed 28 . Secondly, our unpublished work generating a similar knock-in in the homologous exon of the kmo gene of another culicine mosquito, Aedes aegypti, did not result in a recessive lethal phenotype. The situation in C. quinquefasciatus appears more similar to that of the relatively distantly related Anopheles stephensi, where severe fitness costs associated with kmo knock-in/kmo knockout homozygotes resulted in high levels of adult female lethality post blood-feeding, and significant reductions in egg-laying for surviving females 25 -though little apparent effect on males, whereas both sexes were affected in our C. quinquefasciatus knock-in lines. This fitness cost was harnessed as a resistance management mechanism in next-generation A. stephensi gene drives, where the drive provides some rescue against these costs, theoretically reducing the fitness benefit of potentially arising resistance alleles 39 . This provides a framework for similar future designs in C. quinquefasciatus. The basis for the observed differences between homozygous viability for our C. quinquefasciatus knock-in and knock-out lines is unclear. It may be that the original mutant lines retain some activity, i.e. are hypomorphic, even though sequencing identified them as frame-shift mutants-possibly there are alternative splicing variants, or a truncated protein retains some activity. It is also possible that the observed lethality is associated with a closely linked background mutation, though it seems unlikely that this would be present in both knock-ins but not the original knock-outs, all of which were generated from the same wild-type colony. Further research is required to explore this and other potential explanations.
It is hoped that this work will provide a springboard for those researchers interested in developing homingbased and other gene drive strategies in this pernicious global pest.

Materials and methods
Mosquito rearing. Both wild type TPRI (Tropical Pesticides Research Institute) C. quinquefasciatus 28 and the kmo HDR knock-in lines (AGG2069) were maintained at 28 °C, 70% humidity and 12 h day-night cycle in an insectary as previously described 28 . Egg rafts were collected from adult cages in a 150 ml plastic container filled with horse hay infused water. Mosquito larvae were fed with pelleted pond fish food. Adult mosquitoes were fed ad libitum with 10% sucrose solution. A Hemotek system (Hemotek, Blackburn, UK) was used to provide debrifinated horse blood (TCS Biosciences, Buckingham, UK) through sausage casing and a layer of Parafilm.
Plasmid design and cloning. The kynurenine 3-monooxygenase (kmo) gene (CPIJ07147) of C. quinquefasciatus was identified and sequence confirmed as previously described 28 . Approximately 2 kb upstream and downstream of the precise sgRNA cut site was used as homology arms and synthesised by Twist Bioscience (San Diego, California, USA). A cassette containing the Cq7SK promoter 38 was PCR amplified from the plasmid AGG1127 and the sgRNA sequence was added in using the oligos designed for the PCR. The Hr5/IE1-DsRed-p10 3′UTR cassette was amplified by PCR from another plasmid (AGG1906). All the fragments were gel purified and assembled using the NEBuilder HiFi DNA Assembly Master Mix (New England Biolabs, Ipswich, MA, USA). Complete AGG2069 plasmid sequence has been deposited to NCBI (Accession number: MW417419).  LA935 5′-GAA ATT AAT ACG ACT CAC TAT AGG ACA GTG CGG TCC  G-CAA GGG TTT TAG AGC TAG AAA-3′ and LA137 5′-AAA AGC ACC GAC TCG GTG CCA-CTT TTT CAA GTT  GAT AAC GGA CTA GCC TTA TTT TAA CTT GCT ATT TCT AGC TCT AAAAC-3′) containing the T7 promoter using the MEGAscript T7 Transcription kit (ThermoFisher Scientific, Walthum, MA USA). The reaction mix was incubated at 37 °C for 16 h and purified using the MEGAclear Transcription Clean-Up kit (ThermoFisher Scientific, Walthum, MA USA). Injection mix (20 µl) was made with the following components: AGG2069 (HDR donor plasmid, 800 ng/µl), LA935 sgRNA (40 ng/µl), Cas9 protein (300 ng/µl), 10× Injection Buffer 40 (2 µl), DEPC water (up to 20 µl). The assembled injection mix was incubated at 37 °C for 20 min to pre-complex the Cas9 and sgRNA, then centrifuged at 11,000×g speed and 4 °C for at least 10 min. Mix was maintained on ice throughout injection.
Embryonic microinjections. Culex embryo microinjections were performed as previously described 28 . A microscopic injection station equipped with FemtoJet 4× microinjector (Eppendorf, Hamburg, DE) was used for injections. Injections were carried out using quartz capillaries (0.7 mm internal diameter and 1.0 mm external diameter) pulled into needles using a Sutter P2000 laser based micro-pipette needle puller (Sutter Instruments, Novato, CA USA) and the following program: HEAT = 729, FIL = 4, VEL = 40, DEL = 128, PUL = 134, Line = 1. A clear plastic cup containing approximately 100 ml of hay infused water was placed into the adult cages on the 5th day after a blood meal. Cages were placed in the dark to encourage egg laying and allowed to lay for 45-60 min. The egg rafts were disaggregated and aligned horizontally on a piece of moistened chromatography paper and against a nitrocellulose membrane (GE Healthcare, Amersham UK). Lines of embryos were then transferred to a piece of Scotch double-sided tape 665 (3 M, USA) on a plastic coverslip. Prepared eggs were covered with Halocarbon oil 27 (Sigma Aldrich, Gillingham UK) to prevent desiccation and injected. Injected eggs were washed with distilled water to remove as much oil as possible and (still on the coverslip) submerged egg side down into larval rearing trays and allowed to hatch. Surviving larvae were transferred to a new tray with hay infused water and maintained at the standard rearing conditions. Crosses and screening. Both male and female adult injection survivors (G 0 ) were mated to the parental wildtype strain (TPRI). Male G 0 individuals were crossed to three wild type females, and after 7 days these were pooled into groups of ten males and 30 females. G 0 females were mated in pools of ten to approximately 20 wildtype males. Pools were blood fed and after 5 days, eggs were collected as described above. Four ovipositions were collected for all pools and screened under the Leica MZ165FC microscope (Leica Biosystems, Milton-Keynes UK) and images were taken using a Leica DFC camera and the settings: Brightness 82%, saturation 0 and Gamma 0.71 (for white light images) and Brightness 82%, Saturation 0.146 and Gamma 0.40 (mCherry-red fluorescence). Around 20 fluorescent marker positive G 1 males and females from the relevant lines were crossed to provide larvae for each viability assay. PCR confirmation of HDR insertions. Genomic DNA from wildtype and DsRed individuals was extracted using the NucleoSpin Tissue kit (Macherey Nagel, Düren, Germany). The 5′ and 3′ junctions of the kmo integration were PCR confirmed using two internal primers LA2196 (5′-CCA GTT CGG TTA TGA GCC GT-3′) and LA323 (5′-ACC AAA TCT GCC AGC GTC AATAG-3′), that bind within the inserted cassette and two external primers LA6087 (5′-TTC GGT TTG CCC AAA GAA GC-3′) and LA6088 (5′-AAA TGT TCG TCT CCG ACC CC-3′) that bind to the genome external to the homology arms. An additional PCR was also performed with only the external primer set: LA6087 and LA6088 (Fig. S1). Q5 High-fidelity 2× Master Mix (New England Biolabs, Ipswich, MA, USA) was used with the following cycling conditions: initial denaturation 98 °C for 30 s, 35 cycles of denaturation 98 °C for 10 s, annealing temperature 67 °C for 10 s and extension 72 °C for 4 min, final extension 72 °C for 10 min, followed by hold at 4 °C. The PCR amplicons were electrophoresed in a 1% agarose gel with SYBR Safe (ThermoFisher Scientific, Waltham, MA, USA). The amplicons of expected size were excised, gel purified using NucleoSpin Gel and PCR Clean-up kit (Macherey Nagel, Düren, Germany) and subjected to Sanger sequencing.