Genetic tool development in marine protists: emerging model organisms for experimental cell biology

Diverse microbial ecosystems underpin life in the sea. Among these microbes are many unicellular eukaryotes that span the diversity of the eukaryotic tree of life. However, genetic tractability has been limited to a few species, which do not represent eukaryotic diversity or environmentally relevant taxa. Here, we report on the development of genetic tools in a range of protists primarily from marine environments. We present evidence for foreign DNA delivery and expression in 13 species never before transformed and for advancement of tools for eight other species, as well as potential reasons for why transformation of yet another 17 species tested was not achieved. Our resource in genetic manipulation will provide insights into the ancestral eukaryotic lifeforms, general eukaryote cell biology, protein diversification and the evolution of cellular pathways. This Resource describes genetic tools for microbial eukaryotes, providing a roadmap for developing genetically tractable organisms.

Diverse microbial ecosystems underpin life in the sea. Among these microbes are many unicellular eukaryotes that span the diversity of the eukaryotic tree of life. However, genetic tractability has been limited to a few species, which do not represent eukaryotic diversity or environmentally relevant taxa. Here, we report on the development of genetic tools in a range of protists primarily from marine environments. We present evidence for foreign DNA delivery and expression in 13 species never before transformed and for advancement of tools for eight other species, as well as potential reasons for why transformation of yet another 17 species tested was not achieved. Our resource in genetic manipulation will provide insights into the ancestral eukaryotic lifeforms, general eukaryote cell biology, protein diversification and the evolution of cellular pathways.
T he ocean represents the largest continuous planetary ecosystem, hosting an enormous variety of organisms, which include microscopic biota such as unicellular eukaryotes (protists). Despite their small size, protists play key roles in marine biogeochemical cycles and harbor tremendous evolutionary diversity 1,2 . Notwithstanding their significance for understanding the evolution of life on Earth and their role in marine food webs, as well as driving biogeochemical cycles to maintain habitability, little is known about their cell biology including reproduction, metabolism and signaling 3 . Most of the biological knowledge available is based on comparison of proteins from cultured species to homologs in genetically tractable model taxa [4][5][6][7] . A main impediment to understanding the cell biology of these diverse eukaryotes is that protocols for genetic modification are only available for a small number of species 8,9 that represent neither the most ecologically relevant protists nor the breadth of eukaryotic diversity.
The development of genetic tools requires reliable information about gene organization and regulation of the emergent model species. Over the last decade, genome [4][5][6] and transcriptome sequencing initiatives 7 have resulted in nearly 120 million unigenes being identified in protists 10 , which facilitates the developments of genetic tools used for model species. Insights from these studies enabled the phylogenetically informed approach 7 for selecting and developing key marine protists into model systems in the Environmental Model Systems (EMS) Project presented herein. Forty-one research groups took part in the EMS Project, a collaborative effort resulting in the development of genetic tools that significantly expand the number of eukaryotic lineages that can be manipulated, and that encompass multiple ecologically important marine protists.
Here, we summarize detailed methodological achievements and analyze results to provide a synthetic 'transformation roadmap' for creating new microeukaryotic model systems. Although the organisms reported here are diverse, the paths to overcome difficulties share similarities, highlighting the importance of building a wellconnected community to overcome technical challenges and accelerate the development of genetic tools. The 13 emerging model species presented herein, and the collective set of genetic tools from the overall collaborative project, will not only extend our knowledge of marine cell biology, evolution and functional biodiversity, but also serve as platforms to advance protistan biotechnology.

Results
Overview of taxa in the EMS initiative. Taxa were selected from multiple eukaryotic supergroups 1,7 to maximize the potential of cellular biology and to evaluate the numerous unigenes with unknown functions found in marine protists (Fig. 1). Before the EMS initiative, reproducible transformation of marine protists was limited to only a few species such as Thalassiosira pseudonana, Phaeodactylum tricornutum and Ostreococcus tauri (Supplementary Table 1). The EMS initiative included 39 species, specifically, 6 archaeplastids, 2 haptophytes, 2 rhizarians, 9 stramenopiles, 12 alveolates, 4 discobans and 4 opisthokonts (Fig. 1). Most of these taxa were isolated from coastal habitats, the focus area of several culture collections 7 . More than 50% of the selected species are considered photoautotrophs, with another 35% divided between heterotrophic osmotrophs and phagotrophs, the remainder being predatory mixotrophs. Almost 20% of the chosen species are symbionts and/or parasites of marine plants or animals, 5% are associated with detritus and several are responsible for harmful algal blooms (Supplementary Table 2).
While some transformation systems for protists have been developed in the past 8,9,11 , the challenge for this initiative was to develop genetic tools for species that not only require different cultivation conditions but are also phenotypically diverse. It should be noted that not all main lineages were explored. For example, amoebozoans did not feature in this aquatic-focused initiative, in part because they tend to be most important in soils, at least based on current knowledge, and manipulation systems exist for members of this eukaryotic supergroup, such as Dictyostelium discoideum 12 . The overall EMS initiative outcomes are summarized in Fig. 1 and Table 1. We provide detailed protocols for 13 taxa, for which no transformation systems have been previously reported (category A) and eight taxa, for which existing protocols 9,11,[13][14][15][16][17][18][19][20][21] were advanced (category B; Figs. 2, 3 and 4, Table 1, Supplementary Tables 1-5 and Methods). We also review an already published EMS transformation protocol 22 in one species (category C), and we discuss unsuccessful transformation attempts for 17 additional taxa ( Fig. 1 and Methods). Finally, we synthesize our findings in a roadmap for the development of transformation systems in protists (Fig. 5).
Archaeplastids. Prasinophytes are important marine green algae distributed from polar to tropical regions 23 . They form a sister group to Genetic tool development in marine protists: emerging model organisms for experimental cell biology ResouRce NATure MeTHODs chlorophyte algae, and together, these two groups branch adjacent to land plants, collectively comprising the Viridiplantae, which are part of the Archaeplastida 1,23 (Fig. 1). Genome sequences are available for the picoprasinophytes (<3 µm cell diameter) tested herein, specifically, Micromonas commoda, M. pusilla, Ostreococcus lucimarinus and Bathycoccus prasinos. As part of the EMS initiative, we report on genetic tools for Bathycoccus, a scaled, nonmotile genus, and Micromonas, a motile, naked genus with larger genomes than Bathycoccus and Ostreococcus 22 . We also report on genetic tools for Tetraselmis striata and O. lucimarinus. The latter was transformed based on an adapted homologous recombination system for O. tauri 24,25 .
O. lucimarinus (RCC802) and B. prasinos (RCC4222) were transformed using protocols adapted from O. tauri 24,25 . Briefly, using electroporation for transfer of exogenous genes, O. lucimarinus was transformed using a DNA fragment encoding the O. tauri highaffinity phosphate transporter (HAPT) gene fused to a luciferase gene and a kanamycin selection marker (Table 1 and Supplementary  Table 3), which resulted in transient luciferase expression 24 h after electroporation (Table 1 and Fig. 3a). After 2 weeks of growth in low-melting agarose plates containing G418 (1 mg ml −1 ), 480 colonies were obtained, picked and grown in artificial seawater with the antibiotic neomycin. Of these, 76 displayed luminescence ≥2.5-fold above background (80 relative luminescence units (RLU)), with widely variable levels (200-31,020 RLU), likely reflecting either variations in the site of integration and/or the number of integrated genes (Fig. 3a, Supplementary Fig. 1 and Methods).
The O. tauri construct did not work in B. prasinos, while the use of the B. prasinos histone H4 and HAPT sequences in an otherwise identical construct and conditions was successful. Although luciferase expression was not detected 24 h after electroporation, 48 G418-resistant colonies were obtained 2 weeks later, 20 being luminescent when grown in liquid medium. Analysis of 14 resistant transformants revealed that the luciferase sequence was integrated into the genome of five luminescent clones, and one nonluminescent clone ( Fig. 3b and Methods), suggesting that the chromatin context at integration sites in the latter was not favorable to luciferase expression.
Although transformation methods successful for Bathycoccus and Ostreococcus failed in Micromonas, Lonza nucleofection was successful with M. commoda (CCMP2709) ( Table 1 and Fig. 3c) using two different codon-optimized plasmids, one encoding the luciferase gene (NanoLuc, Promega) flanked by an exogenous promoter and terminator sequence from the 5′ and 3′ untranslated regions (UTRs) of histone H3 in Micromonas polaris (CCMP2099), and the other encoding an enhanced green fluorescent protein (eGFP) gene flanked by endogenous promoter and terminator sequences from ribosomal protein S9 (Supplementary Table 5). Sensitivities to antibiotics were established (Supplementary Table 3). Constructs did not include a selectable marker, as we aimed to introduce and express foreign DNA while developing conditions suitable for transfection that supported robust growth in this cell wall-lacking protist (Table 1). Transformants revealed a significantly higher level of eGFP fluorescence than wildtype cells, with 1.3% of the population showing fluorescence per cell 45-fold higher than both the nontransformed portion of the culture and the wild-type cells ( Fig. 3c and Methods). Additionally, the RLU was 1,500-fold higher than controls when using the luciferase-bearing construct, such that multiple experiments with both plasmids confirmed expression of exogenous genes in M. commoda.
T. striata (KAS-836) was transformed using microprojectile bombardment ( Supplementary Fig. 2a). Two selectable marker genes were tested, consisting of a putative promoter and 5′ UTR sequences from the T. striata actin gene and either the coding sequences of the Streptoalloteichus hindustanus bleomycin gene (conferring resistance to zeocin) or the Streptomyces hygroscopicus bar gene (conferring resistance to glufosinate) ( Table 1, Supplementary Fig. 2a and Methods). The terminator sequence was obtained from the T. striata glyceraldehyde-3-phosphate dehydrogenase gene. Linearized plasmids were coated on gold particles and introduced into T. striata cells by using the PDS-1000/He Particle Delivery System (Bio-Rad). Transformants were successfully selected on halfstrength f/2 at 50% salinity agar plates containing either 150 μg ml −1 zeocin or 150 μg ml −1 glufosinate.

Haptophytes (incertae sedis).
Haptophytes are a group of photosynthetic protists that are abundant in marine environments and include the principal calcifying lineage, the coccolithophores. Genome sequences are available for Emiliania huxleyi 6 and Chrysochromulina tobin 26 , and there is one report of nuclear transformation of a calcifying coccolithophore species 27 but transformation of E. huxleyi, the most prominent coccolithophore, has not been achieved yet 27 . Here, as part of the EMS initiative, a stable nuclear transformation system was developed for Isochrysis galbana, a species that lacks coccoliths, but represents an important feedstock for shellfish aquaculture 28 .
I. galbana (CCMP1323) was transformed by biolistic bombardment with the pIgNAT vector, which contains nourseothricin (NTC) N-acetyltransferase (NAT), (for nourseothricin resistance) driven by the promoter and terminator of Hsp70 from E. huxleyi (CCMP1516). Twenty-four hours after bombardment, cells were transferred to liquid f/2 medium at 50% salinity containing 80 µg ml −1 NTC and left to grow for 2-3 weeks to select for transformants ( Table 1). The presence of NAT in NTC-resistant cells was verified by PCR and PCR with reverse transcription (RT-PCR) (Fig. 4a, Supplementary Fig. 2b and Methods) and the sequence was verified. To confirm NTC resistance was a stable phenotype, cells were subcultured every 2-4 weeks at progressively higher NTC concentrations (up to 150 µg ml −1 ) in the above-mentioned media. Cells remained resistant to NTC for approximately 6 months, as confirmed by PCR screening to identify the presence of the NAT gene.
Rhizarians. Rhizarians include diverse nonphotosynthetic protists, as well as the photosynthetic chlorarachniophytes that acquired a plastid via secondary endosymbiosis of a green alga 4 . Uniquely, they represent an intermediate stage of the endosymbiotic process, since their plastids still harbor a relict nucleus (nucleomorph). Here, we report on an advanced transformation protocol for the chlorarachniophyte Amorphochlora (Lotharella) amoebiformis for which lowefficiency transient transformation has previously been achieved using particle bombardment 14 .
A. amoebiformis (CCMP2058) cells were resuspended in 100 µl of Gene Pulse Electroporation Buffer (Bio-Rad) with 20-50 µg of the reporter plasmid encoding eGFP-RubisCO fusion protein under the control of the native rbcS1 promoter and subjected to electroporation (Table 1). Cells were immediately transferred to fresh ESM medium and incubated for 24 h. Transformation efficiency was estimated by the fraction of cells expressing eGFP, resulting in 0.03-0.1% efficiency, as enumerated by microscopy, showing an efficiency up to 1,000-fold higher than in the previous study 14 (Table 1). Stable transformants were generated by manual isolation using a micropipette, and a transformed line has maintained eGFP fluorescence for at least 10 months without antibiotic selection (Figs. 2 and 4b and Methods).
Stramenopiles. Stramenopiles are a diverse lineage harboring important photoautotrophic, mixotrophic (combining photosynthetic and phagotrophic nutrition) and heterotrophic taxa. As the most studied class in this lineage, diatoms (Bacillariophyceae) were early targets for the development of reverse genetics tool 11,29 . Diatoms are estimated to contribute approximately 20% of annual carbon fixation 30 and, like several other algal lineages, are used in bioengineering applications and biofuels 31 . Although other cold-adapted eukaryotes have, to our knowledge, yet to be transformed, here we present a protocol for the Antarctic diatom Fragilariopsis cylindrus 32 .

NATure MeTHODs
A transformation protocol has also been developed herein for Pseudo-nitzschia multiseries, a toxin-producing diatom 33 . Here we also present work for nondiatom stramenopiles, including a transformation protocol for the eustigmatophyte Nannochloropsis oceanica, and an alternative protocol for the labyrinthulomycete Aurantiochytrium limacinum 20 , both of which are used for biotechnological applications. Furthermore, we report on advances for CRISPR/Cas-driven gene knockouts in Phaeodactylum tricornutum 8,13 and a more efficient bacterial conjugation system for Thalassiosira pseudonana 13 .
Microparticle bombardment was used on F. cylindrus (CCMP1102) that was grown, processed and maintained at 4 °C in 24 h light. Exponential phase cells were harvested onto a 1.2 µm membrane filter that was then placed on an 1.5% agar Aquil plate for bombardment with beads coated with a plasmid containing zeocin resistance and eGFP, both controlled by an endogenous fucoxanthin chlorophyll a/c binding protein (FCP) promoter and terminator ( Table 1, Supplementary Table 3 and Methods) 34 . Transformation was performed using 0.7 µm tungsten particles and the biolistic particle delivery system PDS-1000/He (Bio-Rad  ResouRce NATure MeTHODs used for Schizochytrium 36 . The highest transformation efficiency was achieved using 1 µg of linearized 18GZG plasmid with two pulses, resulting in a time constant of ~5 ms (Table 1 and Methods). Expression of the fusion protein was confirmed by both the zeocinresistance phenotype and the detection of eGFP (Fig. 2). Six 18GZG transformants derived from uncut and linearized plasmids were examined in detail. All maintained antibiotic resistance throughout 13 serial transfers, first in selective, then subsequently in nonselective media and then again in selective medium. Integration of the plasmid into the genome was confirmed by PCR as well as by Southern blots using a digoxigenin-labeled ShBle gene probe, showing that four transformants had integrations by single homologous recombination, while in two transformants additional copies of the antibiotic resistance cassette were integrated by nonhomologous recombination elsewhere in the genome ( Supplementary Fig. 5). Electroporation of N. oceanica (CCMP1779) was optimized based on observation of cells treated with fluorescein-conjugated 2,000 kDa dextran and subsequent survival ( Table 1 and Methods). A sorbitol concentration of 800 mM and electroporation at between 5 and 9 kV cm −1 resulted in highest cell recovery. These conditions were used during introduction of plasmids containing the gene for the blue fluorescent reporter mTagBFP2 under the control of cytomegalovirus (CMV), the cauliflower mosaic virus 35S, or the VCP1 promoter previously described from Nannochloropsis sp. 37 . Transient expression of blue fluorescence (compared to cells electroporated simultaneously under the same conditions without plasmid) appeared within 2 h, lasted for at least 24 h and disappeared by 48 h in subsets of cells electroporated with mTagBFP2 under the control of CMV (Supplementary Fig. 6). The transient transformation was more effective when a linearized plasmid was used compared to a circular plasmid (Table 1). VCP1 did not induce blue fluorescence with a circular plasmid, while 35S gave inconsistent results with either circularized or linearized plasmids.
For P. tricornutum (CCAP1055/1), we adapted the CRISPR/Cas9 system 8 for multiplexed targeted mutagenesis. Bacterial conjugation 13 was used to deliver an episome that contained a Cas9 cassette and two single-guide RNA (sgRNA) expression cassettes designed to excise a 38 basepair-long domain from the coding region of a nuclearencoded, chloroplastic glutamate synthase (Phatr3_J24739) and introduce an in-frame stop codon after strand ligation (Table 1 and Methods). The GoldenGate assembly was used to clone two expression cassettes carrying sgRNAs into a P. tricornutum episome that contained a Cas9-2A-ShBle expression cassette and the centromeric region CenArsHis (Supplementary Fig. 7). After their addition to a P. tricornutum culture, plates were incubated in a growth chamber under standard growth conditions for 2 d and transformed P. tricornutum colonies began to appear after 2 weeks. Only colonies maintaining Cas9-2A-ShBle sequence on the delivered episome were able to grow on selection plates because Cas9 and ShBle were transcriptionally fused by the 2A peptide 38 (Supplementary Fig. 7). Gel electrophoresis migration and sequencing of the genomic target loci confirmed the 38 bp-long excision and premature stop codon (Fig. 4d).
Alveolates. This species-rich and diverse group comprises ciliates, apicomplexans and dinoflagellates (Fig. 1). As a link between apicomplexan parasites and dinoflagellate algae, perkinsids are key for understanding the evolution of parasitism, and also have potential biomedical applications 17 . Techniques currently exist for transformation of only a small number of ciliates, perkinsids and apicomplexans 39 . Here, we present a transformation protocol for Karlodinium veneficum (CCMP1975), a phagotrophic mixotroph that produces fish-killing karlotoxins 40 . Experiments were also performed on Oxyrrhis marina (CCMP 1788/CCMP 1795), a basal-branching phagotroph that lacks photosynthetic plastids and Crypthecodinium cohnii (CCMP 316), a heterotroph used in food supplements. For both of these taxa, evidence of DNA delivery 1,350 and 1,550 pounds per square inch (psi) gave the highest colony numbers with efficiencies of 20.7 colony forming units (c.f.u.) per 10 8 cells and 30 c.f.u. per 10 8 cells, respectively. Following bombardment, the filter was turned upside down and left to recover for 24 h on the plate, then cells were rinsed from the plate/filter and spread across five 0.8% agar Aquil plates with 100 µg ml −1 zeocin. Colonies appeared 3-5 weeks later. PCR on genomic DNA showed that 100 and 60% of colonies screened positive for the bleomycin gene (ShBle) for zeocin resistance and the gene encoding eGFP, respectively. As confirmed by fluorescence-activated cell sorting (FACS) and microscopy, eGFP was localized to the cytosol and was distinguishable from plastid autofluorescence (Fig. 2). Additional confirmation by PCR and RT-PCR (Fig. 4c) revealed that the ShBle and eGFP genes were present in the genomes of transformants after multiple transfers (>10) 2 years later, indicating long-term stability.
Bacterial conjugation methods were improved in T. pseudonana (CCMP1335) using the silaffin precursor TpSil3p (Table 1 and Methods) as the target gene. TpSil3p was fused to eGFP flanked by an FCP promoter and terminator, cloned into a pTpPuc3 episomal backbone and transformed into mobilization plasmid-containing EPI300 E. coli cells (Lucigen). The donor cells were grown in super optimal broth with catabolite repression (SOC) medium at 37 °C until OD 600 of 0.3-0.4, centrifuged and resuspended in 267 μl SOC medium. Next, 200 μl of donor cells were mixed with T. pseudonana cells, cocultured on predried 1% agar plates, dark incubated at 30 °C for 90 min, then at 18 °C in constant light for 4 h, followed by selection in 0.25% agar plates containing 100 µg ml −1 NTC. Colonies were observed after 2 weeks, inoculated into 300 μl L1 medium and supplemented with 200 µg ml −1 NTC to reduce the number of false positives. Positive transformants were identified by colony PCR screening (Supplementary Fig. 3) and epifluorescence microscopy (Fig. 2).
The diatom P. multiseries (15093C) and other members of this genus form buoyant linear chains with overlapping cell tips during active growth, and were unconducive to punctate colony formation on agar, where their growth is generally poor. To address this challenge, a low-gelation-temperature agarose seawater medium (LGTA) was developed to facilitate growth, antibiotic selection and cell recovery. P. multiseries exhibited growth inhibition at relatively low concentrations under NTC, formaldehyde and zeocin (Supplementary Table 3). Biolistic transformation of two other P. species had been demonstrated at low efficiency 35 . To complement this approach and explore potentially higher efficiency methods for transformation with diatom episomal plasmids, we modified the existing conjugation-based method 13 . The published conjugation protocol was modified to enhance P. multiseries postconjugation viability by reducing SOC content. An episomal version of the Pm_actP_egfp_actT expression cassette was transfected into E. coli EPI300+pTAMOB and used for conjugation (Table 1 and Methods). After 48 h in L1 medium, cells were plated in LGTA and eGFP-positive cells were observed 7 d later (Fig. 2). PCR revealed the presence of plasmids in all eGFP-positive colonies ( Supplementary  Fig. 4). Similarly, conjugation with the episome pPtPUC3 (bleomycin selection marker)-containing bacterial donors was followed under zeocin selection (200 μg ml −1 ). After 7 d, only viable cells (based on bright chlorophyll fluorescence) contained the episome, as confirmed by PCR. Propagation of transformants after the first medium transfer (under selection) has so far been unsuccessful.
Stable transformation of A. limacinum (ATCC MYA-1381) was achieved by knock-in of a resistance cassette composed of ShBle driven by 1.3 kb promoter and 1.0 kb terminator regions of the endogenous glyceraldehyde-3-phosphate dehydrogenase gene carried in a pUC19-based plasmid (18GZG) along with the native 18S ribosomal RNA gene, and by knock-in of a similar construct containing a eGFP:ShBle fusion ( Supplementary Fig. 5). Approximately 1 × 10 8 cells were electroporated, adapting the electroporation protocol ResouRce NATure MeTHODs was achieved (Table 1, Supplementary Results, Supplementary  Fig. 15 and Methods), a goal recently achieved for C. cohnii using electroporation 19 . Additionally, we report on improved transformation systems for Perkinsus marinus (PRA240) and Amphidinium carterae (CCMP1314) chloroplast, published recently as part of the EMS initiative 15 .
K. veneficum (CCMP1975) was transformed based on electroporation and cloning the selectable marker gene aminoglycoside 3′-phosphotransferase (nptII/neo; note that nptII/neo is used synonymously with amino 3′-glycosyl phosphotransferase gene conferring resistance to kanamycin, neomycin, paromomycin, ribostamycin, butirosin and gentamicin B) into the backbone of the dinoflagellatespecific expression vector DinoIII-neo 41 , which confers resistance to neomycin and kanamycin (Table 1). In brief, DinoIII-neo was linearized and electroporated using the Nucleofector optimization pulse codes, buffer SF/Solution I (Lonza), and 2 μg μl −1 of linearized DinoIII-neo. Electroporated cells were selected under 150 μg ml −1 kanamycin 3 d postelectroporation. Fresh seawater with kanamycin was added every 2 weeks to the cultures and new subcultures were inoculated monthly. After 3 months, DNA and RNA were isolated from the resistant cultures as previously reported 42 and cDNA was synthesized using random hexamers. Out of 16 transformations, two cell lines (CA-137, DS-138) showed stable growth under kanamycin selection. CA-137 developed dense cultures after 3 months, and the resistance gene was detected in both DNA and RNA by nested PCR and RT-PCR, respectively (Fig. 4e, Supplementary  Fig. 8 and Methods).
We improved the transformation protocol 16,17 of P. marinus, a pathogen of marine mollusks, fish and amphibians 43 (Supplementary  Table 5). We coexpressed two genes and efficiently selected transient and stable transformants using FACS (  Fig. 9 and Methods). In addition, we established the integration profile of ectopic DNA once introduced into the P. marinus genome. We did not see evidence of integration through homologous recombination and observed a propensity for plasmid fragmentation and integration within transposable elements sites. An optimized alternative protocol for transformation using glass bead abrasion was also developed. Two versions of the previously published Moe gene promoter 16 were tested. Whereas the 1.0 kb promoter version induced expression after 2 or 3 d, the truncated version (0.5 kb) took 7 d for expression to be detected. Resistance genes to zeocin, blasticidin and puromycin have all been shown to confer resistance to transformed P. marinus; however, selection regimes are still relatively slow and inefficient, indicating further room for improvement 17 .
We also report a vector for the transformation of the A. carterae chloroplast, a photosynthetic dinoflagellate. A. carterae, like other dinoflagellates with a peridinin-containing chloroplast, contains a fragmented chloroplast genome made up of multiple plasmid-like minicircles 40 . The previous transformation protocols made use of this to introduce two vectors based on the psbA minicircle 15 .
Here, we show that other minicircles are also suitable for use as vectors. We created an artificial minicircle, using the atpB minicircle as a backbone, but replacing the atpB gene with a codon-optimized

ResouRce
NATure MeTHODs chloramphenicol acetyltransferase (Table 1 and Methods). This circular vector was introduced by biolistics to A. carterae ( Supplementary Fig. 10a). Following selection with chloramphenicol, we were able to detect transcription of the chloramphenicol acetyltransferase gene via RT-PCR (Fig. 4g). This result suggests that all 20 or so minicircles in the dinoflagellate chloroplast genome would be suitable for use as artificial minicircles, thus providing a large pool of potential vectors.
Discobans. This diverse group, recently split into Discoba and Metamonada 44 , includes heterotrophs, photoautotrophs and predatory mixotrophs, as well as parasites. Discobans include parasitic kinetoplastids with clinical significance, such as Trypanosoma brucei, T. cruzi and Leishmania spp., for which efficient transformation protocols are available 45 . However, such protocols are missing for aquatic species. Here, we describe available transformation protocols for the kinetoplastid Bodo saltans and the heterolobosean Naegleria gruberi. The former was isolated from a lake, but identical 18S rRNA gene sequences have been reported from marine environments 46 . The latter is a freshwater protist that represents a model organism for closely related marine heterolobosean amoebas. Furthermore, we provide advanced methods that build on previous EMS results 18 for the diplonemid Diplonema papillatum.
B. saltans (ATCC 30904) was transformed with a plasmid containing a cassette designed to fuse an endogenous EF1-α gene with eGFP for C-terminal tagging. This cassette includes downstream of eGFP, a B. saltans tubulin intergenic region followed by the selectable marker nptII/neo gene, conferring resistance to neomycin. EF1-α genes exist in tandem repeats. The homologous regions that flank the cassette were chosen as targets for inducing homologydirected repair; however, they target only one copy of the gene. As transcription in B. saltans is polycistronic 46 , insertion of the tubulin intergenic region into the plasmid is essential for polyadenylation of the EF1-α/GFP fusion and trans-splicing of the nptII/neo gene (Supplementary Table 5). Selection of transfected cells began with 2 µg ml −1 of neomycin added 24 h after electroporation, and this concentration was gradually increased over 2 weeks to 5 µg ml −1 ( Table 1 and Methods). Cells were washed and subcultured into fresh selection medium every 4 d, and neomycin-resistant cells emerged 7-9 d postelectroporation. The eGFP signal was detected 2 d postelectroporation, albeit with low intensity. This may be due to the inefficient translation of eGFP since it has not been codon-optimized for B. saltans (Fig. 2). Genotyping analysis 9 months posttransfection confirmed the presence of the nptII/neo gene and at least partial plasmid sequence (Fig. 4h and Supplementary Fig. 10b). However, plasmid integration into the B. saltans genome through  A l l t r a n s .

NATure MeTHODs
homologous recombination is still unconfirmed. This suggests either off-target plasmid integration or episomal maintenance. For N. gruberi (ATCC 30224) two plasmids were designed. The first one carried the hygromycin B resistance gene (hph) with an actin promoter and terminator, along with an HA-tagged eGFP driven by the ubiquitin promoter and terminator. The second plasmid carried the nptII/neo gene instead. For each individual circular plasmid, 4 μg was electroporated (Table 1 and Methods). About 48 h after electroporation, dead cells were removed from the suspension and viable cells were washed with PBS. Afterward, 300 μg ml −1 of hygromycin B or 700 μg ml −1 of neomycin was added to the fresh media. One to 4 weeks later, resistant clones were recovered and expression of eGFP and/or hygromycin was confirmed by western blotting (Supplementary Fig. 11). Expression of eGFP was observed by epifluorescence microscopy (Fig. 2 and Supplementary Fig. 11) with ~80% of transformants maintaining hygromycin B or neomycin resistance in addition to expressing eGFP.
D. papillatum (ATCC 50162) was transformed by electroporation using 3 μg of a SwaI-linearized fragment (cut from p57-V5+NeoR plasmid) containing the V5-tagged nptII/neo gene flanked by partial regulatory sequences derived from the hexokinase gene of the kinetoplastid Blastocrithidia (strain p57) ( Table 1 and Methods) using a published protocol 18 . About 18 h after electroporation, 75 μg ml −1 G418 was added to the medium and after 2 weeks, seven neomycin-resistant clones were recovered. Transcription of nptII/neo was verified in four clones by RT-PCR ( Supplementary  Fig. 12) and the expression of the tagged nptII/neo protein was confirmed in two clones by western blotting using the α-V5 antibody (Fig. 4i).
Opisthokonts. The opisthokont clade Holozoa includes animals and their closest unicellular relatives choanoflagellates, filastereans, ichthyosporeans and corallochytreans. The establishment of genetic tools in nonmetazoan holozoans promises to help illuminate the cellular and genetic foundations of animal multicellularity 47 . Genomic and transcriptomic data are available for multiple representatives characterized by diverse cell morphologies, some of which can even form multicellular structures 46 . Until recently, only transient transformations had been achieved for some opistokonts such as the filasterean Capsaspora owczarzaki 48 , the ichthyosporean Creolimax fragrantissima 49 and the choanoflagellate Salpingoeca rosetta 21 . Through the EMS initiative, we report on evidence for transient transformation of the ichthyosporean Abeoforma whisleri, isolated from the digestive tract of mussels, and review a recently published stable transformation protocol for S. rosetta achieved by using the selectable puromycin N-acetyl-transferase gene (Fig. 2) 22 .
All A. whisleri life stages are highly sensitive to a variety of methods for transformation. However, we developed a 4D-nucleofectionbased protocol using 16-well strips, wherein PBS-washed cells were resuspended in 20 μl of buffer P3 (Lonza) containing 40 μg of carrier     Table 1 and text).

NATure MeTHODs
plasmid (empty pUC19) and 1-5 μg of the reporter plasmid (A. whisleri H2B fused to mVenus fluorescent protein, mVFP) (Table 1 and Methods), and subjected to code EN-138 (Lonza). Immediately after the pulse, cells were recovered by adding 80 μl of marine broth (Gibco) before plating in 12-well culture plates previously filled with 1 ml marine broth. After 24 h, ~1% of the culture was transformed based on the fraction of cells expressing mVFP in the nucleus (Figs. 2 and 4j).
Microbial eukaryotes in natural planktonic communities. Model organisms are typically selected based on criteria such as relative ease of isolation and asexual cultivation in the laboratory; however, these attributes may not correlate with the capacity for uptake and expression of the exogenous DNA. We explored whether natural marine planktonic pico-and nanoeukaryote communities would take up DNA in a culture-independent setting. Microbial plankton from natural seawater was concentrated and electroporated with plasmids containing mTagBFP2 under the control of CMV or 35S promoters (Supplementary Results and Methods). In most trials, blue fluorescent cells were rare if detected at all (compared to control samples). However, in one natural community tested, a photosynthetic picoeukaryote population exhibited up to 50% of cells with transient expression of blue fluorescence when the CMV promoter was used ( Supplementary Fig. 13). This suggests it might be possible to selectively culture eukaryotic microorganisms based on capacity to express exogenous DNA.

discussion
The collaborative effort by the EMS initiative facilitated identification and optimization of the steps required to create new protist model systems, which culminated in the synthetic transformation roadmap (Fig. 5). Our genetic manipulation systems for aquatic (largely marine) protists will enable deeper insights into their cell biology, with potentially valuable outcomes for aquatic sciences, evolutionary studies, nanotechnology, biotechnology, medicine and pharmacology. Successes and failures with selectable markers, transformation conditions and reporters were qualitatively compared across species (Supplementary Tables 3 and 4, Table 1, Figs.

2-4 and Methods).
For some of the selected species, the first step was to identify cultivation conditions for robust growth in the laboratory to either generate high cell densities or large culture volumes for obtaining sufficient biomass required for a variety of molecular biology experiments. Unlike established microbial model species, cultivation of marine protists can be challenging, especially under axenic conditions or for predatory taxa that require cocultivation with their prey. Nevertheless, 13 out of 35 species were rendered axenic before the development of transformation protocols. For the remaining species, we were unable to remove bacteria and therefore had to make sure that transformation signals were coming from the targeted protist rather than contaminants (Supplementary Table 2). Subsequent steps included the identification of suitable antibiotics and their corresponding selectable markers (Table 1 and Supplementary  Table 3 As exemplified in the model systems provided herein (Table 1 and Figs. 2-4), a variety of methods were used to test whether exogenous DNA was integrated into the genome or maintained as a plasmid, and whether the introduced genes were expressed. Approaches to show the former included inverse PCR, Southern blotting and whole genome sequencing, whereas approaches to demonstrate the latter included various combinations of PCR, RT-PCR, western blotting, epifluorescence microscopy, FACS, antibody-based methods and/or growth assays in the presence of antibiotics to confirm transcription and translation of introduced selection and reporter genes (for example, eGFP, YFP, mCherry). For fluorescent markers, it was first ensured that the wild-type, or manipulated controls cells, had no signals conflicting with the marker (Figs. 2 and 3c), an important step because photosynthetic protists contain chlorophyll and other autofluorescent pigments. Overall transformation outcomes for each species were parsed into three groups according to the level of success or lack thereof (A, first transformation protocol for a given species; B, advanced protocol based on previous work and C, published protocol based on the EMS initiative) and are discussed according to their phylogenetic position (Fig. 1).
Our studies did not result in a universally applicable protocol because transformability and a range of other key conditions varied greatly across taxa and approaches, such as intrinsic features of the genome and differences in cellular structure and morphology. In general, electroporation proved to be the most common method for introducing exogenous DNA stably into cells. This approach was used for naked cells and protoplasts, yet frequently also worked, albeit with lower efficiency, on cells protected by cell walls. Linearized plasmids were most effective for delivery, and 5′ and 3′ UTR-containing promotors of highly expressed endogenous genes provided the strongest expression of selective reporters and markers. If successful, teams usually continued with fluorescence-based methods. Furthermore, large amounts of carrier DNA usually facilitated successful initial transformations (for example, M. commoda, A. whisleri) or improved existing protocols (S. rosetta 21 ). We also provide the contact details of all coauthors who are assigned to particular species (Supplementary Table 6).
Some lineages were difficult to transform, especially dinoflagellates and coccolithophores. Here, even if DNA appeared to be delivered (Supplementary Table 5), expression of the transformed genes could not be confirmed. Examples include the dinoflagellates C. cohnii, Symbiodinium microadriaticum and the coccolithophore E. huxleyi. Thus, at least these three species need concerted future efforts.
The combination of results presented herein together with previously published protocols from the EMS initiative 50 significantly expands the segment of extant eukaryotic diversity amenable to reverse genetics approaches. Out of the 39 microbial eukaryotes selected for the initiative, exogenous DNA was delivered and expressed in more than 50% of them. The transformation systems enable us to shed light on the function of species-specific genes, which likely reflect key adaptations to specific niches in dynamic ocean habitats.

online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41592-020-0796-x.  Table 3, the details of  the transformation methods applied to this study in Supplementary Table 4 and  contact details for individual laboratories in Supplementary Table 6. Full list of protists (including details of culture collection) and links to the complete stepby-step transformation protocols and published vector sequences are listed in Supplementary Table 5. The protocols.io links listed in Table 1 and Supplementary  Table 5 are summarized in Supplementary Tables 7 and 8.

Methods
Reporting Summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

data availability
The data that support the findings of this study are available from the corresponding authors as well as the other authors upon request (for the contacts see Supplementary

Statistics
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.

n/a Confirmed
The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.
A description of all covariates tested A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information.

Data
Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: -Accession codes, unique identifiers, or web links for publicly available datasets -A list of figures that have associated raw data -A description of any restrictions on data availability The data that support the findings of this study are available from the corresponding authors as well as the other authors upon request (for the contacts see Suppl. Table 5).

Mycoplasma contamination
The cell lines were not tested for mycoplasma contamination.
Commonly misidentified lines (See ICLAC register) No commonly misidentified cell lines were used.

Flow Cytometry
Plots Confirm that: The axis labels state the marker and fluorochrome used (e.g. CD4-FITC).
The axis scales are clearly visible. Include numbers along axes only for bottom left plot of group (a 'group' is an analysis of identical markers).
All plots are contour plots with outliers or pseudocolor plots.
A numerical value for number of cells or percentage (with statistics) is provided. Cell population abundance For experiments demonstrating transfection of Nannochloropsis and Micromonas in pure culture cell population abundance was determined using an in-line flow meter on the sample line and by weighing the samples before and after data collection. This information on volume run was combined with counts acquired during each run to determine the abundance per ml for each population observed (non-transfected, transfected).

Methodology
Because of the small sample volumes and low abundance of target cells in samples of natural communities, it was not possible to re-test purity of sorted samples. Instead, immediately before sorting of the sample, the sorting efficiency was confirmed to >95% with 3 um UR calibration beads. With the same InFlux instrument and sorting protocols, we have previously determined that

October 2018
sorted pico-cyanobacterial populations from natural samples are >99% pure, by re-running sorted samples using trigger based on the FSC channel and with voltage and threshold set to be able to detect even non-fluorescent particles with approx. 2x lower FSC than Prochlorococcus.
Gating strategy For Micromonas: The trigger channel was Forward Scatter (FSC), to allow detection of pigmented cells (both Micromonas and Nannochloropsis are algae with natural chlorophyll fluorescence), dead cells, and cell detritus. No gates were applied in the data collection mode, only for post analysis. All cells (with plasmid in treratments either pulsed, or not pulsed) were visualized in a two parameter histogram of FSC vs Chlorophyll (692/40 nm bp). For means, the population of non-detrital particles (living cells) was gated into a second two parameter histogram of FALS vs GFP (520/35 nm bp), as well as GFP vs Chlorophyll. From there cells with eGFP fluorescence were analyzed as were cells with baseline eGFP fluorescence and the geometric mean of these fluorescences were compared, as were cell abundance in each population. For Nannochloropsis: The trigger channel was Forward Scatter (FSC), to allow detection of pigmented cells, dead cells, and cell detritus. A first gate was drawn based on FSC and Side Scatter to include particles with the optical scattering characteristics (related approximately to size and cell complexity) of Nannochloropsis cells. Living Nannochloropsis cells maintain high red fluorescence from chlorophyll, so a second gate was drawn based on chlorophyll fluorescence (692 nm, excited by the 488 nm laser) and FSC. In samples electroporated either with (treatment) or without plasmid DNA (sham controls), the proportion of celllike particles (based on FSC and SSC) which had chlorophyll fluorescence remained high (>95%) during the first two hours but had dropped to 72-85% at 24 hours, while in non-electroporated samples the proportion remained >95% for the first two hours. In this way, blue fluorescence related to mTagBFP2 was assessed only on cells which maintained the optical characteristics of healthy Nannochloropsis. For natural samples: The trigger channel was FSC, with voltage and threshold set to include all particles which displayed high yellow autofluorescence (at 580 nm) and high red autofluorescence (at 692 nm), which correspond to Synechococcus-like cells with yellow fluorescent phycoerythrin. In control (not electroporated) and sham-control (electroporated without plasmid), gates were drawn based on red fluorescence (due to chlorophyll, 692 nm excited by the 640 nm laser) and FSC as shown in Supplementary Fig. 6. Spherical fluorescent calibration particles (3 um Spherotech UR) were used as guides for setting FSC parameters to define smaller picoplancton versus cells that represent larger picoplancton or nanoplancton. Three clear phytoplankton "populations" (groups of particles with tightly similar optical characteristics) were visually identified and gates were drawn around them. The gate with lowest FSC was composed of 100% or nearly 100% cells exhibiting high yellow fluorescence due to phycoerythrin, so were assigned as "Cyanobacteria". The other two populations were therefore assumed to correspond to a type of small picophytoplankton (with FSC and pulse width well below that of 3 um calibration spheres) and larger picophytoplankton. These three populations together showed a clear relationship between chlorophyll fluorescence and FSC. Cells/particles with chlorophyll fluorescence below this relationship did not form clear groups and are assumed to represent non-phytoplankton, some of which can exhibit low red fluorescence due to phytoplankton prey in food vacuoles. For each time point, the gate to define blue fluorescent (putative BFP-expressing) cells was drawn to exclude over 95% of cells in sham-controls which were electroporated with dextran green (to track electroporation) but without plasmid DNA. This gate was drawn diagonally on 460 nm (blue fluoresence) versus 530 nm (green fluorescence) to exclude any signal that could come from bleedover of the green fluorescence of fluorescein dextran.
Tick this box to confirm that a figure exemplifying the gating strategy is provided in the Supplementary Information.