Saccharina genomes provide novel insight into kelp biology

Seaweeds are essential for marine ecosystems and have immense economic value. Here we present a comprehensive analysis of the draft genome of Saccharina japonica, one of the most economically important seaweeds. The 537-Mb assembled genomic sequence covered 98.5% of the estimated genome, and 18,733 protein-coding genes are predicted and annotated. Gene families related to cell wall synthesis, halogen concentration, development and defence systems were expanded. Functional diversification of the mannuronan C-5-epimerase and haloperoxidase gene families provides insight into the evolutionary adaptation of polysaccharide biosynthesis and iodine antioxidation. Additional sequencing of seven cultivars and nine wild individuals reveal that the genetic diversity within wild populations is greater than among cultivars. All of the cultivars are descendants of a wild S. japonica accession showing limited admixture with S. longissima. This study represents an important advance toward improving yields and economic traits in Saccharina and provides an invaluable resource for plant genome studies.

The analysis is performed using sequence similarity scores from pairwise alignments. The dendrogram illustrates the sequence identites between vBPOs and vIPOs. Figure 11. Different life stages of S. japonica and Iodine contents in adult sporophyte. a) Gametophyte and Iodine contents; and b) Juvenile sporophyte used for vHPO gene expression investigation; c) Iodine contents in basal blades, middle blades and distal blades were determined by iodometry and were shown in the light yellow boxes; Iodine contents in gametophyte, three blade parts, holdfast and stipe were determined by Scanning Electron Microscopy (SEM) analyses and were shown in the light blue boxes. Figure 14. Phylogenetic analysis of enzymes M1PHD and M1pase corresponding in mannitol synthesis. The abbreviated words begin with and "SJ" in the tree stand for the protein ids for S. japonica. Actinobacteria and Bacteria taxon names were represented by the NCBI protein accession numbers. Figure 15. The KEGG distribution of proteins in S. japonica and E. siliculosus respectively. The x-axis indicates the percentage of a specific category of genes in the species. The signal transduction, glycan biosynthesis and metabolism, membrane transport, cell communication and development pathways were enriched in S. japonica when compared to E. siliculosus.

Supplementary Note 1. Carbon storage and cell wall metabolism
The analogous set of genes involved in the polysaccharide biosynthesis metabolism pathways for mannitol, trehalose, cellulose, laminarin, alginate, sulfate fucan and sucrose in the 14 algal genomes were identified and annotated based on KEGG and previous functional classifications 1,2,3,4 (Supplementary Figure 12).
Mannitol is one of the most widespread sugar alcohol compounds widely found in bacteria, fungi, algae, and land plants 1 . It is known to be involved in osmoregulation, the storage and regeneration of reducing power, and serves as a compatible solute in both land plants and algae 2 . Genes involved in the mannitol cycle in Ectocarpus were obtained from a previous study 3 as reference genes to search against S. japonica and the other algae genomes using BLASTp. A cutoff E-value of <10 -10 was set to pare BLAST results to obtain candidate genes. All of the identified proteins were manually curated by searching against the NCBI non-redundant protein database. As shown in Supplementary Figure 12, the complete mannitol cycle was only identified in the stramenopile algae Saccharina, Ectocarpus and Nannochloropsis. One ortholog of M1PHD (55.7% identity to bacteria) and two orthologs of M1pase (>42.0% identity to bacteria), which facilitate mannitol biosynthesis, were identified in the S. japonica genome. One ortholog of M2HD (55.6% identity to bacteria) and one ortholog of FK (50.3% identity to bacteria), which facilitate mannitol hydrolysis, were identified in the S. japonica genome. Phylogenetic trees supported the HGT origin of the mannitol cycle in brown seaweeds (Supplementary Figure 13).
Laminarin is another carbon storage compound in brown algae 4 . Unlike land plants, brown algae do not store the carbon assimilated by photosynthesis as insoluble starch granules but instead as the soluble 1,3-β-glucan polymer (laminarin) localized to the cytosol 5 . Genes involved in the biosynthesis of laminarin were compared among all available algal genomes. Our results showed that laminarin could be synthesized in stramenopiles and green algae but not in red algae because the genes needed for the final two steps, a β-1,3-glucan synthase gene from the GT48 family and a KRE6-like gene belonging to the GH16 family, were not found in red algae. These two genes are likely involved in the synthesis of the β-1,6-linked branches of laminarin 4 .
Alginate is a major cell wall polymer of brown algae, accounting for up to 45% of the dry weight. It is an unbranched polysaccharide initially synthesized as a β-1,4-D-mannuronic acid chain (M-alginate). The precursor for M-alginate is GDP-mannuronic acid, which is believed to be derived from a four-electron oxidation of GDP-mannose by the enzyme GMD 6 . The M-alginate is later modified by MC5Es, which convert single residues or large blocks of polymers from D-mannuronic acid into L-guluronic acid (G-alginate). The MC5E genes have been reported in the brown algae Laminaria digitata and E. siliculosus. Among all 14 algal genomes in our study, MC5Es were only found in Ectocarpus and S. japonica, indicating that other stramenopiles as well as red algae and green algae should not be able to synthesize alginate. The identification of 105 MC5E genes in S. japonica has exceeded the record of 28 of these genes identified in E. siliculosus. Genetic distance analysis of MC5Es showed that 43 of these genes were on seven scaffolds and shared high sequence similarity (>85%), indicating that recent tandem duplication events occurred in MC5E evolution. Phylogenetic analysis of MC5Es and GDPs among the genomes of Ectocarpus, S. japonica and other algae suggests that putative HGT events from Actinobacteria introduced the chimeric pathway to brown algae.
Sulfated fucans are matrix polysaccharides from the cell wall of marine fucalean brown algae, consisting of an α-L-fucose backbone substituted by sulfate ester groups and branched with other monosaccharide residues 7 . In sulfated fucan metabolism, the E. siliculosus genome possesses two candidate pathways to metabolize GDP-fucose, which is the fucan precursor. One pathway is catalyzed by GDP-mannose 4,6-dehydratase and GDP-L-fucose synthetase, and the other is an alternative salvage pathway that is catalyzed by L-fucokinase (FK) and GDP-fucose pyrophosphorylase. Fucosyltransferases (FTs) from the glycosyltransferase (GT) families (e.g., GT10, GT23, and GT65) can be involved in the polymerization of GDP-fucose into elongating fucan chains in E. siliculosus, and polymerized polysaccharides are sulfated by specific sulfotransferases. Based on the known sulfated fucan pathway, we reconstructed the metabolic pathway for the biosynthesis and remodeling of sulfated fucans in S. japonica and the other 13 algae genomes. Red algae lack candidate genes in two pathways to metabolize GDP-fucose, indicating the absence of sulfated fucans or the existence of an alternative pathway. By comparing the genes involved in the last two steps, FTs (6 in S. japonica, 5 in E. siliculosus, and more than 3 in other algae) and sulfotransferase (24 in S. japonica, 23 in E. siliculosus, and more than 10 in other algae), brown algae showed an obvious advantage in gene numbers, consistent with the sophisticated structure of their cell wall matrix.

Supplementary Note 2. Carbohydrate-active enzymes
Carbohydrate-active enzymes (CAZymes) are responsible for the breakdown, biosynthesis or modification of glycoconjugates, oligo-and polysaccharides 8 . CAZymes can be subdivided into four functional classes based on their structurally related catalytic modules or functional domains: glycoside hydrolases (GHs), glycosyltransferases (GTs), polysaccharide lyases (PLs), and carbohydrate esterases (CEs). Among them, the key enzymes for the synthesis and remodeling of oligo-and polysaccharides are GHs and GTs, which are classified into more than 200 Carbohydrate-Active enZYme (CAZY) families (http://www.cazy.org/) 9 . To identify the CAZymes from S. japonica and distinguish the different cell wall polysaccharides in other algae, we performed CAZyme screening in S. japonica and the other 13 algal genomes (Supplementary Tables 6-8). All of the putative proteins were searched against entries in the CAZy database using the dbCAN Web server 10 , in which HMMer 11 was used to query against a collection of custom-made HMM profiles constructed for each CAZy family. The original output was downloaded and parsed manually with the following parameters: 1) E-value <10 -10 ; 2) identity >50%; and 3) alignment length >80 amino acids. All of the identified proteins were then manually curated.
A total of 213 putative CAZymes were identified in S. japonica (Supplementary Table 6) using the CAZy annotation pipeline. The genome of S. japonica encodes 82 genes from 17 GH families and 131 genes from 30 GT families, which represents a higher absolute number of genes than in Ectocarpus but fewer gene families (54 genes from 18 GH families and 92 genes from 32 families of GT). Gene expansion of the GT families was found in GT2, GT23, GT47 and GT77 containing 24, 17, 14 and 11 genes, respectively, in S. japonica, compared with 12, 7, 8 and 1 genes, respectively, in Ectocarpus. These four families are related to cellulose and alginate biosynthesis in brown algae, providing additional evidence for the morphological enhancement of S. japonica. Furthermore, compared with Ectocarpus, S. japonica gains several new GT families, namely GT27, GT28, GT31, GT68, GT90, and GT92, which are described as acetylglucosaminyltransferases or O-α-fucosyltransferases, but lacks other GT families, such as GT15, GT24, GT33, GT54, GT59, GT65, GT66, which each contain a single gene in each family in S. japonica.
Compared with the GH families in Ectocarpus, 4 (GH13, GH18, GH114, GH128) were newly gained and an additional 4 (GH31, GH36, GH63, GH95) were lost in S. japonica. Among them, 7 genes in GH114 were found exclusively in S. japonica. The gene expansion of GH families was found in GH81 (mainly including endo-1,3-β-glucanases). The comparison of gene numbers and families within the GH and GT family groups between the two brown algae provides additional insight into the evolution and diversification of cell wall related polysaccharides in S. japonica.