A reversible haploid mouse embryonic stem cell biobank resource for functional genomics

Journal name:
Nature
Year published:
DOI:
doi:10.1038/nature24027
Received
Accepted
Published online

The ability to directly uncover the contributions of genes to a given phenotype is fundamental for biology research. However, ostensibly homogeneous cell populations exhibit large clonal variance1, 2 that can confound analyses and undermine reproducibility3. Here we used genome-saturated mutagenesis to create a biobank of over 100,000 individual haploid mouse embryonic stem (mES) cell lines targeting 16,970 genes with genetically barcoded, conditional and reversible mutations. This Haplobank is, to our knowledge, the largest resource of hemi/homozygous mutant mES cells to date and is available to all researchers. Reversible mutagenesis overcomes clonal variance by permitting functional annotation of the genome directly in sister cells. We use the Haplobank in reverse genetic screens to investigate the temporal resolution of essential genes in mES cells, and to identify novel genes that control sprouting angiogenesis and lineage specification of blood vessels. Furthermore, a genome-wide forward screen with Haplobank identified PLA2G16 as a host factor that is required for cytotoxicity by rhinoviruses, which cause the common cold. Therefore, clones from the Haplobank combined with the use of reversible technologies enable high-throughput, reproducible, functional annotation of the genome.

At a glance

Figures

  1. A repairable mutant mES cell library.
    Figure 1: A repairable mutant mES cell library.

    a, Schematic representation of insertional mutagenesis vectors. Splice acceptor sites (SA) are reversible using non-compatible loxP/lox5171 and FRT/F3 sites (triangles). G418 resistance is conferred by β-Geo (bgeo) transcribed from the reversible cassette (gene-trap vectors, GT) or Neo independently from a PGK promoter (polyA trap), stabilized by a splice donor (SD). Six osteopontin-enhancer (OPE) elements (enhanced gene trap; Lenti-ETG, Retro-EGT, and Tol2-EGT vectors) enhance expression of β-Geo through Oct4 (also known as Pou5f1) binding. RetroRS carries a spacer sequence between loxP sites and lacks OPEs. Purple diamonds indicate internal barcodes (BC). LTR, long terminal repeats; L200/R175 and LITR/RITR, terminal repeats of Tol2 and SB. b, Heat map representing numbers of integrations per gene per one million integrations. Gene expression levels (left column) are shown (blue, highly expressed; white, not expressed). Colour code (remaining columns) shows the numbers of integrations per one million integrations. c, Saturation of mutagenesis systems compared to random in silico mutagenesis. y axis, total numbers of insertions versus the percentage of genes with integrations. d, Schematic representation of splice acceptor inversions. e, Loss of mES cell adhesion in clones with integrations in intron 1 of Ctnna1. Inversion of the gene trap restores cell adhesion, subsequent reversion again disruptes adhesion. Phalloidin is used to visualize polymerized actin; DAPI visualizes nuclei. Scale bars, 10 μm. One representative experiment out of two biological replicates is shown.

  2. Essential genes for mES cell and common cold virus infections.
    Figure 2: Essential genes for mES cell and common cold virus infections.

    a, b, Functional annotation of essential mES cell genes. a, Competitive growth assays of antisense (GFP+) and Cre-reverted sense (mCherry+) sister cells with integrations in the indicated genes. Cell populations were analysed at the indicated days after Cre addition using flow cytometry. Data are mean ± s.d. of biological triplicates of a single experiment. b, Fluorescence-activated cell sorting (FACS) plots for the essential gene Psmd1 showing the depletion of mCherry+ cells. c, Integration sites of top scoring genes in our haploid mES cell survival screen of human rhinovirus RV-A1a infections. Loss-of-function score for integrations into the Ldlr locus P = 2.9 × 10−12 and P = 1.4 × 10−11 for Pla2g16. Sense integrations, red triangles; antisense integrations, green triangles; exons, blue boxes. Transcriptional start sites are marked. d, Growth advantage of sense versus respective antisense sister mES cells with integrations in Pla2g16 or Ldlr upon infection with RV-A1a. In uninfected cells, mutation of these genes did not confer growth advantages; arbitrarily set to one. e, HEK293T cells were transduced with four different sgRNAs against PLA2G16 and LDLR in biological triplicates, mixed with control GFP+ HEK293T cells at a ratio of 1:3. Ratios of control to mutated HEK293T cells were evaluated on day 13 after infection using FACS. Data are mean ± s.d. (d, e) with individual data points (diamonds) normalized to uninfected cells. *P < 0.05, **P < 0.01, ***P < 0.001; one-tailed Student’s t-test. f, Targeting of the C-terminus of mouse Pla2g16 using CRISPR–Cas9. Upon selection of haploid cells to ensure hemizygous editing, cells were split and maintained in the presence and absence of RV-A1a.

  3. Regulators of angiogenesis.
    Figure 3: Regulators of angiogenesis.

    a, Generation of sprouting vasculature from haploid mES cells, differentiated into embryoid bodies and cultured with VEGFA (30 ng ml−1). b, CD31+ (green) endothelial cells and filopodia, indicative of tip cells in blood-vessel sprouts. Luminal structures and collagen IV+ basement membranes are shown on the right. Scale bars, 200 μm or as indicated. c, Schematic outline for functional validation of candidate genes in sprouting angiogenesis. The clones from the Haplobank were infected with GFP or mCherry-Cre viruses to generate disruptive sense and antisense sister cells. d, Representative images of hypo- and hypersprouting sense and antisense sister clones. IB4 marks endothelial cells. Scale bar, 500 μm. e, Quantification of IB4+ blood vessel sprouts. Data were normalized to the respective antisense sister clones. Data are mean ± s.e.m. from a minimum of n = 3 independent experiments. *P < 0.05, **P < 0.01, ***P < 0.001; two-tailed Student’s t-test.

  4. In vivo angiogenesis and cell specifications.
    Figure 4: In vivo angiogenesis and cell specifications.

    a, Quantification of the ability of indicated sense and antisense targeted clones to form blood vessels in teratomas. Data are mean ± s.e.m. of n = 3 independent teratomas and n = 5 AN3-12 control. b, Gja1-sense (GFP+) and Gja1-antisense (mCherry+) sister mES cells were mixed 1:1 to form chimaeric embryoid bodies and were subsequently injected into immunocompromised nu/nu mice. Representative sections to identify IB4-, GFP- and mCherry-expressing cells are shown. Scale bar, 50 μm. c, Notch1 antisense and mCherry-Cre+ Notch1-sense sister mES cells were mixed (1:1 ratio) to generate mosaic blood vessels and analysed for mCherry+ (red) or GFP+ (green) cells at the tip position. Scale bar, 200 μm. d, Relative tip cell position of sister cells with sense and antisense integrations in the indicated genes, determined in chimaeric 3D sprouts. Data are mean ± s.e.m. of a minimum of n = 3 independent experiments. a, d, *P < 0.05, **P < 0.01, ***P < 0.001; two-tailed Student’s t-test. e, Representative image of mosaic blood-vessel sprouts from Gja1 sense and antisense sister clones. Scale bars, 500 μm and 100 μm (insert). f, Intravenous injection of a Gja1 inhibitory peptide (GAP26) into neonatal mice abrogated retinal angiogenesis. At day 5 after birth, retinas were isolated and stained for IB4+ blood vessels. Scale bars, 500 μm (top) and 100 μm (bottom).

  5. Stem cell properties of the haploid subclone AN3-12.
    Extended Data Fig. 1: Stem cell properties of the haploid subclone AN3-12.

    a, b, Various parthenogenic cell lines derived from independent embryos from an outcross of 129/Sv and C57BL/6 and thus containing different genomic backgrounds for different chromosomes were allowed to form embryoid bodies by placing 1,000 cells per hanging drop. We observed downregulation of pluripotency marker genes (a) and upregulation of markers from all three germ layers (b) in all cell lines assayed on day 0 (d0), day 5 (d5) and day 12 (d12). The HMSc2 subclone AN3-12 was chosen for further study based on its growth properties in serum/LIF and absence of feeders. Data are shown as individual data points of n = 2 technical replicates together the mean ± s.d. of one representative experiment. c, Growth curve of AN3-12 in the presence and absence of LIF. Data are shown as individual data points and mean values (lines) of three biological replicates. d, FACS analysis of chromosome content of AN3-12 cells (in LIF, same experiment as shown in c) shows the decrease in haploid (1n) cells from 35.5% to 24.9% during the seven-day culture period. e, AN3-12 cells, cultured as in c, maintain a robust haploid population when analysed on day 17 in ESCM despite rapid proliferation. f, Differentiation of AN3-12 cells into keratinocytes resulted in a near-complete loss of haploid cells among the keratin 14 (K14)-positive population; mES cells stained with anti-K14 are shown as a negative specificity control (grey curve in the K14 histogram). g, Immunostaining of AN3-12 cells cultured in ESCM as well as time course of removal of LIF with addition of 500 nM retinoic acid (analysed on the indicated days) shows downregulation of pluripotency markers Oct4, Nanog and Sox2. DAPI is shown as a nuclear counterstain. Scale bars, 50 μm. h, Histological examination of teratomas analysed 25 days after injection of 106 cells subcutaneously. All three germ layers were present in six analysed teratomas, representative H&E images are shown. Magnifications are indicated in each panel.

  6. Analysis of genome integrity.
    Extended Data Fig. 2: Analysis of genome integrity.

    a, M-FISH karyotypic analysis was performed on parental mouse haploid cells (AN3-12) to evaluate genomic stability. Randomly selected metaphases were karyotyped and examined by M-FISH and DAPI banding. Approximately, 200 metaphases from AN3-12 were counted for the diploid versus haploid frequency and 10 well-spread metaphases were fully karyotyped by M-FISH and DAPI-banding pattern. Images of normal female diploid and haploid karyotypes (19, X) are shown. Images were captured on a Zeiss AxioImager D1 fluorescent microscope equipped with narrow band-pass filters for DAPI, DEAC, FITC, CY3, Texas red and CY5. b, CNV analysis of haploid AN3-13 cells by genome sequencing using Illumina HiSeq2500. Mapped reads were analysed relative to male genomes of parental mouse strains C57BL/6J and 129/Sv, respectively, quotient to closer parental strain is shown. As expected, the X chromosome is overrepresented whereas the Y chromosome is absent. Regions of detected variation are highlighted with red boxes and are shown in c. Chromosome numbers are indicated. c, In AN3-12 haploid mES cells three very small deletions (on chromosomes 2, 10 and 12) and 1 duplication (on chromosome 13) were detectable as highlighted. d, Chromosomal distribution of SNP densities for in-house 129/Sv and AN3-12 mES cells relative to in-house C57BL/6 are shown. Numbers of SNPs were calculated for all non-overlapping 100-kb windows across the mm10 C57BL/6J mouse reference genome. SNP density in AN3-12 shows regions of high and low number of SNPs relative to the C57BL/6J genome, as expected for a haploid cell line derived from an F1 female between 129/Sv and C57BL/6.

  7. Molecular characterization of mutagenesis vectors.
    Extended Data Fig. 3: Molecular characterization of mutagenesis vectors.

    a, Schematic illustration of the universal NGS strategy. Optimized primer-binding sites compatible with Illumina sequencing and two restriction enzymes with four base-pair recognition sites were placed adjacent to the terminal elements (LTR, TR). An internal barcode of 32 bases with alternating weak and strong bases was inserted in a parallel cloning step. b, For mapping of integration sites, genomic DNA was amplified by iPCR to introduce adaptor sequences and the experimental index for NGS. Paired-end sequencing maps the genomic integration in the first read using a custom primer, the experimental index as well as the internal barcode using standard Illumina primers binding to the integrated complementary sequence. Barcode (BC) PCR was performed on genomic DNA. c, Meta-analysis of mutagen integrations around transcriptional start sites (TSS) (excluding the precise TSS site). In particular Tol2 and retrovirus show a preference to integrate in proximity to the TSS. Retroviruses also frequently integrate into the promotor regions, whereas lentiviral integrations are typically located within the entire gene body. IPKM, insertions per kilobase per million. The vectors used are described in the legend of Fig. 1. d, Distribution of integration sites. Binning the number of integrations in genic and 2-kb upstream regions per 10-kb windows illustrates pronounced cold spots of mutagenesis using retroviral mutagenesis, where one can observe bins devoid of integrations. e, Genomic region surrounding the Gapdh locus exemplifying the distributions of integrations. While retroviral integrations strongly cluster, Tol2 displays a more uniform distribution of integration sites. Tracks are + strand (top) and − strand (bottom) integration sites. Bar lengths indicate NGS read numbers, subsequent to iPCR. f, Heat map illustrating overlap of epigenetic histone marks with integrations of the indicated mutagens, normalized to peak size. Only retrovirus and Tol2 integrations strongly correlate with DNA accessibility determined by ATAC-seq and active marks such as H3K4me3 and H3K27ac. In silico mutagenesis is shown as a control.

  8. Insertional preferences and generation of the mutant mES cell library.
    Extended Data Fig. 4: Insertional preferences and generation of the mutant mES cell library.

    a, Correlation between integration probabilities (IPKM, insertions per kilobase per million) and expression level (mean log2(FPKM)). Strongest correlation is seen for lentiviral constructs as well as Retro-GT without osteopontin-enhancer elements. All mutagenesis vectors are described in Fig. 1 and Methods. b, 5′ RACE on a set of pooled clones with confirmed antisense integration sites revealed multiple spurious transcription initiation sites in the intronic part of the gene-trap vector around the lox site, but we failed to detect spliced transcripts. Transcriptional initiation within the lox5171 site is highlighted. Red-labelled sequence is marking polyGs used for 5′ tailing. c, Intersection of integration sites of the indicated mutagenesis vectors (see Fig. 1) with genomic features. Coding sequences (CDS), 5′ and 3′ untranslated regions (5′ UTR and 3′ UTR), 1st intron, all other introns, excluding the first intron (intron), non-coding exons (ncExon), upstream regions (defined as 2-kb upstream of TSS) and intergenic regions are indicated. Mutagenesis by piggyback transposons as well as in silico random mutagenesis and ATAC-seq results are shown for comparison. d, Schematic work flow for generation of the mutant haploid mES cell library. Single-cell-derived clones were manually picked 10 or 11 days after seeding, expanded in 96-well plates, and either frozen in quadruplicates or further processed for mapping of the integration sites. e, Schematic illustration of the first step of 4D pooling. Each plate was pooled into the respective slice tray as well as a master plate, uniting identical well coordinates of all plates. f, Schematic illustration of the second step of 4D pooling. Each master-plate was pooled into a master tower pool, a plate with lamella uniting columns, and a plate with lamella uniting rows, thereby generating pools for rows and columns over all samples. g, 4-Dimensional pooling of 9,600 clones in 8 rows, 12 columns, 10 slices and 10 towers resulting in 40 pools. After iPCR to introduce experimental indices, pools were combined and deep sequenced. Amplification of internal barcodes confirmed clonal identity and mapping in 4 dimensions. All mapped clones were deposited in the Haplobank (https://www.haplobank.at/).

  9. Numbers of independent gene trap clones and intragenic distribution.
    Extended Data Fig. 5: Numbers of independent gene trap clones and intragenic distribution.

    a, Numbers of independent available cell lines, carrying a single integration per cell, per gene. For about 37% (RefSeq) to 38% (Ensembl) of genes targeted, there is one gene-trap clone available (5′ UTR, intron or coding sequence), whereas about 18% of genes are targeted in two independent clones, and for around 43% of genes three or more independent clones are available. b, 24.8% (RefSeq) to 26.8% (Ensembl) of genes are represented by a single cell line if one takes all clones into account and about 40% of genes are hit in three or more clones. c, Separation of all gene traps combined into biotypes in single-integration clones of the Haplobank. Antisense and intergenic insertions are observed in all systems, in particular for enhanced gene-trap vectors. d, To map the integration sites of our clones from the Haplobank to the open reading frames (ORFs) of the respective genes dissected ORFs into 5% intervals and annotated integration sites in introns and exons relative to the position within the ORF. All mutagenesis systems (see Fig. 1) show a strong bias towards transcript truncation proximal to the 5′ end of the ORFs and are thus predicted to result in loss-of-function alleles. We defined integrations in the first 50% of the coding sequence (green bars) as optimal for a gene-trap allele; these clones are highlighted by a yellow star on the Haplobank homepage.

  10. Interaction of Pla2g16 with Cox inhibitors in mES cells.
    Extended Data Fig. 6: Interaction of Pla2g16 with Cox inhibitors in mES cells.

    a, Titration series of the indicated Cox inhibitors in the presence and absence of rhinovirus (RV-A1a) in mES cells. No protective effect of inhibition of prostaglandin biosynthesis was detected at non-toxic concentrations. Because mES cells do not generate infectious RV-A1a efficiently, conditioned supernatant containing RV-A1a was added daily. Data are shown as individual data points and mean values. b, mES cell clones from the Haplobank that had mutations in Ldlr and Pla2g16, respectively, were mixed as sister cells in sense (red) and antisense (green) orientation labelled by GFP and mCherry. Subsequently, cells were cultured in the presence and absence of rhinovirus RV-A1a for four days and ratios of red and green cells were then quantified using FACS. Selection pressure for loss of Ldlr and Pla2g16 was not affected by inhibition of Cox. Data are shown as mean ± s.d. of three biological replicates.

  11. Interactions of PLA2G16 with Cox inhibitors in human HEK293T cells and domain mapping.
    Extended Data Fig. 7: Interactions of PLA2G16 with Cox inhibitors in human HEK293T cells and domain mapping.

    a, RV-A1a exposure causes cell death in HEK293T cells in a dose-dependent manner. Cell viability was quantified three days after infection using Alamar blue. Data are shown as individual data points of five biological replicates and mean values. b, Titration series for ibuprofen and indomethacin treatment in the presence and absence of rhinovirus (RV-A1a) in HEK293T cells. Protective effects of ibuprofen and indomethacin were detected at a high drug concentration. Cell viability was quantified 2.5 days after infection using Alamar blue. Data are shown as individual data points of 4 biological replicates and mean values. c, Competitive growth assays in HEK293T cells. Cells containing sgRNAs targeting PLA2G16 did not show a growth difference in the absence of RV-A1a or when treated with indomethacin at 100 μM, but were significantly enriched when challenged with RV-A1a, indicating preferential survival. By contrast, control-guide-treated cells did not show growth advantages at any experimental condition. Data are shown as individual data points and mean ± s.d. of biological triplicates analysed on days 0, 3, 6 and 10 after RV-A1a exposure. d, Scheme of Pla2g16 domains. The enzymatic centre of Pla2g16 is located in the cytoplasm (green); an α helix in the transmembrane domain (yellow) connects the core to a short vesicular domain (blue), located in endosomes49. e, Design of CRISPR sgRNAs targeting the mRNA regions encoding the vesicular domain of seven amino acids (sgRNA1) and the 3′ UTR (sgRNA2) in haploid mES cells to test essentiality of these domains in RV-A1a infections. f, Cells carrying sgRNA2 showed editing in only 1 out of 12 cases, but upon selection with RV-A1a were enriched for deletions within the vesicular domain. For sgRNA2, all mapped deletions in control cells only affected the 3′ UTR, where the expected Cas9 cuts occur; upon RV-A1a exposure, the majority of observed deletions affected the transmembrane domain, the vesicular domain and, in some cases, even extended into the cytoplasmatic region. Colour codes: grey, deletion; red, alternative reading frame and insertions.

  12. Blood-vessel sprouting in Notch1 mutant mES cells and candidate tip cell genes.
    Extended Data Fig. 8: Blood-vessel sprouting in Notch1 mutant mES cells and candidate tip cell genes.

    a, Assessment of four independent Notch1-targeted clones from Haplobank. The locations of the integrations are shown: two antisense (AS) clones marked by green triangles, one sense (S) clone marked by the red triangle, and one clone with an upstream (AS-up) integration (blue triangle). Flipping of the gene traps upon Cre infection is shown by PCR in the left panel. Loss of Notch1 protein (intracellular domain, ICD) expression (clones A4, H7), and re-expression (clone D5) upon Cre recombination are shown by western blot (right panel). β-Actin is shown as a loading control. WT, parental clone without any gene trap integration. Uncropped blots are shown in Supplementary Fig. 1. b, Notch1 inactivation leads to a hypersprouting phenotype. Note the advanced progression and increased density of the vascular networks upon Notch1 deletion (sense clone) compared to antisense sister cells (top, bright field images; bottom, IB4 immunostaining to mark endothelial cells). Scale bars, 500 μm. c, Angiogenic sprouting is not affected when the gene trap is located 1,500 bp upstream of the Notch1 gene (A2 clone). GFP+ and Cre-reverted mCherry+ sister cells were analysed in 3D blood-vessel organoid cultures. Bright field images are shown. Scale bars, 500 μm. d, Differentially expressed genes in endothelial tip cells versus stalk cells from two published datasets30, 31 in the mouse retina were filtered for genes that have also been associated (ingenuity pathway analysis) with candidate genes/pathways for vascular diseases in humans. Scatterplot showing the frequency of independent associations of tip cell genes with various human vascular diseases. Genes available in the Haplobank at the beginning of the project were chosen for functional analysis in the 3D organoids. For most of the listed candidate genes, there were no functional vascular data available. e, Quantification of IB4-positive vascular structures from the indicated sister clones carrying sense and repaired antisense integrations. Clones were classified according to their sprouting capacity from low (hyposprouting) to high (hypersprouting). f, Different clones with independent integrations in the same gene showed reproducible phenotypes in sprouting angiogenesis. Vascular outgrowths were stained for endothelial cell-specific IB4 expression, number of vessels counted and normalized to the respective antisense sister clones. e, f, Data are shown as individual data points from a minimum of n = 3 independent experiments for each sense/antisense sister clone combination together with the mean ± s.e.m. *P < 0.05; **P < 0.01; ***P < 0.001; two-tailed Student’s t-test.

  13. Sprouting angiogenesis in reversible sister clones.
    Extended Data Fig. 9: Sprouting angiogenesis in reversible sister clones.

    Representative images of the indicated sense (S) and antisense (AS) sister clones. IB4 was used to mark endothelial cells. GFP or mCherry expression indicates the respective flipped gene traps. Note that some sense clones are GFP+ whereas others are mCherry+; this is owing to the original orientation of the integration in sense or antisense, which was then reverted by the mCherry-Cre-expressing virus. Scale bar, 500 μm. For quantification of data see Fig. 3e.

  14. Generation of a chimaeric vasculature in vivo and gap junction α-1 protein localization to tip cells.
    Extended Data Fig. 10: Generation of a chimaeric vasculature in vivo and gap junction α-1 protein localization to tip cells.

    a, Representative fluorescence image of a haploid mES-cell-derived teratoma stained for endothelial cell-specific IB4. Endothelial cells arising from haploid mES cells are positive for mCherry and IB4 (yellow), whereas host endothelial cells are only positive for IB4 and appear green. Scale bars, 50 μm. b, Representative FACS analysis of teratomas following injection of chimaeric embryoid bodies into immunocompromised mice. Myst3 antisense (mCherry+) and sense (GFP+) sister clones were mixed at a 1:1 ratio. VE-cadherin-negative non-endothelial cells were also determined within the teratomas. c, Parental haploid mES cells stably expressing GFP or mCherry-Cre were assessed for their ability to generate IB4+ vascular structures in the presence of VEGFA. The number and ratios of IB4+ vessels per organoid were not apparently different between GFP-expressing and mCherry-Cre-expressing cells. Scale bars, 500 μm. Data are shown as mean ± s.e.m. and individual data points from n = 4 independent experiments. P = 0.207; two-tailed Student’s t-test. d, GFP-expressing and mCherry-Cre-expressing parental haploid mES cells contribute equally to tip cells (49.2% GFP+; 50.8% mCherry-Cre+) in 1:1 mixed mosaic cultures. Data are shown as mean ± s.e.m. and individual data points from of n = 4 independent experiments. P = 0.823; two-tailed Student’s t-test. e, Localization of gap junction α-1 protein in the mouse retina at postnatal day 6 (P6). Endothelial cells are marked by IB4 staining. At the angiogenic front, gap junction α-1 protein expression is found in endothelial cells, primarily localized at tip cells (arrows). Scale bars, 50 μm. f, Retinas were stained for gap junction α-1 protein expression and the endothelial marker IB4 to visualize the vascular networks on P6. Note the punctate pattern of gap junction α-1 protein adjacent to the IB4+ vessels, suggestive of gap junction α-1 protein expression in perivascular cells. Scale bar, 50 μm (left) and 20 μm (right). g, Gap junction α-1 protein predominantly localizes to the tip cells (arrows) in the 3D blood vessels. Vessels are marked by CD31 immunostaining and counterstained by DAPI. Bar graph indicates percentages of vessels with the highest gap junction α-1 protein expression in the tip cell. Data are shown as individual data points of eight independent embryoid bodies and mean ± s.d. of vessels. Scale bars, 20 μm (right) and 10 μm (left).

Main

Approaches to functionally analyse the mammalian genome include ENU (N-ethyl-nitrosourea) mutagenesis4, gene targeting5, RNA interference6, 7 and CRISPR-mediated genome editing8. Although powerful, these approaches have various caveats, such as poor knockdown efficiency and off-target effects9, 10, 11. Additionally, clonal variability within populations can compromise comparisons and reproducibility3, 12, 13, 14. Therefore, reversible mutations that enable the direct comparison of phenotypes within a single clone are essential for the study of genetic dependencies.

To generate a conditional mutagenesis system at a genome-wide scale and at the clonal level, we applied insertional mutagenesis using genetically barcoded, lentivirus-based, retrovirus-based15, 16 and transposon-based (Tol2 (ref. 17 and sleeping beauty (SB)18) vectors in haploid mES cells (Fig. 1a) that enable recessive genetics. We analysed several parthenogenetic haploid mES cell lines (129/Sv and C57BL/6 background) (Extended Data Fig. 1a, b) and chose to use AN3-12 cells, which grow in feeder-free conditions and maintain a stable haploid genome across many generations in an undifferentiated state. These cells express pluripotency markers and differentiate into all germ layers in vivo (Extended Data Fig. 1c–h). Notably, AN3-12 cells display only minor genomic duplications and deletions that potentially affect the genes Cdh4, Taf4 (also known as Taf4a), Agmo and Cox7c (Extended Data Fig. 2). Insertional mutagenesis permits the integration of invertable splice acceptors, resulting in conditional alleles, as well as high-throughput direct identification of integration sites. To map insertion sites and complex internal barcodes (>107) by inverse PCR, an optimized universal sequencing strategy was established (Extended Data Fig. 3a, b). A combination of strategies was used to avoid genomic biases of the insertional mutagenesis systems (Fig. 1a, b and Extended Data Figs 3c–f, 4a; reviewed in refs 19, 20), yielding genome-wide mutagenesis (Fig. 1c). Of note, Tol2 outperformed the classical viral delivery systems and even in silico mutagenesis at less than one million integrations (Fig. 1c). The mutagenesis systems also generated antisense and intergenic integrations at a high frequency owing to cryptic transcriptional start sites (Extended Data Fig. 4b, c), however, this is not expected to affect disruption of transcription in sense orientation. Therefore, we used various delivery systems to obtain unbiased, genome-saturated and conditional mutagenesis.

Figure 1: A repairable mutant mES cell library.
A repairable mutant mES cell library.

a, Schematic representation of insertional mutagenesis vectors. Splice acceptor sites (SA) are reversible using non-compatible loxP/lox5171 and FRT/F3 sites (triangles). G418 resistance is conferred by β-Geo (bgeo) transcribed from the reversible cassette (gene-trap vectors, GT) or Neo independently from a PGK promoter (polyA trap), stabilized by a splice donor (SD). Six osteopontin-enhancer (OPE) elements (enhanced gene trap; Lenti-ETG, Retro-EGT, and Tol2-EGT vectors) enhance expression of β-Geo through Oct4 (also known as Pou5f1) binding. RetroRS carries a spacer sequence between loxP sites and lacks OPEs. Purple diamonds indicate internal barcodes (BC). LTR, long terminal repeats; L200/R175 and LITR/RITR, terminal repeats of Tol2 and SB. b, Heat map representing numbers of integrations per gene per one million integrations. Gene expression levels (left column) are shown (blue, highly expressed; white, not expressed). Colour code (remaining columns) shows the numbers of integrations per one million integrations. c, Saturation of mutagenesis systems compared to random in silico mutagenesis. y axis, total numbers of insertions versus the percentage of genes with integrations. d, Schematic representation of splice acceptor inversions. e, Loss of mES cell adhesion in clones with integrations in intron 1 of Ctnna1. Inversion of the gene trap restores cell adhesion, subsequent reversion again disruptes adhesion. Phalloidin is used to visualize polymerized actin; DAPI visualizes nuclei. Scale bars, 10 μm. One representative experiment out of two biological replicates is shown.

Next, we developed a high-throughput pipeline to create a biobank of reversible mutations in AN3-12 haploid mES cells. Starting from haploid mES cell pools that contained between 107 and 108 distinct mutations for each mutagen, we arrayed, processed, banked and mapped over 100,000 individual mES cell clones (Extended Data Fig. 4d–g). In total, we generated sense and antisense clones that target 16,970 out of approximately 24,000 annotated mouse genes (genome release mm10), covering over 70% of the protein-coding genome (Extended Data Fig. 5a–c and Supplementary Table 1). Integrations display a 5′ bias in genes and in the coding sequence (Extended Data Fig. 5d), resulting in truncations that are likely to generate loss-of-function alleles. All pools of mutated cells are available at https://www.haplobank.at. This resource represents a comprehensive library of mES cell clones carrying hemi/homozygous, twice reversible, barcoded integrations, combining the power of stem cells with tunable mutagenesis.

A key advantage of the Haplobank is the ability to compare each mES cell clone with its sister clone carrying the conditionally inverted splice acceptor. As a proof-of-principle, we analysed two mES cell clones containing sense integrations within Ctnna1, which encodes α-E-catenin and is essential for mES cell adhesion21. mES cells with sense integrations showed reduced Ctnna1 expression and impaired cell adhesion, which were both restored by FlpO-mediated reversal of the integrated mutagenesis vectors. Furthermore, Cre-mediated reversal back into the sense orientation again disrupted α-E-catenin-mediated cell adhesion (Fig. 1d, e). Additionally, we evaluated mES cell lines carrying non-disruptive, antisense integrations in presumed essential genes. We infected these mES cell lines with a pool of retroviruses that encode Cre and mCherry, or GFP only. If a gene is essential, Cre-mediated reversion of the integration to the disruptive sense orientation should specifically deplete mCherry+ cells from the mCherry+GFP+ cell pool over time, as detected by flow cytometry. We confirmed the essential role of several genes for mES cell survival (Fig. 2a, b). Therefore, our system enables direct, functional annotation of gene essentiality, instead of screening for the absence of mutations; moreover, one can directly examine the penetrance and timing of lethal phenotypes.

Figure 2: Essential genes for mES cell and common cold virus infections.
Essential genes for mES cell and common cold virus infections.

a, b, Functional annotation of essential mES cell genes. a, Competitive growth assays of antisense (GFP+) and Cre-reverted sense (mCherry+) sister cells with integrations in the indicated genes. Cell populations were analysed at the indicated days after Cre addition using flow cytometry. Data are mean ± s.d. of biological triplicates of a single experiment. b, Fluorescence-activated cell sorting (FACS) plots for the essential gene Psmd1 showing the depletion of mCherry+ cells. c, Integration sites of top scoring genes in our haploid mES cell survival screen of human rhinovirus RV-A1a infections. Loss-of-function score for integrations into the Ldlr locus P = 2.9 × 10−12 and P = 1.4 × 10−11 for Pla2g16. Sense integrations, red triangles; antisense integrations, green triangles; exons, blue boxes. Transcriptional start sites are marked. d, Growth advantage of sense versus respective antisense sister mES cells with integrations in Pla2g16 or Ldlr upon infection with RV-A1a. In uninfected cells, mutation of these genes did not confer growth advantages; arbitrarily set to one. e, HEK293T cells were transduced with four different sgRNAs against PLA2G16 and LDLR in biological triplicates, mixed with control GFP+ HEK293T cells at a ratio of 1:3. Ratios of control to mutated HEK293T cells were evaluated on day 13 after infection using FACS. Data are mean ± s.d. (d, e) with individual data points (diamonds) normalized to uninfected cells. *P < 0.05, **P < 0.01, ***P < 0.001; one-tailed Student’s t-test. f, Targeting of the C-terminus of mouse Pla2g16 using CRISPR–Cas9. Upon selection of haploid cells to ensure hemizygous editing, cells were split and maintained in the presence and absence of RV-A1a.

We subsequently performed a genome-wide screen to uncover novel hits for resistance against infection with a common cold virus. We chose the rhinovirus serotype RV-A1a, which replicates in mouse cells22. A pool of mES cells carrying gene-trap insertions was exposed to rhinovirus every other day for three weeks. The surviving, virus-resistant cells displayed an enrichment for multiple disruptive insertions in the low-density lipoprotein receptor (Ldlr), a known entry portal for this virus (P = 2.9 × 10−12), and in the phospholipase Pla2g16 (P = 1.4 × 10−11) (Fig. 2c). We confirmed that these genes are required for killing of the virus using three different sister clones with reversible integrations (Fig. 2d). Next, we used CRISPR–Cas9 technology to disrupt these genes in human embryonic kidney (HEK293T) cells and monitored competitive proliferation with and without RV-A1a infection. In this assay, LDLR was not required for RV-A1a-mediated killing (Fig. 2e), presumably because the virus can enter via other receptors in the absence of functional LDLR in HEK293T cells23. Notably, we confirmed that inactivation of PLA2G16 with different single-guide RNAs (sgRNAs) confers a selective survival advantage to HEK293T cells exposed to RV-A1a (Fig. 2e). Therefore, the Haplobank enables genome-saturated forward screening and validation to discover novel genes underlying specific phenotypes.

PLA2G16 can catalyse the rate-limiting step of arachidonic acid synthesis, and therefore couples to cyclooxygenases (Cox) and prostaglandin synthesis24, 25. However, different Cox inhibitory drugs did not block RV-A1a-mediated cell death of control or repaired Pla2g16 sister mES cells, nor did arachidonic acid enhance RV-A1a toxicity (Extended Data Fig. 6a). Selective survival of Pla2g16- or Ldlr-mutant mES cells was not affected by Cox inhibitors (Extended Data Fig. 6b). Of note, the Cox inhibitory drugs ibuprofen and indomethacin conferred partial resistance to RV-A1a in HEK293T cells (Extended Data Fig. 7a–c), albeit at concentrations that also affect other pathways26. Structurally, the short C-terminal vesicular domain of PLA2G16 extends into the endosomal lumen where the virion is located before releasing its RNA27 (Fig. 2f). To test the relevance of this domain, we edited the Pla2g16-coding region in the endosomal C terminus and 3′ untranslated region (UTR) using CRISPR–Cas9 and selected for RV-A1a-resistant cells. Mutations conferring resistance to RV-A1a were enriched in the transmembrane domain and the vesicular domain (Fig. 2f and Extended Data Fig. 7d–f), consistent with a recent, independent haploid screen that identified PLA2G16 in picorna viridae infection28. Our results identify the C-terminal domain of Pla2g16 as a target to block rhinoviral infections.

As a third application of the Haplobank, we investigated pathways required for angiogenesis29. Multiple candidate angiogenesis genes have been proposed, but few have been functionally validated30, 31. We adapted blood vessel sprouting in embryoid bodies, because this recapitulates key features of in vivo angiogenesis32, to our haploid mES cells. Sprouts were positive for the endothelial marker CD31 and the basal membrane protein collagen IV, and formed lumens (Fig. 3b). Similar to in vivo blood-vessel formation, the cells at the periphery of the vascular structures exhibited characteristic features of tip cells, such as CD31-positive filopodia protrusions, followed by stalk cells (Fig. 3b). Tip cells express delta-like ligand 4 (Dll4), which activates the Notch1 pathway on stalk cells to suppress their conversion into tip cells33. We used Notch1 antisense, non-disruptive clones from the Haplobank and, using Cre recombination, created stable sister, sense clones with knockout of Notch1 expression (Extended Data Fig. 8a). Embryoid bodies were derived from multiple sense, knockout sister clones and displayed significantly increased vessel density (Extended Data Fig. 8b and data not shown). Gene-trap integration upstream of the Notch1 gene did not alter vessel density (Extended Data Fig. 8c). Therefore, the sprouting assay of embryoid bodies recapitulates normal blood-vessel development.

Figure 3: Regulators of angiogenesis.
Regulators of angiogenesis.

a, Generation of sprouting vasculature from haploid mES cells, differentiated into embryoid bodies and cultured with VEGFA (30 ng ml−1). b, CD31+ (green) endothelial cells and filopodia, indicative of tip cells in blood-vessel sprouts. Luminal structures and collagen IV+ basement membranes are shown on the right. Scale bars, 200 μm or as indicated. c, Schematic outline for functional validation of candidate genes in sprouting angiogenesis. The clones from the Haplobank were infected with GFP or mCherry-Cre viruses to generate disruptive sense and antisense sister cells. d, Representative images of hypo- and hypersprouting sense and antisense sister clones. IB4 marks endothelial cells. Scale bar, 500 μm. e, Quantification of IB4+ blood vessel sprouts. Data were normalized to the respective antisense sister clones. Data are mean ± s.e.m. from a minimum of n = 3 independent experiments. *P < 0.05, **P < 0.01, ***P < 0.001; two-tailed Student’s t-test.

To identify new genes, we selected candidate genes that were more highly expressed in tip versus stalk cells30, 31 and were associated with human vascular disease (Extended Data Fig. 8d). We focused on 32 genes that were represented in the Haplobank. To ensure that differences in angiogenesis are directly linked to inactivation of the respective target genes, and not genetic background noise or clonal effects, we performed colour tracing. We infected selected mES cells with GFP-expressing or mCherry-Cre-expressing retroviruses to generate sister clones with reverted orientation of the integration (Fig. 3c). We observed large variability between independent clones (Extended Data Fig. 8e, f), highlighting the importance of comparing a mutant to genetically repaired sister clones for each gene. Genetic inactivation of Myst3 (also known as Kat6a), Mecom, Gja1, Gabrb3, Tnfrsf1a and Dlg2 reduced sprouting angiogenesis and decreased vessel formation at least twofold compared to antisense clones; gene trapping of Enpp3, Smarca1, Ndufs4, Plcb1 or Hck promoted blood vessel growth (Fig. 3d, e and Extended Data Fig. 9). Therefore, the Haplobank enabled rapid, functional and reproducible validation of candidate angiogenesis genes in engineered blood vessels.

To assess the in vivo role of these angiogenesis genes, we generated embryoid bodies from mES cells stably expressing mCherry and injected them into immunocompromised mice. The resulting teratomas were assessed for mES-cell-derived mCherry+IB4+ blood vessels (Fig. 4a and Extended Data Fig. 10a). To control for teratoma growth rates, we injected 1:1 mixed mosaic embryoid bodies of sense (mutant, GFP+) and their respective antisense clones (repaired, mCherry+). The contribution of GFP- or mCherry-expressing cells to the endothelial lineage and non-endothelial tissues was assessed by cytometry (Extended Data Fig. 10b). Mutated clones of Myst3, Gja1 and Grin2b showed a decreased contribution to IB4+ vasculature (Fig. 4a and Extended Data Fig. 10b). By contrast, mutant clones that showed increased vessel density in vitro exhibited a greater capacity to form blood vessels in vivo than their antisense sister clones (Fig. 4a). These data were confirmed using in situ blood-vessel analysis of mixed teratomas (Fig. 4b). To test whether the identified genes modulate angiogenesis via specification of tip cell fate, we performed mosaic tip cell competition assays (Extended Data Fig. 10c, d). As a positive control, we assessed genetic modulation of the Notch1 pathway. As expected, mCherry+ (sense knockout) Notch1 mutant cells preferentially localized to the tip position compared to GFP+ antisense Notch1-expressing sister clones (Fig. 4c, d). Most clones carrying gene mutations that increased sprouting activity displayed a significant increase to the tip cell position, and vice versa (Fig. 4c–e). Therefore, the identified angiogenesis genes can control the tip cell fate.

Figure 4: In vivo angiogenesis and cell specifications.
In vivo angiogenesis and cell specifications.

a, Quantification of the ability of indicated sense and antisense targeted clones to form blood vessels in teratomas. Data are mean ± s.e.m. of n = 3 independent teratomas and n = 5 AN3-12 control. b, Gja1-sense (GFP+) and Gja1-antisense (mCherry+) sister mES cells were mixed 1:1 to form chimaeric embryoid bodies and were subsequently injected into immunocompromised nu/nu mice. Representative sections to identify IB4-, GFP- and mCherry-expressing cells are shown. Scale bar, 50 μm. c, Notch1 antisense and mCherry-Cre+ Notch1-sense sister mES cells were mixed (1:1 ratio) to generate mosaic blood vessels and analysed for mCherry+ (red) or GFP+ (green) cells at the tip position. Scale bar, 200 μm. d, Relative tip cell position of sister cells with sense and antisense integrations in the indicated genes, determined in chimaeric 3D sprouts. Data are mean ± s.e.m. of a minimum of n = 3 independent experiments. a, d, *P < 0.05, **P < 0.01, ***P < 0.001; two-tailed Student’s t-test. e, Representative image of mosaic blood-vessel sprouts from Gja1 sense and antisense sister clones. Scale bars, 500 μm and 100 μm (insert). f, Intravenous injection of a Gja1 inhibitory peptide (GAP26) into neonatal mice abrogated retinal angiogenesis. At day 5 after birth, retinas were isolated and stained for IB4+ blood vessels. Scale bars, 500 μm (top) and 100 μm (bottom).

We tested whether one of our angiogenesis genes, Gja1, which encodes the gap junction α-1 protein (ref. 34), is involved in physiological vascularization of the mouse retina, which begins at birth and progresses until postnatal day 7 (ref. 35). At postnatal day 6, we observed high expression at the angiogenic front where gap junction α-1 protein localized to endothelial junctions, with the highest intensity in tip cells (Extended Data Fig. 10e). At the vascular plexus, gap junction α-1 protein expression was predominantly detected in perivascular cells, but not in endothelial cells (Extended Data Fig. 10f). Moreover, gap junction α-1 protein was primarily detected at the tip of the developing vascular sprout in 3D blood vessels (Extended Data Fig. 10g). Newborn mice injected intravenously with a gap junction α-1 protein-blocking peptide displayed a delay in vascular network progression and complexity in the retina compared to those injected with a scrambled control peptide36 (Fig. 4f). The number of tip cells at the angiogenic front was decreased and there were decreased numbers of branch points in the vascular plexus (Fig. 4f). Therefore, gap junction α-1 protein is a key regulator of tip cell fate and physiological angiogenesis in vivo.

In summary, the Haplobank resource contains over 100,000 individually mutagenized and barcoded mES cell lines targeting 16,970 protein coding genes. The Haplobank complements a collection of 3,396 reversibly targeted genes in a near haploid human leukaemia cell line37. Our proof-of-principle experiments uncovering genes that are required for rhinovirus infection and angiogenesis show the power of the Haplobank in forward and reverse genetic screens, respectively. The strong variability between independent clones revealed the importance of assessing mutant and repaired clones side-by-side, and addresses an increased demand for rigor and reproductibility38. Therefore, clones from the Haplobank combined with the use of reversible technologies enable high-throughput, reproducible, functional annotation of the genome.

Methods

Haploid mES cell cultures

The haploid mES cells used for this study are a feeder-independent clonal derivative of HMSc2 termed AN3-12. Haploid mES cells were grown in standard ES cell medium with serum and LIF. Cell-culture-grade dishes were from Greiner (15-cm Cellstar cell culture dishes, 639160) and NUNC (all other formats, for example, 10-cm dish Nunclon ∆ Surface, 150350; six-well Nunclon ∆ Surface, 140675). The following ES cell medium (ESCM) was used: 450 ml DMEM (Sigma-Aldrich, D1152), 75 ml FCS (Invitrogen), 5.5 ml penicillin–streptomycin (Sigma-Aldrich, P0781), 5.5 ml NEAA (Sigma-Aldrich, M7145), 5.5 ml l-glutamine (Sigma-Aldrich, G7513), 5.5 ml NaPyr (Sigma-Aldrich, S8636), 0.55 ml βME (Merck, 805740, 10 μl bME was diluted in 2.85 ml PBS for a 1,000× stock) and ESGRO (Millipore ESG1107; used according to the manufacturer’s instructions). For freezing, mES cell clones were expanded, collected from a confluent 10-cm dish by trypsinization, the reaction stopped with ESCM, and subsequently the cells were centrifuged at 1,200 r.p.m. for 5 min. Supernatants were discarded and cell pellets resuspended in 1 ml PBS. Subsequently, 20 μl of the cell suspension was removed and used to prepare DNA for barcode and integration site PCR. Finally, 7 ml freezing medium was added (one volume ESCM, one volume FCS, 11% DMSO (Sigma-Aldrich, 41648)) and cells were frozen in quadruplicates in two-ml cryovials. Enzymatic mycoplasma tests were performed on a weekly basis. For purification and maintenance, haploid cells were trypsinized and incubated in 10 μg ml−1 Hoechst33342 (Sigma-Aldrich, B2261) for 30–40 min at 37 °C and analysed by FACS sorting using a BD AriaIII equipped with a near-UV laser. For flow cytometric analysis cells were trypsinized and analysed using a FACS BD LSRFortessa (BDBiosciences) equipped with a high-throughput sampler. Data analysis was performed using FACS Diva and FlowJo. To generate single-cell-derived colonies, 200 cells per 10-cm dish were plated and grown for 10–11 days. Cells were washed once with PBS, and clones were manually picked in 20 μl PBS under a binocular into 96U-well plates (NUNC). The picked colonies were dispersed into single cells by incubation with 5 μl 0.25% trypsin in 20 μl PBS for 6 min at 37 °C. The reaction was stopped with 175 μl ESCM, and the cells were then split into 96F-well plates, or directly into a 24-well dish for further analysis. For robot-assisted picking, colonies were manually transferred in 20 μl PBS or DMEM into U-well plates. After picking, cells were dissociated in a Hamilton Laboratory Star using trypsin as above and plated in 5 replicas using the 96-head pipetting block. Clones were expanded in 96-well plates for three days and frozen in matrix plates with 2D barcodes or in cryogenic vials (Thermo Scientific) and transferred into liquid nitrogen. A step-by-step protocol describing haploid cell culture can be found in ref. 39.

Viral and transposon vectors for haploid mES cell mutagenesis

For retro- and lentiviral library generation, gene-trap viruses carrying a neomycin-resistance cassette were packaged in PlatinumE (Cell Biolabs, used only at early passage after obtaining from vendor, tested for cell identity by antibiotics resistance and capability to package virus) or LentiX (Clontech, used only at early passage after obtaining from vendor, further identified by capability to package virus) cells, respectively. For some experiments, virus was concentrated by centrifugation (25,000 r.p.m., 4 °C, 4 h) and FACS-sorted haploid mES cells were then infected for 8 h in the presence of 2 μg polybrene per ml; 50 μM JQ1 was used to increase infection efficiency. For transposon-based library generation, FACS-sorted haploid mES cells were nucleofected (Amaxa Nucleofector, program A13) using transposons-containing gene-trap cassettes carrying a splice acceptor, a RPB1 promoter and a neomycin-resistance gene. After infection for 30 h, selection for gene-trap insertions was carried out using G418 (Gibco) at 0.2 mg ml−1. To estimate the numbers of integrations and therefore library complexity, 10,000 cells (for viruses) or 50,000 cells (for transposons) were plated on 15-cm dishes and selected using G418. For comparison, 1,000 cells were plated but not exposed to G418 selection. On day 10, colonies were counted.

4D pooling

In brief, up to one-hundred 96-well plates containing lysed cells were stacked into 10 towers of 10 plates (that is, slices). In the first step, lysates were pooled into master tower plates containing pools of all plates within one tower and a fraction of lysate was separated into jars containing pools of all wells of each plate termed slice. While processing further towers, identical slices were united with previous ones and new master tower plates were generated. In a second run, master tower plates were pooled along rows and columns of 96-well plates representing all wells within one tower. Finally, this resulted in 40 pools representative of 12 columns, 8 rows, 10 slices and 10 towers. Lysates of each well were thus present in four coordinates. Upon DNA purification, the 40 individual pools were processed for inverse PCR (iPCR) adding individual indices to each pool in the PCR step. Subsequently, the pool of all coordinates was sequenced within one lane of an Illumina sequencer. Two independent strategies of iPCR based on different restriction enzymes were used to identify mutations and coordinates. In addition, direct genomic PCR amplifying the internal barcodes of the mutagens was performed with the same experimental indices. For details see below.

Parallel cloning of internal barcodes

All mutagenesis vectors were designed with highly complex barcodes for confirmation of the generated clones. The complex barcode sequences were obtained as ultramers from Integrated DNA Technologies as PAGE Ultramer DNA Oligos and contained 32-bp barcode sequences flanked by BspEI and BstEII restriction sites. Two versions with either strong–weak or weak–strong alternating sequences were cloned into each frame; barcode sequences were amplified by PCR and purified over columns (Qiagen). The gene-trap vector plasmids (GT-MCS, frame 0/1/2) were digested with AgeI and BstEII and dephosphorylated. Fragments were separated by agarose gel electrophoresis, phenol extracted and precipitated with ethanol. Amplified barcode oligos were digested with BspEI and BstEII. Backbones and inserts were ligated with T4 DNA ligase (NEB) and electroporated (BioRad, Gene Pulser II) into appropriate electrocompetent cells. Bacteria were plated and grown overnight and further expanded in liquid culture for 6 h. Plasmid DNA extraction was performed using established protocols (Qiagen). The following barcode PCR primers were used: forward, GTTGATCTGAGCTACTCATCAACGGT; reverse, AAGTTCCTTCTGGTTCTGGCTCTGCT. For each PCR reaction, 20 μl was analysed on a 2% agarose gel and PCR products were purified for sequencing using Illustra ExoStar 1-step kit (GE Healthcare). For Sanger sequencing, the barcode PCR-reverse primer was used.

Mapping of genomic integrations by next-generation sequencing

Enzyme 1 (E1) was used to fragment the genome. Because the recognition sequence for E1 is also present adjacent to the barcode of the gene-trap vector, it is possible to retrieve the exact integration site of the gene-trap cassette within the genome by circularizing E1-digested genomic DNA (gDNA) and subsequent amplification of the genomic region by iPCR using primers ‘US’ and ‘DS’. To improve iPCR efficiency, a linearization step using E2 was introduced, which re-opens the rings generated previously. Two mapping strategies using enzymes E1 were implemented for each mutagenesis system. For protocol details and sequence information, please also see https://www.haplobank.at/. For retroviruses, E1 were NlaIII, MseI; E2 was SbfI; and the mapping strategy was 5′. For Tol2, E1 were NlaIII, TaqI; E2 was PacI; and the mapping strategy was 3′. For lentiviruses, E1 were NlaIII, TaqI; E2 was PacI; and the mapping strategy was 5′. The following illumina iPCR primer sequences were used: DS, AATGATACGGCGACCACCGAGATCTACACGAGCCAGAACCAGAAGGAACTTGAC; US, CAAGCAGAAGACGGCATACGAGATINDEXGTGACTGGAGTTCAGACGTGTGCTCTTC; INDEX indicates the custom barcode of 4–8 bp.

For genomic DNA preparation, cell pellets were lysed in lysis buffer (10 mM Tris-HCl pH 8.0, 5 mM EDTA, 100 mM NaCl, 1% SDS, 1 mg ml−1 proteinase K) and incubated at 55 °C overnight. RNase A (Qiagen, 100 mg ml−1) was added at a ratio of 1:1,000. After incubation for 1 h at 37 °C, two phenol:chloroform:isoamyl alcohol extractions and a chloroform:isoamyl alcohol extraction, using phase-lock tubes (5Prime), were performed. Samples were then incubated overnight at the digestion-enzyme-specific temperatures. Restriction digests were purified using a PCR Purification Kit (Qiagen) and ring-ligated using T4 DNA ligase (Roche Applied Science). The ring-ligation reaction was incubated overnight at 16 °C, heat-inactivated at 65 °C for 15 min and linearized by adding 1 μl E2 for 2 h at 37 °C. The digest was again purified using a PCR Purification Kit (Qiagen). The eluate was then used for iPCR. A 20 μl sample was analysed on an agarose gel and the remaining PCR products were purified using a Gel Extraction Kit (Qiagen). Next-generation sequencing (NGS) was performed on an Illumina HiSeq2500 sequencer according to the manufacturers’ protocols. Sequencing primers used for the first read as well as the experimental indices in barcode–PCR were custom-made, other primers were standard Illumina primers: retrovirus 1:1 mix (owing to alternative processing of viral long terminal repeats (LTRs)), GAGTGATTGACTACCCGTCAGCGGGGGTCTTTCA and TGAGTGATTGACTACCCACGACGGGGGTCTTTCA; Tol2, CACTTGAGTAAAATTTTTGAGTACTTTTTACACCTCTG; lentivirus, CAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCA.

To additionally confirm the genomic integration site of the gene-trap cassettes, gene-specific primers flanking the mapped locus were paired with mutagen-specific primers to confirm integration and the absence of the wild-type allele. Integration-site-specific primers were designed using Clone Manager 9 software (Sci-Ed Software). For each integration site, four PCR reactions with respective primer combinations (integration-site-specific forward and reverse primer and integration-site-specific forward or reverse with mutagen-specific forward or reverse primer) were performed. The following mutagen specific primers were used: retrovirus-INT-fwd, CCAGAACCAGAAGGAACTTGC; retrovirus-INT-rev, TACAGACGCAGGCGCATAACAC; lentivirus-INT-fwd, GCCAGAACCAGAAGGAACTTGC; lentivirus-INT-rev, AGAGCTCCTCTGGTTTCCCTTTC; Tol2-INT-fwd, GAGCCAGAACCAGAAGGAACTG; Tol2-INT-rev, CCGGGCAATGGATTGATATTGG.

4D deconvolution of NGS reads

On the basis of the restriction enzyme used for genome digestion, the first read was cut at the first restriction site if it was present within 50 bp of the mapped sequence. A minimal length of 15 nt was required. The sequence was aligned with Bowtie against the mm10 genome40. A maximum of 1 mismatch (-v 1) was allowed. Only unique alignments (-m 1) in the best strata are reported. The second read reading out the internal barcode was trimmed after 32 nt. The alignment/internal barcode pairs were assigned to a defined well via the experimental indices. A valid assignment required a minimal count of three per coordinate and a minimal fraction of 0.75 over alternative coordinates. The internal barcodes were merged with 4D pooling data if its count was higher than 20. This strategy was also used to verify the existing coordinates and uncover missing coordinates and/or additional integrations. Mapped insertions were intersected with the annotation (Ensembl) of all gene features, including the intergenic region using BEDtools41. In silico insertions were randomly sampled from the mouse genome and aligned back to the genome.

Chromosome spreads and FISH

Metaphase chromosomes were collected according to standard protocols. For multiplex-fluorescence in situ hybridization (M-FISH), a mouse 21-colour painting probe was generated following the pooling strategy2. The M-FISH probe was denatured at 65 °C for 10 min before being applied onto the denatured slides. The hybridization area was sealed with a 22 × 22-mm coverslip and rubber cement. Hybridization was carried out in a 37 °C incubator over two nights. The post-hybridization washes included a 5-min stringent wash in 0.5× SSC at 75 °C, followed by a 5-min rinse in 2× SSC containing 0.05% Tween20 (VWR) and a 2-min rinse in 1× PBS, both at room temperature. Finally, slides were mounted with SlowFade Gold mounting solution containing 4′6-diamidino-2-phenylindole (DAPI, Invitrogen). Images were visualized on a Zeiss AxioImager D1 fluorescent microscope equipped with narrow band-pass filters for DAPI, DEAC, FITC, CY3, Texas red and CY5 fluorescence and an ORCA-EA CCD camera (Hamamatsu). M-FISH digital images were captured using the SmartCapture software (Digital Scientific UK) and processed using the SmartType Karyotyper software (Digital Scientific UK). Approximately 200 metaphases from AN3-12 were counted for the diploid versus haploid frequency and 10 well-spread metaphases were fully karyotyped by M-FISH and DAPI-banding pattern42.

Copy number variation (CNV) analysis

Discriminative coverage analysis was performed to identify differences between genotypes in the number of copies of a genomic region as previously described (PMID: 22136931). In brief, reads from 129, AN312 and C57BL/6J samples were aligned against the mm10 C57BL/6J mouse reference genome using Bowtie v.1.2 with parameters -v 3 -m 1–best–strata. Coverage in non-overlapping 10-kb windows over the whole genome was calculated using Bedtools (v.2.26.0) makewindows and multicov (PMID: 20110278). Read counts were scaled by library size and for each sample windows and extreme outlier counts were removed.

SNP mapping

Variants were called with GATK v.3.7-0 (PMID: 20644199) following established best practices. In brief, reads were aligned to the mm10 C57BL/6J mouse reference genome using BWA mem (BWA v.0.7.12), duplicate reads were marked using Picard v.2.6.0, reads were locally realigned around indels, base quality recalibration was performed using mm10 dbSNP as a training set, variants were called using HaplotypeCaller with ploidy set to one for AN312 and two for C57BL/6J and 129 samples, and variant filtration was applied according to GATK guidelines. For each sample, homozygous SNPs were counted in nonoverlapping 10-kb sliding windows and counts were plotted to estimate the distribution SNP densities on the mm10 reference genome.

RNA sequencing

Reads were screened for ribosomal RNA by aligning with BWA (v.0.6.1)43 against known rRNA sequences (RefSeq). The rRNA-subtracted reads were aligned with TopHat (v.1.4.1)44 against the Mus musculus genome (mm10); a maximum of six mismatches were allowed. Maximum multihits were set to 1 and InDels as well as Microexon-search was enabled. Additionally, a gene model was provided as GTF (UCSC RefSeq mm10). rRNA loci were masked on the genome for downstream analysis. Aligned reads were subjected to FPKM measurements with Cufflinks (v.1.3.0)37, 45. Furthermore, only those fragments compatible with UCSC RefSeq annotation (mm10) of genes with at least one protein-coding transcript were allowed and counted towards the number of mapped hits used in the FPKM denominator. The mean FPKM was calculated over the replicates.

Chromatin immunoprecipitation

ChIP was essentially performed as described previously46. In brief, after trypsinization, 25 million haploid mES cells were washed once in PBS before fixation for 7 min at room temperature by addition of formaldehyde to a final concentration of 1%. Crosslinking was quenched by addition of 2.5 M glycine (0.125 M final concentration) and cells were then incubated on ice. Crosslinked cells were spun at 600g for 5 min and nuclei prepared by consecutive washes with rinse 1 buffer (final: 50 mM HEPES pH 8.0, 140 mM NaCl, 10% glycerol, 0.5% NP40, 0.25% Triton X-100, 1 mM EDTA) followed by rinse 2 buffer (final: 10 mM Tris pH 8.0, 1 mM EDTA, 0.5 mM EGTA, 200 mM NaCl). Nuclei were resuspended in 2 ml total volume of sonication buffer (0.1% SDS, 1 mM EDTA pH 8, 10 mM Tris HCl, pH 8 with protease inhibitor complete mini (Roche)) and then sonicated with a Covaris E22 sonicator (Covaris settings: 5% duty cycle, PIP 140, 200 cyles per burst, 15 min). ChIP was performed at 4 °C in ChIP buffer (final: 50 mM HEPES/KOH pH 7.5, 300 mM NaCl, 1 mM EDTA, 1% Triton X-100, 0.1% DOC, 0.1% SDS) with the indicated antibody. Precipitated chromatin was purified using Dynabeads Protein G (Thermofisher, 10003D).

ATAC-seq

ATAC-seq was performed as reported previously47. In brief, 50,000 cells were washed in cold PBS before gentle resuspension in 50 μl of cold lysis buffer (10 mM Tris-HCl pH 7.4, 10 mM NaCl, 3 mM MgCl2, 0.1% Igepal CA-630). For the transposition reaction, nuclei were collected by immediate centrifugation for 15 min at 4 °C. The supernatant was removed and nuclei were suspended in 50 μl of transposition reaction mixture (25 μl 2× TD buffer (Nextera kit), 2.5 μl TDE1 (Nextera Tn5 Transposes), 22.5 μl nuclease-free water). The transposition reaction was performed at 37 °C for 30 min followed by DNA purification using the Qiagen MinElute Kit.

5′ RACE

TRIzol was used to isolate 10 μg of total RNA from mES cells (ThermoFisher Scientific), followed by additional purification and DNase treatment with the Qiagen RNeasy mini kit. PolyA mRNA was enriched using dT(25)-Dynabeads (ThermoFisher Scientific) using the manufacturers’ recommendations. Purified mRNA was reverse transcribed with SuperScript II polymerase (Invitrogen) in the presence of random hexamer primers using the following incubation protocol: 25 °C for 10 min, 42 °C for 50 min, 50 °C for 10 min, 70 °C for 15 min. cDNA from the reaction was purified with the MinElute reaction clean-up kit (Qiagen) and tailed with dCTPs using Terminal Transferase (NEB). One-tenth of the reaction was used for the first round of PCR with dG-anchor-primer/g.s.-outer-primer: 96 °C for 2 min, 23 rounds of (96 °C for 15 s; 60 °C for 15 s; 68 °C for 3 min), 72 °C for 10 min. Reaction products were diluted 10× and used for the second round of PCR with dG-sec-anchor-primer/g.s.-inner-primer with the following incubation protocol: 96 °C for 2 min, 23 rounds of (96 °C for 15 s; 63 °C for 15 s; 72 °C for 3 min), 72 °C for 10 min. Reaction products were purified with the MinElute clean-up kit and cloned into a pJet1.2 vector (ThermoFisher Scientific). Bacterial colonies were picked and sequenced with β-geo_R4 primer. The following primers were used: g.s.-inner-primer, GTATCGGCCTCAGGAAGATCG; g.s.-outer-primer, GCATCGTAACCGTGCATCTG; dG-anchor-primer, CTACTACTACTAGGCCACGCGTCGACTAGTACGGGGGGGGGG; dG-sec-achor-primer, CTACTACTACTAGGCCACGCGTCGACTAGTAC; β-geo_R4, TCAGGCTGCGCAACTGTTGG.

Differentiation of parthenogenetic mES cell clones

mES cells were dissociated using trypsin and embryoid body formation was induced using the hanging drop technique (1,000 cells per 20 μl drop) in ESCM without LIF. After three days, the embryoid bodies were collected and transferred to untreated Petri dishes for further differentiation. On day 5 and day 12, embryoid bodies were collected and RNA was isolated using TRIzol (Invitrogen) according to the manufacturers’ protocol. Subsequently, 1 μg RNA was reverse transcribed into cDNA using the iScript cDNA Synthesis Kit (Bio-Rad) and qRT–PCR was performed with the GoTaq qPCR Master Mix (Promega, A6001) using a Bio-Rad CFX384 Connect Real-Time PCR Detection System. The following primers (5′ to 3′) were used for qRT–PCR: Oct4, fwd CCTACAGCAGATCACTCACATCGCC, rev CCTGTAGCCTCATACTCTTCTCGTTGG; Klf4, fwd GTGCCCCGACTAACCGTTG, rev GTCGTTGAACTCCTCGGTCT; Nanog, fwd CAGGAGTTTGAGGGTAGCTC, rev CGGTTCATCATGGTACAGTC; keratin 18, fwd TTGCCGCCGATGACTTTAGA, rev GGATGTCGCTCTCCACAGAC; Nkx2-5, fwd CCCAAGTGCTCTCCTGCTTT, rev TTTTTATCCGCCCGAGGGTC; Sox17, fwd GAATCCAACCAGCCCACTGA, rev TAGGGAAGACCCATCTCGGG.

To assess protein expression of pluripotency genes during differentiation, the haploid mES cells were cultured in ESCM without LIF in the presence of 500 nM all-trans-retinoic acid (Sigma-Aldrich, R2625) and the medium was replaced every other day. At indicated time points the cells were fixed in 4% PFA and stained using antibodies against Oct4 (BD Bioscience, 611202), Nanog (R&D Systems, AF2729) and Sox2 (R&D Systems, AF2018), counterstained with DAPI and imaged with a Zeiss LSM 780.

DNA content analysis and growth curves

Haploid AN3-12 cells were routinely passaged and fixed at indicated time points in ice-cold 85% ethanol and stored at −20 °C. For analysis of DNA content, cells were stained with 10 μg ml−1 Hoechst33342 (Sigma-Aldrich, B2261) for 30 min on ice, washed and analysed on a BD Fortessa. Viable cells were counted using a Countess II FL (ThermoFisher Scientific) according to the manufacturer’s protocol.

Keratinocyte differentiation

Embryoid bodies from 2,000 cells were generated using the hanging drop technique. After three days, 30–50 embryoid bodies were transferred to an ultra-low-attachment six-well plate and stimulated with 1 μM retinoic acid for three more days. Next, embryoid bodies were transferred onto collagen-IV-coated dishes in ESCM without LIF medium supplemented with 25 ng ml−1 hBMP-4 (R&D Systems). After three more days, the cells were incubated with CnT-07 keratinocyte medium (CELLnTEC) and cultured for six more days. The keratin14-positive cell population was enriched by using the rapid adherence to Collagen IV: trypsinized cells (0.25% trypsin-EDTA) were plated onto Collagen-IV-coated wells and non-adherent cells were washed off after a 15-min incubation at room temperature.

Reversion of gene-trap cassettes

To reverse gene-trap cassettes, mES cells were either stably infected or transiently transfected with a Cre- or Flpo-expressing plasmid also containing a fluorescent marker, for example, eGFP or DsRed, using Lipofectamine 2000 (Invitrogen), Amaxa nucleofector (program A-13) or viral delivery as outlined above. If single-cell-derived clones were needed, cells were FACS-sorted for mCherry+ or GFP+ cells 24–48 h after lipofection/nucleofection and 1,000 cells were then seeded on 15-cm tissue culture dishes. On day 10 after sorting, colonies were picked and used for further analysis. For stable integration of Cre and the colour marker upon infection, PlatinumE cells (Cell Biolabs) were transfected with the respective plasmids and mES cells were subsequently infected. After 30 h of infection, cells were puromycin selected for a further 72 h. As controls, GFP-puro-infected cells were selected and used. Inversion of the splice acceptor was determined by PCR on genomic DNA. The PCR was performed with three primers on mES cell lysates. Dependent on the orientation, either a fragment binding the first forward primer or a fragment binding the inverse forward primer were amplified. Upon first inversion, PCR fragments are larger in length. The second inversion results in a size reduction below the original orientation. The following specific PCR primers were used for Retro-, Lenti-, Tol2-EGT: EGT first fwd, CGACCTCGAGTACCACCACACT; EGT inv fwd, AAACGACGGGATCCGCCATGTCA; EGT com rev, TATCCAGCCCTCACTCCTTCTCT. Expected bands for Retro-, Lenti-, Tol2-EGT are: EGT first fwd/EGT com rev: 343 bp; EGT inv fwd/EGT com rev: 443 bp. The PCR primers for Tol2-pA were: Tol2 first fwd, TGGGTTCAAGCGATTCTCCTGCCTCA; Tol2 inv fwd, AGATAGGCACCCAGGGTGATGCAAGCTC; Tol2 com rev, CCGATCCATCCATCGCATATTTGGGA. Expected bands for Tol2-pA: polyA trap first fwd/Tol2 com rev: 326 bp; polyA trap inv fwd/Tol2 com rev com rev: 439 bp.

RV-A1a infections

For virus production, HeLa cells at a confluency of 70–80%, were washed with PBS and infected with RV-A1a in infection medium (DMEM, 2% heat-inactivated FCS, 1% penicillin–streptomycin, 1% l-glutamine, 30 mM MgCl2). After an incubation for 24 h, the virus was released from the cells by three freeze/thawing cycles and subsequently cellular debris was removed by centrifugation and filtration (4,000 r.p.m. for 10 min at 4 °C). For the genome-wide screen, retrovirally mutagenized haploid mES cells were seeded in 15-cm dishes (1 × 107 cells per dish), and 5 h after seeding, cells were infected with RV-A1a at a multiplicity of infection (MOI) of 5 in a 1:1 mixture of HeLa infection medium:ESCM (MgCl2 and LIF were additionally added to balance the concentrations). For control mock infections, the HeLa infection medium/ESCM mixture was used. On day 21 after infection, cells were pooled and further processed for isolation of genomic DNA or frozen. Validation in mES cells was done using clones from the Haplobank (Ldlr: clone 1 is 238F02, clone 2 is 374A09, clone 3 is 1031A04; Pla2g16: clone 1 is 371H01, clone 2 is 588G11, clone 3 is 917E06). HEK293T cells from an institute resource were tested for neomycin resistance to confirm cell identity. They were infected with GFP, empty guide and Cas9-containing control, or puro-selection marker, guide and Cas9-containing control vectors. Subsequent to selection, CRISPR–Cas9-edited cells were mixed with GFP-positive cells. These cell mixtures were monitored in the presence and absence of RV-A1a for shifts in ratios of GFP-expressing cells using FACS (BD LSRFortessa and FlowJo). Arachidonic acid, acetylsalicylic acid, ibuprofen, indomethacin, diclofenac and celecoxib (all from Sigma-Aldrich) in DMSO were added to the cells 2 h before RV-A1a infection at the indicated concentrations. For interaction studies with Pla2g16 and Ldlr in mES cells, the following drugs were used: 1 mM acetylsalicylic acid, 10 μM archidonic acid, 910 nM celecoxib, 380 nM Diclofenac, 0.1% DMSO (vehicle), 72 μM ibuprofen, 10 μM indomethacin.

The following guide sequences were used for mutagenesis: GFP sg, GAGCTGGACGGCGACGTAAA; LDLR sg1, TCAGACCGGGACTGCTTGGA; LDLR sg2, CTGTTGCACTGGAAGCTGGC; LDLR sg3, GCTGTTGCACTGGAAGCTGG; LDLR sg4, GGAGCTGTTGCACTGGAAGC; PLA2G16 sg1, GAAGGAATTGCTGTATGATG; PLA2G16 sg2, CCTGCAGCAAAATCATCCAG; PLA2G16 sg3, CTATGTTGGCGATGGATATG; PLA2G16 sg4, CGCTGGATGATTTTGCTGCA; Pla2g16 sgRNA1, TTGCTTCTGTTTCTTGTTTC; Pla2g16 sgRNA2, CTGAATGACTGCCCAGTTTT.

Blood-vessel sprouting assays

mES cells were trypsinized and seeded at 9,600 cells per embryoid body in a 96-well low-attachment plate (Sumitomo Bakelite, Prime Surface-U). After incubation for 14 days in ESCM without LIF in the presence of puromycin, the embryoid bodies were washed in ESCM without LIF, embedded in 3D collagen I gels and stimulated with 50 ng ml−1 VEGFA (in-house production) as previously described32. The first vascular outgrowth was observed after around seven days and capillary-like networks were analysed two weeks after initial embedding of the embryoid bodies into the collagen I matrix. For tip cell competition assays, mosaic embryoid bodies were generated from GFP-positive and mCherry-Cre-positive haploid mES cells. In brief, haploid mES cells were infected with viral supernatants collected from Platinum-E cells expressing GFP-Puro or mCherry-Cre-Puro. 48 h after infection the cells were treated with 10 μg ml−1 puromycin (Invivogen) to select for infected cells. Infected cells were FACS-sorted for the 5% highest GFP/mCherry-expressing cells, constantly kept under puromycin selection pressure to avoid reporter silencing, and then mixed at a 1:1 ratio to generate mosaic embryoid bodies. The following clones from the Haplobank were used to generate blood vessel organoids: Myst3 00235|A10; Myst3 00564|C06; Prkcz 00355|E11; Syt16 00160|B12; Tesc 00279|B08; Gfod1 00284|F11; Enpp3 00455|E06; Enpp3 00584|E08; Enpp3 00535|F07; Grin2b 00335|G12; Grin2b 00858|G05; Plcb1 00362|H01; Plcb1 00367|H09; Ndufs4 00318|D02; Ndufs4 10132|G01; Gja1 00345|A03; Tfpi 00281|H03; F2r 00351|E05; Pcsk6 00200|D09; Abcg2 00385|H12; Prkch 00102|C03; Igfr1 00535|B11; Entpd1 00588|B04; F3 00451|B01; Igf1 00455|D06; Plaur 00582|C11; Hckl 00449|E02; Mapt 00569|D06; Dlg2 00339|A01; Tnfrsf1a 00112|H07; Smarca1 00182|D04; Cdh13 00255|G04; Ets2 00341|E06; Tgfbr2 00583|D04; Gabrb3 00578|B04; Abcb1a 00500|A06; Bag3 00338|G02.

Immunofluorescence

To visualize the vascular outgrowths, blood vessels were fixed with 4% PFA for 20 min at room temperature and blocked with 3% FBS, 1% BSA, 0.5% Triton X-100 and 0.5% Tween-20 for 2 h. Biotinylated GSL Isolectin B4 (IB4; Vectorlabs 1201), anti-CD31 (BD Pharmingen AB9498) and anti-collagen IV (Millipore AB765P) were applied overnight at 4 °C. After washing with PBS-T (0.05% Tween-20), the organoids were incubated with streptavidin-Alexa488, anti-rat-Alexa488 or anti-rabbit-Alexa647 secondary antibodies. Before mounting with DAKO mounting medium and analysis under a stereomicroscope, organoids were stained with 1 μg ml−1 DAPI solution to image nuclei. A Zeiss LSM 780 was used to image the entire vascular outgrowth of every single organoid using the tile-scanning option and recording of z stacks.

Confirmation of Notch1 gene-trap flipping by PCR and western blot

DNA of stably infected GFP or mCherry-Cre cells was isolated using phenol:chloroform:isoamyl alcohol (25:24:1) (Sigma-Aldrich) and Phase Lock Gel Heavy tubes (5Prime). The following primer sequences were used for PCR: F1st, CTTCTGAGACGGAAAGAACCAGC; Finv, AAACGACGGGATCCGCCATGTCA; Rev, TATCCAGCCCTCACTCCTTCTCT. The pair F1st+Rev amplifies a fragment of the original, unflipped gene trap and Finv+Rev amplifies the Cre-recombined, flipped gene-trap fragment. Western blotting was performed using standard protocols. Blotted membranes were blocked in 3% BSA and subsequently incubated with antibodies to detect cleaved Notch1 (CST D3B8) and β-actin (Sigma-Aldrich A5316) overnight. After washing in TBS-T (0.1% Tween-20) and incubating with HRP-linked secondary antibodies, blots were incubated in ECL solution (Pierce) for visualization.

Teratoma assays

mES cells were dissociated and resuspended in growth-factor-reduced Matrigel (Corning, 356231) and kept on ice. Subsequently, 106 cells were injected into both flanks of 8–12-week-old MF1 nu/nu mice. Both female and male mice were used. For generation of a chimaeric vasculature, 2 × 106 mES cells were cultured in the presence of puromycin to avoid reporter silencing, in 15-cm Petri dishes and mosaic embryoid bodies (GFP:mCherry 1:1) were generated. After eight days of culture, the embryoid bodies were washed, resuspended in 200 ml growth-factor-reduced Matrigel (BD 356231), and injected subcutaneously into the flanks of immunocompromised MF1 nu/nu mice using a 18G needle. Teratomas were analysed 3–4 weeks after injection. To this end, teratomas were either processed for immunohistochemistry or cells were isolated for FACS analysis. H&E slides were reviewed with a Zeiss Axioskop 2 MOT microscope (Carl Zeiss Microscopy) and subsequently digitized with the Pannoramic Flash III (Adimec-Q-12-A-180Fc camera) automated slide scanner (3D Histech). For immunohistochemistry, teratomas were fixed with 4%PFA overnight and subsequently incubated in 20% sucrose for around 24 h before embedding into OCT and processing the tissue to 60–80-mm thick sections using a cryostat. For visualization, sections were incubated with anti-GFP (Abcam, AB111258), anti-mCherry (Abcam, AB125096) and biotinylated-IB4 (Vectorlabs, 1201) overnight. After washing and incubation with anti-goat-Alexa488, anti-mouse-Alexa555 and streptavidin-Alexa633, respectively, the specimens were imaged using a Zeiss LSM780 microscope. For FACS analysis, teratomas were cut into 1 mm pieces and subjected to enzymatic digestion (collagenase type IV 300 U ml−1, Worthingon; dispase 0.25 U ml−1, Gibco; DNase 7.5 μg ml−1, Qiagen). Following a 1-h incubation at 37 °C on a shaker, the cell suspension was pipetted through a 70-μm cell strainer, washed once with DMEM before centrifugation at 400g for 10 min. Single cells were incubated for 45 min with an anti-mouse VE-Cadherin-APC antibody (eBioscience 17-1441-80) on ice. A 10-min DAPI wash was performed to exclude dead cells from the FACS analysis. All animal studies were approved according to Austrian and EU legislature.

Retina analyses

Retinas were collected and prepared from mice as previously described35. The gap junction α-1 protein inhibitory peptide (GAP26, VCYDKSFPISHVR) and the scrambled control peptide (sGAP26, YSIVCKPHVFDRS) were synthesized by PSL GmbH) and injected intravenously in neonatal C57BL/6 mice as previously described36, 48 at 2 μg per mouse. Whole-mount retinas were immunostained for IB4 (Vectorlabs, 1201) or gap junction α-1 protein (Abcam, AB11370) expression, incubated with the appropriate secondary antibodies (see above) and imaged using a LSM 780 microscope.

Statistics

All values are mean ± s.e.m. or mean ± s.d. Comparisons between groups were made by Student’s t-test or two-way ANOVA using GraphPad Prism (GraphPad Software) or R statistical software. No statistical methods were used to predetermine sample size. The experiments were not randomized and the investigators were not blinded to allocation during experiments and outcome assessment. P < 0.05 was accepted as statistically significant.

Data availability

All genomic data have been deposited in the NCBI Gene Expression Omnibus and are accessible through GEO accession number GSE84090. All material, protocols and cell lines can be obtained from the website https://www.haplobank.at/ as well as via the Protocol Exchange39.

Accession codes

Primary accessions

Gene Expression Omnibus

References

  1. Cahan, P. & Daley, G. Q. Origins and implications of pluripotent stem cell variability and heterogeneity. Nat. Rev. Mol. Cell Biol. 14, 357368 (2013)
  2. Hou, Y. et al. Single-cell triple omics sequencing reveals genetic, epigenetic, and transcriptomic heterogeneity in hepatocellular carcinomas. Cell Res. 26, 304319 (2016)
  3. Begley, C. G. & Ellis, L. M. Drug development: raise standards for preclinical cancer research. Nature 483, 531533 (2012)
  4. Justice, M. J., Noveroske, J. K., Weber, J. S., Zheng, B. & Bradley, A. Mouse ENU mutagenesis. Hum. Mol. Genet. 8, 19551963 (1999)
  5. Robertson, E., Bradley, A., Kuehn, M. & Evans, M. Germ-line transmission of genes introduced into cultured pluripotential cells by retroviral vector. Nature 323, 445448 (1986)
  6. Elbashir, S. M. et al. Duplexes of 21-nucleotide RNAs mediate RNA interference in cultured mammalian cells. Nature 411, 494498 (2001)
  7. Brummelkamp, T. R., Bernards, R. & Agami, R. A system for stable expression of short interfering RNAs in mammalian cells. Science 296, 550553 (2002)
  8. Jinek, M. et al. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337, 816821 (2012)
  9. Fu, Y. et al. High-frequency off-target mutagenesis induced by CRISPR–Cas nucleases in human cells. Nat. Biotechnol. 31, 822826 (2013)
  10. Morgens, D. W., Deans, R. M., Li, A. & Bassik, M. C. Systematic comparison of CRISPR/Cas9 and RNAi screens for essential genes. Nat. Biotechnol. 34, 634636 (2016)
  11. Evers, B. et al. CRISPR knockout screening outperforms shRNA and CRISPRi in identifying essential genes. Nat. Biotechnol. 34, 631633 (2016)
  12. Rouhani, F. et al. Genetic background drives transcriptional variation in human induced pluripotent stem cells. PLoS Genet. 10, e1004432 (2014)
  13. Fedorov, L. M., Haegel-Kronenberger, H. & Hirchenhain, J. A comparison of the germline potential of differently aged ES cell lines and their transfected descendants. Transgenic Res. 6, 223231 (1997)
  14. Echeverri, C. J. et al. Minimizing the risk of reporting false positives in large-scale RNAi screens. Nat. Methods 3, 777779 (2006)
  15. Schnütgen, F. et al. Genomewide production of multipurpose alleles for the functional analysis of the mouse genome. Proc. Natl Acad. Sci. USA 102, 72217226 (2005)
  16. Schnütgen, F. et al. Enhanced gene trapping in mouse embryonic stem cells. Nucleic Acids Res. 36, e133 (2008)
  17. Mayasari, N. I. et al. Mixture of differentially tagged Tol2 transposons accelerates conditional disruption of a broad spectrum of genes in mouse embryonic stem cells. Nucleic Acids Res. 40, e97 (2012)
  18. Ivics, Z., Hackett, P. B., Plasterk, R. H. & Izsvák, Z. Molecular reconstruction of sleeping beauty, a Tc1-like transposon from fish, and its transposition in human cells. Cell 91, 501510 (1997)
  19. Bellen, H. J. et al. The Drosophila gene disruption project: progress using transposons with distinctive site specificities. Genetics 188, 731743 (2011)
  20. Ivics, Z. et al. Transposon-mediated genome manipulation in vertebrates. Nat. Methods 6, 415422 (2009)
  21. Torres, M. et al. An α-E-catenin gene trap mutation defines its function in preimplantation development. Proc. Natl Acad. Sci. USA 94, 901906 (1997)
  22. Reithmayer, M., Reischl, A., Snyers, L. & Blaas, D. Species-specific receptor recognition by a minor-group human rhinovirus (HRV): HRV serotype 1A distinguishes between the murine and the human low-density lipoprotein receptor. J. Virol. 76, 69576965 (2002)
  23. Hofer, F. et al. Members of the low density lipoprotein receptor family mediate cell entry of a minor-group common cold virus. Proc. Natl Acad. Sci. USA 91, 18391842 (1994)
  24. Jaworski, K. et al. AdPLA ablation increases lipolysis and prevents obesity induced by high-fat feeding or leptin deficiency. Nat. Med. 15, 159168 (2009)
  25. Duncan, R. E., Sarkadi-Nagy, E., Jaworski, K., Ahmadian, M. & Sul, H. S. Identification and functional characterization of adipose-specific phospholipase A2 (AdPLA). J. Biol. Chem. 283, 2542825436 (2008)
  26. Tsutsumi, S. et al. Endoplasmic reticulum stress response is involved in nonsteroidal anti-inflammatory drug-induced apoptosis. Cell Death Differ. 11, 10091016 (2004)
  27. Uyama, T. et al. Interaction of phospholipase A/acyltransferase-3 with Pex19p: a possible involvement in the down-regulation of peroxisomes. J. Biol. Chem. 290, 1752017534 (2015)
  28. Staring, J. et al. PLA2G16 represents a switch between entry and clearance of Picornaviridae. Nature 541, 412416 (2017)
  29. Potente, M., Gerhardt, H. & Carmeliet, P. Basic and therapeutic aspects of angiogenesis. Cell 146, 873887 (2011)
  30. del Toro, R. et al. Identification and functional analysis of endothelial tip cell-enriched genes. Blood 116, 40254033 (2010)
  31. Strasser, G. A., Kaminker, J. S. & Tessier-Lavigne, M. Microarray analysis of retinal endothelial tip cells identifies CXCR4 as a mediator of tip cell morphology and branching. Blood 115, 51025110 (2010)
  32. Jakobsson, L. et al. Endothelial cells dynamically compete for the tip cell position during angiogenic sprouting. Nat. Cell Biol. 12, 943953 (2010)
  33. Hellström, M. et al. Dll4 signalling through Notch1 regulates formation of tip cells during angiogenesis. Nature 445, 776780 (2007)
  34. Valdimarsson, G., De Sousa, P. A., Beyer, E. C., Paul, D. L. & Kidder, G. M. Zygotic expression of the connexin43 gene supplies subunits for gap junction assembly during mouse preimplantation development. Mol. Reprod. Dev. 30, 1826 (1991)
  35. Pitulescu, M. E., Schmidt, I., Benedito, R. & Adams, R. H. Inducible gene targeting in the neonatal vasculature and analysis of retinal angiogenesis in mice. Nat. Protoc. 5, 15181534 (2010)
  36. Chaytor, A. T., Martin, P. E., Edwards, D. H. & Griffith, T. M. Gap junctional communication underpins EDHF-type relaxations evoked by ACh in the rat hepatic artery. Am. J. Physiol. Heart Circ. Physiol. 280, H2441H2450 (2001)
  37. Trapnell, C. et al. Transcript assembly and quantification by RNA-seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511515 (2010)
  38. Baker, M. 1,500 scientists lift the lid on reproducibility. Nature 533, 452454 (2016)
  39. Elling, U. et al. Haplobank methods collection. Protoc. Exch. http://dx.doi.org/10.1038/protex.2017.104 (2017)
  40. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009)
  41. Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841842 (2010)
  42. Jentsch, I., Adler, I. D., Carter, N. P. & Speicher, M. R. Karyotyping mouse chromosomes by multiplex-FISH (M-FISH). Chromosome Res. 9, 211214 (2001)
  43. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 17541760 (2009)
  44. Trapnell, C., Pachter, L. & Salzberg, S. L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 11051111 (2009)
  45. Roberts, A., Trapnell, C., Donaghey, J., Rinn, J. L. & Pachter, L. Improving RNA-seq expression estimates by correcting for fragment bias. Genome Biol. 12, R22 (2011)
  46. Mohn, F. et al. Lineage-specific polycomb targets and de novo DNA methylation define restriction and potential of neuronal progenitors. Mol. Cell 30, 755766 (2008)
  47. Buenrostro, J. D., Wu, B., Chang, H. Y. & Greenleaf, W. J. ATAC-seq: a method for assaying chromatin accessibility genome-wide. Curr. Protoc. Mol. Biol. 109, 21.29.121.29.9 (2015)
  48. Gombash Lampe, S. E., Kaspar, B. K. & Foust, K. D. Intravenous injections in neonatal mice. J. Vis. Exp. 93, e52037 (2014)
  49. Golczak, M. et al. Structural basis for the acyltransferase activity of lecithin:retinol acyltransferase-like proteins. J. Biol. Chem. 287, 2379023807 (2012)

Download references

Acknowledgements

We thank all members of our laboratories, IMBA/IMP and VBCF services for support and Life Science Editors for assistance; B. Knapp, I. Filipuzzi and T. Aust for clone picking, N. R. Movva and T. Bouwmeester (NIBR) for support, and K. Handler for the differentiation protocols. The Haplobank is funded by the Austrian National Bank (OeNB), an Advanced ERC grant and Era of Hope/National Coalition against Breast Cancer/DoD (to J.M.P.). U.E. is a Wittgenstein Prize fellow. D.B. is supported by FWF P23308-B13. A.S. is supported by an ERC Consolidator Grant, Boehringer Ingelheim and FFG.

Author information

  1. These authors contributed equally to this work.

    • Ulrich Elling &
    • Reiner A. Wimmer

Affiliations

  1. Institute of Molecular Biotechnology of the Austrian Academy of Science (IMBA), Vienna Biocenter (VBC), Dr. Bohr Gasse 3, Vienna, Austria

    • Ulrich Elling,
    • Reiner A. Wimmer,
    • Andreas Leibbrandt,
    • Thomas Burkard,
    • Georg Michlits,
    • Alexandra Leopoldi,
    • Dana Abdeen,
    • Sergei Zhuk,
    • Cornelia Handl,
    • Julia Liebergesell,
    • Maria Hubmann,
    • Anna-Maria Husa,
    • Manuela Kinzer,
    • Nicole Schuller,
    • Ellen Wetzel,
    • Nina van de Loo,
    • Jorge Arturo Zepeda Martinez,
    • Chukwuma A. Agu,
    • Oliver Bell &
    • Josef M. Penninger
  2. Vienna Biocenter Core Facilities, Vienna Biocenter (VBC), Dr. Bohr Gasse 3, Vienna, Austria

    • Thomas Micheler
  3. MRC Laboratory for Molecular Cell Biology and Institute for the Physics of Living Systems, University College London, London, UK

    • Irene M. Aspalter
  4. Novartis Institutes for BioMedical Research, Basel, Switzerland

    • David Estoppey,
    • Ralph Riedl &
    • Dominic Hoepfner
  5. Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK

    • Fengtang Yang &
    • Beiyuan Fu
  6. Max F. Perutz Laboratories, Medical University of Vienna, Dr. Bohr Gasse 9, Vienna, Austria

    • Thomas Dechat &
    • Dieter Blaas
  7. Paul Ehrlich Institut, Paul Ehrlich Strasse 51–59, 63225 Langen, Germany

    • Zoltán Ivics
  8. Max-Delbrück Center for Molecular Medicine in the Helmholtz Association, Berlin, Germany

    • Holger Gerhardt
  9. German Center for Cardiovascular Research, Berlin, Germany

    • Holger Gerhardt
  10. Berlin Institute of Health, Berlin, Germany

    • Holger Gerhardt
  11. Research Institute of Molecular Pathology (IMP), Vienna Biocenter (VBC), Dr. Bohr Gasse 7, 1030 Vienna, Austria

    • Alexander Stark

Contributions

U.E. generated the haploid library with technical support from A.Lei., C.H., J.L., M.H., A.-M.H., M.K., N.S., E.W., N.v.d.L., D.H., R.R. and D.E. U.E., R.A.W. and A.Leo. characterized cell lines. A.Lei., G.M., U.E., D.B. and T.D. performed rhinovirus work. A.S., T.B. and T.M. wrote the bioinformatics algorithms and set up the Haplobank website. S.Z. performed RACE experiments, F.Y. and B.F. performed karyotyping experiments and C.A.A. supported standardization. J.A.Z.M. and O.B. performed ATAC-sequencing. Z.I. advized on mutagenesis vectors. R.A.W., I.M.A., D.A., A.Leo. and H.G. performed blood vessel experiments. U.E. and J.M.P. coordinated the project.

Competing financial interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to:

Reviewer Information Nature thanks S. Narumiya and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Author details

Extended data figures and tables

Extended Data Figures

  1. Extended Data Figure 1: Stem cell properties of the haploid subclone AN3-12. (614 KB)

    a, b, Various parthenogenic cell lines derived from independent embryos from an outcross of 129/Sv and C57BL/6 and thus containing different genomic backgrounds for different chromosomes were allowed to form embryoid bodies by placing 1,000 cells per hanging drop. We observed downregulation of pluripotency marker genes (a) and upregulation of markers from all three germ layers (b) in all cell lines assayed on day 0 (d0), day 5 (d5) and day 12 (d12). The HMSc2 subclone AN3-12 was chosen for further study based on its growth properties in serum/LIF and absence of feeders. Data are shown as individual data points of n = 2 technical replicates together the mean ± s.d. of one representative experiment. c, Growth curve of AN3-12 in the presence and absence of LIF. Data are shown as individual data points and mean values (lines) of three biological replicates. d, FACS analysis of chromosome content of AN3-12 cells (in LIF, same experiment as shown in c) shows the decrease in haploid (1n) cells from 35.5% to 24.9% during the seven-day culture period. e, AN3-12 cells, cultured as in c, maintain a robust haploid population when analysed on day 17 in ESCM despite rapid proliferation. f, Differentiation of AN3-12 cells into keratinocytes resulted in a near-complete loss of haploid cells among the keratin 14 (K14)-positive population; mES cells stained with anti-K14 are shown as a negative specificity control (grey curve in the K14 histogram). g, Immunostaining of AN3-12 cells cultured in ESCM as well as time course of removal of LIF with addition of 500 nM retinoic acid (analysed on the indicated days) shows downregulation of pluripotency markers Oct4, Nanog and Sox2. DAPI is shown as a nuclear counterstain. Scale bars, 50 μm. h, Histological examination of teratomas analysed 25 days after injection of 106 cells subcutaneously. All three germ layers were present in six analysed teratomas, representative H&E images are shown. Magnifications are indicated in each panel.

  2. Extended Data Figure 2: Analysis of genome integrity. (603 KB)

    a, M-FISH karyotypic analysis was performed on parental mouse haploid cells (AN3-12) to evaluate genomic stability. Randomly selected metaphases were karyotyped and examined by M-FISH and DAPI banding. Approximately, 200 metaphases from AN3-12 were counted for the diploid versus haploid frequency and 10 well-spread metaphases were fully karyotyped by M-FISH and DAPI-banding pattern. Images of normal female diploid and haploid karyotypes (19, X) are shown. Images were captured on a Zeiss AxioImager D1 fluorescent microscope equipped with narrow band-pass filters for DAPI, DEAC, FITC, CY3, Texas red and CY5. b, CNV analysis of haploid AN3-13 cells by genome sequencing using Illumina HiSeq2500. Mapped reads were analysed relative to male genomes of parental mouse strains C57BL/6J and 129/Sv, respectively, quotient to closer parental strain is shown. As expected, the X chromosome is overrepresented whereas the Y chromosome is absent. Regions of detected variation are highlighted with red boxes and are shown in c. Chromosome numbers are indicated. c, In AN3-12 haploid mES cells three very small deletions (on chromosomes 2, 10 and 12) and 1 duplication (on chromosome 13) were detectable as highlighted. d, Chromosomal distribution of SNP densities for in-house 129/Sv and AN3-12 mES cells relative to in-house C57BL/6 are shown. Numbers of SNPs were calculated for all non-overlapping 100-kb windows across the mm10 C57BL/6J mouse reference genome. SNP density in AN3-12 shows regions of high and low number of SNPs relative to the C57BL/6J genome, as expected for a haploid cell line derived from an F1 female between 129/Sv and C57BL/6.

  3. Extended Data Figure 3: Molecular characterization of mutagenesis vectors. (516 KB)

    a, Schematic illustration of the universal NGS strategy. Optimized primer-binding sites compatible with Illumina sequencing and two restriction enzymes with four base-pair recognition sites were placed adjacent to the terminal elements (LTR, TR). An internal barcode of 32 bases with alternating weak and strong bases was inserted in a parallel cloning step. b, For mapping of integration sites, genomic DNA was amplified by iPCR to introduce adaptor sequences and the experimental index for NGS. Paired-end sequencing maps the genomic integration in the first read using a custom primer, the experimental index as well as the internal barcode using standard Illumina primers binding to the integrated complementary sequence. Barcode (BC) PCR was performed on genomic DNA. c, Meta-analysis of mutagen integrations around transcriptional start sites (TSS) (excluding the precise TSS site). In particular Tol2 and retrovirus show a preference to integrate in proximity to the TSS. Retroviruses also frequently integrate into the promotor regions, whereas lentiviral integrations are typically located within the entire gene body. IPKM, insertions per kilobase per million. The vectors used are described in the legend of Fig. 1. d, Distribution of integration sites. Binning the number of integrations in genic and 2-kb upstream regions per 10-kb windows illustrates pronounced cold spots of mutagenesis using retroviral mutagenesis, where one can observe bins devoid of integrations. e, Genomic region surrounding the Gapdh locus exemplifying the distributions of integrations. While retroviral integrations strongly cluster, Tol2 displays a more uniform distribution of integration sites. Tracks are + strand (top) and − strand (bottom) integration sites. Bar lengths indicate NGS read numbers, subsequent to iPCR. f, Heat map illustrating overlap of epigenetic histone marks with integrations of the indicated mutagens, normalized to peak size. Only retrovirus and Tol2 integrations strongly correlate with DNA accessibility determined by ATAC-seq and active marks such as H3K4me3 and H3K27ac. In silico mutagenesis is shown as a control.

  4. Extended Data Figure 4: Insertional preferences and generation of the mutant mES cell library. (689 KB)

    a, Correlation between integration probabilities (IPKM, insertions per kilobase per million) and expression level (mean log2(FPKM)). Strongest correlation is seen for lentiviral constructs as well as Retro-GT without osteopontin-enhancer elements. All mutagenesis vectors are described in Fig. 1 and Methods. b, 5′ RACE on a set of pooled clones with confirmed antisense integration sites revealed multiple spurious transcription initiation sites in the intronic part of the gene-trap vector around the lox site, but we failed to detect spliced transcripts. Transcriptional initiation within the lox5171 site is highlighted. Red-labelled sequence is marking polyGs used for 5′ tailing. c, Intersection of integration sites of the indicated mutagenesis vectors (see Fig. 1) with genomic features. Coding sequences (CDS), 5′ and 3′ untranslated regions (5′ UTR and 3′ UTR), 1st intron, all other introns, excluding the first intron (intron), non-coding exons (ncExon), upstream regions (defined as 2-kb upstream of TSS) and intergenic regions are indicated. Mutagenesis by piggyback transposons as well as in silico random mutagenesis and ATAC-seq results are shown for comparison. d, Schematic work flow for generation of the mutant haploid mES cell library. Single-cell-derived clones were manually picked 10 or 11 days after seeding, expanded in 96-well plates, and either frozen in quadruplicates or further processed for mapping of the integration sites. e, Schematic illustration of the first step of 4D pooling. Each plate was pooled into the respective slice tray as well as a master plate, uniting identical well coordinates of all plates. f, Schematic illustration of the second step of 4D pooling. Each master-plate was pooled into a master tower pool, a plate with lamella uniting columns, and a plate with lamella uniting rows, thereby generating pools for rows and columns over all samples. g, 4-Dimensional pooling of 9,600 clones in 8 rows, 12 columns, 10 slices and 10 towers resulting in 40 pools. After iPCR to introduce experimental indices, pools were combined and deep sequenced. Amplification of internal barcodes confirmed clonal identity and mapping in 4 dimensions. All mapped clones were deposited in the Haplobank (https://www.haplobank.at/).

  5. Extended Data Figure 5: Numbers of independent gene trap clones and intragenic distribution. (343 KB)

    a, Numbers of independent available cell lines, carrying a single integration per cell, per gene. For about 37% (RefSeq) to 38% (Ensembl) of genes targeted, there is one gene-trap clone available (5′ UTR, intron or coding sequence), whereas about 18% of genes are targeted in two independent clones, and for around 43% of genes three or more independent clones are available. b, 24.8% (RefSeq) to 26.8% (Ensembl) of genes are represented by a single cell line if one takes all clones into account and about 40% of genes are hit in three or more clones. c, Separation of all gene traps combined into biotypes in single-integration clones of the Haplobank. Antisense and intergenic insertions are observed in all systems, in particular for enhanced gene-trap vectors. d, To map the integration sites of our clones from the Haplobank to the open reading frames (ORFs) of the respective genes dissected ORFs into 5% intervals and annotated integration sites in introns and exons relative to the position within the ORF. All mutagenesis systems (see Fig. 1) show a strong bias towards transcript truncation proximal to the 5′ end of the ORFs and are thus predicted to result in loss-of-function alleles. We defined integrations in the first 50% of the coding sequence (green bars) as optimal for a gene-trap allele; these clones are highlighted by a yellow star on the Haplobank homepage.

  6. Extended Data Figure 6: Interaction of Pla2g16 with Cox inhibitors in mES cells. (529 KB)

    a, Titration series of the indicated Cox inhibitors in the presence and absence of rhinovirus (RV-A1a) in mES cells. No protective effect of inhibition of prostaglandin biosynthesis was detected at non-toxic concentrations. Because mES cells do not generate infectious RV-A1a efficiently, conditioned supernatant containing RV-A1a was added daily. Data are shown as individual data points and mean values. b, mES cell clones from the Haplobank that had mutations in Ldlr and Pla2g16, respectively, were mixed as sister cells in sense (red) and antisense (green) orientation labelled by GFP and mCherry. Subsequently, cells were cultured in the presence and absence of rhinovirus RV-A1a for four days and ratios of red and green cells were then quantified using FACS. Selection pressure for loss of Ldlr and Pla2g16 was not affected by inhibition of Cox. Data are shown as mean ± s.d. of three biological replicates.

  7. Extended Data Figure 7: Interactions of PLA2G16 with Cox inhibitors in human HEK293T cells and domain mapping. (770 KB)

    a, RV-A1a exposure causes cell death in HEK293T cells in a dose-dependent manner. Cell viability was quantified three days after infection using Alamar blue. Data are shown as individual data points of five biological replicates and mean values. b, Titration series for ibuprofen and indomethacin treatment in the presence and absence of rhinovirus (RV-A1a) in HEK293T cells. Protective effects of ibuprofen and indomethacin were detected at a high drug concentration. Cell viability was quantified 2.5 days after infection using Alamar blue. Data are shown as individual data points of 4 biological replicates and mean values. c, Competitive growth assays in HEK293T cells. Cells containing sgRNAs targeting PLA2G16 did not show a growth difference in the absence of RV-A1a or when treated with indomethacin at 100 μM, but were significantly enriched when challenged with RV-A1a, indicating preferential survival. By contrast, control-guide-treated cells did not show growth advantages at any experimental condition. Data are shown as individual data points and mean ± s.d. of biological triplicates analysed on days 0, 3, 6 and 10 after RV-A1a exposure. d, Scheme of Pla2g16 domains. The enzymatic centre of Pla2g16 is located in the cytoplasm (green); an α helix in the transmembrane domain (yellow) connects the core to a short vesicular domain (blue), located in endosomes49. e, Design of CRISPR sgRNAs targeting the mRNA regions encoding the vesicular domain of seven amino acids (sgRNA1) and the 3′ UTR (sgRNA2) in haploid mES cells to test essentiality of these domains in RV-A1a infections. f, Cells carrying sgRNA2 showed editing in only 1 out of 12 cases, but upon selection with RV-A1a were enriched for deletions within the vesicular domain. For sgRNA2, all mapped deletions in control cells only affected the 3′ UTR, where the expected Cas9 cuts occur; upon RV-A1a exposure, the majority of observed deletions affected the transmembrane domain, the vesicular domain and, in some cases, even extended into the cytoplasmatic region. Colour codes: grey, deletion; red, alternative reading frame and insertions.

  8. Extended Data Figure 8: Blood-vessel sprouting in Notch1 mutant mES cells and candidate tip cell genes. (682 KB)

    a, Assessment of four independent Notch1-targeted clones from Haplobank. The locations of the integrations are shown: two antisense (AS) clones marked by green triangles, one sense (S) clone marked by the red triangle, and one clone with an upstream (AS-up) integration (blue triangle). Flipping of the gene traps upon Cre infection is shown by PCR in the left panel. Loss of Notch1 protein (intracellular domain, ICD) expression (clones A4, H7), and re-expression (clone D5) upon Cre recombination are shown by western blot (right panel). β-Actin is shown as a loading control. WT, parental clone without any gene trap integration. Uncropped blots are shown in Supplementary Fig. 1. b, Notch1 inactivation leads to a hypersprouting phenotype. Note the advanced progression and increased density of the vascular networks upon Notch1 deletion (sense clone) compared to antisense sister cells (top, bright field images; bottom, IB4 immunostaining to mark endothelial cells). Scale bars, 500 μm. c, Angiogenic sprouting is not affected when the gene trap is located 1,500 bp upstream of the Notch1 gene (A2 clone). GFP+ and Cre-reverted mCherry+ sister cells were analysed in 3D blood-vessel organoid cultures. Bright field images are shown. Scale bars, 500 μm. d, Differentially expressed genes in endothelial tip cells versus stalk cells from two published datasets30, 31 in the mouse retina were filtered for genes that have also been associated (ingenuity pathway analysis) with candidate genes/pathways for vascular diseases in humans. Scatterplot showing the frequency of independent associations of tip cell genes with various human vascular diseases. Genes available in the Haplobank at the beginning of the project were chosen for functional analysis in the 3D organoids. For most of the listed candidate genes, there were no functional vascular data available. e, Quantification of IB4-positive vascular structures from the indicated sister clones carrying sense and repaired antisense integrations. Clones were classified according to their sprouting capacity from low (hyposprouting) to high (hypersprouting). f, Different clones with independent integrations in the same gene showed reproducible phenotypes in sprouting angiogenesis. Vascular outgrowths were stained for endothelial cell-specific IB4 expression, number of vessels counted and normalized to the respective antisense sister clones. e, f, Data are shown as individual data points from a minimum of n = 3 independent experiments for each sense/antisense sister clone combination together with the mean ± s.e.m. *P < 0.05; **P < 0.01; ***P < 0.001; two-tailed Student’s t-test.

  9. Extended Data Figure 9: Sprouting angiogenesis in reversible sister clones. (1,064 KB)

    Representative images of the indicated sense (S) and antisense (AS) sister clones. IB4 was used to mark endothelial cells. GFP or mCherry expression indicates the respective flipped gene traps. Note that some sense clones are GFP+ whereas others are mCherry+; this is owing to the original orientation of the integration in sense or antisense, which was then reverted by the mCherry-Cre-expressing virus. Scale bar, 500 μm. For quantification of data see Fig. 3e.

  10. Extended Data Figure 10: Generation of a chimaeric vasculature in vivo and gap junction α-1 protein localization to tip cells. (903 KB)

    a, Representative fluorescence image of a haploid mES-cell-derived teratoma stained for endothelial cell-specific IB4. Endothelial cells arising from haploid mES cells are positive for mCherry and IB4 (yellow), whereas host endothelial cells are only positive for IB4 and appear green. Scale bars, 50 μm. b, Representative FACS analysis of teratomas following injection of chimaeric embryoid bodies into immunocompromised mice. Myst3 antisense (mCherry+) and sense (GFP+) sister clones were mixed at a 1:1 ratio. VE-cadherin-negative non-endothelial cells were also determined within the teratomas. c, Parental haploid mES cells stably expressing GFP or mCherry-Cre were assessed for their ability to generate IB4+ vascular structures in the presence of VEGFA. The number and ratios of IB4+ vessels per organoid were not apparently different between GFP-expressing and mCherry-Cre-expressing cells. Scale bars, 500 μm. Data are shown as mean ± s.e.m. and individual data points from n = 4 independent experiments. P = 0.207; two-tailed Student’s t-test. d, GFP-expressing and mCherry-Cre-expressing parental haploid mES cells contribute equally to tip cells (49.2% GFP+; 50.8% mCherry-Cre+) in 1:1 mixed mosaic cultures. Data are shown as mean ± s.e.m. and individual data points from of n = 4 independent experiments. P = 0.823; two-tailed Student’s t-test. e, Localization of gap junction α-1 protein in the mouse retina at postnatal day 6 (P6). Endothelial cells are marked by IB4 staining. At the angiogenic front, gap junction α-1 protein expression is found in endothelial cells, primarily localized at tip cells (arrows). Scale bars, 50 μm. f, Retinas were stained for gap junction α-1 protein expression and the endothelial marker IB4 to visualize the vascular networks on P6. Note the punctate pattern of gap junction α-1 protein adjacent to the IB4+ vessels, suggestive of gap junction α-1 protein expression in perivascular cells. Scale bar, 50 μm (left) and 20 μm (right). g, Gap junction α-1 protein predominantly localizes to the tip cells (arrows) in the 3D blood vessels. Vessels are marked by CD31 immunostaining and counterstained by DAPI. Bar graph indicates percentages of vessels with the highest gap junction α-1 protein expression in the tip cell. Data are shown as individual data points of eight independent embryoid bodies and mean ± s.d. of vessels. Scale bars, 20 μm (right) and 10 μm (left).

Supplementary information

PDF files

  1. Reporting Summary (80 KB)
  2. Supplementary Figures (2 MB)

    This file contains full uncropped scans of DNA gels and Western blots used in Extended Data Figure 8.

  3. Supplementary Table (490 KB)

    This table shows the numbers of clones available with respect to different mutagens, orientation of the inserted gene trap to gene transcription, as well as the number of different genes hit. A gene is defined as the genomic region between the transcriptional start and stop sites. www.haplobank.at

Additional data