Introduction

Both forward and reverse genetics approaches are required to understand the precise etiology of human genetic disorders. Forward genetics enables extraction of a causative gene mutation in patients, while reverse genetics allows the molecular functions of the causative gene product to be dissected to uncover the pathological mechanism. Via developments in deep-sequencing technology using next-generation sequencers, we have revealed and analyzed numerous nucleotide variations associated with human genetic diseases for efficient identification of the causative mutations [1, 2]. In contrast, reverse genetics in human cells for evaluating the molecular functions of these causative genes had been limited before genome editing technology was developed. As human cultured cells generally show low activity of homologous recombination, it was too difficult to generate disease-model cultured cells using the conventional method [3]. Engineered endonucleases (EENs) including ZFN, TALEN, and CRISPR/Cas9, increase the efficacy of genome editing through the site-specific activation of DNA repair activity to explore the reverse genetics in human cultured cells. Currently, it is practical for most researchers to dissect the molecular functions of causative gene products using edited cultured cell clones. As a therapeutic concept, if genome editing technology-mediated conversion of a mutation in patient cells to the reference sequence restores disease phenotypes, this has potential implications for the development of associated therapies (Fig. 1a). Further perspectives on genome editing technology mediated gene therapy have also been reviewed recently [4,5,6].

Fig. 1
figure 1

Strategies of genome editing technology in human cultured cells. a The available procedures in genome editing technology in medicine. Introduction of a gene mutation of interest into normal cultured cells is used to identify the molecular and cellular pathology for disease modeling. Another potential approach is to modify a mutation to the reference sequence in patient cells for gene therapy. b Error-prone NHEJ-mediated gene knockout. A target sequence is disrupted by an insertion or deletion due to NHEJ. c HDR-mediated introduction of a selectable drug resistance cassette (Drug R) into a target locus for gene knockout. d An NHEJ-mediated drug cassette with the same Cas9 recognition sequence as the endogenous target knock-in. When the Cas9 recognition sequence is in the same orientation in the endogenous target and drug cassette vector, the ends of the targeted vectors contain uncontrolled indels. In contrast, when the orientation of Cas9 recognition sequence in the endogenous target and drug cassette vector is opposed, the ends are programmable. e A simple SNV knock-in. A CRISPR/Cas9-induced DSB enhances HDR with a 100-nt ssODN repair template to introduce an SNV (asterisk) into a target site. f Site-specific cytidine deamination. Catalytically dead Cas9 protein recruits a cytidine deaminase AID (or APOBEC1) to specific sites, leading to conversion of C to T. g CORRECT method for SNV knock-in. In the first step, an SNP (asterisk) is introduced along with the second site mutations (closed circles) required to prevent recleavage by the Cas9 enzyme. A subsequent second step of editing in a similar manner corrects the secondary mutations to leave only the SNV. h Scarless SNV knock-in. A drug-selectable marker cassette is introduced into a target locus along with the SNV, and is subsequently excised by a further round of HDR or by piggyBack transposase

To obtain a precise understanding of the genetic basis of human diseases, it may be essential to use reverse genetics in human cultured cells. Genome editing technology enables engineering of the variant allele associated with a specific disease in human cultured cells with a uniform genetic background [7, 8]. Phenotypic comparison of such edited cells can then demonstrate how a variant of interest can affect the cellular events that are relevant to a specific pathological condition [9, 10]. If a knock-in variant reproduces disease-associated phenotypes, the causality of the candidate variant is confirmed (Fig. 1a). Here, we address the utility of employing genome editing technology in human cultured cells and discuss to what extent this technology can be applied in the field of human genetics.

Which human cultured cells are optimal for genome editing technology?

As a critical step of genome editing technology in human cultured cells, DNA double-stranded breaks (DSBs) induced by engineered endonucleases are repaired mainly by error-prone nonhomologous end joining (NHEJ) or error-free homologous directed repair (HDR) [11,12,13,14,15,16]. NHEJ, which is active throughout the cell cycle, causes insertions or deletions (indels) of various lengths that can lead to frameshift mutations and, consequently, gene knockout [17]. In contrast, HDR, which works in late-S and G2 phases, induces a precise recombination event between a homologous DNA donor and the DSB site, resulting in accurate introduction of the DNA donor into the target site and, consequently, gene knock-in [18]. Subsequently, the edited cells should be isolated and expanded for further functional evaluations. To distinguish clonal and artificial effects in the edited cell clones, it is necessary to generate a small number of independent edited clones, perform complementary analysis using exogenous expression of wild-type cDNA in the edited cells, or sequence the predicted off-target sites. Taking these points together, it is important to determine the DNA repair and proliferative capacities of the target cultured cells (Table 1).

Table 1 Cellular properties associated with genome editing technology

Primary cells

A variety of human cultured cells have been used for the modeling of human diseases in vitro. Primary cells derived from patients, such as skin fibroblasts and peripheral lymphocytes, are informative to understand the cellular etiology of genetic disorders. However, the limited proliferative capacity of primary cells is a major obstacle in genome editing technology-mediated reverse genetics [19]. Thus, primary cells are currently not used for genome editing technology-mediated disease modeling. On the other hand, for the application of genome editing technology to gene therapy, primary cells are an essential target. Recently, Howden et al. [20] demonstrated the simultaneous reprogramming and genome editing of primary skin fibroblasts. However, further studies are needed to construct the experimental flow of genome editing technology in primary cells.

Cancer cell lines

Immortalized human cultured cell lines are useful for genome editing technology. Cancer cell lines are typically spontaneously immortalized cells [21,22,23]. As cancer cells have acquired unlimited proliferative ability via mutations in oncogenes and tumor suppressor genes during carcinogenesis, it is possible to generate immortalized cell lines from various cancer tissues [24]. However, most cancer cell lines show genomic instability, defined as either chromosomal instability (CIN) or microsatellite instability (MIN) [25]. CIN is characterized by aneuploidy due to chromosomal mis-segregation during mitosis [26, 27]. U2OS cells and HeLa cells, which are derived from osteosarcoma and cervical cancer, respectively, are typical examples of CIN [21, 28]. For genome editing, it is somewhat difficult to handle CIN cells because these cells have multiple copies of a target locus. In contrast, MIN is defined by repetitive DNA expansions without aneuploidy [29, 30]. A typical MIN cell line, HCT116, derived from colorectal carcinoma, has a stable karyotype of 45 chromosomes and relatively high activity of homologous recombination [31, 32]. Thus, HCT116 cells are often used for genome editing technology.

Immortalized normal cell lines

Cancer cell lines lose some morphological and biochemical characteristics observed in normal cells. For example, most cancer cell lines have lost extracellular signaling sensor structures, namely, primary cilia, which are hair-like, microtubule-based organelles present on the surface of most normal cells in the quiescent G0 phase [33]. To immortalize normal cells, exogenous expression of adenovirus type 5 E1 gene, simian virus 40 T-antigen, and/or human telomerase (hTERT) is used [34,35,36]. HEK293 cells, which are widely used in genome editing research, are adenovirus type 5 E1 gene-mediated immortalized human embryonic kidney cells [37]. Although the genomes of HEK293 cells can be readily edited because of the high rate with which transgenes can be introduced into them and their relaxed chromatin state, they lose their diploid karyotype and morphological features such as primary cilia and cell-to-cell junctions. In contrast, hTERT-RPE1 cells derived from normal human retina pigmented epithelia retain their original phenotypes of a diploid karyotype and primary cilium formation [38]. However, they also retain the low HDR activity as also observed in the original tissue [39]. For genome editing in hTERT-immortalized normal cells, we should choose an NHEJ-dependent editing strategy to improve the efficacy of isolation of the edited cell clones [40]. Taking these factors together, when establishing an experimental design, it is important to consider the DNA repair activity of the immortalized normal cells.

Induced pluripotent stem cells

Human pluripotent stem cells (hPSCs) are also important for genome editing technology, since they can continue to divide to form identical cell clones [7, 41, 42]. They can also produce specialized types of cells through differentiation. These properties of hPSCs confer a multitude of possibilities for the modeling of human diseases and the development of unique therapies. Human embryonic stem cells (hESCs), derived from the inner cell mass of the blastocyst during human embryogenesis, were originally referred to as hPSCs [43]. However, the generation of hESCs is inherently associated with challenges in accessing fertilized eggs and ethical issues regarding their use.

To overcome these issues, in 2006, iPS technology emerged to provide a robust approach for generating hPSCs without the use of embryos [44]. In iPS technology, adult somatic cells transfected by the four Yamanaka factors, including Oct3/4, Sox2, c-Myc, and Klf4, can be reprogrammed to acquire stem cell characteristics [44, 45]. Induced pluripotent stem cells (iPSCs) generated from normal individuals have been verified to have a diploid karyotype, hESC-like DNA methylation pattern, and potential to develop into all three germ layers [45]. An intrinsic feature of single-cell survival rate in iPSCs is a challenge for reverse genetics. The suppression of anoikis by the Rho-kinase inhibitor Y-27632 during the disaggregation of hESC colonies was found to dramatically improve single-cell survival of hESCs, so it can be applied to single-cell cloning of iPSCs [46, 47]. Genome editing technology has been combined with iPSCs to generate knockout or knock-in clones, correct causative mutations, or insert reporter genes [48,49,50,51].

Which genome editing method is used for human genetics studies?

There are numerous strategies for genome editing in human cultured cells. The choice of such a strategy depends on the specific issue that is being addressed (knockout or knock-in) and multiple experimental factors including the types of human cultured cells, EENs, and transfection methods [52]. Therefore, it is necessary to optimize the experimental flow in view of the specific aim. Here, we summarize successful examples of genome editing technology-mediated strategies applied to human genetics studies (Fig. 1b–h, Table 2).

Table 2 Cases of genome editing-mediated disease modeling in human cultured cells

Gene knockout in human cultured cells

Gene knockout is a simple but important approach for modeling a disease. EEN-mediated NHEJ introduces indel mutations into a target locus (Fig. 1b). These indel mutations in protein-coding regions cause frameshifts to generate a null-mutant cell. To date, many disease-model cells generated by this strategy have been reported [53,54,55,56,57]. In the field of human genetics, these null-mutant cells are also used in complementary tests for the candidate variants underlying a genetic disorder called by forward genetics approach.

Using a homology arm-tagged targeting vector and EENs, HDR leads to the replacement of a protein-coding exon with a drug resistance cassette for the efficient isolation of a knockout cell (Fig. 1c) [39, 58, 59]. When targeting arms containing flanking constitutive exons with recombinase sites such as loxP sites are used in this strategy, cells with conditional knockout in the genes essential for cell survival can be generated [60, 61]. Moreover, EEN-mediated incorporation of a targeting vector possessing the desired variant in the homology arms into the target locus enables modeling or correction of disease-associated phenotypes in vitro [62, 63].

NHEJ-mediated knock-in using CRISPR-ObLiGaRe or HITI

Genome editing technology has mainly been applied to human cell lines with intrinsic HDR activity that is sufficient for the efficient isolation of genome-edited cells. However, HDR-dependent genome editing is not practical in normal-tissue-derived cell lines and post-mitotic cells including neurons and muscle cells since their HDR activity is inefficient or deficient [64, 65]. For example, we previously generated a microcephaly-associated KIF2A gene knockout hTERT-RPE1 cell line using TALEN and a drug-resistant gene cassette contained in a targeting vector, but the efficacy of isolation of the targeted clones was low, at approximately 1% of drug-resistant clones [39]. Maresca et al. added the ZFN site located in the genome into a drug-resistant gene cassette vector, and cointroduced the ZFN and the targeting vector into human cultured cells to isolate the targeted clones with high efficacy through NHEJ activity (Fig. 1d) [66]. They named this method ObLiGaRe (obligate ligation-gated recombination), based on the Latin verb obligate (“to join to”) [66]. In the ObLiGaRe method, highly effective transgene knock-in occurs in most human cultured cells, but the orientation of the transgene is not controlled and precise adjustment of junction sequences between a target locus and the transgene is not possible. Recently, we combined the CRISPR/Cas9 system and the ObLiGaRe method to efficiently generate ataxia-telangiectasia-causative ATM gene knockout hTERT-RPE1 cell clones [67]. In this method, biallelic targeting vector-inserted clones corresponding to knockout cells were rare at around 5% among the drug-resistant clones, while the monoallelic inserted clones were dominant at more than 70%. As almost all monoallelic inserted clones carried the NHEJ-mediated insertions or deletions at the target locus in the second, uninserted allele, >70% of the drug-resistant clones were indeed knockout cell clones. Thus, CRISPR-ObLiGaRe is an efficient and useful method for generating knockout cell clones.

Notably, a novel NHEJ-mediated site-specific transgene insertion method named homology-independent targeted integration (HITI) has been reported [68]. In CRISPR-ObLiGaRe, a guide RNA (gRNA) target sequence located in the genome is added in the same orientation into the targeting vector [66, 67]. In contrast, the targeting vector for HITI contains the gRNA target sequence in the opposite orientation to the genome [68]. HITI-mediated transgene knock-in occurs more preferentially in the forward than in the reverse direction because the forward-directed transgene knock-in alters the genomic gRNA target sequence to prevent additional CRISPR/Cas9 cutting. Suzuki et al. demonstrated that HITI worked in HEK293 cells and post-mitotic neurons, and that HITI introduced the wild-type exon to rescue visual function using a rat model of retinitis pigmentosa as a proof of concept of its potential use for gene correction therapy [68].

ssODN-mediated single nucleotide variation (SNV) knock-in

Numerous SNVs have been identified from the screening of causative mutations in genetic disorders and Genome-wide association study (GWAS) [1, 2, 69, 70]. To validate the causality of these SNVs, EEN-driven HDR introduces 100–200 nt of single-stranded DNA oligonucleotides (ssODN) with the SNV into the target locus (Fig. 1e) [48, 71]. Generally, the distance of SNV from the DSB should be minimized for efficient SNV incorporation [72, 73]. This method is routinely performed in early embryos of many animal species [74,75,76]. Although this method was reported to have been applied to mutation correction in some patient-derived iPSCs [50, 77, 78], it is not efficient in human cultured cells because of the less efficient HDR activity. Therefore, it is necessary to design experimental procedures for the concentration of ssODN-knock-in cells. To date, sib-selection and transient drug selection methods to achieve this concentration have been reported [52, 79], but further improvements are needed for practical use.

Site-specific cytidine deamination as a scarless SNV knock-in

The missense mutations of both D10A and H840A in the Cas9 protein inactivate its nuclease activity, while retaining its ability to bind to specific DNA sequences [80]. Conjugation of such catalytically dead Cas9 (dCas9) with an alternative enzymatic domain allows the recruitment of specific enzymatic activities to the target site in the genome [81,82,83,84,85,86]. This has been applied for the conversion of one base to another directly in the target genome. Activation-induced cytidine deaminase (AID) and dCas9 were fused to form a synthetic complex named Target-AID that converts C to T (or G to A) at the specific base (Fig. 1f) [87]. Another cytidine deaminase, APOBEC1, is also available for the programmable base editing mediated by dCas9 technology (Fig. 1f) [88]. In addition, in a cancer cell line, HCC1954, with Tyr163Cys mutation in the tumor suppressor gene p53, dCas9-APOBEC1 corrected the mutation by a specific C-to-T transition at a rate of 3.3%–7.6% [89]. Since this approach does not require any DNA repair activity, it can be applied to a broad range of cell types. However, it requires a procedure for concentrating single-base-pair-substituted cells similarly to the case of ssODN-knock-in cells. In addition, since only C-to-T transition is currently available in this approach, its application is somewhat limited.

CORRECT method for scarless SNV knock-in

A two-step genome editing strategy named “CORRECT” for scarless SNV knock-in has been reported (Fig. 1g) [90, 91]. In the first step of this approach, to prevent recutting by CRISPR/Cas9 and unwanted indels subsequently being introduced into human cultured cells using an ssODN template carrying the intended mutation and secondary silent mutations. In the second step, only the secondary mutations are removed by a redesigned guide RNA, which targets the 20-bp sequence containing the introduced CRISPR/Cas9-blocking mutations and the modified repair ssODN template. Alternatively, a Cas9 variant such as VRER-Cas9, which recognizes the modified PAM sequence introduced as a blocking mutation in the first step, can also be used in the second step [90].

Using this approach, early-onset Alzheimer’s disease-causative mutations in amyloid precursor protein (APPSwe) and presenilin1 (PSEN1M146V) were precisely introduced into HEK293 cells and iPSCs [90]. The edited iPSC-derived cortical neurons displayed the disease-specific biochemical phenotypes of amyloid-β (A-β) peptide generation [90]. Thus, Alzheimer’s disease-associated phenotypes in neurons can be faithfully modeled in vitro using genome editing technology in human iPSCs. However, this approach requires at least two rounds of clonal selection, taking ~3 months to generate the intended mutant in iPSCs. To minimize clonal selection and the occurrence of indels at each step, HDR improvement strategies such as novel NHEJ inhibitors and HDR enhancers should be applied in this approach.

Drug-selectable scarless SNV knock-in

For efficient SNV introduction in human cultured cells, we previously reported a TALEN-mediated two-step single-base-pair editing strategy (Fig. 1h) [92]. The first step included TALEN-mediated insertion of a drug-selectable marker cassette into an SNV flanking region. The targeting vector carried a neomycin-resistance gene and a herpes simplex virus thymidine kinase (hsvTK) gene separated by a 2A peptide sequence, allowing expression of the discrete protein products from a single open reading frame. The drug-selectable marker cassette knock-in cell clones were positively selected using neomycin. The second step involved the removal of the drug-selectable marker cassette from the targeted alleles, and introduction of the single-nucleotide substitution in an HDR-activity-dependent manner. Single-nucleotide-edited clones were negatively selected using ganciclovir treatment. Compared with the CORRECT method, the TALEN-mediated two-step single-base-pair editing strategy enables more efficient isolation of the edited clones, since it uses an antibiotic resistance marker for screening. Using this approach, we identified a causal mutation of a cancer-prone genetic disorder, premature chromatid separation with mosaic variegated aneuploidy [PCS (MVA)] syndrome [92,93,94]. Both biallelic and monoallelic mutations of the BUB1B gene encoding BubR1 have been reported in patients [95, 96]. Monoallelic mutations in the exons of BUB1B were identified in seven Japanese families with this syndrome. No second mutation in exons of the BUB1B gene was found in the opposite allele, although a conserved BUB1B haplotype within a 200-kb interval among the Japanese patients and a reduced transcript level were identified [96]. We thus determined the nucleotide sequence of the 200-kb region in a patient with this syndrome using deep-sequencing analysis and found that a unique SNV in an intergenic region 44-kb upstream of the BUB1B transcription start site was linked to the disease [92]. We used TALEN-mediated single-base-pair editing technology to biallelically introduce this substitution into HCT116 cells. The genome-edited clones showed reduced BUB1B transcript levels and PCS (MVA) syndrome-specific chromosomal instability, demonstrating that the identified SNV was indeed the causal mutation [92]. The single-base-pair editing technique is thus useful to evaluate nucleotide variants with unknown functional relevance.

In this approach, TALEN can be replaced by other EENs including CRISPR/Cas9. Moreover, the drug-selectable cassette marker can be excised either by site-specific recombinases or by the piggyBac transposase (Fig. 1h), which has recently been demonstrated to be effective in human iPSCs [51, 97,98,99,100]. However, the targetable loci are limited since editing with piggyBac transposase requires a TTAA sequence near the target site [98]. These strategies are useful for efficient SNV knock-in in human cultured cells, but they require at least two rounds of clone selection, which is time-consuming and can produce off-target mutations. Further investigations are thus needed to achieve safe scarless SNV-knock-in systems.

Conclusions

Genome editing technology considering each cellular characteristic will undoubtedly provide major insights into the pathological mechanisms and therapeutic targets of genetic disorders. However, several important challenges remain. The most critical limitation in this technology is derived from the general property of human cultured cells in which their DSBs are repaired dominantly by the NHEJ pathway rather than by HDR. Therefore, an NHEJ-mediated gene knockout strategy is mainly used to evaluate the loss-of-function effects and to remove the disease-causing sequences. In the case of Duchenne muscular dystrophy (DMD) patient-derived iPSCs, a 725-kb genomic region containing a premature stop codon in the disease-causing DMD gene was deleted by CRISPR/Cas9 to rescue the open reading frame and ensure partial protein function [101]. In addition, to treat HIV-infected patients, the CCR5 gene, which encodes a chemokine co-receptor required for HIV infection but not survival, of T cells was removed by ZFN, for which clinical trials are underway [102]. However, for most clinical treatments or precise modeling of diseases in vitro, it is essential to achieve high-frequency knock-in of the repair template with the desired variant. Notably, since microhomology-mediated end joining (MMEJ)-assisted gene knock-in named PITCh (Precise Integration into Target Chromosome) [103, 104] and NHEJ-mediated site-specific gene insertion, HITI [68], are both HDR-independent precise knock-in methods, these strategies may increase the utility of genome editing in human cultured cells. Other techniques to enhance gene knock-in include inhibition of NHEJ with small compounds or Cas9 protein accumulation in an S- and G2-phase-dependent manner [105,106,107,108,109]. Several techniques to shift the DSB repair balance from NHEJ to HDR in human cultured cells have been developed. A class2 CRISPR effector, Cpf1, which cuts target DNA further away from the PAM sequence to generate a single-stranded overhang, may increase HDR more than NHEJ [110]. Against this background, further advances of genome editing technology in human cultured cells are required to understand and correct genetic disorders.