Insights on autophagosome–lysosome tethering from structural and biochemical characterization of human autophagy factor EPG5

Pivotal to the maintenance of cellular homeostasis, macroautophagy (hereafter autophagy) is an evolutionarily conserved degradation system that involves sequestration of cytoplasmic material into the double-membrane autophagosome and targeting of this transport vesicle to the lysosome/late endosome for degradation. EPG5 is a large-sized metazoan protein proposed to serve as a tethering factor to enforce autophagosome–lysosome/late endosome fusion specificity, and its deficiency causes a severe multisystem disorder known as Vici syndrome. Here, we show that human EPG5 (hEPG5) adopts an extended “shepherd’s staff” architecture. We find that hEPG5 binds preferentially to members of the GABARAP subfamily of human ATG8 proteins critical to autophagosome–lysosome fusion. The hEPG5–GABARAPs interaction, which is mediated by tandem LIR motifs that exhibit differential affinities, is required for hEPG5 recruitment to mitochondria during PINK1/Parkin-dependent mitophagy. Lastly, we find that the Vici syndrome mutation Gln336Arg does not affect the hEPG5’s overall stability nor its ability to engage in interaction with the GABARAPs. Collectively, results from our studies reveal new insights into how hEPG5 recognizes mature autophagosome and establish a platform for examining the molecular effects of Vici syndrome disease mutations on hEPG5.

M acroautophagy (also known as autophagy) is the main pathway for degrading long-lived cytoplasmic macromolecules and full-sized organelles and represents a key component of the cellular homeostatic program. Under favorable growth conditions, basal autophagy serves as a quality control mechanism to selectively remove misfolded/aggregated proteins and dysfunctional organelles in the cytoplasm 1,2 . When cells encounter stress conditions, such as starvation, autophagy is upregulated to promote nonselective bulk degradation to generate basic building blocks to power essential metabolic reactions and to generate energy [3][4][5] . Because of autophagy's important roles in guarding normal cellular physiology, dysregulation of this degradation pathway is linked to many human pathologies ranging from neurodegeneration and cancer to infectious diseases 6,7 . An improved understanding of autophagy at the molecular level will generate insights into the basis of different human diseases and may reveal additional avenues for therapeutic intervention.
Autophagy degradation begins with the formation of a membrane precursor known as the phagophore. The phagophore expands in size, sequesters cytoplasmic materials, and self-seals to form a double-membrane transport vesicle called the autophagosome. The cargo-laden autophagosome then gets transported to and fuses with the lysosome or the late endosome, where the content of the autophagosome is ultimately digested by hydrolytic enzymes inside the lysosome [3][4][5]8 . The discovery of the ATG (autophagy related) genes by yeast genetic screening and the identification of the core autophagy machinery composed of 18 mostly conserved Atg proteins generated a framework for investigating the molecular mechanism of this multistep degradation pathway. Subsequent characterization of the core autophagy machinery consisting of five key functional groups (Atg1 kinase complex, autophagy-specific phosphatidylinositol 3kinase/PI3K complex, Atg8 conjugation system, Atg12 conjugation system, Atg9 and Atg2-Atg18 complex) yielded mechanistic insights into autophagy initiation and autophagosome biogenesis, primarily in the yeast model system 3,9,10 . However, the mechanisms underlying the later steps of autophagy, including how autophagosome engages and ultimately fuses with the lysosome remain less well understood.
Recent studies in high eukaryotes have begun to unravel these mysteries. The identification of syntaxin 17 (STX17) and YKT6 as the autophagosomal "SNARE" that bind lysosomal VAMP8-SNAP29 and STX7-SNAP29, respectively, to form trans-SNARE complexes offered insights into the autophagosome-lysosome membrane fusion process 11,12 . Systematic gene deletion studies of the six homologs of human ATG8 (LC3A, LC3B, LC3C, GABARAP, GABARAPL1, and GABARAPL2) revealed that the three members of the GABARAP subfamily play critical roles to autophagosome-lysosome fusion 13 . Furthermore, the discovery of new non-ATG autophagy regulators, including the conserved multi-subunit HOPS complex, and the metazoan-specific proteins TECPR1, PLEKHM1, BRUCE, GRASP55, and EPG5 generated a growing list of additional proteins/protein complexes that participate in the terminal stage of autophagy 11,[14][15][16][17][18][19][20] . However, the precise physiological functions of these newly identified autophagy factors, and exactly how they coordinate with one another and the GABARAP proteins to mediate autophagosomelysosome fusion are not fully understood.
Originally discovered in the Caenorhabditis elegans genetic screen for metazoan autophagy genes, EPG5 (ectopic P-granules autophagy protein 5) is a large-sized (~292 kDa) protein proposed to regulate fusion specificity between autophagosomes and lysosomes. Notably, epg-5 deficiency in C. elegans causes nonspecific fusion of autophagosomes with other endocytic vesicles and the formation of abnormally large non-degradative vesicles 20,21 . Subsequent studies in C. elegans and human cell lines showed that EPG5 is recruited to the lysosome/late endosome by the small GTPase RAB7, and EPG5 has the ability to bind the autophagosome surface protein and ATG8 homolog human LC3B or C. elegans LGG-1 via two LC3-interacting region (LIR) motifs composed of a conserved sequence [(W/F/Y)-X 1 -X 2 -(I/L/V)] 20 . Together with the finding that C. elegans EPG-5 is capable of stabilizing and facilitating assembly of the STX17-SNAP29-VAMP8 trans-SNARE complex in vitro, these new data led to the model that EPG5 functions as an autophagy tethering factor that mediates initial interaction between the autophagosome and the lysosome 20 .
At around the time when EPG5's role in autophagy was uncovered, clinical genetics analysis revealed that recessive mutations of the gene encoding human EPG5 (hEPG5) cause Vici syndrome, a rare but severe multisystem disorder characterized by agenesis of the corpus callosum, cataracts, cardiomyopathy, hypopigmentation, and combined immunodeficiency [22][23][24] . Approximately 100 cases of Vici syndrome have been reported to date with a median survival time of 24 months [22][23][24] . Analyses of primary cells isolated from patients showed an accumulation of autophagosomes attributed to deficiency in autophagosomelysosome fusion 24,25 . Interestingly, epg5 −/− knockout mice exhibit neurodegenerative features resembling human amyotrophic lateral sclerosis 26 . Although many Vici syndrome mutations have been mapped, the effects of these mutations on EPG5's structure and function are not known.
By developing a method to produce recombinant full-length hEPG5, we were able to comprehensively characterize the structural and biochemical properties of this large-sized putative autophagy tethering factor. We found that hEPG5 adopts an extended architecture reminiscent to tethering factors found in other membrane trafficking pathways. We also found that hEPG5 shows preferential binding to the GABARAP subfamily of ATG8 proteins, and this interaction involves a complex interplay between the two LIR motifs exhibiting differential binding affinities. We further showed that hEPG5-GABARAP interaction is required for hEPG5 recruitment to mitochondria during PINK1/ Parkin-mediated mitophagy. Lastly, the common recurrent Vici syndrome mutation Q336R did not affect the overall architecture, stability, and GABARAP-binding ability of hEPG5.

Results
hEPG5 adopts an extended overall architecture. With 2579 amino acid residues and an overall molecular mass of~290 kDa, hEPG5 is one of the largest regulators in the autophagy pathway identified to date. Due in part to technical challenges associated with purification of this large-sized protein, nothing is currently known about the structural properties of hEPG5. To overcome this barrier, we developed a baculovirus-insect cell-based system to overexpress the recombinant full-length hEPG5 and an anti-FLAG affinity chromatography coupled with glycerol density ultracentrifugation approach to purify the recombinant protein.
This procedure enabled us to obtain highly purified hEPG5 suitable for biochemical and structural characterization (Fig. 1a). Analytical gel filtration chromatography of purified hEPG5 showed that it elutes at a volume corresponding to a predicted molecular weight higher than its calculated mass (Fig. 1b). This suggests that hEPG5 either exists as an obligate oligomer or adopts a non-globular overall shape. We next examined hEPG5 by negative stain single-particle electron microscopy (EM). Raw images not only revealed highly elongated particles with a distinct curvature at one end, but also showed that hEPG5 is monomeric (Fig. 1c). Two-dimensional (2D) analysis emphasized that hEPG5 has an overall architecture resembling a "shepherd's staff" and composed of a rigid round "hook" connected to an extended and more flexible "shaft" (Fig. 1d). A "thumb"-shaped protrusion is also present between the hook and the shaft. The length of hEPG5 is estimated to be3 75 Å, a value consistent with the maximum dimensions observed for different tethering complexes found in conventional membrane trafficking pathway 27,28 . A three-dimensional (3D) reconstruction calculated from the negative stain EM data showed that hEPG5 is nonplanar with its two "ends" projected toward opposite directions (Fig. 1g). To determine which regions of hEPG5 adopts the two prominent substructures, we generated and purified a truncated version of hEPG5 lacking the C-terminal 500 residues (designated hEPG5 Δ2079-2579 ). 2D negative stain EM analysis showed that this hEPG5 truncation mutant, while adopting an overall architecture reminiscent of full-length hEPG5, contains a shorter shaft, indicating that the C-terminus of hEPG5 is located at the tip of the shaft (Fig. 1e). We also generated a fusion construct with maltose-binding protein (MBP) fused to the C-terminus of hEPG5 and purified this fusion protein for negative stain EM. In agreement with our termini assignment from deletion analysis, 2D EM analysis showed an extra density projected from the tip of the shaft (Fig. 1f).
hEPG5 interacts preferably with the GABARAP subfamily of ATG8 proteins. hEPG5's extended architecture seems well suited to its proposed role in tethering the autophagosome to the lysosome prior to fusion. Tethering factors mediate longer range interaction between the transport vesicle and its target organelle 28,29 . The recent finding that hEPG5 is capable of binding LC3B indicated that hEPG5 likely recognizes the autophagosome via this human ATG8 protein, which localizes to both the inner and outer membrane of the autophagosome 20 Fig. 1 Overall architecture of hEPG5. a SDS-PAGE of His-FLAG-hEPG5 glycerol gradient ultracentrifugation fractions, stained with Coomassie Blue. M and I represent the protein marker and input, respectively. b Analytical gel filtration chromatography of His-FLAG-hEPG5. Elution volume of hEPG5 (~300 kDa including the tags) is indicated by black dashed line and the molecular weight standard ferritin (440 kDa) is indicated by blue dashed line. c A representative raw image of negatively stained hEPG5 (Scale bar: 100 nm). d Representative 2D class averages of wild-type hEPG5, with the location of Nand C-termini indicated in yellow. e Representative 2D class averages of hEPG5 Δ2079-2579 . f Representative 2D class averages of C-terminal maltosebinding protein (MBP)-tagged hEPG5. MBP density is shown by asterisk (yellow). g 3D reconstruction of hEPG5 revealing the hook and shaft regions (Scale bar: 5 nm). GABARAPL2) 30,31 and it is unclear if hEPG5 is capable of binding other members of the ATG8 family. We therefore subjected purified FLAG-tagged full-length hEPG5 to a systematic GST (glutathione S-transferase) pull-down analysis involving all six GST-tagged human ATG8 homologs. Our fluorescence-based western blotting showed that the three GABARAP subfamily of ATG8 proteins (GABARAP, GABARAPL1, and GABARAPL2) precipitated 2.5 times more hEPG5 compared to the three members of the LC3 subfamily (LC3A, LC3B, and LC3C), indicating that hEPG5 binds preferentially to the GABARAP's ( Fig. 2a, b). Interestingly, hEPG5 has different affinities toward the three LC3 subfamily members, with the strongest interaction with LC3C and the weakest with LC3B.
hEPG5 binds GABARAP and other ATG8 proteins via a tandem LIR motif. ATG8 proteins typically bind the so called LIR motifs of their cognate binding partner 31,32 . Although previous studies have shown that two tandemly arranged motifs between residues 550 and 570 of hEPG5 ( 550 WTLV 553 and 567 WILL 570 ) are essential for interaction with LC3B, other putative LIR motifs are also predicted along the entire length of hEPG5 20 . To find out which of these putative LIR's of hEPG5 is/are responsible for mediating the high-affinity interaction with the GABARAP proteins, we first applied a deletion mapping approach that involves purifying hEPG5 truncation mutants and assessing their abilities to bind GABARAP by GST pulldown. Out of the series of truncation mutant constructs we designed and generated, only three could be purified at sufficient levels for biochemical analyses ( Supplementary Fig. 1a, b). These include hEPG5 Δ1-548 which is devoid of the region between the N-terminus and the two previously characterized LIR motifs, hEPG5 Δ1-1198 which excludes the entire N-terminal region, and hEPG5 Δ1770-2579 which excludes the entire C-terminal region, but shares a high degree of sequence identity with C. elegans EPG-5, which is approximately half the size of hEPG5 and consists of 1599 amino acid residues. Our pull-down results showed that hEPG5 Δ1770-2579 binds GABARAP equally well compared to wild-type hEPG5, suggesting that the entire C-terminal region of hEPG5 is dispensable to GABARAP interaction (Fig. 2c). The observation that hEPG5 Δ1-548 , but not hEPG5 Δ1-1198 was pulled down by GABARAP suggested that the GABARAP-binding site is located between residues 548 and 1198 of hEPG5. We next mixed recombinant hEPG5 with GST-tagged GABARAP, purified the hEPG5-GABARAP complex, and examined the purified complex by negative stain EM. Our 2D analysis revealed an extra density present along the N-terminal hook shape structure and supported by the protruding "thumb" (Fig. 2d), confirming the general location of the GABARAP-binding domain from our deletion mapping experiment. Interestingly, we found that hEPG5 Δ1-548 was precipitated by GABARAP at a higher level compared to wild-type hEPG5. This observation could be attributed to increased accessibility of one or more LIR motifs upon removal of structural elements located in the N-terminal region of hEPG5 (Fig. 2c).
To determine which of the three putative LIR motifs between residues 548 and 1198 ( 550 WTLV 553 , 567 WILL 570 , and 794 FIKI 797 ) are required for GABARAP interaction, we first generated three hEPG5 mutants in which the key first aromatic residue of each of the three LIR motifs was replaced by an alanine (Supplementary Fig. 1c-e), and then assessed the ability of these mutants to bind GABARAP by GST pulldown. We found that mutations to the first two LIR motifs in this region (W550A and W567A) severely or mildly diminish hEPG5's interaction with GABARAP, respectively, underscoring their importance in binding GABARAP ( Fig. 2e and Supplementary Fig. 2a, b). By contrast, hEPG5 F794A which contains mutation to the third LIR motif binds GABARAP as strongly as wild-type hEPG5, suggesting that this LIR motif is not required for GABARAP interaction ( Supplementary Fig. 2c). We also generated the W550A/W567A double mutant ( Supplementary Fig. 1f) and showed that it completely abolished GABARAP binding ( Fig. 2e and Supplementary Fig. 2d). Collectively, these results indicated that the tandem LIR motifs, previously shown to bind LC3B 20 , mediate high-affinity interaction with GABARAP. Furthermore, the first LIR (hereafter denoted LIR1) appears to play a more dominant role than the second LIR (hereafter denoted LIR2) in this interaction.
LIR2 peptide shows higher binding affinity toward GABARAP proteins than LIR1 peptide. To better understand how LIR1 and LIR2 work in conjunction with one another to mediate highaffinity interaction with GABARAP, we first decoupled the two LIR motifs by synthesizing peptides corresponding to LIR1 ( 546 GSGTWTLVDEG 556 ) or LIR2 ( 560 DEDPETSWILLN 571 ), and used isothermal titration calorimetry (ITC) to measure the binding constants of these two peptides with GABARAP and other human ATG8 proteins. For LIR1, we found that, in agreement with our pull-down data with full-length hEPG5, this peptide shows higher affinity toward all three GABARAP subfamily members compared to the three LC3 subfamily members ( Fig. 3a and Supplementary Table 1). Notably, the strongest binding was observed for GABARAP (K d of 7.47 µM), followed by GABARAPL1 and GABARAPL2 (8.54 µM and 11.79 µM, respectively). By contrast, the three LC3 subfamily members bind weakly, and we could only accurately determine the K d of LIR1 with LC3A, which is three times higher than that for LIR1-GABARAP.
For LIR2, our ITC experiments revealed that although this peptide retains strong preference toward the three GABARAP subfamily members, it exhibits substantially higher affinity toward all six ATG8 homologs ( Fig. 3b and Supplementary Table 1). More specifically, the K d value of LIR2 with GABARAP (0.16 µM), GABARAPL1 (0.09 µM), and GABARAPL2 (0.68 µM) are~45-fold, 100-fold, and 15-fold lower than those measured for LIR1, respectively. Similarly, the K d for LIR2 with LC3A and LC3B (1.75 µM and 4.07 µM, respectively) are approximately ten times lower than that determined for LIR1. The observation that LIR2 binds more tightly to GABARAP and other ATG8 homologs than LIR1 was unexpected, given that our pull-down analysis on the full-length hEPG5 LIR mutants indicated that LIR1 plays a more dominant role in this interaction. This discrepancy could be explained by the relative inaccessibility of LIR2 in the context of full-length hEPG5 prior to LIR1 contacting GABARAP and possibility causing a local conformational change.
LIR2 binds canonical binding site on GABARAPL1. We next examined how LIR2 mediates high-affinity interaction with the GABARAP subfamily proteins by co-crystallizing LIR2 in complex with GABARAPL1 and determining the crystal structure of this complex at 1.91 Å resolution (Table 1). There are two copies of LIR2-GABARAPL1 present in the asymmetric unit and their overall structures are essentially identical to one another ( Fig. 4a and Supplementary Fig. 3a, b). Our crystal structure revealed that LIR2 binds GABARAPL1 at the canonical LIR-binding site through a network of hydrophobic and electrostatic interactions. The critical aromatic residue W567 hEPG5 is inserted into hydrophobic pocket 1 (HP1) of GABARAPL1, whereas the hydrophobic residue L570 hEPG5 is inserted into hydrophobic pocket 2 (HP2; Fig. 4b). Within HP1, side chains of the residues P30, L50, and F104 of GABARAPL1 form hydrophobic interaction with W567 hEPG5 (Fig. 4c). On the other hand, side chains of the residues lining HP2 (Y49, V51, F60, L63, and I64) are engaged in hydrophobic contacts with L570 hEPG5 (Fig. 4f). In addition, W567 hEPG5 forms electrostatic interaction with the carboxyl group on E17 GABARAPL1 side chain at HP1, and the main chain carbonyl oxygen and NH group of L570 hEPG5 forms hydrogen bonds with the guanidinium group of R28 GABARAPL1 and carbonyl oxygen of L50 GABARAPL1 at HP2 (Fig. 4c, f). The central residues of LIR2 also contributed to GABARAPL1 binding. I568 hEPG5 side chain forms hydrophobic interaction with the Anti-Flag aromatic side chain of Y49 GABARAPL1 , and NH group and carbonyl oxygen of I568 hEPG5 forms hydrogen bonds with the main chains of K48 GABARAPL1 and L50 GABARAPL1 (Fig. 4d). L569 hEPG5 forms hydrophobic interaction with the side chain of Y25 GABARAPL1 and L50 GABARAPL1 (Fig. 4e). hEPG5-LIR2 engages in interaction with GABARAPL1 in a very similar fashion as other GABARAPL1 binding partners (PDB:5DPT 33 ; PDB:5LXI 34 ; PDB:5YIP 35 ; PDB:6HOL 36 ; and PDB:6HOI 36 ), with root-meansquare deviation (r.m.s.d.) of these structures ranging from 0.65 to 1.00 Å (Supplementary Fig. 3c).
We then compared our LIR2-GABARAPL1 crystal structure with the previously reported apo-GABARAPL1 crystal structure (PDB:2R2Q). We found that the side chain of K46 GABARAPL1   undergoes a conformational rearrangement upon LIR2 binding ( Supplementary Fig. 3d-f). This lysine conformational rearrangement has previously been shown to be important for LIR motif binding in LC3 subfamily proteins, as well as GABARAP and GABARAPL2, suggesting that this mechanism is conserved amongst mammalian LC3/GABARAP proteins (K49 for LC3A/B, K55 for LC3C, and K46 for GABARAP/L1/L2) [46][47][48] . Within apo-GABARAPL1, the side chain of K46 forms hydrogen bond with the main chain of K48, as well as hydrophobic interaction with the aromatic ring of the Y49. Upon LIR2 peptide binding, such interactions are disrupted, and the side chain of K46 GABARAPL1 shifts outward by 8.0 Å. This creates space to accommodate I568 hEPG5 of LIR2 to bind and engage in hydrophobic interaction with K46 GABARAPL1 and Y49 GABARAPL1 , including a hydrogen bond with the main chain of K48 GABARAPL1 , as described above (Fig. 4d).
hEPG5 requires GABARAP to localize to mitochondria during PINK1/Parkin-mediated mitophagy.  Fig. 4 Crystal structure of the LIR2-GABARAPL1 complex. a Surface representation of complex between LIR2 peptide (magenta) and GABARAPL1 (gray) at 1.91 Å. b Close-up view of LIR2 peptide (sticks representation in pink) binding to canonical binding site on GABARAPL1 (ribbon and transparent surface representation in gray). Residues Trp567 and Leu570 of LIR2 inserted into hydrophobic pocket 1 (HP1 in cyan) and hydrophobic pocket 2 (HP2 in orange) through hydrophobic interaction, respectively. c-f Close-up view of the interactions between each of the LIR2 motif residues (magenta) and residues on GABARAPL1 (dim gray). Black and orange dotted lines represent hydrophobic and electrostatic interaction, respectively; black solid lines represent hydrogen bonds. c Side chain of Trp567 interacts with side chain of Pro30, Leu50, and Phe104 through hydrophobic interaction, as well as the carboxyl group on Glu17 side chain through electrostatic interaction. d Side chain of Ile568 interacts with side chain of Lys46 and Tyr49 through hydrophobic interaction; main chain of Ile568 forms hydrogen bonds with main chain of Lys48 and Leu50. e Side chain of Leu569 interacts with side chain of Tyr25 and Leu50 through hydrophobic interaction. f Side chain of Leu570 interacts with side chain of Tyr49, Val51, Phe60, Leu63, and Ile64 through hydrophobic interaction; main chain of Leu570 interacts with main chain of Leu50 and guanidinium group on Arg28 side chain. g, h Close-up view of the interactions between each of the LIR2 N-and C-terminal residues (magenta) and residues on GABARAPL1 (dim gray). Solid lines represent hydrogen bonds. g LIR2 Nterminal residues Glu564 side chain, Thr565 main chain, and Ser566 side chain form hydrogen bonds with side chain of Tyr25, Lys46, and Lys48, respectively. h LIR2 C-terminal residue Asn571 side chain form hydrogen bonds with guanidinium group on Arg28 side chain.
place. Although previous studies showed that hEPG5 localizes to the perinuclear region, as well as diffusely in the cytoplasm in basal conditions 20 , it is unclear if the GABARAP's has a role in this localization pattern. We therefore transfected hEPG5-GFP into four different HeLa cell lines: wild type, LC3-TKO which contains deletion of all three genes encoding the LC3 subfamily members, GABARAP-TKO which contains deletion of all three genes encoding the three GABARAP subfamily members, and ATG8-hexaKO which contains deletion of all six genes encoding the six human ATG8 homologs, and examined hEPG5's localization by confocal microscopy. We found that hEPG5-GFP shows the same localization pattern in all four different cell lines (Fig. 5a), indicating that this tethering factor traffics to the lysosome/late endosome independent of the GABARAP's. To further delineate the role of hEPG5-GABARAP interaction in autophagy, we next analyzed hEPG5's localization under mitophagy-inducing condition. We used the well-established approach of activating PINK1/Parkin-dependent mitophagy by treating cells with oligomycin and antimycin A 49 . Upon induction of mitophagy, hEPG5-GFP localizes to punctate structures on or next to mitochondria in WT and LC3-TKO cells (Fig. 5b). Examination of hEPG5-GFP in GABARAP-TKO and ATG8-hexaKO showed that the absence of the GABARAP's appears to prevent the formation of these structures. These results indicate that hEPG5 requires the GABARAP's for recruitment to mitochondria during PINK1/Parkin-dependent mitophagy, and that hEPG5 functions downstream of the GABARAP's to drive autophagosome-lysosome/late endosome fusion.
Vici syndrome mutation Q336R does not affect structural integrity and stability of hEPG5. With a robust system to produce recombinant hEPG5 in place, we utilized this platform to more thoroughly examine the effects of Vici syndrome mutations on the structural and biochemical properties of hEPG5. A common missense mutation discovered from studies on two large cohorts of Vici syndrome is a nucleotide mutation at position 1007 of the epg5 gene. This mutation results in single residue change (Gln336Arg; Q336R) in hEPG5 protein 22,24,[50][51][52] . We decided to first focus on this disease mutation and examine its effect on the hEPG5 protein. We were able to express and purify hEPG5 Q336R at similar yield compared to wild-type hEPG5 ( Fig. 1a and Supplementary Fig. 1g). Negative stain EM analysis on hEPG5 Q336R showed no change in overall architecture and subunit stoichiometry compared to that of wild-type hEPG5 (Fig. 6a). We next used the thermal shift assay to assess the stability of the mutant protein and found that hEPG5 Q336R is slightly more stable than wild-type hEPG5, with an estimated 1.5°C higher melting temperature (Fig. 6b, c). Lastly, GST pull-down assay shows that the Q336R mutation did not affect hEPG5's ability to bind to GABARAP and other human ATG8 proteins (Fig. 6d).

Discussion
For all intracellular trafficking pathways, including autophagy, a transport vesicle must fuse specifically with its target organelle to ensure each cargo can reach and be delivered to its correct destination 53,54 . Tethering factors are a diverse family of proteins and protein complexes that play critical roles in defining and enforcing this specificity through mediating initial engagement between a transport vesicle and its target and facilitating the fusion event 54,55 . Recent studies on C. elegans EPG-5 by the Zhang group led to the proposal that this large-sized protein serves as an autophagy tethering factor, as it possesses two features found in tethering factors of other membrane trafficking pathways: (1) the ability to bind the transport vesicle (autophagosome via LC3B) and the target organelle (late endosome/ lysosome via RAB7), and (2) the ability to facilitate the formation of the STX17-SNAP29-VAMP7/8 trans-SNARE complex that mediates membrane fusion 20 . The first structural information on full-length hEPG5 reported here further substantiated this hypothesis by showing that hEPG5 adopts a relatively elongated overall shape, an architecture reminiscent of the "appendages" substructures found in multi-subunit tethering complexes, including COG 56 , TRAPPII 57 , exocyst 58 , and HOPS 59 . While our 2D and 3D EM analyses suggested that hEPG5 is relatively rigid, its C-terminal "shaft" exhibits conformationally flexibility as has been observed for most tethering factors characterized to date. Lastly, the overall length of hEPG5 matches closely to the longest dimension of the tethering complexes, which was thought to be evolved to mediate interactions at distance beyond that of the SNARE fusogen 28,29,60,61 . hEPG5 is predicted to be composed of predominantly helical structures based on several different secondary structure prediction algorithms. Future high-resolution structural analysis of hEPG5 will determine if the extended substructures of hEPG5 are constructed by helical bundles that are arranged in a similar fashion as those observed in highresolution crystal structures of tethering complexes subunits.
Previous multigene deletion studies of the six human ATG8 homologs demonstrated that the three GABARAP subfamily members play crucial roles in autophagosome-lysosome/late endosome fusion 13 . In agreement with this earlier finding, we observed that full-length hEPG5 shows a strong preferential binding to the three GABARAP proteins. We also demonstrated the importance of hEPG5-GABARAP interaction in autophagy by demonstrating that hEPG5 requires GABARAP's to localize to mitochondria in PINK1/Parkin-dependent mitophagy. Although multiple putative LIR motifs are predicted along the entire length of hEPG5, only the two tandemly arranged LIR motifs previously shown to mediate LC3B interaction are directly involved in binding GABARAP. Interestingly, the sequences of these two LIR motifs do not resemble the recently characterized GABARAP interaction motif or GIM ([W/F]-[V/I]-X 2 -V) 33 , indicating that other structural or biochemical features on hEPG5 contribute to its preference for GABARAP.
Our finding that the tandem LIR motifs (LIR1 and LIR2) are both essential for optimal binding to GABARAP raises the question as to why two motifs were evolved. In an attempt to understand the functional relationship between the two LIR motifs, we observed that although LIR1 clearly shows a more dominant role than LIR2 in mediating GABARAP interaction, the isolated LIR2 binds GABARAP with substantially higher affinity. These seemingly contradictory results, though initially perplexing, suggested that a more complex relationship exists between the two LIR motifs. Our observation that deletion of the N-terminal region of hEPG5 can alleviate potential inhibitory effects on GABARAP binding indicate that one or both of these LIR motifs might be inaccessible in the context of full-length hEPG5. Based on this result, we proposed a "two-factor authentication" step-wise binding model, in which LIR1 serves as an anchoring motif which makes initial contact with GABARAP, possibly at a noncanonical site. This binding would trigger a local conformational change making LIR2 accessible to binding with the canonical site on GABARAP (Fig. 7a). The ability for a tandem LIR motifs to bind simultaneously to a single ATG8 protein has been previously observed for RavZ, a Legionella pneumophila effector protein that inhibits xenophagy by cleaving lipidated ATG8 proteins, such as LC3B. Crystallographic analysis of the Nterminal tandem LIR motifs of RavZ in complex with LC3B revealed that the tandem motifs adopt a novel beta-sheet conformation with the second LIR binding in a noncanonical fashion 45 . As we were unable to obtain well-ordered crystals of hEPG5-LIR1 in complex with GABARAP proteins, likely due to the ability of isolated LIR1 to bind both the noncanonical and canonical sites in the absence of LIR2, validation of this model will likely require high-resolution cryo-EM analysis of full-length hEPG5 in complex with GABARAP or crystallizing GABARAP in complex with the tandem motifs. Finally, our studies here demonstrated that biochemical properties of the isolated LIR peptides may not reflect their true properties in the context of the full-length protein or protein complexes due to factors, such as accessibility.
Based on our EM data, the two LIR motifs are spatially located near the junction point between the hook and the shaft near the center of this protein. This suggested that hEPG5 likely binds autophagosome decorated with GABARAP with its slightly concave shaft facing the surface of the autophagosome. In such a configuration, one could envisage that hEPG5 would tether autophagosome to the lysosome by binding to RAB7 or other lysosomal proteins on its "back" (Fig. 7b). Alternatively, hEPG5 might exert its tethering function by working in conjunction with other factors, such as the HOPS complex at the interface between the autophagosome and lysosome/late endosome 8,[62][63][64][65] .
Although an almost complete catalog of Vici syndrome mutations has been compiled from numerous clinical genetics studies, how these disease mutations affect the structural and biochemical properties of hEPG5 are not known. The system we have built for producing and biochemically and structurally characterizing recombinant hEPG5 can potentially fill a critical gap in investigating the molecular basis of Vici syndrome. We completed a proof-of-concept study by examining hEPG5 encoding the c. 1007 A > G, p. Q336R mutation, which is the most common of four recurrent Vici syndrome mutations reported to date. Patients carrying this mutation show milder symptoms compared to other patients with other mutations, including the lack of cardiac malfunction and immunodeficiency 22,50,52 . Our findings that the Q336R missense mutation does not disrupt the overall architecture, thermal stability, and the GABARAP-binding capability of hEPG5 appear consistent with these clinical observations. Recent mRNA analysis of a Vici Syndrome patient carrying the c. 1007 A > G, p. Q336R mutation revealed that while alternative splicing caused by this mutation leads to 75% of the transcribed mRNA to contain premature codon truncation and in-frame shift deletion, 25% of the transcribed mRNA are normal spliced product that would lead to the synthesis of full-length hEPG5 protein 50,52 . Further understanding of how the Q336R mutation causes Vici syndrome will require more in-depth investigation of how this mutation affects hEPG5 interaction with RAB7 and the SNARE complex and the autophagosome-lysosome fusion event.
His-FLAG-hEPG5 expression and purification. Baculovirus containing the wildtype and mutant His-FLAG-hEPG5 constructs, generated using the Baculovirus Expression Vector System, were transfected into Sf9 cells at a density of 1.5-2.2 × 10 6 cells/mL. Cells were harvested~72 h after infection and stored at −70°C until use.
For purification, Sf9 cell pellets expressing wild-type or mutant His-FLAG-hEPG5 were resuspended in buffer A (50 mM Tris pH 8.0, 150 mM NaCl, 0.1% CHAPS, 1 mM phenylmethylsulfonyl fluoride [PMSF], and cOmplete ethylenediaminetetraacetic acid (EDTA)-free protease inhibitor). The cells were sonicated using Branson Sonicator 450 for four cycles consisting of 20 s sonication followed by 40 s cooling on ice, with duty cycle set to 40% and output control at 4. The resulting cell lysate was centrifuged at 110,200 × g for 30 min at 4°C. The supernatant was then applied to an Anti-FLAG M2 affinity gel (Sigma-Aldrich) for batch binding and eluted with 3× FLAG peptide containing buffer B (50 mM Tris pH 8.0, 150 mM NaCl, and 0.01% CHAPS). The elution containing His-FLAG-EPG5 was applied to the top of a glycerol gradient 15-30% in buffer B using a Gradient station (BioComp, Fredericton). After ultracentrifugation at 38,500 r.p.m. for 17.5 h using Beckman SW55 Ti rotor, fractionation was carried out. The fraction containing His-FLAG-hEPG5 was confirmed by SDS-PAGE gel, subject to preparation of negative stain grids.
His-FLAG-hEPG5 and GST-GABARAP co-purification. His-FLAG-hEPG5 was purified using the above protocol but with two different buffers; lysis buffer C (50 mM NaPhosphate pH 7.4, 150 mM NaCl, 0.05% TWEEN 20, 5% glycerol, 1 mM PMSF, and cOmplete EDTA-free protease inhibitor), and elution buffer D with the same components as the lysis buffer C except 0.01% TWEEN 20. The eluate containing His-FLAG-hEPG5 was applied to a HisPur TM Ni-NTA resin (Thermo Scientific) and washed with buffer D. GST-GABARAP in buffer D was applied to the His-FLAG-hEPG5 bound Ni-NTA resin and gently agitated for 30 min. Unbound GST-GABARAP was washed away with buffer D. Following a 35 mM imidazole wash, GST-GABARAP-bound His-FLAG-hEPG5 was eluted with three times 150 mM imidazole in buffer D and two times 250 mM imidazole in buffer D.
LC3/GABARAP protein expression and purification. For GST pull-down analysis, N-terminally GST-tagged LC3 and GABARAP proteins were expressed in Escherichia coli (T7 Express) cells. The cells were induced with 1 mM isopropyl βd-1-thiogalacpyranoside [IPTG] for 4 h at 25°C. The cell pellets were resuspended in 25 mL buffer E (50 mM Tris pH 8.0, 150 mM NaCl, and 1 mM PMSF). Cells were then sonicated for four cycles consisting of 1 min sonication followed by 2 min cooling on ice, with duty cycle set to 50% and output control at 5. Cell lysate was then centrifuged at 20,950 × g for 40 min at 4°C. Supernatant was incubated for 1 h at 4°C with glutathione resin (50% slurry, GenScript) pre-equilibrated with buffer E, while gently inverting. Resin was returned to the column and washed with buffer E. Proteins were eluted with 10 mM reduced glutathione (GoldBio) in buffer F (50 mM Tris pH 8.0, 150 mM NaCl). Free glutathione was removed by dialysis in buffer F and protein concentration was measured by spectrophotometry (Nano-Drop, Thermo Scientific). Glycerol was added to a final concentration of 5%, then solutions were aliquoted and stored at −70°C.
For ITC studies, N-terminally His-tagged LC3 and GABARAP proteins were expressed in E. coli (T7 Express) cells. The cells were induced with 1 mM IPTG for 4 h at 25°C. The cell pellets were resuspended, sonicated, and centrifuged, as described above with buffer G (50 mM Tris pH 7.0, 150 mM NaCl, and 2 mM PMSF). The supernatant was incubated with HisPur TM Ni-NTA resin (Thermo Scientific) pre-equilibrated with buffer G at 4°C for 1 h. The resin was washed five times with buffer H (50 mM Tris pH 7.0, 150 mM NaCl) and subsequently ten times with buffer H containing 30 mM imidazole. The bound proteins were eluted ig. 7 Proposed model of hEGP5-mediated tethering in the terminal stage of autophagy. a Schematic diagram of the hEPG5 tandem LIR motifs binding to GABARAP subfamily in a "two-factor authentication" step-wise binding model. LIR1 is recognized by GABARAP possibly at a noncanonical site initially, with subsequent local conformational change allowing LIR2 becomes accessible to binding at the canonical site on GABARAP. b hEPG5 interacts with autophagosome by binding to GABARAP on the concave side of its hook. The convex side of the hook and the flexible stalk may interact with HOPS complex, SNARE proteins, RAB proteins, as well as the late endosome and lysosome to facilitate autophagosome-lysosome fusion.
with buffer H containing 100 mM imidazole once, then 300 mM imidazole for four times, and 500 mM imidazole once. Eluted proteins were loaded onto a HiPrepQ FF 16/10 column (GE Healthcare) in buffer I (50 mM Tris pH 7.0) and 1% gradient of buffer J (50 mM Tris pH 7.0, 2 M NaCl). Target protein in flow through were collected, concentrated and further purified by size-exclusion chromatography using Sephacryl S-200 column (GE Healthcare) with buffer H. For crystallography, N-terminally GST-tagged GABARAPL1 was expressed in E. coli (T7 Express) cells. The cells were induced with 1 mM IPTG for 4 h at 25°C. The cell pellets were resuspended, sonicated and centrifuged, as described above with buffer K (40 mM HEPES pH 7.4, 150 mM NaCl, 2 mM PMSF, and 0.1% Triton X-100), followed by centrifugation at 30,966 × g for 40 min. The supernatant was incubated with glutathione resin (GenScript) pre-equilibrated with buffer K at 4°C for 1 h. The resin was washed six times with buffer L (40 mM HEPES pH 7.4, 150 mM NaCl). On-column GST-tag cleavage with PreScission protease in buffer L with 1 mM dithiothreitol (DTT) and 1 mM EDTA pH 8.0 was performed at room temperature for 2 h. Target protein in flow through and two times washes were collected and further purified by reverse glutathione resin chromatography. Target protein was then concentrated and further purified by size-exclusion chromatography using Sephacryl S-200 column (GE Healthcare), equilibrated and run in buffer F.
Crystallization and data processing. hEPG5-LIR2 peptide was dissolved in buffer F. GABARAPL1 (10 mg/mL) was incubated with dissolved hEPG5-LIR2 peptide (4.4 mg/mL) at 4°C for 1 h prior to all crystallization trials. Crystals grew in a condition containing 0.2 M ammonium sulfate, 0.1 M MES pH 5.5, and 29% (w/v) PEG4000 in 1:3 protein to liquid ratio. Crystals were harvested and frozen in liquid nitrogen directly prior to data collection.
X-ray diffraction data were collected on Beamline 5.0.2 at the Advanced Light Source (ALS) and processed using DIALS 66 . The structure was solved by molecular replacement using PHASER 67 with the search model 2R2Q. Model building and refinement were performed using PHENIX 68 , CCP4 69 , and Coot 70 . Refer to Table 1 for data collection and refinement statistics, and Supplementary Fig. 3a, b for structural figure with probability ellipsoids.
Negative stain electron microscopy and image processing. Negative stain specimens were prepared as previously described 71 . In brief, each protein sample from the peak fraction of the glycerol gradient or elution from imidazole was adsorbed on to a carbon coated grid and stained with uranyl formate. Micrographs were collected at nominal magnification of 49,000× and a defocus of 1-1.5 µm on a Tecnai Spirit transmission electron microscope (FEI) operating at an accelerating voltage of 120 kV and equipped with a FEI Eagle 4 K charge-coupled device camera. For image processing of the five datasets (hEPG5, hEPG5-MBP, EPG5 Δ2079-2579 , hEPG5 in complex with GABARAP, hEPG5 Q336R ; Supplementary Table 2), images were binned twice and particles were subsequently selected using Boxer 72 with a box size of 128 × 128 pixels. 2D classification of the selected particles was then carried out using Relion 1.4 73 . The 3D reconstruction of full-length wild-type hEPG5 was determined using ab initio model function and further refined using cryoSPARC v2 (ref. 74 ). A final resolution of 21 Å was calculated using the gold standard method (Supplementary Fig. 4).
Pull-down assay. Pulldowns were performed with a 50 µL slurry of glutathione resin (50% slurry, GenScript). The resin was equilibrated with buffer N (50 mM Tris pH 8.0, 150 mM NaCl, 5% glycerol, 0.01% Tween20, and 0.5 mM DTT) and the beads were incubated with equal amounts (200-240 µg) of purified GST, GST-LC3A/B/C, and GST-GABARAP/L1/L2 bait for 10 min at 4°C with gentle inversion. The tubes were centrifuged at 500 × g for 2 min, followed by removal of excess bait protein in the supernatant. The resin was then incubated with equal amounts (25-30 µg) of purified His-FLAG-hEPG5, truncated hEPG5 or its corresponding mutants (W550A, W567A, W550A/W567A, F794A, or Q336R). The tubes were once again centrifuged at 500 × g for 2 min, followed by removal of the supernatant. The resin was washed with 1 mL buffer N, centrifuged at 500 × g for 2 min, followed by removal of the excess buffer with five washes. After the final wash and removal of the supernatant, 50 µL 2× SDS loading dye was added to each tube. The tubes were then heated at 65°C for 10 min, loaded onto two 6-15% gradient SDS-PAGE gels, and stained with Coomassie Blue. For western blots, samples were transferred from a 6-15% gradient SDS-PAGE gel to a nitrocellulose membrane and blocked with Odyssey Blocking Buffer (LI-COR). Mouse anti-FLAG was used as the primary antibody (1:2000; Sigma-Aldrich) and donkey anti-mouse IRDye 680LT as the secondary antibody (1:7500; LI-COR) for visualization on a Che-miDoc Imager (Bio-Rad). Experiments were performed in triplicates.
Images were obtained in 3D by optical sectioning using an inverted Leica SP8 confocal laser scanning microscope equipped with an 63×/1.40 NA objective (oil immersion, HC PLAPO, CS2; Leica microsystems). Imaging was conducted at ambient room temperature using a Leica HyD Hybrid Detector (Leica Microsystems) and the Leica Application Suite X (LASX v2.0.1) with a minimum zstack range of 1.8 µm and a maximum voxel size of 90 nm laterally (x,y) and 300 nm axially (z). Presentative images are displayed as z-stack maximum projections.
Differential scanning fluorimetry. Purified wild type His-FLAG-hEPG5 and His-FLAG-hEPG5 Q336R at 0.4 mg/mL were mixed with 10× Sypro Orange dye in buffer F in a final volume of 25 µL. The fluorescence was measured using the MiniOpticon Real-Time PCR system in triplicates. Melting temperature was calculated using the maximum of the first derivative with Prism 7 (GraphPad).
Statistics and reproducibility. All experiments were performed in triplicates. ITC data represents means or means ± SEM. Intensity of His-FLAG-hEPG5, GST (control), and GST-LC3/GABARAP subfamily proteins in pull-down assays was quantified using Bio-Rad Image Lab Software v6.0. Statistics were performed using Prism 7 (GraphPad).