Introduction

It is well-known that pancreatic cancer is a highly fatal disease1,2,3. Many active basic and clinical research efforts have been made, but no effective treatment is yet in sight. Recently, several solute carrier (SLCs) transporters have come to light that are involved in six crucial routes of pancreatic cancer resulting in very poor prognoses including: (i) cancer cell proliferation, (ii) programmed cell death, (iii) invasion and metastasis, (iv) angiogenesis, (v) cellular metabolism and (vi) chemo-sensitivity. These SLC transporters move a broad range of key substrates including (a) amino acids, (b) nucleotides, (c) essential metal ions including zinc, magnesium, sodium, lithium, copper, other organic ions, NH4+, HCO3, (d) short chain fatty acids, (e) co-factors, vitamins, and (f) other organic ions and compounds into cells. Some of these SLCs are involved in pancreatic cancer metastasis and angiogenesis, others are involved in proliferation and programmed cell death, as well as cellular metabolism and chemo-sensitivity (Table 1)1,2,3. With these new insights, these SLCs not only should be the focus of intense research, but should also be the targets for new therapeutic strategies to treat pancreatic cancer and perhaps other cancer metastasis1,2,3,4.

Table 1 The solute carrier transporters relevant in pancreatic cancer.

There are 46 distinct gene families of SLC transporters comprising 384 genes in the human genome2,5. Some members of these SLC transporters have recently been found to be involved in several key aspects of cancer including fatal pancreatic cancer (Table 1). One of the key characteristics of cancer is deregulation of cellular energetics, namely the insatiable energy demand including sugars and nutrients through upregulating the transporters1,3,6. By effectively blocking the key logistics of transport systems to the cancer cells, we may be able to add another tool to more effectively treat caners.

SLC transporters are involved in multiple cellular pathways for pancreatic cancer and others in a single pathway. For example, several are involved in metastasis, programmed cell death, proliferation, chemo-sensitivity (Table 1) including SLC2A1 (also called GLUT1)7,8,9, SLC7A1110, SLC4A411, SLC1A512 and SLC7A513,14, SLC39A615,16. On the other hand, others are only involved a single pathway (Table 1) including SLC39A317,18,19, SLC9A120,21,22, SLC6A1423, SLC4A724, SLC5A825,26 and SLC41A127. The motivation of this study is to endeavor to discover new drugs and to generate therapeutic monoclonal antibodies, by targeting those that are involved in multiple pathways.

SLC transporters have multiple transmembrane (TM) helices. Most have 10TM-12TM helices depending on their transporting substrates with the longest having 13TM (SLC5A8 for short chain fatty acids). Several of these SLC transporters have large N-termini and extracellular loops (Figs. 1, 2) of ~ 30–60 amino acids. Those with large extracellular loops are more likely to be better targets for generating therapeutic monoclonal antibodies since these extracellular loops may stimulate robust immune responses during immunizations.

Figure 1
figure 1

Protein alignments of six solute carrier transporters with their QTY variants. The symbols | and * indicate the identical and different amino acids, respectively. Please note the Q, T and Y amino acid replacement (red). The alpha-helices (blue) are shown above the protein sequences. The loop color codes are: internal (yellow) and external (red). Features of natural and QTY variants are: molecular weight, pI, total variation % and transmembrane variation %. The alignments are: (a) SLC2A1 versus SLC2A1QTY, (b) SLC7A11 versus SLC7A11QTY, (c) SLC4A4 versus SLC4A4QTY, (d) SLC1A5 versus SLC1A5QTY, (e) SLC7A5 versus SLC7A5QTY, (f) SLC29A1 versus SLC29A1QTY. Although there are significant QTY changes for the TM alpha-helix changes, > 45% for SLC29A1, > 50% for SLC7A11, their changes of molecular weight and pI are insignificant (Table 2).

Figure 2
figure 2

Protein alignment of eight native solute carrier transporters with their water-soluble QTY variants. All characteristics and color codes are referred in Fig. 1. The alignments are: (a) SLC39A6 versus SLC39A6QTY, (b) SLC39A3 versus SLC39A3QTY, (c) SLC9A1 versus SLC9A1QTY, (d) SLC6A14 versus SLC6A14QTY, (e) SLC4A7 versus SLC4A7QTY, (f) SLC5A8 versus SLC5A8QTY, (g) SLC41A1 versus SLC41A1QTY (see Tables 2, 3).

On 15 July 2021, Google DeepMind announced the AlphaFold2, and at the same time David Baker’s lab introduced RoseTTAFold as machine learning revolutionary tool for the very accurate prediction of protein structures28,29,30. AlphaFold2 and, to a lesser extent, RoseTTAFold, have already made an enormous impact on our understanding of 350,000 protein structures. On July 28, 2022, DeepMind released 214 million protein structures, nearly all known protein structures. AlphaFold2 predicts the structures with very high accuracy for 35% of all protein structures (~ 75 million), and with high confidence for another 45% (~ 96.3 million). AlphaFold2 has truly started a new era of digital biology. Nevertheless, academic investigators, the pharmaceutical and biotech companies must still ultimately study the physical structures of proteins, including SLC membrane transporters since the structures are vital to understanding how substrates including sugars, amino acids, essential ions, organic molecules, drugs, or other essential nutrients are transported across highly regulated and controlled cell membranes.

We previously applied the QTY (Glutamine, Threonine, Tyrosine) code to design several detergent-free transmembrane (TM) protein chemokine receptors and cytokine receptors for various uses using conventional computing programs to simulate several G protein-coupled receptors. Each took ~ 5 weeks to complete the simulation33,34,35. The expressed and purified water-soluble proteins exhibited predicted characteristics and retained ligand-binding activity31,32,33,34,35. In July 2021, we prepared QTY variant protein structure predictions using AlphaFold2, achieving better results in 1–2 hours36,37, rather than ~ 5 weeks for each molecular simulation using GOMoDo, AMBER and YASARA programs31,32,33. We also produced a program and website for generating the membrane protein water-soluble QTY variants38.

Here, we report using AlphaFold2 to design water-soluble QTY variants of the 13 solute carrier transporters, and to directly compare with their counterpart native structures. In addition to targeting the key nutrients and ion uptake activity of cancer cells, these QTY variant water-soluble SLC transporters can prospectively find many additional applications. Working with water-soluble QTY variants may substantially accelerate the discovery and development of therapeutic and diagnostic biologicals.

Results and discussions

Protein sequence alignments and other characteristics

We aligned the native SLC transporters with their QTY variants. Despite significant QTY replacement of hydrophobic residues in the transmembrane domains (~ 44–55%) in the SLC transporters, the isoelectric focusing point pI and molecular weight remain rather similar (Figs. 1 and 2, Table 2). This is because Q, T, Y amino acids do not introduce any charges, they only introduce water-soluble side chains. Q (glutamine) side chains form 4 water hydrogen bonds, 2 donors through –NH2, and 2 acceptors through oxygen on –C=O; the sidechains –OH of T (threonine) and Y (tyrosine) form 3 water hydrogen bonds, 1 donor from H (hydrogen) and 2 acceptors from O (oxygen).

Table 2 Characteristics of native solute carrier transporters and their water-soluble QTY variants.

Since the electron density maps share remarkable structure similarities between leucine (L) versus glutamine (Q); isoleucine (I), valine (V) versus threonine (T); and phenylalanine (F) versus tyrosine (Y), the QTY code selects 3 neutrally polar amino acids: glutamine, threonine and tyrosine to replace 4 hydrophobic amino acids leucine, isoleucine, valine and phenylalanine. After applying the QTY code, the hydrophobic amino acids in the transmembrane segments are replaced by Q, T, and Y, therefore the transmembrane segments have significantly reduced hydrophobicity. For example, SLC39A3 and SLC29A1 differ > 54 and > 45% from their water-soluble QTY variants, respectively, in their transmembrane alpha-helical segments. (Figs. 1, 2, Table 2).

Other characteristics are also notable. The pIs (isoelectric-focusing points) vary, some in the acidic and some in the basic range. For example, native SLC7A11 has a basic pI of 9.29. On the other hand, SLC1A5 has an acidic pI of 5.34. Others including SLC9A1 have a near neutral pI of 6.74 (Table 2). It is noted that the pIs are identical for the native and QTY variants for SLC4A4 (pI 6.35), SLC39A3 (pI 6.39), SLC4A7 (pI 6.26), and SLC41A1 (pI 5.11) despite the large number of QTY substitutions. The reason is that Glutamine, Threonine, Tyrosine (Q, T, Y) have neither positive nor negative charges at neutral pH. Therefore, substitutions of Q, T, Y do not change the pIs. This is significant because altered pIs could cause non-specific interactions.

Moreover, while there are between > 45– > 54% QTY substitutions in the transmembrane helices, the molecular weights of the native and QTY variants differ by only a few hundreds of Daltons. This is due to a) the substitutions of CH3- on Lue and Val, by -OH groups to Gln (Q) and Thr (T), and b) the addition of OH- on Tyr (Y). These increase the protein molecular weights (Figs. 1, 2, Table 2).

Superposition of native transporters and their water-soluble QTY variants

In our current study, the native SLC structures determined by X-ray crystal or CryoEM were superimposed and compared to their QTY variants. The molecular structures of native SLC transporters are already available for SLC2A1 (PDB: 6THA)39, SLC7A11 (PDB: 7PV9)40, SLC4A4 (PDB: 6CAA)41, SLC1A5 (PDB: 5LMM)42, SLC7A5 (PDB: 6IRS)43, SLC29A1 (PDB: 6OB6)44. The superposed structures are performed for: SLC2A1Crystal versus SLC2A1QTY, SLC7A11CryoEM versus SLC7A11QTY, SLC4A4CryoEM versus SLC4A4QTY, SLC1A5Crystal versus SLC1A5QTY, SLC7A5CryoEM versus SLC7A5QTY and SLC29A1Crystal versus SLC29A1QTY (Tables 2 and 3).

Table 3 RMSD between native solute carrier transporters, their water-soluble QTY variants, and crystal structures.

The experimentally-determined native structures and their in-silico-determined water-soluble QTY variants superposed within a few Å. Their RMSDs are as follows: SLC2A1 versus SLC2A1QTY (2.281 Å); for SLC7A11 versus SLC7A11QTY (0.933 Å); SLC4A4 versus SLC4A4QTY (0.445 Å); SLC1A5 versus SLC1A5QTY(0.965 Å); SLC7A5 versus SLC7A5QTY (1.815 Å); and SLC29A1 versus SLC29A1QTY (1.512 Å). (Fig. 3, Table 2). It can be seen from Fig. 3, these molecular structures, experimentally-determined and AlphaFold2-predicted visibly superpose very well. These results show that despite > 45% QTY substitutions in the transmembrane alpha-helices in the water-soluble QTY variants, their structures share rather similar 3-dimentional folds. These closely superposed structures perhaps confirm that the AlphaFold2’s predictions are highly accurate, since the predicted native structures are directly superposed with the experimentally determined X-ray crystal structures. Our AlphaFold2 predicted structures show the significant structural similarity between the native SLC transporter and their water-soluble QTY variants.

Figure 3
figure 3

Superposed 6 solute carrier transporter crystal or CryoEM structures with QTY variants predicted by AlphaFold2. The X-ray crystal or CryoEM structures of the native transporters are obtained from the Protein Data Bank (PDB). The crystal or CryoEM structures (magenta) are superposed with QTY variants (cyan) predicted by AlphaFold2. The RMSD (Å) for each structure is in Table 2. (a) SLC2A1Crystal versus SLC2A1QTY (2.281 Å), (b) SLC7A11CryoEM versus SLC7A11QTY (0.933 Å), (c) SLC4A4CryoEM versus SLC4A4QTY (0.445 Å), (d) SLC1A5Crystal versus SLC1A5QTY (0.965 Å), (e) SLC7A5CryoEM versus SLC7A5QTY (1.815 Å), (f) SLC29A1Crystal versus SLC29A1QTY (1.512 Å). These superposed structures display that the crystal or CryoEM structures and their QTY variants have very similar molecular structures (Tables 2, 3). For clarity of direct comparisons, the N-terminus and C-terminus are deleted.

Because the X-ray crystal or CryoEM structures of the other 7 native SLCxxxx are not yet available, including SLC39A4, SLC39A6, SLC39A3, SLC9A1, SLC6A14, SLC4A7, SLC5A8 and SLC41A1, AlphaFold2 tool is used for the structural predictions. The RMSD in Å (residue mean-square distances) for these superposed structures are displayed (Fig. 4, Table 2). The examples are: SLC39A6 versus SLC39A6QTY (0.931 Å); SLC39A3 versus SLC39A3QTY (1.567 Å); SLC9A1 versus SLC9A1QTY (0.905 Å); SLC6A14 versus SLC6A14QTY (0.521 Å); SLC4A7 versus SLC4A7QTY (0.426 Å); SLC5A8 versus SLC5A8QTY (1.593 Å); and SLC41A1 versus SLC41A1QTY (1.914 Å) (Table 2). The AlphaFold2-predicted structures of both natural SLCs and their water-soluble SLC variants superpose very well implying that they share comparable structures despite the significant substitutions in the transmembrane alpha-helices (45–54%).

Figure 4
figure 4

Superposed 7 native solute carrier transporters and their QTY variants that were predicted by AlphaFold2. The native structures (green) and their water-soluble QTY variants (cyan). For the superposed structures, the RMSD is in Å ( ). (a) SLC39A6 versus SLC39A6QTY (0.931 Å) (b) SLC39A3 versus SLC39A3QTY (1.567 Å), (c) SLC9A1 versus SLC9A1QTY (0.905 Å), (d) SLC6A14 versus SLC6A14QTY (0.521 Å), (e) SLC4A7 versus SLC4A7QTY (0.426 Å), (f) SLC5A8 versus SLC5A8QTY (1.593 Å), (g) SLC41A1 versus SLC41A1QTY (1.914 Å). The large N- and C-termini are removed for clarity. Please see Tables 2 and 3. For clarity, N-terminus, C-terminus and large loops are deleted.

Analysis of the hydrophobic surface of native transporters and the water-soluble QTY variants

It is known that the natural SLCxxxx have high hydrophobicity content, especially in the transmembrane alpha-helical segments. They are inherently insoluble in water and require surfactants to solubilize and stabilize them. These natural transporters quickly self-associate to form unstructured aggregation, precipitation, and no longer biologically functional without the highly selected surfactants.

In the natural SLCxxxx, the 6TM-12TM alpha-helices are directly embedded in the hydrophobic lipid bilayer. The hydrophobic side chains of phenylalanine, isoleucine, leucine, and valine interact with the hydrophobic lipid bilayers. Thus, the 6TM-12TM alpha-helices exhibit water-repelling hydrophobic surfaces (Figs. 5, 6).

Figure 5
figure 5

Hydrophobic surface of 6 crystal structures of solute carrier transporters and the designed QTY variants. The native solute carrier transporters have many hydrophobic residues L, I, V and F in the transmembrane helices. After Q, T, and Y replacement of the L, I, V, F, the surfaces are much more hydrophilic. The hydrophobic surface (brownish) of the native transporters become more cyan color: (a) SLC2A1 versus SLC2A1QTY, (b) SLC7A11 versus SLC7A11QTY, (c) SLC4A4 versus SLC4A4QTY, (d) SLC1A5 versus SLC1A5QTY, (e) SLC7A5 versus SLC7A5QTY, (f) SLC29A1 versus SLC29A1QTY. The hydrophobic surface is largely reduced on the transmembrane helices for the QTY variants. These QTY variants converted to water-soluble form. For clarity, the N-terminus and C-terminus are deleted.

Figure 6
figure 6

Hydrophobic surface of 8 AlphaFold2 predicted native solute carrier transporters and their QTY variants. The pairwise of AlphaFold2 predicted native transporters with hydrophobic surface (brownish color), and the QTY variant transporters (cyan color). (a) SLC39A6 versus SLC39A6QTY, (b) SLC39A3 versus SLC39A3QTY, (c) SLC9A1 versus SLC9A1QTY, (d) SLC6A14 versus SLC6A14QTY, (e) SLC4A7 versus SLC4A7QTY, (f) SLC5A8 versus SLC5A8QTY, (g) SLC41A1 versus SLC41A1QTY. For clarity, the large N- terminus and C-terminus are deleted.

After QTY substitutions of hydrophobic amino acids L, I, V, F, with hydrophilic amino acids Q, T and Y, the hydrophobic surfaces are decreased (Figs. 5, 6). The QTY changes hydrophobic 6TM-12TM into hydrophilic 6TM-12TM, without, however, significantly changing the alpha-helical molecular structures, as shown in Fig. 3. Analogous reproducible experimental results reported for chemokine and cytokine receptors in our previous publications31,32,33,34,35. However, our experimental results showed that the structure integrity, stability, and ligand-binding activities have been retained from the water-soluble QTY-variant chemokine receptors and cytokine receptors31,32,33,34,35.

There are three chemically distinct alpha-helix types. Type I: the water-soluble hydrophilic alpha-helix, commonly water-soluble enzymes in the cellular cytosols and extracellular circulating proteins including antibodies, protein and peptide hormones and more; Type II: the water-insoluble hydrophobic alpha-helix commonly in transmembrane proteins including hormone receptors, transporters, ion channels, G protein-coupled receptors, photosynthesis systems; and Type III: the amphiphilic alpha-helix, like a Janus that have a hydrophobic face on one side and a hydrophilic face on the other side. These three chemically distinct alpha-helical types have very similar structures, regardless their hydrophobicity and hydrophilicity45,46,47,48,49. This is the molecular foundation of the QTY code.

AlphaFold2 predictions

For over 65 years, scientists have made great efforts to predict protein folding. It was the dream of structural biologists and protein scientists to predict protein folding rapidly and accurately. With the advent of AlphaFold2 through machine learning, the tool is now available to predict protein structure by all scientists, almost free of charge. We can now study previously unattainable protein structures, particularly membrane-embedded transmembrane proteins.

Systematic bioinformatic studies revealed that in most organisms, ~ 20–30% genes code for membrane proteins50. It is known that the human genome codes for ~ 24% membrane proteins51. But structural determination of a single transmembrane protein is an extremely difficult process, traditionally requiring even decades of endeavor. There are many barriers, from gene expression, protein production, detergent selection, purification, detergent exchange, to maintaining their long-term stability and integrity as well as maintaining functionality to avoid irreversible aggregation. The numbers of integral transmembrane protein structures lag far behind water-soluble proteins. Recently, several groups have systematically analyzed the known structures and at least 16 different folds of ~ 400 members, 65 families of the solute carrier transporter52,53. These studies further provide insights into molecular structures and functions of these transporters.

Applying AlphaFold2 accurate protein structure predictions, we can directly compare the native structure with an AlphaFold2 predicted water-soluble QTY variant. With the QTY variant, expressed in various cells, it becomes possible to overcome the high barriers of studying membrane embedded transmembrane proteins.

We previously reported36,37 use of the AlphaFold2 tool to predict structures of the water-soluble variants of G protein coupled receptor chemokine receptors and glucose transporters, and compared them to the known experimentally-determined crystal or CryoEM structures.

One of the questions concerns what would be the utility of these water-soluble transporters that can no longer transport molecules cross the membrane since the water-soluble QTY variants can no longer insert themselves into the lipid bilayer membrane. It is plausible that after these transporters are rendered water-soluble, they can be used: (1) as water-soluble antigens to generate useful monoclonal antibodies in animals since they have many extracellular loops including few large loops, and (2) these anti-SLCxxx antibodies could be useful as research reagents for an assay system to study the transporters in tissue cell cultures and in vivo. These specific monoclonal antibodies can perhaps also be useful (a) as therapeutics to treat diseases including pancreatic cancer, (b) as diagnostic reagent to monitor cancer treatments and perhaps (c) for early pancreatic cancer detection.

Our current study using the AlphaFold 2 demonstrates that the water-soluble QTY-variant structures of SLC transporters are substantially similar to the native structures. AlphaFold 2 is a very useful approach to predict other membrane embedded transmembrane proteins. QTY is a useful approach for working with difficult-to-study hydrophobic proteins. The SLC transporter water-soluble QTY variants not only could be used for designs of molecular machines, but also as water-soluble antigens for generating therapeutic monoclonal antibodies and for accelerating drug discovery.

Methods

Protein sequence alignments and other characteristics

The native protein sequences for SLC transporters and their QTY-variant sequences are aligned using the same methods previously described33,34. The website Expasy (https://web.expasy.org/compute_pi/) was used to calculate the molecular weights (MW) and pI values of the proteins.

AlphaFold2 predictions

AlphaFold228,30 Program https://github.com/sokrypton/ColabFold was used for the structure predictions of the QTY variants following the instructions at the website on 2 × 20 Intel Xeon Gold 6248 cores, 384 GB RAM, and a Nvidia Volta V100 GPU. The European Bioinformatics Institute (EBI, https://alphafold.ebi.ac.uk) has all AlphaFold2-predicted structures. The Uniprot website https://www.uniprot.org has each protein ID, entry name, description, and FASTA sequence. The data was taken from UniProt using a custom Python code. The QTY method website (https://pss.sjtu.edu.cn/) can convert the FASTA protein sequences into their water-soluble versions. These steps were optimized using Python libraries for web applications such as requests and splinter.

Superposed structures

The molecular structures are taken from PDB https://www.rcsb.org. They include SLC2A1 (PDB: 6THA)39, SLC7A11 (PDB: 7P9U)40, SLC4A4 (PDB: 6CAA)41, SLC1A5 (PDB: 5LMM)42, SLC7A5 (PDB: 6IRS)43, SLC29A1 (PDB: 6OB6)44. AlphaFold2 predictions of 8 native SLC transporters and their QTY variants were carried out using the AlphaFold2 program at https://github.com/sokrypton/ColabFold. Uniprot https://www.uniprot.org is the source for all 13 SLC transporter protein sequences and AlphaFold2 was performed to predict QTY variant structures. These structures are superposed using PyMOL https://pymol.org/2/.

Structure visualization

Two key programs were used for structure visualization: PyMOL https://pymol.org/2/ and UCSF Chimera https://www.rbvi.ucsf.edu/chimera/. PyMOL program is used for the superposed models, while hydrophobicity models were rendered using Chimera.

Ethical approval

(1) All methods were carried out in accordance with relevant guidelines and regulations. (2) All experimental protocols were approved by a named institutional and licensing committee. (3) Neither human biological samples, nor human subjects were used in the study. This is a completely digital structural biology study using the publicly available AlphaFold2 machine learning program.