Introduction

GPCRs represent the largest superfamily and most diverse group of mammalian transmembrane proteins. The main characteristic feature of these proteins is that they share a common seven-transmembrane (7TM) configuration. GPCRs have attracted a great deal of interest owing to their numerous physiological and pathological roles in transducing extracellular signals into intracellular effector pathways through the activation of heterotrimeric G protein by binding to a broad range of ligands, including proteins1, peptides2, organic compounds3, 4, and eicosanoids5. This makes GPCRs and their signal transduction pathways important specific targets for a variety of physiological functions and therapeutic approaches, ranging from the control of blood pressure, allergic response, kidney function, hormonal disorders, and neurological diseases to the progression of cancer6. Owing to the features of GPCR structure and function, approximately 36% of currently marketed drugs target human GPCRs7. GPCRs have huge potential in biomedical research and drug development.

Human GPCRs can be divided into five main families on the basis of phylogenetic criteria, Glutamate, Rhodopsin, Adhesion, Frizzled/Taste2, and Secretin8. Among the five GPCRs families, Rhodopsin is the most studied. It comprises the largest group of GPCRs. Notably, in recent years, the leucine-rich repeat-containing G-protein coupled receptor (LGR) subfamily, part of Rhodopsin, have displayed enormously important physiological functions in knockout mice studies especially LGR4 and LGR5. Olfactory receptors are also members of Rhodopsin family of GPCRs and are mainly expressed in sensory neurons of olfactory system. These form a multigene family. The PSGR subfamily belongs to the olfactory receptor group. The family has restricted expression in human prostate tissues and is upregulated in prostate cancer. The second largest GPCR family, with 33 members, is the Adhesion family. This family is very special because of its members' secondary structures, with distinctive long N-termini containing adhesion domains8. Limited studies have shown that Adhesion GPCRs are involved in the signaling of cell adhesion, motility, embryonic development, and the immune system. There are still GPCRs for which the natural ligands remain to be identified. These are called orphan GPCRs.

LGRs and PSGR belong to Rhodopsin subfamily and they represent as classical GPCRs in structure and signal transduction. On the other hand, Adhesion GPCRs are novel, and their structures and signal transduction are distinct to the classical GPCRs. In this review, we focused our discussion on LGR subfamily, PSGR subfamily, and Adhesion GPCRs family. We also discussed current screening systems for the deorphanization and characterization of the orphan GPCRs.

Orphan GPCRs

The first GPCR to be identified was rhodopsin in 1878. It was later proven that rhodopsin consists of the GPCR protein opsin and a reversibly covalently bound cofactor, retinal9, 10. After completion of the human genome sequence in 200411, 12, the number of human GPCRs increased to about 800 based on the screening approaches, such as low-stringency hybridization13, PCR-derived methods14, and bioinformatic analyses15. Besides the olfactory receptor family, more than 140 GPCRs have not yet been linked to endogenous ligands. These are the so-called orphan GPCRs (Figure 1)16.

Figure 1
figure 1

Percentage of the orphan GPCRs in GPCR superfamily. GPCRs constitute a large transmembrane family of more than 800 members. Among them, 6% are utilized as drug target in clinical applications, and 30% are natural ligand receptors. However, 49% are olfactory receptors (most of them are orphan GPCRs), and 15% are orphan GPCRs. (Data were summarized from a review paper122)

Biological functions of the LGR subfamily

LGRs 4–8 are members of the rhodopsin GPCR family, which can be divided into two groups, LGRs 4, 5, and 6 and LGRs 7 and 8 in terms of their natural ligand. R-spondins have recently been identified as the ligands for LGRs 4, 5, and 617. LGRs 7 and 8 are relaxin family peptide (RXFP) receptors18. According to sequence similarity, LGRs 4, 5, and 6 are closely related to each other, showing almost 50% identities. The three orphan receptors have a substantially large N-terminal extracellular domain (ECD) composed of 17 leucine rich repeats (LRR) (Figure 2)19. Lgr4, also known as Gpr48, has been reported to have many physiological functions by the generation of knockout mice. The loss of Lgr4 results in developmental defects in many areas, including intrauterine growth retardation associated with embryonic and perinatal lethality20, abnormal renal development21, defective postnatal development of the male reproductive tract22, ocular anterior segment dysgenesis23, bone formation and remodeling dysfunction6, impaired hair placode formation24, and defective development of the gall bladder and cystic ducts25. Lgr5 has been proven to be a marker of gastrointestinal tract and hair follicle stem cells26, 27. Knockout of Lgr5 in mice leads to total neonatal lethality accompanied with ankyloglossia and gastrointestinal distension28. Lgr6 also has been shown to be a stem cell marker in hair follicles, and Lgr6-positive stem cells have been found to produce all cell lineages of the skin29. LGR4 and LGR5 are also highly expressed in several types of cancers. LGR5 is up-regulated in human colon and ovarian tumors and promotes cell proliferation and tumor formation in basal cell carcinoma30, 31. Overexpression of LGR4 enhances cervical and colon cancer cell invasiveness and metastasis32. However, despite their critical function in development and cancer, LGR4 and LGR5 will still be considered orphan receptors until R-spondins reported to function as their natural ligands can be proven to regulate Wnt/β-catenin signaling pathway. Some observations of Lgr4 and Lgr5 knockout mice have been strongly relevant to Wnt/β-catenin signaling33, 34. This suggests that LGR4 and LGR5 could be involved in the Wnt pathway. One author stated that R-spondins-Lgr4 induced the signal transduction pathway in a manner independent of G proteins17. However, two independent groups have reported that Lgr4/Gpr48 is associated with the Gαs-cAMP pathway by generating constitutively active forms of Lgr4/Gpr4823, 35. Therefore, the existence of endogenous ligands for the activation of classical G-protein coupled signaling pathways for Lgr4/Gpr48 is still a question. LGR7 and LGR8 share 54% identity. Besides 10 LRRs motif, LGR7 and LGR8 also have an LDL class A (LDLa) motif in the N-terminal, which is an important domain for signal transduction (Figure 2). Traditionally, relaxin/LGR7 has been thought to be a hormone receptor for pregnancy and parturition18. Recently, it has been reported that relaxin/LGR7 also has significant function in non-reproductive organs, such as the heart, and even plays a role in cancer growth and metastasis36. Insulin-like peptide 3 (INSL3), which is a ligand of LGR8, is highly expressed in the Leydig cells of the testis and knocking out Insl3 in mice generates a cryptorchid phenotype. However, reports have been conflicting with respect to LGR8 mutations related to human cryptorchidism18. The role of INSL3 in human adult male is still not clear.

Figure 2
figure 2

LGR subfamily GPCRs. The Type A LGRs includes the follicle-stimulating hormone receptor (FSHR), the luteinizing hormone receptor (LHR) and the thyroid-stimulating hormone receptor (TSHR). The Type B LGR comprises three members, Gpr48/LGR4, LGR5, and LGR6 which remain orphan GPCRs at the present time. By contrast, Type C LGRs have only two members, LGR7 and LGR8 which have been demonstrated to be the relaxin family receptors. Type A contains 9 LRRs in the ectodomain, whereas Type B contains 17 LRRs. By contrast, Type C has an N-terminal LDL receptor-like cysteine-rich domain not found in other LGRs. 7TM, seven-transmembrane; LDL, low-density lipoprotein; LRR, leucine-rich repeat; LGR, leucine-rich repeat-containing G-protein-coupled receptor; FSHR, follicle-stimulating hormone receptor; LHR, luteinizing hormone receptor; TSHR, thyroid-stimulating hormone receptor.

PSGRs subfamily in prostate cancer

Mammalian olfactory receptors, which are the members of the Rhodopsin family of GPCRs and mainly expressed in sensory neurons of the olfactory epithelium in the nose, are used to sense the chemical environment37. Recently, some olfactory receptors have also been found in other organs. For example, MOR23 is expressed both in the olfactory epithelium and in sperm and functions as a chemosensing receptor during sperm-egg communication, thereby modulating fertilization in the reproductive system38. The new olfactory receptor family members PSGR1 and PSGR2 have been found to have restricted expression in human prostate tissues, as shown by Northern blot and real-time PCR analysis of over 20 different human tissue types39, 40, 41. PSGR subfamily expression increases significantly in the epithelial cells of prostate intraepithelial neoplasia (PIN) patients and in prostate cancer patients relative to non-cancerous controls and benign prostatic hyperplasia tissues, suggesting that the PSGR subfamily may play an important role in early prostate cancer development42. The PSGR subfamily has been proven to be strongly associated with the clinical parameters (clinical stages, Gleason scores, recurrence status, and metastasis) and its members could serve as biomarkers for prostate cancer42, 43. PSGR subfamily transcripts even can be used as diagnostic markers in urine44. It has also been reported that PSGR expression detection together with the well-known prostate cancer marker prostate-specific antigen (PSA), prostate cancer gene 3 (PCA3), and α-methylacyl-CoA racemase (AMACR) can increases diagnostic specificity in the detection of prostate cancer43, 44, 45. Recently, Neuhaus EM et al reported that through intracellular Ca2+ flux using a bank of steroid hormones and through odorant-related compound screening, certain steroids and β-ionone have been proven to be active ligands for PSGR46. PSGR-induced Ca2+signaling was found to require the involvement of endogenous Ca2+-selective transient receptor potential vanilloid type 6 (TRPV6) channels47. Incubation of prostate cancer cells with β-ionone inhibits cell proliferation46. This suggests that PSGR signaling is also involved in prostate cancer cell progression.

Adhesion GPCR family

GPCRs in the Adhesion family have a relative long N-terminal domain, which contains many so-called adhesion domains (Figure 3). These adhesion domains only existed in some adhesion molecules, such as integrins, cadherins, and selectins; and the domains are thought to have adhesive properties. Another striking characteristic of all the Adhesion GPCRs is that there is a GPS (GPCR proteolytic site) domain linking the 7TM region to the extracellular domain, which acts as an autocatalytic site48, 49. As a novel GPCRs family, most of the members are orphan and only a few of them have been identified as having natural ligands and functions.

Figure 3
figure 3

Schematic diagram of the extracellular N-terminal domain within the Adhesion GPCRs. The extracellular N-terminal domains of 33 Adhesion GPCRs was predicted by the RPS-BLAST against the conserved domain database (CCD). CA, cadherin domain; calx-beta, domain found in Na+–Ca2+ exchangers; CUB, resembles the structure of immunoglobins; EAR, epilepsy-associated repeat; EGF-Lam, laminin EGF-like domain; EGF, epidermal growth factor domain; HBD, hormone-binding domain; herpes-gp2, resembles the equine herpes virus glycoprotein gp2 structure; GBL, galactose-binding lectin domain; Ig, immunoglobulin domain; OLF, olfactomedin domain; LamG, laminin G domain; LRR, leucine-rich repeat domain; PTX, pentraxin domain; Puf, displays structural similarity to RNA-binding protein from the Puf family; SEA, domain found in sea-urchin sperm protein; SIN, resembles the primary structure of the SIN component of the histone deacetylase complex; TSP1, thrombospondin domain. C-type lectin, similar to the C-type lectin or carbohydrate-recognition domain; GPS, GPCR proteolytic site domain.

Adhesion GPCRs in immunology

Immune response is coordinated by an assortment of membrane receptors, including TLRs, integrins, lectins, the Ig superfamily, selectins, and GPCRs, which are found on leukocytes10, 50, 51. The first Adhesion GPCR to be discovered, epidermal growth factor-like module containing mucin-like receptor protein 1 (EMR1, F4/80 receptor), which is an epidermal growth factor (EGF)-seven transmembrane (7M) receptor, have a predominantly leukocyte-restricted expression pattern52. Though the expression of Emr1 is restricted, the function of this receptor remained unknown until the generation of Emr1 knock-out mice. The mouse model indicates that Emr1 is critical to the induction of CD8+ regulatory T-cells in peripheral tolerance53. Besides EMR1, the EGF-TM7 subfamily includes EMR2, EMR3, EMR4, and CD97, all of which belong to the Adhesion GPCR family. Unlike the highly specific expression of EMR1, the other EGF-TM7 receptors are expressed largely in myeloid cells (monocytes, macrophages, neutrophils, and dendritic cells) and in some lymphoid cells (T and B cells)54. Chondroitin sulfate has recently been identified as the ligand for EMR2 and CD97, which mediate cell attachment55. CD97, the leukocyte activation antigen, also has been shown to bind to the complement regulatory protein DAF/CD55 (decay accelerating factor) and the longest splice variant of CD97 has the highest capacity to bind to CD55-expressing cells. Although CD97 and EMR2 differ by only 3 amino acids (in the EGF domain), the activity of EMR2 binding to CD55 is significantly weaker56, 57. The precise function of the CD97-CD55 interaction is still not fully understood. Using knock-out mice and x-ray crystallography, Abbott RJ et al demonstrated that T cells and complement regulatory activities of CD55 occur on opposite faces of the molecule, suggesting that the CD97-CD55 complex might simultaneously regulate both the innate and adaptive immune responses58, 59. EMR3 has been reported as a marker for mature granulocytes, and it can interact with the ligand that expresses at the surface of monocyte-derived macrophages and activated human neutrophils60, 61. EMR4 has been reported to interact with a cell surface protein as a ligand on A20 B-lymphoma cells62.

Adhesion GPCRs in development

The most extensively studied Adhesion GPCRs in embryonic development are the so-called 7TM-cadherin subfamily (Celsr/ Flamingo/Starry night). All the members of this subfamily posses extracellular domains containing nine atypical cadherin repeats which have linked the combination of EGF-like and laminin G-like domains63. The 7TM-cadherins are an evolutionarily conserved gene subfamily with homologues discovered from ascidians to mammals63. In mammals, the subfamily comprises 3 genes, Celsr1, Celsr2, and Celsr3. There are 4 genes (fmila, fmilb, fmi2, and fmi3) in zebrafish and only one homologue, called flamingo and starry night, in Drosophila. Drosophila studies provide us with a distinct function view of Flamingo/Starry night as a core planar polarity protein64. Its functions include regulating dendrite extension from sensory neurons65, 66, modulating target selection by photoreceptor axons67, accelerating axon advance from sensory and motor neurons68, and limiting ectopic neuromuscular junction formation and maintenance of motor axon terminals69. Gene knockout and knockdown of 7TM-cadherins has also confirmed this observation in vertebrates. 7TM-cadherins regulate morphogenetic movements, neural tube closure, orientation of sensory hair cells in inner ear, and hair follicle patterning63, 70, 71, 72, 73. Recently, Adhesion GPCRs Gpr124 and Gpr126, which are not 7TM-cadherins, have been shown to regulate the development of different tissues in mice. Gpr124 affects CNS-specific angiogenesis and Gpr126 affects Schwann cells to initiate myelination74, 75, 76, 77. This suggests that more members of this family may be involved in development and that this may be due to the adhesive or other properties of N-terminal domains.

Adhesion GPCRs in cancers

Because cell adhesion molecules have a vital role in cancer progression, it is reasonable to speculate that Adhesion GPCRs also play important functions in cancer progression and metastasis. Leukocyte Adhesion GPCR EMR2 has been proven to be overexpressed in human breast cancer and is associated with patient survival78. CD97 is involved in tumor-environment interactions and mediates tumor invasion79. It has been reported that the 7TM-cadherin receptors may also be involved in human cancers, such as gastric cancer, lung cancer, and melanoma80. Interestingly, unlike other Adhesion GPCRs, GPR56 has been shown to suppress some cancer cell growth and metastasis through interacting with tissue transglutaminase (TG2)81, 82.

Signal transduction mediated by Adhesion GPCRs

Most Adhesion GPCRs are orphan receptors, which is the main reason whether or not Adhesion GPCRs are involved in G protein signaling. In addition, the complicated structure of Adhesion GPCRs, comprising both largely ECD and 7TM domains, make it possible for Adhesion GPCRs to go through the signaling pathway in a G-protein-independent manner83. For example, Gpr124 regulates angiogenic sprouting into neural tissues through TGF-beta pathway in mouse76. BAI1 can function as an engulfment receptor in response to “eat me” signal phosphatidylserine, which leads to BAI1 directly bind and activate the ELMO/DOCK180/RAC module84. It has been reported that GPR124 and GPR125 can interact with several viral oncoproteins by its cytoplasmic PDZ domain. And the rat Ig-Hepta (GPR116) has been shown to form a homodimer that is linked by disulphide bonds. Moreover, this receptor undergoes two proteolytic cleavages, and cleaved product in the SEA domain might act as a ligand to bind to GPR11685, 86, 87. Therefore, these 7TM receptors may mediate G-protein independent signaling pathway in cellular functions.

Though some Adhesion GPCRs go through G-protein- independent pathways, others have been proven to go through the classic G-protein-dependent pathway. Lectomedin receptor-1 was co-purified with the Gαo88. Also, GPR56 has been shown to form a complex with Gq/11 and G12/13 in the neural progenitor cells89, 90. Gpr126 modulates Schwann cells, initiating myelination by classic cAMP pathway74. Latrophilin, which is activated by the ligand LTX, can transduce the intercellular Ca2+ signal pathway. These observations indicate that this family can transmit signals through both classical G-protein-dependent and G-protein-independent mechanisms.

Deorphanization strategy

GPCRs are the most prominent family of pharmacological targets in biomedicine91. The deorphanization of orphan GPCRs is one of the most important missions in orphan GPCR research. Deorphanization is the process of identifying ligands that are highly selective for orphan GPCRs. In general, the standard assays are radio-ligand binding, calcium flux, GTPγ binding, and modulation of cAMP levels92, 93, 94, 95, 96, 97, 98.

With the development of molecular technology, several lines of approaches have been used for deorphanization. The first, according to the sequence and function similarity, ligands of the identified receptors are used to examine GPCRs with identical sequences or domains. This sequence similarity strategy resulted in the identification of the ligands of Edg3 and Edg5, whose sequences are similar to that of the S1P receptors, with >50% amino acid identity99, 100, 101. The function similarity strategy lead to the identification of the ligands of Lgr5 homologues, R-spondins, which stimulate the growth of intestinal stem cells17. However, this approach must be carefully evaluated because its predictions are not always accurate. For example, alkyl imidazole functions as dual histamine H3/H4 receptor ligands, while histamine H3/H4 receptors share very little sequence identity102, 103. Although the EGF domain of CD97 and EMR2 share 97% identity, only CD97 shows high affinity with CD55 but not with EMR2. The second strategy used to identify natural ligands works by determining the expression profile relationship between receptor and the putative ligand. This technique led to the identification of the receptors of RDC7 and RDC8 as adenosine A1 and A2A receptors, all of which are highly transcribed in the brain cortex, thyroid follicular cells, and testis104, 105. The third technique is used to identify GPCRs that have specific expression profiles and distinct cytoplasm signal pathways. This method uses extracts of tissues that contain potential ligands to screen by the GPCRs mediated signaling assays. Some hormone proteins, such as nociptin, orexins, apelin, prolactin, and ghrelin, were successfully identified using this strategy106, 107, 108, 109. The fourth strategy has been used successfully to deorphanize Adhesion GPCRs. It involves engineering recombinant soluble extracellular regions of Adhesion GPCRs with an Fc-fragment in N-terminal and biotinylation signal at the C-terminal. This acts as probe to screen the extracellular matrix components. This led to the identification of certain ligands for myeloid cell Adhesion GPCR51, 110. In recent years, the so-called reverse pharmacology strategy has also been used to identify the ligands of orphan GPCRs98. This is carried out by expressing these orphan GPCRs in eukaryotic cells by DNA transfection and then coupling them to ligands to examine the binding affinity of the cells and ligands111, 112. With this approach, many peptide hormones, including ghrelin, which stimulates hunger; kisspeptin and metastin, which are involved in puberty development and cancer metastasis; orexin and hypocretin, which mediate food intake and induce wakefulness and energy expenditure, have been discovered within the last decade113. However, the successful application of reverse pharmacology method depends on three major elements: sufficient orphan receptor expression, high-quality ligands and robust screening assays to detect receptor activation114, 115. Fortunately, with development of membrane protein expression and purification techniques, neuropeptides and synthetic ligands have been applied to large-scale screening116. Of the three elements outlined above, choosing an appropriate detection assay is the most problematic.

The rate of GPCR deorphanization decreased drastically at the turn of the century, suggesting some gap the processes exit. Herein, we discuss several factors that may account for the problem. The greatest challenge in deorphanization of the receptors is the limited knowledge about them, especially with respect to their physiological functions and their roles as transmitters of signal pathways. Thus, experimental design is rendered difficult by the lack of signal transduction assays and positive controls113. Second, the majority of approaches to deorphanization rely on monitoring changes at the second messenger level, which is regulated by G proteins. However, GPCRs can transduce signal pathways diversely, sometimes even beyond G proteins. In this case, identifying the relevant signaling pathway is key point to deorphanization. For example, some orphan GPCRs require accessory proteins for their activity. This working model has been shown in calcitonin GPCRs, which require RAMPs (receptor activity-modifying proteins) for their activation. To identify the ligand of this kind of GPCR, new screening assays for specific accessory proteins must be set up117, 118. Third, there is a possibility that some transmitters are only expressed at a particular time during the life span or at certain specific conditions9. Although it is risky and challenging, it is necessary to find more effective transmitters for deorphanization and put them to use. Lastly, some orphan GPCRs can form heterodimers with other GPCRs and function in a ligand-independent manner, and there is no outcome for the identification of the ligands of this kind of orphan GPCRs. For example, GABABR1 and GABABR2 form well-known heterodimer receptors and GABABR1 is involved in ligand-binding, whereas GABABR2 only acts as the signaling unit. GABABR2 is an orphan receptor in the heterodimer complex without any known ligand119, 120, 121.

Perspectives in the research of orphan GPCRs

In recent years, the numbers of new orphan GPCRs have increased and several members have been relatively well characterized. However, the progress of orphan GPCR function research has been hampered by the lack of identified ligands and by the unique structures of the GPCR themselves. Further investigation of their signaling pathways is valuable to understand the physiological and pathological roles of these new orphan GPCRs. The development of orphan GPCR knockout mice has also been shown to be a successful method for the characterization of their physiological and pathological functions. The knockout approach for orphan GPCRs are essential for our understanding of these receptor functions and their potential pathways. Functional and specific antibodies can serve probes not only for the ligands, but also for developing therapies for tumors and genetic disorders in which orphan GPCRs are involved. Although progress is very difficult, searching for the ligands of orphan GPCRs and identifying their physiological functions will continue. With recent discoveries of more and more orphan GPCR signaling pathways, understanding of their particular physiological functions and deorphanization for therapeutic purposes should accelerate in the coming years.