Characterization of a putative orexin receptor in Ciona intestinalis sheds light on the evolution of the orexin/hypocretin system in chordates

Tunicates are evolutionary model organisms bridging the gap between vertebrates and invertebrates. A genomic sequence in Ciona intestinalis (CiOX) shows high similarity to vertebrate orexin receptors and protostome allatotropin receptors (ATR). Here, molecular phylogeny suggested that CiOX is divergent from ATRs and human orexin receptors (hOX1/2). However, CiOX appears closer to hOX1/2 than to ATR both in terms of sequence percent identity and in its modelled binding cavity, as suggested by molecular modelling. CiOX was heterologously expressed in a recombinant HEK293 cell system. Human orexins weakly but concentration-dependently activated its Gq signalling (Ca2+ elevation), and the responses were inhibited by the non-selective orexin receptor antagonists TCS 1102 and almorexant, but only weakly by the OX1-selective antagonist SB-334867. Furthermore, the 5-/6-carboxytetramethylrhodamine (TAMRA)-labelled human orexin-A was able to bind to CiOX. Database mining was used to predict a potential endogenous C. intestinalis orexin peptide (Ci-orexin-A). Ci-orexin-A was able to displace TAMRA-orexin-A, but not to induce any calcium response at the CiOX. Consequently, we suggested that the orexin signalling system is conserved in Ciona intestinalis, although the relevant peptide-receptor interaction was not fully elucidated.

CiOX has been automatically annotated as an orexin receptor in databases, and it is in fact slightly closer to hOX 1/2 than protostome ATRs in terms of amino acid sequences: in the predicted transmembrane segments (TMs), the sequence identity between CiOX and hOX 1 or hOX 2 is 40-42% and between CiOX and ATRs 33-36% (Supplementary Fig. S1).Similar trends are seen when comparing the complete amino acid sequences between CiOX and hOX 1 or hOX 2 (32-33%) and between CiOX and ATRs (26-32%) (Supplementary Fig. S2).Beyond the TMs, especially the third intracellular loops of hOX 1/2 and CiOX differ markedly in length: the loop of hOX 1 contains 34 amino acids while the loop of CiOX contains 123 amino acids.Interestingly, the putative C. savignyi orexin receptor has 38 amino acids in its third intracellular loop.
The hOX 1/2 are more similar to ATRs than to CiOX in terms of sequence identity (48-54% vs. 40-42%; TMs only) as well as in the phylogenetic analysis (Fig. 1).The sequence alignment used for the phylogenetic analysis appeared of good quality, with gaps located in the regions structurally corresponding to loops and not within the TM regions.We constructed trees both with full sequences and with the TM regions only.The trees constructed with only the TM regions should be the least impacted by the sequence alignment methods; furthermore, TM regions represent the most evolutionary conserved regions of the receptors and thus are most relevant for phylogenetic studies.Altogether, the six trees present a consistent branching for four clades: vertebrate orexin receptors (green), cephalochordate orexin receptors (turquoise), and echinoderm/hemichordate orexin receptors (purple).The branching of the CiOX (yellow) and ATRs (pink) clades is more variable.Nonetheless, in five out of six phylogenetic trees, CiOX is branched away from protostome allatotropin and other chordate orexin receptors.
To clarify the issue, we compared the putative ligand binding cavities for hOX 2 , CiOX, and M. sexta ATR (Fig. 2).This was achieved through construction of homology models of CiOX and ATR and their comparison to the available X-ray structure of hOX 2 (Protein Data Bank (PDB) code 5WQC 22 ).This early study was conducted using the inactive form of the receptor, when no active structure was available, however amino acids lining the binding cavity are the same in both active and inactive conformations (see Modelling of the peptide-CiOX complexes).Furthermore, the models are robust to the different construction methods: The homology models presented in this manuscript are highly similar in their transmembrane regions to those produced using AlphaFold 23 (Supplementary Fig. S3).As discussed below, only a minor part of the peptide-extremely conserved across vertebrate orexins-binds deep into this cavity.S1, for the sequences used, see Supplementary Information 1, and for the alignments, see Supplementary Information 2-3.The CiOX gene encodes for two transcripts (X1 and X2), of which the X2 transcript isolated from C. intestinalis is used in this study.Trees were constructed based on either full sequences (left) or TMs only (right), with three different phylogenetic construction methods as indicated.Robustness was assessed with 500 × bootstrap method (values in % at the intersections of the branches).Green, vertebrate orexin receptors; turquoise, cephalochordate orexin receptors; pink, protostome ATRs; purple, echinoderms/hemichordates orexin receptors; and yellow tunicate orexin receptors.The tree is rooted (grey) with human sequences of NPFF1 (NPFFR), GAL 2 (GALR2), QFR (QRFPR), and ET B (ENDRB) receptors.www.nature.com/scientificreports/ The modelled binding site of CiOX resembles more the binding site of hOX 2 than that of ATR.Out of the 26 residues that are located deep within the receptor binding pocket there are 16 conserved positions between hOX 2 and CiOX, but only 10 conserved positions between hOX 2 and ATR.The analysis highlights two hydrogen bond networks and four salt bridges lining the binding cavity (Fig. 2).Of those, both networks and three salt www.nature.com/scientificreports/bridges are conserved or conservatively substituted when comparing the hOX 2 and the CiOX binding sites.In contrast, comparison of hOX 2 and ATR revealed only one conserved hydrogen bond network and one conserved salt bridge.Two of the salt bridges involve TM2 and TM7: D 2.65 -H 7.39 is centrally located and present in all three receptors, while E 2.68 -R 7.28 is present in the extracellular domain of hOX 2 and CiOX (Fig. 2a,b) but not ATR (Fig. 2c).hOX 2 also features two additional salt bridges (D ECL2.51 -R 6.59 and E ECL2.52 -H 5.39 ) connecting extracellular loop 2 (ECL2) and TMs 5 and 6 (Fig. 2a).The E ECL2.52 -H 5.39 salt bridge is conserved between hOX 2 and CiOX, but not between hOX 2 and ATR.Furthermore, the binding cavity features side chains (N/D 6.55 and F/Y 5.42 ) conservatively substituted between hOX 2 and CiOX, which could extend the network towards the intracellular surface of the binding cavity.In hOX 2 , N 6.55 has been described to form a hydrogen-bond with orexin-B 24 and suvorexant 25 .The D ECL2.51 -R 6.59 salt bridge found in hOX 2 is not conserved, but it is typical for distantly related GPCRs to be more divergent in their extracellular domains 26 .
Another important network of interacting side chains T 2.61 -Y 7.43 -Q 3.32 is present in all three receptors.This hydrogen bond network stabilizes the inactive state of hOX 2 .In the active state the network is broken as Q 3.32 adopts an upward-facing conformation in order to form a hydrogen bond with the agonist orexin-B 24 .The position 3.32 is well known to be involved in receptor activation, in particular in amine GPCRs 27 .
In hOX 2 , position F 5.42 is packed together with T 3.33 and V 3.36 (Fig. 2a).F 5.42 has been described as essential for orexin-A and small molecule binding 24,25,28 .Position 5.42 is also well-known to be involved in the activation of amine GPCRs 27 .A similar network can be formed in CiOX (Y 5.42 , M 3.33 and T 3.36 ; Y 5.42 -T 3.36 hydrogen bond is seen in our models) (Fig. 2b) but not in ATR (K 5.42 , S 3.33 , V 3.36 ; no stabilizing interactions) (Fig. 2c), indicating that a similar subpocket exists in hOX 2 and CiOX.Additional potentially disruptive changes, that might affect orexin peptide binding, are also found: In the TM2 of CiOX, P 2.60 replaces A 2.60 of ATR and hOX 2 ; and in the TM3 of ATR, L 3.29 replaces P 3.29 of CiOX and hOX 2 .These changes have the potential to be significant as they may shift the exposed regions of α-helices.Across hOX 2 , CiOX and ATR, the cavities are overall of similar sizes, with few notable differences: T/M 3.33 (hOX 2 /CiOX) are bulkier residues than S 3.33 (ATR), while W 2.64 in ATR is bulkier than the valines found in CiOX and hOX 2 .
We further compared the binding site conservation by constructing a vertebrate orexin receptor consensus sequence and an allatotropin consensus sequence for the binding site residues (Supplementary Fig. S4).The binding site residues of CiOX share 62% sequence identity with the consensus of vertebrate orexin receptor binding site, while identity with the consensus of ATR binding site is only 42%.Thus, at the binding site, CiOX is more similar to the vertebrate orexin receptors than with respect to ATR receptors.

Discovery of the candidate orexin peptide from C. intestinalis
The C. intestinalis putative prepro-orexin (CiPPO) is an orphan open reading frame named XM_026835150 (NCBI; reading frame 2).The region encoding Ci-orexin-A shares 22.2% sequence identity and 31.1% sequence similarity with human orexin-A (EMBOSS Needle, default parameters).Comparing vertebrate species, orexin-A and orexin-B peptides share one to another 17-56% identity, and are located in tandem in a single gene, which suggests that they have resulted from an internal gene duplication (Fig. 3, more comprehensive in Supplementary Fig. S5).Ci-orexin-A was named according to the vertebrate orexin-A since this segment of CiPPO also contains four cysteines with potential to form disulphide bridges whereas vertebrate orexin-B contains no cysteines (Supplementary Fig. S6).When comparing the putative orexin-A and the following region (corresponding to orexin-B in vertebrates) outside vertebrates, peptides do not share more identity than random sequences (≤ 10%; Fig. 3).A "cryptic peptide" 14 has been identified in the genes coding for invertebrate allatotropin and the orexin-like peptide of S. kowalevskii, but such is not found in other (putative) orexin peptide sequences (Fig. 3).
In tetrapods, disulphide bridges are formed between C6-C12 and C7-C14 of orexin-A.In ray-finned fishes, orexin-A peptides also have four cysteines in their N-terminus and thus should be able to form two disulphide bridges (Fig. 3).However, Xu and Volkoff 30 have suggested, based on molecular modelling, that only the cysteine bridge C7-C14 is formed in ray-finned fishes, while C6 is unbridged.Ci-orexin-A has cysteines in positions 6, 7, 18 and 25.If both disulphide bridges are formed in the same way as suggested for tetrapod orexin-A, a remarkably long loop and a shorter one would be present in the N-terminal domain of Ci-orexin-A, yet these loops would not impact the deep binding region of the peptide (Fig. 4).

CiOX receptor binds both TAMRA-orexin-A and Ci-orexin-A
Both hOX 1 -eGFP and CiOX-eGFP were well expressed on the surface of the stable clones of Flp-In T-REx 293 cells (green fluorescence in Supplementary Fig. S7 and S8).TAMRA-orexin-A (30 nM) bound to the plasma membranes of OX 1 cells quickly after the addition (red fluorescence in Supplementary Fig. S7).We then tested the same for CiOX-eGFP-expressing cells: TAMRA-orexin-A bound to the plasma membranes of the CiOX receptorexpressing cells in an equally fast manner as to the OX 1 receptor-expressing cells and the intensity remained stable for at least 30 min (red fluorescence in Supplementary Fig. S8).In contrast, no green fluorescence (eGFP channel in Supplementary Fig. S9) was observed in wild-type Flp-In T-REx 293 cells and TAMRA-orexin-A did not bind to them (TAMRA channel in Supplementary Fig. S9).
Four Ci-orexin-A variants were obtained through custom synthesis.All four peptides were C-terminally amidated.Two full peptides of 43 amino acids were additionally acetylated or not acetylated in the N-terminus (Ci-orexin-A and Ac-Ci-orexin-A), and two short variants containing 18 C-terminal amino acids, additionally acetylated or not acetylated in the N-terminus (Ci-orexin-A 26-43 and Ac-Ci-orexin-A 26-43 ).No disulphide bridges were introduced during the synthesis process of these peptides but the potential spontaneous formation of disulphide bridges in the peptides was also not assessed.The reduction of disulphide bridges in human orexin-A decreases its potency on hOX 1/2 , but the peptide remains active 31  (Rinne & Kukkonen, unpublished  data).Additionally, there are several possibilities for combinations of different disulphide bridges, thus testing all possible variants would have been extremely costly.
For the hOX 1 receptor-expressing cells, we also observed intracellular red fluorescence suggesting internalization of TAMRA-orexin-A probably together with the receptors (Supplementary Fig. S7).In contrast, we did not observe this in CiOX receptor-expressing cells (Supplementary Fig. S8).

Human orexin peptides but not Ci-orexin-A activate CiOX
Orexin-A and orexin-B, endogenous ligands of hOX 1/2 , and ATP (an endogenous ligand for endogenous P2 receptors) induced strong, concentration-dependent intracellular Ca 2+ elevations in hOX 1 and hOX 2 receptorexpressing CHO-K1 and HEK293 cells (see, e.g. 32) and we have seen the same in Flp-In T-REx 293 cells (Rinne & Kukkonen, unpublished; see also Supplementary Fig. S11).We tested here the effect of the hOX 1/2 agonists Figure 3. Prepro-orexin peptides from different species.For the species codes (Supplementary Table S1), the sequences used and the alignment (Supplementary Fig. S6), see Supplementary material 1. Left, a schematic presentation of the phylogenetic tree of the taxa shown.Boxes, coding/mature regions of the peptides (both orexin-A and -B in vertebrates and a single peptide in invertebrates and early vertebrates).Black lettering indicates conserved motives, ~ denotes region of length not specified here, x denotes region with a length specified here (number of x:s' = number of aa).Colour coding as in Fig. 1.Parallel horizontal double line, signalling peptide; vertical end of the line (prepro-peptide), stop.Right, percent identity (ID) and similarity (SIM) between orexin-A and orexin-B (vertebrates), orexin-A and the sequence directly downstream that would correspond to orexin-B (tunicates, cephalocordates, echinoderms), orexin-A and "cryptic peptide" (hemichordates), or allatotropin and "cryptic peptide" (protostome) within the individual species calculated by EMBOSS Needle, BLOSUM62 global alignment.Abbreviations: aa, amino acids.)) weakly elevated Ca 2+ in the wild-type cells (Fig. 6a) via either endogenous orexin receptors or some other target, however, the maximum elevations were not significant when compared to basal (P value > 0.01).Luckily, the Ca 2+ elevation was clearly larger in CiOX-eGFP-expressing cells (Fig. 6b).The specific response (derived by subtracting the response in wild-type cells from that in CiOX cells) amounted to about 20% of the response to 100 µM ATP (Fig. 6c).EC 50 -values were as follows: orexin-A (224.7 ± 8.1 nM), orexin-B (250.3 ± 6.6 nM), and [A11, d-L15] orexin-B (715.8 nM ± 20.5 nM); no saturation was reached with orexin-A 15-33 , and thus its EC 50 value could not be determined.
In contrast, none of the four Ci-orexin-A peptides induced any Ca 2+ elevation in CiOX receptor-expressing cells (not shown).We additionally tested the potent and efficacious small molecule agonist of human orexin receptors, Nag 26 33,34 , and the M. sexta allatotropin but these induced no Ca 2+ elevation either in CiOX or wildtype cells.

Modelling of the peptide-CiOX complexes
To gain a structural understanding on the binding of orexin-A and Ci-orexin-A to CiOX, we used a cryo-electron microscopy (cryo-EM) structure of hOX 2 in complex with orexin-B and a G protein to build homology models of human orexin-A-CiOX and Ci-orexin-A-CiOX complexes.Orexin-A and orexin-B differ only at the terminal residue (L33 in orexin-A, M28 in orexin-B) in the segment of orexin-B that is resolved in the cryo-EM structure, which suggests that binding interactions in this segment are conserved among the two peptides.
In the orexin-B-hOX 2 structure, residues N20-M28 of the C-terminal part of orexin-B are resolved in an extended conformation, with the C-terminus inserted deep in the core of the receptor (Fig. 8a).Residues G24, I25, L26 and M28 of orexin-B provide high shape-complementarity with the binding cavity, while key polar interactions are formed by backbone oxygen atoms and the sidechain hydroxyl of T27.In detail, the main chain α-carbon of G24 is packed against V ECL2.49 in ECL2 of hOX 2 , the side chain of I25 is sandwiched between the aromatic rings of Y 7.32 and F 7.35 in TM7, while the side chain of L26 occupies a hydrophobic pocket between TM2 and TM3 formed by P 3.29 , L3.28, W ECL1.50 and A 2.60 .In the bottom of the binding cavity, M28 is surrounded by the side chains of F 5.42 , I 6.51 and V 3.36 .In addition to these hydrophobic contacts, hydrogen bonds are formed to K 6.58 , H 7.39 , Q 3.32 and N 6.55 of hOX 2 .www.nature.com/scientificreports/Analysis of the homology models shows that the interactions in the binding cavity would be mostly preserved between the ligand-receptor pairs of orexin-B-hOX 2 and orexin-A-CiOX.In the orexin-A-CiOX model (Fig. 8b), G29, I30, and L31 of orexin-A form hydrophobic contacts to ECL2, TM7, and the TM2-TM3 pocket, respectively, while hydrogen bonds are formed to R 6.58 , H 7.39 , Q 3.32 and D 6.55 of CiOX, which are equivalent to the positions in hOX 2 (Fig. 8a).The main difference in the binding interactions is found in the bottom of the cavity, where F 5.42 and V 3.36 of hOX 2 (Fig. 8a) are substituted for Y 5.42 and T 3.36 in CiOX (Fig. 8b), introducing a slightly more polar environment in the vicinity of L33 of orexin-A.Importantly, in the modelled active CiOX conformation, the distance between the hydroxyl groups of Y 5.42 and T 3.36 (> 5 Å) does not permit a hydrogen bond interaction.
In the Ci-orexin-A-CiOX model, Ci-orexin-A retains the hydrophobic contacts to ECL2 and TM7 with the conserved G-I motif (Fig. 4) and occupies the TM2-TM3 pocket with M41.The hydrogen bonds formed between CiOX and the peptide backbone oxygen atoms are equivalent to those in orexin-A-CiOX and orexin-B-hOX 2 complexes, but the hydrogen bond to D 6.55 is absent in Ci-orexin-A-CiOX as Ci-orexin-A contains A42 (Fig. 8c).The corresponding position in the human orexin peptides contains a threonine (connecting to N 6.55 in hOX 2 or D 6.55 in CiOX; Fig. 8a,b, respectively).Ci-orexin-A has T43 as the C-terminal residue, which is positioned well to form a hydrogen bond with Y 5.42 in the bottom of the CiOX binding cavity (Fig. 8c).

Discussion
In this study, we expressed and characterized a putative C. intestinalis orexin receptor in human cells.CiOX is slightly closer to hOX 1/2 than to ATRs in terms of sequence conservation.In contrast, in the phylogenetic analyses, the orexin receptors in vertebrates and cephalochordates are positioned in five out of six trees closer one www.nature.com/scientificreports/ to another in comparison with CiOX from tunicates.This is unexpected because vertebrates and tunicates are believed to be evolutionary closer than vertebrates and cephalochordates 35 .Importantly, some of the bootstrap values are low (< 65%), indicating a lack of robustness in the position of the corresponding branches, and the CiOX clade is characterized by the longest branches, indicating high degree of divergence.This would suggest that tunicate receptors have diverged at a faster rate, which would explain why they branch further away from their expected positions.This could be explained by the high amino acid substitution rate in C. intestinalis 9 .
The data presented in this study strongly support the notion that this orphan C. intestinalis receptor, here called CiOX, is an orexin-like receptor.Despite the sequence divergence, the (proposed) binding cavity of hOX 2 and CiOX are highly similar.The homology models of CiOX suggests that the amino acid residues that are crucial for binding are conserved throughout the species, and located spatially in a way compatible with a key role of the peptides' C-terminus.Experimentally, we demonstrate that the receptor is well expressed on the surface of human cells and binds TAMRA-labelled human orexin-A.The receptor couples in some degree to human G q as demonstrated by the sensitivity of the Ca 2+ responses triggered by the human orexin peptide variants orexin-A, orexin-B, [A11, d-L15]orexin-B and orexin-A 15-33 to the G q/11/14 inhibitor UBO-QIC.These Ca 2+ responses were also inhibited by the mammalian orexin receptor antagonists TCS 1102, almorexant and SB-3348967.In contrast, the small molecule hOX agonist Nag 26 did not induce any calcium response.The data further support the concept of the similarity between the binding site of the CiOX and the hOX 1/2 .Very few compounds have been reported to tested on both orexin and allatotropin receptors, and, to our knowledge, testing mammalian orexin peptides in allatotropin receptors has not been reported in the literature.However, allatotropin receptor of Tribolium castaneum was not inhibited by almorexant 36 .We also aimed at identifying the endogenous C. intestinalis prepro-orexin peptide; to our best knowledge, a candidate orexin peptide in tunicates has not been proposed before.The identified sequence (CiPPO) harboured a potential peptide that we named Ci-orexin-A based on the presence of a cysteine-containing N-terminus; neither orexin-B equivalent nor cryptic peptide as in the PPO of S. kowalewskii and in the allatotropin propeptides were found.We could see that the full-length synthetic Ci-orexin-A(1) competed with TAMRA-orexin-A for the binding to CiOX, suggesting that it binds to CiOX.We tested the four synthetic variants of this putative orexin-A sequence (Ci-orexin-A variants), but no Ca 2+ elevation was seen upon stimulation of the CiOX receptor with any of the peptides.A reasonable doubt remains as to whether the correct endogenous peptide has been identified, including its cleavage sites (particularly in N-terminus), posttranslational modifications, as well as whether the formation of disulphide bridges should have been controlled.It should also be noted that other candidate peptides, if they exist, may be identified in the future using data mining techniques tailored for retrieving short peptides.
Another explanation for the lack of efficacy of CiOX may lie in potentially different proteins needed for orexin receptor signalling among human and tunicates.It is thus possible that the signal transduction machinery of mammalian cells is less optimal for CiOX.Finally, we must consider the possibility that CiOX, unlike its mammalian orthologs, may rather signal via other pathways than G q , as we did not investigate other known GPCR signal transduction pathways.Although human orexin receptors couple to Ca 2+ elevation presumably via G q , in basically every cell type tested 1 , it is not said that their distant relative, CiOX, couples in the same manner.However, when expressed alone or together with the promiscuous G 16 -protein, insect ATRs couple to Ca 2+ elevation in mammalian cells 37 .Further studies of the CiOX signalling are thus required [37][38][39] .www.nature.com/scientificreports/ The evolutionary events that led to the emergence of the ATR, the CiOX, and the vertebrate OX 1 and OX 2 receptors are blurry.The data presented in this study, together with the compelling evidence for two rounds of whole genome duplication in the lineage leading to vertebrates [38][39][40] , can be reconciliated into a simple evolutionary hypothesis: a single orthologous gene would encode for the allatotropin receptors in protostome and the orexin-like receptors in tunicates.The ability to be activated by the long orexin-like peptides instead of allatotropin peptides would have emerged before the split between tunicates and vertebrates through point mutations.
As shown in this study, about 16 point mutations at the binding site may explain the binding specificity for allatotropin and orexin ligands at least in the deep-binding region.Following this split, the CiOX would have diverged at high rate, and the OX 1 and OX 2 receptors would have emerged from the vertebrate whole-genome duplication.Two hypotheses are plausible: (a) either one the ancestral OX genes has been inactivated after the first round, or (b) two of the four duplicated orexin receptors has been lost after the second round 38,41 .There is no evidence of whole genome duplications in the C. intestinalis genome, which harbours approximately 16,000 predicted genes, including 169 suggested GPCR genes 42 .While this hypothesis is parsimonious, it does not exclude other scenarios 21,41 .More studies, in particular gene synteny analysis, could give further insights into evolutionary relationships of genes between different organisms.It should however be noted that C. intestinalis has reduced gene synteny conservation 9 , which makes this analysis challenging or even unfeasible.
The orexin system has a central role in many physiological functions in mammalians (see Introduction).In the allatotropin system, first identified in M. sexta, regulates the production of juvenile hormones, modulates the circadian clock and the myotropic activity, and has role also in feeding 37,43,44 .Thus, there is evidence for similar roles of the orexin and allatotropin systems, at least in the regulation of circadian activities and feeding.The function of the potential orexin system in tunicates is unknown, but it is not unreasonable to assume a function along similar lines.
Altogether, these data demonstrate that CiOX is an orexin receptor-like receptor, an evolutionary distant relative of both the OX 1 and OX 2 receptors.Furthermore, we propose that Ci-orexin-A binds to hOX 1 and CiOX and affects CiOX, although its actual sequence and post-translational modifications are uncertain.

Molecular phylogeny and three-dimensional structure modelling
The CiOX is encoded by the Ensembl gene ENSCING00000007467 17 (accessed 17 th July 2023).Two transcripts may be produced from this gene, X1 and X2, which differ in their C-terminus.The X2 transcript (NCBI transcript XM_002127151.3;NCBI protein XP_002127187.1 46 and Uniprot F6YII5_CIOIN 45 ; accessed 17th July 2023) was obtained from the mRNA of animal tissue-a strong indication that this variant actually exists in nature-and cloned to the RIKEN cDNA library.Only the X2 transcript was known when this study started.
The phylogenetic relationships of CiOX with vertebrate orexin receptors and ATRs was investigated using the MEGA X package 47 .The analysis included retrieval of orexin and ATR amino acid sequences for organisms of different orders from the NCBI peptide database.This data set was supplemented with four human neuropeptide receptor sequences to root the tree (NPFF1 neuropeptide FF receptor, GAL 2 galanin receptor, QRFP pyroglutamylated RFamide peptide receptor, and ET B endothelin receptor).The initial alignment produced by MEGA X was manually revised to align GPCR family-specific amino acids and to avoid gaps in TM (sequences and alignments are in Supplementary material 1-3).The phylogenetic analyses were conducted either with full sequences (599 positions, out of 798 possible positions in the alignment that did not contain many gaps) or restricted to the regions corresponding to TMs (231 positions out of 249).Three different tree construction methods were compared: minimum evolution (Poisson model), maximum likelihood (Jones-Taylor-Thornton model) and neighbour-joining (Poisson model).Five hundred bootstrap replicates were conducted in all cases.
Homology modelling of the CiOX receptor was conducted using the high-resolution crystal structure of hOX 2 receptor (PDB code: 5WQC 22 ).Regions absent from template structure, such as termini and intercellular loop 3, were not modelled.Five models were constructed using the MODELLER software v9.22 with default settings 48 , and the model with the best discrete optimized protein energy (DOPE) score selected for further analysis.The well-defined binding cavity of the crystal structure of hOX 2 -suvorexant complex (PDB code 4S0V 25 ) was used for the definition of the CiOX binding pocket: residues with atoms within 5 Å of suvorexant, complemented with residues forming the 4 salt bridges in the hOX 2 's binding site.
Discovering a hit for the CiPPO proved to be challenging.We aimed to find an unassigned open reading frame for a PPO peptide containing more than 100 amino acids, the cysteine cap (pattern CCx n1 Cx n2 C), and the cleavage site (ILTL/GKR) characteristic of the vertebrate orexin-A peptide.Database searches were conducted using BLAST to query the Ensembl database (release 95) 49 .Several vertebrate and invertebrate PPO segments coding the signal peptide, orexin-A and the cleavage site were queried against C. intestinalis and Ciona savignyi cDNA using TBLASTN (BLAST of protein sequence on a nucleotide database); Blosum45, distant homologies, gap penalties 10 for opening and 3 for extension, low complexity regions filtered out).The initial hit was identified in C. savignyi cDNA using zebrafish (Danio rerio) PPO segment as a query.The initial hit was further queried against C. intestinalis cDNA and the final hit identified as an uncharacterized protein with mRNA evidence in NCBI; code: XM_026835150, reading frame 2, location in LOC113474376, chromosome 7 (accessed 17th July 2023).
Following discovery of a suitable open reading frame, we analysed its putative 3D structure by homology modelling.A three-dimensional model of the Ci-orexin-A was constructed using the human orexin-A peptide (PDB code: 1WSO 50 ) as the template structure.The presence of a potential signal peptide was predicted with SignalP-5.0online tool (likelihood 0.9995) 51 .One hundred homology models were constructed in order to cover multiple conformations, and one with the highest DOPE score selected for further analysis.
To model the structures of orexin-A and Ci-orexin-A in complex with CiOX, we used the active-state cryo-EM structure of hOX 2 bound to orexin-B (PDB ID: 7L1U 24 ) as a template.This structure was solved while the study

Plasmids and cloning
The cDNA of CiOX (transcript variant X2) in the pBluescriptII SK(-) vector (https:// dna.brc.riken.jp/ en/ clone seten/ ciona_ est_ en) was obtained from RIKEN BioResource Research Center (Tsukuba, Japan).The stop codon was removed by polymerase chain reaction and the enhanced green fluorescent protein (eGFP) coding sequence was fused to the 3'-end of the receptor coding sequence.hOX 1 was similarly fused to eGFP as described previously 54 .The CiOX-eGFP and hOX 1 -eGFP constructs were transferred into the mammalian expression vector pcDNA5/FRT/TO (Invitrogen/ThermoFisher Scientific, Waltham, MA, USA).These constructs allow stable tetracycline-inducible mammalian expression of CiOX and hOX 1 receptors in Flp-In T-REx 293 cells; the C-terminal eGFP enables visualization of the expression levels and the subcellular localizations of the receptors.Plasmid modifications were planned with the software SerialCloner2.6 55 .C. intestinalis genetic code was not optimized for human cells as the receptor was well expressed as such in human Flp-In T-REx 293 cells (see later in Results).

Generation of the stable cell lines
The stable cell lines were based on the Flp-In T-REx 293 cells (ThermoFisher) allowing flippase-mediated recombination with a single integrated flippase recognition target (FRT) site.Recombination with the insert in pcDNA5/FRT/TO generates stable cell lines with tetracycline-inducible expression.The host cells were separately transfected with the receptor plasmids pcDNA5/FRT/TO-CiOX-eGFP and pcDNA5/FRT/TO-hOX 1 -eGFP (+ the accessory plasmid pOG44) using GeneJuice transfection reagent (Merck, Darmstadt, Germany) and cultured and selected according to the manufacturers protocol.Four foci were picked for each cell type and tested for doxycycline-induced gene expression and zeocin-sensitivity.One focus was selected for each cell type for further studies.The doxycycline induced gene expression was further assessed by fluorescent microscopy (eGFP) after 24 h and 48 h from the treatment with different concentrations of doxycycline: 0, 0.1, 1, 10, 100 and 1000 ng/mL.100 ng/mL was chosen for the induction of the expression in further studies, to ensure comparable expression in both cell lines.Differences in the apparent expression level were not observed between the timepoints 24 h and 48 h.

Binding of TAMRA-orexin-A
We have previously observed that the N-terminus of orexin-A can tolerate additions without apparent loss of activity on orexin receptors (Kukkonen, unpublished) and it has been reported that fluorescently labelled orexin-A can be used as probe for hOX 1/2 56 .We thus ordered custom synthesis of TAMRA-orexin-A.This peptide shows red fluorescence (555/580 nm), and could thus be used to show binding to the membranes and co-localization with the green fluorescent, eGFP-fused OX 1 and CiOX receptors in microscopy.

Figure 1 .
Figure 1.Phylogenetic analysis of putative Ciona orexin receptors based on amino acid sequences; for the species codes, see Supplementary TableS1, for the sequences used, see Supplementary Information 1, and for the alignments, see Supplementary Information 2-3.The CiOX gene encodes for two transcripts (X1 and X2), of which the X2 transcript isolated from C. intestinalis is used in this study.Trees were constructed based on either full sequences (left) or TMs only (right), with three different phylogenetic construction methods as indicated.Robustness was assessed with 500 × bootstrap method (values in % at the intersections of the branches).Green, vertebrate orexin receptors; turquoise, cephalochordate orexin receptors; pink, protostome ATRs; purple, echinoderms/hemichordates orexin receptors; and yellow tunicate orexin receptors.The tree is rooted (grey) with human sequences of NPFF1 (NPFFR), GAL 2 (GALR2), QFR (QRFPR), and ET B (ENDRB) receptors.

Figure 2 .
Figure 2. Comparison of the receptor binding sites.(a) The determined binding sites of hOX 2 (PDB code 5WQC (b) homology model of CiOX and (c) homology model of M. sexta ATR are shown from two points of view.The top row presents mainly the TM2 and TM7 and the bottom row mainly the TM3, ECL2, TM5 and TM6.Conserved residues between hOX 2 , CiOX and ATR are uncoloured sticks, otherwise in green (hOX 2 ), light brown (CiOX) and coral (ATR).Yellow dashes indicate hydrogen bonds and salt bridges, while red dashes indicate long distance (4.0 Å) between the heavy atoms.Numbering according to the Ballesteros-Weinstein convention 29 .(d) aligned protein sequences of hOX 2 , CiOX and ATR.Binding site residues in black boxes, conservation in gradient blue, residues involved in salt bridges in red letters, salt bridge connections in red numbers.Regions of the sequences not shown are indicated by blue vertical lines.

Figure 4 .
Figure 4. Comparison of human and C. intestinalis orexin peptides.Left, an NMR solution structure of human orexin-A (dark green; PDB code: 1WSO); and right, a homology model of Ci-orexin-A (orange).Below, a sequence alignment of human orexin-B, orexin-A and Ci-orexin-A.Disulphide bridges are shown as red sticks in the models and red lines in the sequence alignment.Identical residues are represented as blue sticks and blue boxes.Alignment was manually revised to match cysteines in orexin-A and Ci-orexin-A.

Figure 5 .
Figure 5. Binding of TAMRA-orexin-A to the plasma membrane and displacement by Ci-orexin-A(1) in CiOX-eGFP cells.Left, the cells before additions; middle, the cells after a 10 min incubation with 10 nM TAMRA-orexin-A; right, the cells after a further 10 min incubation with 1 µM Ci-orexin-A(1).Scale bar = 50 µm.

Figure 6 .
Figure 6.Ca 2+ responses to receptor stimulation.Concentration-response curves for orexin-A, orexin-B, [A11, d-L15]orexin-B and orexin-A 15-33 in (a) wild-type and (b) CiOX-expressing cells.In (c), CiOX-specific response, obtained by subtracting the response in wild-type cells from the response in CiOX-expressing cells.Responses are presented as normalized to the maximum ATP (100 µM; 100%) response separately for each independent sample to allow comparison between cell types.Data presented as mean ± S.E.M. N = 3.

Figure 7 .
Figure 7.The sensitivity of the human endogenous orexin peptide (1 µM) -mediated Ca 2+ responses to inhibitors (1 µM) in CiOX-expressing cells.The responses were normalized to the basal + inhibitor (0%) and the control agonist peptide response (in the absence of inhibitor; 100%) separately for each independent experiment before averaging.Data presented as mean ± S.E.M. N = 3-4.The significances are given in relation to the agonist control.

Figure 8 .
Figure 8.Comparison of the binding interactions of human and C. intestinalis orexin peptides.(a) Cryo-EM structure of human orexin-B (dark green) bound to hOX 2 (light green) (PDB ID: 7L1U), (b) homology model of the putative binding mode of human orexin-A (dark green) to CiOX (bleak yellow), and (c) homology model of the putative binding mode of Ci-orexin-A (orange) to CiOX (bleak yellow).Binding site side chains conserved between hOX 2 and CiOX are shown as uncoloured sticks.Residues not conserved are shown in light green (hOX 2 ) and bleak yellow (CiOX).Residues conserved between human and C. intestinalis orexin peptides are shown as blue sticks.