## Main

The olfactory system faces a unique challenge amongst sensory modalities owing to the inordinate complexity of the chemical world. Whereas light waves vary continuously in amplitude and frequency, odorants differ discretely along an enormous number of dimensions in their molecular structure and physicochemical properties. Consequently, just three photoreceptors are sufficient to sense the entire spectrum of visible light, but large repertoires of olfactory receptors appear to be necessary to detect and discriminate amongst the diversity of chemicals in the environment1,2,3. In mammals, odour detection is mediated by G-protein-coupled receptors that signal through canonical second-messenger cascades5,6. By contrast, insects detect volatile chemicals using a unique class of odorant-gated ion channels7,8 consisting of two subunits: a conserved co-receptor (Orco) subunit9,10 and a highly divergent odorant receptor (OR) subunit that contains the odorant-binding site and confers chemical sensitivity to the heteromeric complex11.

Although mammals and insects rely on distinct molecular mechanisms for odour detection, they share a common neural logic for olfactory perception based on the combinatorial activation of distinct ensembles of olfactory receptors and associated sensory neurons1,3,12. Central to this sensory coding strategy is that most individual ORs detect a variety of structurally and chemically diverse odorants11,13,17,25. However, in the absence of a structural model, how such flexible chemical recognition is achieved has remained unknown. Whether the broad chemical tuning of ORs reflects the presence of multiple odorant-binding sites that differ in their chemical specificity or a single promiscuous binding pocket is not known. Furthermore, which structural or chemical features of odorants are recognized by a receptor remains unclear. In this study, we leveraged the evolutionary diversification of insect ORs to elucidate the structures of a homomeric receptor from a basal insect species bound to different ligands. We reveal how a single receptor can detect a wide array of odorants through a single promiscuous binding site that recognizes the overall physicochemical properties of each odorant rather than being tuned to any of their specific structural or molecular features, suggesting a structural basis for flexible chemical recognition.

## MhOR5 is a broadly tuned receptor

Although neopteran insects each express a repertoire of highly divergent ORs along with a single, almost invariant Orco2, recent genomic analyses have revealed that some basal insects, such as the jumping bristletail M. hrabei, possess only a small number of OR genes and lack an apparent Orco orthologue4 (Fig. 1a). MhORs have been proposed to represent the most ancestral members of the insect olfactory receptor family, arising before the emergence of Orco4,14. Although little is known about chemosensory detection in the jumping bristletail, we reasoned that MhORs might function as homomeric olfactory receptors. We heterologously expressed each MhOR in HEK293 cells and found that, indeed, MhOR1 and MhOR5 migrated as tetramers on non-denaturing native gels (Extended Data Fig. 1a, b). To assess whether these homomeric complexes function as chemoreceptors, we used a high-throughput fluorescence assay10 in which we co-expressed MhOR1 or MhOR5 with the indicator GCaMP6s and measured calcium influx in response to a panel of 54 small molecules over a range of concentrations. We found that MhOR5 was activated by many volatile odorants but not tastants, consistent with a role for this receptor in olfactory detection (Fig. 1b, Extended Data Fig. 2a–d). MhOR5 was also activated by the insect repellent DEET and inhibited by the synthetic Orco agonist VUAA115. To quantitatively capture the complexity of odorant-evoked responses16(Extended Data Fig. 2a–d), we defined an activity index for each odorant (−log(EC50) × max ΔF/F; in which EC50 is the concentration of ligand at which the response reaches half its maximal value) that reflects both the apparent affinity and maximal efficacy of an odorant. MhOR5 was activated by over 65% of odorants, resembling the broad molecular receptive fields of many insect and mammalian ORs11,13,17 (Extended Data Fig. 1d). By contrast, MhOR1 exhibited far more selective tuning, responding to only eight odorants from the same chemical panel (Extended Data Fig. 1e). Both MhOR1 and MhOR5 were activated by ligands that spanned multiple chemical classes and a range of physicochemical properties (Extended Data Fig. 1e, f), exemplifying the complex chemical logic of odorant detection.

Whole-cell recordings of HEK cells expressing MhOR5 showed that the odorant eugenol elicited slowly activating inward currents that reversed at 0 mV, consistent with its function as a cation-selective ion channel (Fig. 1c). In outside-out patches, eugenol evoked small-conductance single-channel activity that rapidly flickered between the closed and open states, resembling canonical heteromeric insect olfactory receptors7,10 (Fig. 1c, Extended Data Fig. 2e). MhORs thus autonomously assemble as homotetrameric odorant-gated ion channels and display the divergent chemical tuning profiles typical of this receptor family. Given its sensitivity to a broad array of structurally diverse odorants, we focused on MhOR5 to investigate the molecular basis of promiscuous chemical recognition.

## Structure of the MhOR5 homotetramer

We used single-particle cryo-electron microscopy (cryo-EM) to elucidate the structure of the MhOR5 tetramer. We obtained a density map at 3.3 Å resolution (Fig. 1d, Extended Data Figs. 3, 4, Extended Data Table 1), which allowed us to unambiguously build a model for the majority of the protein, with the exception of several extra-membranous loops and the short intracellular N terminus and extracellular C terminus (Extended Data Figs. 4c, 5c). A comparison of the structure of MhOR5 with the previously elucidated structure of Orco from the wasp Apocrypta bakeri10 showed that these two receptors, despite sharing only about 18% amino acid conservation, display notable similarity, both in the fold of each heptahelical subunit and in the tetrameric organization of the subunits within the membrane plane (Extended Data Fig. 5a, b). As in Orco, each MhOR5 subunit contributes a single helix (S7b) to the central ion conduction pathway, and their S0–S6 helices form a loosely packed domain that projects radially away from the pore axis (Fig. 1d). Within the membrane, the contacts between MhOR5 subunits are minimal and confined to the pore, whereas about 75% of the residues that form inter-subunit interactions reside within the intracellular ‘anchor’ domain, formed from the intertwined S4–S7 helices of all four subunits (Extended Data Fig. 5d). Analogous to the Orco structure, the tightly packed anchor domain of MhOR5 exhibited the highest local resolution (Extended Data Fig. 4c), consistent with a structural role in stabilizing the loosely assembled S0–S6 transmembrane domains within the lipid bilayer. The limited sequence conservation across neopteran ORs and Orcos maps to residues predominantly within the pore and anchor domain10, further underscoring how the architecture of this receptor family can accommodate a high degree of sequence diversification while maintaining the same overall fold, a feature that is likely to have facilitated the rapid evolution of ORs.

## Odorant binding leads to pore opening

To explore the structural determinants of odorant gating, we determined a 2.9 Å-resolution structure of MhOR5 in complex with its highest activity ligand, eugenol. Three-dimensional reconstruction of the bound structure immediately yielded higher resolution, as was apparent from early stages of data processing (Extended Data Figs. 3, 4, Extended Data Table 1). The MhOR5 pore displays the same distinct quadrivial architecture as the Orco homotetramer10, in which a single extracellular pathway opens into a large aqueous vestibule near the intracellular surface of the membrane and then diverges into four lateral conduits formed at the interfaces between subunits (Fig. 2a, b). In the apo structure, the S7b helices coalesce to form the narrowest portion of the ion conduction pathway. In particular, Val468 protrudes into the channel lumen, generating a hydrophobic constriction measuring about 5.3 Å in diameter, and thus serves as a gate to impede the flow of hydrated ions through the quadrivial pore (Fig. 2a, b, d). In the presence of eugenol, the extracellular aperture of the pore is dilated as a result of movement of the S7b helices away from the central pore axis (Fig. 2b–d), which rotates Val468 out of the pore lumen to face the lipid bilayer, while Gln467 rotates in to face the ion pathway. As a consequence of this rearrangement, the chemical environment of the pore is transformed from a narrow hydrophobic constriction to a wide hydrophilic ring, 9.2 Å in diameter, that can readily accommodate the passage of hydrated cations. Notably, the remainder of the quadrivial pore remains essentially unaltered with the addition of eugenol (Fig. 2a–c), confirming that the tightly packed anchor domain forms a relatively stationary structural element10. The dilation of the S7b helices thus appears to be sufficient to gate the ion conduction pathway—this small conformational change would present a low energetic barrier to gating, consistent with the low affinity of most odorants11,17 and with functional evidence that MhOR5 channels, as with many insect olfactory receptors, open spontaneously even in the absence of ligand7,11 (Extended Data Fig. 1c).

Gln467 is highly conserved across Orcos and ORs from M. hrabei and other basal insect species14 and was previously identified as a component of the only signature sequence motif (TYhhhhhQF, in which h is any hydrophobic amino acid) that is diagnostic of the highly divergent insect chemosensory receptor superfamily18. Mutation of Gln467 in MhOR5 to either the smaller residue alanine or the positively charged residue arginine strongly impaired receptor function, whereas a more conservative mutation to asparagine had little effect on activity (Fig. 2e). Replacement of the neighbouring residue Val468 with either alanine or glutamine resulted in minimal changes to odorant activation (Fig. 2e), highlighting that movement of the pore helices, rather than simply the presence of a pore-lining glutamine, is necessary to gate the channel. In the closed structure of the Orco homotetramer10, the homologous residue, Gln472, points into the lipid membrane, similar to its position in the closed conformation of MhOR5. Mutation of Gln472 to alanine in Orco yielded non-functional homomeric channels (Fig. 2f). Gln472 is thus one of the few S7b residues in Orco that is intolerant to such a perturbation10, consistent with a conserved and critical role for this residue in gating and/or ion permeation across this receptor family. Notably, the Q472A Orco mutant could be partially rescued by co-expression with an OR from Anopheles gambiae (Fig. 2g), indicating that this mutant can fold and function in the context of the heteromeric assembly and underscoring the intrinsic robustness of the Orco–OR complex, where both subunits contribute to a shared ion conduction pathway10,19.

## Architecture of the odorant-binding site

In the transmembrane domain of each MhOR5 subunit, the S2, S3, S4 and S6 helices splay apart to form a 15 Å-deep pocket within the extracellular leaflet of the bilayer (Fig. 3a, Extended Data Figs. 6, 7). Clearly defined density consistent with the size and shape of eugenol lies at the base of this pocket, enclosed within a hydrophobic box constructed from several large aromatic and hydrophobic residues, with Trp158 forming the lid, Tyr91 and Tyr383 forming its base, and flanked by Tyr380 on one side and by Met209, Ile213 and Phe92 on the other (Fig. 3b, Extended Data Figs. 6, 9d). In the apo structure, the density for some of these amino acids was diffuse (Extended Data Fig. 6b), which could be attributed to the overall lower resolution of this structure or to conformational flexibility when no odorant is bound. The lower resolution of the apo pocket precluded us from defining the path that eugenol takes to enter the pocket, as in the bound structure the pocket is not obviously accessible to solvent (Extended Data Figs. 6b, 7a), or from determining whether the cavity is partially occupied in the absence of an added ligand. Binding of eugenol, however, stabilized the residues that line the pocket, allowing unambiguous mapping of the side chains that form the binding site.

To explore the potential binding modes of eugenol, we used computational docking methods20 and performed a broad grid search spanning the majority of the transmembrane domain. This analysis identified a series of closely related eugenol poses with uniformly favourable docking scores that fit into the experimental density well (Extended Data Fig. 8a). At this resolution, differentiating between these poses is challenging given that eugenol, as with most odorants, is a small molecule with few distinguishing structural features to orient it within the density. Nevertheless, eugenol was predicted to bind through comparable interactions across all the top poses, but these interactions could be mediated by different hydrophobic or aromatic residues within the pocket. For example, the benzene ring of eugenol was stabilized through π-stacking interactions, but these could be mediated by Trp158, Tyr91, or Tyr380, which lie on opposing faces of the binding pocket. In every pose, eugenol also formed extensive hydrophobic interactions with an overlapping complement of aliphatic and aromatic side chains. Moreover, although eugenol’s hydroxyl was consistently oriented towards the only polar amino acid lining the pocket (Ser151), none of the predicted poses adopted a geometry that allowed them to form hydrogen bonds with the surrounding residues. Therefore, recognition of eugenol appears to rely on non-directional hydrophobic interactions formed with a distributed array of binding pocket residues. Although only one of these poses might be energetically favoured, structural studies of odorant binding proteins21,22 that ferry hydrophobic ligands through the sensory neuron lymph have revealed that an individual odorant can bind in different poses within the same hydrophobic binding cavity; thus, it is possible that eugenol might likewise sample from multiple energetically degenerate binding modes in MhOR5.

To functionally corroborate the eugenol binding site, we identified ten amino acids with side chains that were in close proximity to the ligand density—Val88, Tyr91, Phe92, Ser151, Gly154, Trp158, Met209, Ile213, Tyr380 and Tyr383—and found that mutation of any of these residues to alanine strongly altered eugenol signalling (Fig. 3c, Extended Data Fig. 9a–c). Several of these mutants also displayed increased baseline activity (Extended Data Fig. 9a, e), suggesting that these residues stabilize the closed conformation. Mutation of adjacent residues that project away from the binding site—Thr87 and Leu379—had minimal effect on activation by eugenol (Fig. 3c, Extended Data Figs. 6a, 9a), underscoring the specificity of these perturbations to odorant-dependent gating.

A comparison of the apo and eugenol-bound structures indicates that, in addition to the dilation of the pore, smaller conformational changes appear to be distributed throughout the transmembrane portion of the S0–S6 helices (Extended Data Fig. 10a, b and Supplementary Videos 1, 2). Although the delocalized nature of these small rearrangements makes it challenging to delineate how odorant binding is transduced to pore opening, one potential route is through the S5 helix, which runs parallel to the S7b helix that lines the pore and anti-parallel to the S6 helix that contributes key residues to the odorant-binding pocket. Upon eugenol binding, these three helices move together away from the central axis of the channel, displacing the S7b helices outwards to gate the ion conduction pathway (Extended Data Fig. 10a, b). Close to the extracellular surface of the membrane, the S5 and S7 helices interact through Tyr362 and Leu465, which are highly conserved as hydrophobic amino acids and evolutionarily coupled23, pointing to a coordinated role in receptor function. These residues remain tightly packed as the S7b helix moves into an open configuration (Extended Data Fig. 10b), suggesting that they might couple conformational rearrangements within the odorant-binding pocket to the dilation of the pore. Mutation of either Tyr362 or Leu465 to alanine impaired eugenol activation, whereas mutation of Tyr362 to phenylalanine had no effect (Extended Data Fig. 10c), supporting a model in which hydrophobic interactions at this position contribute to gating.

## Structural basis of odorant specificity

To investigate the diversity of binding modes used by different ligands, we determined the 2.9 Å structure of MhOR5 in complex with the insect repellent DEET (Extended Data Table 1). The S7b helices in the DEET-bound structure were dilated to a diameter of 8.7 Å (Extended Data Fig. 10d–f), indicating that different ligands elicit a common conformational change to gate the pore. Density corresponding to DEET localized to the same binding pocket as eugenol, encased within the same box-like configuration of aromatic and aliphatic side chains (Fig. 4a, b, Extended Data Figs. 6b, 9d). As with eugenol, computational docking of DEET yielded multiple poses with comparable docking scores that fit the experimental density well (Extended Data Fig. 8a). Whereas each of the top poses was predicted to adopt a distinct orientation, all were stabilized through a similar complement of hydrophobic and/or π-stacking interactions. Although we cannot determine whether DEET adopts only one or multiple conformations within the binding pocket, these observations reinforce how non-directional hydrophobic interactions may contribute to flexible chemical recognition, allowing different ligands to bind to the same structural locus or potentially enabling a single odorant to sample from multiple poses within the binding cavity.

To investigate whether the broader panel of MhOR5 ligands is recognized through a similar structural logic, we examined how their physicochemical descriptors correlated with receptor activity. Multiple regression analysis revealed that although no single metric was strongly predictive of agonism, the descriptors that best accounted for MhOR5 activity were low polar surface area, low water solubility, and low potential for forming hydrogen bonds (Extended Data Table 2), consistent with our structural observations of a geometrically simple binding site in which diffuse hydrophobic interactions dominate. MhOR1 agonism was less correlated with these descriptors, suggesting that they have a heterogeneous role in shaping the tuning of different receptors (Extended Data Table 2). Furthermore, the top 31 MhOR5 agonists identified in our panel could be docked within this same binding site with favourable scores, stabilized predominantly through hydrophobic interactions (Extended Data Fig. 8), suggesting that diverse odorants can be recognized through distributed and non-directional interactions with an overlapping subset of residues in the MhOR5 binding pocket.

A comparison of the eugenol and DEET-bound structures reveals how the MhOR5 binding pocket might accommodate such diverse ligands. The constellation of amino acids lining the binding pocket retains the same overall geometry in both structures, leaving the architecture of the hydrophobic box largely unchanged. However, a small displacement of the S4 helix results in expansion of the pocket, probably to accommodate the longer aliphatic moiety of DEET and avoid a steric clash with the side chain of Met209 (Fig. 4c, Extended Data Fig. 6b). Functional data support these structural observations. Mutation of Met209 to smaller hydrophobic amino acids (valine or alanine) enhanced the affinity of DEET (Fig. 4c, Extended Data Fig. 9b). The same mutations attenuated eugenol sensitivity, suggesting that this smaller odorant occupies the binding pocket less optimally in the absence of the bulky methionine side chain. Conversely, mutation of Ile213, another aliphatic S4 residue that lies in close proximity to DEET, to the larger residue methionine abolished DEET sensitivity but marginally altered eugenol signalling (Fig. 4c, Extended Data Fig. 9c). Structure-guided mutagenesis therefore differentially altered the sensitivity of MhOR5 to these two ligands. Furthermore, the I213M and M209V mutations broadly reconfigured the tuning of MhOR5 to a larger panel of 40 odorants (Fig. 4d), supporting a model in which diverse chemicals are recognized by shared structural elements within a common binding pocket. Changes in odorant tuning, however, did not adhere to a simple logic, consistent with the complexity of physicochemical properties that define MhOR5 agonism (Extended Data Table 2) and with the proposal that both the global geometry and local chemical environment of the binding pocket contribute to its chemical sensitivity.

To assess whether MhOR5 can serve as a structural model for chemical recognition in other ORs, we used sequence homology to identify ten residues predicted to line the binding pocket in the more narrowly tuned MhOR1 and examined their contribution to odorant tuning (Extended Data Fig. 11a–c). For all but one of these residues, mutation to alanine impaired MhOR1 activation by its ligands, 1-octanol and eugenol, indicating that the odorant binding pocket is a conserved structural feature of this family, even between divergent receptors that display distinct chemical tuning. Furthermore, mutation of Met231 in MhOR1 to the corresponding residue in MhOR5, isoleucine, enhanced the sensitivity of MhOR1 to a panel of odorants (Extended Data Fig. 11d). Thus, whereas the I213M mutation narrows the chemical tuning of MhOR5, the reciprocal M231I mutation broadens the molecular receptive range of MhOR1, shifts in sensitivity that could be attributed to alterations in the size of the binding pocket. Odorant recognition in different insect olfactory receptors appears therefore to rely on a conserved binding site that can be readily retuned to detect different regions of chemical space.

## Discussion

The broad tuning of olfactory receptors is central to the detection and discrimination of the vast chemical world. Here we show that MhOR5 detects a wide array of odorants through a single promiscuous binding site, offering structural insight into how such flexible chemical recognition is achieved. Notably, odorant binding relies predominantly on hydrophobic interactions, which lack the strict geometric constraints inherent to other intermolecular associations (such as hydrogen bonds) that frequently mediate ligand recognition. The distributed arrangement of hydrophobic and aromatic residues across multiple surfaces of the binding pocket further relaxes orientational constraints by allowing odorants to form comparable interactions with many of its faces. Moreover, the simple geometry of the binding site imposes minimal restriction on the shape of odorants that can bind, accommodating both eugenol and DEET with little structural rearrangement. Computational docking analyses support these structural observations and suggest that the same logic underlies the sensitivity of MhOR5 to structurally and chemically diverse ligands. The prevalence of comparatively weak intermolecular interactions is compatible with the low affinity of most odorants11,13,17,24 and the small conformational change required to gate the channel. Olfactory receptor tuning thus depends on the stereochemistry of its ligands25,26, but does not adhere to the classic lock-and-key mechanism that governs many receptor–ligand interactions.

Residues that have been implicated in odorant specificity in different neopteran receptors map to the binding pocket of MhOR510,27,28,29,30, indicating that it represents a conserved and canonical locus for odorant detection across this highly divergent family. Binding of DEET to the same site offers structural corroboration that this insect repellent might exploit the promiscuity of diverse ORs and serve as a molecular ‘confusant’ by scrambling the olfactory code31. Other modulators of olfactory receptors, such as VUAA1 (which inhibits MhOR5), cannot favourably dock within this binding pocket owing to their much larger size, suggesting that insect olfactory receptors might possess additional points of allosteric modulation that expand their signalling mechanisms.

Several important implications arise from our observation that diverse odorants share the same structural determinants for binding. Notably, even single conservative mutations within the binding pocket can broadly reconfigure the chemical tuning of the receptor, a feature that is likely to have facilitated the rapid evolution of receptors with distinct ligand specificity2,27,28,29. However, such extensive retuning also poses a substantial evolutionary constraint, as individual binding-site mutations are likely to have a pleotropic effect on the representation of multiple odorants, potentially serving to broadly reconfigure the odour code. The promiscuous and arbitrary nature of odorant recognition is likely to impose substantial selective pressures on the structure and function of olfactory circuits, driving the evolution of synaptic and circuit mechanisms that can decorrelate, decode, and impose meaning onto combinatorial patterns of receptor activity32. Odour discrimination is thus transformed from a biochemical problem at the receptor level to a neural coding problem within the brain.

Although the structure of a mammalian olfactory receptor has yet to be elucidated, odorant detection in mammals has been proposed to also rely on distributed hydrophobic and non-directional interactions within a deep transmembrane pocket33,34,35. Structurally and mechanistically distinct receptor families appear to therefore rely on similar principles for their broad chemical tuning, pointing to common constraints in how diverse hydrophobic molecules can be recognized. Additional mechanisms for odorant recognition certainly exist, in particular for receptors that are selectively tuned to ethologically relevant chemical classes, such as pheromones36, the perceptual meaning of which is singular and invariant. Whether stricter odorant specificity relies on distinct intermolecular binding modes, variations in the geometry of the binding pocket, or both, remains to be determined.

Finally, our work sheds light on the evolution of the insect olfactory system. We demonstrate that MhORs can function as homomeric odorant-gated channels, supporting the proposal that they lie at the ancestral origin of the insect olfactory receptor family4,14, which expanded massively across insect lineages to emerge as possibly the largest and most divergent class of ion channels in nature2. Why neopteran ORs became obligate heteromers with Orco remains unclear, but presumably reflects the fact that Orco confers structural stability on the complex, thereby relaxing evolutionary constraints on the ORs and allowing them to further diversify, to ultimately support the flexible detection and discrimination of an enormous and ever-changing chemical world.

## Methods

### Expression and purification of MhOR5

The coding sequence of M. hrabei OR5 (MhOR5) was synthesized as a gene fragment (Twist). Residues Lys2 to Pro474 were cloned into a pEG BacMam vector37 containing N-terminal tags of Strep II, superfolder GFP38, and an HRV 3C protease site for cleavage (N-CACCatg-ST2-SGR-sfGFP-PPX-AscI-MhOR5-taa-NotI-C). The AscI/NotI restriction enzyme sites enable efficient cloning of different OR sequences. SF9 cells (ATCC CRL-1711) were used to produce baculovirus containing the MhOR5 construct, and the virus, after three rounds of amplification, was used to infect HEK293S GnTI cells (ATCC CRL-3022)37. Cell lines were not authenticated except as performed by the vendor. HEK293S GnTI cells were grown at 37 °C with 8% carbon dioxide in Freestyle 293 medium (Gibco) supplemented with 2% (v/v) fetal bovine serum (Gibco). Cells were grown to 3 × 106 cells/ml and infected at a multiplicity of infection of about 1. After 8–12 h, 10 mM sodium butyrate (Sigma-Aldrich) was added to the cells and the temperature was dropped from 37 °C to 30 °C for the remainder of the incubation. Seventy-two hours after initial infection, cells were collected by centrifugation, washed with phosphate-buffered saline (pH 7.5; Gibco), weighed and flash frozen in liquid nitrogen. Pellets were stored at −80 °C until they were thawed for purification.

For purification, cell pellets were thawed on ice and resuspended in 20 ml lysis buffer per gram of cells. Lysis buffer was composed of 50 mM HEPES/NaOH (pH 7.5), 375 mM NaCl, 1 μg/ml leupeptin, 1 μg/ml aprotinin, 1 μg/ml pepstatin A, 1 mM phenylmethylsulfonyl fluoride (PMSF; all from Sigma-Aldrich) and about 3 mg DNase I (Roche). MhOR5 was extracted using 0.5% (w/v) n-dodecyl β-d-maltoside (DDM; Anatrace) with 0.1% (w/v) cholesterol hemisuccinate (CHS; Sigma-Aldrich) for 2 h at 4 °C. The mixture was clarified by centrifugation at 90,000g and the supernatant was added to 0.1 ml StrepTactin Sepharose resin (GE Healthcare) per gram of cells and rotated at 4 °C for 2 h. The resin was collected and washed with 10 column volumes (CV) of 20 mM HEPES/NaOH, 150 mM NaCl with 0.025% (w/v) DDM and 0.005% (w/v) CHS (together, SEC buffer). MhOR5 was eluted by adding 2.5 mM desthiobiotin (DTB) and cleaved overnight at 4 °C with HRV 3C Protease (EMD Millipore). Sample was concentrated to about 5 mg/ml and injected onto a Superose 6 Increase column (GE Healthcare) equilibrated in SEC buffer. Peak fractions containing MhOR5 were concentrated until the absorbance at 280 nm reached 5–6 (approximately 5 mg/ml) and immediately used for grid preparation and data acquisition. For the eugenol-bound structure, peak fractions were pooled, and eugenol (Sigma Aldrich, CAS#97-53-0) dissolved in dimethylsulfoxide (DMSO; both Sigma-Aldrich) was added for a final odour concentration of 0.5 mM, and the complex was incubated at 4 °C for 1 h. The maximum DMSO concentration was kept below 0.07%. The complex was then concentrated to approximately 5 mg/ml and used for grid preparation. For the DEET-bound structure, sample from the overnight cleavage step was concentrated to about 5 mg/ml and injected into the Superose 6 Increase column equilibrated in SEC buffer with 1 mM DEET (Sigma Aldrich, CAS#134-62-3). Peak fractions were concentrated to about 5 mg/ml and used immediately for grid preparation.

### Cryo-EM sample preparation and data acquisition

Cryo-EM grids were frozen using a Vitrobot Mark IV (FEI) as follows: 3 μl of the concentrated sample was applied to a glow-discharged Quantifoil R1.2/1.3 holey carbon 400 mesh gold grid, blotted for 3–4 s in >90% humidity at room temperature, and plunge frozen in liquid ethane cooled by liquid nitrogen.

Cryo-EM data were recorded on a Titan Krios (FEI) operated at 300 kV, equipped with a Gatan K2 Summit camera. SerialEM39 was used for automated data collection. Movies were collected at a nominal magnification of 29,000× in super-resolution mode resulting in a calibrated pixel size of 0.51 Å/pixel, with a defocus range of approximately −1.0 to −3.0 μm. Fifty frames were recorded over 10 s of exposure at a dose rate of 1.22 electrons per Å2 per frame.

Movie frames were aligned and binned over 2 × 2 pixels using MotionCor240 implemented in Relion 3.041, and the contrast transfer function parameters for each motion-corrected image were estimated using CTFFIND442.

#### Apo structure

Two datasets were collected with 4,050 micrographs in dataset A and 3,748 micrographs in dataset B. Processing was done independently for each dataset in the following way: particles were picked using a 3D template generated in an initial model from a dataset of 5,000 particles picked in manual mode. A total of 562,794 (dataset A) and 536,145 (dataset B) particles were subjected to 2D classification using RELION-3.041. Particles from the best 2D classes (210,833 for dataset A, 183,061 for dataset B) were selected and subjected to 3D classification imposing C4 symmetry and adding a soft mask to exclude the detergent micelle after 25 iterations. One class from each dataset containing 44,884 (dataset A) and 43,788 (dataset B) particles was clearly superior in completeness and definition of the transmembrane domains. These particles were subjected to 3D refinement with C4 symmetry, followed by Bayesian polishing and CTF refinement. The polished particles from both datasets were exported to cryoSPARC v243 and processing continued with the joined dataset of 88,672 particles. In cryoSPARC, further heterogeneous refinement resulted in a single class with 49,832 particles that were subjected to particle subtraction with a micelle mask. Non-uniformed refinement of subtracted particles imposing C4 symmetry yielded the final map with an overall resolution of 3.3 Å as estimated by cryoSPARC with a cutoff for the Fourier shell correlation (FSC) of 0.14344.

#### Ligand-bound structures

Processing for the eugenol-bound and DEET-bound structures occurred through the following pipeline: 4,410 (eugenol) and 4,365 (DEET) micrographs were collected and used to pick 461,254 (eugenol) and 787,448 (DEET) particles that were extracted, unbinned and exported into cryoSPARC v2. In cryoSPARC, several rounds of 2D classification resulted in 221,339 (eugenol) and 180,874 (DEET) particles that were used to generate an initial model with four classes with no imposed symmetry. These models were inputted as templates of a heterogeneous refinement with no imposed symmetry, from which one (eugenol) and two (DEET) final classes were selected containing 129,031 (eugenol) and 121,441 (DEET) particles. These particles were refined and exported to RELION 3.0 where they were subjected to a round of 3D classification with no imposed symmetry. The best class from this 3D classification contained 54,900 (eugenol) and 56,191 (DEET) particles that were subjected to Bayesian polishing and CTF refinement. Polished particles were then imported into cryoSPARC v2 and subjected to particle subtraction. Final non-uniform refinement with C4 symmetry imposed resulted in the final maps with overall resolution of 2.9 Å in both cases, estimated with a cutoff for the FSC of 0.143. In all cases, the four-fold symmetry of the channel was evident from the initial 2D classes without having imposed symmetry and refinements without imposed symmetry produced four-fold symmetric maps.

### Model building

The Cryo-EM structure of Orco (Protein Data Bank (PDB) accession 6C70) was used as a template for homology modelling of MhOR5 using Modeller45, followed by manual building in Coot46. The 3.3 Å density map of the apo was of sufficient quality to build the majority of the protein, with the exception of the S3–S4 and S4–S5 loops, the 13 N-terminal residues and the 5 C-terminal residues. The models were refined using real-space refinement implemented in PHENIX47 for five macro-cycles with four-fold non-crystallographic symmetry applied and secondary structure restraints applied. The eugenol- and DEET-bound models were refined including the ligands, which were placed as a starting point within the corresponding density in a pose that was obtained through docking methods (described below) and with restraints obtained with the electronic Ligand Builder and Optimization Workbench58 (eLBOW) implemented in PHENIX. Model statistics were obtained using MolProbity. Models were validated by randomly displacing the atoms in the original model by 0.5 Å, and refining the resulting model against half maps and full map48. Model–map correlations were determined using phenix.mtriage. Images of the maps were created using UCSF ChimeraX49. Images of the model were created using PyMOL50 and UCSF ChimeraX49.

### Docking analysis

All compounds were docked using Glide20,51 implemented in Maestro (Schrödinger, suite 2020). In brief, the model was imported into Maestro and prepared for docking. A 20 Å3 cubical grid search was built centred in the region of observed ligand density. Ligand structures were imported into Maestro by their SMILES unique identifiers and prepared using Epik52 to generate their possible tautomeric and ionization states, all optimized at pH 7.0 ± 2. All ligands were docked within the built grid, and the top poses that best fit the density are presented in Extended Data Fig. 8. The top activators scored with values between −7.4 and −4. While all activators docked with negative scores, some non-activators also docked with favourable scores. For example, caffeine docked favourably despite the molecule not activating the channel in our functional experiments. As docking does not incorporate dynamics of the receptor, it is not expected that docking will correlate homogeneously or monotonically with experimentally determined activity of ligands. At most a qualitative agreement can be expected.

### Structure analysis

Residues at subunit interfaces were identified using PyMOL as any residue within 5 Å of a neighbouring subunit (Extended Data Fig. 5d). The pore diameters along the central axis and lateral conduits were calculated using the program HOLE53, which models atoms as solid spheres of Van der Waals radius (Fig. 2a–c, Extended Data Fig. 10d, e). Two calculations were performed for each structure: one along the central four-fold axis (central pore) and another between subunits near the cytosolic membrane interface (lateral conduits). The plots in Fig. 2b and Extended Data Fig. 10e show the diameter along the central axis of the main conduit and the lateral conduit. The measurements in Fig. 2d and Extended Data Fig. 10f between residues lining the pore are taken from atom centres using PyMol. Electrostatic surface representations were performed using ChimeraX v1.1, coulombic estimation with default parameters (Extended Data Fig. 7). Morph videos were created in ChimeraX v1.1 with direct interpolation between states.

### Electrophysiology

HEK293 cells were maintained in high-glucose Dulbecco’s modified Eagle’s medium (DMEM) supplemented with 10% (v/v) fetal bovine serum (FBS) and 1% (v/v) GlutaMAX (all Gibco) at 37 °C with 5% (v/v) carbon dioxide. Cells were plated on 35-mm tissue-culture-treated Petri dishes 72–48 h before recording, and infected with the same pEG BacMam GFP-tagged MhOR5 construct used for expression 24–48 h before recording. Electrodes were drawn from borosilicate patch glass (Sutter Instruments) and polished (MF-83, Narishige Co.) to a resistance of 3–6 MΩ when filled with pipette solution. Analogue signals were digitized at 20 kHz (Digidata 1440A, Molecular Devices) and filtered at 1 kHz (whole-cell) or 2 kHz (patch recordings) using the built-in four-pole Bessel filter of a Multiclamp 700B patch-clamp amplifier (Molecular Devices) in whole-cell or patch mode. Whole-cell recordings were baseline-subtracted offline. Patch signals were further resampled offline for representations.

Whole-cell and single-channel recordings in Fig. 1c and Extended Data Fig. 2e were performed using an extracellular (bath) solution composed of 135 mM NaCl, 5 mM KCl, 2 mM MgCl2, 2 mM CaCl2, 10 mM glucose, 10 mM HEPES-Na/HCl (pH 7.3, 310 mOsm/kg) and an intracellular (pipette) solution composed of 150 mM KCl, 10 mM NaCl, 1 mM EDTA-Na, 10 mM HEPES-Na/HCl (pH 7.45, 310 mOsm/kg). Single-channel recordings were done in excised outside-out mode. Stock eugenol solution was prepared by dissolving in DMSO at 150 mM, and working solutions were prepared by diluting stocks to 3 μM in extracellular solution. Solutions were locally perfused using a microperfusion system (ALA Scientific Instruments).

### Cell-based GCaMP fluorescence calcium flux assay

All DNA constructs used in this assay were cloned into a modified pME18 s vector with no fluorescent marker, flanked by AscI/NotI restriction enzyme sites for efficient cloning. Each transfection condition contained 0.5 μg of a plasmid encoding GCaMP6s (Addgene #40753) and 1.5 μg of the plasmid encoding the appropriate olfactory receptor, diluted in 250 μl OptiMEM (Gibco). In experiments with heteromeric olfactory receptors, the total amount of DNA was 1.5 μg, in a ratio of 1:1 of Orco:OR. These were diluted in a solution containing 7 μl Lipofectamine 2000 (Invitrogen) and 250 μl OptiMem, followed by a 20-min incubation at room temperature. HEK293 cells were maintained in high-glucose DMEM supplemented with 10% (v/v) FBS and 1% (v/v) GlutaMAX at 37 °C with 5% (v/v) carbon dioxide. Cells were detached using trypsin and resuspended to a final concentration of 1 × 106 cells/ml. Cells were added to each transfection condition, mixed and added to 2 × 16 wells in a 384-well plate (Grenier CELLSTAR). Four to six hours later, a 16-port vacuum manifold on low vacuum was used to remove the transfection medium, replaced by fresh FluoroBrite DMEM (Gibco) supplemented with 10% (v/v) FBS and 1% (v/v) GlutaMAX. Twenty-four hours later, this medium was replaced with 20 μl reading buffer (20 mM HEPES/NaOH (pH 7.4), 1× HBSS (Gibco), 3 mM Na2CO3, 1 mM MgSO4, and 2 or 5 mM CaCl2) in each well. The calcium concentration was optimized for each receptor to account for their differences in baseline activity: for experiments with MhOR5 and MhOR5 mutants, reading buffer contained 2 mM CaCl2, while 5 mM CaCl2 was used for MhOR1, Orco and Orco–AgOR28 heteromers. The fluorescence emission at 527 nm, with excitation at 480 nm, was continuously read by a Hamamatsu FDSS plate reader. After 30 s of baseline recording, an optimized amount of odorant solution—10 μl for all MhOR-containing experiments or 20 μl for all Orco-containing experiments—was added to the cells and read for 2 min. All solutions were warmed to 37 °C before beginning.

Seven ligand concentrations were used for each transfection condition in sequential dilutions of 3, alongside a control well of only reading buffer. Ligands were dissolved in DMSO to 150 mM, then diluted with reading buffer to a highest final-well concentration of 0.5 mM (DMSO never exceeded 0.5%). Water-soluble ligands (arabinose, caffeine, denatonium, glucose, MSG, sucrose) were dissolved directly into reading buffer. If experimental data indicated a more sensitive response than this range, the concentration was adjusted accordingly. Ligand concentrations for mutants were the same as for the corresponding wild-type OR. Each plate contained a negative control of GCaMP6s transfected alone and exposed to eugenol for MhOR5 and VUAA1 for Orco experiments. Additionally, each plate included the corresponding wild-type OR with its cognate ligand—MhOR5 and MhOR1 with eugenol, Orco with VUAA1, and Orco–AgOR28 with acetophenone—as a positive control to account for plate-to-plate variation in transfection efficiency and cell count. A control of DMSO alone was also tested to ensure no activity effects were due to the solvent. Each concentration of ligand was applied to four technical replicates, which were averaged and considered a single biological replicate.

The baseline fluorescence (F) was calculated as the average fluorescence of the 30 s before odour was added to the plate. Within each well, ΔF was calculated as the difference between the average of the last 10 s of fluorescence and the baseline F. ΔF/F was then calculated as the ΔF divided by the baseline fluorescence (F). Finally, the ΔF/F for each concentration was normalized to the maximum ΔF/F value of the corresponding positive control present on each plate: MhOR5 and MhOR1 with eugenol, Orco with VUAA1, and Orco–AgOR28 with acetophenone to account for inevitable variations in transfection efficiency and cell counts across different plates. The normalized ΔF/F averaged across all experiments for a given condition is the value used to construct the dose–response curves in all plots (Figs. 1b, 2e–g, Extended Data Figs. 2d, 9a–c, 10c, 11b). All wild-type curves come from the same plates as the experimental data in the same plot. Baseline values for wild-type and mutant channels were found by normalizing each F value by the negative GCaMP6s-only control on the same plate (Extended Data Figs. 1c, 9a, e).

For all experiments, GraphPad Prism 8 was used to fit the dose–responses curves to the Hill equation from which the EC50 of the curve was extracted. Three metrics were used to characterize the dose–response curve for each ligand: activity index, log(EC50) and max ΔF/F. For conditions where EC50 was too high for the dose–response curve to reach saturation and therefore could not be fitted to a Hill equation, a value of −2 was assigned to the EC50, which is more than an order of magnitude higher than the highest concentration used. Max ΔF/F is the maximum response achieved at the highest concentration. Activity index is defined as the negative product of log(EC50) and max ΔF/F, as follows:

Activity index = −log(EC50) × max ΔF/F

### Gels and small-scale transfections

For western blots and fluorescence-detection size-exclusion chromatography (FSEC) traces (Extended Data Figs. 1a, b, 9g), HEK293 cells were maintained in high-glucose DMEM supplemented with 10% (v/v) FBS and 1% (v/v) GlutaMAX at 37 °C with 5% (v/v) carbon dioxide. Cells were detached using trypsin and plated in six-well plates at a concentration of 0.4 × 106 per well. Twenty-four hours later, cells were transfected with 2 μg of DNA in the same superfolder GFP-containing pEG BacMam vector used for large-scale purification and 9 μl Lipofectamine 2000 (Invitrogen) diluted in 700 μl OptiMEM and added dropwise to the cells after a 5-min incubation. Twenty-four hours later, cells were checked for GFP fluorescence, rinsed with phosphate-buffered saline, and collected by centrifugation. Cells were either frozen at −20 °C or used immediately.

Cell pellets were rapidly thawed and resuspended in 200 μl lysis buffer containing 50 mM HEPES/NaOH (pH 7.5), 375 mM NaCl, an EDTA-free protease inhibitor cocktail (Roche), and 1 mM PMSF. The protein was extracted for 2 h at 4 °C by adding 0.5% (w/v) DDM with 0.1% (w/v) CHS after 10 s sonication in a water bath. This mixture was then clarified by centrifugation and filtered. The supernatant was added to a Shimadzu autosampler connected to a Superose 6 Increase column equilibrated in SEC buffer. An aliquot of the supernatant was also used to run SDS–PAGE (Bio-Rad, 12% Mini-PROTEAN TGX) and Blue Native(BN)-PAGE (Invitrogen, 3–12% Bis-Tris) gels. Gels were transferred using Trans-Blot Turbo Transfer Pack (Bio-Rad) and blocked overnight. The following day, gels were stained with rabbit anti-GFP polyclonal antibody (Life Technologies; 1:20,000), washed, incubated with anti-rabbit secondary antibody (1:10,000), and imaged with ImageLab.

The lifetime sparseness54,55 measure in Extended Data Fig. 1d was used to quantify olfactory receptor tuning breadth and calculated as follows:

$${\rm{Lifetime}}\,{\rm{sparseness}}=\,\left(\frac{1}{1-\frac{1}{n}}\right)\times \left(1-\frac{{\left({\sum }_{i=1}^{n}\frac{{{\rm{res}}}_{i}}{n}\right)}^{2}}{{\sum }_{i=1}^{n}\frac{{{\rm{res}}}_{i}^{2}}{n}}\right),$$

in which n is the number of ligands in the set, and resi is the receptor’s response to a given ligand i. All inhibitory responses (values below 0) were set to 0 before the calculation54,55. The Drosophila melanogaster OR dataset comes from the DoOR database56.

### Multiple regression analysis

A set of 11 molecular descriptors were compiled for all 54 ligands tested from PubChem, Sigma-Aldrich, ChemSpider, EPA, and The Good Scents Company; the values used are in Supplementary Table 9. A multiple regression analysis using the scikit-learn Linear Regression module was used to assess the accuracy with which the receptor activity could be predicted by individual descriptors (1-dimensional analysis) or combinations of two descriptors (2-dimensional analysis) (Extended Data Table 2). Owing to the absence of reported metrics for some ligands—acetic acid, citric acid, MSG, sucrose, denatonium, and VUAA1—the analysis was performed on the remaining 48 ligands. For the 1-dimensional analysis, a single variable linear regression was performed for each descriptor independently. The analysis sought to fit a linear model with coefficients w1, …, wn + 1, in which n is the dimension of the input data. The optimal coefficient set was determined using residual sum of squares optimization between the observed activity index targets and those predicted by linear approximation using solved coefficients. This process was repeated for the 2-dimensional case, using every unique permutation of descriptors across the 11-dimensional space. As a means of assessing the predictive power of a given combination, the R2-value, reflecting the square of the correlation coefficient between observed and modelled values of the activity index, was calculated for each linear model and reported in Extended Data Table 2. This allowed ranking of descriptor sets based on accuracy of prediction.

### Sequence alignments

For Extended Data Fig. 11a, the alignment between the sequences of MhOR1 and MhOR5 was done using MAFFT implemented in JalView57 with minimal manual adjustment based on the structure of MhOR5. For Extended Data Fig. 5a, the sequence alignment between A. bakeri Orco and MhOR5 was done by aligning the published structure of A. bakeri Orco (PDB 6C70) and the structure of MhOR5 in PyMOL. All sequence alignments were visualized and plotted using JalView57.

### Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.