Bifunctional CYP81AA proteins catalyse identical hydroxylations but alternative regioselective phenol couplings in plant xanthone biosynthesis

Xanthones are natural products present in plants and microorganisms. In plants, their biosynthesis starts with regioselective cyclization of 2,3′,4,6-tetrahydroxybenzophenone to either 1,3,5- or 1,3,7-trihydroxyxanthones, catalysed by cytochrome P450 (CYP) enzymes. Here we isolate and express CYP81AA-coding sequences from Hypericum calycinum and H. perforatum in yeast. Microsomes catalyse two consecutive reactions, that is, 3′-hydroxylation of 2,4,6-trihydroxybenzophenone and C–O phenol coupling of the resulting 2,3′,4,6-tetrahydroxybenzophenone. Relative to the inserted 3′-hydroxyl, the orthologues Hc/HpCYP81AA1 cyclize via the para position to form 1,3,7-trihydroxyxanthone, whereas the paralogue HpCYP81AA2 directs cyclization to the ortho position, yielding the isomeric 1,3,5-trihydroxyxanthone. Homology modelling and reciprocal mutagenesis reveal the impact of S375, L378 and A483 on controlling the regioselectivity of HpCYP81AA2, which is converted into HpCYP81AA1 by sextuple mutation. However, the reciprocal mutations in HpCYP81AA1 barely affect its regiospecificity. Product docking rationalizes the alternative C–O phenol coupling reactions. Our results help understand the machinery of bifunctional CYPs.

C ytochrome P450 (CYP) enzymes constitute a large superfamily of haem-thiolate proteins present in almost all living organisms 1 . They catalyse regio-and stereoselective oxidative attack of non-activated carbons at physiological conditions. In plants, CYPs are encoded by B1% of the protein-coding genes, the catalytic functions of the majority of the gene products remaining unknown 1 . Assigning functions to new CYPs is a challenge due to the large number of proteins, the lack of reliable sequence-function correlations, and the structural similarities between family members 1 . Commonly, CYPs in plants catalyse hydroxylation (mono-oxygenation) reactions in both the general and the specialized metabolism. However, few enzymes exhibit unusual activities 2,3 , including methylenedioxy bridge formation, successive oxidation of a single carbon, rearrangement of carbon skeletons, S-heterocyclization, sterol desaturation and C-C bond cleavage, as well as phenol coupling [4][5][6][7] . Intramolecular C-C and intermolecular C-O phenol coupling reactions were studied at the gene level in benzylisoquinoline alkaloid metabolism [7][8][9] ; however, intramolecular C-O phenol coupling was only established biochemically in plant xanthone biosynthesis 10 .
Xanthones (Fig. 1a) embody a group of natural products found in fungi, lichens and higher plants 11 . Their distribution and substitution patterns are chemotaxonomically significant 12 . To date, more than 1,500 xanthone-based natural products are available in public databases. The major groups involve simple oxygenated, glycosylated and prenylated compounds beside bisxanthones, xanthonolignoids and miscellaneous xanthones 11,13 . Plant xanthones are defense compounds against herbivores and microorganisms 14 . In humans, xanthones exhibit an array of pharmacological activities, including anti-Alzheimer properties 11,15 .

Results
Cloning of CYP81AA1 from H. calycinum cell cultures. Yeastextract-treated H. calycinum cell cultures accumulate the 1,3,7-triHX derivative hyperxanthone E ( Supplementary Fig. 1), which is undetectable in non-treated cell cultures 20 . Under these conditions, a suppression subtractive hybridization (SSH) cDNA library was constructed, which was enriched in elicitorinducible transcripts. Subsequent bioinformatic analysis of assembled contigs encoding fragments of candidate CYPs resulted in identification of CYP71, CYP72, CYP81 and CYP706 sequences (Supplementary Table 1). Notably, CYP81 was represented by multiple contigs, two of which, 21 and 59, were encoded by 22 and 33 copies, respectively, indicating high elicitor responsiveness. CYP81 belongs to the CYP71 clan comprising shikimate-modifying enzymes 21 . In contrast, the CYP72 clan represented by CYP72A with a high copy number (Supplementary Table 1) is related to isoprenoid metabolism. When searched against H. perforatum transcriptomes, which are publicly available in the 0 Medicinal Plant Genomics Resource 0 (MPGR) databank (http://medicinalplantgenomics.msu.edu), the contigs 21 and 59 shared 95 and 97.3% identity, respectively, with the predicted amino acid sequence of locus 416 and therefore appeared to be related to the same gene. To test if the non-overlapping contigs 21 and 59 belong to the same sequence, a forward primer from the upstream contig 59 and a reverse primer from the downstream contig 21 were designed. Using 5 0 RACE-ready cDNA as a template, a core fragment of 1,075 bp was amplified. The lacking 204 and 248 bp portions towards the start and stop codons, respectively, were amplified using 5 0 and 3 0 RACE techniques, resulting in a 1,830 bp full-length cDNA, which was reamplified by proof-reading polymerase. The 1,527 bp open reading frame (ORF) encoded a 508 amino acid protein which was named CYP81AA1 by the CYP nomenclature committee 22 . The enzyme constituted a new CYP81 subfamily.
Cloning of NADPH-cytochrome P450 reductase. The activity of CYPs commonly depends on cooperation with a NADPH-CYP reductase (CPR) as electron-transfer partner 23 . The SSH cDNA library contained a CPR cDNA core fragment, which lacked 1,707 and 244 bp toward the start and stop codons, respectively. The missing 3 0 stretch was amplified by 3 0 RACE using RNA from elicitor-treated H. calycinum cell cultures. For 5 0 -extension, a degenerate upstream primer led to elongation by 1,362 bp and 5 0 RACE then yielded the remaining 345 bp up to the start codon. The 2,599 bp full-length cDNA, which was re-amplified by proofreading polymerase, contained a 2,139 bp ORF encoding a 712 amino acid protein. This enzyme shared 75, 79.1 and 79.2% identity with the CPR2 enzymes from Arabidopsis thaliana 24 , Gossypium hirsutum 25 and Populus trichocarpa 26 , respectively, and was therefore designated as HcCPR2. In contrast to animals which have a single CPR 27 , plants contain multiple CPR isoforms which group in two classes distinguishable by their divergent N-termini 26,28 . In Arabidopsis, cotton and centaury, expression of class II CPR is induced by wounding, light and elicitation, whereas class I CPR is constitutively regulated 24,25 . Consistently, class II CPR transcripts were cloned from the SSH cDNA library of H. calycinum and isolated from cell cultures after elicitor treatment. The full-length HcCPR2 cDNA was cloned in MCS2 of the pESC-URA vector and expressed in yeast. Microsomes were isolated and the ability of HcCPR2 to transfer electrons from NADPH to cytochrome c was verified 29  H. calycinum CYP81AA1 encodes 1,3,7-TXS. The HcCYP81AA1 and HcCPR2 coding sequences were ligated into the multicloning sites 1 and 2, respectively, of the pESC-URA vector. After co-expression in yeast, microsomes were isolated and incubated with eleven potential benzophenone and xanthone substrates in the presence of NADPH (Supplementary Table 2, Supplementary  Fig. 2 and Supplementary Methods). Products formed were analysed by high-performance liquid chromatography (HPLC). Incubation with 2,3 0 ,4,6-tetraHB resulted in formation of a single product, whose retention time (R t 19.6 min) matched that of authentic 1,3,7-triHX (Fig. 2a). In addition, the UV and mass spectra agreed with those of the reference compound ( Supplementary Figs 3 and 4). When incubated with 2,4,6-triHB, the microsomal fraction formed two products, whose R t values (15.1 and 19.6 min) matched those of authentic 2,3 0 ,4,6-tetraHB and 1,3,7-triHX, respectively (Fig. 2b). The identities were verified by mass spectrometry (MS/MS) and UV spectroscopy ( Supplementary Figs 3 and 4). Incubation with 2,4dihydroxybenzophenone (2,4-diHB) also yielded two products (Fig. 2c). The product with R t 13.7 min was identified as 2,2 0 ,4,5 0tetrahydroxybenzophenone (2,2 0 ,4,5 0 -tetraHB) in comparison with authentic compound ( Supplementary Figs 3 and 4). The product with R t 17.1 min was tentatively identified as 2,3 0 ,4-trihydroxybenzophenone (2,3 0 ,4-triHB) using MS/MS ( Supplementary Fig. 4). Incubation with 2,3 0 ,4,4 0 ,6-pentahydroxybenzophenone (2,3 0 ,4,4 0 ,6-pentaHB, maclurin) yielded a single product, which shared the R t value (22.3 min) and the UV and mass spectra with authentic 1,3,6,7-tetrahydroxyxanthone (1,3,6,7-tetraHX, norathyriol) ( Fig. 2d and Supplementary Figs 3   also detected in control incubations, confirming its previously reported presence as a contaminant in 2,3 0 ,4,4 0 ,6-pentaHB samples 30 . Control assays without NADPH failed to form enzymatic products from all substrates used. Likewise, microsomes from yeast harbouring the empty plasmid did not produce any of the products. No enzymatic activity was observed with benzophenones and xanthones illustrated in Supplementary Fig. 2. Thus, HcCYP81AA1 was identified as a bifunctional enzyme, which exhibits both B3 0 H and 1,3,7-TXS activities. To correlate the expression profiles of HcCYP81AA1, HcCPR2 and HcBPS with the previously published accumulation pattern of hyperxanthone E in elicitor-treated H. calycinum cell cultures 20 , changes in the transcript levels were analysed over 48 h by reverse transcription quantitative real-time PCR (RT-qPCR) (Fig. 3). Actin and Histone H2A served as reference genes. The transcript levels of all genes started to increase 4 h post elicitation, peaked at 8 h and decreased thereafter. Their increases thus preceded the accumulation of hyperxanthone E, which started 12 h post elicitation 20 5). In contrast, transcripts encoded by locus 8128 were more abundant in flower organs.
Molecular modelling suggests residues for mutagenesis. To gain insight into the regioselectivity-determining elements of CYP81AA1 and CYP81AA2, the amino acid sequences were aligned, secondary structure elements were predicted 32 , residues were numbered according to the standardized numbering scheme for class II CYPs 33 , and the putative six substrate recognition sites (SRSs) were identified 34,35 . The predicted secondary structure elements of the two enzymes matched, except for b3-3, b4-1 and b4-2, located in the C-terminal region ( Fig. 4a and Supplementary Fig. 8). Therefore, the 55 C-terminal amino acids were reciprocally exchanged between the two enzymes to identify selectivity-determining residues. However, replacement of this portion of CYP81AA2 with that of CYP81AA1 resulted in a complete loss of enzyme activity and the chimeric protein was not subjected to further mutations.
A second round of mutations involved reciprocal site-directed mutagenesis of individual amino acids located in the SRSs. The two CYP81AA1 orthologues differed in only one residue (V/I220, standard position 189) within the six predicted SRSs. In contrast, HpCYP81AA1 and HpCYP81AA2 had a total of 145 divergent residues including 15 within the SRSs plus 7 gaps, which were expected to be responsible for the alternative regioselectivities ( Fig. 4a and Supplementary Fig 8). Therefore, changing one or more of these residues towards their counterparts in the other enzyme was likely to affect the nature of the cyclization product. To reduce the number of possible single and multiple reciprocal mutations, homology models of both enzymes were built using two independent methods, YASARA and HHpred 36 . The final HHpred models were based on the structure of the closed conformation of mammalian CYP2B4 in complex with the inhibitor ticlopidine (3kw4) (ref. 37). Upon examination of the superimposed models, 17 residues were located within a 4 Å radius of the bound inhibitor in either protein model. Five of these residues belonging to the SRSs 5 and 6 differed between the two enzymes (Supplementary Table 4 and Fig. 4b). In addition, S375 in CYP81AA1, which corresponds to L378 in CYP81AA2 (standard position 330.1), was selected as a mutagenesis target. This position undergoes a change from hydrophilic to hydrophobic and was previously predicted to affect the selectivity of CYPs 35 .
Wild-type CYP81AA1 showed absolute specificity towards the production of 1,3,7-triHX. Substitution of the 55 C-terminal residues by those of CYP81AA2 failed to cause any change in the regiospecificity. Similarly, reciprocal mutations in SRSs 5 and 6 had only minor effects on the regioselectivity (Supplementary Table 5 ARTICLE product binding, indicating hindrance of internal rotation in the 2,3 0 ,4,6-tetraHB substrate (Fig. 5b). In the wild-type enzyme (1,3,5-TXS), the 3-hydroxy group of the product can form a hydrogen bond with S375, orienting it toward the haem loop. The 5-hydroxy group points toward the haem group. At the substrate level, rotation of the 3 0 -hydroxyphenyl ring of 2,3 0 ,4,6-tetraHB is hindered by the side chains of the I-helix residues. In the sextuple mutant, the 3-hydroxy group of the product can form a new hydrogen bond with A483T, orienting it towards the C-terminal loop. This hydrogen bond switching can explain the observed cooperative effect of the double mutant S375A/A483T. The mutation introduced in the former H-bonded residue S375A together with the nearby mutation L378S may alter the conformation of the haem loop to cooperatively support the reorientation of the substrate. The 7-hydroxy group of the product is located between the I-helix and the haem group, which again indicates hindrance of free rotation of the 3 0 -hydroxyphenyl ring of 2,3 0 ,4,6-tetraHB.

Discussion
The interface between the benzophenone and xanthone biosynthetic pathways is a ring closure reaction, which converts the diphenyl ketone scaffold to a tricyclic ring system. This cyclization step is preceded by 3 0 -hydroxylation, which favours subsequent substitution at the ortho and para positions of the benzophenone. Demonstrated herein is that both consecutive reactions are catalysed by bifunctional CYP, two variants of which accomplish the ring closure reaction in a regioselective manner. Hydroxylation by the oxygen rebound mechanism is the classical CYP-catalysed reaction, inserting one atom of molecular oxygen into the substrate and reducing the second to water 38 . The major oxidant intermediate in CYP reactions is the so-called compound I, a highly reactive iron-oxo radical species 38,39 . However, benzophenone cyclization was proposed to be an oxidative phenol coupling, reducing both atoms of molecular oxygen to water 2,3 . The TXSs catalyse both the hydroxylation and the coupling reactions and thus function as mono-oxygenase and oxidase ( Supplementary Fig. 9). The previously proposed coupling mechanism involves two one-electron oxidations 10 . Formation of an intermediate biradical, as offered for phenol couplings in morphine, corytuberine and cyclodipeptide biosyntheses 8,40,41 , is energetically unfavourable and the underlying mechanisms have recently been revised 3,7,10,42 . Homology modelling and product docking revealed that CYP81AA1 and CYP81AA2 accommodate distinct conformers of 2,3 0 ,4,6-tetraHB. The 3 0 -hydroxyphenyl ring lacks free rotation in the active site cavities, resulting in cyclization to 1,3,7-and 1,3,5-triHXs, respectively. At the xanthone level, no hydroxylation by the two CYPs takes place, as indicated by the non-acceptance of 1,3dihydroxyxanthone and derivatives. Notably, 2,4-diHB as substrate was 3 0 -hydroxylated by both CYP81AA1 and CYP81AA2; however, the identical product 2,3 0 ,4-triHB lacked cyclization because of the absence of the 6-hydroxy group. Instead, it underwent 2 0 -hydroxylation, yielding 2,2 0 ,4,5 0 -and 2,2 0 ,3,4 0 -tetraHBs, respectively. Subsequent loss of water, as previously proposed for orthoortho 0 -dihydroxylated benzophenones 43 , was hindered by hydrogen bonding between the 2-hydroxy and the carbonyl groups. However, 2 0 -hydroxylation and water elimination may be an alternative reaction mechanism for the physiological substrate 2,4,6-triHB (ref. 43). With 2,3 0 ,4,4 0 ,6-pentaHB ring closure took place but lacked regioselectivity and both enzymes formed 1,3,6,7-tetraHX. This conversion was the only reaction catalysed by CYP81AA3, the physiological function of which remains open. None of the CYPs studied here converted monohydroxylated benzophenones, such as 2-and 4-hydroxybenzophenones.
Identification of genes encoding TXSs with alternative regioselectivities offered the chance of exploring their selectivity determinants. Sequence comparison, homology modelling, and reciprocal site-directed mutagenesis demonstrated the impact of individual SRS residues on the selection of the alternative substrate conformers. CYP81AA2 (1,3,5-TXS) was almost completely converted into CYP81AA1 (1,3,7-TXS). Three amino acids in close proximity to the haem group (S375, L378 and A483; equivalent to the standard positions 328, 330.1 and 437, respectively 33,44 ) were most influential in orienting the coupling reaction ortho to the 3 0 -hydroxy group. S375 and L378 are located within SRS-5, which spans the loop between the EXXR motif and the b1-4 strand, whereas A483 belongs to SRS-6. S375 (standard position 328) occupies position 5 behind the EXXR motif, its side chain pointing towards the haem group in the majority of structurally studied CYPs. The position is a mutation hotspot 35,44 and preferentially occupied by a hydrophobic amino acid 35 . However, CYP81AA2 contains a polar serine residue, which may be required for the interaction with the polyhydroxylated substrates. In a number of CYPs, mutations at this standard position 328 resulted in dramatic changes in the regio-and stereoselectivities 35,44,45 . The point mutation F363I entirely switched the regiospecificity of spearmint limonene 6-hydroxylase to that of peppermint limonene 3-hydroxylase 45 . In CYP81AA2, however, the point mutation S375A caused only a partial change (B20%) towards 1,3,7-triHX formation.
In addition to standard position 328, a second protruding residue in SRS-5 is expected to occupy either position 8 or 9 after the EXXR motif (corresponding to standard positions 331 and 332, respectively) 35 . Although filling position 8 behind the EXXR motif, the L378 residue was assigned the standard position 330.1, for which surprisingly no mutations were reported. Thus, the involvement of 330.1 provides new combinations of regioselectivity determinants. In the model of the sextuple mutant, 328 and 330.1 change the orientation of the haem loop, thereby affecting the size of the active site.
SRS-6 is formed by the turn in b-sheet 4 between b4-1 and b4-2 towards the C-terminus. A483 (standard position 437) is located at the tip of this turn and its side chain protrudes into the active site (Fig. 4b). Mutations at this position affected the regiospecificities of multiple CYPs 44 , such as CYP94A2, in which the F494L substitution shifted the regioselectivity of fatty acid hydroxylation from o to o-1 (ref. 46). In CYP81AA2, L378S and A483T caused less than 8% change each but induced B20 and 29% shifts, respectively, in combination with S375A, indicating synergistic interaction in controlling the regioselectivity. The triple mutant S375A/L378S/A483T caused 80.7% of the total product to be released as 1,3,7-triHX. Investigation of further CYP81AA proteins for sequence/structure-function relationships, with an emphasis on SRS-5 and SRS-6, will clarify whether the positions are generally key mediators of regioselectivity. Another 10.4% shift in selectivity was contributed by the triple mutation M486L/K488R/N489K in SRS-6. However, no combination of mutations was capable of achieving the absolute regiospecificity of CYP81AA1. In case of the sextuple mutant (mut6), a portion of 6.9% of the total product was still 1,3,5-triHX (besides 2% unidentified product).
Reciprocal site-directed mutagenesis was also applied to CYP81AA1 (1,3,7-TXS); however, the maximum shift in the regioselectivity achieved was 6.8%, indicating that global changes in the backbone rather than subtle substitutions in the SRSs are needed to convert 1,3,7-TXS to 1,3,5-TXS. Similarly, the I364F modification in the peppermint limonene 3-hydroxylase, which is the reciprocal mutation to the abovementioned substitution, did not achieve the regiospecificity of the spearmint limonene 6-hydroxylase; quite the contrary, it afforded an inactive, although properly folded, enzyme 45 . Crystallization of the membrane anchor-freed TXSs is necessary to gain deeper insight into the structural conditions and the reaction mechanisms underlying the bifunctionality. Notably, none of the enzyme mutants generated exhibited uncoupling of the consecutive hydroxylation and coupling reactions.
Besides intramolecular C-O phenol coupling studied here, two other types of phenol coupling were detected previously in plant benzylisoquinoline alkaloid biosynthesis (Supplementary Fig. 10). CYP80G2 and CYP719B1 catalyse stereoselective intramolecular C-C phenol coupling, which converts (S)-and (R)-reticuline to (S)-corytuberine and salutaridine, respectively 7,8 . CYP80A1 accomplishes intermolecular C-O phenol coupling between (R)and (S)-N-methylcoclaurine to form the bisbenzylisoquinoline alkaloid berbamunine 9 . None of these enzymes exhibited multifunctionality. In contrast to these CYP80 and CYP719 families, which are represented by a limited number of members in a restricted quantity of plant taxa 47 , CYP81 is a large enzyme family consisting of multiple members in all plant genomes sequenced to date 48 . However, only a few members of the CYP81 family were functionally characterized, linking them to specialized metabolism in response to biotic and abiotic stress 49 . CYP81E members catalyse regiospecific hydroxylations on ring B of the isoflavone skeleton in either the 2 0 -or 3 0 -positions, yielding products for pathogen defense and insect-induced responses, respectively 50 . CYP81Q members catalyse stepwise formation of two methylenedioxy bridges in the biosynthesis of the lignan ( þ )-sesamin, which widely occurs in vascular plants 5 . In microorganisms, four CYPs (OxyA-C and E) catalyse intramolecular C-O and C-C phenol couplings in the biosynthesis of glycopeptide antibiotics, such as vancomycin and teicoplanin. However, these soluble bacterial CYPs do not act on the free substrates but the intermediate heptapeptides are covalently bound to a peptidyl carrier protein domain of the nonribosomal peptide synthase, the X-domain of which recruits the CYPs 51 .
CYPs are probably nature's most versatile enzymes in terms of substrate range and reaction type 38 . This vast repertoire is here extended by characterization of CYP81AA1 and CYP81AA2. Their investigation provides deeper insight into CYP-catalysed oxidative phenol coupling reactions, which cannot be easily rationalized in context of the traditional catalytic cycle of CYPs 42,52 . However, more work has to be done to completely understand the selectivity determinants of CYP81AA1 and CYP81AA2, as pointed out by the challenge to convert 1,3,7into 1,3,5-TXS. As ring closure catalysts, the enzymes may be attractive for engineering approaches, such as selective production of either 1,3,5-or 1,3,7-triHXs.

Methods
RNA extraction. H. calycinum cell cultures growing in the dark at 25°C in liquid LS medium with shaking at 120 r.p.m. were treated with 3 g l À 1 yeast extract and cells were collected 7 h post elicitation. Total RNA was extracted using the RNeasy Plant Mini Kit (Qiagen). Middle-aged leaves of H. perforatum growing in the medicinal plants garden of the Institute of Pharmaceutical Biology, Technische Universität Braunschweig, were collected during the flowering period and total RNA was extracted using the GeneJET Plant RNA Purification Mini Kit (Thermo Scientific), following the manufacturer's protocol for polyphenol-rich samples.
Bioinformatic analysis of H. calycinum cDNA library. An SSH library was constructed using control vs yeast-extract-treated H. calycinum cell cultures 20 . The library comprised 2,005 clones, which were assembled into 277 contigs using the CAP3 program 53 . The contigs were functionally annotated using Blast2GO (ref. 54). Contigs that were annotated to encode CYPs were individually blasted against the non-redundant protein database of NCBI (NR). For the contigs having E-values lower than 1.0 E-6 with one or more CYP sequences in the database, the subfamily of the closest classified CYP and the number of copies per contig (retrieved from the CAP3 output) were examined (Supplementary Table 1). In addition, a fragment encoding CPR was also identified.
Extension of the CPR core fragment in 5 0 direction. Multiple sequence alignment of the nucleotide sequences of 20 plant CPRs from the NCBI database, whose accession numbers are listed in Supplementary Table 6, was created using ClustalW ( Supplementary Fig. 11). An upstream forward degenerate primer for CPRs was designed based on the sequence of the conserved FMN binding domain. The CPR core fragment identified by bioinformatical analysis of the subtracted cDNA library lacked 1,707 bp towards the start codon. Standard PCR using the designed degenerate primer (CPR-dpF-1) and a reverse gene-specific primer (GSP) (CPR-R1; Supplementary Table 7) together with 5 0 -RACE-ready cDNA as a template led to extension of the core fragment by 1,362 bp in the 5 0 direction. 5 0 -RACE primers were derived based on the sequence of the extended part.
3 0 and 5 0 rapid amplification of cDNA ends (RACE). The SMART RACE cDNA Amplification Kit (Clontech) was used. First-strand cDNA was synthesized from 5 mg RNA, primed with the adaptor-linked oligo(dT) primer 3 0 -CDS, as described in the instructions for the synthesis of 3 0 -RACE-ready cDNA. In case of 5 0 -RACE, the reverse GSPs used to synthesize the first-strand 5 0 -RACE-ready cDNA of the respective genes were CYP81-5RACE1 and CPR-5RACE1 (Supplementary Table 7). Touchdown PCR was employed for the subsequent amplification step. The 25 ml reaction mixture contained 1 ml each of the cDNA samples in 1 Â reaction buffer Y, 0.4 mM dNTPs, 0.4 mM primers and 1.25 U peqGOLD Taq-DNA-Polymerase (Peqlab). The PCR conditions were as follows. Initial denaturation at 94°C for 2 min; 10 cycles with 94°C for 30 s, annealing at the T m of the respective GSPs for 45 s with DT À 0.5°C per cycle, 72°C for 90 s; 30 cycles with 94°C for 30 s, annealing at the T m of the respective GSPs À 5°C for 45 s, 72°C for 2 min; and final extension at 72°C for 15 min. The RACE products were run on an agarose gel stained with Midori Green (Nippon Genetics) and the bands at the expected size were excised and purified using the innuPREP DOUBLEpure Kit (Analytic Jena). The purified products were cloned into the pGEM-T Easy vector (Promega) and sequenced. The primers used for the RACE reactions are listed in Supplementary Table 7.
Construction of the expression plasmids. HcCPR2 was amplified with the primers HcCPR2-BamHI-F and HcCPR2-HindIII-R (Supplementary Table 7) using Phusion Hot Start II High-Fidelity DNA Polymerase (Thermo Scientific), digested with BamHI/ HindIII and cloned into MCS2 of the pESC-URA yeast expression vector to generate the pESC-URA:HcCPR2 plasmid. The CYP genes were subsequently amplified with the respective primers listed in Supplementary Table 7, digested with the appropriate restriction enzymes and cloned into MCS1 of the pESC-URA:HcCPR2 plasmid to generate the pESC-URA:HcCYP81AA1/HcCPR2, pESC-URA:HpCYP81AA1/HcCPR2, pESC-URA:HpCYP81AA2/HcCPR2 and pESC-URA:HpCYP81AA3/HcCPR2 expression plasmids.
Gene expression analysis by reverse transcription quantitative PCR. Total RNA was extracted from non-treated (control) and yeast-extract-treated H. calycinum cell cultures at 4, 8, 12, 16, 20, 24, 36 and 48 h post elicitation using the RNeasy Plant Mini Kit (Qiagen). On-column digestion was applied to each sample using the RNase-Free DNase Set (Qiagen) to get rid of any genomic DNA contamination. The RNA quality was checked on a gel. Concentrations and 260/280 ratios were determined using the SimpliNano Spectrophotometer (GE Lifesciences). For each time point, cDNA was synthesized from 1 mg RNA using iScript Reverse Transcription Supermix for RT-qPCR (Bio-Rad). Subsequent measurements were performed on the CFX Connect Real-Time PCR Detection System (Bio-Rad) using iTaq Universal SYBR Green Supermix (Bio-Rad), following the manufacturer's protocol. The 20 ml reaction contained 1 ml cDNA, 10 ml (2 Â ) supermix and 0.5 mM of each primer (Supplementary Table 7). Samples were initially denatured at 95°C for 30 s, run for 40 cycles at 95°C for 5 s and 59°C for 30 s. Data were recorded after the annealing/extension step. The specificity of the amplification product was verified by melt curves and running the amplification products on an agarose gel. Pooled cDNA of all time points were used in a serial dilution to determine the efficiency of amplification for each primer pair. On the basis of the C q values obtained during the efficiency tests, 1 ml of 1:50 dilution of the original cDNA of each time point was used as a template for the subsequent expression analysis. Actin and Histone H2A served as reference genes. The nontreated sample (0 h) served as a calibrator. Amplification efficiency as well as normalized expression relative to the calibrator were determined using the Bio-Rad CFX Manager software (version 3.1, Bio-Rad). The presented data are the means of three technical replicates, error bars representing s.e.m.
C-terminal exchanges. The reciprocal exchange of the terminal 55 amino acids between HpCYP81AA1 and HpCYP81AA2 was performed using the one-pot fusion PCR protocol as previously described 55 . CYP81AA1 with the C-terminus of CYP81AA2 (HpCYP81AA1-AA2) was generated in a 50 ml PCR reaction, which contained 500 ng of each template plasmid (pESC-URA:HpCYP81AA1/HcCPR2 and pESC-URA:HpCYP81AA2/HcCPR2) in 1 Â Phusion HF buffer, 0.2 mM primers (Hp81AA1-SpeI-F and Hp81AA2-PacI-R) (Supplementary Table 7 35 cycles at 98°C for 10 s, 60°C for 30 s and 72°C for 90 s, followed by a final extension at 72°C for 10 min. The PCR product with the expected size (B1,500 bp) was excised from an agarose gel stained with Midori Green (Nippon Genetics), purified using the innuPREP DOUBLEpure Kit (Analytic Jena), digested with SpeI and PacI restriction enzymes (Thermo Scientific), ligated to the pESC-URA:HcCPR2 plasmid previously digested with the same restriction enzymes, transformed into DH5a and sequenced. Similarly, CYP81AA2 with the C-terminus of CYP81AA1 (HpCYP81AA2-AA1) was generated using the same reaction conditions, except for using the primers Hp81AA2-SpeI-F, Hp81AA1-PacI-R and AA2-AA1-ex (Supplementary Tables 7 and 8).
Site-directed mutagenesis. Partially overlapping complementary primers were designed according to the method described previously 56 . The 50 ml PCR reactions were set up with 25-50 ng template plasmid in 1 Â Phusion HF buffer, 0.2 mM dNTPs, 0.5 mM primers (Supplementary Table 8) and 1 U Phusion Hot Start II High-Fidelity DNA Polymerase (Thermo Scientific). After initial denaturation at 98°C for 30 s, samples were run for 25 cycles at 98°C for 10 s, 70°C for 30 s and 72°C for 6:30 min, followed by a final extension at 72°C for 10 min. DpnI (10 U) was added to the product and incubated at 37°C for 3 h to digest the template plasmid. An aliquot of 5 ml of the digestion mixture was directly used to transform DH5a. Plasmids were isolated and sequenced to confirm the introduction of the designed mutations. The templates and primers used to generate the individual mutants are listed in Supplementary Table 9.
CYP expression and preparation of yeast microsomes. The constructed plasmids and the plasmids obtained after mutagenesis were transferred to the S. cerevisiae strain INVSc1 using the S.c. EasyComp Kit (Invitrogen), following the manufacturer's protocol. The yeast cells were grown and gene expression was performed as previously described 57 . The transformation mixture was spread on solid synthetic dextrose (s.d.) minimal medium (0.67% yeast nitrogen base, 2% D-dextrose, supplemented with amino acids but omitting uracil) for auxotrophic selection and incubated at 30°C for 48 h. A single colony was inoculated in a 5 ml s.d. tube and incubated at 30°C with shaking at 250 r.p.m. for 24 h. An aliquot (1 ml) of the resulting culture was transferred to 150 ml YPGE medium (1% yeast extract, 1% peptone, 0.5% glucose and 3% ethanol) and grown for 28-30 h until OD 600 reached 1.5-1.7. Expression was induced by the addition of 2% galactose and the culture was further incubated for 14-16 h. Cells were harvested by centrifugation, washed twice with 10 and 5 ml of TEK buffer (50 mM Tris-HCl pH 7.4, 1 mM EDTA and 0.1 M KCl), manually broken with 3 g of 0.45 mm glass beads in 3 ml of TES-B buffer (50 mM Tris-HCl pH 7.5, 1 mM EDTA and 600 mM sorbitol) and centrifuged at 3840 g for 5 min. The supernatant was collected and the pellet was further washed twice with 3 ml of TES-B buffer each. The combined homogenate was centrifuged for 10 min at 25 000 g to pellet the mitochondria and nuclei. The supernatant was centrifuged for 1 h at 100 000 g to pellet the microsomal membranes. Microsomes were resuspended in 1 ml of TEG buffer (50 mM Tris-HCl pH 7.4, 1 mM EDTA and 20% glycerol) and stored at À 80°C. All steps for microsomal preparation were performed at 0-4°C.
Enzyme assays. Enzymatic activities were determined in a standard 200 ml assay containing 50 mg microsomal protein, 1 mM NADPH and 0.2 mM substrate in 100 mM sodium phosphate buffer (pH 7.0). The reaction was initiated by the addition of NADPH, incubated at 30°C for 1 h and stopped by the addition of 20 ml 1N HCl. After extraction twice with 250 ml ethyl acetate, the combined organic phase was evaporated to dryness, dissolved in 60 ml methanol, 25 ml of which were analysed by HPLC.
LC-MS analysis. Products of 30 enzymatic incubations were purified by HPLC and corresponding peaks were collected as individual fractions. These fractions were directly infused into the mass spectrometer (3200 QTrap mass spectrometer; Applied Biosystems/MDS SCIEX, Darmstadt, Germany), equipped with an electrospray ionization interface (Turbo V), using the integrated syringe pump of the 3200 QTrap instrument (Syringe; 1,000 ml, i.d. 2.3 mm; Hamilton, Nevada, USA) at a flow rate of 10 ml min À 1 . The MS/MS was operated in the positive mode with a source voltage and declustering potential of 5.5 kV and 76 V, respectively. Nitrogen gas was used for nebulization, with the curtain gas, gas 1, and gas 2 settings at 10, 14 and 0, respectively. Parameters were optimized for benzophenones using standard 2,4,6-trihydroxybenzophenone and for xanthones using 1,3,7-trihydroxyxanthone in methanol at 10 mg ml À 1 . The molecular ion peaks [M þ H] þ of the products were further analysed by MS/MS experiments in the enhanced product ion (EPI) mode of the instrument using nitrogen gas for collision-induced dissociation at the high-level setting. The collision energy was 30-50 V. Data acquisition and processing were performed using the Analyst software (version 1.4.2; Applied Biosystems/MDS SCIEX).
Secondary structures prediction and standard numbering. The deduced amino acid sequences of Hc/HpCYP81AA1 and HpCYP81AA2 were aligned with Clus-talW2 using Gonnet protein weight matrix, a gap opening penalty of 10 and a gap extension penalty of 0.2. The secondary structures of HpCYP81AA1 and HpCY-P81AA2 were predicted using the CYP modules prediction tool in the Cytochrome P450 Engineering Database (CYPED) 32 . The standard residues positions were also obtained by blasting the sequences in the CYPED website (https://cyped.biocatnet.de/ workbench/numbering) 44 .
Homology modelling. The HHpred server 36 was used to identify homologous CYP X-ray crystal structures as three-dimensional templates for comparative modelling utilizing the software Modeler 58 . The identified homologous template structures were all in the low sequence identity range (between 20-25% sequence identity) and an alternative homology modelling attempt using YASARA Structure 14.7.17 was used to identify the most suitable template structures by another independent method 59,60 . Structures from two matching CYP-families were identified by sequence search using Psi-BLAST 61 and Psi-Pred secondary structure prediction employing position-specific patterns 62 . The generated homology models were refined using MD-simulations with the YAMBER 63 force field and ranked according to knowledge-based structural descriptors as implemented in YASARA. The best scoring structures were obtained using the X-ray crystal structure from eukaryotic CYP2B4 37 in the closed conformation with the bound inhibitor ticlopidine (PDB: 3kw4) as template. In the alignment, 400 out of 508 target residues (78.7%) are aligned to template residues of HpCYP81AA1. Among these aligned residues, the sequence identity is 30.8% and the sequence similarity is 48.0% ('similar' means that the BLOSUM62 score is40). A second reference structure showing good scores was human CYP17A1 (steroid 17-a-hydroxylase) in closed conformation with bound substrate mimetic. In the alignment, 378 out of 508 target residues (74.4%) are aligned to template residues of HpCYP81AA1. Among these aligned residues, the sequence identity was 27.8% and the sequence similarity was 47.4% ('similar' means that the BLOSUM62 score is40). This model structure based on CYP17A1 showed a different orientation of the C-terminal b-loop (SRS-6) and the adjacent a-helix containing SRS-2 and it was crystallized in the dimeric form. The final three-dimensional-models of HpCYP81AA1 and HpCYP81AA2 were generated from a hybrid model using the HHpred server and the three best matching template structures: 3kw4 (CYP2B4), 3swz (CYP17A1), 3qz1 (steroid 21-hydroxylase) and 2hi4 (CYP1A2). The best scoring model was built on eukaryotic CYP2B4 (ref. 37) in the closed conformation with the bound inhibitor ticlopidine (PDB: 3kw4) as template.
Product docking. Molecular docking was carried out using AutoDock Vina 64 . Twenty-five and 100 docking runs were performed for rigid protein side chains and flexible side chains in the binding pocket, respectively (Supplementary Data 1). The docking solutions were clustered by applying a RMSD cutoff of 5 Å. The standard settings within the provided YASARA dock run and dock ensemble macros were used. The overlaid structures and the results are depicted in the Supplementary  Fig. 12. Product docking identified several binding modes indicating 3 0 -hydroxylation; however, no catalytic-binding orientations for the subsequent cyclization (cyclized bond o6 Å from catalytic haem iron) were obtained when automated docking with rigid protein backbone was employed. Therefore, mechanism-guided manual building of the product in a reactive pose was used, which allowed for full relaxation of the backbone and the side chains of the substrate binding loops in the pocket, based on the below-described staggered minimization protocol. This procedure consisted of steepest descent and simulated annealing minimization steps using fully flexible AMBER molecular dynamics and including several heating/freezing cycles from 298 to 10 K. To this end, the model structures of HpCYP81AA1 and HpCYP81AA2 were overlaid by MUSTANG 65 with haem and the ticlopidine inhibitor-bound structure (PDB: 3kw4) using YASARA Structure Ver. 14.7.17. The initial binding modes of the respective products (1,3,5-triHX and 1,3,7-triHX) were introduced into the apo-model structures using catalytic constraints, so that the cyclized bond and the respective carbon that undergoes H-abstraction are in a distance below 6 Å from the catalytic haem iron. On the basis of this initial product bound model, the binding mode was manually built and energy minimized using YASARA 66 . To remove bumps and correct the covalent geometry, first a short steepest descent minimization was performed. After removal of conformational stress, the procedure was continued by simulated annealing (timestep 2 fs, atom velocities scaled down by 0.9 every 10th step) until convergence was reached, that is, the energy improved by less than 0.05 kJ mol À 1 per atom during 200 steps using the AMBER03 (ref. 67) force field for protein residues and the general AMBER force field GAFF (ref. 68). AM1BCC (ref. 69) calculated partial charges and a force cutoff of 0.786 Å and particle mesh Ewald for exact treatment of long range electrostatics by periodic boundary conditions were employed. The same procedure was applied to the active site mutation S375A/L378S/A483T/M486L/K488R/N489K (mut6).