Nitrogen heterocycles form peptide nucleic acid precursors in complex prebiotic mixtures

The ability to store information is believed to have been crucial for the origin and evolution of life; however, little is known about the genetic polymers relevant to abiogenesis. Nitrogen heterocycles (N-heterocycles) are plausible components of such polymers as they may have been readily available on early Earth and are the means by which the extant genetic macromolecules RNA and DNA store information. Here, we report the reactivity of numerous N-heterocycles in highly complex mixtures, which were generated using a Miller-Urey spark discharge apparatus with either a reducing or neutral atmosphere, to investigate how N-heterocycles are modified under plausible prebiotic conditions. High throughput mass spectrometry was used to identify N-heterocycle adducts. Additionally, tandem mass spectrometry and nuclear magnetic resonance spectroscopy were used to elucidate reaction pathways for select reactions. Remarkably, we found that the majority of N-heterocycles, including the canonical nucleobases, gain short carbonyl side chains in our complex mixtures via a Strecker-like synthesis or Michael addition. These types of N-heterocycle adducts are subunits of the proposed RNA precursor, peptide nucleic acids (PNAs). The ease with which these carbonylated heterocycles form under both reducing and neutral atmospheres is suggestive that PNAs could be prebiotically feasible on early Earth.


Method Rationale and Limitations
The primary goals of this study were to (1) elucidate the adducts that N-heterocycles commonly form in mixtures simulating the chemical complexity expected on early Earth and (2) evaluate plausible pre-RNA molecules compatible with these more complex structures. To this end, we studied the reactivity of 53 N-heterocycles representing five classes (pyridines, pyrimidines, triazines, purines, and pteridines) with complex mixtures produced under both reducing and neutral atmospheres to elucidate chemical trends, which resulted in hundreds of samples. Given the high number of samples and the complexity of each reaction mixture, it was not feasible to pursue product purification and structure elucidation via liquid chromatography-mass spectrometry, which precluded us from estimating product yields and determining which conformational isomer formed in any particular case. Instead we chose to analyze samples using high-resolution mass spectrometry (HRMS) with a direct analysis in real-time (DART) ion source to enable high sample throughput. The high mass resolution (>30,000 at m/z 400) and mass accuracy (typically <5 ppm) of the instrument permitted the assignment of molecular formulas, which enabled the recognition and identification of N-heterocycle adducts. DART was also selected for ionization because it readily protonates nitrogen-containing compounds, is less sensitive to matrix effects, does not usually produce salt adducts, and we found Penning ionization to be minimal, enabling us to rapidly analyze complex mixtures with minimal sample preparation 26 .

Results and Discussion
Reactivity trends in Miller-Urey mixtures. Miller-Urey spark discharge mixtures generate a plethora of organic compounds including aldehydes, nitriles, hydroxy acids, and amino acids 23,[27][28][29] ; these organics hold the potential to generate sugar, alcohol, electrophilic carbonyl, or nucleophilic side chains on N-heterocycles. The resulting side chain would impact the subsequent reactivity of the N-heterocycles and hence influence the types of molecular polymers they may form. For example, sugar and alcohol side chains can be readily phosphorylated 10 or glyoxylated 14 forming monomers akin to RNA or gaRNA; alternatively, hydroxymethylated N-heterocycles can spontaneously oligomerize 30 . Meanwhile, the formation of carboxylic acid and aldehyde side chains could serve as the precursor for PNAs should their carbonyls be attacked by amino acids. Lastly, nucleophilic side chains could facilitate the formation of organometallic clusters (e.g. with iron) as good nucleophiles often make decent ligands. To investigate whether N-heterocycles generate adducts that would favor the formation of a specific type of genetic polymer, we characterized the reactivity of N-heterocycles (Supplementary Table S1) in Miller-Urey spark discharge mixtures by searching for adducts with sugar, alcohol, carbonyl (i.e. electrophilic), and nucleophilic side chains (Supplementary Table S2). Figure 1(a-d) summarizes the strategy used to characterize select products observed in heterocycle-spark reaction mixtures using a uracil adduct (m/z 185.0556 = uracil-C 3 H 5 O 2 , Fig. 1a) that was formed under a reducing atmosphere (N 2 -CO 2 -CH 4 -H 2 ) as an example. Tandem mass spectrometry (MS/MS) of the uracil adduct (Fig. 1b)  ] + ). When uracil is incubated with acrylic acid in a separate reaction, we observed a product with identical mass (m/z = 185.0555) as the uracil-spark adduct. Comparison of the MS/MS spectra in Fig. 1(b,c) reveals that the fragmentation pattern of the uracil-spark and uracil-acrylic acid adducts match. The fragmentation pattern of a commercial reference standard of uracil-N1-propanoic acid (Fig. 1d) and 15 N-NMR spectra (Fig. 1g,h) of uracil incubated with acrylic acid indicates that the reaction proceeded at the N1 position in the spark reaction mixture. Thus, the N-heterocycle adduct at m/z 185.0556 in the heterocycle-spark reaction mixture was identified as uracil-N1-propanoic acid.
Using these methods, we confirmed that specific organics in the spark discharge mixture react readily with a range of N-heterocycles (Supplementary Tables S3-S6; Supplementary Figs S3-S25); from this information we inferred that products having side chains with identical chemical compositions and thus the same Δm/z (i.e. the difference in mass between the N-heterocycle adduct and parent N-heterocycle) were generated via the same mechanisms.
Reducing atmosphere. N-heterocycles were incubated with spark discharge mixtures generated under a strongly reducing atmosphere as they were originally perceived to be more conducive for organic synthesis than neutral gases 31 . Although, Earth's early atmosphere was likely near neutral (N 2 , CO 2 ) with trace amounts of H 2 , CO, and H 2 S 32 , volcanic outgassing and bolide impacts could have generated large, but transient amounts of reduced gases 33 . Figure 2 summarizes the reactivity of 53 N-heterocycles in spark discharge mixtures produced from a 1 bar atmosphere containing 0.4 N 2 , 0.1 CO 2 , 0.25 H 2 , and 0.25 CH 4 (i.e. reducing spark mixture) over a 0.2 M phosphate buffer solution (adduct details listed in Supplementary Table S3). Markedly, with the exception of 4-pyridinecarboxylic acid [9], N-heterocycles with cyano or carboxylic acid groups did not produce detectable adducts. Conversely, the majority of N-heterocycles without these electron withdrawing groups (EWGs), including the canonical nucleobases, formed at least one adduct containing a carbonyl carbon. These carbonylated heterocycles were 1-3 carbons in length, sometimes methylated (at either the α or β-carbon on the side chain), and contained a terminal aldehyde, nitrile, amide, or carboxylic acid functional group.
Neutral atmosphere. Given that Earth's early atmosphere was likely near neutral based on current models 32 , we wanted to see whether N-heterocycles would carbonylate from a mixture generated under a N 2 -CO 2 atmosphere lacking any reduced gases. Previous work has shown that neutral gas mixtures (1:1 N 2 :CO 2 ) over an aqueous buffer form amino acids in higher yields 34 than previously thought 28 , albeit at levels lower than that of reducing atmospheres. It is predicted that neutral atmospheres are not as conducive for amino acid synthesis partially because they produce less HCN and NH 3 34 , which incidentally are likely precursors of the reactants that carbonylated N-heterocycles in our reducing spark mixture. Moreover, neutral atmospheres generate higher levels of nitrates and nitrites, the latter of which can go on to form nitrosamines from amino acids 34 and N-heterocycles 35 .
Twenty-one N-heterocycles were incubated with a spark discharge mixture formed under a 1 bar atmosphere of 0.5 N 2 and 0.5 CO 2 gases (i.e. neutral spark mixture). Heterocycles were selected because of either their reactivity in the reducing spark mixture (Fig. 2 Reducing Atmosphere) or their relevance to previous pre-RNA studies 20,21,36-38 (see Supplementary Table S1). In the majority of cases, the number of adducts formed were the same or higher compared to those using spark reaction mixtures from a reducing atmosphere. In addition, we identified at least one carbonylated adduct for almost every N-heterocycle tested ( Fig. 2 Neutral Atmosphere), with many forming the same products as when they were incubated with the reducing spark mixture (see Supplementary  Table S4 for a complete list of adducts). It should be noted that in addition to forming carbonylated adducts, several N-heterocycles (including the canonical purines) formed products with masses corresponding to various sugar-like side chains, including 4-, 5-, and 6-carbon sugars or sugar alcohols (see Supplementary Tables S4 and  S6). However, these adducts were uncommon (i.e. each only formed by 2-5 N-heterocycles) and additional work is needed to verify the identity of the side chains.  15 N-NMR results confirm that acrylic acid adds preferentially to the N1 position of uracil. (e) 15 N 2 -Uracil produces two doublets corresponding to N1 (δ132.00) and N3 (δ159. 47). (f) Uracil-N1-propanoic acid standard shows that addition at the N1 position shifts the N1 peak downfield (δ135. 16) and N3 peak upfield (δ158.71). (g) The four peaks produced from 15 N 2 -Uracil incubated with acrylic acid (100 °C x 3 h in D 6 -dimethyl sulfoxide (D 6 -DMSO)) correspond to uracil and uracil-N1-propanoic acid. (h) INEPT spectrum of the reaction mixture shows peaks only for nitrogens that have protons directly attached; the N1 peak (δ135.18) was not observed, confirming an addition at the N1 position. Note that the presence of two additional peaks (δ158.26 and δ131. 22), suggests that the C5 adduct is a minor product in this reaction. The tops of the spectra have been truncated due to the peak height of uracil. All NMR spectra were obtained in D 6 -DMSO.
Given that almost all the N-heterocycles with carboxylic acid groups were unreactive under a reducing atmosphere, we were intrigued to find that all three pyridine carboxylic acids tested were reactive [7,8,9], with [8 and 9] forming carbonylated heterocycles; 2-pyridinecarboxylic acid [7] formed a single adduct containing an alcohol www.nature.com/scientificreports www.nature.com/scientificreports/ side chain. One possible explanation for the reactivity of these compounds in the neutral but not reducing spark mixture may be that the former produces considerably fewer nucleophiles like HCN and ammonia 39 that would otherwise sequester the carbonylating reactants 23,40 . This suggests that heterocycles with carboxylic acids can act as a nucleophile to form carbonylated heterocycles under certain favorable reaction conditions; indeed, previous studies have shown that such heterocycles undergo carbonylation in isolated reactions 41,42 .
Reactivity trends of N-heterocycles. We surmised that most of the N-heterocycle adducts were formed by the N-heterocycles attacking electrophiles; thus, those that were unreactive (and ionized by DART) can be considered poor nucleophiles. To some extent pK a can be used to predict whether heterocycles will be unreactive as those having an exceptionally low pK a are strong acids, but very poor nucleophiles. Being strong acids, these heterocycles hold on to their electrons tightly, hindering their interaction with electrophiles. This value is especially important for heterocycles whose conjugate base is neutral (e.g. pyridine as the conjugate base for pyridinium cations) and thus significantly less nucleophilic than conjugate bases that are negatively charged (e.g. deprotonated uracil). In accordance with this pattern the reported pK a values of the conjugate acids (i.e. pK aH ) for 2-, 3-, and 4-cyanopyridines (all of which were unreactive in the reducing spark mixture) are very low (−0.26, 1.36, and 1.90, respectively). The pK aH of the ring nitrogen in the cyanopyridines reflects how the cyano group pulls electrons from the ring nitrogen via resonance effects, decreasing its nucleophilicity and deactivating the molecule towards electrophilic reactions. In addition to decreasing the electron density surrounding the ring nitrogen of pyridine, cyano groups also decrease the nucleophilicity of the ring N via steric hindrance as the bulky cyano group partially blocks the ring N from electrophilic attack.
Although carboxylic acids (R-COOH) are EWGs, once they deprotonate and form carboxylates (R-COO − ) they are only slightly deactivating. In our reaction mixtures (pH 8), all of our carboxylic acid heterocycles exist as carboxylates. In consequence, the pK aH of the ring nitrogen is only slightly less than pyridine (~5 for pyridine carboxylic acids). Despite what their pK aH indicates, the pyridine carboxylic acids remain relatively weak nucleophiles; therefore, their low reactivity is probably due to steric hindrance from the bulky side chain near the ring nitrogen. This would also explain why 4-pyridinecarboxylic acid [9] is the only reactive pyridine carboxylic acid in the reducing spark mixture and why 2-pyridinecarboxylic acid [7] is the least reactive of the three in the neutral spark mixture (Supplementary Table S4). This trend is also consistent with 4-and 2-hydroxymethylpyridine [5,6] being unreactive in the reducing spark mixture despite the ring nitrogen being a decent base (pK aH ~9); 3-hydroxymethylpyridine [4] is slightly reactive, forming two detectable adducts, only one of which is from electrophilic attack. In comparison, 4-aminopyridine [2], which has a similar pK aH (9.2) to 4-hydroxymethylpyridine [5] (pK aH 8.9) but is not sterically hindered, generated 6 detectable adducts while [5] was unreactive.

Robust Reactions of N-Heterocycles in Miller-Urey Mixtures.
We investigated whether N-heterocycles gain sugar, alcohol, carbonyl, or nucleophilic side chains in prebiotically plausible complex mixtures. Our results show that in both reducing and neutral spark mixtures the majority of adducts contained carbonyl side chains (see Supplementary Table S6); in comparison, the formation of nucleophilic, sugar, and alcohol side chains (with the exception of acetaldehyde and cyanamide adducts in the neutral spark mixture) were exceptionally low (see Supplementary Table S6). Figure 2 summarizes the most common reactions of N-heterocycles observed in spark discharge mixtures (only the hydrolysis product of each adduct (i.e. acids) will be referred to hereafter). We assigned reactants based on their availability in spark mixtures (i.e. compounds that have been reported in the literature), demonstrated reactivity with N-heterocycles based on isolated reactions resulting in the same N-heterocycle adduct m/z ( Table 1, Supplementary Table S5), and (for adducts with sufficient abundance) MS/MS spectra (Supplementary Figs S7-S25); in select cases we also compared the MS/MS of the spark adduct with that from an isolated reaction-in every case the fragmentation pattern matched. Notably, the only reactions that were major in both reducing and neutral spark mixtures were those that generated carboxylic acid side chains (Fig. 3).
HCN/Cyanamide adducts. In the aqueous buffer (0.2 M phosphate buffer, pH 8), about 10% of HCN is deprotonated (pKa 9.2); the resulting anion is an excellent nucleophile in reactions with N-heterocycles whose rings are sufficiently electron deficient. Accordingly, we found that the majority of cyanide adducts in reducing atmospheres were formed by pteridines and triazines that possessed a strongly electrophilic carbon susceptible to nucleophilic attack (e.g. [26], [31], and [33]). Thus, it was surprising to find that in the neutral spark mixture, pyrimidines and triazines containing two or more electron donating amine groups (e.g. melamine [28] and TAP [16]) formed cyano adducts, especially considering that Cleaves and coworkers 34 measured significantly less HCN in spark discharge mixtures generated under an identical atmospheric ratio (1:1 N 2 :CO 2 ). Given this, we propose that these cyano-adducts are formed via an intermediate that was present in the neutral but not reducing spark mixture. One such possibility is nitrosamines via nitrous acid generated from the spark discharge through a N 2 -CO 2 atmosphere 34 . In water nitrous acid readily forms nitrosonium ions (Fig. 4a) that then react with nucleophilic amine groups forming nitrosamines (Fig. 4b). For aromatic compounds like the N-heterocycles, loss of water from the nitrosamine gives aryl diazonium cations (Fig. 4c)-excellent leaving groups that lower the activation energy for nucleophilic attack. Subsequent attack of the aryl diazonium cations by nucleophilic cyanamide would generate an adduct with identical mass to that expected from the parent heterocycle undergoing HCN nucleophilic substitution (Fig. 4d).
Glycolonitrile adducts. Intriguingly, N-heterocycles with an acetic acid group, our most consistently observed adduct, are the "nucleoside" subunit of the PNA shown to readily base pair with RNA 16 (i.e. PNAs whose heterocycles are linked to a N-(2-aminoethyl)glycine (AEG) backbone by an acetamide bridge: Het-CH 2 -CO-AEG), hereafter referred to as aegPNA. These carbonylated heterocycles formed via a Strecker-like synthesis on the N-heterocycle with glycolonitrile (or alternatively, formaldehyde, followed by HCN). Notably, when we tested www.nature.com/scientificreports www.nature.com/scientificreports/ the reactivity of glycolamide with guanazole we did not observe the corresponding acetic amide nor acetic acid adduct (a detailed discussion of the reactivity of glycolonitrile with guanazole can be found in the Supplementary text, section 2.2 and Supplementary Fig. S5). These results coincide with previous observations that glycolonitrile-and by extension, its precursors, HCN and formaldehyde-are abundant products of spark discharges under reducing atmospheres 39 . Our results also show that N-heterocycles form the heterocycle-acetic acid molecules even in mixtures produced under neutral atmospheres where the concentrations of HCN are suspected to be relatively low 31,34 . While concentrations of HCN decrease under increasingly neutral atmospheres, the proportional amounts of formaldehyde and glycolonitrile may increase. Accordingly, it was found that glycolonitrile  Fig. 2 and Supplementary Table S1 for structures and names, respectively). * or † Indicates that the reactant was not directly detected, but that its formation in spark mixtures is inferred based on the identification of: * The product of the predicted reactant with other spark compounds (mechanism shown in parenthesis) or † Compounds that combine to form the predicted reactant. ‡ Methacrylonitrile/amide/acid and 3-butn-2-one are equally plausible for generating the observed methylated carbonyl side chain as crotonitrile/ amide/acid and 2-methylpropiolaldehyde, respectively. Note that Wolman and colleagues 24 previously detected β-aminobutyric acid (from crotonitrile) and β-aminoisobutyric acid (from methacrylonitrile) in approximately equal amounts in spark discharge mixtures generated under a reducing atmosphere (CH 4 , N 2 , H 2 O, with trace NH 3 ). Although the reactivity of methacrylonitrile and 3-butyn-2-one was not investigated, they are likely more reactive than their respective isomers, crotonitrile and 2-methylpropiolaldehyde, as they are methylated at the α-carbon and are thus not sterically hindered at the β-carbon which is the site of nucleophilic attack during the Michael addition. § The reactivity of the predicted reactant can be inferred from that of a similar organic (e.g. those with different terminal functional groups) or a double instead of triple bond. It has been shown that the most reactive Michael acceptors are those with aldehydes (vs nitriles/esters/acids) 42,77 and triple bonds (e.g. cyanoacetylene) rather than double (e.g. acrylonitrile) 78 . See Supplementary text, section 2.4 for details. ¶ Organics with a methyl group attached to the β-carbon of an acrylic compound; the reactivity of these organics with N-heterocycles suggests that the methyl group of crotonitrile would not inhibit its ability to behave as a Michael acceptor in a reaction with N-heterocycles.
www.nature.com/scientificreports www.nature.com/scientificreports/  Supplementary Table S6 for details). This grouping revealed that the majority contained alkyl and acrylic side chains 1-3 carbons in length with terminal aldehyde or CN/CONH 2 /COOH groups. Of these groups, only the individual adducts that that were formed by at least 10 N-heterocycles were deemed major and included in this figure. Blue, red, and purple arrows indicate the reaction was robust when N-heterocycles were incubated with mixtures formed under a reducing atmosphere, neutral atmosphere, and both atmospheres, respectively. For clarity, reactants are shown as their nitrile precursors and adducts as their final hydrolysis product (carboxylic acids and aldehydes). *Indicates that it is equally plausible that a structural isomer of the reactant (methacrylonitrile and 3-butyn-2-one for crotonitrile and methylpropiolaldehyde, respectively) attacked the N-heterocycle, forming a structural isomer of the structure shown (see Supplementary Table S2 for   www.nature.com/scientificreports www.nature.com/scientificreports/ yields were highest under a N 2 -H 2 -CO 2 atmosphere (compared to the same atmosphere but with CO or CH 4 instead of CO 2 ); notably the highest yields were found when the H 2 :CO 2 ratio was 3 (compared to 0.5) 39 .
Michael adducts. The majority of the reactions depicted in Fig. 3 are Michael additions of N-heterocycles with 3-carbon-long α,β-conjugated compounds with a terminal carbonyl group (derived from methylcyanoacetylene, cyanoacetylene, crotonitrile, methylpropiolaldehyde, propiolaldehyde, acrolein, and acrylonitrile). Previous work has repeatedly shown that these types of reactions are regioselective with carbonylation being thermodynamically favored at the N1 and N9 positions of pyrimidines and purines, respectively 42-45 -the sites where ribose attaches to the canonical nucleobases in RNA. In accordance with this, we found that the Michael acceptors propiolic acid and acrylic acid preferentially add to the N1 position of uracil ( Fig. 1; Supplementary Fig. S6).
Reactions unique to either reducing or neutral atmospheres. We identified several reactions that were common in either the reducing or the neutral spark mixtures, but not in both. The addition of the Michael acceptors acrolein, propiolaldehyde, and methylpropiolaldehyde were frequently observed only in reactions with a reducing atmosphere, which concurs with the fact that all three are easily generated via methane 46,47 (Table 1). However, it should be noted that regarding the methylpropiolaldehyde and acrolein adducts, this observation may be misleading as the majority of the N-heterocycles that formed these adducts in the reducing spark mixture were not studied in the neutral spark mixture.
On the other hand, a greater presence of acetaldehyde, acetic acid, and formic acid adducts under a N 2 -CO 2 atmosphere are consistent with measurements of excess formaldehyde and proportionally less HCN in neutral spark mixtures 39 . Since HCN sequesters aldehydes, low HCN yields correspond to higher acetaldehyde concentrations 48 (for a discussion of the lack of observed formaldehyde adducts see Supplementary text, section 2.1). The paucity of free HCN may have been due to the efficient formation of HCN products (such as glycolonitrile) by cyanohydrin reactions as deduced by our results here and as seen in prior work 34 ; alternatively, hydrolysis of HCN to formic acid could have been exacerbated by nitric and nitrous acids present in the neutral spark mixture 34 . Presumably, relatively high formic acid-to-HCN (and in parallel acetic acid-to-acetonitrile) ratios accounts for the robust production of formyl and acetyl adducts in the neutral mixtures.

Discussion of other prebiotic Reactions Relevant to our Results.
Remarkably, the majority of the organics that carbonylated N-heterocycles were Michael acceptors that can also act as N-heterocycle precursors (e.g. urea + cyanoacetylene, propiolic acid, or acrylonitrile → cytosine/uracil) 46,[49][50][51] . In addition, it has been shown that carbonylated nucleobases can be generated via purine and pyrimidine precursors that reacted with the carbonylating organics (e.g. hydantoic acid (from urea + glycolonitrile + 2 H 2 O) + cyanoacetaldehyde → cytosine/uracil-acetic acid) 52 . Overall, these previous studies, together with our results, suggest that there are multiple chemical pathways by which carbonylated heterocycles may have formed on early Earth. For an in-depth discussion of trends of nucleobase carbonylation and the possibility for a one-pot synthesis of carbonylated nucleobases in Miller-Urey-type reaction mixtures (based on results from previous publications and our observations of the spark mixtures generated for this study) see Supplementary text, sections 2.4 and 2.5; see Table S7 for yields of carbonylated nucleobases in isolated reactions obtained by previous studies.
Under both reducing and neutral atmospheres, the most common adducts were those containing carbonyl side chains 1-to 3-carbons long. The carbonyl carbon of these side chains, being electron deficient, is susceptible to nucleophilic attack by electron-rich molecules (e.g. amino acids)-the exception being the carbonyl resulting from nitrosation and cyanamide (see Supplementary text, section 2.3). Previous work has shown that amino acids readily react with N-heterocycles containing aldehyde side chains in aqueous solution, forming nucleobase-peptide molecules connected via an imine bond 41 . The formation of amide bonds may be facilitated by subjecting solutions containing N-heterocycle adducts, amino acids, and hydroxy acids to wet-dry cycles, as the latter compound promotes peptide bond formation 53 . Intriguingly, both amino acids and hydroxy acids are generated in relatively high yields from Miller-Urey reactions 23,27 . In fact, AEG, the backbone of aegPNA, has been identified in spark discharges of CH 4 , N 2 , NH 3 , and H 2 O atmospheres 52 , albeit in low yields and in the presence of numerous other amino acids 23 . Given this, a diverse set of PNA monomers containing AEG and various other amino acids is likely to be generated from carbonylated heterocycles within complex mixtures.
The resulting PNA monomers with aldehyde, acrylic, and carboxylic acid side chains could subsequently polymerize via Knoevenagel condensations 41 , free radical-induced polymerization 54 , or continuous wet-dry cycles (akin to amino acid polymerization 53 ), respectively. As a diverse set of PNA monomers would be available during polymerization, the resulting macromolecule would likely be composed of a hetero-peptide backbone rather than a uniform backbone like aegPNA. Alternatively, carbonylated heterocycles can be attacked by an existing nucleophilic backbone such as poly-AEG 55 or HCN polymer. Moreover, the carbonyl side chains, being hydrophilic, help solubilize N-heterocycles which in turn may facilitate the supramolecular assembly of carbonylated heterocycles and peptide-nucleobase monomers that base pair in solution 56 . The diversity of these polymerization mechanisms suggests that a facile transition of carbonylated heterocycles to complex polymers is prebiotically plausible under a broad range of conditions.
Our results suggest that formylation of N-heterocycles, including the aminopyrimidines studied here, is more favorable under a neutral atmosphere. Intriguingly, formylated 5,6-diaminopyrimidines can become N9 purine nucleosides in formose reaction mixtures subjected to drying 7 . As neutral atmospheres generate more free aldehydes than reducing atmospheres 39 , they may also be more conducive than reducing atmospheres for sugar synthesis and hence the conversion of formylated 5,6-diaminopyrmidines to purine glycosides. Therefore, formylation of N-heterocycles under neutral atmospheres could serve as a first-step towards PNA monomers as well as TNA and RNA purine nucleosides.

Conclusions
Here we report plausible prebiotic pathways for the robust formation of carbonylated N-heterocycles in highly complex organic mixtures, which are expected to be present on early Earth (Fig. 5). The majority of these carbonylated heterocycles are formed by a wide range of N-heterocycles via both a Strecker-like synthesis with glycolonitrile and Michael additions with 3-carbon acrylic and propiolic derivatives. Intriguingly, previous work has shown that Michael additions tend to primarily occur at the N1 and N9 position of pyrimidines and purines, respectively [42][43][44][45] (Fig. 1, Supplementary Fig. S6), hence generating pyrimidine and purine carbonylated heterocycles capable of forming Watson-Crick base pairs. Notably, strongly acidic N-heterocycles (e.g. cyanopyridines) were unreactive and those with bulky side chains conjugated to the ring (e.g. carboxylates, hydroxymethyl, etc.) were mostly inert in spark discharge mixtures; thus, once formed, these compounds are not as likely as other N-heterocycles to be incorporated into a primitive genetic polymer. Conversely, the majority of N-heterocycles which react readily with sugars were also reactive with a range of electrophiles in both reducing and neutral spark mixtures. Thus, although these heterocycles could have readily formed glycosides via ribose, their versatile reactivity with carbonylating organics may have impeded their incorporation into ribonucleoside polymers.
While it has been argued that a reducing atmosphere would have been more favorable for the origins of life on Earth 39 , we have demonstrated here that neutral atmospheres are just as conducive at producing carbonylated heterocycles. These results are in accordance with the consensus that the early Earth had a near neutral atmosphere 32 and demonstrate how carbonylated heterocycles could have been produced from N-heterocycles on the early Earth under a broad range of atmospheric conditions.
Although we did not determine yields, the reproducibility of products in replicates (see Supplementary text, section 2.6) and with a large number of N-heterocycles in both heated and unheated reactions (see Supplementary Tables S3-S5) confirms that the formation of such adducts is robust. Furthermore, as the very organics that form N-heterocycles may also carbonylate them, the co-formation of N-heterocycles and their corresponding PNA precursors is prebiotically feasible. Given that there are many chemical pathways by which PNA precursors may form, and that these reactions occur under a broad range of conditions, it is possible that the carbonylated heterocycles identified here would have been readily available for the formation of PNAs on early Earth. PNAs are particularly interesting in regards to the origins of life as those with a backbone composed of AEG have been shown to form double helices with themselves and RNA and, importantly, are capable of auto-catalytic and cross-catalytic template-based replication 57,58 -characteristics that are important for any viable genetic polymer. Additionally, as PNAs are composed of N-heterocycles and amino acids, they provide a plausible avenue for the co-evolution of proteins and nucleic acids.
Intriguingly, the organics generated from spark discharges-including amino and hydroxy acids in addition to N-heterocycles and the derivatives of organics that carbonylate them (e.g. β-amino acids and glycine which are amine derivatives of Michael acceptors and glycolonitrile, respectively)-have been identified in meteorites 23,24 and experiments modeling organic synthesis at hydrothermal vents 25 , Titan's atmosphere 59 , as well as the hydrolysis of tholins expected on Titan and Triton 60,61 . In fact, glycolonitrile, acrylonitrile, cyanoacetylene, acrolein, and propiolaldehyde have all been detected in the interstellar medium 62-65 ; some of these organics have  4,19 . Alternatively, electric discharges through either a reducing or neutral atmosphere produce (b) a complex mixture of organics that can combine to form (c) N-heterocycles 2,3,17,46 . Strongly acidic N-heterocycles (as indicated by low pK aH ), such as those with cyano and amide groups conjugated to the ring, deactivate the N-heterocycle for electrophilic attack. Similarly, bulky side chains such as carboxylates also decrease the nucleophilicity of the ring nitrogen, and thus the heterocycle's reactivity in complex mixtures. (d) Organics formed in electric discharges readily react with a wide range of N-heterocycles to form adducts with a carbonyl side chain, which can serve as precursors for PNA monomers. (2019) 9:9281 | https://doi.org/10.1038/s41598-019-45310-z www.nature.com/scientificreports www.nature.com/scientificreports/ also been identified in interstellar ice analogs (i.e. acrolein, glycolonitrile) 66,67 , comets (i.e. cyanoacetylene, propiolaldehyde) 68,69 , and Titan's atmosphere (i.e. acrylonitrile and cyanoacetylene) 70,71 . Furthermore, AEG and diamino acids, both of which could serve as the backbone for PNAs, have been detected in interstellar ice analogs 72 ; diamino acids have also been identified in the Murchison meteorite 73 . These observations illustrate that the reactants that produce carbonylated heterocycles and potentially PNA monomers form in a wide variety of environments. In consequence, the chemical evolution of N-heterocycles to carbonylated adducts in Miller-Urey mixtures, as described here, may be a common phenomenon in the Solar System, potentially extending the formation of PNA precursors from early Earth environments to meteorites and other planetary bodies.

Materials and Methods
All glassware was thoroughly washed with ultrapure water, wrapped in aluminum foil, and heated at 480 °C for 8 hours to ensure removal of organic compounds. All solutions were stored in anoxic vials stoppered with gas-tight blue rubber butyl stoppers; solutions were prepared with 0.2 M N 2 -purged potassium phosphate buffer (pH 8.0) with sodium sulfide nonahydrate (0.006 g/L) to act as an oxygen scrub.
spark discharge experiments. A 1000 mL round bottom flask attached to an Electrotechnics BD50E Tesla coil and two tungsten electrodes (tips 1 cm apart) was used to conduct spark discharge experiments ( Supplementary Fig. S1). The flask was filled with 250 mL of a 0.2 M phosphate buffer solution (pH 8.0) and injected (in bars) with either a reducing (0.4 N 2 , 0.1 CO 2 , 0.25 H 2 , 0.25 CH 4 ) or neutral (0.5 N 2 , 0.5 CO 2 ) gas mixture. The flask was placed inside a water bath (~5 °C) and sparked at 40 kV for 72 hours; the water bath was employed to maintain a constant temperature in the flask. To evaluate the reactivity of N-heterocycles in complex mixtures, 1 mL from a heterocycle stock solution was reacted with 2 mL aliquots of fresh spark product (resulting in an initial heterocycle concentration of 1 mM). The reaction mixture was either incubated at 80 °C for 1-7 days or frozen immediately and kept at −80 °C; the latter was done to determine whether (1) adducts formed in the absence of heating and (2) if heating facilitated their formation. Reactions were stopped by flash freezing with dry ice and stored at −80 °C until their analysis. sample analysis. Samples were analyzed using a Thermo Scientific LTQ Orbitrap XL hybrid mass spectrometer equipped with an IonSense DART ion source (IonSense, Saugus, MA, USA). The resolving power of the orbitrap was set to 30,000 (at m/z 400) and mass spectra were collected over a range of m/z 50-500. The mass spectrometer was run under positive ion mode and the DART ion source operated at 350 °C with He gas; guanine was analyzed at 450 °C. N-heterocycles that did not ionize were re-analyzed in negative mode using a DART ion source temperature of 450 °C. Due to the complexity of the reaction mixtures, a targeted approach was used to identify adducts (see Supplementary Table S2). Heterocycle adducts were identified using molecular formulae as determined by accurate mass measurements (typically <5 ppm mass error) and comparison to controls that were processed in parallel. Some adducts were confirmed by incubating N-heterocycles (1 mM) with the predicted reactant (1 mM) and matching product ion spectra (MS/MS) along with comparison to reference standards when available. NMR spectra were obtained using a Bruker Avance III HD 500 spectrometer at 50.7 MHz and 298 K. Additional experimental details are described in the SI text (see Supplementary text, section 1.0).

Data Availability
Individual mass spectra will be deposited at the public Penn State data commons (http://www.datacommons. psu.edu) upon publication. All other data generated or analyzed during this study are included in this published article (and its Supplementary Information files).