RNA oxidation in chromatin modification and DNA-damage response following exposure to formaldehyde

Formaldehyde is an environmental and occupational chemical carcinogen implicated in the damage of proteins and nucleic acids. However, whether formaldehyde provokes modifications of RNAs such as 8-oxo-7,8-dihydroguanine (8-oxoG) and the role that these modifications play on conferring long-term adverse health effects remains unexplored. Here, we profile 8-oxoG modifications using RNA-immunoprecipitation and RNA sequencing (8-oxoG RIP-seq) to identify 343 RNA transcripts heavily enriched in oxidations in human bronchial epithelial BEAS-2B cell cultures exposed to 1 ppm formaldehyde for 2 h. RNA oxidation altered expression of many transcripts involved in chromatin modification and p53-mediated DNA-damage responses, two pathways that play key roles in sustaining genome integrity and typically deregulated in tumorigenesis. Given that these observations were identified in normal cells exhibiting minimal cell stress and death phenotypes (for example, lack of nuclear shrinkage, F-actin alterations or increased LDH activity); we hypothesize that oxidative modification of specific RNA transcripts following formaldehyde exposure denotes an early process occurring in carcinogenesis analogous to the oxidative events surfacing at early stages of neurodegenerative diseases. As such, we provide initial investigations of RNA oxidation as a potentially novel mechanism underlying formaldehyde-induced tumorigenesis.

Cytotoxicity assay. After six hours of recovery from exposure in fresh BGEM media, the basolateral medium from each well was collected and frozen at − 80 °C until the day of analysis. Cellular membrane damage was measured by detection of lactase dehydrogenase (LDH) in the cellular medium using a colorimetric assay (LDH Cytotoxicity Detection Kit, Takara Bio, Japan). LDH is an enzyme released into media after plasma membrane damage and is proposed to increase proportionally to the number of dead cells 51 . Absorbance of the assay was measured at 491 nm for 30 min at 25 °C using a Cytation3 plate reader (Biotek, Winooski, VT).

Confocal microscopy.
Confocal microscopy images of BEAS-2B cells adhered to hydrophilic PTFE membranes were acquired as previously described 39 . Prior to treatment of the cells, the insert's membrane was cut off from the plastic insert by making an incision around the edge of the membrane. Each membrane was then placed onto a microscope slide mounted in a petri dish with cells facing upward. Cells were fixed in 1 ml of 3.7% formaldehyde solution in phosphate buffer solution pH 7.4 (PBS, Thermo Fisher Scientific, Waltham, MA) for 15 min at 37 °C. After fixation, the formaldehyde solution was discarded, and the membrane was washed three times with 1 ml of PBS pre-warmed to 37 °C. Then, 1 ml of 0.1% Triton-X-100 (Sigma) in PBS was placed onto the membrane for 4 min and washed with 1 ml PBS three times. The membrane was then pre-incubated with 1 ml of 1% bovine serum albumin (BSA) in PBS for 20 min, prior to adding the phallotoxin staining solution.
To stain F-actin in the cells, 10 µl of Alexa Fluor 594 Phalloidin solution (Thermo Fisher Scientific, Waltham, MA) was diluted into 400 µL of PBS with 1% BSA solution. The staining solution was placed onto the membrane for 20 min at room temperature and protected from light to prevent photobleaching. The fluorescent media was aspirated and washed three times with PBS. Once each membrane was stained, a drop of ProLong Gold Antifade Mountant with DAPI (Thermo Fisher Scientific, Waltham, MA) was placed onto the membrane. A coverslip was positioned on top of the membrane, and then the edges of each coverslip were sealed with clear nail polish and left to dry. Specimens were stored in the dark at 4 °C until the day of analysis. Confocal microscopy for analysis was performed using a Zeiss LSM 710 Confocal Microscope. Five or more images were acquired in random locations and captured using Zen Pro software with a 63 × oil objective and filters for DAPI and Alexa 594. Image analysis. The cytosolic area was estimated from the phalloidin Alexa 594-stained F-actin surface and the nucleic area from the DAPI-stained DNA. The area was quantified in Fiji/ImageJ by drawing the outline of the cell with the free hand pencil tool in at least 5 cells in 3 confocal images (63 × magnification) selected for each biological replicate and condition. The F-actin organization around the nucleus and plasma membrane was quantified using Fibriltool plugin in Fiji according to the described protocol 52 . This analysis was conducted in 3 confocal images (63 × magnification) selected for each biological replicate and condition. The anisotropic score was computed on 5 or more cells per image by drawing an area of interest of approximately 5 µm by 10 µm. RNA preparation. Following exposure, the apical side of each membrane was treated with 1 ml of TRIzol (Invitrogen, Carlsbad, CA) and gently mixed to ensure thorough lysis of cell culture. Lysate was collected and Scientific RepoRtS | (2020) 10:16545 | https://doi.org/10.1038/s41598-020-73376-7 www.nature.com/scientificreports/ frozen until the day of the extraction. TRIzol RNA extraction was conducted following manufacturer instructions with freshly prepared ethanol (200 Proof, OmniPur, EMD Millipore, Burlington, MA), isopropanol (molecular biology grade, IBI Scientific, Dubuque, IA) and nuclease-free water (Ambion, Austin, TX) purged of oxygen with ultra-high purity N 2 . Briefly, TRIzol aliquots were thawed on ice and 1 ml of chloroform (HPLC grade, J.T.Baker, Phillipsburg, NJ) was added to induce phase separation. Soluble RNA in the aqueous phase was precipitated in 0.5 ml isopropanol overnight at − 20 °C with 1.6 µl glycogen (GlycoBlue, Thermo Fisher Scientific, Waltham, MA) as a carrier. Following precipitation, the pellet was washed twice with 1 ml 95% ethanol and air-dried. The purified RNA was incubated with DNAse I (New England Biolabs, Ipswich, MA) following the manufacturer's protocol. RNA was then re-extracted with 200 µl of 25:24:1 mixture of phenol/chloroform/ isoamyl alcohol (Fisher BioReagents, Hampton, NH) followed by a chloroform extraction and an isopropanol precipitation as described above. After DNase I treatment of RNA, ribosomal RNA (rRNA) was depleted using Ribo-Zero Gold rRNA Removal Kit (Illumina, San Diego, CA) as described by the manufacturer's protocol to produce the formaldehyde air and clean air input RNA samples (FA IN and CA IN ). Depletion of rRNA was validated by Agilent 2100 Bioanalyzer (Agilent, Santa Clara, CA) and all samples surpassed an RNA Integrity Number (RIN) threshold of 7.00. This protocol has been adapted from other published work by our group [53][54][55] .
Immunoprecipitations of 8-oxoG-containing RNA transcripts were performed in biological duplicates for clean air (CA) and formaldehyde (FA) conditions, following a protocol previously established in our group 18,39 . All buffers were prepared fresh from concentrated stocks on the day of pulldown experiments and purged of O 2 as described above. A portion of the input RNA was incubated with 12.5 µg of 8-oxo-7,8-dihydroguanosine (8-oxoG) monoclonal antibody (0.5 mg/ml, clone 15A3, Trevigen, Gaithersburg, MD) in immunoprecipitation (IP) buffer (10 mM Tris pH 7.4, 150 mM NaCl, 0.1% IGEPAL, and 200 units/ml SUPERaseIn RNA inhibitor [Invitrogen, Carlsbad, CA]) in a 1 ml reaction volume for 2 h at 4 °C with rotation. The 8-oxoG antibody binds specifically to 8-oxoG-containing transcripts directly without mediation through a RNA-binding protein 56 . SureBeads Protein A magnetic beads (Biorad, Hercules, CA) were washed according to manufacturer's protocol and blocked in IP buffer supplemented with 0.5 mg/ml bovine serum albumin (BSA) for two hours at room temperature. After washing beads twice in IP buffer, the beads were resuspended in IP buffer, mixed with the RNA-antibody reaction and incubated for 2 h at 4 °C with rotation. Next, the beads were washed three more times in IP buffer before two rounds of competitive elution were performed. Each elution consisted of incubation of the beads with 108 µg of 8-oxo-dG (Cayman Chemical, Ann Arbor, MI) in IP buffer for 1 h at 4 °C with rotation. The elution volume was then cleaned up using the RNA Clean and Concentrator-5 kit (Zymo Research, Irvine, CA) to isolate Clean Air-immunoprecipitated oxidized RNA (CA IP ) and Formaldehyde-immunoprecipitated oxidized RNA (FA IP ). RNA sequencing. Libraries for CA Input , FA Input , CA IP , and FA IP were prepared using the NEBNext Small RNA kit (NEB, Ipswich, MA) by the Genomic Sequencing and Analysis Facility (GSAF) at the University of Texas at Austin. Sequencing was performed on an Illumina HiSeq 4000 to yield 75-bp reads with an average read depth of 31 M reads for pull-downs and 56 M reads for total RNA samples. Raw sequencing results are available in the NCBI GEO database (Accession number: GSE148377). Data analysis. Raw sequencing data was acquired from the GSAF and high run quality was visually confirmed using FastQC (https ://www.bioin forma tics.babra ham.ac.uk/index .html). Runs were processed with Cutadapt to remove primer and adaptor sequences 57,58 . After trimming, reads were re-assessed with FastQC for read quality and the removal of repetitive sequences was confirmed. Trimmed reads were then aligned with Spliced Transcripts Alignment to a Reference (STAR) aligner 59 . STAR was chosen over other mapping programs such as Tophat2, HISAT, bwa, and bowtie for its ability to identify novel transcript isoforms via a two-pass mapping approach.
The full description and code used to analyze transcriptomic data is described in Supplemental Methods. Briefly, a STAR genome index was constructed utilizing the ENSEBL GRCh38.p12 primary genome assembly (ftp://ftp.ensem bl.org/pub/relea se-94/fasta /homo_sapie ns/dna/Homo_sapie ns.GRCh3 8.dna_sm.prima ry_assem bly.fa.gz) with the corresponding annotations (ftp://ftp.ensem bl.org/pub/relea se-94/gtf/homo_sapie ns/Homo_ sapie ns.GRCh3 8.94.gtf.gz). The genome index was used as a reference for first pass mapping of the trimmed reads to identify and annotate novel splice junctions. The novel splice junction database was then used in conjunction with the genome index for second pass mapping of trimmed reads to create an Aligned.to.Transcriptome.bam output file. Read alignments were visually inspected for proper alignment of transcripts to annotated genes by Integrative Genomics Viewer and the number of reads collected for each splice variant was calculated using RSEM [60][61][62] . RSEM was chosen for read counting because it uses the SAM/BAM Aligned.toTranscriptome output file from the STAR aligner as input to account for novel isoforms generated during the two-pass mapping approach. The RSEM reference file was prepared using ENSEBL GRCh38.p12 and its corresponding annotation described above to calculate expression of each splice variant from the STAR output bam file.
Annotation and functional analysis. The tximport package was used to import the RSEM results file into R, allowing assessment of each transcript generated by STAR 63,64 . Statistical analysis of differential expression and 8-oxoG enrichment was performed with DESeq2 in R version 3.5 using modified steps in the DESeq2 manual and help page 65  www.nature.com/scientificreports/ method to control for type I error due to multiple comparisons. A p adj cutoff of less than 0.05 and a fold change greater than 4 (log 2 FC >|2|) was chosen so that only relevant genes were included the downstream functional network analyses. Comparisons of log 2 (fold change) values between FA IP and FA IN (referred to as FA FC ) as well as CA IP and CA IN (referred to as CA FC ) were calculated to identify transcripts that were differentially oxidized relative to their input RNA. By normalizing each oxidized transcript isolated by immunoprecipitation relative to the expression of its corresponding transcript abundance in the input pool, relative enrichment of oxidation for individual transcripts could be calculated 66 . Biological replicates reduced noise due to nonspecific binding of antibodies to transcripts and minimized bias within sequencing reactions 67 . The use of proportional enrichment of transcripts in formaldehyde exposures relative to clean air controls helped to discriminate formaldehyde-induced oxidations from background oxidations 66 . For this reason, a comparison between CA IP and FA IP was not performed because IP requires input RNA as a frame of reference for enrichment of individual transcripts relative to expression of the transcript from the input RNA pool.
To identify transcripts enriched in oxidation resulting from oxidative stress associated with the formaldehyde exposure, candidate transcripts (p adj < 0.05) resulting from the DESeq2 analysis between immunoprecipitated and input RNA pools were filtered for further analysis. Differences in log 2 fold changes between these transcripts in the formaldehyde treatment and their corresponding transcripts in the clean air condition were calculated by subtracting the DESeq2-generated log 2 FC value of clean air controls from that of formaldehyde exposed samples for each transcript (∆log 2 FC = FA log2FC − CA log2FC ), similar to that performed by Soetanto et al. 68 . Log 2 FC values of 0.00 were raised to 0.01 to enable log calculations without influencing count data. The difference between FA log2FC and CA log2FC was used to calculate the relative magnitude of oxidation for each transcript between clean air and formaldehyde air exposures. A cutoff value of ∆log 2 FC > |2| was used to identify differentially oxidized transcripts associated with formaldehyde exposure and were used for further functional analyses.
Raw counts from RSEM were manually inspected to ensure that the major drivers of differentiation were not due to noise from variation in low transcript counts (minimum estimated counts above 0 were 22.99 and 16.32 for differential expression and oxidation enrichment analyses, respectively). Protein name information for each transcript was retrieved using the BiomaRt R package with the Ensembl market database setting 69 . Transcripts identified as differentially expressed or enriched in oxidations were then used for downstream analyses of protein interactions, cellular/biological gene ontology, and functional pathways.
Strong clustering of network associations have been used to infer functional relationships amongst proteins, as groups of strongly interacting genes can be indicative of ongoing cellular processes 70,71 . To elucidate potential functional interactions among transcripts identified by the differential expression and oxidative enrichment analyses, STRING database was used to identify known interactions amongst transcripts identified by the ∆log 2 FC filtering steps outlined above 72,73 . Protein-protein interaction (PPI) networks were constructed in Cytoscape StringApp 74,75 . Furthermore, STRING enrichment within Cytoscape was used to assess association with potential functional roles of transcripts in response to formaldehyde exposure. Databases with relevant information for gene ontology (GO Biological Processes 2018, GO Molecular Function 2018, GO Cellular Component 2018) and molecular pathways (Reactome 2016) were compiled and filtered to remove redundant terms and to determine statistical significance of association with the genes assessed (p adj < 0.05). The pathways and GO terms identified were further investigated through literature review for relatedness and experimental relevance.
Statistics and reproducibility. We conducted all described measurements at least twice using independent and biological replicates. The experiments were not randomized, and the researchers were not blinded during experiments. All data was presented as the mean ± one standard deviation. Statistical analysis between groups was performed in Rstudio software (V 1.2.5042) using the function stat_compare_means and determined by t-test (two-tail homoscedastic).

Results
Minimal phenotypic changes in BEAS-2B cell line exposed to 1 ppm formaldehyde exposure. We exposed biological triplicates of human bronchial epithelial BEAS-2B cell cultures to formaldehyde following a previously described experimental design 18,39 . Cells were grown in collagen-coated inserts with a seeding density of 200,000 cells and incubated for 24 h to reach 70-80% confluence. Two hours before exposure, the medium from the apical cell surface was fully removed and the medium from the basolateral surface was renewed with fresh complete medium. Cells were then put into an exposure chamber and subjected to humidified clean air (control condition) or humidified air containing 1 ppm formaldehyde for 2 h at 2 L/min. An airliquid interface (ALI) system was used to allow direct exposure of the lung cells to the desired concentration of toxin ( Figure S1). For this study, we chose 1 ppm formaldehyde exposures over the course of 2 h to reflect occupational exposure conditions from individuals in the garment (0.9-2.7 ppm), furniture (0.4-5.4 ppm), and utensil (0.5-2.6 ppm) manufacturing industries 11,76 . Following exposure, cells were allowed to recover with fresh media for 6 h.
We examined morphological features indicative of cellular stress such as nuclear shrinking 77 and alterations in actin filaments 78 using confocal microscopy (Fig. 1A). We analyzed the cytosol and nuclear area of exposed cells demonstrating no significant alterations in size relative to control samples (Fig. 1B). We further quantified the heterogeneity of F-actin orientations (or F-actin anisotropy) in cortical actin filaments using FibrilTool 52 . As seen in Fig. 1C, our results show no significant alterations to cellular cytoskeletal structures. Moreover, we analyzed cytotoxicity effects of formaldehyde exposure in BEAS-2B cells with lactase dehydrogenase (LDH) activity assays (Fig. 1D). We observed no significant differences in LDH activity in the cellular media relative to www.nature.com/scientificreports/ in F-actin organization in exposed cells. An anisotropy score of 0 is given for no order (purely isotropic fibrils), and 1 is given for perfectly parallel fibrils (purely anisotropic arrays). This analysis was conducted in 10 µm × 5 µm regions on 65 cells per treatment.
(D) LDH activity levels assayed after two hours of exposure and six hours recovery show no significant differences in exposed cells. Statistical difference was computed by t-test analysis and error bars are expressed as one standard deviation. www.nature.com/scientificreports/ control samples. Taken together, these data illustrate that minimal phenotypic alterations were observed in cells treated with 1 ppm formaldehyde.

Scientific RepoRtS
Profiling 8-oxoG modifications shows oxidation of chromatin modification and DNA-damage response transcripts following 1 ppm formaldehyde exposure. We profiled 8-oxoG-modified transcripts using an RNA-sequencing approach coupled with 8-oxoG immunoprecipitation that was recently established by our lab 18,39 . Two biological replicates were used for clean air (CA) and 1 ppm formaldehyde (FA) exposures ( Fig. 2A). After RNA extraction and ribosomal RNA (rRNA) depletion, the resulting RNA pool for each treatment was then split into two fractions. One fraction for each treatment-referred in Fig. 2B as CA IN and FA IN -was used to quantify total RNA abundance of transcripts and served as a reference to measure relative enrichment following immunoprecipitation. The second fraction for each treatment-CA IP and FA IP , respectively -was used to isolate 8-oxoG-containing transcripts via immunoprecipitation with an anti-8-oxoG antibody (clone 15A3). The clone 15A3 can recognize 8-oxoG in DNA and RNA 28,[79][80][81] , and it has been applied and validated for 8-oxoG immunoprecipitation of miRNA and mRNAs 28,81 . These samples were used for library preparation and sequencing (Fig. 2C) (Fig. 2D). Full experimental details are described in the material and methods section.
To identify 8-oxoG transcripts that were significantly oxidized in the formaldehyde exposure, we first applied a threshold of p adj < 0.05 enrichment compared to input in at least one condition (i.e., CA IP with CA IN or FA IP with FA IN ). Subsequently, we applied a cutoff of fold-change (FC) difference between the resulting FA pool (FA FC ) and CA pool (CA FC ) greater than 4 (i.e., FA FC /CA FC > 4, see Fig. 2D). By applying these thresholds, we removed any potential unintended drivers of differentiation (e.g., non-specific interactions, antibody cross-reactivity and artifactual 8-oxoG formation during library preparation) 18,39 . Nine transcripts detected in the FA pool but not in the CA pool were also included, since detection in the formaldehyde treatment alone demonstrated enrichment. This analysis identified 343 transcripts as enriched in oxidation from formaldehyde-exposed cultures relative to the same transcripts from clean air samples (full list of transcripts listed in Data S1). Over 70% of the 343 oxidation enriched transcripts were protein-encoding, and ten genes had more than one transcript isoform in the oxidation enriched set (e.g. ACTB, BRCA1, DDX3X, FN1, LPP, MLLT10, SMARCA4, SNORD3B-2, SNU13 and ZBTB14).
Principal component analysis of 8-oxoG RIP-seq data (i.e., CA IP with CA IN and FA IP with FA IN ) showed that the major drivers of variance were the 8-oxoG immunoprecipitation procedure and the treatment condition ( Figure S2A and S2B). Moreover, the volcano plots show unbiased distribution of immunoprecipitated transcripts ( Figure S2C and S2D) consistent with previous 8-oxoG RIP-seq datasets 39 , further validating the effective execution of our 8-oxoG RIP-seq experiments.
The identification of transcript enrichment in 8-oxoG immunoprecipitated fractions relative to the total RNA pool enables novel insight about cellular processes that may be potentially impacted by nucleotide modifications imposed on existing RNA transcripts. We performed pathway and gene ontology (GO) enrichment analyses using the StringApp in Cytoscape 72,74,75 . Significant associations below the cutoff (p adj < 0.05) were ranked by their p adj to provide context of relevant gene ontologies for terms potentially impacted by RNA oxidation (Fig. 3A). The patterns do not appear to be driven by enrichment of single genes associated with multiple terms and pathways; rather, diverse suites of transcripts contribute to enrichment of terms that include few general members. Specifically, the resulting functional assessment and GO analysis indicates oxidative enrichment of transcripts associated with chromatin regulation, cell signaling, cell adhesion and DNA processes such as strand break repair and DNA recombination. Full details of significant GO terms and pathways are provided in Data S2.
To explore relationships between transcripts identified as enriched in oxidations from the formaldehyde treated cells relative to the clean air controls, we performed a network analysis in Cytoscape 75 . Importantly, this analysis identified significantly more interactions than expected due to chance (p = 5.20 × 10 -11 , Fig. 3B), suggesting that functional cellular processes could be differentially affected by transcript oxidation 71,72 . Closer inspection of the most densely connected nodes highlight transcripts such as Actin, cytoplasmic 1 (ACTB) with 31 connections, Breast cancer type 1 susceptibility protein (BRCA1) with 31 connections, Catenin beta 1 (CTNNB1) with 30 connections, Transcription activator BRG1 (SMARCA4) with 30 connections, and NADdependent protein deacetylase sirtuin-1 (SIRT1) with 28 connections (Data S3). These transcripts are mainly associated with chromatin modifying enzymes and DNA damage response pathways. Importantly, the subset of oxidation enriched transcripts that constitute these pathways share stronger interconnectivity than expected due to chance (p < 1 × 10 -16 ) (Fig. 3C).
Taken together, our data provides evidence of oxidative modification in 343 transcripts (out of > 140,000 total transcripts) in BEAS-2B cells after formaldehyde exposure. Importantly, we show that multiple functional associations affected by oxidations become apparent in relatively few functional categories, supporting recent work that proposes that RNA oxidation is selective, rather than randomly distributed across the transcriptome 28,34,83 . Specifically, we demonstrate that formaldehyde induces RNA oxidation in cellular processes specific to the role of chromatin regulation and DNA-damage response. Oxidative marks and the resulting defects in transcripts associated with these pathways may make them early markers for cell function and fate, as previously suggested by Shan et al 31 .

8-oxoG-modified transcripts in the chromatin regulation and DNA-damage response by p53
pathways are differentially downregulated. Our assessment of differential expression between input RNA samples (CA IN and FA IN ), applying p adj < 0.05 and log 2 FC > |2| cutoffs, identified 256 differentially expressed transcripts from cultures exposed to formaldehyde relative to clean air samples ( Figure S3). Of these differentially expressed transcripts, 127 were upregulated (Data S4) and 129 were downregulated (Data S5). www.nature.com/scientificreports/ Oxidation and differential expression are considered to be mutually exclusive from each other because oxidative stress by formaldehyde is presumed to impact the suite of transcripts present in the cell at the time of insult, regardless of their regulation following the event. Interestingly, 52 of the 343 transcripts from the oxidative enrichment analysis were identified in the differential expression analysis as downregulated (Data S6); none of the 343 oxidative enriched transcripts were identified as significantly upregulated. Of these 52 downregulated and oxidation enriched transcripts, 13 are involved in the chromatin modifying enzymes pathway and four are   www.nature.com/scientificreports/ highly connected gene products, two oxidation enriched transcripts of ACTB and BRCA1 and one oxidation enriched transcript of SMARCA4 (out of two oxidation enriched transcripts for this gene product) were differentially downregulated. It is important to note that replacement of oxidized transcripts by the production of non-oxidized transcripts in response to formaldehyde exposure (upregulation) would likely supersede the signal from oxidized transcripts generated from the stress event. Downregulated transcripts likely appear to harbor more extensive oxidation because they are not readily replaced and potentially impact biological function, since the results show significant oxidative enrichment of transcripts that code for proteins associated with oxidative stress response pathways. The remainder of the 343 oxidation enriched transcripts (291) were not identified as significantly up-or down-regulated. The latter observation suggests that downregulated, oxidation-harboring transcripts may persist longer in the cell relative to those that are replaced by newly synthesized transcripts. It is worth noting that our relative quantitative analysis of transcripts was constrained as following the selected 6-h recovery period post formaldehyde exposures. As such, by focusing on those transcripts enriched in oxidation, we preferentially identify transcripts that are not efficiently turned over or replaced and are therefore more likely to contribute to sustained influences on cellular processes and cell survival; this is justified by the establishment of RNA oxidations as early indicators of cell mortality occurring hours to days after exposure 31 .

Discussion
In this study, we demonstrate oxidation enrichment of 343 RNA transcripts in cultures exposed to 1 ppm formaldehyde exposure relative to clean air conditions. Although RNA oxidation altered expression levels of many transcripts, this was particularly pronounced on transcripts involved in chromatin modification and DNAdamage response. Given that exposure to 1 ppm formaldehyde caused negligible nucleolar and cytosolic alterations, minimal cytoskeletal defects, and minimal cytotoxicity in BEAS-2B cells, we considered the observed Table 1. Fold-change in 8-oxoG enrichment and differential expression (DE) in the chromatin modification and DNA-damage response pathways after formaldehyde exposure. *Differentially expressed transcripts with p adj < 0.1 are shown in bold and with p adj < 0.05 are underlined. www.nature.com/scientificreports/ alterations in transcription expression patterns and networks indicative of early cellular responses rather than consequences of apoptosis and necrosis. The ability of formaldehyde to induce specific 8-oxoG modification of RNA molecules and to subsequently compromise the stability of these modified transcripts and their associated pathways represents a potentially novel mechanism for the detrimental and carcinogenic effects of formaldehyde. Formaldehyde is an environmental and occupational carcinogen broadly used in industrial and consumer products; therefore, it can be found in several domestic products and construction materials 1 . Due to its abundant utilization, formaldehyde is a common indoor pollutant. On average, children and adults spend 90% of the time in indoor environments 84 , suggesting regular exposures to formaldehyde in daily life. Indoor concentrations of formaldehyde typically vary from 5 to 220 µg/m 3 (4.1 ppb-0.18 ppm) 85 , although these concentrations are mainly reported in industrialized countries with more stringent environmental regulations. Higher levels of formaldehyde are expected in locations with more permissible laws or in places where exposures go unnoticed. For example, mean levels of 670 µg/m 3 (0.54 ppm) were detected in households using wood for cooking in India 86 . Moreover, levels exceeding 2 ppm have been detected in hair salons during heavy use of hair products containing formaldehyde 87 (e.g. hair-smoothing products disproportionally used by different demographics).
Given that formaldehyde exposure mainly occurs by inhalation, we used human bronchial epithelial BEAS-2B cells exposed at the air-liquid interface as a model for this study. We found minimal morphological alterations and LDH activity in the media following 1 ppm formaldehyde treatment for 2 h (2 L/min). It is worth noting that similar exposure conditions used in other reports have shown varied responses depending on the cell type, dose, and exposure technique. For example, human alveolar epithelia cell line A549 (a cancerous line) exposed to 1 ppm formaldehyde for 4 h at the ALI (1.0 L/min) rendered a substantial 6.7-fold increase in LDH activity as compared to control samples 44 . Likewise, human tracheal fibroblast Hs 680.Tr cells treated with media containing 100 µM (3 ppm) formaldehyde for 4 h caused ~ 50% reduced viability (determined by MTT colorimetric assay) 88 . Conversely, studies in A549 cells detect a negligible reduction in viability after 0.5 ppm formaldehyde exposure for 72 h using the ALI, and in BEAS-2B cells show over 90% viability over a period of 6 h with exposure concentrations up to 15 ppm 42,89 . The latter observation, also based on BEAS-2B cells, is most consistent with our results, albeit using much higher levels of formaldehyde exposures.
Growing evidence suggests that RNA oxidation might be involved in the pathogenesis of many neurodegenerative diseases (e.g. Alzheimer's and Parkinson's disease) given that substantial levels of 8-oxoG-modified RNAs had been detected at early stages of degeneration of vulnerable neurons 90 . Although most research on RNA oxidation has focused on brain disorders, oxidative stress-which is a primary source of RNA oxidation-has been characterized in a broad spectrum of health conditions 91 . Therefore, the association of RNA oxidation in other diseases such as cancers remains to be explored. Environmental factors such as solar UVB radiations and cigarette smoke that have been recognized as risk factors of cancers 92 can oxidatively damage RNA in the form of 8-oxoG 93,94 . Importantly, observations in the literature argue that due the ability of 8-oxoG to base pair with adenosine, mRNA oxidation can introduce amino acid point mutations 95-97 -the most typical genetic trace in tumors 98 . Moreover, supporting the potential connection of cancer and RNA oxidation, increased free 8-oxoG excretion has been detected in urine of non-small-cell lung cancer patients 99 .
RNA oxidation has been previously profiled using 8-oxoG RIP-seq to identify 8-oxoG accumulation in the cholesterol synthesis transcript FDFT1 in BEAS-2B cells exposed to air pollution mixtures, leading to changes within the cholesterol pathway that result in morphological alterations associated with lung inflammation and fibrosis 39 . Moreover, 8-oxoG profiling in micro RNAs (miRNAs) with 8-oxoG immunoprecipitations coupled to microarrays revealed that oxidized miR-184 erroneously targets transcripts encoding the anti-apoptotic proteins Bcl-xL and Bcl-w for degradation, subsequently inducing initiation of apoptosis in rat heart H9c2 cells during myocardial hypoxia 81 . In this current study, we use 8-oxoG RIP-seq to demonstrate that formaldehyde exposure heavily induces oxidation of transcripts that heavily cluster in the DNA-damage response and signal transduction by p53 class mediator pathways. Moreover, given the impact of 8-oxoG modification on stability of RNA transcripts, several of these transcripts are significantly downregulated. Tumor protein p53 functions as a sequence-specific transcription factor, activated in response to DNA damage generated by diverse environmental stresses such as ionizing radiation, UV light, and methyl methanesulfonate 100 . It promotes genomic stability by regulating many genes responsible in part of G1 phase cell cycle progress 101,102 and, consequently, p53 prevents tumor progression. Previous works have elucidated that formaldehyde causes point mutations of the p53 gene in nasal squamous cell carcinomas in rats (induced with > 6 ppm formaldehyde), which provoke identical mutations observed in human cancers 103 . Furthermore, formaldehyde exposures experienced at sufficiently high concentrations can compromise genetic stability by DNA fragmentation 43 and formation of DNA adducts such as N(2)hydroxymethyl-dG 16 . One of the most functionally connected oxidized transcripts in biological networks found in our Cytoscape network analysis is Breast cancer type 1 susceptibility protein (BRCA1), which plays a central role in DNA repair by facilitating DNA double-strand break repair 104 . BRCA1 mutations confer a high risk of carcinogenesis linked to loss of the ability for cells to maintain genetic integrity 105 . Our indication that formaldehyde induces oxidation of transcripts responsible for DNA-damage responses evidences the early impact of formaldehyde in precluding cells from removing DNA lesions. In fact, loss of p53 signaling and activity is a major driver of cancer because in its absence cells can no longer adequately sustain genome integrity. Taken together, this study presents initial investigations into the role of RNA oxidation in formaldehyde-induced carcinogenesis.
Disruption of chromatin regulation and assembly can contribute to the development of cancer because it induces defects in the DNA repair machinery, essential for genome stability 106 . Mutations in genes involved in chromatin organization and regulation have been identified in over 50% of cancers 107 . Our findings of 21 transcripts in the chromatin modifying enzymes pathway that are extensively oxidized during formaldehyde exposure (13 of these are significantly downregulated) aligns with previous studies that argue the negative effect of formaldehyde in chromatin function. Specifically, formaldehyde exposure dramatically affects acetylation of N-terminal tails of histones 89  www.nature.com/scientificreports/ acetyltransferase complex, was found among the oxidation enriched transcripts following formaldehyde exposure. This protein complex controls transcriptional activation of several oncogenes (e.g. MYC) by acetylation of histones H4 and H2A 108 . Supporting the connection of formaldehyde-driven deregulation of acetylation of histones and RNA oxidation, we additionally found 8-oxoG accumulation in two transcripts that regulate histones deacetylation, NAD-dependent protein deacetylase sirtuin-1 (SIRT1) and Histone deacetylase 8 (HDAC8). Importantly, SIRT1 deacetylates histone H4 lysine 16 (H4-K16Ac) and histone H3 lysine 9 (H3-K9Ac) and previous studies have linked its upregulation with cellular protection to formaldehyde exposure 109 . HDAC8 deacetylates core histone proteins (e.g., H2A, H2B, H3, and H4) as well as non-histone proteins (e.g., p53) 110 ; an activity that is downregulated in several cancers 111 . Besides alteration in acetylation patterns, formaldehyde exposure can also affect inhibition of chromatin assembly 89 . Consistent with this observation, we found that Histone H2A type 1-C (HIST1H2AC), a core component of the DNA packing complex nucleosome, is highly susceptible to oxidation following formaldehyde exposure. Histone-lysine N-methyltransferase SET2D and Histone-lysine N-methyltransferase EHMT2 were also found heavily oxidized by formaldehyde exposure. Notably, SET2D is the main enzyme tri-methylating histone H3 lysine 36 (H3K36me3), a modification involved in DNA repair signaling through homologous recombination and nonhomologous end-joining 112 . The susceptibility to 8-oxoG modification by formaldehyde exposure in transcripts involved in chromatin regulation is likely to reduce genome stability, reflecting RNA oxidation as a novel mechanism implicated in the carcinogenic effects of formaldehyde in cells. Using 8-oxoG profiling, we also identified many transcripts associated with cancer pathways, including nonsmall-cell lung cancer, acute myeloid leukemia, and prostate cancer ( Table 2). Among these oxidized transcripts, Signal transducer and activator of transcription 3 (STAT3) and E3 ubiquitin-protein ligase MDM2 are important regulators of carcinogenesis 113,114 . Although neither of these transcripts were significantly regulated, the detection of 8-oxoG modifications in cancer-related transcripts indicates that together with more general processes in carcinogenesis such as DNA damage response and chromatin regulation, oxidation of guanine occurs disproportionately on transcripts associated with specific pathways in different types of cancers. Overall, we provide a first approximation to understanding post-transcriptional effects of formaldehyde in key pathways and processes that are linked to cancer development. Given that these observations are identified in normal cells without exhibiting cell death phenotypes (i.e. nuclear shrinkage and fragmentation or increased LDH activity), we hypothesize that oxidative damage of specific RNA transcripts following sublethal formaldehyde exposure could represent an early process occurring in carcinogenesis, analogous to the events of significant increase of 8-oxoG-modified transcripts at early stages of neurodegenerative diseases 31,115 . The validation of the formation of 8-oxoG in specific transcripts reported in this study requires investigation in further work, using the workflows described previously 28,81 . Moreover, it is important to confirm whether similar responses to formaldehyde exposure can be observed in other models of respiratory epithelial cells. Nevertheless, our studies provide initial insights into a new mechanism underlying formaldehyde carcinogenesis and toxicity in cells.

Conclusions
The results presented herein support the growing body of evidence that specific transcripts are more prone to RNA oxidation than others and that transcript oxidation is not a random event. Our study identified significant enrichment of RNA oxidation on transcripts associated with specific pathways, including those previously www.nature.com/scientificreports/ indicated in the cellular response to formaldehyde exposure, suggestive of a functional role. At biologically relevant conditions of the formaldehyde exposure tested here (1 ppm for 2 h) followed by 6 h of recovery, we identified enrichment of oxidized transcripts associated with pathways including chromatin modification and DNA-damage response by p53, demonstrating a sustained presence of these oxidized transcripts. Considering that previous research has demonstrated translational mutations resulting from RNA oxidation, altered proteins produced by these transcripts could be indicative of early pathway dysregulation and disease onset. Our work strengthens previous studies pointing to RNA transcripts as candidates for biomarkers of oxidative stress or early onset of disease 39,50,116 given their prolonged enrichment of oxidation. Based on the functional relevance of oxidation enriched transcripts on interacting cellular processes, we propose RNA oxidation as an additional driver of cell physiology, health, and disease that remains undetected by standard transcriptomics approaches. As technology progresses in this field, we anticipate that the identification of 8-oxoG modifications at nucleotide resolution will allow more insight to the biological implications of oxidations on specific transcripts. In the context of environmental toxins, utilization of molecular reporters, like oxidized RNAs, that can report early alterations to functional cellular pathways will be key in protecting consumers, employees, and health care workers from the deleterious effects of harmful chemicals (like formaldehyde) ubiquitously present in manufactured goods. www.nature.com/scientificreports/