The Healthy Infant Nasal Transcriptome: A Benchmark Study

Chu, Chin-Yi; Qiu, Xing; Wang, Lu; Bhattacharya, Soumyaroop; Lofthus, Gerry; Corbett, Anthony; Holden-Wiltse, Jeanne; Grier, Alex; Tesini, Brenda; Gill, Steven R.; Falsey, Ann R.; Caserta, Mary T.; Walsh, Edward E.; Mariani, Thomas J.

doi:10.1038/srep33994

Download PDF

Article
Open access
Published: 23 September 2016

The Healthy Infant Nasal Transcriptome: A Benchmark Study

Chin-Yi Chu¹^na1,
Xing Qiu²^na1,
Lu Wang²^na1,
Soumyaroop Bhattacharya¹^na1,
Gerry Lofthus³^na1,
Anthony Corbett²^na1,
Jeanne Holden-Wiltse²^na1,
Alex Grier⁴^na1,
Brenda Tesini⁵^na1,
Steven R. Gill⁴^na1,
Ann R. Falsey⁶^na1,
Mary T. Caserta⁵^na1,
Edward E. Walsh⁶^na1 &
…
Thomas J. Mariani^1,3^na1

Scientific Reports volume 6, Article number: 33994 (2016) Cite this article

2944 Accesses
19 Citations
55 Altmetric
Metrics details

Subjects

Abstract

Responses by resident cells are likely to play a key role in determining the severity of respiratory disease. However, sampling of the airways poses a significant challenge, particularly in infants and children. Here, we report a reliable method for obtaining nasal epithelial cell RNA from infants for genome-wide transcriptomic analysis and describe baseline expression characteristics in an asymptomatic cohort. Nasal epithelial cells were collected by brushing of the inferior turbinates and gene expression was interrogated by RNA-seq analysis. Reliable recovery of RNA occurred in the absence of adverse events. We observed high expression of epithelial cell markers and similarity to the transcriptome for intrapulmonary airway epithelial cells. We identified genes displaying low and high expression variability, both inherently and in response to environmental exposures. The greatest gene expression differences in this asymptomatic cohort were associated with the presence of known pathogenic viruses and/or bacteria. Robust bacteria-associated gene expression patterns were significantly associated with the presence of Moraxella. In summary, we have developed a reliable method for interrogating the infant airway transcriptome by sampling the nasal epithelium. Our data demonstrates both the fidelity and feasibility of our methodology and describes normal gene expression and variation within a healthy infant cohort.

Regulation of immune responses by the airway epithelial cell landscape

Article 13 January 2021

Genomic attributes of airway commensal bacteria and mucosa

Article Open access 12 February 2024

Nasal airway transcriptome-wide association study of asthma reveals genetically driven mucus pathobiology

Article Open access 28 March 2022

Introduction

Sampling of the airway for research or diagnostic biospecimens in diseased infants is highly challenging. As compared to the tremendous progress made over the last decades in our understanding of adult lung diseases such as COPD and lung fibrosis^1,2,3, this limitation has restricted progress in identifying the molecular mechanisms contributing to pediatric respiratory diseases. Many contemporary studies sample the upper airway, using aspirates of tracheal secretions in intubated subjects to collect intra-airway inflammatory cells and sloughed resident cells of the airway. The availability of these samples is limited to intubated subjects, introducing significant selection bias. Furthermore, the ability of these samples to reliably capture the cellular and molecular state of the lung is unclear. Other studies have analyzed nasal washings, which can be performed in a more broad population. Again, these samples are of unclear relevance to the pathophysiology of the lung.

Brushings of nasal epithelial cells have been used for decades to diagnose primary ciliary dyskenesis (PCD), indicating that recovery of epithelial cells from the nose of non-acutely ill children and infants could provide molecular information that reflects abnormalities of physiological relevance in the distal airway. Furthermore, nasal brushings in adults and older pediatric patients have previously been used for interrogation of the transcriptome. These studies have generally shown that nasal brushings broadly capture gene expression with relevance to the airways⁴ and can detect disease-associated changes in gene expression⁵.

We undertook the present study to determine if the recovery of nasal epithelial cells for transcriptomic analysis was feasible and whether these data could provide molecular information of relevance to the more distal airways. We present a reliable procedure for sampling resident cells of the airway in healthy young infants, which is applicable to infants experiencing respiratory infectious disease (T. Mariani, unpublished observations) and likely to older ill and healthy pediatric subjects. Furthermore, we provide a benchmark healthy infant nasal transcriptome (HINT) data set, describing baseline expression and natural variation, for asymptomatic subjects. Our data demonstrates that it is technically feasible to quantify genome-wide RNA expression from young infant’s airways, which may be helpful in the evaluation of host responses to diseased states.

Results

Subject Demographics and RNA-Seq Data Generation

Fifty-three healthy, full-term infants were recruited at birth. A majority of the subjects were female, predominantly Caucasian and had a mean gestational age of 39.3 ± 1.0 (mean ± SD) weeks, with a mean birth weight of 3.3 ± 0.4 kg (Table 1). Nasal brushing samples were obtained from the inferior nasal turbinate at approximately one month of age (see Methods). All samples were collected when infants were free of illness, specifically signs of respiratory infection or fever.

Table 1 Characteristics of the Infant Cohort.

Full size table

RNA was purified for high throughput sequence analysis from the recovered nasal cells as outlined in Fig. 1. RNA yields from the nasal brushings were variable with a mean recovery of 278.5 ± 436.7 ng (Fig. 2). Due to variability in total RNA recovery, cDNA was generated from approximately 1 ng input RNA. As shown in Fig. 2, the samples generated an average of 27 ± 6 million total reads, with 66 ± 14% displaying unique mapping to the human genome. After excluding genes whose expression was not robustly and reliably detected across subjects (see Methods), 13,978 genes or 63 ± 7% of the genome was transcriptionally represented in these samples.

Cell-Specific Marker Gene Analysis

We first interrogated the expression of cell type-specific marker gene expression in this data set, in order to assess the nature and composition of the samples (Fig. 3). We noted high levels of expression (>1,000 normalized counts) for transcripts for PLUNC, the definitive marker of nasal epithelial cells⁶. BPIFA1 (SPLUNC) and BPIFB1 (LPLUNC) were most highly expressed, as compared to their alternate isoforms. We also observed a high level of expression for other general epithelial cell markers, including CDH1 and EPCAM and diverse expression of epithelial-specific mucins (MUC1/2/4/16/20).

We noted expression of some non-epithelial genes, but at lower expression levels (<100 normalized counts), so we thoroughly assessed the data set for evidence of non-resident cells. First, we assessed globin gene transcripts as a measure of red blood cells. The total burden of transcripts encoding any of the globin genes was <0.01%, with 29 (55%) of the subjects having zero reads mapping to any globin gene. These data indicate that bleeding in general, or contamination of samples with blood-derived cells, was not confounding our gene expression assessments. We next assessed the expression of selected lymphocyte and other immune cell markers. Most of these genes (e.g., CD3/8/19, MPO) were expressed at low levels in most subjects. However, the expression of some (e.g., CD4) were reliably detected in many or most subjects, albeit at a relatively low level. Finally, we assessed the expression of a set of immune cell marker genes⁷. These data suggested the presence of a low level of immune cells, primarily granulocytes, mast cells and macrophages (Fig. 3). However, it is clear that some subjects displayed higher levels of expression for leukocyte markers than others.

The Infant Nasal Transcriptome

To comprehensively define the transcriptome of the healthy infant nasal airway, we considered the 13,978 genes detected in our data set and disregarded genes displaying ubiquitous expression across other tissues and samples⁸. This resulted in a set of 7,081 genes whose expression we propose defines the healthy infant nasal transcriptome (HINT). We compared the HINT to the nasal transcriptome previously reported for non-asthmatic control teenage subjects⁴. There was a high degree of similarity between these two transcriptomes (Fig. 4A), with the older subjects displaying more diversity than the infants. Since we believe that interrogation of the nasal transcriptome may provide insight into homeostatic and disease processes occurring in the lung proper, we compared the HINT to previously published data sets describing gene expression in the pulmonary upper⁹ or lower¹⁰ airways. Among the 7,081 expressed genes defining the HINT, 5840 (82%) were expressed in all samples, while 86% were expressed in “teenage” nasal airway and 88% were consistently expressed in the adult (both upper and lower) airway.

We next assessed global HINT expression in an unbiased manner. We found the nasal and upper airway specific epithelial cell markers BPIFA1/SPLUNC1 and BPIFB1/LPLUNC1, were the 11^th and 12^th most highly expressed transcripts based upon mean expression level, more highly expressed than ACTB (rank 20) and GAPDH (rank 92). Among the other most highly expressed transcripts in the HINT were for MUC-16 (rank 35), −1 (rank 48), −4 (rank 73), −20 (rank 76) and two epithelial-specific keratins (KRT 19/rank 53 and KRT8/rank 129). Other highly expressed transcripts included B2M (rank 2) which encodes the antimicrobial b-2-microglobulin, SAT1 (rank 3) which encodes a regulator of intracellular polyamines concentration, FTL (rank 4) which encodes the ferritin light chain and PIGR (rank 5) which encodes the polymeric Ig receptor. Interestingly, MALAT1, which is a long non-coding RNA, was by far (~2-fold greater than B2M) the most highly expressed HINT transcript.

HINT Expression Diversity

We sought to identify genes whose expression displayed particularly low or high variability in the HINT. Shown in Fig. 4B, are the 10 genes with the highest and lowest robust CV. Some of the genes with the highest robust CV had identifiable immune- and/or infection-related functions (e.g., IL1B, CXCR1), while others did not. We performed pathway analysis using the 200 genes identified as having the highest variation and the significant canonical pathways identified are reported in Supplemental Fig. 2. Most prominent among the pathways represented by genes with variability in expression were those related to stress response (EIF2 signaling, Unfolded protein response and Protein ubiquitination) and mitochondrial function (Oxidative phosphorylation and Mitochondrial dysfunction).

Genes with expression mean or variance differences associated with intrinsic (gender, race, etc.) and extrinsic (delivery method, environmental tobacco exposure, etc.) variables were then identified. We detected significant changes in mean expression associated with race, gender, feeding method and delivery method (Table 2). Differences in gene expression variance were also noted for many of these variables (Table 2). For instance, 11 genes demonstrated expression variance associated with environmental tobacco smoke exposure (ETS). Interestingly, all of these 11 genes showed higher variance in subjects exposed to ETS (Fig. 4C).

Table 2 Genes displaying mean expression differences or expression variance differences associated with demographic and subject variables.

Full size table

We also tested nasal turbinates for the presence of viruses or potentially pathogenic bacteria. One or more of these microbes was detected by PCR in 16 of 53 subjects; 11 with M. catarrhalis, 1 with S. pneumoniae, 1 with H. Influenzae, 7 with rhinovirus and 2 with coronavirus. The presence of one of these known pathogens was the variable most profoundly associated with differences in HINT gene expression (Table 2). For instance, many genes displayed significant changes in expression variance associated with the presence of either bacteria (n = 16 genes) or virus (n = 123 genes). Notably, the mean expression of 130 genes was significantly different in subjects with a pathogen when compared to subjects where these pathogens were undetectable. Interestingly, these gene expression changes were more highly associated with the presence of viral pathogens (9 subjects, 282 genes) than bacterial pathogens (12 subjects, 1 gene). Pathway analysis indicated that genes associated with the presence of viral pathogens were involved in tight junction, PI3K/AKT, apoptosis, CD27, lymphotoxin B receptor and GADD45 signaling pathways, as well as DNA double-strand break repair (Supplemental Fig. 3).

These data suggested that nasal gene expression responses to the presence of different pathogenic bacteria (such as H. influenzae or S. pneumoniae) were more diverse than responses to different common respiratory viruses (such as rhinovirus and coronavirus). To explore this issue further, we comprehensively assessed the relationship between the HINT and the presence of all colonizing bacteria in the nasal turbinates. The abundance of various taxonomical units representing pathogenic species and their related genera, was quantified by 16S ribosomal sequencing and correlated with nasal gene expression patterns (Fig. 5). While gene expression patterns were generally not significantly associated with the abundance of Streptococcus (1 gene) or Haemophilus (0 genes), the expression of 1927 genes was significantly associated with the abundance of any Moraxellaceae family. Moreover, the abundance of the pathogenic genus of Moraxella was almost solely associated with nasal gene expression (739 genes), while other non-pathogenic genera were almost never associated with nasal gene expression. Pathway analysis indicated that Moraxella responsive genes were involved in axonal guidance, A- and B- adrenergic, sphongosine−1-phosphate, IL1, CCR5 and IL17F signaling, retinol biosynthesis, regulation of cytokine production and DNA double-strand break repair (Supplemental Fig. 4).

Finally, in an effort to identify the most robust gene expression patterns, we completed a multivariate analysis using linear regression to control for potential confounding factors. This approach had limited power, given the large number of variables available. Therefore, we chose to include only variables strongly associated with gene expression based upon our marginal (univariate) analyses, including gender and the presence of a known pathogenic virus or Moraxella abundance. We also considered two-way interactions among these variables. Following a Benjamini-Hochberg multiple testing procedure to control FDR at a level of 0.05, we identified a total of 49 genes whose expression was significantly associated with any variable (Supplemental Table 1). As might be anticipated, a large proportion of these genes were associated with gender and are located on the sex chromosomes. However, we also identified a number of genes significantly associated with the presence of a known pathogen (rhinovirus, coronavirus or Moraxella), in combination with each other, or in combination with gender.

Gene Expression Validation

We used quantitative real-time PCR (qPCR) to validate RNA-seq-based expression estimates for selected genes displaying significant expression changes. We found a very high degree of correlation between RNA-seq- and qPCR-based estimates for LPLUNC (p < 7 × 10E-10) and SPLUNC (p < 2 × 10E-23). Of 13 additional genes selected for variable-specific expression validation, 11 (85%) demonstrated significant correlation between qPCR and RNA-seq-based estimates. We attempted validation for 5 genes with significant mean differences based upon the presence of any known infant nasal pathogen. Significant differences (p < 0.05) in expression were validated for three of these genes including DNAI2, DNAH5 and CES1, while a fourth showed a trend for validation (OAS1, p = 0.06; Table 3). We also validated 4 of 7 genes (GPI, KRT8, TYROBP and CES1, p < 0.05) correlated with the abundance of Moraxella (Table 4). Finally, we explored 3 genes with expression variation differences associated with the presence of any virus. qPCR data for all three of these genes clearly displayed substantive differences in variance associated with the presence of viral pathogens (Fig. 6).

Table 3 qPCR was used to validate the expression of genes displaying differences in expression between subjects with detectable virus (n = 9) and those without (n = 41).

Full size table

Table 4 qPCR was used to validate genes displaying expression patterns associated with the abundance of genus Moraxella.

Full size table

Discussion

Responses by resident cells within the respiratory system are likely to play a key role in determining the severity of disease, including illnesses related to infectious pathogens. Others have used an assessment of the nasal transcriptome to assess disease-associated changes in airway gene expression in older pediatric or adult subjects^4,5. Although it is highly challenging to sample the lung or lower airways in young subjects, our data demonstrate successful identification of the healthy infant nasal transcriptome (HINT). We also found a high degree of similarity with published results of the nasal and intra-pulmonary airways. Particularly, we found a very high proportion of HINT non-housekeeping genes are also expressed in the upper and lower airways of adults^9,10. Thus, we suggest that this approach is useful to estimate responses that occur in the lower airways of infants.

Our data are consistent with a recent report of the nasal transcriptome in an older population⁴. A direct comparison of our HINT with gene expression from control subjects from that study demonstrated a high degree of similarity of the nasal transcriptome irrespective of age. It should be noted that this comparison of similarity does not take into account variation in expression levels across subjects and airway regions. Clearly, significant differences in gene expression exist between nasal and lower airway epithelial cells. However, our data do support the study of the nasal transcriptome in infants and children as a reasonable surrogate for assessing changes in airway gene expression during diseased states.

We directly assessed variation in detected nasal expression patterns from healthy infants and found sets of genes with either high or low variance. Genes with low variance (EIF5, CANX, TAF7, SEPT7) may represent those under highest selective pressure and may represent targets of key regulatory factors. Genes with high variance (PROK2, IL1B, CXCR1) could implicate factors that may be associated with heterogeneity in resilience versus susceptibility to disease. Interestingly, many genes with inherently high variance were associated with immunity and infection responses, which may be associated with changes in cellular composition. Differences in expression variance were also identified in association with many demographic and host variables, particularly the presence of known respiratory pathogens in these asymptomatic infants. By comparison, most demographic variables were associated with modest differences in mean expression levels.

Comparing gene expression changes in subjects with detectable virus or potentially pathogenic bacteria in the nares, we found greater effect sizes associated with the presence of rhinovirus or coronavirus when compared to any pathogen or bacterial pathogens alone. We suspected these data were likely due to greater variability in gene expression associated with distinct bacteria. Using 16S microbial sequencing of nasal samples from these subjects, we found a strong association between gene expression and the presence of pathogenic Moraxella genera, but not with non-pathogenic Moraxella or other pathogenic bacteria. Furthermore, multivariate analyses identified gene expression patterns significantly associated with the interaction of pathogen and gender. Our results suggest that the presence of bacterial or virus are strongly associated with nasal gene expression, even in asymptomatic infants, but that responses are not identical in males and females.

We recognize some limitations of the current study. Although we were able to recover RNA from a vast majority of the subjects using our detailed protocol (see Supplement), we saw significant variability in yield across subjects. Therefore, we chose to use a commercially available “low input” protocol for RNA sequencing, which enabled capture of data from all subjects. Although it is likely that this approach sacrifices sensitivity for detection of genes expressed at a low level, we did observe very high technical replicate correlation. We also noted sample-wide correlations which were consistent, but somewhat lower than expected (r = 0.730 ± 0.148). We believe this reflects the heterogeneity of individual samples, which are likely representative of multiple (epithelial, as well as some non-epithelial) cell types. Unfortunately, in the current study, we were unable to use cytometry to directly assess the diversity of cells within individual subjects, or among subjects across the cohort. However, sample composition was partially assessed using cell-specific marker gene expression. High levels of nasal-specific and general epithelial cell marker expression were identified. These data also suggested a low level (but not absence) of immune/inflammatory cells, which may be greater without washing of the nares prior to cell recovery. In addition, the consistently absent or low detection of globin transcripts indicated no significant injury from the procedure as well as little or no red blood cell confounding of expression data. These results strongly suggest that this method predominantly assesses nasal epithelial cell gene expression. It is also worth noting that we found differences in sample recovery efficiency based upon the swab used for brushing the nasal turbinates. After some trial and error and consideration of the relationship between aggressive cell recovery and injury to the subjects, we were successful in consistently recovering sufficient RNA for complete transcriptome analysis and expression validation. Alternate brush choices, allowing more aggressive brushing, may be better for older subjects. Finally, our recovery procedures were affected by apparent changes in buoyancy of cells stored in RNA stabilizer. In order to overcome some observed variability in centrifugation-based cell recovery of cells suspended in RNA stabilizer, we ensured 100% recovery by simple centrifugation through a membrane. To overcome potential technical impediments, we strongly recommend following the specific protocol provided, particularly the choice of swab.

Regardless of the limitations, the data reported here demonstrate both the fidelity and feasibility of our methodology. These data also serve as a benchmark for healthy infant airway gene expression, describing normal expression and variation within a healthy infant cohort. Future application of these procedures may facilitate the identification of gene expression patterns that distinguish diseased and normal states. These methods could also provide novel information about potential disease mechanisms and efficacy of therapeutic interventions.

Methods

For complete methods, please refer to Supplemental material.

Cohort Recruitment and Sampling

All procedures were reviewed and approved by the University of Rochester Research Subjects Review Board and the Rochester General Hospital Clinical Investigation Committee and approved by the NIAID Division of Microbiology and Infectious Disease. All studies were carried out in accordance with these approved guidelines. Written informed consent was obtained from all subjects (and their families).

Healthy, full term newborns (≥37 weeks gestation) were recruited at approximately one month. Nasal samples were obtained after washing, essentially as previously described^5,11. For RNA recovery, the swab was immediately placed in RNA stabilizer (RNAprotect, Qiagen, Germantown, MD) and stored at 4 °C. Cells were recovered by filtering through a 0.45 uM membrane filter. Cells were lysed, homogenized by passing through a 28 g needle and total RNA was recovered (AbsolutelyRNA Miniprep kit, Agilent, Santa Clara, CA), according to manufacturer’s instructions.

Library Preparation and Sequencing

Approximately 1ng of total RNA was amplified using the SMARter Ultra Low amplification kit (Clontech, Mountain View, CA). Libraries were constructed using the NexteraXT library kit (Illumina, San Diego, CA). Libraries were sequenced on the Illumina HiSeq2500 to generate ~20 million 1 × 100-bp single end reads per sample.

Read Mapping, Data Normalization and Filtering

Sequences were aligned against human genome version of hg19 using TopHat¹², counted with HTSeq¹³ and normalized for total counts (reads per million, RPM). We used a non-specific filtering strategy to remove genes with low expression values indistinguishable from background signal. These data area available in dbGaP; accession number phs001201.v1.p1.

Gene Significance Analyses

We calculated a robust measure of expression variance for each gene, based upon the relationship between inter-quartile range and median values. For categorical variables such as gender and delivery method, we used SAMseq to identify genes with significant differences in mean expression (FDR < 0.05). For continuous variables such as gestational age and microbiome derived variables, we used both Pearson and Spearman correlation tests, with the Benjamini-Hochberg correction¹⁴, to select significant genes (adjusted p-value < 0.05). To better understand the association between gene expression and clinical/demographic variables, we also used Levene’s test for equal variance¹⁵, with the Benjamini-Hochberg correction, to select genes that have significantly different variances between groups (adjusted p-value < 0.05). We performed canonical pathway analyses using Ingenuity Pathway Analysis software (Qiagen) to identify significant pathways (p-value < 0.05).

Multivariate Regression Analyses

We conducted a multivariate regression analysis between gene expression and sex, race, gestational age, birth weight, environmental smoke exposure, birth delivery method, breast feeding and asymptomatic presence of virus or bacterial pathogen. Multiple linear regression, based on least square fitting criteria, was used to analyze the linear association between clinical/demographic factors (covariates) and gene expression levels (response variable). Goodness-of-fit F-test was applied to test the significance of overall linear association, with FDR controlled at 0.05 level by the Benjamini-Hochberg procedure. If a gene was selected as significant, we performed regression t-test to identify significant covariates (p-value < 0.05) for this gene.

Detection of Pathogenic Virus and Bacteria

TaqMan Array Card (TAC) technology was used to detect for the presence of respiratory viral pathogens and several potentially pathogenic bacteria in the nasal washings of subjects, as described^16,17. The TAC detects respiratory syncytial virus (RSV), rhinovirus, coronavirus (1–4), Haemophilus influenzae and Streptococcus pneumoniae, among others. In addition, the presence of Moraxella catarrhalis was assessed by real-time PCR using published primers and probes, essentially as previously described^18,19.

Microbiome Analysis

Matched nasal swab specimens were collected. V3-V4 16S rRNA was amplified from total genomic DNA and sequenced on an Illumina MiSeq (Illumina, San Diego, CA). 16S rRNA bacterial sequence reads were assessed for quality and analyzed using phylogenetic and Operational Taxonomic Unit (OTU) methods in the Quantitative Insights into Microbial Ecology (QIIME) software, version 1.9 ²⁰. For the purposes of OTU relative abundance analysis, the raw OTU table was normalized using the cumulative sum stabilization method from the metagenomicSeq R package²¹.

Gene Expression Validation

Quantitative real-time polymerase chain reaction (qPCR) was performed using gene-specific primer sets and Taqman chemistry on a ViiA 7 Real-Time PCR System (Life Technologies). Gene expression levels were calculated relative to PPIA (cyclophilin A) using the ddCT method.

Additional Information

How to cite this article: Chu, C.-Y. et al. The Healthy Infant Nasal Transcriptome: A Benchmark Study. Sci. Rep. 6, 33994; doi: 10.1038/srep33994 (2016).

References

Agusti, A., Sobradillo, P. & Celli, B. Addressing the complexity of chronic obstructive pulmonary disease: from phenotypes and biomarkers to scale-free networks, systems biology and P4 medicine. American journal of respiratory and critical care medicine 183, 1129–1137 (2011).
Article Google Scholar
Gauthier, M., Ray, A. & Wenzel, S. E. Evolving Concepts of Asthma. American journal of respiratory and critical care medicine 192, 660–668 (2015).
Article CAS Google Scholar
Woodruff, P. G., Agusti, A., Roche, N., Singh, D. & Martinez, F. J. Current concepts in targeting chronic obstructive pulmonary disease pharmacotherapy: making progress towards personalised management. Lancet 385, 1789–1798 (2015).
Article CAS Google Scholar
Poole, A. et al. Dissecting childhood asthma with nasal transcriptomics distinguishes subphenotypes of disease. The Journal of allergy and clinical immunology 133 (2014).
Zhang, X. et al. Similarities and differences between smoking-related gene expression in nasal and bronchial epithelium. Physiological genomics 41, 1–8 (2010).
Article Google Scholar
Bingle, L. & Bingle, C. D. Distribution of human PLUNC/BPI fold-containing (BPIF) proteins. Biochemical Society transactions 39 (2011).
Murray, P. J. & Wynn, T. A. Protective and pathogenic functions of macrophage subsets. Nat Rev Immunol 11, 723–737 (2011).
Article CAS Google Scholar
Ramskold, D., Wang, E. T., Burge, C. B. & Sandberg, R. An abundance of ubiquitously expressed genes revealed by tissue transcriptome sequence data. Plos computational biology 5 (2009).
Beane, J. et al. Characterizing the impact of smoking and lung cancer on the airway transcriptome using RNA-Seq. Cancer prevention research 4, 803–817 (2011).
Article CAS ADS Google Scholar
Hackett, N. R. et al. RNA-Seq quantification of the human small airway epithelium transcriptome. BMC genomics 13, 82 (2012).
Article CAS Google Scholar
Hall, C. B. & Douglas, R. G., Jr. Clinically useful method for the isolation of respiratory syncytial virus. The Journal of infectious diseases 131, 1–5 (1975).
Article CAS Google Scholar
Kim, D. et al. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome biology 14, R36 (2013).
Article Google Scholar
Anders, S., Pyl, P. T. & Huber, W. HTSeq–a Python framework to work with high-throughput sequencing data. Bioinformatics 31, 166–169 (2015).
Article CAS Google Scholar
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B (Methodological), 289–300 (1995).
Levene, H. Robust tests for equality of variances1. Contributions to probability and statistics: Essays in honor of Harold Hotelling 2, 278–292 (1960).
Google Scholar
Harvey, J. J. et al. Comparative analytical evaluation of the respiratory TaqMan Array Card with real-time PCR and commercial multi-pathogen assays. Journal of Virological Methods 228, 151–157 (2016).
Article CAS Google Scholar
Kodani, M. et al. Application of TaqMan low-density arrays for simultaneous detection of multiple respiratory pathogens. Journal of clinical microbiology 49, 2175–2182 (2011).
Article Google Scholar
Desai, H. et al. Bacterial colonization increases daily symptoms in patients with chronic obstructive pulmonary disease. Annals of the American Thoracic Society 11, 303–309 (2014).
Article Google Scholar
Greiner, O., Day, P. J., Altwegg, M. & Nadal, D. Quantitative detection of Moraxella catarrhalis in nasopharyngeal secretions by real-time PCR. Journal of clinical microbiology 41, 1386–1390 (2003).
Article CAS Google Scholar
Caporaso, J. G. et al. QIIME allows analysis of high-throughput community sequencing data. Nature methods 7, 335–336 (2010).
Article CAS Google Scholar
Paulson, J. N., Stine, O. C., Bravo, H. C. & Pop, M. Differential abundance analysis for microbial marker-gene surveys. Nature methods 10, 1200–1202 (2013).
Article CAS Google Scholar

Download references

Acknowledgements

We thank the following individuals for their expert contributions to the studies described in this manuscript; Christopher Slaunwhite, Jami McGrath, Mary Criddle, Lisa Denmark, Lynne Shelley, John Ashton, Ann Gill, Amy Murphy, Melissa Bowman, Mary Ann Formica, Ken Schnabel, Jennifer Carnahan, Valerie Lunger and Ashley Lopez. We also wish to acknowledge the URMC Genomics Research Center for their expert support. This project has been funded in whole or in part with Federal funds from the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services, under Contract No. HHSN272201200005C.

Author information

Chu Chin-Yi and Qiu Xing contributed equally to this work.

Authors and Affiliations

Division of Neonatology and Pediatric Molecular and Personalized Medicine Program, University of Rochester Medical Center, Rochester, NY, USA
Chin-Yi Chu, Soumyaroop Bhattacharya & Thomas J. Mariani
Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, NY, USA
Xing Qiu, Lu Wang, Anthony Corbett & Jeanne Holden-Wiltse
Department of Medicine, University of Rochester Medical Center, Rochester, NY, USA
Gerry Lofthus & Thomas J. Mariani
Department of Microbiology and Immunology and University of Rochester Medical Center, Rochester, NY, USA
Alex Grier & Steven R. Gill
Division of Pediatric Infectious Diseases, University of Rochester Medical Center, Rochester, NY, USA
Brenda Tesini & Mary T. Caserta
Department of Medicine, Rochester General Hospital University of Rochester Medical Center, Rochester, NY, USA
Ann R. Falsey & Edward E. Walsh

Authors

Chin-Yi Chu
View author publications
You can also search for this author in PubMed Google Scholar
Xing Qiu
View author publications
You can also search for this author in PubMed Google Scholar
Lu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Soumyaroop Bhattacharya
View author publications
You can also search for this author in PubMed Google Scholar
Gerry Lofthus
View author publications
You can also search for this author in PubMed Google Scholar
Anthony Corbett
View author publications
You can also search for this author in PubMed Google Scholar
Jeanne Holden-Wiltse
View author publications
You can also search for this author in PubMed Google Scholar
Alex Grier
View author publications
You can also search for this author in PubMed Google Scholar
Brenda Tesini
View author publications
You can also search for this author in PubMed Google Scholar
Steven R. Gill
View author publications
You can also search for this author in PubMed Google Scholar
Ann R. Falsey
View author publications
You can also search for this author in PubMed Google Scholar
Mary T. Caserta
View author publications
You can also search for this author in PubMed Google Scholar
Edward E. Walsh
View author publications
You can also search for this author in PubMed Google Scholar
Thomas J. Mariani
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

T.J.M., E.E.W. and M.T.C. conceptualized the study. T.J.M., E.E.W., M.T.C., S.R.G. and C.-Y.C. designed the experiments. E.E.W., M.T.C., A.F., G.L. and B.T. developed the cohort and collected the specimens and clinical data. J.H.-W. and A.C. facilitated data organization, management and analysis. T.J.M., M.T.C., E.E.W., X.Q., S.R.G., C.-Y.C., L.W., J.H.W., S.B., A.C. and A.G. generated, analyzed and interpreted the data. T.M., E.E.W., M.T.C., X.Q., C.-Y.C., S.R.G., A.G. and J.H.-W. wrote and/or revised the manuscript.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Electronic supplementary material

Supplementary Information

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Chu, CY., Qiu, X., Wang, L. et al. The Healthy Infant Nasal Transcriptome: A Benchmark Study. Sci Rep 6, 33994 (2016). https://doi.org/10.1038/srep33994

Download citation

Received: 17 May 2016
Accepted: 06 September 2016
Published: 23 September 2016
DOI: https://doi.org/10.1038/srep33994

This article is cited by

Airway gene-expression classifiers for respiratory syncytial virus (RSV) disease severity in infants
- Lu Wang
- Chin-Yi Chu
- Xing Qiu
BMC Medical Genomics (2021)
A novel in vitro model of primary human pediatric lung epithelial cells
- Qian Wang
- Soumyaroop Bhattacharya
- Thomas J. Mariani
Pediatric Research (2020)
CX3CR1 as a respiratory syncytial virus receptor in pediatric human lung
- Christopher S. Anderson
- Chin-Yi Chu
- Thomas J. Mariani
Pediatric Research (2020)
A method of processing nasopharyngeal swabs to enable multiple testing
- Santiago M. C. Lopez
- Judith M. Martin
- Nader Shaikh
Pediatric Research (2019)
Microbiome-Transcriptome Interactions Related to Severity of Respiratory Syncytial Virus Infection
- Abhijeet R. Sonawane
- Liang Tian
- Yang-Yu Liu
Scientific Reports (2019)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

Subject Demographics and RNA-Seq Data Generation

Cell-Specific Marker Gene Analysis

The Infant Nasal Transcriptome

HINT Expression Diversity

Gene Expression Validation

Discussion

Methods

Cohort Recruitment and Sampling

Library Preparation and Sequencing

Read Mapping, Data Normalization and Filtering

Gene Significance Analyses

Multivariate Regression Analyses

Detection of Pathogenic Virus and Bacteria

Microbiome Analysis

Gene Expression Validation

Additional Information

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Ethics declarations

Competing interests

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links