Main

Pulmonary surfactant is a phospholipid and protein complex that is synthesized by alveolar type II cells and maintains alveolar expansion at end expiration. Developmentally regulated deficiency of surfactant due to immaturity of alveolar type II cells in prematurely born infants disrupts fetal-neonatal pulmonary transition by causing alveolar collapse at end expiration and neonatal respiratory distress syndrome (RDS) (1). Neonatal RDS may also result from genetic disruption of pulmonary surfactant metabolism as suggested by heritability estimates (0.2–0.8 in twin studies), by rare, lethal mutations in genes encoding surfactant protein-B, (SFTPB, gene ID 6439, MIM 178640), surfactant protein-C (SFTPC, gene ID 6440, MIM 178620), and the ATP-binding cassette, subfamily A, member 3 (ABCA3, gene ID 27410, MIM 601615), and by genotyping studies of common, nonsynonymous variants in SFTPB and SFTPC (28). Furthermore, recent studies report an increased risk of neonatal RDS likely attributable to surfactant deficiency in late preterm infants (≥34 weeks gestation) (911).

SFTPC encodes surfactant protein-C (SP-C), a lung-specific extremely hydrophobic peptide that spans the phospholipid bilayer of pulmonary surfactant and contributes to maintenance of alveolar expansion at end expiration (12). SFTPC is genomically small (3.5 kb with a 3.7-kb promoter), located on human chromosome 8, and directs the synthesis of an alternatively spliced 191 or 197 amino acid proprotein (proSP-C) that undergoes sequential proteolytic cleavages to yield the 35-amino acid mature SP-C peptide (12,13). Studies performed on fetal human lung tissue demonstrate developmental regulation of SFTPC with mRNA detected by 13–15 weeks gestation (14,15) and pro-SPC protein by 12–16 weeks (16); SP-C expression increases with advancing gestational age to approximately 15% of adult levels by 24 weeks gestation (15).

Dominantly expressed, rare, exonic mutations in SFTPC cause respiratory dysfunction of varying severity and age of onset among infants, children, and adults, which is thought to result from aggregation of misfolded or misrouted proSP-C peptides that exceed the capacity of cell stress response pathways to maintain cellular homeostasis (6,1720). Furthermore, two common nonsynonymous variants that encode an asparagine for threonine substitution at codon 138 (p.T138N, rs4715) and an asparagine for serine substitution at codon 186 (p.S186N, rs1124) have been statistically associated with RDS among premature infants <34 weeks gestation (21). However, no comprehensive sequence analyses of the contribution of rare variants in SFTPC to RDS have been performed.

To investigate the contribution of rare and/or noncoding region variants in SFTPC to neonatal RDS in term or late preterm infants, we used complete resequencing or genotyping of SFTPC in newborns ≥34 weeks gestation with and without RDS to identify variants statistically associated with neonatal RDS, in silico evaluation of transcriptional function, and transfection of a murine pulmonary epithelial cell line to confirm functional significance of statistically associated variants.

MATERIALS AND METHODS

Case-control study.

We recruited 184 consecutive term and late preterm newborn infants (≥34 weeks gestation) with RDS and, separately, a control group of 354 infants without respiratory symptoms who were referred to the Division of Newborn Medicine at St. Louis Children's Hospital for clinical care (8) (Table 1). To standardize case phenotype, RDS was defined as the need for supplemental oxygen (FiO2 ≥0.3) for >24 h, a chest radiograph consistent with neonatal RDS, and the need for continuous positive airway pressure or mechanical ventilation within the first 48 h of life. Controls (CON) were term or late preterm infants without respiratory symptoms. Gestational age for each infant was assigned based on best obstetrical estimate. We excluded infants with congenital anomalies that could contribute to respiratory distress, known genetic causes of respiratory insufficiency (e.g. SFTPB or ABCA3 deficiency), culture positive sepsis, chromosomal anomalies, late onset RDS (>48 h), and transient tachypnea of the newborn (TTN) that resolved within 24 h of life. A single twin from each pair of monochorionic twins was included. The designation of RDS or control status was assigned before any genetic studies and without knowledge of SFTPC genotype. We obtained informed consent from parents of all infants.

Table 1 Characteristics of case-control group (538)

We performed complete SFTPC resequencing, including the promoter and intervening introns, for 269 infants (92 RDS, 177 CON). Interim analyses revealed overrepresentation in the cases of the minor allele at three SFTPC promoter sites; so, we performed genotyping of these three variants in an additional 269 infants (92 RDS, 177 CON) using 5′ nuclease assays. This study was approved by the Human Research Protection Office of Washington University School of Medicine.

DNA isolation, sequencing, and genotyping.

We isolated DNA from blood using Puregene DNA isolation kit (Qiagen). We bidirectionally sequenced all translated exons, introns, and promoter region of SFTPC (6kb total) from 269 infants as previously described (22). We used Phred, Phrap, PolyPhred, and Consed (http://www.phrap.org/phredphrapconsed.html) to identify and annotate single nucleotide polymorphisms (SNPs) in sequencing chromatograms and Prettybase (http://pga.mbt.washington.edu) to extract a final file with genotypes. We obtained human SFTPC reference sequence from Alamut (version 1.5; Interactive Biosoftware, Inc., Rouen, France). We used 5′ nuclease assays (Taqman, Applied Biosystems) and the ABI 7500 FAST Real Time PCR system to genotype the three promoter sites identified in interim analyses for an additional 269 infants.

Statistical methods.

After anonymously linking individual genotype and phenotype information, we used SAS (version 9.1.3; SAS Institute, Cary, NC) to perform interim analyses using race-stratified Fisher's exact test to determine the association of SFTPC variants with risk of RDS. For our final analyses, we performed race-stratified logistic regression controlling for the effects of estimated gestational age and gender (23) to determine the association of the identified promoter variants with risk for neonatal RDS. To control for inflated type 1 error because of multiple association tests required for each of the SFTPC variants (n = 80), we used a modified false discovery rate (FDR) approach (24) based on the number of effective tests. Specifically, we considered the gene variants within one haplotype block as one effective test, and we determined an adjusted p value using the formula: p = 0.05/(1 + ½ + 1 3 + ¼ + ··· +1/n), where n is number of effective tests. Based on this calculation, we determined a p value of 0.01 as a criterion for statistical significance.

In silico functional analysis of SNPs.

To identify transcription factor binding motifs and determine the predicted functional significance of promoter region variants, we used TFSEARCH95 (www.cbrc.jp), an application that predicts transcription factor consensus sites based on a weighted matrix.

In vitro functional analysis of SNPs.

We created the reporter plasmid SFTPC_WT_luc by subcloning the 3.7-kb human SFTPC promoter (kind gift of J. Whitsett) into a Hind III site of the firefly luciferase reporter vector pGL4.10(luc2; Promega, Madison, WI). We generated the three SFTPC promoter variants associated with RDS (SFTPC_1167G_luc, SFTPC_1647A_luc, and SFTPC_2385T_luc) and a combination of variants in high linkage dysequilibrium (LD; SFTPC_1647_2385) by site-directed mutagenesis (Stratagene, La Jolla, CA) and confirmed all construct sequences by DNA sequencing. To evaluate the effect of each promoter variant or combination on transcription, we used Fugene 6 (Roche Applied Science, Indianapolis, IN) to transiently transfect MLE-15 cells (kind gift of J. Whitsett), an immortalized mouse lung epithelial cell line with functional and morphologic characteristics similar to pulmonary type II epithelial cells (25,26). Twenty-four hours after seeding MLE-15 cells (1.5 × 105 cells/well in 12-well plates) in HITES medium (RPMI 1640 medium with 2% FBS, supplemented with 5 μg/mL insulin, 10 μg/mL transferrin, 5 ng/mL sodium selenite, 10 nM β-estradiol, 10 nM hydrocortisone, 2 mM l-glutamine, 10 mM HEPES, 100 U/mL penicillin and 100 μg/mL streptomycin), we cotransfected each SFTPC reporter construct with pGL4.74 (hRluc-TK) vector (Promega) with six replicates for each construct. After 48 h, we lysed the cells and measured luciferase activities using the dual-luciferase reporter assay system (Promega). We normalized relative light units of firefly luciferase activity with Renilla luciferase activity and performed all transfections three times. We compared normalized luciferase activity between the wild type promoter and each construct using paired t tests.

RESULTS

SFTPC variants associated with neonatal RDS.

We identified 80 variants within SFTPC and its promoter with complete resequencing of 269 infants (Table S1, http://links.lww.com/PDR/A61). Analyses using Fisher's exact test revealed overrepresentation in cases of the minor alleles at genomic positions g.−1647(A) and g.−2385(T) among infants of European descent with RDS and g.−1167(G) among infants of African descent with RDS (Table 2). The alleles at g.−1647 and g.−2385 are in strong LD (r2 = 0.75 for African descent infants, r2 = 0.84 for European descent infants). Race-stratified logistic regression models that included gestational age and gender (23) performed on the final cohort of 538 infants revealed overrepresentation of the minor alleles of all three promoter variants (g.−1167, g.−1647, and g.−2385) among infants of European descent with RDS (Table 3). There was a trend toward overrepresentation of the minor allele of g.−1167(G) among infants of African descent with RDS; however, this trend did not achieve statistical significance. The variants at g.−1647 and g.−2385 were not associated with neonatal RDS in the African descent infants. In addition, no deleterious variants in SFTPB or ABCA3 were identified in any of the RDS infants.

Table 2 Fisher's exact test: genotype association of SFTPC variants with RDS
Table 3 Logistic regression: genotype association of SFTPC variants with RDS

Two promoter region variants are within transcription factor binding sites.

By using TFSEARCH95, we found that the minor allele at g.−1167 disrupts a SOX (SRY-related high mobility group box) consensus motif and introduces a GATA-1 site, at g.−2385 removes a myeloid zinc finger (MZF-1) binding site, and at g.−1647 removes a potential methylation site. These findings suggest that two of the three statistically identified SFTPC promoter variants could regulate transcription by altering transcription factor binding and the third by altering methylation.

Promoter region variants decrease transcription.

Each of the three SFTPC variants significantly decreased transcription of the SFTPC promoter as measured by decreased luciferase reporter activity relative to the wild type construct (Fig. 1). The construct for g.−1167 reduced SFTPC promoter transcription by approximately 59% (p = 1.7 × 10−6), g.−1647 by approximately 19% (p = 0.005), and g.−2385 by approximately 13% (p = 0.03). The construct with the two variants in high LD (g.−1647 + g.−2385) reduced transcription by approximately 56% (p = 0.003). We transfected each promoter construct three times with similar findings. These findings suggest that the minor alleles at each of the three statistically identified SFTPC promoter variants and the combination of g.−1647 and g.−2385 reduce transcription.

Figure 1
figure 1

Three individual promoter variants—SFTPC_1167G, SFTPC_1647A, and SFTPC_2385T—and the combination, SFTPC_1647A_2385T, decrease transcriptional activity of human SFTPC promoter relative to the wildtype promoter. Data are mean ± SD from one representative experiment (n = 6 replicates per construct). The experiments were performed three times with similar results. *p = 1.7 × 10−6; **p = 0.0045; †p = 0.030; ‡p = 0.0026.

DISCUSSION

We found that three SFTPC promoter SNPs were statistically overrepresented among infants of European descent with RDS, were predicted by in silico testing to impact transcription, and functionally decreased transcription in a murine alveolar epithelial cell line. In contrast to exonic SFTPC mutations that result in pulmonary disease in children and adults due to protein misfolding (6,17,18), these findings suggest that a proportion of the risk for neonatal RDS is attributable to transcriptional regulation of SFTPC. Population-based analyses of these variants are underway to determine the attributable risk of these variants to neonatal RDS.

In the absence of an accompanying change in protein structure, mutations in the regulatory regions of genes may be overlooked as a mechanism for phenotypic variation and generally require functional analysis of gene expression, such as transient transfection. The phenotypic variation associated with these cis-regulatory mutations is presumed to be caused by changes in transcript abundance. In a review of 107 human genes with identified cis-regulatory polymorphisms, 63% demonstrated allelic differences in expression of 2-fold or greater with transient transfection (27). For example, two common variants within the promoter of the SFTPB gene have been associated with decreased gene transcription. The g.−18 C>A variant is associated with decreased SP-1 binding and lower concentrations of SP-B in the bronchoalveolar lavage fluid of healthy adult volunteers (28). The g.−384 G>A variant impairs binding of TTF (thyroid transcription factor)-1 and reduces transcriptional activity in H441 cells (29).

Similar to exonic SFTPC mutations, these promoter variants appear to act in a dominant fashion, as the presence of a single copy of the minor or risk allele increases the risk for RDS among European descent infants and was sufficient to decrease gene expression in vitro. A single infant was homozygous for the rare variant at g.−2385; otherwise, all individuals were heterozygous. Cis-regulatory mutations may act in a codominant fashion due to allele-specific transcript abundance, whereas mutations in the coding sequence are more likely to act in a recessive fashion. Because of the nature of transcriptional regulation, the effect of cis-regulatory mutations may be limited temporally or spatially depending on the specific regulatory site affected (30). For these three promoter variants in SFTPC, the context-dependent regulation of gene expression may have important implications for the developmentally susceptible infant who requires a threshold of SFTPC expression for successful fetal-neonatal pulmonary transition. Although full-term mice with genetically abrogated SFTPC expression are viable at birth, have normal pulmonary function, exhibit normal lung morphology, and have similar rates of surfactant synthesis, secretion, and pool size, their pulmonary surfactant is functionally deficient due to reduced hysteresivity and instability at low bubble volumes (31). When these mice are subjected to a significant pulmonary insult (e.g. intratracheal bleomycin), they exhibit greater inflammation, increased collagen accumulation, and delayed recovery when compared with wild-type mice (32). Surfactant from SP-B+/− SP-C+/− mice exhibit microbubble stability by bubble surfactometer when compared with surfactant from SP-B+/− SP-C−/− mice, indicating that SP-C plays a role in stabilization of phospholipid film (33).

To our knowledge, the only other study to evaluate a statistical association of SFTPC variants and neonatal RDS used genotyping strategies and found that two common nonsynonymous SNPs, c.413C>A (p.T138N, rs4715, minor allele frequency approximately 0.20) and c.557G>A (p.S186N, rs1124, minor allele frequency approximately 0.22) were overrepresented among Finnish preterm infants with RDS (21). We did not find these associations in our study, perhaps because of the differences in mean gestational age [38.2 weeks (United States) versus 28.6 weeks (Finnish)] or differences in ethnicities between the two populations. However, the range of gestational ages in our cohort was more limited than that of the Finnish cohort, and these polymorphisms may be more influential during earlier periods of development.

When we performed race-stratified analyses, both gender and gestational age were incorporated as covariates into the final regression model. Gender was a significant predictor for RDS risk among African descent infants but not for European descent infants in our study; estimated gestational age was an independent predictor of RDS risk for both races. We also considered the possibility of an inflated type I error rate and used a FDR approach to correct for multiple tests. The promoter variants that we found to be statistically associated with RDS among infants of European descent approximated the FDR p value and were further validated in a transient transfection system to decrease transcription of a reporter gene. It is also possible that the African descent cohort may have been underpowered to detect associations with the three promoter SNPs identified in European descent infants because of the limited number of African descent infants with RDS. This disparity reflects what we observe clinically as there are fewer infants of African descent with RDS (34), and our study enrolled infants consecutively. Finally, the frequencies of the three transcriptionally active promoter SNPs were too low to provide statistical power to assess associations with disease severity.

Taken together, these data suggest that SFTPC promoter variants increase the risk for neonatal RDS in late preterm and term infants by reducing SFTPC transcription. This combined statistical, in silico, and in vitro approach suggests that reduced SFTPC transcription contributes to the genetic risk for neonatal RDS in developmentally susceptible infants.