Single cell RNA analysis identifies cellular heterogeneity and adaptive responses of the lung at birth

The respiratory system undergoes a diversity of structural, biochemical, and functional changes necessary for adaptation to air breathing at birth. To identify the heterogeneity of pulmonary cell types and dynamic changes in gene expression mediating adaptation to respiration, here we perform single cell RNA analyses of mouse lung on postnatal day 1. Using an iterative cell type identification strategy we unbiasedly identify the heterogeneity of murine pulmonary cell types. We identify distinct populations of epithelial, endothelial, mesenchymal, and immune cells, each containing distinct subpopulations. Furthermore we compare temporal changes in RNA expression patterns before and after birth to identify signaling pathways selectively activated in specific pulmonary cell types, including activation of cell stress and the unfolded protein response during perinatal adaptation of the lung. The present data provide a single cell view of the adaptation to air breathing after birth.

conclusions ends often with speculations without really demonstrating them clearly. While the amount of information produced is vaste, the demonstration falls always short, some information are always missing or not completely convincing (due to lack of validation). With improved validations and clearer integration of the single cell data and time course bulk data, this work could be a very useful resource in the field.
Specific comments: Figure 1, 2: single cell analysis appears appropriate (Figures and supplemental figures 2-7). Results looks really good. Nice and clear analysis. Figure 2: It would be nice to see the pseudotemporal ordering of the single cells along each differentiation lineage (AT1, AT2 and ciliated, club). This would complement Figure 2a, 2b, 2c and 2d. It would be also nice to show immunofluorescence (or FISH or FAC S) for the Sox2hi, ciliated and club cell on section (similar to figure 2e). It would validate the markers identified (see Villani et al, Science, 2017 for validation of newly identified cell populations). Figure 3: Validation of the identified markers using immunofluorescence or FISH would strengthen the findings and confirmed the methods for this tissue. Figure 4: Validation of the identified markers and cell populations (MatrixFB-1 and MatrixFB-2) using immune fluorescence or FISH or FAC S experiments would strengthen the findings and confirmed the methods for this tissue. The authors state an hypothesis but, unfortunately, do not try to demonstrate that Tbox genes regulates matrixFB-1 differentiation. Similarly, IGF-1 is predicted to drive the differentiation of MatrixFB-2 cells. To confirm these predictions and the validity of the method, a demonstration would be required. The Pericyte-1 and Pericyte-2 cell populations were not validated by an orthogonal method. Immunofluorescence and/or FAC S analysis methods should be used to validate the level of expression Acta2 and/or validate Pdgrfrb, Notch3, Mcam, C spg4 and Acta2 markers.
Smooth muscle/myofibroblasts cells were also identified without orthogonal validation using immunofluorescence (or FISH or FAC S) on the identified markers. The authors identified 5 different sub-population of immune cells. Again, it would be necessary to to validate the findings with an orthogonal method. For example, markers of proliferation would demonstrate the production of proliferative lymphocytes. C ross-validation and comparison of Drop-Seq and Fluidigm C 1: The authors demonstrate that C 1 data and Drop-Seq data are consistent and the results are reproducible independently of the platform used. This also validate the method used by the authors and is not really an orthogonal method to validate the pathways, gene and cell populations discovered. They conclude the paragraph with a known trade-offs between these 2 platforms. I don't think that the data presented in Figure 5 is really driving the conclusion led by the authors on the cell number/number of genes detected trade-off. More thorough studies have been done in the past to really compare the single cell platforms (see for example: Ziegenhain et al, Mol C ell, 2017;Svensson et al., 2017). This piece of information could easily be relegated to a supplementary figure. Dynamic regulation of genes and pathways at birth ( Figure 6): it is not clear what point this figure is trying to make. It would have been interesting to compare the pseudo-bulk from the single cell experiment PND1 and the bulk RNA-seq at the same time to show any concordance. In addition, the authors do not go back to the single cell data to show that these genes are seen in particular cell population (and associated function). It would have been interesting to combine the single cell results in the context of a broader time course that the bulk experiment brings. The next section seemed to be build on the integration of the single cell and the time course bulk data but it is not very clearly explained and the figures are not showing the integration clearly. Activation of the unfolded portion response pathway at birth (Figure 7, 8): Figure7a is more information than real data and it would probably be better placed in supplemental. It would also be useful to highlight in the figure the studied gene (ATF4/6, C HOP, SYVN1, … for example). Figure  7b makes an excellent and clear point. Would it possible to add C HOP in this figure since the authors are studying it the Figure 8. In figure 8a, the western blot appears to be slightly overexposed. Where the quantification done on this blot? The authors would benefit to measure the band intensity on low exposure blot (i.e. not saturated signal) to prevent signal saturation and maximize the dynamic range. It would be helpful to have an immunofluorescence to show the co-localization of markers of specific cell population on section to better demonstrate that ATF6, and not ATF4, is important for ER-stress response. In Figure 7b, ATF4 and ATF6 have very similar pattern. It would be interesting to see ATF4 as well on the western blot. The pattern 47 (RNA-seq) only suggest that ATF6 is upregulated at PND1. Next, the authors look at C HOP and Synv1. It would add to their demonstration to show on section the co-expression of ATF6, C HOP and Synv1 on sections. Pdia3 is also regulated at PDN1 and shown to be expressed in Lymphatic endothelial cells, ciliated, broncho-pulmonary epithelial cells. The immunofluorescence shows that Pdia3 is co-localized with AT2 marker Abca3. Yet, the single cell result shows that Pdia3 is not expressed in AT2 cells only in AT1/AT2 cells. C ould the authors reconcile these 2 results? Surfactant production was shown to be up-regulated at PDN1 in the bulk RNA-seq results. It would be interesting to show also the change in expression at the protein level of different surfactant proteins. It has been shown that the ER control is important for the biosynthesis of surfactant proteins. One can wonder whether the activation of ERAD would have a direct effect on the proper biosynthesis of surfactant proteins? ERAD activation would be needed to respond to the increase production of surfactant proteins?
C omments for the methods section: In results, the cell preparation for Drop-seq and Fluidigm C 1 experiments are not described in details. These methods are significantly different in preparation and would need to be explained in better details such that it explains the dissociation methods and final concentration used to load the single cell instrument . In addition, Drop-seq sometimes required adjustment of settings and it would be very valuable to details the setting used in the experiments such as flow rates used for this experiment.
Reviewer #4: Remarks to the Author: This manuscript provides the first data set of single cell RNA Seq of postnatal lung cell diversity. This is an important resource for developmental biologists, as well as those interested in regeneration and disease.
1. The authors correlate their findings with current literature to provide helpful comparisons. They point out how their data are consistent with recent findings. How are the data contrasting? A few examples of how postnatal lung is not the same as adult or as stem cells would provide more perspective, not just knowing how things all match up.
2. AT2/AT1 progenitors are described. What about possible "BASC s" or cells that express markers of C lub cells and alveolar cells?
3. It is noted in the discussion that the data showed potential pathways/genes that control differentiation (of endothelial cells). This can't be revealed with only the single age analyzed. More caution should be used.
4. Is all the data from single cell (Drop or Fluidgm) or is some from bulk RNA seq? 5. The authors created a public website for others to use. When I tried the site, it wa not possible to access it.

Reviewer #1 (Remarks to the Author):
MAJOR COMMENTS: 1) In some cases, molecularly distinct sub-classes are proposed to be functionally distinct subtypes (e.g., FB1 and FB2 or Per1 and Per2) while in other cases they are proposed to be the same cell type at distinct stages of maturation (e.g., BP, Sox2-Hi, MyoFB). While these interpretations have been well-established by prior work for BPs (and somewhat for FBs), they are less wellsupported by prior knowledge for the other 3 instances. The authors should provide further evidence from the literature or their own data to support the presence of two distinct pericyte sub-types. Similarly, the rationale for why they believe the MyoFB classes to represent early versus late (rather than 2 distinct sub-types) is not clearly explained. The most surprising is the proposal that Sox2-Hi cells are the progenitors for Club and ciliated cells. This Sox2-Hi population expressed markers of both Club and ciliated cells, yet prior work in the field suggests that ciliated cells differentiate much earlier in development, and at least one lineage study using CCSP-Cre has reported that marked cells exclusively give rise to Club cells. (ciliated --> club?) Furthermore, Sox2+ airway cells that are neither Club nor ciliated cells exist in mature mouse airways and generate basal cell-like cells that become active following H1N1 influenza. In light of these previous studies, the authors' reliance on co-expression of Club and ciliated markers along with entropy analysis does not seem sufficient to conclude that the Sox2-Hi class are bipotent progenitors of Club and ciliated cells.

Author's Responses:
a. Validate pericyte-1 and pericyte-2 We applied an unsupervised and unbiased approach to identify major and sub-clusters. Cell type mapping is based on the significance and overlaps of sub-type representative signature genes with RNAs selectively expressed in a given cell type based on experimental or other highthroughput screens. In the case of pericyte 1 & 2, both expressed multiple pericyte selective markers such as Pdgfrb, Notch3, Mcam, and Cspg4. The distinction between Pericyte-1 and Pericyte-2 is that one expresses high levels of Acta2, another pericyte marker (also a marker for smooth muscle), while the other one expresses low levels of Acta2 RNA. In addition, Pericyte-1 cells selectively expressed Map3k7cl and Mustn1, while Pericyte-2 cells selectively expressed Agtr1a, Vsnl1, and Art3. As suggested, we performed immunofluorescence staining for CSPG4, PDGFR and ACTA2 and demonstrated the existence of distinct PDGFR + /CSPG4 + and PDGFR + /CSPG4 + /ACTA2 + pericyte cells in mouse lung at PND1 (now added to Supplementary Fig. 14).
b. Why the MyoFB classes represent early versus late (rather than 2 distinct sub-types)?
We appreciate the reviewer's comment. We termed "MyoFB early" and "MyoFB late" since both express MyoFB representative genes. "MyoFB early" also enriched in stem cell markers such as Pdgfra and Ednrb. Since we do not have clear evidence that these are distinct "cell types" or related cells that are in a transitional stage, we now changed subtype terms as MyoFB-1 and MyoFB-2.
c. Are Sox2-Hi cells the progenitors for club and ciliated cells?
We cannot prove this without lineage tracing. Nevertheless, scRNA-seq is increasingly used to predict lineage and developmental relationships among heterogeneous cell populations and their differentiation states. In the present study, we applied entropy based differentiation state analysis. Our in silicon model predicted the potential role of Sox2-hi as progenitors for club and ciliated cells. Our in silicon lineage prediction is supported by previous studies. Several lines of evidence demonstrated that Sox2 plays multiple roles in both the developing and adult airways: required for maintenance and differentiation of bronchiolar club, ciliated, and goblet cells. Sox2 induces proliferation and partially reprograms alveolar epithelial cells into cells with characteristics of the conducting airways including club and ciliated cells. Conditionally deleting Sox2 leads to a reduction in the number of basal, ciliated and club cells during development and in the adult following injury 38-40 .
Experimental Validation: To better address reviewer's comments, we performed immunofluorescence staining for SOX2, identifying SOX2 hi (solid white outline) and SOX2 lo (dotted white outline) subpopulations. Co-expression of SCGB1A1 or FOXJ1 with SOX2 was observed in the SOX2 hi . The result has been added to Supplementary Fig. 8.
2) The sc-RNAseq results indicate Lyve-1 is expressed in both lymphatic and vascular endothelial cells. However, Lyve-1 is believed to be specific for lymphatic endothelial cells based on prior work in the field. How do the authors reconcile this result? Is it possible what they call the vascular endothelial population may represent a different lymphatic endothelial cell type or progenitor?
Author's Response: Single cell approaches have enhanced resolution of cell selective markers based mRNA expression. We observed Lyve1 RNA in both lymphatic and vascular endothelial cells in multiple single cell datasets including present data. These observations are consistent during lung development and in both human and mouse. Our single cell study suggested that Lyve1 is essential but not sufficient to identify lymphatic endothelial cells. We performed immunofluorescence staining and identified co-staining of LYVE1 with EMCN and/or SOX17, the latter two are selective markers for vascular endothelial cells. Publications (Kretschmer et al., 2013 andGordon et al., 2008) also suggested that LYVE-1 is not restricted to the lymphatic vasculature but is also expressed on embryonic blood vessels. Our present study supports the use of Lyve-1 in combination with additional markers, such as Thy1 or Prox1, to specifically identify lymphatic endothelial cells. We have added the immunofluorescence data in Fig. 3c.
3) For every named cell class and sub-class, the authors should specify which exact marker(s) they use to identify the named class. For instance, what defines and distinguishes what the authors call myofibroblasts from fibroblasts, pericytes, and smooth muscle cells? To some degree, these classes express the same transcripts, though at different levels. The authors should be very clear about this to allow for comparison of their dataset and conclusions with future studies by different groups.
Author's Response: We agree with the reviewer that these mesenchymal cells share many genes; our subtype distinction is primarily based on gradient differences of selective markers. As suggested, we have added Supplementary Table 9 listing candidate markers for each subtype that may be useful to the field.
MINOR POINTS 1) In Figure 9, the wrong color is indicated for the AGER stain 2) There are numerous instances of spelling errors throughout the manuscript Author's Response: Yes. Thank you. We had reported to the editor regarding the mislabeling of AGER in the original Figure 9 and sent the correct version on 4/10/2018. As suggested, we have more carefully edited the manuscript to correct syntax and spelling errors.

Reviewer #2 (Remarks to the Author):
I found this a very interesting description of the transcriptional profile defining individual cell types in the mouse lung using a Drop-seq technique. The types of cell types identified and their gene expression match very well with previous knowledge but this study adds further knowledge and complexity to the variety of cell types in the lung. RNA Seq is then used to analyse temporal changes in gene expression through embryonic development and after birth, with the transcriptome data being inferred back on the single cell data to explore which cells are likely contributing to key changes in gene expression around birth. Whilst it would have been better to have single cell data at each time point that would have been a heroic effort. The key signatures of adaptation to breathing/birth include responses to oxidative and ER stress, and increased lipid metabolism which are well explained in the discussion. Protein studies are used to confirm these changes, with for example a very convincing enhancement in ATF6 at PND1 in epithelial cells.
To further validate the Pdia3 expression in AT1 and AT2 cells, we repeated immunofluorescence staining for PDIA3 and ABCA3 (an AT2 cell marker) and PDPN (another AT1 marker) on frozen lung sections at PND1. We observed clear co-localization of PDIA3 with ABCA3, and lack of co-staining with AGER (original Fig. 9, now Fig. 7 in the revised manuscript) or PDPN (Supplementary Fig. 20). Previous proteomics study of cultured AT2 cells revealed the induction of PDIA3 protein expression during AT2 to AT1 transdifferentiation and the role of PDIA3 is mediated by WNT signaling (PMID: 26035385), consisting with the present finding that both Pdia3 and Axin2 are more enriched in AT1/AT2 cells than other epithelial cell types. The lack of co-expression with AT1 markers is likely relative to the plasticity of alveolar AT1 and AT2 cells at PND1 and immunofluorescence staining fails to capture this transition stage of cells.
Reviewer #3: Surfactant production was shown to be up-regulated at PDN1 in the bulk RNA-seq results. It would be interesting to show also the change in expression at the protein level of different surfactant proteins. It has been shown that the ER control is important for the biosynthesis of surfactant proteins. One can wonder whether the activation of ERAD would have a direct effect on the proper biosynthesis of surfactant proteins? ERAD activation would be needed to respond to the increase production of surfactant proteins?