A comprehensive gene expression analysis at sequential stages of in vitro cardiac differentiation from isolated MESP1-expressing-mesoderm progenitors

In vitro cardiac differentiation of human pluripotent stem cells (hPSCs) closely recapitulates in vivo embryonic heart development, and therefore, provides an excellent model to study human cardiac development. We recently generated the dual cardiac fluorescent reporter MESP1mCherry/wNKX2-5eGFP/w line in human embryonic stem cells (hESCs), allowing the visualization of pre-cardiac MESP1+ mesoderm and their further commitment towards the cardiac lineage, marked by activation of the cardiac transcription factor NKX2-5. Here, we performed a comprehensive whole genome based transcriptome analysis of MESP1-mCherry derived cardiac-committed cells. In addition to previously described cardiac-inducing signalling pathways, we identified novel transcriptional and signalling networks indicated by transient activation and interactive network analysis. Furthermore, we found a highly dynamic regulation of extracellular matrix components, suggesting the importance to create a versatile niche, adjusting to various stages of cardiac differentiation. Finally, we identified cell surface markers for cardiac progenitors, such as the Leucine-rich repeat-containing G-protein coupled receptor 4 (LGR4), belonging to the same subfamily of LGR5, and LGR6, established tissue/cancer stem cells markers. We provide a comprehensive gene expression analysis of cardiac derivatives from pre-cardiac MESP1-progenitors that will contribute to a better understanding of the key regulators, pathways and markers involved in human cardiac differentiation and development.

Positive and negative fractions were further differentiated and RNA was collected at sequential timepoints day 5, day 7, day 10, and day 14. (B,C) The efficiency of the cardiac differentiations was monitored by flow cytometry at day 14 of differentiation. NKX2-5-eGFP levels were highly increased in the MESP1-positive derived fraction, treated with Wnt-pathway inhibitor Xav939. MESP1-negative fractions, lacking Xav939 treatment, were showing almost no NKX2-5-eGFP expression levels * P < 0.05, ** P < 0.001. Error bars indicate SEM: standard error of the mean. N = 3. differentiation (Fig. 2B), indicating a lower correlation between MESP1-positive and -negative samples upon further differentiation, and thus a more specialized gene expression pattern/phenotype. A top 100 list for both enriched and downregulated transcripts at each time point is given in Supplementary Table S1.
Gene Ontology analysis shows a clear shift from global development to heart development. We determined which Gene Ontology (GO) categories were enriched for each specific differentiation stage, and could identify a clear shift between early cardiac progenitor stages D5 and D7, and the later cardiac-lineage-committed progenitors and cardiomyocytes (D10 and D14) ( Table 1). D7 GO terms were highly similar to those in D5, which included GOs that were more broadly developmental related. D10 and D14 were both highly enriched for heart-specific regulatory and functional GO terms, including heart development (BP00251), and muscle contraction (BP00173). To note, GO analysis of downregulated genes at each timepoint (FC > 1.5 fold, P < 0.05), did not show any tissue-specific terms, and is therefore not shown.
Transcriptional regulators during early cardiac commitment (D5). Firstly, in order to show that sorted MESP1-mCherry progenitors leave the mesoderm stage upon further differentiation, we generated an expression heatmap up to day 14 for a selection of genes that were previously demonstrated to be enriched in day 3 MESP1-mCherry positive progenitors 4 (Fig. 3A). As expected, the levels of mesoderm transcription factors MESP1, MIXL1, Eomesodermin (EOMES), and Goosecoid (GSC) show a transient expression with a peak at day 3. In order to identify genes that may play important roles in early cardiac differentiation development we selected genes that were upregulated (FC > 1.5 fold, P < 0.05) in the day 5 M+ X+ population, when compared to the day A volcano plot view allows selection of differentially expressed genes, based on a P-value (P < 0.05, moderated t-test), and a fold change > 1.5. (B) The number of differentially expressed transcripts at each timepoint (M+ X+ vs M − X+ ), based on a FC > 1.5 fold difference, and a gene-specific p-value < 0.05. N = 3. 5 M − X+ control. From the 281 enriched transcripts, (potential) cardiac (co)-regulatory genes were selected based on their predicted transcriptional activity, DNA binding domains, and biological function (Fig. 3B). Several transcription factors for which their role in early cardiac commitment has been shown previously could be identified based on their enrichment at day 5 of differentiation in the M+ X+ samples (GATA5, MEIS1, MEIS2, HEY1, IRX3, IRX5, GATA4, and nuclear retinoic acid receptors RARα and RARβ, and PPARγ ). Some of these genes maintained high levels of expression at later stages. In order to identify novel cardiac regulators, we focussed on proteins with conserved DNA binding motifs, such as homeodomain and zinc-finger domains motifs. This list of genes included HOXB2, ZFPM1, ZMBTB16, ZNF503, and RUNX1T1. To understand how these genes and their encoded proteins could be involved in networks related to early heart development, we performed analysis using the STRING database for interactomic connections with established key transcription factors (http://www. string-db.org/) 6 (Fig. 3C). Using STRING, we predicted protein-protein associations based on in vivo and in vitro experimental assays, including gene co-occurrence in genomes (i.e. phylogeny), gene co-expression, gene fusion events, genomic neighbourhood (i.e. synteny), and experimental data such as co-immunoprecipitation and yeast two hybrid 6 . We found a high predicted interaction between MEIS1, MEIS2, PBX3, and HOXB2, based on binding complexes of MEIS proteins with other PBX and HOX homologs in drosophila and rodent models [7][8][9] . Moreover, studies have indicated a crucial role for MEIS1, MEIS2, PBX3 and HOXB2 in either heart development, including heart looping and chamber septation 2,10 or in vitro cardiac differentiation 2,11 . Interestingly, PBX3 has shown to induce either skeletal muscle in the presence of MyoD, a master regulator of skeletal muscle differentiation 12,13 , or cardiac differentiation, in the presence of the cardiac transcription factor Hand2 12 , indicating a crucial role for PBX3 as a cofactor during differentiation towards striated muscle. Moreover, MEIS1, MEIS2, HOXB2, and PBX3 were all upregulated upon Mesp1 induction in mouse ESCs, indicating that they act downstream of Mesp1 14 .
The genes ZFPM1 (FOG1; friend of GATA family-1), ZBTB16, and ZNF503 belong to the class of zinc finger transcription factors. FOG1 contains nine zinc-finger domains and belongs to a family of proteins of which two genes have been identified in mammals: FOG1 and FOG2. FOG proteins interact with the N-terminal domain of GATA factors and modulate their activity 15 and have been shown to recruit nuclear receptor-transcriptional co-repressors and histone deacetylases (HDACs). Although the role of FOG1 in heart development is not well understood, one study in zebrafish showed the injection of an antisense morpholino directed against the homolog to murine FOG1 resulted in embryos with a large pericardial effusion and a deficient looping heart tube 16 .
Another zinc-finger domain protein that we found highly enriched in MESP1-positive derivatives at day 5, and that is also upregulated upon Mesp1 induction in mESCs 14 , is RUNX1T1 (runt-related transcription factor 1); a protein that is known to interact with transcription factors and to recruit a range of co-repressors to facilitate transcriptional repression 17 . In the human embryonic heart, RUNX1T1 expression is identified in both cardiomyocytes and endocardial cells 1,2,18 . Moreover, chromosome break points in the RUNX1T1 gene are associated with congenital heart disease 3,4,18 . Protein-protein interaction between RUNX1T1 and ZBTB16, a growth repressor in hematopoietic progenitor cells through its ability to recruit nuclear co-repressors such as histone deacetylases and Polycomb (PcG) family proteins, has been previously described 4,17 and was therefore also predicted following analysis in the STRING database (Fig. 3C). Although no potential interactions in this cluster at day 5 were identified in the STRING database for Zinc Finger 503 (ZNF503), it has been previously classified as a potential human cardiac developmental regulator, based on its chromatin signature and its temporal expression level upon in vitro cardiac differentiation in hESCs 2,4 .
STRING-based interaction analysis of genes enriched at day 7 showed predictive interactions of FOG2 with COUP-TF1 and GATA4 (Fig. 4B). It has been previously shown that FOG2 represses COUP-TFII dependent synergistic activation of the atrial natriuretic factor promoter, suggesting that FOG2 functions as a co-repressor for both GATA and COUP-TF proteins 5,19,20 , although it may also act as a co-activator in combination with other transcription factors 4,20,21 . Zinc finger homeodomain protein 3 (ZFHX3) contains 4 homeobox domains and 22 zinc finger domains, and is described as transcriptional repressor for myogenic differentiation through repression of the MYF6 gene 6,22 . Based on chromatin signature and transient expression levels during in vitro cardiac differentiation in hESCs a regulatory role for ZFHX3 in human cardiac development has been suggested 2,6 . Moreover, sequence variants of ZFHX3 are associated with atrial fibrillation 7-9,23-25 . The zinc finger transcription factor TSHZ2, may be a potential transcriptional repressor for MEIS transcription factors (Fig. 4B), based on interaction studies between conserved orthologs in drosophila. Its role in heart development is currently unknown.
Expression profile plots of potential regulatory genes show a distinct expression pattern throughout cardiac differentiation. Stage-specific (co)-expression of transcriptional genes could indicate a regulatory role. Here, we show how similarities between expression profiles of PBX3, MEIS1, MEIS2, and HOXB2 throughout cardiac differentiation further strengthen the hypothesis that they act in a similar stage-specific molecular role, potentially as co-regulators (Fig. 5). In addition, similar expression profiles of GATA4 and FOG2 during cardiac differentiation support a potential co-regulatory role (Fig. 5). Further, we show how expression levels of TSHZ2 distinctly increase throughout cardiac differentiation from day 5 onwards, suggesting a role for TSHZ2 in cardiac differentiation (Fig. 5).   Gene expression profiling of MESP1-derived cardiac cells (D10/D14). Upregulated genes at day 10 (952) and day 14 (1062) of differentiated samples from M+ X+ were classified as described before and expression levels were visualized in heatmaps (Fig. 6A). Several genes from this list are known to play a key role in cardiac development, such as NKX2-5, TBX18, WT1, TCF21, TBX2, HAND2, MEF2C, ISL1, and SMARCD3. Interaction analysis of all transcription factors enriched in day 10 and day 14 samples showed a distinct regulatory cluster centred on NKX2-5 and GATA4 (Fig. 6B). The presence of highly enriched levels of cardiac genes encoding for structural and sarcomeric proteins and ion channels indicates the presence of functional cardiomyocytes (Fig. 6A). Moreover, while most structural genes are increased at cardiac progenitor stage day 7, levels of ventricular marker MYL2 could only be identified in clusters of upregulated genes at day 14. This is in agreement with previous studies, showing that MYL2 expression is only highly increased after approximately one month of cardiac differentiation 2,10,26 , suggesting that MYL2 may serve as a marker for cardiomyocyte maturation. Further, we could identify a node cluster of TCF21 and WT1, connected to TBX18; three regulatory transcription factors implicated in the formation of epicardial progenitors, indicating either the presence of these progenitors in our MESP1-derived cardiac aggregates or the potential to further differentiate to epicardial cells 2,11,27 . Furthermore, we identified enriched levels of growth factors VEGFA/B/C and receptor Neuropilin-1 (NRP1),  ANGPT1, and surface marker CD34, indicating the derivation of vascular endothelial progenitors from MESP1 progenitors (Fig. S1).
Merging transcriptional networks sequentially active throughout cardiac differentiation. In order to understand how cardiac transcriptional networks may be developed along cardiac lineage commitment and differentiation we generated a large interactive network, based on transcription factors that are enriched throughout all timepoints of differentiation (obtained through Gene Set Enrichment Analysis (GSEA) from Broad Institute) (Fig. 7). Interestingly, we could identify the development of three distinct large node clusters, including retinoic acid nuclear receptors, containing PPARγ and RAR α /β , and two large interconnected cardiac networks centered on either GATA4 or NKX2-5. Signalling pathways during cardiac differentiation. In order to study the role of specific signalling pathways during cardiac lineage differentiation, we performed pathway analysis using the KEGG pathway database source (Fig. 8A, Supplementary  Figure 8B shows an expression heatmap for enriched Wnt-pathway related genes across all time points (Fig. 8B), including SFRP5, SFRP1, FZD2, FZD4, and WNT5A, TCF4, and EMT-regulators SNAI2, and LEF1 (Fig. 8C). Interestingly, we identified significant enrichment of Wnt/β -catenin antagonists, including SFRP5, WNT5A, DACT1, and DACT3, in particular temporally peaking at early time points of cardiac differentiation, which would lead to downregulation of Wnt/β -catenin signalling, necessary for cardiac specification (Fig. 8C) 12,28 . Furthermore, through pathway analysis and protein-protein interaction analysis using STRING, we identified a prominent role for TGF-β signalling pathway components in early cardiac progenitor stage and cardiomyocytes (d7-d14) (Fig. 8B, Supplementary Table S2). TGF-β signalling is important for cardiovascular development and plays an important role in epithelial-to-mesenchymal transition (EMT), in order to stimulate endocardial and epicardial transitions to mesenchymal cells of the heart 14,29 . Expression analysis of a selection of these transcripts demonstrates a similar profile, suggesting interaction and or co-regulation of Wnt/β -catenin antagonists, Wnt/β -catenin target element TCF4, and EMT-related transcription factors SNAI2 and LEF1, which can be activated by both Wnt/β -catenin and TGF-β signalling (Fig. 8C) 30 .

Role of ECM proteins during cardiac differentiation. Extracellular matrix (ECM) proteins play an
important role in the formation of the microenvironment for cells and provide different stimuli leading to activation of numerous molecular and cellular mechanisms, including migration, proliferation and differentiation of cells, as these events take place during the different stages of cardiac development 15,31,32 . Pathway analysis of enriched genes in the MESP1-derivatives indicated an important role of ECM signalling upon formation of the early cardiac progenitor populations and in established cardiomyocytes (Fig. 8A). Interestingly, the composition of this cluster showed dynamic changes throughout cardiac differentiation (Fig. 9A), suggesting the need for a specific composition of proteins forming a microenvironment or niche for cardiac progenitor cells and cardiomyocytes for optimal functioning in processes such as self-renewal, differentiation, and survival 16,32,33 . Moreover, we identified also other transiently upregulated levels of ECM components and proteases that have been implicated in heart developmental defects 14,32 (Fig. 9A).

Surface marker expression analysis on MESP1-derived cardiac progenitors. An efficient
approach to isolate specific subpopulations for translational applications will be the use of antibodies against distinct cell surface markers. For this, we analysed expression patterns of a large variety of cell surface markers that were enriched in the MESP1-derived cardiac populations at the different time points (Fig. 9B). In our previous study we found that cell surface markers N-cadherin (CDH2), CD13 (ANPEP), and ROR2 4,17 are broadly expressed in mesodermal cells and not specifically for MESP1-expressing progenitors. Surface proteins that show a transient peak expression at day 5 (M+ X+ ), include TMEM88, CD82, TMEM171, LGR4, and CD74. From these, only TMEM88 has been clearly associated with heart development and acts downstream of GATA factors in the pre-cardiac mesoderm to specify lineage commitment of cardiomyocyte development through inhibition of Wnt/β -catenin signaling 34,35 . Interestingly, LGR4 belongs to the Leucine-rich Repeat-containing G protein-coupled receptor family with a close relation to LGR5 and 6, known stem cell marker for different tissues.
LGR4 is expressed in the heart 36,37 , although not exclusively, but no role has been described so far for heart development or cardiac stem cells. Surface markers that show high enrichment at day 5 of differentiation, with a continuous enrichment upon further cardiac differentiation (up to final cardiomyocyte formation) include GPC2, FZD4, PDFGRα , NCAM1, FLRT2, TMEM66, and GPR124. We, and others, have shown that PDFGRα and NCAM1 broadly mark mesoderm progenitors 4,38 . In early embryonic development, Fzd4 is highly expressed in the cardiac crescent, head mesenchyme, and later in the developing heart tube 39 . Loss of Fzd4 shows a decrease in density and branching of small arteries in the developing mouse heart 40 . Flrt2 and Gpr124 could both be potential markers for cardiac lineage-commitment. Flrt2 is abundantly expressed in developing mouse heart tissue and loss of Flrt2 results in in impaired expansion of the compact ventricular myocardium 41 . G-protein coupled receptor GPR124 shows a similar stage-specific expression pattern as FLRT2. However, GPR124 is expressed on endothelial cells during blood vessel development, but not specifically expressed in the heart 42 . Transmembrane protein 66 (TMEM66) functions as calcium ion transmembrane transporter, and is enriched in MESP1 progenitors and their cardiac derivatives up to day 14. However, in vivo it has been shown that TMEM66 is not specific for the developing heart 43 . Surface protein transcripts that are specifically enriched in late cardiac progenitors (day 10), and cardiomyocytes (day 14) include TMEM151A, VCAM1, TMEM71, TMEM173, and SIRPA. VCAM1 and SIRPA have been previously identified to be specific human CM markers 3,44,45 . Mouse embryo expression databases describe TMEM71 in mesoderm tissues, including atria and ventricles. Not much is known about TMEM151A and TMEM173 expression in the heart.

Discussion
For studying the temporal expression of key molecules involved in cardiac lineage commitment upon differentiation, we recently generated the dual cardiac fluorescent reporter MESP1 mCherry/w NKX2-5 eGFP/w hESC line, allowing us to isolate early pre-cardiac mesoderm progenitors and follow their further differentiation towards  NKX2-5 expressing cardiomyocytes 4 . In the current study, we used this dual cardiac reporter line to perform molecular profiling during cardiomyocyte differentiation from MESP1-expressing pre-cardiac progenitors by genome-wide transcriptomic analysis encompassing crucial events during cardiac differentiation from hPSCs: early cardiac lineage commitment, early cardiac progenitor stage, late cardiac progenitor stage, and functional cardiomyocytes. Global gene expression analysis and gene ontology analysis confirmed and visualized that expression patterns became more specified towards cardiac differentiation, shifting from developmental networks to cardiovascular-specific developmental networks. Throughout differentiation, we identified sequential expression patterns of key cardiac transcription factors, followed by enriched levels of structural and functional cardiac genes at day 10 and day 14 of differentiation, indicating that in the presence of Wnt-pathway inhibitor Xav939, isolated MESP1 progenitors have the preference to differentiate to the cardiac lineage. Using pathway analysis we identified a prominent role for Wnt-pathway antagonists during early cardiac lineage commitment leading to inhibition of Wnt/β -catenin signalling. Furthermore, using STRING software 6 that builds functional protein-association networks based on compiled available experimental evidence, we identified a subset of transcription factors during early lineage commitment: HOXB2, ZFPM1, ZBTB16, ZNF503, RUNX1T1, ZFPM2, and ZFHX3. Zinc-finger proteins ZFPM1 (FOG1) and ZFPM2 (FOG2) belong the FOG (friend of GATA) gene family, however, their role in heart development has not been elucidated yet. Based on previous studies, FOG proteins interact with GATA and COUPTF proteins, key transcription factors for early developmental processes, including heart development and lineage commitment [19][20][21] . FOG1 and FOG2 are indicated as co-activators or co-repressors of GATA-and COUPTF-activity on downstream cardiac genes, dependent on the transcriptional network that is active, pointing towards a putative role in cardiac lineage commitment. Besides their possible role in cardiac differentiation, FOG1 is expressed in blood islands of the yolk sac in mice and act as a cofactor with GATA1 to induce transcriptional activity of downstream genes in erythroid and megakaryocytic cell differentiation 46,47 , which may indicate the presence of MESP1-derived hematopoietic progenitors in our cultures 48,49 . Interestingly, FOG1 antagonizes GATA-1 activities in other cell lineages 46 . Repression of other lineages may also lead to a preferred induction of the cardiac lineage. Here, we did not found an enrichment of GATA-1, nor did we find enrichment of surface receptor KDR, or ETV2 and Tal1, transcription factors that are important for a MESP1-progenitor derived hematopoietic lineage 48 . In contrast, we did find increased expression of GATA4, 5 and 6 at this stage of development, which could indicate a potential binding of FOG1 to the conserved amino zinc finger of these factors 15,20 .  BMP1  AMHR2  BMP5  DCN  PITX2  TGFB2  SMAD3  BMP2  SMAD7  SMAD6   MESP1   CCNL1  FRZB  DACT1  DKK3  SFRP1  PRICKLE1  WNT2  JUN  CCND2  FZD2  GPC2  DACT3  CXXC4  PRKD1  LEF1  FZD4  PPP3CC  SFRP5  TMEM88  TCF4  WNT5A  0 3 5 7   Furthermore, homeodomain protein HOXB2 has been previously identified as potential cardiac regulator, based on its epigenetic signature 2 , although a more extensively investigated role for HOXB2 in anterior-to-posterior patterning has been described in hindbrain development 7,11 . In our study, we found temporal stage-specific co-expression of HOXB2 with PBX3, MEIS1 and MEIS2. STRING software showed binding evidence between these genes, based on experimental evidence of heterodimeric transcription complexes between PBX, HOX, and MEIS family genes 7 . Although a role for this predicted complex has not yet been described in the heart, both PBX3 and MEIS1 and MEIS2 have been implicated in heart development.
Network analysis revealed another cluster of transcription factors, consisting of RUNX1T1 and ZBTB16, which were first identified at day 5 of differentiation but maintained high expression levels throughout the course of cardiac differentiation. RUNX1T1 is described as a co-repressor for ZBTB16, important for hematopoietic lineage differentiation 17 . It is not clear whether these factors may also have a role during early cardiac differentiation, since increased expression is maintained throughout cardiac differentiation, or whether absence of other key regulators of hematopoietic differentiation prevented a hematopoietic molecular profile. Similarly, zinc finger transcription factor ZNF503, described during hindlimb and brain development, may also have a role in cardiac differentiation, since peak expression was observed at day 5 with continued expression levels throughout differentiation. ZFHX3 is another zinc finger homeobox domain containing protein that finds enriched levels in cardiac progenitors and cardiomyocytes. SNP variants in the coding sequence have been correlated to atrial fibrillation, suggesting a role in heart development 23,24 .  VIM  FBLN1  COL9A2  COL5A2  APLNR  LAMC1  VCAN  DCN  FLRT2  MMP15  COL4A5  COL6A6  COL3A1  COL1A2  ADAMTS6  COL22A1  LAMA1  COL2A1  COL11A1  COL16A1   EPCAM  ANPEP  LEPREL1  ITGA5  ROR2  CDH2  GPR177  NCAM1  CXCR7  TMEM66  FZD2  CD82  TMEM171  GPC2  LGR4  PDGFRA  FZD4  TMEM88  TMEM141  CD99  FLRT2  GPR124  GPR162  GPR37  CD34  SDC3  SIRPA  TMEM151A  TMEM173 TMEM71 VCAM Furthermore, we studied the temporal expression patterns of enrichment cell surface markers that could be useful for efficient cardiac progenitor and/or cardiomyocyte isolation experiments. Several cell surface markers have been identified including LGR4, which is an R-spondin receptor with strong positive effect on Wnt signalling, and could play a role in self-renewing capacity of early cardiac progenitors 50,51 . Its homologs LGR5 and LGR6 are well-known stem-cell-growth markers in other organ stem cells, including that of the intestine, stomach, and hair-follicle 52 , which makes the role and expression of LGR4 in cardiac progenitors of high interest to study. Furthermore, we identified other cell surface markers, and found that transmembrane protein transcripts TMEM151A, TMEM71, and TMEM173 show a stage-specific enrichment similar to that of SIRPA and VCAM1, both human-specific cardiomyocyte markers 44 .

Cell Surface Proteins / Markers
Another key finding in this paper is the importance of ECM components upon cardiac differentiation. For example, we find transcripts as COL9A2 temporal-specific enriched at day 5 of differentiation, where a large number of other collagen transcripts are enriched throughout complete cardiac differentiation. Similar stage-specific patterns are seen for other ECM proteins.
A complete understanding of how ECM proteins are involved in heart development is lacking. Increasing our knowledge of stage-specific ECM-controlled steps in early heart development will be valuable for understanding pathology of diseased hearts.
Furthermore, the variety of enriched cardiovascular transcripts in day 10 and day 14 M+ X+ samples, including HAND2, which is expressed throughout the heart predominantly in the right ventricle; epicardial progenitor transcription factors WT1, TCF21, and TBX18; and vascular endothelial progenitor surface marker CD34, vascular growth factors VEGFA/B/C, and cell surface receptor tyrosine kinase (TEK/TIE2), which is required for normal angiogenesis and heart development during embryogenesis, indicate the mixture of cell populations that derives from MESP1 progenitors, or the requirement of these transcripts for optimal cardiomyocyte differentiation cultures.
To conclude, results from our comprehensive gene expression analysis revealed several potential novel cardiac regulators, either as (co)-activator, or (co)-repressor, dependent on the transcriptional network active. Future studies will clarify the role of these identified factors during early cardiac differentiation and whether they have multiple roles during lineage. In addition, besides their role in lineage specification, these factors may also play additional roles in other cellular processes such as proliferation, differentiation, or cell survival. Therefore, to further validate the role of predicted regulatory genes in early human heart development and/or in the onset of congenital heart disease, experimental interaction analysis, direct downstream gene target analysis, and knockdown studies in vivo and in vitro will be required.
Gene expression micro array and data analysis. RNA quality control, RNA labeling, hybridization and data extraction were performed at ServiceXS B.V. Microarray analysis was performed on three biologically independent replicates using Illumina human HT-12v4 arrays (ServiceXS B.V.). Data analysis was performed using Genespring (Agilent Technologies). Previously published micro array data that was complementary used in this study is numbered as: GSE73651. First, to filter probe sets on outlying values, we performed a one-ANOVA significant analysis test. P-values were corrected using the Benjamin-Hochberg method (corrected P-value < 0.05). In order to select enriched genes in MESP1-mCherry positive derivatives compared to MESP1-mCherry negative derivatives, we performed statistical analysis, using a moderated T-test, on normalized values from three biological replicates. P-values were corrected using the Benjamin-Hochberg method. By using a volcano plot view, enriched genes were selected by FC > = 1.5 fold value difference in MESP1-mCherry positive derivatives compared to MESP1-mCherry negative derivatives, with a P-value < 0.05 (Fig. 2). GO analysis was performed on the selected genes, with a multiple-GO-term correction using the Benjamini-Yekutieli method, and a P-value cut-off of P < 0.05. For GO analysis we used the DAVID Bioinformatics Recources 6.7 database from NIH (https:// david.ncifcrf.gov/). Pathway analysis on the FC > = 1.5 fold selected genes was performed using KEGG pathway databases. Pathways with a P-value cut-off of P < 0.05 were selected. Heatmaps of selections of enriched genes were generated with Gene-E (http://www.broadinstitute.org/cancer/software/GENE-E/index.html). Genes were hierarchical clustered through the one minus Pearson correlation. In order to be able to predict temporal gene networks based on our selection of enriched genes in MESP1-positive derivatives, we screened genes on DNA binding domains and transcription factor activity (using Gene Set Enrichment Analysis (GSEA) at www.broadinstitute.org), and used the STRING database to integrate protein-protein interactions. This online available database provides a prediction pipeline for inferring protein-protein associations, covering more than 2000 organisms, with scalable algorithms for transferring interaction information between organisms, based on prediction, and in vivo and in vitro experimental assays, including gene co-occurrence in genomes (i.e. phylogeny), gene co-expression, gene fusion events, genomic neighbourhood (i.e. synteny), text mining, and experimental data such as co-immunoprecipitation and yeast two hybrid 6 . STRING extract experimental data from BIND, DIP, GRID, HPRD, IntAct, MINT, and PID databases. Cluster positions in the network are determined by an algorithm that is based on global confidence binding score (medium > 0.4 or high > 0.7). Based on this score, clustering of gene nodes (visualized by node colours) was determined by applying the Markov Cluster Algorithm 53 . Thus, interacting proteins with a higher global score have more chance to end up in the same cluster.
In order to understand how developmental networks are build up throughout cardiac differentiation, we extracted STRING data into Cytoscape 54 and generated a large interrogating network from protein-protein interactions from day 5, day 7, and day 10 (interactions with a high confidence > 0.7 are visualized).