Massively parallel reporter perturbation assays uncover temporal regulatory architecture during neural differentiation

Gene regulatory elements play a key role in orchestrating gene expression during cellular differentiation, but what determines their function over time remains largely unknown. Here, we perform perturbation-based massively parallel reporter assays at seven early time points of neural differentiation to systematically characterize how regulatory elements and motifs within them guide cellular differentiation. By perturbing over 2,000 putative DNA binding motifs in active regulatory regions, we delineate four categories of functional elements, and observe that activity direction is mostly determined by the sequence itself, while the magnitude of effect depends on the cellular environment. We also find that fine-tuning transcription rates is often achieved by a combined activity of adjacent activating and repressing elements. Our work provides a blueprint for the sequence components needed to induce different transcriptional patterns in general and specifically during neural differentiation.


Supplementary Notes
Supplementary Note 1: Determinants of time-point specific regulatory activity.
One of the open questions in Biology is to understand which regulatory elements play a role under different cellular conditions and what determines this cell-type specificity. The answers for these questions can guide designs of synthetic enhancers for the purpose of cell therapy and can help in refining drug design. In our model system, we examined if we can gain a better understanding on the determinants of time-point specific regulatory activity. Cooperative binding of pluripotent factors and neural factors (POU, SOX, NANOG) are known to play important roles. Especially SOX2 and ClassV POU (POU5F1) function as a pioneer factor in ESCs, while other members of SOX and POU (SOX1/2 and POU3F1 etc.) play important roles in NPCs. However, their function depends on genomic context and molecular function of other factors (e.g. OTX2) that play roles in neural induction and are largely unknown.
When we perturb an essential motif, by definition, the enhancer is no longer functional in any of the time points, suggesting that these motifs are required for transcription, but not necessarily for determining a specific temporal pattern of the transcription. We were intrigued to see if we could find such condition specific binding motifs in our data. To that end, we looked for FRSs with motifs in (i) late response WT regions (WT alpha -cluster 3; Supplementary Fig. 12a) that exhibit their highest perturbation effect in the later time points (Log2FC cluster 3; Supplementary Fig. 12a) and specifically, show significant perturbation effects in 72hr NPC state but not in 0hr embryonic stem cell (ESC) state ; (ii) early response WT regions (WT alpha -cluster 1; Supplementary Fig.  12a) that exhibit the highest perturbation effect in the early time points (Log2FC cluster 1; Supplementary Fig. 12a) and specifically, show significant perturbation effects in 0hr but not in 72hr. We find 37 sequences that are candidates for driving NPC state (i) and 7 sequences that are candidates for driving ESC state (Supplementary Dataset 4). Our premise was that if such motifs are condition (time point) specific they should determine the regulatory activity of a genomic sequence to be in ESC or NPC state. We thus look for enrichment (hypergeometric test FDR<0.05) of such motifs in all 591 WT regions, either in ESCs WTs (cluster 1) or NPC WTs (cluster 3), bearing in my that this is an underpowered test, we only find the following motifs enriched in NPC WT regions: RELA_M4497_1.02 -GGGGATTTCCA, RELB_M6448_1.02 -GGGGGATTTCCA, SP8_1 -GCCACGGCCACT and no motifs were found to be enriched in ESC WT.
Moreover, we find instances where the same motif sequence: e.g., GGGGATTTCCA motif for RELA_M4497_1.02 in region chr1:33808004-33808175 shows functional activity in both NPC (72hr) and ESC (0hr) states. These results suggest that the motif sequence alone is less likely to determine temporality without the context of the surrounding region and other bound factors which have an effect on that.
Interestingly, we observe instances where a similar motif appears more than one time in the same region and has different condition specific effects: CTTTGGATGACAAAGG motif does not show NPC specific effect whereas TTTGGATGACAAAGG and TTGGATGACAAAGG motifs do (SOX1 example; Supplementary Fig. 12b), this suggests that there might be specific cases where within the same region we can dissect specific bases that affect temporality.