Deciphering the regulatory logic of enhancer elements that control tissue-specific gene expression has been a long-standing goal in the fields of gene regulation and synthetic biology. Although specific DNA motifs such as transcription-factor binding sites are known, their required arrangement (such as spacing, orientation or combination for spatiotemporal control) has been elusive. Taskiran et al. and de Almeida et al. have now published computational strategies in Nature for understanding and designing such regulatory sequences, while proving their functionality in vivo.
Taskiran et al. used in silico evolution via deep neural networks in a stepwise optimization of random sequences to analyze features required for enhancer function. They successfully designed fully synthetic enhancers that are active in the neural cells of transgenic flies (>75%). Moreover, they used their design to generate enhancers that range from minimal required elements to more complex versions that are active in different cell types. By defining a code for the interaction of transcription factors, the authors designed human enhancers that function in vitro. Using a similar strategy, de Almeida et al. used deep learning and transfer learning, trained with data from single-cell assay for transposase-accessible chromatin with sequencing and refined with in vivo activity data from enhancer assays. Using sequence-to-activity models, they designed 40 synthetic enhancers for a variety of Drosophila tissues. Remarkably, 78% of these enhancers were active and 68% were capable of driving tissue-specific expression.
This is a preview of subscription content, access via your institution