Similarity regression predicts evolution of transcription factor sequence specificity

Lambert, Samuel A.; Yang, Ally W. H.; Sasse, Alexander; Cowley, Gwendolyn; Albu, Mihai; Caddick, Mark X.; Morris, Quaid D.; Weirauch, Matthew T.; Hughes, Timothy R.

doi:10.1038/s41588-019-0411-1

Article
Published: 27 May 2019

Similarity regression predicts evolution of transcription factor sequence specificity

Nature Genetics volume 51, pages 981–989 (2019)Cite this article

9930 Accesses
63 Citations
169 Altmetric
Metrics details

Subjects

Abstract

Transcription factor (TF) binding specificities (motifs) are essential for the analysis of gene regulation. Accurate prediction of TF motifs is critical, because it is infeasible to assay all TFs in all sequenced eukaryotic genomes. There is ongoing controversy regarding the degree of motif diversification among related species that is, in part, because of uncertainty in motif prediction methods. Here we describe similarity regression, a significantly improved method for predicting motifs, which we use to update and expand the Cis-BP database. Similarity regression inherently quantifies TF motif evolution, and shows that previous claims of near-complete conservation of motifs between human and Drosophila are inflated, with nearly half of the motifs in each species absent from the other, largely due to extensive divergence in C2H2 zinc finger proteins. We conclude that diversification in DNA-binding motifs is pervasive, and present a new tool and updated resource to study TF diversity and gene regulation across eukaryotes.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Overview of the similarity regression method.**

**Fig. 2: Similarity regression classification of TFs for highly similar or dissimilar sequence specificities.**

**Fig. 3: PBM data from the plant *C. sativa* and model fungi *A. nidulans* and *N. crassa* for TFs with conserved and dissimilar motifs.**

**Fig. 4: Conservation of TF motifs within major eukaryotic kingdoms.**

**Fig. 5: Motif divergence of TF families in metazoans and plants.**

**Fig. 6: TF motif conservation between human and *Drosophila melanogaster*.**

Sequence determinants of human gene regulatory elements

Article Open access 21 February 2022

Persistence and plasticity in bacterial gene regulation

Article 25 November 2021

Multi-landmark alignment of genomic signals reveals conserved expression patterns across transcription start sites

Article Open access 05 July 2023

Data availability

New PBM data and motifs are deposited in GEO (accession number GSE121420) and the Cis-BP database (v.2.0; http://cisbp.ccbr.utoronto.ca/).

Code availability

The Similarity Regression code, and examples, are available on GitHub (https://github.com/smlmbrt/SimilarityRegression).

References

Stormo, G. D. DNA binding sites: representation and discovery. Bioinformatics 16, 16–23 (2000).
Article CAS Google Scholar
Mathelier, A. et al. JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 44, D110–D115 (2016).
Article CAS Google Scholar
Weirauch, M. T. et al. Determination and inference of eukaryotic transcription factor sequence specificity. Cell 158, 1431–1443 (2014).
Article CAS Google Scholar
Pelossof, R. et al. Affinity regression predicts the recognition code of nucleic acid-binding proteins. Nat. Biotechnol. 33, 1242–1249 (2015).
Article CAS Google Scholar
Christensen, R. G. et al. Recognition models to predict DNA-binding specificities of homeodomain proteins. Bioinformatics 28, i84–i89 (2012).
Article CAS Google Scholar
Persikov, A. V. et al. A systematic survey of the Cys2His2 zinc finger DNA-binding landscape. Nucleic Acids Res. 43, 1965–1984 (2015).
Article CAS Google Scholar
Najafabadi, H. S. et al. C2H2 zinc finger proteins greatly expand the human regulatory lexicon. Nat. Biotechnol. 33, 555–562 (2015).
Article CAS Google Scholar
Nitta, K. R. et al. Conservation of transcription factor binding specificities across 600 million years of bilateria evolution. eLife 4, e04837 (2015).
Liu, H., Chang, L. H., Sun, Y., Lu, X. & Stubbs, L. Deep vertebrate roots for mammalian zinc finger transcription factor subfamilies. Genome Biol. Evol. 6, 510–525 (2014).
Article Google Scholar
Nadimpalli, S., Persikov, A. V. & Singh, M. Pervasive variation of transcription factor orthologs contributes to regulatory network evolution. PLoS Genet. 11, e1005011 (2015).
Article Google Scholar
Lynch, V. J. & Wagner, G. P. Resurrecting the role of transcription factor change in developmental evolution. Evolution 62, 2131–2154 (2008).
Article CAS Google Scholar
Baker, C. R., Tuch, B. B. & Johnson, A. D. Extensive DNA-binding specificity divergence of a conserved transcription regulator. Proc. Natl Acad. Sci. USA 108, 7493–7498 (2011).
Article CAS Google Scholar
Sayou, C. et al. A promiscuous intermediate underlies the evolution of LEAFY DNA binding specificity. Science 343, 645–648 (2014).
Article CAS Google Scholar
Morgunova, E. et al. Structural insights into the DNA-binding specificity of E2F family transcription factors. Nat. Commun. 6, 10050 (2015).
Article CAS Google Scholar
McKeown, A. N. et al. Evolution of DNA specificity in a transcription factor family produced a new gene regulatory module. Cell 159, 58–68 (2014).
Article CAS Google Scholar
Najafabadi, H. S. et al. Non-base-contacting residues enable kaleidoscopic evolution of metazoan C2H2 zinc finger DNA binding. Genome Biol. 18, 167 (2017).
Article Google Scholar
Berger, M. F. et al. Compact, universal DNA microarrays to comprehensively determine transcription-factor binding site specificities. Nat. Biotechnol. 24, 1429–1435 (2006).
Article CAS Google Scholar
Weirauch, M. T. et al. Evaluation of methods for modeling transcription factor sequence specificity. Nat. Biotechnol. 31, 126–134 (2013).
Article CAS Google Scholar
Love, J. J. et al. Structural basis for DNA bending by the architectural transcription factor LEF-1. Nature 376, 791–795 (1995).
Article CAS Google Scholar
Marmorstein, R., Carey, M., Ptashne, M. & Harrison, S. C. DNA recognition by GAL4: structure of a protein–DNA complex. Nature 356, 408–414 (1992).
Article CAS Google Scholar
King, D. A., Zhang, L., Guarente, L. & Marmorstein, R. Structure of a HAP1–DNA complex reveals dramatically asymmetric DNA binding by a homodimeric protein. Nat. Struct. Biol. 6, 64–71 (1999).
Article CAS Google Scholar
Persikov, A. V. & Singh, M. De novo prediction of DNA-binding specificities for Cys2His2 zinc finger proteins. Nucleic Acids Res. 42, 97–108 (2014).
Article CAS Google Scholar
Gupta, A. et al. An improved predictive recognition model for Cys2-His2 zinc finger proteins. Nucleic Acids Res. 42, 4800–4812 (2014).
Article CAS Google Scholar
de Mendoza, A. et al. Transcription factor evolution in eukaryotes and the assembly of the regulatory toolkit in multicellular lineages. Proc. Natl Acad. Sci. USA 110, E4858–E4866 (2013).
Article Google Scholar
Narasimhan, K. et al. Mapping and analysis of Caenorhabditis elegans transcription factor sequence specificities. eLife 4, e06967 (2015).
Robinson-Rechavi, M., Maina, C. V., Gissendanner, C. R., Laudet, V. & Sluder, A. Explosive lineage-specific expansion of the orphan nuclear receptor HNF4 in nematodes. J. Mol. Evol. 60, 577–586 (2005).
Article CAS Google Scholar
Stracke, R., Werber, M. & Weisshaar, B. The R2R3-MYB gene family in Arabidopsis thaliana. Curr. Opin. Plant Biol. 4, 447–456 (2001).
Article CAS Google Scholar
Grove, C. A. et al. A multiparameter network reveals extensive divergence between C. elegans bHLH transcription factors. Cell 138, 314–327 (2009).
Article CAS Google Scholar
Reinke, A. W., Baek, J., Ashenberg, O. & Keating, A. E. Networks of bZIP protein–protein interactions diversified over a billion years of evolution. Science 340, 730–734 (2013).
Article CAS Google Scholar
Jolma, A. et al. Multiplexed massively parallel SELEX for characterization of human transcription factor binding specificities. Genome Res. 20, 861–873 (2010).
Article CAS Google Scholar
Noyes, M. B. et al. A systematic characterization of factors that regulate Drosophila segmentation via a bacterial one-hybrid system. Nucleic Acids Res. 36, 2547–2560 (2008).
Article CAS Google Scholar
Zhu, L. J. et al. FlyFactorSurvey: a database of Drosophila transcription factor binding specificities determined using the bacterial one-hybrid system. Nucleic Acids Res. 39, D111–D117 (2011).
Article CAS Google Scholar
MacPherson, S., Larochelle, M. & Turcotte, B. A fungal family of transcriptional regulators: the zinc cluster proteins. Microbiol. Mol. Biol. Rev. 70, 583–604 (2006).
Article CAS Google Scholar
Lambert, S. A. et al. The human transcription factors. Cell 175, 598–599 (2018).
Article CAS Google Scholar
Ecco, G., Imbeault, M. & Trono, D. KRAB zinc finger proteins. Development 144, 2719–2729 (2017).
Article CAS Google Scholar
Schmitges, F. W. et al. Multiparameter functional diversity of human C2H2 zinc finger proteins. Genome Res. 26, 1742–1752 (2016).
Article CAS Google Scholar
Noyes, M. B. et al. Analysis of homeodomain specificities allows the family-wide prediction of preferred recognition sites. Cell 133, 1277–1289 (2008).
Article CAS Google Scholar
Wilkinson, S. P. aphid: an R package for analysis with profile hidden Markov models. Bioinformatics https://doi.org/10.1093/bioinformatics/btz159 (2019).
Article Google Scholar
Henikoff, S. & Henikoff, J. G. Amino acid substitution matrices from protein blocks. Proc. Natl Acad. Sci. USA 89, 10915–10919 (1992).
Article CAS Google Scholar
R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2013);http://www.R-project.org/
Friedman, J., Hastie, T. & Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33, 1–22 (2010).
Article Google Scholar
Kuhn, M. Building predictive models in R using the caret package. J. Stat. Softw. 28, 1–26 (2008).
Gorodkin, J. Comparing two K-category assignments by a K-category correlation coefficient. Comput. Biol. Chem. 28, 367–374 (2004).
Article CAS Google Scholar
Sagendorf, J. M., Berman, H. M. & Rohs, R. DNAproDB: an interactive tool for structural analysis of DNA–protein complexes. Nucleic Acids Res. 45, W89–W97 (2017).
Article CAS Google Scholar
Berman, H. M. et al. The protein data bank. Nucleic Acids Res. 28, 235–242 (2000).
Article CAS Google Scholar
HMMER: biosequence analysis using profile hidden Markov models (Howard Hughes Medical Institute, 2015); http://hmmer.org/
Finn, R. D. et al. The Pfam protein families database. Nucleic Acids Res. 38, D211–D222 (2010).
Article CAS Google Scholar
Lambert, S. A., Albu, M., Hughes, T. R. & Najafabadi, H. S. Motif comparison based on similarity of binding affinity profiles. Bioinformatics 32, 3504–3506 (2016).
CAS PubMed PubMed Central Google Scholar
Yin, Y. et al. Impact of cytosine methylation on DNA binding specificities of human transcription factors. Science 356, eaaj2239 (2017).
Article Google Scholar
O’Malley, R. C. et al. Cistrome and epicistrome features shape the regulatory DNA landscape. Cell 165, 1280–1292 (2016).
Article Google Scholar
Barazandeh, M., Lambert, S. A., Albu, M. & Hughes, T. R. Comparison of ChIP-seq data and a reference motif set for human KRAB C2H2 zinc finger proteins. G3 (Bethesda) 8, 219–229 (2018).
Article CAS Google Scholar
Hume, M. A., Barrera, L. A., Gisselbrecht, S. S. & Bulyk, M. L. UniPROBE, update 2015: new tools and content for the online database of protein-binding microarray data on protein–DNA interactions. Nucleic Acids Res. 43, D117–D122 (2015).
Article CAS Google Scholar
Matys, V. et al. TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res. 34, D108–D110 (2006).
Article CAS Google Scholar
Khan, A. et al. JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework. Nucleic Acids Res. 46, D1284 (2018).
Article Google Scholar
Kulakovskiy, I. V. et al. HOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-seq analysis. Nucleic Acids Res. 46, D252–D259 (2018).
Article CAS Google Scholar
Sigrist, C. J. et al. PROSITE: a documented database using patterns and profiles as motif descriptors. Brief. Bioinform. 3, 265–274 (2002).
Article CAS Google Scholar
Kumar, S., Stecher, G., Suleski, M. & Hedges, S. B. Timetree: a resource for timelines, timetrees, and divergence times. Mol. Biol. Evol. 34, 1812–1819 (2017).
Article CAS Google Scholar
Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).
Article CAS Google Scholar
Lam, K. N., van Bakel, H., Cote, A. G., van der Ven, A. & Hughes, T. R. Sequence specificity is obtained from the majority of modular C2H2 zinc-finger arrays. Nucleic Acids Res. 39, 4680–4690 (2011).
Article CAS Google Scholar
Zhao, Y. & Stormo, G. D. Quantitative analysis demonstrates most transcription factors require only simple models of specificity. Nat. Biotechnol. 29, 480–483 (2011).
Article CAS Google Scholar

Download references

Acknowledgements

We thank Xiaoting Chen and Mario Pujato for computational support. S.A.L. was funded by a Natural Sciences and Engineering Research Council of Canada Doctoral Fellowship. T.R.H. holds the Billes Chair of Medical Research at the University of Toronto. This work was supported by a Canadian Institutes of Health Research grant (FDN-148403) and a Natural Sciences and Engineering Research Council of Canada grant (RPGIN-2016-05643) to T.R.H., National Institutes of Health (NIH) grants R01 AR073228, R01 NS099068 and R01 GM055479, Lupus Research Alliance ‘Novel Approaches’, CCRF Endowed Scholar and CCHMC CpG Award 53553 to M.T.W. and a Canadian Institutes of Health Research Operating grant (MOP-125894) to Q.D.M. and T.R.H.

Author information

Authors and Affiliations

Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
Samuel A. Lambert, Alexander Sasse, Quaid D. Morris & Timothy R. Hughes
Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
Ally W. H. Yang, Mihai Albu, Quaid D. Morris & Timothy R. Hughes
Institute of Integrative Biology, University of Liverpool, Liverpool, UK
Gwendolyn Cowley & Mark X. Caddick
Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
Quaid D. Morris
Canadian Institutes For Advanced Research (CIFAR) Artificial Intelligence Chair, Vector Institute, Toronto, Ontario, Canada
Quaid D. Morris
Ontario Institute of Cancer Research, Toronto, Ontario, Canada
Quaid D. Morris
Divisions of Biomedical Informatics and Developmental Biology, Center for Autoimmune Genomics and Etiology (CAGE), Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, USA
Matthew T. Weirauch
Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
Matthew T. Weirauch
CIFAR, Toronto, Ontario, Canada
Timothy R. Hughes

Authors

Samuel A. Lambert
View author publications
You can also search for this author in PubMed Google Scholar
Ally W. H. Yang
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Sasse
View author publications
You can also search for this author in PubMed Google Scholar
Gwendolyn Cowley
View author publications
You can also search for this author in PubMed Google Scholar
Mihai Albu
View author publications
You can also search for this author in PubMed Google Scholar
Mark X. Caddick
View author publications
You can also search for this author in PubMed Google Scholar
Quaid D. Morris
View author publications
You can also search for this author in PubMed Google Scholar
Matthew T. Weirauch
View author publications
You can also search for this author in PubMed Google Scholar
Timothy R. Hughes
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.A.L., M.T.W. and T.R.H. conceived the study and oversaw it to completion. S.A.L. analyzed the data, made the figures and performed all computational analyses except for experiments for which A.S. reimplemented the affinity regression pipeline and applied it to new data. Q.D.M. guided the computational and statistical analyses. M.A., S.A.L. and M.T.W. maintained and updated the Cis-BP database. G.C. and M.X.C. produced the clones for Aspergillus PBM experiments. A.W.H.Y. produced the remainder of the clones and performed all PBM experiments. S.A.L. and T.R.H. wrote the manuscript with feedback and approval from all authors.

Corresponding author

Correspondence to Timothy R. Hughes.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Integrated supplementary information

Supplementary Figure 1 Additional similarity regression model building and selection details.

Four similarity regression (SR) models are made for each TF family, and compared to alignment percent identity to identify the best similarity regression model. The best model is selected after cross-validation, and threshold selection by Matthews correlation coefficient. This figure uses homeodomains as an example which have 465 PBM constructs yielding 13,832 highly similar, 22,714 ambiguous and 71,793 dissimilar pairs.

Supplementary Figure 2 Application of similarity regression to TFs with an array of DBDs.

(a) DBDs are first aligned to find the best (maximizing amino acid identity) ungapped and internal alignment. Examples of permissible and non-permissable alignment configurations are shown. (b) Alignments are then scored by calculating positional protein similarity features in each finger of a DBD array (for example C2H2 ZFs), and combined into a single representation by averaging the features by the length of the longest DBD array.

Supplementary Figure 3 Comparison of similarity regression weights to known DNA-contacting residues.

(a) Homeodomain, or (b) C2H2 ZF similarity regression (SR) weights are compared to DNAproDB contact frequencies for DNA backbone, major and minor groove contacts, using partial Pearson correlations. TF amino acid sequence diversity (for the similarity regression model training sequences) is displayed, for reference (above). Figures comparing contact frequencies with similarity regression weights are provided for all similarity regression models in Supplementary Data 1. (c) Partial correlations for 25 TF families with similarity regression models and structural information in DNAproDB (Sagendorf, J.M., et al., Nucleic Acids Res. 45, W89-W97, 2017) are displayed and coloured according to the statistical significance, as -log₁₀(p-value).

Supplementary Figure 4 Comparison of similarity regression to percent identity at predicting TF pairs with dissimilar specificities.

(a) Scatter plot comparing the fraction of all dissimilar TF pairs captured by the 95% NPV threshold (specificity) for 17 TF families that have dissimilar TFs. (b) Scatter plot showing Matthews correlation coefficient, which summarizes multi-class (for highly similar, ambiguous, and dissimilar TF sequence specificity) classification accuracy for 29 TF families. In both panels, points are sized according to the number of PBM experiments used for training, and coloured according to the AA features used in each model.

Supplementary Figure 5 Comparison of similarity regression scores with experimentally determined similarity in DNA sequence specificity, for new PBM data.

Predicted TF similarity (similarity regression (SR) score) and actual DNA-binding similarity (PBM E-score overlap) are plotted for 275 PBM experiments, vs the most similar (by similarity regression score) TF in the training set. Results are displayed for each TF family with more than three TFs. Linear fit is shown, with corresponding R² value. Points are coloured by their actual TF similarity based on family-specific E-score overlap thresholds.

Supplementary Figure 6 Comparison of predicted Z-score profiles for similarity regression, affinity regression and percent identity.

(a) Individual points show the Pearson correlation coefficient of predicted vs. actual Z-score profiles for 315 TFs (those among the 340 that have Similarity Regression models), for the reconstruction methods (similarity regression (SR), affinity regression (AR) and percent identity) tested. Reconstruction methods are grouped by whether they are a mixture of one NN, or multiple (Z-score reconstructions) TF profiles, as indicated by grey bars above. Points are coloured by TF family (see legend). (b-d) Individual results for the three most abundant TF families in the test set are plotted separately: (b) C2H2 ZFs (n = 34), (c) Homeodomain (n = 17), and (d) zinc cluster (n = 107). Boxplots are defined with center line, median; box limits, upper and lower quartiles; whiskers, smallest or largest data point within 1.5× interquartile range.

Supplementary Figure 7 Comparison of similarity regression predicted motifs with C2H2 and homeodomain-recognition code predictions.

(a) Individual points show the motif similarity of predicted vs. actual PFMs, for the Homeodomain motif prediction methods tested (recognition codes, similarity regression (SR), and percent identity). Boxplots summarizing the predictions for 17 TFs are coloured by the motif prediction method and defined with center line, median; box limits, upper and lower quartiles; whiskers, smallest or largest data point within 1.5× interquartile range. (b) Motif similarity of predicted vs. actual PFMs for C2H2 ZF prediction methods are displayed the same as in a for 34 TFs.

Supplementary Figure 8 Increase in percentage of TFs with a predicted motif in CIS-BP (Similarity Regression compared to percent identity).

(a) The percentage of TFs with a ‘Direct’ (that is experimentally determined) (black bars), or predicted (grey bars) motif are plotted for the 50 largest TF families in CIS-BP. Increase in percentage due to similarity regression (SR) models is shown by red bars. Total number of TFs encompassed is shown at right. (b) Motif coverage in well-studied eukaryotes, plotted as in panel a. Relationships between the species are represented by divergence time (million years ago) obtained from the TimeTree database (Kumar, S., et al., Mol Biol Evol. 34, 1812-1819, 2017). The major clades of fungi, metazoans, and plants are coloured in red, blue, and green respectively.

Supplementary Figure 9 Motif divergence of TF families in fungi.

Classifications of motif similarity are shown as in Fig. 5. The outer ring of each nested pie chart represents Saccharomyces cerevisiae TFs similarities with respect to the species it’s being compared to (displayed along the phylogeny). The inner ring represents the compared species similarities with respect to S. cerevisiae. Branch length is the divergence time between species (millions of years).

Supplementary Figure 10 Motif similarity between corresponding Drosophila and human TFs (highest scoring BLASTP hits with Drosophila as query).

Motif similarity was calculated between 322 fly (1128 PWMs) and 251 human TFs (with 2177 PWMs) with experimentally determined motifs, using MoSBAT energy scores (Lambert, S.A., et al., Bioinformatics. 32, 3504–3506, 2016). (a) The maximum motif similarity for all pairs of human and fly TFs (that is considering that there are often multiple motifs per TF) is displayed as a boxplot, according to the Similarity Regression-predicted TF similarity for each NN pair. (b) Similar plot as panel a, but only HT-SELEX data (306 fly PWMs and 410 human PWMs) is used in the analysis. Boxplots are defined with: center line, median; box limits, upper and lower quartiles; whiskers, smallest or largest data point within 1.5× interquartile range; points, outliers.

Supplementary information

Supplementary Information

Supplementary Figs. 1–10

Reporting Summary

Supplementary Table 1

Supplementary Data 1

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lambert, S.A., Yang, A.W.H., Sasse, A. et al. Similarity regression predicts evolution of transcription factor sequence specificity. Nat Genet 51, 981–989 (2019). https://doi.org/10.1038/s41588-019-0411-1

Download citation

Received: 13 November 2018
Accepted: 04 April 2019
Published: 27 May 2019
Issue Date: June 2019
DOI: https://doi.org/10.1038/s41588-019-0411-1

This article is cited by

DNA binding analysis of rare variants in homeodomains reveals homeodomain specificity-determining residues
- Kian Hong Kock
- Patrick K. Kimes
- Martha L. Bulyk
Nature Communications (2024)
Boolean interpretation, matching, and ranking of natural language queries in product selection systems
- Matthew Moulton
- Yiu-Kai Ng
Discover Computing (2024)
Shared and distinct interactions of type 1 and type 2 Epstein-Barr Nuclear Antigen 2 with the human genome
- Kenyatta C. M. F. Viel
- Sreeja Parameswaran
- Matthew T. Weirauch
BMC Genomics (2024)
A functional mechanism for a non-coding variant near AGTR2 associated with risk for preterm birth
- Li Wang
- Robert M. Rossi
- Louis J. Muglia
BMC Medicine (2023)
A redundant transcription factor network steers spatiotemporal Arabidopsis triterpene synthesis
- Trang Hieu Nguyen
- Louis Thiers
- Alain Goossens
Nature Plants (2023)