Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Matters Arising
  • Published:

Reusability report: Compressing regulatory networks to vectors for interpreting gene expression and genetic variants

The Original Article was published on 27 July 2020

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Modified DeepExpression framework with GEEK embedding vectors.
Fig. 2: Identifying key regions around EPAS1 in HUVEC cells for high-altitude adaptation by attention score.

Data availability

The data used in our K562 and HUVEC studies with the retrained GEEK model are available at https://zenodo.org/record/4797001#.YK3HLS21FN011. All the GEEK data2 are available at http://yiplab.cse.cuhk.edu.hk/geek/, https://zenodo.org/record/3040059, http://www.ncbi.nlm.nih.gov/geo/ (accession no. GSE145774) and the Genome Sequence Archive (project no. CRA002025).

Code availability

GEEK is freely available at https://codeocean.com/capsule/3404879/tree/v113. Modified DeepExpression for reproduction is freely available at https://github.com/wanwenzeng/DeepExpression14. vPECA is freely available at https://github.com/jxxin22/vPECA15. Details of the methods are available in refs. 2,3,4.

References

  1. Dong, Y., Chawla, N. V. & Swami, A. metapath2vec: scalable representation learning for heterogeneous networks. In Proc. 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 135–144 (ACM, 2017).

  2. Cao, Q. et al. A unified framework for integrative study of heterogeneous gene regulatory mechanisms. Nat. Mach. Intell 2, 447–456 (2020).

    Article  Google Scholar 

  3. Zeng, W., Wang, Y. & Jiang, R. Integrating distal and proximal information to predict gene expression via a densely connected convolutional neural network. Bioinformatics 36, 496–503 (2020).

    Article  Google Scholar 

  4. Xin, J. et al. Chromatin accessibility landscape and regulatory network of high-altitude hypoxia adaptation. Nat. Commun. 11, 4928 (2020).

    Article  Google Scholar 

  5. Vaswani, A. et al. Attention is all you need. In Advances in Neural Information Processing Systems Vol. 30, 5998–6008 (NIPS, 2017).

  6. Lu, D. et al. Ancestral origins and genetic history of Tibetan Highlanders. Am. J. Hum. Genet. 99, 580–594 (2016).

    Article  Google Scholar 

  7. Weir, B. S. & Cockerham, C. C. Estimating F-statistics for the analysis of population structure. Evolution 38, 1358–1370 (1984).

    Google Scholar 

  8. Peng, Y. et al. Genetic variations in Tibetan populations and high-altitude adaptation at the Himalayas. Mol. Biol. Evol. 28, 1075–1081 (2011).

    Article  Google Scholar 

  9. Simonson, T. S. et al. Genetic evidence for high-altitude adaptation in Tibet. Science 329, 72–75 (2010).

    Article  Google Scholar 

  10. Rao, S. S. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).

    Article  Google Scholar 

  11. Zeng, W., Xin, J., Jiang, R. & Wang, Y. Compressing regulatory networks to vectors for interpreting gene expression and genetic variants (Zenodo, 2021); https://doi.org/10.5281/zenodo.4797001

  12. Li, X. et al. Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale. Nat. Genet. 52, 969–983 (2020).

    Article  Google Scholar 

  13. Cao, Q. et al. GEEK (Gene Expression Embedding frameworK) demo (GM12878, chromosome 1) (Code Ocean, 2020); https://doi.org/10.24433/CO.1518993.V1

  14. Zeng, W. wanwenzeng/DeepExpression: DeepExpression (Zenodo, 2021); https://doi.org/10.5281/zenodo.4798333

  15. Xin, J. vPECA (Zenodo, 2021); https://doi.org/10.5281/zenodo.4797172

Download references

Acknowledgements

We acknowledge funding from the National Key Research and Development Program of China (grants 2018YFC0910404 and 2020YFA0712402), the National Natural Science Foundation of China (grants 11688101,12025107, 11871463, 61621003, 61873141, 61721003, 61573207 and 62003178), and a grant from the Guoqiang Institute, Tsinghua University.

Author information

Authors and Affiliations

Authors

Contributions

Y.W. and R.J. conceived and supervised the project. W.Z. and J.X. designed the experimental/analytical approach and performed numerical experiments and data analysis. All authors wrote, revised and contributed to the final manuscript.

Corresponding authors

Correspondence to Rui Jiang or Yong Wang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Machine Intelligence thanks the anonymous reviewers for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zeng, W., Xin, J., Jiang, R. et al. Reusability report: Compressing regulatory networks to vectors for interpreting gene expression and genetic variants. Nat Mach Intell 3, 576–580 (2021). https://doi.org/10.1038/s42256-021-00371-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s42256-021-00371-6

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing