Comment | Published:

DNA.Land is a framework to collect genomes and phenomes in the era of abundant genetic information

Nature Geneticsvolume 50pages160165 (2018) | Download Citation

Creating large genome/phenome collections can require consortium-scale resources. DNA.Land is a digital biobank that collects genetic data from individuals tested by consumer genomic companies using a fraction of the resources of traditional studies.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.


  1. 1.

    Ashley, E. A. Nat. Rev. Genet. 17, 507–522 (2016).

  2. 2.

    Sudlow, C. et al. PLoS Med. 12, e1001779 (2015).

  3. 3.

    Downey, P. & Peakman, T. C. Int. J. Epidemiol. 37(Suppl. 1), i46–i50 (2008).

  4. 4.

    Khan, R. & Mittelman, D. Genome Biol. 14, 139 (2013).

  5. 5.

    Greshake, B., Bayer, P. E., Rausch, H. & Reda, J. PLoS ONE 9, e89204 (2014).

  6. 6.

    Erlich, Y. et al. PLoS Biol. 12, e1001983 (2014).

  7. 7.

    Delaney, S. K. et al. Expert. Rev. Mol. Diagn. 16, 521–532 (2016).

  8. 8.

    Wilbanks, J. & Friend, S. H. Nat. Biotechnol. 34, 377–379 (2016).

  9. 9.

    Bakos, Y., Marotta-Wurgler, F. & Trossen, D. R. J. Legal. Stud. 43, 1–35 (2014).

  10. 10.

    Albala, I., Doyle, M. & Appelbaum, P. S. IRB Ethics Hum. Res. 32, 3 (2010). Available at (accessed 17 September 2017).

  11. 11.

    Klitzman, R. L. J. Empir. Res. Hum. Res. Ethics 8, 8–19 (2013).

  12. 12.

    Lunshof, J. E., Chadwick, R., Vorhaus, D. B. & Church, G. M. Nat. Rev. Genet. 9, 406–411 (2008).

  13. 13.

    Ball, M. P. et al. Proc. Natl. Acad. Sci. USA 109, 11920–11927 (2012).

  14. 14.

    Curnin, C., Gordon, A. & Erlich, Y.  Bioinformatics  33, 2191–2193 (2017).

  15. 15.

    Kaplanis, J. et al. Preprint at (2017).

  16. 16.

    Bryc, K., Durand, E. Y., Macpherson, J. M., Reich, D. & Mountain, J. L. Am. J. Hum. Genet. 96, 37–53 (2015).

  17. 17.

    Jain, S. H., Powers, B. W., Hawkins, J. B. & Brownstein, J. S. Nat. Biotechnol. 33, 462–463 (2015).

  18. 18.

    Kosinski, M., Stillwell, D. & Graepel, T. Proc. Natl. Acad. Sci. USA 110, 5802–5805 (2013).

  19. 19.

    Wu, H.-Y. et al. Eulerian video magnification for revealing subtle changes in the world. Preprint at (2012).

  20. 20.

    Paparrizos, J., White, R. W. & Horvitz, E. J. Oncol. Pract. 12, 737–744 (2016).

Download references


Y.E. holds a Career Award at the Scientific Interface from the Burroughs Wellcome Fund. This study was supported by a generous gift from Andria and Paul Heafy to the Erlich Laboratory, funding from the National Breast Cancer Coalition, and support from Amazon Web Services’ Education Grants. J.Y. is supported by the Columbia University Integrative Graduate Education and Research Traineeship (IGERT), funded by NSF research grant number 1144854. We thank the tens of thousands of DNA.Land participants—especially our early adopters, whose feedback was integral in our efforts to improve the site—and genetic genealogist C. Moore for her valuable advice. We welcome inquiries by researchers who are interested in collecting genotype and phenotype information with our resource.

Author information

Author notes

  1. Jie Yuan and Assaf Gordon contributed equally to this work.


  1. New York Genome Center, New York, NY, USA

    • Jie Yuan
    • , Assaf Gordon
    • , Daniel Speyer
    • , Richard Aufrichtig
    • , Dina Zielinski
    • , Joseph Pickrell
    •  & Yaniv Erlich
  2. Department of Computer Science, Fu Foundation School of Engineering, Columbia University, New York, NY, USA

    • Jie Yuan
    • , Daniel Speyer
    •  & Yaniv Erlich
  3. Center for Computational Biology and Bioinformatics (C2B2), Department of Systems Biology, Columbia University, New York, NY, USA

    • Joseph Pickrell


  1. Search for Jie Yuan in:

  2. Search for Assaf Gordon in:

  3. Search for Daniel Speyer in:

  4. Search for Richard Aufrichtig in:

  5. Search for Dina Zielinski in:

  6. Search for Joseph Pickrell in:

  7. Search for Yaniv Erlich in:


J.Y., A.G., D.S., D.Z., J.P., and Y.E. coded the DNA.Land website. A.G. is the chief architect of DNA.Land. J.Y., A.G., and Y.E wrote the manuscript and analyzed the data. Y.E. conceived the website. R.A. provided technical assistance. J.P. and Y.E. supervised the study.

Competing interests

Y.E. is the Chief Science Officer of J.P. is the CEO and co-founder of Gencove.

Corresponding author

Correspondence to Yaniv Erlich.

Integrated Supplementary Information

  1. Supplementary Figure 1 User activity statistics of DNA.Land.

    a The number of page visits per week to each type of report on DNA.Land: Ancestry (Red), Relative Matching (Light Blue), Relatives of Relatives (Green), and Trait Report pages (Dark Blue) b The distribution of new user registrations to DNA.Land by day of the week c The percentage of page visits by DNA.Land users by day of the week.

  2. Supplementary Figure 2 Participation rates in DNA.Land’ s trait survey feature.

    a Per-week and cumulative numbers of total surveys completed by users b Per-week and cumulative numbers of total questions answered by users c The distribution of time required by users to complete each type of survey. The surveyed traits are as follows: Chronotype (Orange), Coffee Consumption (Blue), Myopia (Red), Eye Color (Green), Neuroticism (Pink), Educational Attainment (Purple), and Height (Yellow).

  3. Supplementary Figure 3 Relative matching statistics among DNA.Land users.

    a The distribution of the number of inferred relatives among DNA.Land users based on matching IBD segments. Only 10.5% of DNA.Land users have no detected relatives b The distribution of degrees of relatedness among matching pairs of DNA.Land users, as calculated by the ERSA algorithm. A degree of 0 indicates either an identical twin or duplicate genotype file.

  4. Supplementary Figure 4 Demographics of DNA.Land users.

    a Self-reported age distribution in DNA.Land b Ancestry composition of DNA.Land users with aggregated ancestry categories: Northern European (Red), Northeast European (Orange), Other European (Light Orange), Ashkenazi (Yellow), African (Yellow-Green), South Asian (Light Green), East Asian (Turquoise), Native American (Blue). Each column represents a single user, and stacked bars on each column indicate the distribution of ancestry groups for a given user. Users are sorted by decreasing percentage of their largest ancestry group c Geographic location of DNA.Land users, as determined by IP address.

Supplementary information

  1. Supplementary Text and Figures

    Supplementary Figures 1–4, Supplementary Tables 1–3 and Supplementary Note.

About this article

Publication history



Newsletter Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing