Data mining

Definition

Data mining is the process of extracting potentially useful information from data sets. It uses a suite of methods to organise, examine and combine large data sets, including machine learning, visualisation methods and statistical analyses. Data mining is used in computational biology and bioinformatics to detect trends or patterns without knowledge of the meaning of the data.

Latest Research and Reviews

News and Comment

  • Editorial |

    Citation of prior publications is essential both to claim that knowledge is needed in your area of research and to establish that you have indeed advanced understanding substantially in that area. The journal deplores and will decline to consider manuscripts that fail to identify the key findings of published articles and that—deliberately or inadvertently—omit the reason the prior work is cited.

  • News and Views |

    The Uncultivated Bacteria and Archaea dataset is a foundational collection of 7,903 genomes from uncultivated microorganisms. It highlights how microbial diversity is readily recovered using current tools and existing metagenomic datasets to help piece together the tree of life.

    • Lindsey M. Solden
    •  & Kelly C. Wrighton
    Nature Microbiology 2, 1458–1459
  • News and Views |

    An innovative study analyzing genetic association across tree-structured routine healthcare data in the UK Biobank represents a new branch on a tree that is poised to grow rapidly and offer new kinds of insights on how genome variation relates to human health and disease. Indeed, this tree is likely to offer new kinds of insights into the very nature of human disease.

    • Nancy J Cox
    Nature Genetics 49, 1295–1296
  • Comments and Opinion |

    • Yasset Perez-Riverol
    • , Mingze Bai
    • , Felipe da Veiga Leprevost
    • , Silvano Squizzato
    • , Young Mi Park
    • , Kenneth Haug
    • , Adam J Carroll
    • , Dylan Spalding
    • , Justin Paschall
    • , Mingxun Wang
    • , Noemi del-Toro
    • , Tobias Ternent
    • , Peng Zhang
    • , Nicola Buso
    • , Nuno Bandeira
    • , Eric W Deutsch
    • , David S Campbell
    • , Ronald C Beavis
    • , Reza M Salek
    • , Ugis Sarkans
    • , Robert Petryszak
    • , Maria Keays
    • , Eoin Fahy
    • , Manish Sud
    • , Shankar Subramaniam
    • , Ariana Barbera
    • , Rafael C Jiménez
    • , Alexey I Nesvizhskii
    • , Susanna-Assunta Sansone
    • , Christoph Steinbeck
    • , Rodrigo Lopez
    • , Juan A Vizcaíno
    • , Peipei Ping
    •  & Henning Hermjakob
    Nature Biotechnology 35, 406–409
  • Comments and Opinion |

    • John Vivian
    • , Arjun Arkal Rao
    • , Frank Austin Nothaft
    • , Christopher Ketchum
    • , Joel Armstrong
    • , Adam Novak
    • , Jacob Pfeil
    • , Jake Narkizian
    • , Alden D Deran
    • , Audrey Musselman-Brown
    • , Hannes Schmidt
    • , Peter Amstutz
    • , Brian Craft
    • , Mary Goldman
    • , Kate Rosenbloom
    • , Melissa Cline
    • , Brian O'Connor
    • , Megan Hanna
    • , Chet Birger
    • , W James Kent
    • , David A Patterson
    • , Anthony D Joseph
    • , Jingchun Zhu
    • , Sasha Zaranek
    • , Gad Getz
    • , David Haussler
    •  & Benedict Paten
    Nature Biotechnology 35, 314–316