Data mining

Definition

Data mining is the process of extracting potentially useful information from data sets. It uses a suite of methods to organise, examine and combine large data sets, including machine learning, visualisation methods and statistical analyses. Data mining is used in computational biology and bioinformatics to detect trends or patterns without knowledge of the meaning of the data.

Latest Research and Reviews

News and Comment

  • Comments and Opinion |

    Biomedical ‘big data’ has opened opportunities for data repurposing to reveal new insights into complex diseases. Public data on IBD have been repurposed for novel diagnostics and therapeutics, and these datasets continue to grow. Here, we discuss the practicalities and implications of open data informatics for IBD.

    • Vivek A. Rudrapatna
    •  & Atul J. Butte
  • News |

    A regulatory vocabulary for synthetic biology and why baby diapers matter.

    • Vivien Marx
  • Editorial |

    Citation of prior publications is essential both to claim that knowledge is needed in your area of research and to establish that you have indeed advanced understanding substantially in that area. The journal deplores and will decline to consider manuscripts that fail to identify the key findings of published articles and that—deliberately or inadvertently—omit the reason the prior work is cited.

  • News and Views |

    The Uncultivated Bacteria and Archaea dataset is a foundational collection of 7,903 genomes from uncultivated microorganisms. It highlights how microbial diversity is readily recovered using current tools and existing metagenomic datasets to help piece together the tree of life.

    • Lindsey M. Solden
    •  & Kelly C. Wrighton
    Nature Microbiology 2, 1458–1459