Data mining
Data mining is the process of extracting potentially useful information from data sets. It uses a suite of methods to organise, examine and combine large data sets, including machine learning, visualisation methods and statistical analyses. Data mining is used in computational biology and bioinformatics to detect trends or patterns without knowledge of the meaning of the data.
Latest Research and Reviews
Research | | open
mTFkb: a knowledgebase for fundamental annotation of mouse transcription factorsScientific Reports 7, 3023
Research | | open
Systematic discovery of mutation-specific synthetic lethals by mining pan-cancer human primary tumor data
There are no robust methods for systematically identifying mutation-specific synthetic lethal (SL) partners in cancer. Here, the authors develop a computational algorithm that uses pan-cancer data to detect mutation-andcancer-specific SL partners and they validate a novel SL interaction between mutant IDH and loss of ACACA in leukaemia.Nature Communications 8, 15580
Research | | open
UROPA: a tool for Universal RObust Peak AnnotationScientific Reports 7, 2594
Research |
Recognition of EGF-like domains by the Notch-modifying O-fucosyltransferase POFUT1
X-ray crystallographic analysis of the mouse protein-O-fucosyltransferase POFUT1 combined with average structural map analysis demonstrate that POFUT1 specifically recognizes only one of the four EGF-like domain types found in nature.
News and Comment
Comments and Opinion |
Discovering and linking public omics data sets using the Omics Discovery IndexNature Biotechnology 35, 406–409
Comments and Opinion |
Toil enables reproducible, open source, big biomedical data analysesNature Biotechnology 35, 314–316
Editorial |
European Open Science Cloud
A recent recommendation that a large number of professional data stewards be trained and employed in all data-rich research projects raises the exciting prospect they will conduct research on data-intensive research itself. It also focuses us on questions about the role of all scientists in data quality and accessibility as well as how best to measure the value of good data stewardship to science and society.Nature Genetics 48, 821
Comments and Opinion | | open
A crowdsourcing approach for reusing and meta-analyzing gene expression dataNature Biotechnology 34, 803–806
Editorial |
FAIR principles for data stewardship
The FAIR data principles are simple guidelines for ensuring that machines can find and use data, supporting data reuse by individuals. More—and better—research can be generated by designing data and algorithms to be findable, accessible, interoperable and reusable, together with the tools and workflows that led to these data.Nature Genetics 48, 343
News and Views |
Superheroes of disease resistance
Individuals who remain healthy despite carrying a disease-causing mutation are identified in an analysis of over a half million people.Nature Biotechnology 34, 512–513