Article
|
Open Access
Featured
-
-
Article
| Open AccessA comprehensive transformer-based approach for high-accuracy gas adsorption predictions in metal-organic frameworks
Three-dimensional representation learning is efficient in material science. Here, authors proposed a transformer-based framework for multi-purpose gas adsorption prediction. Predicted values correspond with the outcomes of adsorption experiments.
- Jingqi Wang
- , Jiapeng Liu
- & Diannan Lu
-
Article
| Open AccessStructured information extraction from scientific text with large language models
Extracting scientific data from published research is a complex task required specialised tools. Here the authors present a scheme based on large language models to automatise the retrieval of information from text in a flexible and accessible manner.
- John Dagdelen
- , Alexander Dunn
- & Anubhav Jain
-
Article
| Open Accessβ-Variational autoencoders and transformers for reduced-order modelling of fluid flows
Reduced-order models provide better understanding for complex spatio-temporal dynamics of fluid flows with high numbers of degrees of freedom and non-linear interactions. The authors propose a variational autoencoder and transformer framework for learning the temporal dynamics of the nonlinear reduced-order models relevant for fluid dynamics, weather forecasting, and biomedical engineering.
- Alberto Solera-Rico
- , Carlos Sanmiguel Vila
- & Ricardo Vinuesa
-
Article
| Open AccessHigh monoclonal neutralization titers reduced breakthrough HIV-1 viral loads in the Antibody Mediated Prevention trials
Antibody Mediated Prevention (AMP) trials showed that the broadly neutralizing antibody VRC01 could prevent some HIV-1 acquisitions. Here the authors use VRC01 levels and the sensitivity of each acquired HIV virus to predict viral loads in the AMP studies and show that VRC01 influenced viral loads, though potency was lower in vivo than expected.
- Daniel B. Reeves
- , Bryan T. Mayer
- & Srilatha Edupuganti
-
Article
| Open AccessExploiting redundancy in large materials datasets for efficient machine learning with less data
Big data is crucial for machine learning, but the redundancies in the datasets are rarely studied. Here the authors reveal significant redundancy in large materials datasets, showing that up to 95% of data can be removed without impacting prediction accuracy.
- Kangming Li
- , Daniel Persaud
- & Jason Hattrick-Simpers
-
Article
| Open AccessHyper-cores promote localization and efficient seeding in higher-order processes
Networks with higher-order interactions provide better description of social and biological systems, however tools to analyze their function still need to be developed. The authors introduce here a decomposition of network in hyper-cores, that gives better understanding of spreading processes and can be applied to fingerprint real-world datasets.
- Marco Mancastroppa
- , Iacopo Iacopini
- & Alain Barrat
-
Review Article
| Open AccessThe promise of data science for health research in Africa
In this Review article, the authors discuss emerging efforts to build ethical governance frameworks for data science health research in Africa and the opportunities to advance these through investments by African governments and institutions, international funding organizations and collaborations for research and capacity development.
- Clement A. Adebamowo
- , Shawneequa Callier
- & Sally N. Adebamowo
-
Article
| Open AccessA robust normalized local filter to estimate compositional heterogeneity directly from cryo-EM maps
Heterogeneity in structural biology data includes potentially valuable information about binding and dynamics. Here, the authors devise, validate and demonstrate a method to quantify local heterogeneity in 3D reconstructions.
- Björn O. Forsberg
- , Pranav N. M. Shah
- & Alister Burt
-
Article
| Open AccessHeterogeneity in M. tuberculosis β-lactamase inhibition by Sulbactam
Here, the reaction of the suicide inhibitor sulbactam with the M. tuberculosis β-lactamase (BlaC) is investigated with time-resolved crystallography. Singular Value Decomposition is implemented to extract kinetic information despite changes in unit cell parameters during the time-course of the reaction.
- Tek Narsingh Malla
- , Kara Zielinski
- & Marius Schmidt
-
Article
| Open AccessWidespread global disparities between modelled and observed mid-depth ocean currents
Analysis of big Argo data reveals that model representation of global ocean circulation near 1000-m depth is substantially compromised by inaccuracies. Only 3.8% of the mid-depth ocean circulation can be considered accurately modelled.
- Fenzhen Su
- , Rong Fan
- & Fei Chai
-
Article
| Open AccessTopological identification and interpretation for single-cell gene regulation elucidation across multiple platforms using scMGCA
A major challenge in analyzing scRNA-seq data arises from challenges related to dimensionality and the prevalence of dropout events. Here the authors develop a deep graph learning method called scMGCA based on a graph-embedding autoencoder that simultaneously learns cell-cell topology representation and cluster assignments, outperforming other state-of-the-art models across multiple platforms.
- Zhuohan Yu
- , Yanchi Su
- & Xiangtao Li
-
Article
| Open AccessImpact of the Euro 2020 championship on the spread of COVID-19
In this Bayesian inference study, the authors aim to quantify the impact of the men’s 2020 UEFA Euro Football Championship on COVID-19 spread in twelve participating countries. They estimate that 0.84 million cases and 1,700 deaths were attributable to the championship, with most impacts in England and Scotland.
- Jonas Dehning
- , Sebastian B. Mohr
- & Viola Priesemann
-
Article
| Open AccessMapping global dynamics of benchmark creation and saturation in artificial intelligence
Recent studies raised concerns over the state of AI benchmarking, reporting issues such as benchmark overfitting, benchmark saturation and increasing centralization of benchmark dataset creation. To facilitate monitoring of the health of the AI benchmarking ecosystem, the authors introduce methodologies for creating condensed maps of the global dynamics of benchmark.
- Simon Ott
- , Adriano Barbosa-Silva
- & Matthias Samwald
-
Article
| Open AccessDigitally-enhanced lubricant evaluation scheme for hot stamping applications
The digital transformation and Industry 4.0 technologies are rapidly shaping the future of manufacturing. Here, authors use reliable big data to quantitatively evaluate lubricants performance and select desirable candidates for application in target manufacturing processes.
- Xiao Yang
- , Heli Liu
- & Liliang Wang
-
Article
| Open AccessImpedance-based forecasting of lithium-ion battery performance amid uneven usage
Accurate forecasts of lithium-ion battery performance will ease concerns about the reliability of electric vehicles. Here, the authors leverage electrochemical impedance spectroscopy and machine learning to show that future capacity can be predicted amid uneven use, with no historical data requirement.
- Penelope K. Jones
- , Ulrich Stimming
- & Alpha A. Lee
-
Article
| Open AccessData-driven capacity estimation of commercial lithium-ion batteries from voltage relaxation
Accurate capacity estimation is crucial for lithium-ion batteries' reliable and safe operation. Here, the authors propose an approach exploiting features from the relaxation voltage curve for battery capacity estimation without requiring other previous cycling information.
- Jiangong Zhu
- , Yixiu Wang
- & Helmut Ehrenberg
-
Article
| Open AccessData-driven modeling and prediction of non-linearizable dynamics via spectral submanifolds
Current data-driven modelling techniques perform reliably on linear systems or on those that can be linearized. Cenedese et al. develop a data-based reduced modeling method for non-linear, high-dimensional physical systems. Their models reconstruct and predict the dynamics of the full physical system.
- Mattia Cenedese
- , Joar Axås
- & George Haller
-
Article
| Open AccessThe Tharsis mantle source of depleted shergottites revealed by 90 million impact craters
The ejection sites of the martian meteorites are still unknown. Here, the authors build a database of 90 million craters and show that Tharsis region is the most likely source of depleted shergottites ejected 1.1 Ma ago, thus confirming that some portions of the mantle were recently anomalously hot.
- A. Lagain
- , G. K. Benedix
- & K. Miljković
-
Article
| Open AccessPhysics-informed learning of governing equations from scarce data
Recovery of underlying governing laws or equations describing the evolution of complex systems from data can be challenging if dataset is damaged or incomplete. The authors propose a learning approach which allows to discover governing partial differential equations from scarce and noisy data.
- Zhao Chen
- , Yang Liu
- & Hao Sun
-
Article
| Open AccessNetwork community structure of substorms using SuperMAG magnetometers
During geomagnetic substorms, the energy accumulated from solar wind is abruptly transported to ionosphere. Here, the authors show application of community detection on the time-varying networks constructed from all magnetometers collaborating with the SuperMAG initiative.
- L. Orr
- , S. C. Chapman
- & W. Guo
-
Article
| Open AccessA model for the fragmentation kinetics of crumpled thin sheets
The process of thin sheet crumpling is characterized by high complexity due to an infinite number of possible configurations. Andrejevic et al. show that ordered behavior can emerge in crumpled sheets, and uncover the correspondence between crumpling and fragmentation processes.
- Jovana Andrejevic
- , Lisa M. Lee
- & Chris H. Rycroft
-
Article
| Open AccessBayesian data analysis reveals no preference for cardinal Tafel slopes in CO2 reduction electrocatalysis
The Tafel slope in electrochemical catalysis is usually determined from experimental data and remains error-prone. Here, the authors develop a Bayesian approach for Tafel slope quantification, and apply it to study the prevalence of certain "cardinal" Tafel slopes in the electrochemical CO2 reduction literature.
- Aditya M. Limaye
- , Joy S. Zeng
- & Karthish Manthiram
-
Article
| Open AccessNon-invasive single-cell morphometry in living bacterial biofilms
Accurate cell detection in dense bacterial biofilms is challenging. Here, the authors report an image analysis pipeline that is able to accurately segment and classify single bacterial cells in 3D fluorescence images: Bacterial Cell Morphometry 3D (BCM3D).
- Mingxing Zhang
- , Ji Zhang
- & Andreas Gahlmann
-
Article
| Open AccessDesigning accurate emulators for scientific processes using calibration-driven deep models
The success of machine learning for scientific discovery normally depends on how well the inherent assumptions match the problem in hand. Here, Thiagarajan et al. alleviate this constraint by allowing the change of optimization criterion in a data-driven approach to emulate complex scientific processes.
- Jayaraman J. Thiagarajan
- , Bindya Venkatesh
- & Brian Spears
-
Article
| Open AccessUncovering temporal changes in Europe’s population density patterns using a data fusion approach
Official data on the distribution of human population often ignores the changing spatio-temporal densities resulting from mobility. Here, authors apply an approach combining official statistics and geospatial data to assess intraday and monthly population variations at continental scale at 1 km2 resolution.
- Filipe Batista e Silva
- , Sérgio Freire
- & Carlo Lavalle
-
Article
| Open AccessData-driven analysis and forecasting of highway traffic dynamics
The demands on transportation systems continue to grow while the methods for analyzing and forecasting traffic conditions remain limited. Here the authors show a parameter-independent approach for an accurate description, identification and forecasting of spatio-temporal traffic patterns directly from data.
- A. M. Avila
- & I. Mezić
-
Article
| Open AccessIdentifying degradation patterns of lithium ion batteries from impedance spectroscopy using machine learning
Forecasting the state of health and remaining useful life of batteries is a challenge that limits technologies such as electric vehicles. Here, the authors build an accurate battery performance forecasting system using machine learning.
- Yunwei Zhang
- , Qiaochu Tang
- & Alpha A. Lee
-
Article
| Open AccessThe METLIN small molecule dataset for machine learning-based retention time prediction
The use of machine learning for identifying small molecules through their retention time’s predictions has been challenging so far. Here the authors combine a large database of liquid chromatography retention time with a deep learning approach to enable accurate metabolites’s identification.
- Xavier Domingo-Almenara
- , Carlos Guijas
- & Gary Siuzdak
-
Article
| Open AccessA meta-analysis of catalytic literature data reveals property-performance correlations for the OCM reaction
The incomplete nature and undefined structure of the existing catalysis research data has prevented comprehensive knowledge extraction. Here, the authors report a novel meta-analysis method that identifies correlations between a catalyst’s physico-chemical properties and its performance in a particular reaction.
- Roman Schmack
- , Alexandra Friedrich
- & Ralph Kraehnert
-
Article
| Open AccessEstablishing the effects of mesoporous silica nanoparticle properties on in vivo disposition using imaging-based pharmacokinetics
Nanoparticle applications are limited by insufficient understanding of physiochemical properties on in vivo disposition. Here, the authors explore the influence of size, surface chemistry and administration on the biodisposition of mesoporous silica nanoparticles using image-based pharmacokinetics.
- Prashant Dogra
- , Natalie L. Adolphi
- & C. Jeffrey Brinker
-
Article
| Open AccessBayesian model selection for complex dynamic systems
Systematic changes in stock market prices or in the migration behaviour of cancer cells may be hidden behind random fluctuations. Here, Mark et al. describe an empirical approach to identify when and how such real-world systems undergo systematic changes.
- Christoph Mark
- , Claus Metzner
- & Ben Fabry
-
Article
| Open AccessAn autonomous organic reaction search engine for chemical reactivity
While automated reaction systems typically work for the synthesis of pre-defined molecules, automated systems to discover reactivity are more challenging. Here the authors report an autonomous organic reaction search engine that allows discovery of the most reactive pathways in a multi-reagent, multistep reaction system.
- Vincenza Dragone
- , Victor Sans
- & Leroy Cronin
-
Article
| Open AccessChaos as an intermittently forced linear system
The huge amount of data generated in fields like neuroscience or finance calls for effective strategies that mine data to reveal underlying dynamics. Here Brunton et al.develop a data-driven technique to analyze chaotic systems and predict their dynamics in terms of a forced linear model.
- Steven L. Brunton
- , Bingni W. Brunton
- & J. Nathan Kutz
-
Article
| Open AccessLocal dimensionality determines imaging speed in localization microscopy
Localisation microscopy enables nanometre-scale imaging of biological samples, but the method is too slow to use on dynamic systems. Here, the authors develop a mathematical model that optimises the number of frames required and estimates the maximum speed for super-resolution imaging.
- Patrick Fox-Roberts
- , Richard Marsh
- & Susan Cox
-
Article
| Open AccessQuantum-chemical insights from deep tensor neural networks
Machine learning is an increasingly popular approach to analyse data and make predictions. Here the authors develop a ‘deep learning’ framework for quantitative predictions and qualitative understanding of quantum-mechanical observables of chemical systems, beyond properties trivially contained in the training data.
- Kristof T. Schütt
- , Farhad Arbabzadah
- & Alexandre Tkatchenko