If it seems dispiriting that artificial intelligence can defeat human intellect at chess and Go, how much more so if scientific discovery itself were to be done more effectively by machines? We don’t have to cede victory to AI just yet, but two papers have now shown that machine learning has the potential to point materials prospecting in interesting directions1,2.

The approach of Zhou et al.1 is particularly suggestive. Several previous efforts (for example, refs 3,4) have used conventional descriptors of the elemental ingredients as the parameters deployed in computational searches. In effect, the assumption is that we already know what factors are important to materials properties, and just need to find the right combinatorial blends. But Zhou et al. start without such preconceptions. They simply supply their machine-learning algorithm with existing experimental data for a vast range of (around 60,000) inorganic two-, three- and four-component compounds, and let the algorithm figure out for itself what are the relevant attributes of each kind of atom in a range of environments — that is, within the context of other atoms.

The result is a vast, multidimensional but rather sparse matrix of atom–environment pairs encoding similarities in composition between the compounds formed by different types of atom. In effect, each atom has an associated vector for which the dimensions are abstract quantities learnt from scratch by the algorithm. Some of these vector dimensions loosely correlate with known properties of the elements concerned — one, for example, strongly predicts non-metal behaviour, another metallic behaviour. But the algorithm decides which vector components to heed, rather than being instructed to be guided by, say, valency.

Nonetheless, the algorithm accurately identifies the family groupings familiar from the periodic table: the kinship of halogens or alkali metals, say. It discovers this periodicity by itself. What’s more, the atom properties turn out to effectively predict the characteristics of compounds, such as whether they are metals or insulators — and does so with lower mean errors than either empirical methods or machine-learning techniques that assume some model of what matters about the atoms in question.

Typically, such AI-based methods look only for equilibrium phases. But some enticing states of matter are out of equilibrium. One such class is made up of phases characterized by many-body localization (MBL), where disorder frustrates equilibration. This can lead to unusual many-body effects, such as discrete-time-crystal behaviour5. Machine learning has already been applied to these phases6, but when dealing with relatively unfamiliar states there is no guarantee that we know what to look at a priori — what are, say, the relevant order parameters for phase boundaries?

Again, the machine-learning algorithm used by Venderley et al.2 makes no prior assumptions about that. Using as input just the essentially neutral ‘entanglement spectra’ of the quantum-mechanical states for an idealized, disordered chain of spins (the canonical model for MBL phases), it outperforms conventional metrics for finding sharply defined phase boundaries.

While these results bode well for attempts to locate interesting and potentially useful new states of matter, they might also reasonably provoke some anxiety: when dealing with the complexities of multicomponent materials governed by subtle many-body effects, can we be sure that our intuitive, physically transparent descriptors are going to be the ones that nature actually recognizes?