Stefan Knapp's results made him feel like he was riding a roller coaster. Knapp, a pharmaceutical scientist at Goethe University Frankfurt in Germany, and his colleagues had identified a protein important for cancer growth. They had also found a drug-like molecule that inhibited the protein. They bought more of that compound to run further experiments — but this time it was inactive. When they then bought the same compound from another vendor, it was even more active than in the first set of experiments.

The Novartis chemical library in Basel, Switzerland, contains roughly 3 million molecules. Credit: Novartis AG

It turned out that the molecule in question exists in the form of enantiomers — chemical structures that are mirror images of each other but still distinct, like right-handed and left-handed gloves. Knapp worked out that vendors were selling mixtures with varying proportions of the two enantiomers, and only the left-handed form was active in the team's assays1. (Ironically, the right-handed version is the active form of the lung-cancer drug crizotinib, which acts through a different mechanism.)

The team's detective work speaks to just one of the many ways in which chemical reagents can thwart biological experiments, says Guilio Superti-Furga, a systems biologist at the Research Center for Molecular Medicine in Vienna, and a leader of the work. In some cases, scientists don't know what chemicals they have in their hands. In others, molecules' effects are less specific than experimentalists imagine. “These two problems, together, reduce the wonderful potential impact of using chemistry to interrogate biology,” says Superti-Furga.

If you don't pay attention to the chemistry, the chemistry will bite you in the ass.

Kim Janda, an immunologist and chemist at the Scripps Research Institute in La Jolla, California, puts it more bluntly: “If you don't pay attention to the chemistry, the chemistry will bite you in the ass.”

Bite marks

Researchers rely on chemical reagents across all areas of cellular biology. One application is as tool compounds or chemical probes that dissect a protein's function. Using a small molecule to inhibit a specific enzyme, for instance, can offer subtler clues to the protein's biology than using genetic techniques to keep it from being made altogether. Compounds are also sometimes collected in chemical libraries, where they are screened en masse in the hope of finding useful reagents and pharmaceuticals.

In both situations, mix-ups, impurities and unanticipated activity can send unsuspecting scientists on wild goose chases. Scientists have sounded alarms about chemical-based artefacts in assays over the years2,3,4, but recognition of these problems is still not widespread. Some online resources can help. The expert-curated Chemical Probes Portal (www.chemicalprobes.org) assesses more than 100 individual tool compounds. Probe Miner (http://probeminer.icr.ac.uk/) and the Probes and Drugs Portal (www.probes-drugs.org) aggregate publicly available information to help researchers select which chemicals to use.

Actually assessing the quality of the chemicals that researchers buy isn't easy. Reagents often contain by-products of synthesis, or impurities formed when the reagent degrades. Chemist Josh Bittker, who heads a high-throughput screening group at the Broad Institute in Cambridge, Massachusetts, had a chance to find out just how common those contaminants can be when he and his team assembled a library of compounds that had already been tested in clinical trials.

To his surprise, nearly 29% of the 8,584 molecules the team tested failed quality control, with impurities making up 15% or more of some batches of reagents5. Often, material from another manufacturing lot from the same vendor passed the check, especially if supplied as a dry powder rather than a solution, a form that is convenient for experiments but prone to degradation.

The problem could be even worse. Bittker's screens assessed compounds by molecular weight and so would not have detected whether samples contained multiple isomers, molecules with the same chemical formula but a different arrangement of atoms.

Chemicals galore

Medicinal chemists like to say that there are more possible structures for drug-sized molecules than there are stars in the Universe. Between them, commercial vendors probably sell more than 10 million different compounds. Researchers looking for a particular molecule might have to track down specialized vendors, get their supplies from other interested researchers or have compounds synthesized to order. However, if a molecule's biological activity has been reported in a high-profile paper, and especially if it has been tested in clinical trials, it might be sold by a dozen vendors or more. Pharmaceutical companies sometimes license vendors to produce or distribute reagent-grade versions of their drugs; this results in more reliable reagents, but also higher prices.

Credit: Source: Chem. Eng. News 90(21), 34–35 (2012)

In one infamous case, nearly 20 vendors offering an approved cancer drug called bosutinib were found to actually be selling a related structure in which chemical groups were misarranged (see 'Spot the difference' and go.nature.com/2w3dz0a). Both bosutinib and the second structure bind a suite of cell-signalling proteins, but with different potencies, so the mix-up potentially calls into question dozens of papers.

Dimitrios Tzalis, chief executive of the contract-research organization Taros in Dortmund, Germany, recommends buying from reputable vendors rather than trying to save money with untested sources. “Cheap can be very expensive,” he says.

When it comes to buying reagents, “you have to know what you're doing and what to look for”, Bittker agrees. “If a deal is too good to be true, it probably is.” One thing to watch out for, he says, is a vendor that is unwilling to discuss or supply quality-control data. Many buy compounds from third parties to resell to scientists, which can make for variable quality. “If there is not a chain of information back to someone who experimentally confirmed the sample, it is not to be trusted,” Bittker says.

Sometimes confusion stems from the literature. When chemical biologist Kilian Huber at the University of Oxford, UK, read reports that an enzyme inhibitor called SCR7 could boost the efficiency of the CRISPR–Cas9 gene-editing technique, he decided to try it in his lab. He ended up abandoning the project when his graduate student could not synthesize a compound matching the reported structure. The situation exemplifies the main uncertainties of using chemical tools: the chemical-reagent company Tocris in Bristol, UK, later reported that multiple SCR7 vendors were in fact selling a structure related to, but not the same as, the molecule (see go.nature.com/2vapstf), and proposed that the originally reported form was inactive. Separately, other researchers questioned whether any of these molecules acted by inhibiting the specific enzyme described6. The original discoverer of the inhibitor tells Nature that his 2012 paper7 reports inhibition of the correct enzyme, but that the structure of SCR7 that it describes is unstable. Instead, he says, SCR7 converts to a cyclic form not described by that paper or by Tocris.

Janda also encountered a mix-up when he decided to do some follow-up studies on a molecule described8 as boosting expression of a tumour-suppressor protein. The molecule turned out to be inactive, and at first, Janda doubted his postdoc's organic-synthesis skills. Then he realized that other studies had not synthesized the compound themselves, but had used reagents supplied by distributors, including the US National Cancer Institute. Those distributors had perpetuated a mistake in the original publication and supplied the wrong molecule. In fact, a company pursuing clinical trials with the compound had originally licensed intellectual property documenting the incorrect structure. That patent was eventually reissued, but not before Janda caused an uproar by filing a patent application for the correct version.

It is impossible to know how common these kinds of mix-up are, Janda says, but certainly many go undetected and unreported.

Scepticism is key, says Nick Levinson, who spent months as a postdoc trying to work out how bosutinib bound its protein target before showing conclusively that he was not working with bosutinib at all — and neither were many other researchers studying the compound9. “The main thing I do differently in my lab is, I am more suspicious with results. If we get any result that seems wrong or is surprising, the first thing that pops into our head is that maybe the compound isn't right,” says Levinson, now a medicinal chemist at the University of Minnesota in Minneapolis. Detecting the problem isn't always easy. For bosutinib, the process required a relatively specialized form of nuclear magnetic resonance (NMR) spectroscopy and was aided immensely by a published crystal structure of a protein binding the isomer, he says. But researchers frequently use reagents without doing any authentication.

Impure products

We are trying to educate the biological community to be more respectful of the need for chemical knowledge.

Whenever using a new chemical probe, researchers should use at least two kinds of assay to verify its identity, or find colleagues who can, advises Paul Workman, head of the Institute of Cancer Research in London. “We are trying to educate the biological community to be more respectful of the need for chemical knowledge.” A good biologist would never work with an antibody before doing some control experiments, or use a cloned gene from a colleague without sequencing it first, he says. They should be just as careful about using a chemical probe without testing it first by mass spectrometry, NMR or other methods (see 'Six tips for better chemistry').

Even if the chemical supplied contains the correct compound, the results obtained with it might still be wrong. As a research scientist at Roche in Nutley, New Jersey, Johannes Hermann led a drug-discovery effort that identified a series of molecules that seemed to inhibit an exciting enzyme10. As expected for such molecules, the higher their concentration, the greater the inhibition, a signal that these results were not mere artefacts.

Then, the medicinal-chemistry efforts stopped making sense: a tiny tweak to a molecule's structure might make a big difference to its activity, whereas a larger one had no effect. Finally, Hermann and his colleagues realized that the active compounds had something in common: they had been synthesized in the presence of zinc. When the researchers added zinc alone to their assays, they got the same results as they had seen with the supposedly active compounds. Just to be sure, they added metal-binding molecules known as chelators along with the compounds, and found no activity. It was a “painful experience”, says Hermann, who is now a data scientist at Johnson & Johnson in Raritan, New Jersey. He recommends that no one invest time or resources following up on a hit before they have explored whether a chelator changes the results.

Hermann's experience is unusual for industry only in that he could take the time to write it up. Other pharmaceutical chemists describe similar frustrations, but with copper and palladium as the culprits. Small, inorganic compounds used in synthesis, such as hydrazine, can also foil experiments — by inhibiting the target enzyme or by altering how its activity is assessed.

Often, problems with a reagent start after the vial is opened. Repeated freezing and thawing can degrade a compound, and some chemicals aren't stable to freezing at all, says Heather Holement, head of life-science reagents at Merck KgaA in Darmstadt, Germany. Chemicals dissolved in organic solvents can also sometimes crash out of solution when added to water-based environments, such as those in cell and protein assays. Researchers should consider the quality of stocks, and always run control experiments with solvent alone, she advises. They can also rule out some false signals by running assays without cells or proteins.

Sometimes a compound's apparent activity is actually due to the 'vehicle' that carries it. Jake Shortt, a haematologist at Monash University in Melbourne, Australia, followed a common protocol of dissolving anticancer compounds in a solvent known as NMP (N-methyl-2-pyrrolidone) before injecting them into mice, and was surprised to see the tumours respond even in control groups that received the solvent but none of the test compounds11. Those data suggest that NMP itself has anticancer activity, and Shortt is now starting to test it in the clinic. But others still use NMP as a supposedly inert liquid to dissolve drug compounds, he says.

For the types of screening experiment that Bittker runs, the time for the most intensive diligence comes when hits are first identified. It is not uncommon for stocks in chemical libraries to sit around for months or even years, which allows plenty of time for degradation and simple mix-ups. The first step in validating results, he says, is to run the screen again, using the same material; screening assays can be noisy. Next, researchers should try the same compound from a different source, ideally synthesized in-house or by collaborators. If it turns out that a hit is not the expected molecule, Bittker suggests not trying to pin down the active component. “It's a rabbit hole,” he says. “If you have a contaminant, just let it go.”

Such advice may need to be turned on its head for researchers who are screening natural products. Extracts of plants and similar sources are almost always mixtures rather than pure reagents, says Guido Pauli, a natural-products specialist at the University of Illinois at Chicago. In many cases, he says, the purer the extract, the lower its activity, which means that some effects attributed to the most common ingredient are in fact caused by other components12.

Even when natural products are relatively well characterized, mixtures can be a problem. For example, gentamicin is an antibiotic produced industrially by isolating it from bacteria grown in fermentation tanks. The bacteria produce several closely related molecules that have different activities and toxicities, but commercial preparations contain varying proportions of each form and different water content, says Robert Greenhouse, a medicinal-chemistry consultant for Stanford University. As a result, researchers who are not careful will not know exactly how much or what forms of antibiotic they are using in their experiments.

Even if a molecule is correctly identified, it may exert effects without actually binding to a specific protein in a specific way. For example, 'aggregators' cover other molecules in a soap-like coating. These artefacts can often be revealed by adding detergent to assays.

“Sometimes you have the right compound, but it's just a lousy compound,” says Workman.

Resources and education are crucial, he says, but the onus is on individual scientists. “It really is the responsibility of every researcher using any small-molecule compound that the material they are using is correct and fit for purpose.” When it comes to using chemical reagents, a simple maxim is more important than any hard-and-fast rules, says Workman. “Caveat emptor — let the buyer beware.”