Biologists excel at cataloging natural genetic sequence variation and are improving ways to associate this variation with meaningful function. What is missing, according to Jay Shendure of the University of Washington, Seattle, are empirical ways to test for functional sequence variation on a large scale.

Two recent reports help fill this need by using deep sequencing to monitor the activity of many synthetic enhancer variants that drive expression of uniquely tagged RNA reporters. Researchers in the Shendure laboratory originally used the idea to interrogate the function of residues in core promoters in vitro. In one of the recent reports, a team led by Shendure, Len Pennacchio at Lawrence Berkeley National Laboratory and Nadav Ahituv at the University of California San Francisco modified the assay to accommodate longer enhancers and in vivo testing (Patwardhan et al., 2012).

The group first engineered enhancer variants by stitching together oligo mixtures 'doped' with random substitutions using an overlapping PCR strategy. They tested variants with changes at 2–3% of positions of two human and one mouse enhancer up to 620 base pairs long. They placed each enhancer haplotype in front of a reporter with a sequence tag in a plasmid vector and sequenced the library to characterize each enhancer and associate it with a tag.

The resulting mixtures were incredibly complex, representing every possible variant at every position. They injected each enhancer library as a single shot into mouse tail veins and sequenced tags from mouse liver RNA. “People are always saying, 'this mutation caused a 1.5-fold effect, a twofold effect [on the likelihood of observing a specific trait],' but what does that mean functionally?” asks Shendure. Their data allowed them to directly model quantitative effects of sequence variation on gene regulation.

In the second report, Tarjei Mikkelsen at the Broad Institute and colleagues designed enhancer sequence variants on custom microarrays (Melnikov et al., 2012). This limited the number and length of the elements but provided better control over the composition of the libraries. They transfected plasmid pools into human cells and sequenced RNA and DNA tags, using the ratio to determine expression associated with the enhancers.

Results from one experiment for the two enhancers they tested agreed with those from painstaking experiments carried out over decades. The group tested linear, nonlinear and thermodynamic quantitative models of sequence-activity relationships and found that relatively simple linear models describe the elements surprisingly well. Their use of inducible enhancers enabled the testing of multiple cell states in the same system.

The two studies had in common several interesting conclusions: the researchers uncovered new binding sites in well-characterized enhancers and found that most sequence changes had small but significant effects on gene expression. They found no obvious signs of strong interactions between sequence positions in either study.

The assays can be used to test the functional effects of variants from candidate disease-associated regulatory regions implicated in genetic studies and to better engineer cell state–specific enhancers for synthetic biology or gene therapy applications. Mikkelsen points out that their approach helped to optimize enhancer characteristics that could be difficult to address with traditional directed-evolution approaches. “Mutations that increase the activity of an enhancer in the target cell state will often decrease its specificity for that state. We can assay large numbers of mutations in every relevant cell state and then combine the data in silico to find the optimal trade-off,” he says.

Shendure sees a larger recent trend of applying mutagenesis to massively parallel functional assays. “I see a [new] field bubbling up,” he says.