Bridging non-overlapping reads illuminates high-order epistasis between distal protein sites in a GPCR

Epistasis emerges when the effects of an amino acid depend on the identities of interacting residues. This phenomenon shapes fitness landscapes, which have the power to reveal evolutionary paths and inform evolution of desired functions. However, there is a need for easily implemented, high-throughput methods to capture epistasis particularly at distal sites. Here, we combine deep mutational scanning (DMS) with a straightforward data processing step to bridge reads in distal sites within genes (BRIDGE). We use BRIDGE, which matches non-overlapping reads to their cognate templates, to uncover prevalent epistasis within the binding pocket of a human G protein-coupled receptor (GPCR) yielding variants with 4-fold greater affinity to a target ligand. The greatest functional improvements in our screen result from distal substitutions and substitutions that are deleterious alone. Our results corroborate findings of mutational tolerance in GPCRs, even in conserved motifs, but reveal inherent constraints restricting tolerated substitutions due to epistasis.


Supplementary Note 1: Detailed explanation of BRIDGE methodology
Below, we provide a detailed explanation of the BRIDGE methodology: Illumina next-generation sequencing (NGS) platforms are almost exclusively used for deep mutational scanning (DMS) studies due to their high accuracy and throughput 6 . Illumina NGS platforms employ sequencing-by-synthesis (SBS) technology wherein an engineered DNA polymerase replicates a DNA strand in the presence of fluorescently-labeled dNTPs with modified 3' hydroxyl groups, which enable step-wise addition and identification of dNTPs 7 . Upon incorporation of a labeled dNTP, an optical system detects the identity of the nucleotide, then fluorophore release and 3'-modification is catalyzed allowing the next labeled dNTP to be incorporated. The resulting sequence is called a read, which can reach up to 300 bases in length using current technology. Parallelization is achieved through the use of a sequencing chip surface modified with two unique DNA adapters, which bind DNA strands carrying complementary, flanking sequences that can be appended to any strand through PCR (Supplementary Figure 14).
In a process referred to as bridge amplification, the distal ends of bound DNA strands approach the chip surface facilitating interaction between the complementary sequences in the template and on the surface. The DNA template is replicated from the second adapter generating a covalently-bound DNA strand. Accordingly, millions of unique DNA strands adhere to the surface and are clonally amplified through repeated rounds of bridge amplification to generate clusters comprising thousands of identical, proximal copies, significantly improving the signal-to-noise ratio 8 . Bridge amplification also enables paired-end sequencing, wherein a single DNA strand is sequenced from both ends providing additional sequence information or improving accuracy if the paired-end reads overlap (Fig. 1b). However, due to limitations in read length (i.e. ≤ 300 bases), paired-end reads will only overlap if the 5'-ends of each read are separated by < 600 bases. Thus, distal mutations separated by ≥ 600 bases preclude sequencing using overlapping paired-end reads.
In addition to reducing sequencing accuracy, non-overlapping paired-end reads preclude the use of available DMS software to match reads arising from the same DNA strand. Identification of all mutations arising in a single gene is crucial due to the poorly predicted effects of epistasis. We provide a straightforward method to extend DMS to protein libraries containing distal mutations by leveraging the proximity with which paired-end reads generate fluorescent signals upon incorporating nucleotides into the synthesized read (Fig. 1b). For a given DNA strand, the fluorescent signals emitted from each pairedend read share an (x, y) coordinate on the sequencing chip surface. As spatial coordinates are provided for each read in the output FASTQ files, we developed code to parse through the forward and reverse FASTQ read files and match reads that share an (x, y) coordinate.
Following cluster generation, fluorescent signals originating from distinct clusters may overlap on the chip surface reducing the accuracy of matching paired-end reads emerging from the same DNA strand. The occurrence of these polyclonal clusters can be reduced by optimizing clustering density, which can be controlled by the concentration of DNA loaded into the instrument flow cell. However, raw sequencing data passes through a "chastity" or "purity" filtering step in which instrument software automatically discards reads with overlapping fluorescent signals. Accordingly, the sequences contained in FASTQ files are likely to represent those originating from monoclonal clusters.

Supplementary Note 2: Application of BRIDGE to libraries containing >2 distal loci
Although our library was designed to group all mutations in two distal loci, BRIDGE is not limited by positional constraints within individual reads. The mutations may be located anywhere within the length of either read; however, these locations should be chosen taking into consideration the practical considerations noted in Supplementary Note 3. As an example, we illustrate below the application of BRIDGE to a protein library wherein mutations are placed in three separate loci (Supplementary Figure  15).
Using the Illumina NextSeq sequencer, read lengths reach up to 150 bases enabling use of a reverse read that captures mutations imparted at L249 6.51 , H250 6.52 , and S281 7.46 , which resides on transmembrane helix 7 and has been reported to alter ligand binding affinity upon mutagenesis 9,10 . Standard protocols provided by the manufacturer can be used to prepare and sequence the NGS library. Upon obtaining forward and reverse FASTQ read files, BRIDGE can be applied to match non-overlapping paired-end reads generated from the same DNA strand.
The code utilized to match non-overlapping paired-end reads (i.e. append_reads.py) does not have to be modified as it relies solely on flow cell coordinates encoded within the FASTQ sequence identifier and is sequence-agnostic. To trim and quality-filter reads using code applied in this manuscript, the trim-qc() function can be modified as noted in the README.md file. Upon matching and appending paired-end reads, analysis of enrichment rates may be performed using available software (e.g. Enrich2), or the code described in this manuscript can be modified accordingly. Specifically, appended reads can be translated to their corresponding amino acid sequences by modifying and applying the translate_reads() function.

Supplementary Note 3: Practical considerations in applying BRIDGE.
Upon attachment to the sequencing chip surface, DNA templates undergo bridge amplification to facilitate cluster generation and paired-end sequencing (Supplementary Figure 14). Thus, amplification efficiency, which is generally greater for shorter templates 18 , is important for successful bridge amplification and sequencing of paired-end reads. Accordingly, paired-end sequencing is often suggested for templates with length ≤ 1 kb 19 . However, different libraries may yield varying results. While we predict most templates with length ≤ 1 kb will be compatible with BRIDGE, this methodology can likely be applied to longer templates. Provided successful bridge amplification, BRIDGE can be applied to match paired-end reads produced from the ends of the DNA template (i.e. loci separated by ≤ 1 kb).
Prior to analysis of any template, it is prudent to determine optimal cluster density for the specific DNA template. As the density of clusters on the sequencing chip surface increases, the likelihood of exceeding the optimal cluster density (i.e. overclustering) also increases. Overclustering results in a reduced fluorescence signal-to-noise ratio and increases the difficulty of generating and resolving distinct clusters. Although achieving a cluster density that is less than optimal (i.e. underclustering) retains high data quality, it reduces the amount of data that is obtained from an NGS run. Optimal cluster densities for various Illumina NGS platforms are provided by the manufacturer. The concentration of library DNA leading to optimal cluster density can be determined empirically through instructions provided by the manufacturer.
Note that the efficiency of cluster generation can also be impacted by other variables including nucleotide diversity in the initial cycles. Nucleotide diversity and the efficiency of cluster generation is often increased through the addition of PhiX or 1 -3 nucleotides in the primers used to generate the NGS library. Manufacturer instructions should be noted in optimizing cluster density.
As noted in Supplementary Note 1, raw sequencing data passes through a "chastity" or "purity" filtering step in which instrument software automatically discards reads with overlapping fluorescent signals. Therefore, the sequences contained in FASTQ files are likely to represent those originating from monoclonal clusters.
Supplementary Figure 1: Expression of specific fluorescent ligand binding signal compared to expression from a conventional centromeric backbone (pYC). A fluorescent ligand (FITC improvement in specific signal obtained in yeast producing A compared to the less mitotically stable, low biological replicates, and error bars represent their standard deviation. Source Data file. : Expression of A 2 aR from a multisite integrating vector (pITy) improves specific fluorescent ligand binding signal compared to expression from a conventional centromeric A fluorescent ligand (FITC-APEC) binding assay demonstrates up to an 11 al obtained in yeast producing A 2 aR using a multisite integrating vector compared to the less mitotically stable, low-copy centromeric vector. Data represent the mean of three biological replicates, and error bars represent their standard deviation. Source Data are provided as a e integrating vector (pITy) improves specific fluorescent ligand binding signal compared to expression from a conventional centromeric APEC) binding assay demonstrates up to an 11-fold aR using a multisite integrating vector Data represent the mean of three Source Data are provided as a Supplementary Figure 2: Growth with raffinose prior to induction of P improves phenotypic homogeneity upon incubation with fluorescent ligand. under the control of the P GAL1 promoter raffinose. (b, c) Flow cytometric analysis of yeast serially cultured in dextrose followed by galactose reveals phenotypic heterogeneity, where 10% of the popu comparable to autofluorescence in wildtype yeast (i.e. yeast not transformed with pITy A cells serially cultured in dextrose, raffinose, and raffinose combined with galactose are phenotypically homogeneous with 99% of the population exhibiting fluorescence intensities above background. (d) The addition of a raffinose growth step marginally improves mean fluorescence intensity (MFI) over background. A 2 aR expression in this experiment was performed us each histogram corresponds to a representative sample replicates, and error bars represent their standard deviation. through calculation of p values using a paired Source Data file.
Growth with raffinose prior to induction of P GAL1 -driven homogeneity upon incubation with fluorescent ligand. (a) Expression of promoter is repressed by dextrose, induced by galactose, and unaffected by raffinose. (b, c) Flow cytometric analysis of yeast serially cultured in dextrose followed by galactose reveals phenotypic heterogeneity, where 10% of the population exhibits fluorescence intensities autofluorescence in wildtype yeast (i.e. yeast not transformed with pITy A cells serially cultured in dextrose, raffinose, and raffinose combined with galactose are phenotypically ogeneous with 99% of the population exhibiting fluorescence intensities above background. (d) The addition of a raffinose growth step marginally improves mean fluorescence intensity (MFI) over expression in this experiment was performed using a pITy A 2 aR construct. each histogram corresponds to a representative sample. Data represent the mean of three biological replicates, and error bars represent their standard deviation. Statistical significance was determined values using a paired, two-sided Student's t-test. Source Data are provided as a 6 driven A 2 aR expression Expression of A 2 aR is repressed by dextrose, induced by galactose, and unaffected by raffinose. (b, c) Flow cytometric analysis of yeast serially cultured in dextrose followed by galactose lation exhibits fluorescence intensities autofluorescence in wildtype yeast (i.e. yeast not transformed with pITy A 2 aR). In contrast, cells serially cultured in dextrose, raffinose, and raffinose combined with galactose are phenotypically ogeneous with 99% of the population exhibiting fluorescence intensities above background. (d) The addition of a raffinose growth step marginally improves mean fluorescence intensity (MFI) over aR construct. In panel b, Data represent the mean of three biological Statistical significance was determined . Source Data are provided as a Supplementary Figure 3: Optimization of FACS parameters enables complete enrichment of ye producing active A 2 aR diluted into a pool of assess the sorting stringency achieved using our GPCR expression and fluorescent ligand binding protocols, we performed a pilot enrichment experiment. on their ligand binding properties, we screened a mixed population comprising wildtype A inactive A 2 aR variant (C28A/C82A/C128A/C185A/C245S/C254A/C394S) that does not bind ligand. (a) Flow cytometric analysis of pure populations incubated with 10 µM FITC expected phenotypes and allow us to position the sort gate to isolate cells producing active A Yeast cultures producing wildtype A mixed in varying dilution rates, and sorted using FACS. Only 2 rounds of sorting are required to fully enrich cells producing active A 2 aR from a pool diluted In panel a, each histogram corresponds to a representative sample a single experiment with one sample for each dilution rate.
: Optimization of FACS parameters enables complete enrichment of ye aR diluted into a pool of cells producing an inactive A 2 aR variant assess the sorting stringency achieved using our GPCR expression and fluorescent ligand binding protocols, we performed a pilot enrichment experiment. Since our goal is to isolate GPCR variants based on their ligand binding properties, we screened a mixed population comprising wildtype A aR variant (C28A/C82A/C128A/C185A/C245S/C254A/C394S) that does not bind ligand. (a) ic analysis of pure populations incubated with 10 µM FITC-APEC demonstrates the expected phenotypes and allow us to position the sort gate to isolate cells producing active A Yeast cultures producing wildtype A 2 aR and the inactive variant were incubated with fluorescent ligand, and sorted using FACS. Only 2 rounds of sorting are required to fully aR from a pool diluted 1:1000 with cells producing the inactive variant. histogram corresponds to a representative sample. In panel b, data represent a single experiment with one sample for each dilution rate. was performed using yeast producing wildtype (WT) A improved binding affinity to NECA, an adenosine receptor agonis Specific signal reflects the improved binding affinity anticipated in yeast producing A compared to cells producing the WT receptor. In this experiment, low-copy, non-integrating pYC backbone, which is used to express the A mean of three biological replicates, and error bars represent their standard deviation. provided as a Source Data file.
: Optimized GPCR expression and FACS screen discriminates between aR variants based on ligand binding affinity. A saturation fluorescent ligand binding experiment was performed using yeast producing wildtype (WT) A 2 aR and a variant (Q89A) reported to exhibit improved binding affinity to NECA, an adenosine receptor agonist structurally similar to APEC. Specific signal reflects the improved binding affinity anticipated in yeast producing A compared to cells producing the WT receptor. In this experiment, A 2 aR genes were expressed from the pYC backbone, which is used to express the A 2 aR library. Data represent the mean of three biological replicates, and error bars represent their standard deviation.

8
: Optimized GPCR expression and FACS screen discriminates between saturation fluorescent ligand binding experiment and a variant (Q89A) reported to exhibit t structurally similar to APEC. Specific signal reflects the improved binding affinity anticipated in yeast producing A 2 aR (Q89A) genes were expressed from the aR library. Data represent the mean of three biological replicates, and error bars represent their standard deviation. Source Data are Supplementary Figure 5: Structure of the structure of A 2 aR bound to NECA (PDB code 2YDV chains belonging to residues investigated in this study (i.e. T88 The shown structure was generated using a truncation at residue 316 and 4 substitutions including a between the ligand and side chains whose van der Waals radii are separated by Supplementary Figure 5: Structure of the adenosine A 2 a receptor bound to NECA.
aR bound to NECA (PDB code 2YDV 8 ) is shown including stick representation of side chains belonging to residues investigated in this study (i.e. T88 3.36 , Q89 3.37 , W246 6.48 , L249 The shown structure was generated using a thermostabilized variant, which contains a truncation at residue 316 and 4 substitutions including a Q89A mutation. Red lines connect atoms between the ligand and side chains whose van der Waals radii are separated by ≤ 0.4 Ǻ.

9
a receptor bound to NECA. The crystal ) is shown including stick representation of side , L249 6.51 , H250 6.52 ). thermostabilized variant, which contains a C-terminal Q89A mutation. Red lines connect atoms Supplementary Figure 6: Illustration of high library populations were initially cells above background fluorescence intensity were collected. The b was determined for each round of sorting using an empty vector (pYC) negative control. In the third round of sorting, the Post Sort 2 population was sorted using both high the high-stringency gating strategy, we collected the top 5% of cells with respect to mean fluorescence intensity. In the low-stringency gating strategy, we collected all cells displaying mean fluorescence intensity above background. In the fourth round of sorting, either hig repeated for the respective population.
ure 6: Illustration of high-and low-stringency FACS sorting strategies gated to sort cells and singlets. In the first two rounds of sorting, all cells above background fluorescence intensity were collected. The background fluorescence intensity was determined for each round of sorting using an empty vector (pYC) negative control. In the third the Post Sort 2 population was sorted using both high-and low-stringency gating. In gating strategy, we collected the top 5% of cells with respect to mean fluorescence stringency gating strategy, we collected all cells displaying mean fluorescence intensity above background. In the fourth round of sorting, either high-or low-stringency gating was repeated for the respective population.
10 FACS sorting strategies. All In the first two rounds of sorting, all ackground fluorescence intensity was determined for each round of sorting using an empty vector (pYC) negative control. In the third stringency gating. In gating strategy, we collected the top 5% of cells with respect to mean fluorescence stringency gating strategy, we collected all cells displaying mean fluorescence stringency gating was 12 aR variants containing at least 1 stop codon are rapidly depleted from approximately 3 orders (count) of unique variants As expected, both the and the number (n) of unique inactive variants decrease rapidly during specific carryover rate containing reads compared 0.7 for each round percentile of each population. The Whiskers in the box plot represent 1. 5 points falling outside 1.5xIQR Supplementary Figure 9: The 100 most highly enriched and depleted A correlate with codon adaptation index indicating codon bias does not influence enrichment. log2(enrichment rate) for each of the 100 most highly enriched and depleted A correlation with the variant's codon adaptation index (CAI). CAI is a measure of the similarity of codon usage compared to a user-defined reference, where 0 indicates dissimilarity and 1 indicates complete similarity. Here, the codon usage reference is the one given in Sharp on the codon usage in Saccharomyces cerevisiae efficiency.
: The 100 most highly enriched and depleted A 2 aR variants do not correlate with codon adaptation index indicating codon bias does not influence enrichment. log2(enrichment rate) for each of the 100 most highly enriched and depleted A 2 aR variants shows no correlation with the variant's codon adaptation index (CAI). CAI is a measure of the similarity of codon defined reference, where 0 indicates dissimilarity and 1 indicates complete sage reference is the one given in Sharp et al. 9 , which was composed based Saccharomyces cerevisiae genes that are subject to selection via translational 13 aR variants do not correlate with codon adaptation index indicating codon bias does not influence enrichment. The aR variants shows no correlation with the variant's codon adaptation index (CAI). CAI is a measure of the similarity of codon defined reference, where 0 indicates dissimilarity and 1 indicates complete , which was composed based genes that are subject to selection via translational Supplementary Figure 10: Amino acid sequence alignment of A reveal high conservation of mutated residues library (T88 3.36 , Q89 3.37 , W246 6.48 , L249 suggesting important roles in receptor function denote the relative position of a residue with respect to the most evolutionarily conserved residue in a transmembrane helix. For example, the most conserved residue in helix 3 is 3.50. were generated using jalview with sequences : Amino acid sequence alignment of A 2 aR transmembrane helices 3 and 6 reveal high conservation of mutated residues among orthologs. The positions mutated in the A , L249 6.51 , and H250 6.52 ) are highly conserved across A suggesting important roles in receptor function. Ballesteros-Weinstein notation is used in superscri relative position of a residue with respect to the most evolutionarily conserved residue in a transmembrane helix. For example, the most conserved residue in helix 3 is 3.50. Sequence alignments were generated using jalview with sequences obtained through the GPCRdb.

aR transmembrane helices 3 and 6
The positions mutated in the A 2 aR ) are highly conserved across A 2 aR orthologs Weinstein notation is used in superscript to relative position of a residue with respect to the most evolutionarily conserved residue in a Sequence alignments Supplementary Figure 11: Frequency logo plots to substitutions among mutated sites plotted for variants in PS4 where the vertical scaling of each a.a. reflects its relative frequency at a specific position. The a.a. color reflects a residue's chemical properties where polar residues (G, S, T, Y, C, Q, and N) are green, basic residues (K, R, and H) are blue, acidic hydrophobic residues (A, V, L, I, P, W, F, and M) are black. (a) Variants with PS4/naïve enrichment rates greater than 0.30, equal to the wildtype receptor rate, strongly favor T/S at T88 and L at L249. In contrast, sites Q89, W246, and H250 are permissive and sample a range of residues with varying chemical properties. Variants with PS4/naïve enrichment rates greater than 61, equivalent to the 95 exhibit similar preferences for sites Q89, strongly prefer the wildtype residue at T88 and G/H at H250.
Frequency logo plots for enriched PS4 variants reveal among mutated sites. The amino acid (a.a.) frequency at each mutated position is where the vertical scaling of each a.a. reflects its relative frequency at a specific position. The a.a. color reflects a residue's chemical properties where polar residues (G, S, T, Y, C, Q, and N) are green, basic residues (K, R, and H) are blue, acidic residues (D and E) are red, and hydrophobic residues (A, V, L, I, P, W, F, and M) are black. (a) Variants with PS4/naïve enrichment rates greater than 0.30, equal to the wildtype receptor rate, strongly favor T/S at T88 and L at L249. In contrast, 89, W246, and H250 are permissive and sample a range of residues with varying chemical properties. Variants with PS4/naïve enrichment rates greater than 61, equivalent to the 95 exhibit similar preferences for sites Q89, W246, and L249. However, variants within this population more strongly prefer the wildtype residue at T88 and G/H at H250. reveal varying tolerance frequency at each mutated position is where the vertical scaling of each a.a. reflects its relative frequency at a specific position. The a.a. color reflects a residue's chemical properties where polar residues (G, S, T, Y, residues (D and E) are red, and hydrophobic residues (A, V, L, I, P, W, F, and M) are black. (a) Variants with PS4/naïve enrichment rates greater than 0.30, equal to the wildtype receptor rate, strongly favor T/S at T88 and L at L249. In contrast, 89, W246, and H250 are permissive and sample a range of residues with varying chemical properties. Variants with PS4/naïve enrichment rates greater than 61, equivalent to the 95 th percentile, er, variants within this population more Supplementary Figure 12: Forcesignificant connectivity between modestly represents a unique variant, the node's radius scales with th represents a difference in 1 amino acid enrichment. Nodes are attracted to other nodes with which it shares an edge and repulsed from disconnected nodes. As a result, highly of modestly-enriched (purple) nodes representing sequences more l evolution. Larger, isolated nodes can be imagined as reflects epistatic interactions between residues of those variants. This epistasis contributes to the rarity of accessing isolated nodes such as SLNIG in favor of more highly -directed graph of PS4 variants with enrichment rates significant connectivity between modestly-enriched variants. Identical to Figure 3a, e represents a unique variant, the node's radius scales with the variant's enrichment rate, represents a difference in 1 amino acid, and the color of each edge is that of the variant with lower Nodes are attracted to other nodes with which it shares an edge and repulsed from disconnected nodes. As a result, highly-interconnected nodes are clustered. These clusters largely consist enriched (purple) nodes representing sequences more likely to be sampled through Darwinian evolution. Larger, isolated nodes can be imagined as taller peaks in a sequence-function landscape, which reflects epistatic interactions between residues of those variants. This epistasis contributes to the rarity of accessing isolated nodes such as SLNIG in favor of more highly-connected nodes such as TSWLH.
directed graph of PS4 variants with enrichment rates ≥ 1 reveals Identical to Figure 3a, each node e variant's enrichment rate, each edge he color of each edge is that of the variant with lower Nodes are attracted to other nodes with which it shares an edge and repulsed from interconnected nodes are clustered. These clusters largely consist ikely to be sampled through Darwinian function landscape, which reflects epistatic interactions between residues of those variants. This epistasis contributes to the rarity of connected nodes such as TSWLH.
Supplementary Figure 13: Digestion of yeast DNA sequencing region. Preparation for next amplification of the target region within A Digestion of DNA extract with EcoRI amplicon (Digested DNA) with the e template, 2 ng/µL DNA was used while 1 ng/µL plasmid (pYC A plasmid. The total DNA extract and pure plasmid were run on separate gels, which were processed in parallel. Each gel contains the same mass of Ladder to facilitate comparison of DNA quantity across gels. These gel images represent the result file. : Digestion of yeast DNA extract improves amplification of the targeted Preparation for next-generation sequencing was initially hindered by poor amplification of the target region within A 2 aR from yeast total DNA extract (Undigested DNA).
EcoRI and HindIII prior to PCR significantly improves the yield of amplicon (Digested DNA) with the expected size (Pure Plasmid). In the PCRs using extracted DNA as template, 2 ng/µL DNA was used while 1 ng/µL plasmid (pYC A 2 aR) was used in the PCR with pure The total DNA extract and pure plasmid were run on separate gels, which were processed in parallel. Each gel contains the same mass of Ladder to facilitate comparison of DNA quantity across gels. These gel images represent the results of a single experiment. Source data are provided as a Source Data extract improves amplification of the targeted was initially hindered by poor aR from yeast total DNA extract (Undigested DNA).
prior to PCR significantly improves the yield of xpected size (Pure Plasmid). In the PCRs using extracted DNA as aR) was used in the PCR with pure The total DNA extract and pure plasmid were run on separate gels, which were processed in parallel. Each gel contains the same mass of Ladder to facilitate comparison of DNA quantity across gels.
Source data are provided as a Source Data Illustration of DNA strand attachment and bridge amplification on Illumina sequencing chip surface. (a) Illumina sequencing platforms feature two unique DNA adapters, which complement flanking sequences appended to a DNA strand (depicted in red and blue). interaction of complementary sequences in the strand and on the surface, the distal end of the facilitating binding to the second adapter. This second interaction enables strand is used as a template for DNA synthesis from the second adapter. (b) Repeated cycles of bridge amplification produce clusters of identical DNA template strands covalently linked to the sequencing chip surface.