ProbeDealer is a convenient tool for designing probes for highly multiplexed fluorescence in situ hybridization

Fluorescence in situ hybridization (FISH) is a powerful method to visualize the spatial positions of specific genomic loci and RNA species. Recent technological advances have leveraged FISH to visualize these features in a highly multiplexed manner. Notable examples include chromatin tracing, RNA multiplexed error-robust FISH (MERFISH), multiplexed imaging of nucleome architectures (MINA), and sequential single-molecule RNA FISH. However, one obstacle to the broad adoption of these methods is the complexity of the multiplexed FISH probe design. In this paper, we introduce an easy-to-use, versatile, and all-in-one application called ProbeDealer to design probes for a variety of multiplexed FISH techniques and their combinations. ProbeDealer offers a one-stop shop for multiplexed FISH design needs of the research community.

Fluorescence in situ hybridization (FISH) can be used to visualize spatial locations of DNA regions and RNA molecules in a sequence-specific manner. Recently, FISH has been extended with multiplicity to profile chromatin folding pattern and the abundance of numerous transcripts in several methods. These include multiplexed sequential DNA FISH 1-6 (termed chromatin tracing), multiplexed error-robust FISH (MERFISH) 7-10 and similar methods 11,12 , and sequential single-molecule RNA FISH (sequentially imaging individual RNA species without combinatorial barcoding) 4,9 . Chromatin tracing has been combined with single-molecule RNA FISH to study the association between gene expression regulation and chromatin folding 4,5 . Recently, our group reported multiplexed imaging of nucleome architectures (MINA) 13 , a method that combines chromatin tracing, RNA MERFISH and protein labeling. We used MERFISH and cell boundary labeling to distinguish different cell types in a highly heterogeneous mammalian tissue, and used chromatin tracing and co-immunofluorescence to profile three-dimensional genomic architectures in single nuclei across different length scales and in relation to other nuclear components 13 . These FISH-based methods greatly advanced the characterization of spatial-omics and their physiological relevance. However, one technical obstacle to the broad adaption of these highly multiplexed techniques is their complex probe design procedure. Here, we introduce ProbeDealer, an easy-to-use application that facilitates probe design for chromatin tracing, RNA MERFISH and sequential single-molecule RNA FISH of individual transcript species.
Both chromatin tracing and RNA MERFISH use a two-stage hybridization procedure. In stage one, a library of oligonucleotide probes termed primary probes targeting all genomic loci or RNA species of interest are simultaneously hybridized to the targets. Each primary probe contains a targeting region that hybridizes to the target, and one or more overhanging readout regions. In stage two, dye-labeled secondary probes with different sequences complementing the readout regions on the primary probes are sequentially hybridized to the sample, imaged, and then bleached or removed. To design the primary probes with high hybridization efficiency Scientific Reports | (2020) 10:22031 | https://doi.org/10.1038/s41598-020-76439-x www.nature.com/scientificreports/ and specificity, several requirements need to be met, including proper melting temperature (Tm), GC content, minimal secondary structure, minimal cross-hybridization between probes, and lack of long consecutive repeats of the same nucleotide 14 . These requirements were often fulfilled using OligoArray 2.1 15 in previous works 1,7,13 , which requires familiarity with UNIX. Unfortunately, OligoArray 2.1 is no longer available for download. Thus, users without a previously installed copy of the software will not be able to use our published workflow. Even with OligoArray 2.1, probe design using our previous workflow can be challenging: Output probes must be filtered through additional rounds of BLAST 16 to select specific probes, and must be extended with combinations of readout regions and priming sequences, which requires additional programming. The probe filtering and sequence extension algorithms depend on the probe type. Users need to make multiple changes to our previous MATLAB codes to adapt them for their needs. ProbeDealer simplifies this process by integrating probe design, BLAST, and other modifications into one program, and allows versatile probe design for several multiplexed FISH methods. Its graphical user interface improves user experience and eliminates the need for coding expertise.

Results and discussion
Generate primary probe sequences with customizable physical properties. The entire workflow of ProbeDealer is illustrated in Fig. 1, which includes three main steps: generate primary probes and filter them by their physical properties, filter primary probes by specificity, and generate outputs. To generate primary probes from target sequences (targeted genomic region or RNA sequences), ProbeDealer utilizes the following algorithm implemented in MATLAB: Each target sequence is scanned with a sliding window with customizable length (default is 30 nucleotides). For simplicity, the scanned sequences within the window, henceforth referred to as "oligos", are on the same strand as the input sequences, and thus need to be reverse-complemented in a later step to generate probes that hybridize to the input sequences. Oligos are first filtered by GC content and repetitive nucleotides. Tm is calculated using the oligoprop function in the MATLAB Bioinformatics Toolbox. To avoid probes containing stable secondary structures, we apply the rnafold function to identify stem-loop structure on oligos. To reject probes that cross-hybridize with other probes, we perform local sequence alignment between each new candidate oligo and the currently accepted oligos using the swalign function. Of note, to facilitate the probe design process, we provide a set of default probe design parameters in an input Excel spreadsheet (oligoparameter.xlsx) which has been tested in our previous experiments 13 . Users may provide their customized probe design parameters by editing the Excel spreadsheet. ProbeDealer then BLASTs accepted oligos against specific genomes and transcriptomes, depending on their intended application. For chromatin tracing, oligos are BLASTed against the whole genome in both directions to find oligos that only have one alignment in the genome. Because chromatin tracing is routinely performed with RNase treatment 1 , RNA will be eliminated and will not bind probes targeting the same sequences in the genome.
To accommodate users who may need to retain RNAs, such as for MINA 13 , ProbeDealer offers an additional feature to design chromatin tracing probes targeting only the antisense strands of gene regions, and therefore even when transcripts are present, the probes will not hybridize to them. Towards this end, users may enable an "only target antisense strand" option in ProbeDealer to BLAST the oligos that passed the whole-genome BLAST further against the unspliced transcriptome in the plus/plus orientation. If an oligo (or its reverse complement) has no hits in the unspliced transcriptome, the oligo (or its reverse complement) is retained. If it is aligned to one or more entries of the unspliced transcriptome in the plus/plus direction, the probe (reverse complement of the oligo) will bind to the unspliced transcript(s), and thus this oligo is rejected.
In addition, to avoid chromatin tracing probes cross-hybridizing with RNA FISH probes applied to the same sample, ProbeDealer can optionally avoid exon regions of genes. This feature is important especially for MINA, when chromatin tracing and RNA FISH are combined and may target the DNA and RNA from the same genomic region. To achieve this, all chromatin tracing oligos are further BLASTed against spliced transcriptome. Oligos with alignment(s) to the spliced transcriptomes are rejected.
To design probes for multiplexed RNA FISH methods, ProbeDealer BLASTs oligos against the spliced transcriptome, and retain probes that only bind to transcript isoforms of the same gene. For both chromatin tracing and RNA FISH probes, users may choose to output all probe sequences, or define how many probes they want for each input target sequence.
Export probe design outputs. After filtering oligos according to their physical properties and genomic and/or transcriptomic specificity, ProbeDealer appends sequences of the secondary probes and additional priming regions to the oligos to generate the final template oligos for probe synthesis. ProbeDealer provides 50 published default secondary probe sequences 13 for chromatin tracing probes, and 16 published default secondary probe sequences 8 for MERFISH and sequential single-molecule RNA FISH. Users may modify or add secondary probe sequences in the provided Excel spreadsheet of the ProbeDealer package.
We offer two choices of output format: (1) a template oligo library, which may be ordered as oligo pools for probe synthesis and amplification according to a previously established protocol (note the final primary probes are reverse-complements of template oligo sequences) 1,13 ; (2) primary probe sequences, which can be ordered individually and used directly in primary hybridization. In the first option, three pairs of primer sequences are provided according to previous publications 1, 13 , with one pair per probe type (chromatin tracing, RNA MER-FISH, or sequential single-molecule RNA FISH), so users may combine multiple template oligo libraries from separate ProbeDealer outputs as sub-libraries in one oligo pool order, and selectively synthesize and amplify each sub-library. In the second option, priming regions are not added, and the primary probe sequences are computationally reverse-complemented from the template oligo sequences. This second option is particularly useful for designing small libraries of sequential single-molecule RNA FISH probes for individual targets.   Comparison with other tools. Given the stringent probe specificity consideration in ProbeDealer, we set out to test whether ProbeDealer can yield sufficient number of probes, which is essential for good signal-tobackground ratio especially in RNA FISH. We performed test runs of designing RNA MERFISH probes targeting 136 mouse transcripts probed in our previous work 13 using three packages: ProbeDealer, OligoArray 2.1 15 and OligoMiner 17 , and compared the number of probes generated by the packages. The probe count per kb of transcript is 29.3 for ProbeDealer, 18.7 for OligoArray 2.1 and 29.1 for OligoMiner. Therefore, ProbeDealer generates comparable number of probes as OligoMiner, which is higher than OligoArray 2.1. Thus, ProbeDealer and OligoMiner are more suitable for designing probes for shorter transcripts, which may not offer enough probes when OligoArray 2.1 is used. A detailed comparison of probe counts among ProbeDealer, OligoArray 2.1 and OligoMiner can be found in Supplementary Table S1.
We further compared these probe design tools in terms of their software purposes, computational resource requirements and time costs, probe design criteria, and user-friendliness in Supplementary Note S3. In summary, ProbeDealer is specialized in easy adaption and designing probes for multiple versions of multiplexed sequential FISH experiments, and generates comparable numbers of probes as other tools in relatively short time (Supplementary Note S3; Supplementary Table S1). With a graphical-user interface on a local computer, ProbeDealer is also more user-friendly and reliable than command-line-based probe design packages or online applications.
We hope ProbeDealer will simplify probe design for chromatin tracing, RNA MERFISH, sequential singlemolecule RNA FISH and MINA, so these methods may be more easily utilized across the scientific community.
Generate oligos from input sequences. Default probe design uses parameters reported in Ref. 13 , and requires the primary targeting region of probes to be 30-nucleotide (nt) in length, with melting temperature (Tm) not lower than 66 °C, and GC content between 30 and 90%. More than five repetitive nucleotides are excluded (e.g. GGG GGG ). If probes form secondary structures, the Tm of concatenated stem sequence should not exceed 76 °C. If they cross-hybridize with other probes, the Tm of concatenated hybridized region should not exceed 72 °C. As recommended in OligoArray 2.0 15 , Tm is calculated via the nearest neighbor method 19 under the assumption of 1 mol/liter salt concentration and 1e-6 mol/liter probe concentration. Input sequences are scanned with a 30-nt window from the 5′ end. If the current test oligo is accepted, the window moves 30 nt towards 3′ end to evaluate the next adjacent oligo; otherwise, the window moves 1 nt towards 3′ end for a new test oligo. These parameters can be customized by editing oligoparameter.xlsx in the ProbeDealer package.
BLAST oligos. For chromatin tracing probes, oligos are first BLASTed against the genome, and only those with unique alignment in the genome are retained. If the "only target antisense of gene" feature is selected, retained oligos are additionally BLASTed against the unspliced transcriptome. Oligos without alignment to the unspliced transcriptome in plus/plus direction are retained. If an oligos is aligned to the unspliced transcriptome in the plus/plus direction but not in the plus/minus direction, its reverse-complement is retained as a qualified oligo. If the "avoid exon regions" feature is selected, qualified oligos are BLASTed once more against the spliced transcriptome. Probes with alignment to the spliced transcriptome are rejected. For RNA FISH probes, including those for RNA MERFISH and sequential single-molecule RNA FISH, oligos are BLASTed against the transcriptome, and oligos matching the transcripts of multiple genes are rejected.
Append secondary sequences and/or priming regions. For chromatin tracing and sequential singlemolecule RNA FISH probes, 50 and 16 default secondary sequences that were previously validated 8,13 are provided, respectively. These sequences are appended to the up-and down-stream of the primary targeting region of oligos so that oligos targeting the same input sequence share the same secondary sequence. If the number of input sequences for chromatin tracing or sequential single-molecule RNA FISH exceeds the number of provided secondary sequences, users should provide additional secondary sequences in the secondary sequence Excel spreadsheets (DNA secondaries.xlsx for chromatin tracing probes, RNA secondaries.xlsx for sequential singlemolecule RNA FISH probes) in the ProbeDealer package.
For MERFISH probes, ProbeDealer provides the Modified-Hamming-Distance-4 (MHD4) 7 coding scheme, which accepts up to 140 input sequences. The MHD4 codes are rearranged according to bulk gene expression values to avoid bit sharing among highly expressed genes. Each of the probes targeting one transcript carries three out of four secondary sequences that are assigned to that transcript, with one secondary sequence upstream of the primary targeting region of the oligo and two secondary sequences downstream.
If users want to order template oligos as a pooled library for primary probe synthesis and amplification 1,13 , priming regions are appended to both ends of the oligo sequences. By default, chromatin tracing probes, sequential single-molecule RNA FISH probes and MERFISH probes each have one distinct pair of priming sequences. The default priming sequences are stored in Primers.xlsx in the ProbeDealer package and can be customized by the users. The primary probe synthesis and amplification procedure with the template oligo library and the default primers will reverse complement the template oligo sequences to generate the primary probes 1,13 . If users require only the primary probe sequences and plan to order them directly, priming regions will not be added and oligo sequences (including the secondary regions) will be computationally reverse-complemented in the output files.
Perform test runs with ProbeDealer. The following test runs in ProbeDealer was performed with a Dell Precision Tower 3630 workstation: five chromatin tracing test runs, which targeted: (1) 50 TADs on mouse chr19, (2) 19 consecutive 5-kb regions upstream of Scd2 gene on mouse chr19, (3) 30 TADs along human chr20, (4) 34 TADs on human chr21, and (5) 40 TADs on human chrX; and one RNA MERFISH test run, which targeted 136 mouse transcripts. The workstation contained an Intel ® Core™ i7-8700 K CPU with 6 cores and 12 logical processors, and 32 GB RAM. One processor was used in all test runs.
For chromatin tracing test runs, the coordinates of targeted genomic region of mouse chr19 TADs and sequences upstream of Scd2 gene locus were downloaded from Ref. 13 , and converted from mm9 to mm10 by the LiftOver tool in UCSC genome browser with default settings. Mouse mm10 genome assembly was used for both test runs. The genomic coordinates of human chr20, chr21 and chrX TADs were downloaded from Ref. 1 , and the central 100-kb region were used for probe design. For the three test runs targeting human genome, human hg18 genome assembly was used, instead of the default hg38 genome assembly. For all chromatin tracing test runs, we chose the chromatin tracing design scheme without additional features, and all probes were exported.
For MERFISH test runs, 137 mouse transcript IDs and RNA-seq FPKM values were acquired from Ref. 13 , and transcript IDs were converted from Gencode vM24 to Gencode vM25, with one failed transcript conversion. We chose the RNA MERFISH design scheme and exported all probes.
Perform MERFISH test run with OligoArray 2.1. The 136 converted mouse transcripts were first divided into 1-kb fragments, and then supplied to OligoArray 2.1 with the following settings: probe length of 30 nucleotides, GC content between 30 and 90%, melting temperature above 66 °C, no 6 or more identical consecutive bases, no stable secondary structure above 76 °C, no cross-hybridization above 72 °C, and no overlaps