Main

Advances in comparative genomics have begun predicting potential regulatory elements, because these essential sequences tend to be conserved through evolution (1,2). Furthermore, whole-genome comparison studies among several species revealed that noncoding genomic sequences that are well conserved through evolution (CNE) are clustered in the vicinity of known transcription factors that play critical roles in embryogenesis (3). Hence, CNE represent reasonable candidates for potential regulatory elements.

In a whole-genome comparative study using pufferfish, Fugu rubripes, the transcription factor SALL1 was found to possess a relatively abundant number of CNE in proximity to the coding sequence (3). However, the in vivo function of these CNE has not been examined. Here, we attempted to perform a systematic functional survey for CNE belonging to the SALL1 gene, an important developmental regulatory molecule. In the present study, we specifically studied the noncoding region embedded between the putative transcription start site and the stop codon. SALL1, a zinc-finger transcription factor located on 16q12.1 in human genome, has been identified as one of the four human homologues of the Drosophila region-specific homeotic gene spalt (sal) (4). Heterozygous mutations in human SALL1 genes cause Townes-Brocks syndrome, characterized by anal defects, upper limb preaxial defects, first and second arch defects including the ears and jaw, kidney malformations, and occasionally mental retardation (5,6). Concordant with the distribution of these malformations, SALL1 is expressed in a unique spatio-temporal pattern in various species. In mice, Sall1 is expressed in the limbs, ear, anus, heart, kidney, spinal cord and brain (7,8). csal1, the chicken ortholog of SALL1, is expressed in the heart, the pharynx, involuting mesoderm and the neural plate with subsequent strong expression in the neural tube of the early embryo (9). In later developmental stages, csal1 expression is also found in the tail bud and developing limb buds (10).

To identify the cis-acting regulatory elements for SALL1, we first performed a comparative genomic scan that yielded five CNE. We next took advantage of chick model system. As a vertebrate species, chick undergoes early embryogenesis that is remarkably similar to that of human in morphologic change as well as expression patterns of developmental transcription factors. Furthermore, the size of the chicken genome is only 40% of that of human genome without considerable differences in the number of the genes and this compact nature of chicken genome facilitates the identification of putative enhancer elements (1). Furthermore, electroporation to the chick embryo has simplified gene transfer, facilitating rapid genetically based studies (11). Using this system, we have identified one CNE that specifically directs SALL1 to the forebrain, one of the regions where SALL1 is expressed.

MATERIALS AND METHODS

Identification of CNE in the vicinity of SALL1.

We performed a bioinformatic analysis using a human genome sequence of roughly 115 kb covering SALL1. The base position coordinates of the target region on chromosome 16 from 16pter were 49677829–49792653, as determined according to the May 2004 human reference sequence NCBI Build 35 assembled by UC Santa Cruz genome browser (http://genome.ucsc.edu/). The chicken genome sequence, a Gallus gallus chromosome 11 genomic contig (GenBank accession number: NW_060474.1) covering the chicken SALL1 ortholog, was obtained using the NCBI map viewer (http://www.ncbi.nlm.nih.gov/mapview/). These two sequences were compared to identify CNE using the Mulan website (http://mulan.dcode.org/) (12). In the present study, we defined CNE as stretches of base pairs at least 100 bases in length and with a similarity of at least 70% between the human and chicken. We named the chicken CNE as corresponding CNE sequences from the chicken genome and human CNE as corresponding CNE sequences from the human genome. Transcription factor binding sites were predicted using rVISTA2.0 software (http://rvista.dcode.org/) (13).

Construction of reporter eGFP expression cassettes.

Genomic fragments corresponding to each CNE were prepared using PCR and inserted into reporter cassettes consisting of eGFP cDNA and a minimal promoter. The PCR primers were designed for each CNE using the Primer 3 website (http://frodo.wi.mit.edu/cgi-bin/primer3/primer3_www.cgi). The primer sequences are listed in a supplemental Table 1A (material online at www.pedreasearch.org). The PCR products were treated with PWO DNA polymerase (Roche Molecular Biochemicals, Indianapolis, IN) and T4 polynucleotide kinase (Takara, Kyoto, Japan) to facilitate cloning. One of the CNE (CNE3) was divided into four blocks: subregion I (100 bp), subregion II (157 bp), subregion III (92 bp), and subregion IV (110 bp). Deletion constructs lacking each subregion were created using the Splicing by Overlap Extension PCR (SOEing PCR) technique (Fig. 1) (14). The region 5′ to the segment to be deleted and that 3′ to the segment were independently generated using PCR. These two PCR products were joined together using the SOE method and four kinds of PCR primers: an outer forward primer, an inner reverse primer, an inner forward primer, and an outer reverse primer (supplemental Table 1B, material online at www.pedresearch.org). The 5′-half of the inner reverse primer was designed to be complementary to the inner forward primer, and the 5′-half of the inner forward primer was designed to be complementary to the inner reverse primer. These complementarities between the inner primers allowed them to overlap when the PCR products were mixed, denatured, and reannealed. Amplification of the overlapped molecules by the outer forward and outer reverse primers and DNA polymerase led to the extension of the overlap, thereby “splicing” the original PCR products. Using T4DNA ligase (Takara), PCR products were inserted into the polylinker SmaI site of ptkEGFP. The plasmid ptkEGFP was constructed by insertion of the Herpes simplex virus thymidine kinase promoter in the polylinker HindIII site of pCAT3-basic vector (Promega, Madison, WI), and by replacing the CAT gene with the EGFP gene (CLONTECH, Palo Alto, CA) (11). The plasmid was cleaved by SmaI and then treated with CIAP (Invitrogen, Carlsbad, CA) and ligated with the PCR products. The plasmids were transformed into competent DH5α Escherichia coli (Toyobo, Osaka, Japan) using a standard protocol and plated on LB agar containing ampicillin (100 μg/mL) as a selective antibiotic. Colonies were selected and expanded in LB medium containing ampicillin (100 μg/mL) in a rotary shaker at 37°C. The plasmids were harvested using the Endofree plasmid Maxi kit (QIAGEN, Valencia, CA) and dissolved in DNA solution. The sequences of the cloned CNE elements were confirmed using an ABI prism 3100 sequence analyzer. After CNE insertion, the plasmid construct was transfected to early chick embryos. The study protocols were approved by the Animal Care Committee of Keio University School of Medicine.

Figure 1
figure 1

Schematic diagram of SOEing PCR. Two regions (A, C) flanking the region to be deleted (B) were fused by SOEing PCR. The fusion is mediated by an overlap of the two strands of PCR products that were created with the use of primers (outer forward primer, inner reverse primer, inner forward primer, outer reverse primer). 5′-region of inner reverse primer used in the amplification of region A is complimentary to the segment of region C. 5′-region of inner forward primer used in the amplification of region C is complimentary to the segment of region A. A recombinant product is formed when this overlap is extended in a subsequent reaction, and this recombined product is amplified with outer primers.

Ex ovo electroporation.

The plasmid construct was transfected using in vivoex ovo electroporation as described previously (1,11). Briefly, after 20–24 h in an incubator, HH stage 5 chick embryos (15) were harvested. The thick albumen was partially removed, a ring of filter paper was placed on the vitelline membrane, and the membrane was cut at the circumference of the filter ring. The embryo anchored to the filter ring through the vitelline membrane was transferred into warmed Hank's solution. The embryo was then placed, with the vitelline membrane side downwards, onto an electrode (CUY700P2E; NEPA GENE, Chiba, Japan). A DNA solution containing the reporter plasmid (2 μg/μL) and the marker plasmid [pDsRed1-N1 (CLONTECH), 1 μg/μL] was injected between the blastoderm and the vitelline membrane using a glass capillary. Glass capillary needles were pulled using a Narishige PN-3 horizontal puller (Narishige Scientific Instrument Lab, Tokyo, Japan) from borosilicate glass capillary tubing (GD-1, 90 mm in length, 1 mm outside diameter × 0.6 mm inside diameter; Narishige Scientific Instrument Lab); the needles were then beveled to a diameter of 0.05 mm.

An anodal electrode (CUY700P2L; NEPA GENE) was placed on the hypoblast side of the embryo. Electroporation was performed using an OVODYNE electroporator (Intracel, Herts, UK) with five pulses of 8–10 V for a duration of 50 ms and with intervals of 100 ms. The embryo was incubated at 38°C and 100% humidity. The anatomical region of successful electroporation was confirmed by the co-electroporation of a DsRed vector carrying a universal promoter. eGFP or DsRed signals were observed under LED lights. For eGFP excitation, lights from 12 blue LEDs (peak wave length: 475 nm) were intensified using a concave lens and were projected onto the embryonic specimen through a band-pass filter (BI0060; Asahi Spectra, Tokyo, Japan). Emissions from the eGFP fluorescence were observed through a long-pass filter (FF01-520/35-25; Bright Line, Semrock, Rochester, NY). For the excitation of DsRed, lights from 12 green LEDs (peak wave length: 520 nm) were projected through a band-pass filter (FF01-562/40-25; Bright Line). Emissions from DsRed were observed through a long-pass filter (610LP; Thin Film Imaging Technologies, North Mankato, MN). The results were demonstrated as superimposed images of the eGFP image and a light, bright field image.

In ovo electroporation.

After about 40 h of incubation (HH stage 10) chick embryos were electoroporated using a slight modification of the original method (16). For in ovo electroporation, an EP21 current amplifier (Intracel) was used with an OVODYNE electroporator (Intracel). An anode (CUY610P4; NEPA GENE) was placed beneath the embryo, and a cathode (CUY610P4; NEPA GENE) was placed onto the cephalic region. A DNA solution was injected into the neural tube and electroporated with five pulses of 26 V for a duration of 50 ms and at intervals of 100 ms. After the electroporation, the embryos were incubated for another 24–48 h. The anatomical region of successful electroporation was confirmed by the co-electroporation of a DsRed vector. eGFP/DsRed signals were observed as described above.

Histologic analysis using plastic-embedded sections.

To preserve eGFP signals for histologic analysis, the electroporated embryos were embedded in plastic resin (Technovit 8100, Heraeus Kulzer) according to the manufacture's instructions. Briefly, the embryos were fixed in 4% paraformaldehyde, washed overnight in PBS containing 6.8% sucrose at 4°C, dehydrated in 100% acetone, and embedded in Technovit 8100 using Histoform Q (Heraeus Kulzer) as a mold. The polymerized resin was fixed onto a block (Histobloc, Heraeus Kulzer) with Technovit 3040 (Heraeus Kulzer), and sections (5- to 6-μm-thick) were cut using a microtome. eGFP fluorescence was observed using a laser scanning confocal microscope (FLUOVIEW FV300; Olympus, Tokyo, Japan). The results were visualized as superimposed eGFP and Nomarski images. Nomarski images were obtained using differential interference contrast microscopy, which is an optical microscopy illumination technique used to enhance contrast in unstained samples.

In situ hybridization analysis.

We performed the in situ hybridization analysis using a previously described protocol (17). Chick EST clones corresponding to the chicken Sall1 (csal1) sequence (GenBank accession number, NM_204707) were identified using the chick EST database available on the Biotechnology and Biologic Sciences Research Council's website (http://www.chick.umist.ac.uk/) (18). Among the 19 EST clones identified by a database search, ChEST818l4, which originates from the cDNA of chicken stage 22 limbs, was used as a template for riboprobe synthesis in this study.

RESULTS

Identification of conserved noncoding elements.

A comparison of the genomic sequences of humans and chicken revealed eight conserved elements within the SALL1 gene. However, three of the eight elements were identical to the coding region and untranslated region of SALL1; thus, five elements were identified as CNE (Fig. 2). The percentage similarities of the CNE ranged from 75.77% to 91.93%, and the lengths of the CNE varied from 210 bp to 443 bp (Table 1). Five CNE were examined using ex ovo electroporation, and only one CNE, a 443-bp CNE in intron 1, exhibited tissue-specific enhancer activity. We named the enhancer sequence in intron 1 as CNE3. The similarity between the human CNE3 genomic sequence (base position coordinates on chromosome 16 from 16pter according to the May 2004 human reference sequence NCBI Build 35: 49734498–49734940) and chicken CNE3 sequence was 81.33%. The actual chicken CNE3 sequence data, which was cloned into the plasmid vector and was actually injected, is presented in the supplemental data (material online at www.pedreasearch.org). The other four CNE did not show any enhancer activity in early-stage embryos.

Figure 2
figure 2

Genomic structure of SALL1. The closed boxes represent exons, the open box represents the UTR, and the shaded boxes represent CNEs.

Table 1 Characteristics of five CNE

Enhancer activity of CNE3 during the neurula stage.

When the chicken CNE3 plasmid construct was electroporated into the embryo at the gastrula stage (HH stage 5), the eGFP signal was visualized in the prosencephalon at HH stages 8–12 (Fig. 3). The presence of a DsRed signal covering the entire chick embryos indicated that ex ovo electroporation was successful (Fig. 3A). In HH stage 8 embryos, eGFP fluorescence reflecting the enhancer activity, appeared in the neural tube around the anterior neural pore (before the completion of neural tube closure) (Fig. 3B). In HH stage 10 embryos, eGFP signals were present in the prosencephalon (Fig. 3C). In HH stage 12 embryos, eGFP signals were present throughout the entire prosencephalon (Fig. 3D). As the head-folding process progressed (HH stage 14), the eGFP signals became localized in the telencephalon (Fig. 3E). Insertion of the CNE3 sequence in the reverse direction did not alter eGFP expression at HH stages 8–12 (data not shown). Electroporation of the human CNE3 exerted a comparable enhancer activity at HH stages 9–10 (Fig. 4), but the eGFP signal became less prominent as development progressed. No eGFP signal was observed at HH stage 12 or later with human CNE3. Serial axial sections of HH stage 9 embryos embedded in plastic resin revealed that eGFP signals driven by chicken CNE3 were present predominantly in the ventral neuroepithelium between the levels of the optic vesicles and the foregut (Fig. 5). The maximum signal intensity was observed in the forebrain vesicles at the level of the optic vesicles.

Figure 3
figure 3

eGFP signals driven by chicken CNE3 at the neurula stage. (A) Anatomical region of successful electroporation visualized using a DsRed signal. (B) HH stage 8 embryo. (C) HH stage 10 embryo. (D) HH stage 12 embryo. (E) HH stage 14 embryo. p, prosencephalon; s, somites; c, cardiac tube; ov, optic vesicle; e, eye; t, telencephalon. Scale bar: 1 mm.

Figure 4
figure 4

Comparison of enhancer activities of human and chicken CNE3. The upper panels show superimposed images of eGFP and light, bright field images. The lower panels show eGFP/DsRed images. A pair of the upper panel and the lower panel represents the same embryo. (A) HH stage 10 embryo carrying human CNE3. (B) HH stage 10 embryo carrying chicken CNE3. p, prosencephalon. Scale bar: 0.5 mm.

Figure 5
figure 5

Histologic analysis of eGFP signals driven by chicken CNE3 in HH stage 9 embryos. (A) Optic vesicle level. (B) Foregut level. The solid lines represent the level of each section. p, prosencephalon; s, somites; ne, neuroepithelium; ov, optic vesicle; f, foregut. Scale bar: 0.25 mm.

Enhancer activity in pharyngula-stage embryos.

The enhancer activity of chicken CNE3 in the later developmental stages of the CNS was investigated after electroporating HH stage 10 embryos in ovo. The presence of a DsRed signal covering the cephalic region of chick embryos indicated that in ovo electroporation was successful (Fig. 6, A and B). At the pharyngula stage (HH stage 15), the eGFP signal was confined within the anterior neural ridge, which represents one morphogenetic center known to participate in patterning the anterior neural plate (Fig. 6A) (19). In HH stage 20 embryos, the eGFP signal was detected on the ventral side of the telencephalic vesicle (Fig. 6B). Serial axial sections of HH stage 15 embryos revealed that eGFP signals were present only in the neuroepithelium of the ventral telencephalon (Fig. 7).

Figure 6
figure 6

eGFP signals driven by chicken CNE3 at the pharyngula stage. The upper panels show superimposed images of eGFP and light, bright field images. The lower panels show eGFP/DsRed images. A pair of the upper panel and the lower panel represents the same embryo. (A) HH stage 15 embryo. (B) HH stage 20 embryo. t, telencephalon; e, eye; d, diencephalon; m, mesencephalon; op, otic placode. Scale bar: 1 mm.

Figure 7
figure 7

Histologic analysis of eGFP signals driven by chicken CNE3 in HH stage 15 embryos. The solid line represents the level of the section. t, telencephalon; e, eye; d, diencephalon; m, mesencephalon; h, heart; oc, optic cup; ne, neuroepithelium. Scale bar: 0.5 mm.

In situ hybridization analysis.

Whole-mount in situ hybridization studies of chick embryos revealed csal1 expression in the developing brain, limb buds, and tail bud, a pattern that recapitulates that previously reported by others (9,10). Histologic studies revealed that csal1 was expressed along the entire neuroepithelium at the level of the optic vesicles at stage 11 (Fig. 8A) and on the neuroepithelium of the telencephalic vesicle at stage 20 (Fig. 8B). Hence, the csal1 gene expression pattern overlapped that of eGFP reporter expression.

Figure 8
figure 8

csal1 in situ hybridization analysis. (A) HH stage 11 embryo. ov, optic vesicle; ne, neuroepithelium. (B) HH stage 20 embryo. oc, optic cup; t, telencephalon. Scale bar: 0.1 mm (A); 0.5 mm (B).

Deletion constructs.

Reporter eGFP expression cassettes containing chicken CNE3 with various deleted subregions (I–IV) were constructed and electroporated into embryos at the gastrula stage (HH stage 5). The four deletion constructs lacking 92–157 bp are depicted in Figure 9A. The length of the deleted segments was 110 bp (subregion IV) in CNE3a, 92 bp (subregion III) in CNE3b, 157 bp (subregion II) in CNE3c, and 100 bp (subregion I) in CNE3d. No eGFP expression was observed with the CNE3a and CNE3b-reporter constructs at the neurula stages, whereas CNE3c and CNE3d-reporter constructs showed comparable expression to the original CNE3 reporter construct (Fig. 9B). To refine the minimal region required for enhancer activity, we next designed constructs CNE3e containing larger deletions than CNE3c and CNE3d. Although CNE3e was included in the preserved regions in both CNE3c and CNE3d, CNE3e did not possess any enhancer activity. Further subdivision of CNE3e into CNE3f and CNE3g also eliminated enhancer activity (data not shown).

Figure 9
figure 9

Analysis of chicken CNE3 containing partial deletions. (A) Design of various CNE3 mutants containing partial deletions. The partial deletions are shown by the broken lines. The shaded boxes represent the remaining genomic region of CNE3. The solid lines indicate the boundaries between each subregion (SR). (B) Enhancer activities of chicken CNE3 containing partial deletions. Scale bar: 1 mm.

Prediction of transcription factor binding sites.

Transcription factor binding sites found in both the chicken and human CNE3 sequences were predicted using rVISTA2.0 software. The numbers of putative transcription factor binding sites conserved in the chicken and human genomes were 2 in subregion I, 8 in subregion II, 17 in subregion III, and 10 in subregion IV (Fig. 10). One cluster of transcription factor binding sites was found in each of subregions III and IV. Potential binding sites for SOX9, ZIC1, ZIC2, and ZIC3 were clustered in subregion III.

Figure 10
figure 10

Transcription factor binding sites in CNE3 predicted using rVISTA2.0 software. The solid lines indicate the boundaries between each subregion.

DISCUSSION

We defined a control element, CNE3, within the SALL1 gene that directs spatio-temporal regulation in the forebrain at the neurula and pharyngula stages. The spatial distribution of the enhancer activity at the neurula stage recapitulated the SALL1 expression pattern previously demonstrated using in situ hybridization (710). The CNE3 element exhibited activity in both the sense and antisense directions, thereby fulfilling the requirements of an enhancer element (20). CNE3 is the first SALL1 enhancer to be identified. To better understand the molecular mechanism regulating CNE3, we isolated the essential genomic region of CNE3 required to maintain enhancer activity using CNE3 elements containing four types of deletions. The subregion III and IV elements can be regarded as essential elements for the enhancer activity of CNE3 in that the CNE3a and CNE3b elements, which are devoid of subregions IV and III, respectively, did not exhibit the original CNE3 enhancer activity.

A database search for potential transcription factor binding sites using a pattern-matching approach revealed that the ZIC family protein binding motif was present in subregion III, which is essential for CNE3 enhancer function. The ZIC family protein binding motif is shared among three Zic proteins: Zic1, Zic2, and Zic3. Among these three proteins, ZIC2 is known to play an important role in forebrain morphogenesis (21). Hence, ZIC2 represents a candidate protein that may act as an upstream regulator of CNE3. Subregion III of CNE3 included a potential binding site for another transcription factor, SOX9. SOX9 is expressed in the forebrain; therefore, SOX9 might be an upstream molecule of SALL1 CNE3 (22,23). The discussion expanded above relies on a computer-based prediction, and not on actual experiments. An in vitro binding assay is required to test whether ZIC family or SOX9 proteins indeed bind to CNE3, because these putative binding sites are found randomly in genomic DNA at a high frequency. Indeed, a total of three SOX9 binding sites, three ZIC1 binding sites, three ZIC2 binding sites, and five ZIC3 binding sites were present in intron 1 of human SALL1 genomic sequences, and one SOX9 binding site was present in intron 2 of human SALL1 genomic sequences (data not shown).

CNE3 activity was present only in the ventral aspect of the forebrain, indicating that CNE3 activity is under the control of ventralizing signaling in the developing forebrain. Shh may represent one potential upstream signaling molecule. SALL1 is regulated by Shh signaling in the chick limb bud and mouse embryonic kidney. Farrell et al. (10) documented that forced Shh expression using bead implantation into limb buds resulted in the up-regulation of SALL1 in chicken. Hu et al. (24) revealed that Shh deficiency decreased the expression of SALL1 in embryonic kidney using Shh knockout mice. In the developing CNS and paraxial mesoderm of fish (medaka), Koster et al. (25) documented that ectopic Shh activity expanded the Sal expression domain.

The observation that the CNE3e element, which includes subregions III and IV, did not exert enhancer activity indicates that the presence of the CNE3e element is not sufficient for enhancer activity. We have hypothesized that unidentified critical transcription factor binding site(s) may be present within subregions I and II, given that the current database of consensus binding sequences for transcription factors remains incomplete.

We successfully identified a forebrain-specific enhancer element using a comparative genomic approach, reconfirming the notion that genome comparisons are helpful for identifying regulatory elements (1,2). We investigated five CNE and found that one of the five elements conferred enhancer activity. We identified one enhancer out of five CNE. Hence, the sensitivity of our enhancer screening was 20%. This figure is significantly lower than that of other enhancer screening experiments in other species (i.e. 40–90%) (3,26). The most likely explanation for this seemingly low sensitivity figure is the technical limitation of our assay system being based on a cultured chick embryonic system, since this system does not allow us to observe enhancer activity later than the neurula stages. Four apparently “negative” conserved noncoding elements might act as enhancer elements exclusively at developmental stages later than neurulation. Alternatively, the elements may function as repressors, rather than enhancers.

We have not identified critical enhancers that define expression in the limbs, caudal end, and branchial arches, the primordia of the organs affected in Townes-Brocks syndrome. Because we did not evaluate any CNEs outside of the coding regions (i.e. the 5′ and 3′ ends of the SALL1 coding region), we cannot comment on whether some of these CNE may regulate SALL1 in the limbs, caudal end, and branchial arches.

In the present study, we have illustrated that a relatively simple screen using the principle of comparative genomics and electroporation into chick embryos can identify tissue-specific enhancers. From a basic science research standpoint, systematic discovery of tissue-specific enhancer elements would advance our understanding on complex network of developmental regulatory genes. From a translational research standpoint, identification of tissue-specific enhancer elements will facilitate designing safer gene transfer strategies that will minimize adverse events.