Psychiatric disorders are thought to be multifactorial complex traits with partial predisposition driven by genetic variation altering expression or protein function. While genetic variants have been linked to certain psychiatric conditions and gene expression approaches have identified expression alterations, few studies have combined genetic and expression alterations in neuropsychiatric conditions.
Exon arrays have been proposed for the study of gene- and exon-level expression alterations, such as alternative splicing alterations. For this purpose, the Affymetrix Human Exon 1.0 ST array, containing over 5.5 million probes regrouped into 1.4 million probe sets, can be used to interrogate expression of the entire human transcriptome at the individual exon level. Here, we propose a method to use exon arrays also for the study of genetic variation in coding exons taking advantage of the presence of a large number of probes in regions of the transcriptome that contain single nucleotide polymorphisms (SNPs) (1.99 million SNPs in more than 5.5 million total probes).
We explored patterns of gene expression in 10 male subjects from Costa Rica, five schizophrenia probands and five gender-matched relatives using lymphoblastoid cell line (LCL) RNA and the Affymetrix Human 1.0 ST Exon Arrays. Although many brain-specific genes are not expressed in LCLs (Vawter MP, unpublished result) and it is not possible to model in LCLs the complex processes and anatomical interactions taking place in a human brain, LCLs have several advantages compared to studying tissue such as brain. Viable lymphocytes readily obtained from large numbers of cases and controls can be experimentally manipulated compared to human brain samples from autopsies or biopsies. Drug effects and other confounds can be controlled in culture to reduce the direct effects on gene expression.
Gene- and exon-level changes were observed and confirmed by quantitative-polymerase chain reaction (Q-PCR) between affected and unaffected controls. These expression changes often did not involve probes with coding SNPs. The entire microarray experiment was repeated from independently grown LCLs from the same subjects with similar expression profiles.
Interestingly, high exon-level individual variation (110–32 000 fold) was observed on 8151 probe sets, suggestive of alternative splicing differences; however, at least one-half of these probes contained an SNP. Some subjects exhibited background-level expression for one exon within a gene, while other exons showed normal expression levels. Such variation might be due to alternative splicing of exons or perhaps hybridization effects due to coding SNPs. DPM2 (dolichyl-phosphate mannosyltransferase polypeptide 2, regulatory subunit) gene exon level analysis is presented here as an example (Figure 1a). The fourth exon of this gene presented a 100-fold change variation among subjects (subject effect P=10−11). A marked reduction in expression was also observed for the four probes (322624.1–322624.4) composing the probe set (data not shown). Two SNPs (rs7997 and rs6781) were found to be present in the DPM2 exon 4 probe set target sequence and their impact was further explored by allele-specific Q-PCR. Different sets of primers were designed to detect expression from the C (C-primer) and the G (G-primer) alleles of the rs7997 SNP leading to a Ser/Thr missense mutation. A third set of primers were designed to detect expression levels irrespective of the genotype for normalization purposes.
A genotype-specific pattern of expression was observed in the Q-PCR experiment with the allele-specific primers for DPM2 exon 4 (Figure 1b). This indicated that the observed expression pattern was due to the presence of an SNP in the probe. When the proper allele-specific primers were used, normal expression levels were in fact observed for all subjects (Figure 1b). The absence of alternative splicing was also confirmed with another set of primers designed to detect expression in the DPM2 exon 4 irrespective of the genotype. Each subject's DNA was also resequenced to confirm the genotypes detected with allele-specific PCR of cDNA. We also confirmed that the same DPM2 SNP affected microarray hybridization in an independent sample of control subjects of northern European ancestry and validated the same effect by allele-specific Q-PCR.
This experiment illustrates the power and specificity of exon arrays to detect not only exon-specific expression but also allele-specific expression within an exon (Figure 1) and underlines the utility and importance to characterize expression in a more comprehensive manner taking into account also genetic information. Although the Affymetrix exon array has been used for tissue-specific alternative splicing (http://www.affymetrix.com/support/technical/technotes/id_altsplicingevents_technote.pdf), we observed that multiple instances of alternative splicing might be more consistent with coding SNPs influencing probe hybridization in one tissue type. Further, since many researchers use algorithms to average individual probes into probe set values; this will inadvertently include probes with hybridization artifacts.
There is currently no commercial high throughput platform to capture the effect of both genetics and gene expression conjointly, although results using both cDNA and nuclear DNA on separate platforms have been recently published.1 It has been suggested in the past that probe sets with SNPs can confound expression analysis using Affymetrix expression microarrays.2 We leveraged this phenomenon to study genetic variation of coding exons in psychiatric disorders. This could be particularly useful considering it has been demonstrated that gene expression is under genetic control and therefore, highly dependent on ethnic background.3 So far gene expression studies investigating neuropsychiatric disorders using postmortem tissue have in general not taken into consideration the influence of the ethnic background on gene expression variation. Two sources of gene expression variation could explain ethnic gene expression differences, that is, population specific due to presence of a relatively new SNP or stratification of a common SNP by population. Thus, gene expression studies using subjects with mixed genetic backgrounds can be affected by genetic variation that alters probe hybridization on microarray platforms. We caution that ethnic gene expression differences recently reported by Spielman et al.3 may be accounted for by ethnic specific probes causing differential hybridization to cDNA since ∼40% of the Affymetrix probes contain SNPs.
In conclusion, we propose using exon array probe-level analysis to obtain coding SNPs genotypic information by comparing probes that contain SNPs to probes that do not contain SNPs. Of the 1.99 million Affymetrix probes on the exon array that contain an SNP, it is estimated that the strongest disruption of hybridization will occur when the SNP is located within the middle 15 bp of probe sequence, which occurs 127 404 times on the exon array with a minor-allele frequency 5%. This means that there are at least 127 404 common exonic SNPs with a maximal hybridization effect on the chip. This analysis can provide in one array genotype-specific expression in cDNA. The combination of expression and genotypic data from the same subjects using the same platform should allow a very efficient study of cis-regulatory variation.