Cell-type-specific expression quantitative trait loci associated with Alzheimer disease in blood and brain tissue

Because regulation of gene expression is heritable and context-dependent, we investigated AD-related gene expression patterns in cell types in blood and brain. Cis-expression quantitative trait locus (eQTL) mapping was performed genome-wide in blood from 5257 Framingham Heart Study (FHS) participants and in brain donated by 475 Religious Orders Study/Memory & Aging Project (ROSMAP) participants. The association of gene expression with genotypes for all cis SNPs within 1 Mb of genes was evaluated using linear regression models for unrelated subjects and linear-mixed models for related subjects. Cell-type-specific eQTL (ct-eQTL) models included an interaction term for the expression of “proxy” genes that discriminate particular cell type. Ct-eQTL analysis identified 11,649 and 2533 additional significant gene-SNP eQTL pairs in brain and blood, respectively, that were not detected in generic eQTL analysis. Of note, 386 unique target eGenes of significant eQTLs shared between blood and brain were enriched in apoptosis and Wnt signaling pathways. Five of these shared genes are established AD loci. The potential importance and relevance to AD of significant results in myeloid cell types is supported by the observation that a large portion of GWS ct-eQTLs map within 1 Mb of established AD loci and 58% (23/40) of the most significant eGenes in these eQTLs have previously been implicated in AD. This study identified cell-type-specific expression patterns for established and potentially novel AD genes, found additional evidence for the role of myeloid cells in AD risk, and discovered potential novel blood and brain AD biomarkers that highlight the importance of cell-type-specific analysis.


Introduction
Recent expression quantitative trait locus (eQTL) analysis studies suggest that changes in gene expression have a role in the pathogenesis of AD 1,2 . However, regulation of gene expression, as well as many biological functions, has been shown to be context-specific (e.g., tissue and cell types, developmental time point, sex, disease status, and response to treatment or stimulus) [3][4][5][6] . One study of 500 healthy subjects found over-representation of T cellspecific eQTLs in susceptibility alleles for autoimmune disease and AD risk alleles polarized for monocytespecific eQTL effects 7 . In addition, disease and traitassociated cis-eQTLs were more cell-type-specific than average cis-eQTLs 7 . Another study classified 12% of more than 23,000 eQTLs in blood as cell-type-specific 4 . Large eQTL studies across multiple human tissues have been conducted by the GTEx consortium, with a study on genetic effects on gene expression levels across 44 human tissues collected from the same donors characterizing patterns of tissue specificity recently published 8 .
Microglia, monocytes, and macrophages share a similar developmental lineage and are all considered to be myeloid cells 9 . It is believed that a large proportion of AD genetic risk can be explained by genes expressed in myeloid cells and not other cell types 10 . Several established AD genes are highly expressed in microglia 9,11 , and a variant in the AD-associated gene CELF1 has been associated with lower expression of SPI1 in monocytes and macrophages 10 . AD risk alleles have been shown to be enriched in myeloid-specific epigenomic annotations and in active enhancers of monocytes, macrophages, and microglia 12 , and to be polarized for cis-eQTL effects in monocytes 7 . These findings suggest that a cell-type-specific analysis in blood and brain tissue may identify novel and more precise AD associations that may help elucidate regulatory networks. In this study, we performed a genome-wide cis ct-eQTL analysis in blood and brain, respectively, then compared eQTLs and celltype-specific eQTLs (ct-eQTLs) between brain and blood with a focus on genes, loci, and cell types previously implicated in AD risk by genetic approaches.

Study cohorts Framingham Heart Study (FHS)
The FHS is a multigenerational study of health and disease in a prospectively followed community-based and primarily non-Hispanic white sample. Procedures for assessing dementia and determining AD status in this cohort are described elsewhere 13 . Clinical, demographic, and pedigree information, as well as 1000 Genomes Project Phase 1 imputed SNP genotypes and Affymetrix Human Exon 1.0 ST array gene expression data from whole blood, were obtained from dbGaP (https://www.ncbi.nlm.nih.gov/projects/gap/cgibin/study.cgi?study_id=phs000007.v31.p12). Requisite information for this study was available for 5257 participants. Characteristics of these subjects are provided in Supplementary Table S1.

Religious Orders Study (ROS)/Memory and Aging Project (MAP)
ROS enrolled older nuns and priests from across the US, without known dementia for longitudinal clinical analysis and brain donation and MAP enrolled older subjects without dementia from retirement homes who agreed to brain donation at the time of death 14 (http:// www.eurekaselect.com/99959/article). RNA-sequencing brain gene expression and whole-genome sequencing (WGS) genotype data were obtained from the AMP-AD knowledge portal (https://www.synapse.org/#!Synapse: syn3219045) (https://www.synapse.org/).

Data processing
Generation, initial quality control (QC), and preprocessing procedures of the FHS GWAS and expression data are described elsewhere 13 . Briefly, the Robust Multichip Average (RMA) method 15,16 was used for background adjustment and normalization of gene expression levels and further adjusted for the first principal component of ancestry. ROSMAP gene expression data were log-normalized and adjusted for known and hidden variables detected by surrogate variable analysis (SVA) 17 in order to determine which of these variables should be included as covariates in analysis models for eQTL discovery. Additional filtering steps of FHS and ROSMAP GWAS and gene expression data included eliminating subjects with missing data, restricting gene expression data to protein-coding genes, and retaining common variants (MAF ≥ 0.05) with good imputation quality (R 2 ≥ 0.3).

Cis-eQTL mapping
Cis-eQTL mapping was performed using a genomewide design ( Supplementary Fig. S1). The association of gene expression with SNP genotypes for all cis SNPs within 1 Mb of protein-coding genes was evaluated using linear-mixed models adjusting for family structure in FHS and linear regression models for unrelated individuals in ROSMAP. In FHS, lmekin function in the R kinship package (version 1.1.3) 18 was applied assuming an additive genetic model with covariates for age and sex, and family structure modeled as a random-effects term for kinship-a matrix of kinship coefficients calculated from pedigree structures. The linear model for analysis of FHS can data be expressed as follows: where Y i is the expression value for gene i, G j is the genotype dosage for cis SNP j, Aij and S ij are the covariates for age and sex respectively, U ij is the random effect for family structure, and β 1 , β 2 , and β 3 are regression coefficients.
ROSMAP data were analyzed using the lm function in the base stats package in R (http://www.R-project.org/). The regression model, which included covariates for age, sex, postmortem interval (PMI), study (ROS or MAP), and a term for a surrogate variable (SV1) derived from analysis of high dimensional data, can be expressed as: where Y i is the expression value for gene i, G j is the genotype dosage for cis SNP j, Aij, S ij , PM ij , S2 ij , and SV1 ij are the covariates for age, sex, PMI, study, and SV1, respectively, ɛ ij is the residual error, and the βs are regression coefficients.

Cis ct-eQTL mapping
Models testing associations with cell-type-specific eQTLs (ct-eQTLs) included an interaction term for expression levels of "proxy" genes that represent cell types. Proxy genes representing ten cell types in whole blood 4 and five cell types in brain [19][20][21] were incorporated in cell-type-specific models (Supplementary Table S2). These proxy genes for cell types in blood were established previously using BLUEPRINT expression data to validate cell-type-specific expression in each cell-type module 4 and the proxy genes for brain cell types have been incorporated in several studies [19][20][21] . Cell-type-specific expression analyses in blood of FHS participants were conducted using the following model: where in each eQTL ij pair, Y i is the eQTL expression value for gene i, G j is the genotype dosage for cis SNP j, P is the proxy gene, P * G j is the interaction term representing the effect of genotype in a particular cell type, Aij and S ij are covariates for age and sex, respectively, U ij is the random effect for family structure, and βs are regression coefficients. Models with significant interaction terms indicate cell-type-specific eQTLs.
The following model was used to evaluate cell-typespecific expression in the brain in ROSMAP: where in each eQTL ij pair, variables Y i , G j , P, Aij, S ij , P ij , ɛ ij , and βs are as described above, and PM ij , S2 ij , and SV1 ij are covariates for PMI, study, and SV1, respectively. A Bonferroni correction was applied to determine the significance threshold for each analysis (Supplementary  Table S3).
We assessed the relevance of the significant findings more directly to AD in two ways. In one approach, AD status was included as a covariate in the eQTL and ct-eQTL analysis models. In addition, the significant eQTLs and ct-eQTLs were evaluated separately in AD cases and controls separately in the ROSMAP brain expression dataset, but not in the FHS blood expression dataset due to the paucity of AD cases (2%) in that sample.

Selection of eQTLs in AD loci and gene-set pathway enrichment analysis
AD loci were determined based on the review of published GWAS and linkage studies of AD and AD-related traits, and this list was augmented with genes that are well recognized as functionally related to AD by experimental approaches (Supplementary Table S4). AD genes identified by GWAS met genome or study-wide significance thresholds and some of these were annotated as the closest gene to an intergenic association signal. eGenes (genes whose expression levels are associated with variation at a particular eSNP) included 88 genes and 80 eSNPs (no SNPs that significantly influence gene expression) which include genome-wide significant "peak" SNPs (i.e., top-ranked SNP within an association signal) for AD. Gene-set enrichment analysis was performed using the PANTHER (Protein ANalysis THrough Evolutionary Relationships) software tool 22 to determine if the unique genes in the significant eQTL/ct-eQTL pairs shared by both brain and blood datasets are associated with a specific biological process or molecular function. The significance of the pathways was determined by the Fisher's Exact test with false discovery rate (FDR) multiple test correction.

Colocalization analyses
Assessment of causal variants shared by adjacent GWAS and eQTL signals was performed using a Bayesian colocalization approach implemented in the R package coloc 23 . This analysis incorporated SNP summary statistics from a recent large AD GWAS 24 and eQTL analyses described above. For the purpose of this study, a peak SNP refers to the most significantly associated AD-SNPs under a particular GWAS signal and a lead eQTL variant is defined as the eSNP showing the strongest association with gene expression. Following recommended guidelines, the variants were deemed to be colocalized by a high posterior probability that a single shared variant is responsible for both signals (PP4 > 0.8) 23,25 . A lower threshold for statistical significance with a false discovery rate (FDR) < 0.05 for eQTL significant results was applied to maximize detection of colocalized pairs. Regional plots were constructed with LocusZoom 26 .

Differential expression analysis of potential AD biomarker genes
The 386 distinct eGenes in shared eQTL pairs in significant blood and brain results were further examined for differentially expressed genes (DEG) between AD cases and controls in the AD enriched ROSMAP RNA-Seq dataset. After filtering, 308 of the total 386 genes were tested in the DEG analysis. The differences in expression among the groups were computed using the log2 transformation of the fold-change (log2FC). The differential analysis was performed using a linear model to identify DE genes between AD cases and controls implemented in R package limma (Linear Model for Microarray Data) version 3.32.7 (http://www.R-project.org/). The P values were adjusted for multiple testing to control the False Discovery Rate (FDR), with the gene considered DE when the adjusted P value was ≤0.05.
This study was approved by the Boston University Institutional Review Board.

Results
A total of 173,857 eQTLs and 51,098 ct-eQTLs in the brain, and 847,429 eQTLs and 30,405 ct-eQTLs in blood were significant after Bonferroni correction (Supplementary Table S3 and Supplemental Resources). Additional significant gene-SNP eQTLs pairs in the brain (n = 11,649) and blood (n = 2533) were observed in ct-eQTL analysis that were not detected in eQTL analysis (Fig. 1A).
eQTLs and ct-eQTLs common to blood and brain Of note, 24,028 significant gene-SNP eQTL pairs were shared between blood and brain. The 386 distinct eGenes among these shared eQTL pairs (Supplementary Table  S5) are most enriched in the apoptosis signaling (P = 0.023) and Wnt signaling (P = 0.036) pathways (Supplementary Table S6). Five of these eGenes (HLA-DRB5, HLA-DRB1, ECHDC3, CR1, and WWOX) were previously associated with AD 24,27 . Three eSNPs in eQTLs involving HLA-DRB1/HLA-DRB5 (rs9271058) and ARL17A/ LRRC37A2 (rs2732703 and rs113986870, which are near KANSL1 and MAPT) were previously associated with AD risk at the genome-wide significance level 24,28 (Table 1). eQTLs involving CR1, ECHDC3, and WWOX were much more significant in the brain than blood, whereas HLA-DRB5 and HLA-DRB1 were more significant in blood when comparing the effect sizes. ECHDC3 was a significant eGene in blood and brain eQTLs (specifically in neurons). HLA-DRB5 and HLA-DRB1 were the only eGenes ascribed to significant ct-eQTLs in both blood and brain noting that of the ten distinct lead eSNPs, five are unique to each tissue (Table 1). Although the eQTLs involving these genes with the largest effect were observed in blood across multiple cell types, the total number of significant eSNP-eGene combinations was far greater in brain (particularly in microglia and neurons). The only instance in which the lead eSNP is also associated with AD risk at the GWS level was observed in the blood eQTL pair of HLA-DRB1 with eSNP rs9271058 (Table 1). Among the AD-associated SNPs at the GWS level, rs9271058 is a significant eSNP for HLA-DRB1 in both blood and brain cell types (the most significant association by P value was observed in antibacterial cells and microglia) and rs9271192 is a significant ct-eQTL for the gene in multiple brain cell types ( Table 1). Both of these SNPs are also eSNPs for HLA-DB5 in the brain in neurons only.
There were 657 gene-SNP eQTL pairs comprising 16 unique eGenes that were significant in blood and brain overall as well as in specific cell types in both blood and brain (Supplementary Table S7). None of these eGenes were observed in significant pathways enriched for AD genes, however, they included AD-associated genes HLA-DRB1 and HLA-DRB5. Fig. 1 Significant gene-SNP eQTLs and ct-eQTLs in blood and brain tissue genome-wide. A Venn diagram shows the number of overlapping eQTLs and ct-eQTLs in blood and brain. Gold color indicates significant eQTLs that are cell-type-specific. Orange color indicates significant eQTLs that are shared between blood and brain. B Cell-type distributions of significant genome-wide ct-eQTL results in blood and brain. Table 1 eQTLs and ct-eQTLs in established AD loci appearing in both blood and brain.
(A) eQTLs and ct-eQTLS in established AD genes in both blood and brain. eGene

eQTLs and ct-eQTLs among previously established AD loci
Slightly more than half (42/80 = 52.5%) of the established AD associations (Supplementary Table S3) are eGene targets for significant eQTLs in blood (Supplementary Table S8). By comparison, only seven established AD loci were eGene targets for significant eQTLs in the brain, among which OARD1 was significant in endothelial cells only (Supplementary Table S8). Many GWS SNPs for AD risk are eSNPs affecting the expression of the nearest gene, which is usually recognized as the causative gene, but several GWS SNPs target other genes (Supplementary Table S9). For example, AD-associated eSNPs rs113986870 and rs2732703 in the MAPT/KANSL1 region target ARL17A in blood, but are paired in seven of eight eQTLs and ct-eQTLs with LRRC37A2 in the brain (Supplementary Table S9). HLA-DRB1 is the only AD gene with a significant ct-eQTL in blood, whereas many AD genes have significant blood eQTLs. In the brain, only four AD loci (CR1, HLA-DRB1/DRB5, IQCK, and MAPT/ KANSL1) have significant brain eQTLs of which HLA-DRB1/DRB5 and MAPT/KANSL1 are the only brain ct-eQTLs, noting that all are significant in microglia, neurons, and endothelial cells.
Next, we evaluated whether the most significant eSNPs and SNPs genome-wide significantly associated with AD status (i.e., AD-SNPs) co-localize and thus to identify a single shared variant responsible for both signals (posterior probability of shared signals (PP4) > 0.8). This analysis revealed eight eQTL/ct-eQTL signals that colocalized with seven AD GWAS signals and half of the colocalized signals involved a ct-eQTL (Table 2 and Supplementary Fig. S2). Two different eSNPs for CD2AP, rs4711880 (eQTL P = 1.4 × 10 −104 ) and rs13201473 (NK/ CD8 + T cell ct-eQTL P = 1.47 × 10 −9 ), flank CD2AP GWAS SNP rs10948363 which is also the second most significant eQTL (P = 2.32 × 10 −104 ) and the second most significant ct-eQTL in NK cells/CD8 + T cells (P = 2.66 × 10 −9 ). These three SNPs span a 9.0-kb region in intron 2 and are in complete linkage disequilibrium (LD, r 2 = 1.0), indicating that any one or more of them could affect the function of target gene CD2AP. Rs6557994 is the most significant eSNP for and located in PTK2B (blood interferon ct-eQTL P = 2.58 × 10 −9 ) and is moderately correlated with the PTK2B GWAS SNP (rs28834970, r 2 = 0.78, P = 1.58 × 10 −9 ). Thus, it is not surprising that rs6557994 is also significantly associated with AD risk (P = 8.19 × 10 −7 ). Rs6557994 is also correlated with a GWAS SNP in CLU, located approximately 150 kb from PTK2B, that is not significantly associated with the expression of any gene. Because PTK2B and CLU are independent AD risk loci 27 , it is possible that this eSNP has an effect on AD pathogenesis through independent pathways (Supplementary Fig. S2). The most significant eSNP in MADD (rs35233100, P = 2.88 × 10 −10 ) was predicted to have functional consequences because it is a stop-gained mutation. This brain eQTL is colocalized (PP4 = 0.95) and weakly correlated with a GWAS SNP (P = 1.91 × 10 −5 ) in CELF1 rs10838725 (r 2 = 0.12).

ct-eQTLs genome-wide
Examination of the distribution of the significant ct-eQTL results genome-wide showed that nearly two-thirds of the ct-eQTLs in blood occurred in interferon response/ antibacterial cells which are defined as type I interferon viral response cells in upregulated genes and type II interferon antibacterial inflammatory response cells in downregulated genes 4 , whereas brain ct-eQTLs are highly represented in endothelial cells, neurons, and microglia ( Fig. 1B and Supplementary Table S10). Further examination of significant results within myeloid cell lineages (i.e., microglia and monocytes/macrophages) which account for a large proportion of the genetic risk for late-onset AD 10 revealed that 3234 or 10.6% of all significant ct-eQTLs in blood were in monocytes/macrophages. This subset includes 128 unique eGenes which are significantly enriched in the AD amyloid secretase pathway (FDR P = 0.013, Supplementary Table S11). A total of 974 or 30.1% of ct-eQTLs including 4 of the 20 most significant eGenes in monocytes/macrophages are located within 1 Mb of established AD loci. One of the eGenes in this top-ranked group (HLA-DRB5) is an established AD gene, and three others that are near established AD loci (DLG2 near PICALM 29 , C4BPA near CR1 30 , and MYO1E near ADAM10 31 ) are reasonable AD gene candidates based on evidence using non-genetic approaches (Table 3). Microglia accounted for 15,560 (30.5%) of significant ct-eQTLs in the brain (Supplementary Table S10) which involved 304 unique eGenes. Approximately 52% of significant ct-eQTLs in microglia are located in AD regions including five of the 20 most significant ct-eQTLs in this group ( Table 3). One of these five eGenes is an established AD gene (HLA-DRB1) and two others (ALCC 32 and WNT3 33 ) have been linked to AD in previous studies.

Overlap of eQTLs and ct-eQTLs among myeloid cell types
Considering significant eGene-eSNP pairs in myeloid cell types, 251 pairs including five distinct eGenes (BTNL3, FAM118A, HLA-DOB, HLA-DRB1, and HLA-DRB5) are shared between microglia and monocytes/ macrophages (Table 4A and Fig. 2A). Three of these pairs involving eSNPs rs3763355, rs3763354, and rs1183595100 have the same target gene HLA-DOB and occur only in microglia and monocytes/macrophages (Table 4B). Among the significant ct-eQTLs in the brain, the cell types with the largest proportion that were also significant in monocytes/macrophages were microglia (1.6%) and neurons (1.3%) ( Table 4). Conversely, among the significant ct-eQTLs in blood, the cell types with the largest proportion that were also significant in microglia were NK/CD + T cells (12.9%) and monocytes/macrophages (7.8%). Among ct-eQTLs which are significant only for one cell-type each in blood and one in the brain, monocytes/macrophages shared three ct-eQTLs with microglia but with no other brain cell types (Fig. 2B and Table 4C). By comparison, microglia shared 63 ct-eQTLs with interferons/antibacterial cells, but with no other blood cell types. The proportions of overlap of ct-eQTLs between blood and brain across ten paired cell types are significantly different (Fisher's Exact test χ 2 9 = 789.8, P = 2.2 × 10 −16 ). The much larger number of ct-eQTLs in microglia that were common with interferons/bacterial cells than monocytes/macrophages may reflect the substantially greater proportion of significant eQTLs in blood involving interferons/antibacterial cells (64%) than monocytes/macrophages (10.6%) (Supplementary Table  S10). The only other ct-eQTLs that were unique to a pair of cell types in brain and blood cell type involved neurons paired with neutrophils (n = 3) and with interferons/ antibacterial cells (n = 65) (Fig. 2B).

Effect of AD status on significant eQTLs and ct-eQTLs
None of the significant eQTLs and ct-eQTLs observed in the brain (Table 1) were influenced by the inclusion of AD status in the analysis models. Stratified analyses revealed that the top findings involving eSNPs that were previously associated with AD at the genome-wide significant level were evident in both AD cases and controls (Supplementary Table 12A). Although most of the findings were more significant in AD cases than controls (noting that the ROSMAP brain sample of AD cases was 44% larger than the control sample), the effect size for most eSNP-eGene pairs was similar. However, patterns among AD cases and controls differed when focusing on the most significant eQTLs and ct-eQTLs in established AD genes. For example, eQTLs observed in undifferentiated brain cells involving CR1 paired with rs6656401 (P = 7.85 × 10 −22 ), in endothelial cells involving HLA-DRB1 paired with rs73399473 (P = 2.5 × 10 −10 ) and HLA-DRB5 paired with rs1064697 (P = 2.18 × 10 −14 ), in microglia involving HLA-DRB1 paired with rs72847627 (P = 4.43 × 10 −51 ), and in neurons involving ECHDC3 paired with rs866770710 (P = 5.79 × 10 −13 ) were significant only in AD cases (Supplementary Table 12B). Other eQTLs observed in multiple cell types involving these same genes (HLA-DRB1: rs111976080, P = 1.68 × 10 −25 ; HLA-DRB5: rs2395517, P = 8.64 × 10 −12 , rs9271184, P = 5.42 × 10 −41 , and rs80141235, P = 3.94 × 10 −9 ) were significant only in controls. Several other eQTLs and ct-eQTLs in CR1, HLA-DRB1, and HLA-DRB5 were highly significant in one group but showing a much   Table 4 Overlap of ct-eQTLs in myeloid cell types in brain and blood. Among the 386 eGenes that were significant in both blood and brain (Supplementary Table S5), 87 were differentially expressed between AD cases and controls (Supplementary Table S13). This includes WWOX (P adj = 1.02 × 10 −4 ) and LRRC2 (P adj = 2.38 × 10 −3 ) which have been associated with AD risk by GWAS 24,34 .

Discussion
We identified several novel AD-related eQTLs that highlight the importance of cell-type-dependent context. It is noteworthy that there were more significant ct-eQTLs in the brain (n = 51,098) than blood (n = 30,405) even though the dataset containing expression data from blood (FHS) is several times larger than the brain expression dataset (ROSMAP). This could be due to greater cell-type heterogeneity in the brain, the enrichment of AD cases in the ROSMAP dataset who may show different patterns of gene expression compared to persons without AD, or highly variable gene expression across cell types in the nervous system 35 . Because expression studies in the brain are often constrained by the small number of specimens compared to studies in other tissues, postmortem changes that may affect gene expression in the brain 36 , and the growing recognition that AD is a systemic disease 37-39 , incorporating expression data from multiple tissues can enhance discovery of additional genetic influences on AD risk and pathogenesis.
Although most significant findings were tissue-specific, the 386 distinct eGenes among more than 24,000 significant gene-SNP eQTL pairs that were shared between blood and brain were enriched in the apoptosis signaling pathway which has been suggested to contribute to the underlying pathology associated with AD 40,41 . Six established AD genes (CR1, ECHDC3, HLA-DRB1, HLA-DRB5, LRRC2, and WWOX 24,27,34 ) were shared eGenes in the brain and blood. They were also involved in eQTLs and ct-eQTLs that showed different patterns of association in cases versus controls (i.e., CR1, HLA-DRB1, HLA-DRB5, and ECHD3) or differentially expressed in AD cases versus controls (i.e., WWOX and LRRC2).
The complement receptor 1 (CR1) gene encodes a transmembrane glycoprotein functioning in the innate immune system by promoting phagocytosis of immune complexes, cellular debris, and Aβ 42 . CR1 is an eGene for several eSNPs, including AD GWAS peak SNP rs6656401 located within the gene, in brain and blood eQTLs and the effects on CR1 expression are opposite in blood and brain. There are multiple possible explanations for the effect direction differences across tissues. The effect of eSNP rs6656401 on CR1 expression may be developmental, noting that the average age of the FHS subjects (a group with expression data in blood) is more than 30 years Table 4 continued younger than the ROSMAP subjects (group with expression data in the brain). The difference between brain and blood may also reflect postmortem changes in the brain that are not indicative of expression in vivo. Alternatively, these effects may be related to AD because few FHS subjects were AD cases at the time of blood draw, whereas 60% of subjects in the ROSMAP sample are AD cases. This idea is supported by the observation of a larger and positive effect of rs6656401 on CR1 expression in AD (β = 0.020) compared to control brains (β = −0.0086).
Opposite effect directions of expression in brain and blood from AD patients compared to controls have been observed for several ribosomal genes 43 . GWS variants located in the region spanning ECHDC3 and USP6NL have previously been associated with AD 44 . Altered ECHDC3 expression in AD brains 45 supports the idea that this gene has a role in AD. Knockout of WWOX in mice leads to aggregation of amyloid-β (Aβ) and Tau, and subsequent cell death 46,47 . LRRC2 is located in a region including GWS variants that modify the inverse relationship between education attainment and AD 34 . A recent study showed that the expression of a LRRC2 long noncoding RNA (LCCR2-AS1) is significantly increased in children with autism spectrum disorder compared to children with normal development 48 . The human leukocyte antigen (HLA) region is the key susceptibility gene in many immunological diseases and many associations have been reported between Fig. 2 Intersection of significant gene-SNP eQTL pairs between cell types in blood and brain tissue. A Venn diagram showing overlap of ct-eQTL pairs in myeloid cell types (microglia and monocytes/macrophages). B Number of significant eQTLs unique to and that overlap cell types in blood and brain. The bar chart on the left side indicates the number of significant eQTLs involving each cell type and the bar chart above the matrix indicates the number of significant eQTLs that are unique to each cell type and set of cell types. Pink colored bar indicates the number of eQTLs pairs that are unique to microglia and monocytes/macrophages. neurodegenerative diseases and HLA haplotypes 49 . In addition, the most widely used marker to examine activated microglia in normal and diseased human brains is HLA-DR and microglia activation increases with the progression of AD 50,51 . HLA-DRB5 and HLA-DRB1 have been implicated in numerous GWAS studies as significantly associated with AD risk 24,27 and appeared frequently among significant results in blood and brain in this study. Rs9271058, which is located approximately 17.8 kb upstream of HLA-DRB1, is significantly associated with AD risk (P = 5.1 × 10 −8 ) 24 . and when paired with HLA-DRB1 is a significant eQTL and ct-eQTL in multiple cell types in blood and brain including myeloid lineage cells (i.e., monocytes/macrophages and microglia). This eSNP is also a significant eQTL in the brain and specifically in neurons when paired with HLA-DRB5. Rs9271192, which is adjacent to rs9271058 and also significantly associated with AD risk (P = 2.9 × 10 −12 ) 27 , is a significant eQTL and ct-QTL with multiple cell types in brain but not blood when paired with HLA-DRB5 and HLA-DRB1.
Significant associations for AD have been reported with variants spanning a large portion of the major histocompatibility (MHC) region in HLA class I, II, and III loci 49,52,53 . While the strongest statistical evidence for association in this region is with variants in HLA-DRB1 24 , fine mapping in this region suggests that a class I haplotype (spanning the HLA-A and HLA-B loci) and a class II haplotype (including variants in HLA-DRB1, HLA-DQA1, and HLA-DQB1) are more precise markers of AD risk. Given the complexity of the MHC region and extensive LD, further work is needed to confirm whether this is a true eQTL or a signal generated from a specific HLA allele or haplotype. Although functional studies may be required to discern which HLA variants have AD relevant consequences and HLA polymorphisms methods would be required to detect differential gene expression between the HLA alleles, our findings support a role for the immune system in AD 37,54 and the hypothesis that a large proportion of AD risk can be explained by genes expressed in myeloid cells 10 .
The potential importance and relevance to AD of eQTLs and ct-eQTLs in myeloid cell types are supported by the observation that a large portion of GWS ct-eQTLs we identified map within 1 Mb of established AD loci, and 58% (12/20 in monocytes/macrophages and 11/20 in microglia) of the most significant eGenes have been previously implicated in AD. DLG2 encodes a synaptic protein whose expression was previously reported as downregulated in an AD proteome and transcriptome network 55 and inversely associated with AD Braak stage 29 . Genome-wide significant associations of AD risk with PTPRG was observed in a family-based GWAS 56 and with CLNK in a recent large GWAS for which the evidence was derived almost entirely with a proxy AD phenotype in the UK Biobank 57 . NFXL1 is a novel putative substrate for BACE1, an important AD therapeutic target 58 . FCRL5 may interact with the APOE*E2 allele and also modifies AD age of onset 59 . C4BPA was shown to be consistently downregulated in MCI and AD patients, and the protein encoded by this gene accumulates in Aβ plaques in AD brains 30,60 . Lower levels of the PAM have been observed in the brains and CSF of AD patients compared to healthy controls 61 and MYO1E is expressed by anti-inflammatory disease-associated microglia 31 . As a calcium channel protein, CACNB2 may affect AD risk by altering calcium levels which could cause mitochondrial damage and then induce apoptosis 62,63 .
Likewise, several eGenes of top-ranked ct-eQTLs in microglia that are not established AD loci may have a role in the disease. It was shown that copy number variants (CNVs) near HNRNPCL1 overlapped the coding portion of the gene in AD cases but not controls 64 . A region of epigenetic variation in ALLC was associated with AD neuropathology 32 . FAM21B, a retromer gene in the endosome-to-Golgi retrieval pathway, was associated with AD in a candidate gene study 65 . Vacuolar sorting proteins genes in this pathway including SORL1 have been functionally linked to AD through the trafficking of Aβ 66 . One study demonstrated that WNT3 expression in the hippocampus was increased by exercise and alleviated ADassociated memory loss by increasing neurogenesis 33 . Expression of RPL9 is downregulated in severe AD 67 and significantly differs by sex among persons with the APOE ɛ4 allele 68 . Significant evidence of association with a TRIM49B SNP was found in a genome-wide pleiotropy GWAS of AD and major depressive disorder (MDD) 69 .
HLA-DOB, which is one of the five distinct eGenes (BTNL3, FAM118A, HLA-DOB, HLA-DRB1, and HLA-DRB5) for significant ct-eQTLs shared between microglia and monocytes/macrophages, and is the target gene for three eSNPs (rs3763355, rs3763354, and rs1183595100) that were evident only in these myeloid cell types. These eSNPs have similar eQTL p-values in both cell types, but have slightly larger effect sizes in monocytes (Fig. 2). The effect of rs3763355 on expression is in opposite directions in monocytes and microglia which suggests HLA-DOB may be acting in different immune capacitates in AD in blood and brain. Though the functions of the genes BTNL3 and FAM118A are unknown, a BTNL8-BTNL3 deletion has been correlated with TNF and ERK1/AKT pathways, which have an important role in immune regulation inducing inflammation, apoptosis, and proliferation, suggesting the deletion could be correlated to inflammatory disease 70 . This suggests that the majority of the shared myeloid cell-type genes-the HLA genes and possibly BTNL3-are all immune-related. Ct-eQTLs involving microglia and monocytes/macrophages had a larger proportion of total intersection, an isolated set interaction and a statistically significant overlap (P < 1.0 × 10 −314 ), demonstrating a stronger connection than other brain/blood cell types in this study and thus providing further evidence for the importance of the immune system in AD.
The proportions of significant ct-eQTLs in NK cells/ CD8 + T cells, monocytes/macrophages, and eosinophils are comparable to those observed in reference blood tissue (https://www.miltenyibiotec.com/US-en/ resources/macs-handbook/human-cells-and-organs/ human-cell-sources/blood-human.html#gref), (https:// www.stemcell.com/media/files/wallchart/WA10006-Frequencies_Cell_Types_Human_Peripheral_Blood.pdf). Similarly, significant eQTL distributions in endothelial cells, neurons, and glia are consistent with reference brain tissue 71 . The majority of significant blood eQTLs were type I interferon response cells which cross-regulate with pro-inflammatory cytokines that drive the pathogenesis of autoimmune diseases including AD and certain heart diseases [72][73][74] and the enrichment of interferon ct-eQTLs in this study could possibly be due to the high proportion of subjects these diseases in the FHS dataset. In contrast, the proportion of significant ct-eQTLs among glial cells is much lower in astrocytes and oligodendrocytes and threefold higher in microglia than in reference brain tissue (https://www.stemcell.com/media/files/wallchart/WA10006-Frequencies_Cell_Types_Human_Peripheral_Blood.pdf). Because many AD risk genes are expressed in myeloid cells including microglia 10 , the large number of microglia ct-eQTLs is consistent with the high proportion of AD subjects in the ROSMAP dataset.
Several SNPs previously reported to be associated with AD at the GWS level were associated with eGenes that differ from genes ascribed to AD loci and thus may have a role in AD. Karch et al. observed that the expression of PILRB, which is involved in immune response and is the activator receptor to its inhibitory counterpart PILRA, an established AD gene 75,76 , was highest in microglia 11 . CNN2, the eGene for eSNP rs4147929 located near the end of ABCA7, significantly alters extracellular Aβ levels in human induced pluripotent stem cell-derived neurons and astrocytes 77 . Rs4147929 also targeted HMHA1 which plays several roles in the immune system in an HLAdependent manner 78 . The eSNP/GWAS SNP rs3740688 located in SPI1 also affects expression of MYBPC3 and MADD. MYBPC3 was recently identified as a target gene for eSNPs located in CELF1 and MS64A6A in a study of eQTLs in blood for GWS AD loci 79 . MADD is expressed in neurons 11 , is involved in neuronal cell death in the hippocampus 80 , and was shown to be a tau toxicity modulator 81 . Although eSNP rs113986870 in KANSL1 when paired with the nearby eGene LRRC37A2 was a significant brain eQTL and ct-eQTL, LRRC37A2 encodes a leucine-rich repeat protein that is expressed primarily in testis and has no apparent connection to AD. However, rs113986870 also significantly influenced the expression of another gene in this region, ARL17A, that was previously linked to progressive supranuclear palsy by analysis of gene expression and methylation 82 .
The aim of this study was to identify context-dependent (i.e., cell-type-specific) eQTLs in blood and brain among older individuals including AD cases using a genome-wide approach. Previous studies have evaluated ct-eQTLs using purified cells, but they focused on only one or two cell types 7,10 . Other studies examined multiple cell types but using expression data generated from individuals who were on average much younger than the FHS and ROS-MAP participants 4,83,84 . With the exception of a recent report by Patrick et al. who applied a deconvolution approach to estimate cell-type proportions from in cortical tissue obtained from ROSMAP participants but did not examine ct-eQTLs 85 , our study is one of the first to study eQTLs and ct-eQTLs in a sample enriched for AD cases and link these findings to established AD genes and AD risk.
Our study has several noteworthy limitations. The proxy genes individually or collectively may not accurately depict cell-type-specific context. In addition, the comparisons of gene expression in blood and brain may yield false results because they are based on independent groups ascertained from a community-based longitudinal study of health (FHS-blood) and multiple sources for studies of AD (ROSMAP-brain) which were not matched for age, sex, ethnicity and other factors which may affect gene expression. Moreover, the FHS cohort contains many elderly but relatively few AD cases, whereas 60% of the ROSMAP participants in this autopsy sample are AD cases. Although the dataset for eQTL analysis in blood was much larger than the dataset derived from the brain, the effect sizes associated with many of the eQTLs common to both tissues were similar. Also, findings in the brain may reflect postmortem changes unrelated to disease or cell-type different expression 36 . Another limitation of our findings is that some cell types are vastly underrepresented compared to others. Because myeloid cell types are represented in only a small proportion of the total cell populations in the brain and blood (generallỹ 10%), it is difficult to identify myeloid-specific signals 12 . Despite this limitation, many of the most significant and noteworthy results of this study involved monocytes/ macrophages and microglia.

Conclusion
Our observation of cell-type-specific expression patterns for established and potentially novel AD genes, finding of additional evidence for the role of myeloid cells in AD risk, and discovery of potential novel blood and brain AD biomarkers highlight the importance of celltype-specific analysis. Future studies that use more robust computational approaches such as deconvolution to reliably estimate cell type proportions [83][84][85] , compare celltype-specific differential gene expression among AD cases and controls using single-cell RNA-sequencing or cell count data and conduct functional experiments are needed to validate and extend our findings.