Introduction

Infection by the HIV greatly increases the risk of developing KS,1 an AIDS-defining malignancy.2 The development of KS within HIV-1-positive populations results from the uncontrolled expression of latency genes of human herpes virus-8 (HHV-8),2 the etiologic agent of KS.3 Although all cases of KS carry HHV-8, not all individuals with HHV-8 infection develop KS and among HIV-positive infected individuals, those who seroconvert to HHV-8 after HIV infection are at greater risk for KS than those who seroconvert before HIV infection.4

Genetic susceptibility to HIV-KS is poorly understood. Co-infection by HHV-8, also called KS herpes virus, and the higher prevalence of KS in the setting of HIV infection suggested an important role for host immunity in the control of HIV progression to KS.1, 5, 6, 7, 8 In particular, the observation that immunosuppressed transplant recipients have increased cumulative risks (8–30%) for classical KS9, 10 furthermore suggested that immunosuppression is an important risk factor.

The advent of highly active anti-retroviral therapy has significantly decreased the incidence of HIV-KS in the developed countries. Consequently, interests in host genetics in HIV-KS have diminished and only a few studies have been reported in the last two decades. Early reports implicated variant genotypes of IL-6 (interleukin-6) and Fc-γ receptor IIIA in the development of KS.11, 12 Further studies of HIV- and non-HIV-KS that focused on the host factors encoded in central major histocompatibility complex (MHC), essentially those of the human leukocyte antigen (HLA) system, have reported positive associations with HLA-DR genes1, 8, 13 corroborating early findings.14, 15

As suggested by the higher incidence of classical KS among kidney transplant recipients relative to the general population, the risk associated with immunosuppression is apparently independent of HIV-1 infection. However, the highest risk of KS observed among HIV-infected individuals, notably among men who have sex with men who turned out to have the highest rate of HHV-8 infection (40%), suggested that susceptibility to HIV-KS entails actions and/or interactions between HIV and HHV-8. However, which of the two aspects of immunosuppression–prolonged exposure to low CD4+ T-cell count (CD4+ count) or rate of decline of CD4+ count in the years preceding the diagnosis of KS—is the major determinant of risk is not known. Moreover, given the known implication of viral infection chronology on HIV-KS outcome, epidemiologic studies of HIV-KS may not have been adequately designed.

To overcome the limitations of previous studies, we have re-designed the initial case and control study of HIV-KS8 nested within MACS (Multicenter AIDS Cohort Study) to allow for the control of infection chronology, immunosuppression and principal component analysis-based assessment of race/ethnicity in evaluating the independent effects of HLA and non-HLA loci across central MHC. Specifically, the degree of immunosuppression in the years preceding the diagnosis of KS was determined by the trapezoidal method16 and was used together with the estimated slope of CD4+ count as time-dependent covariates.

In the present study, we evaluated the effects of the highly polymorphic class I HLA-B gene and a selection of 467 quality control-filtered single-nucleotide polymorphisms (SNPs) encompassing about 5 Mb across the central MHC region on the natural history of HIV progression to KS. We report data for HLA-B-independent associations of HIV-KS with a variant HLA-DMB and linked TAP1 gene variants.

Results

Single locus analysis

Exclusion of 29 SNPs (5.8%) that deviated from Hardy–Weinberg equilibrium in the controls led to a final set of 467 SNPs (Supplementary Table 1) available for further analyses. Based on the self-report of ancestry, 22 men were reclassified into other racial groups and re-matched based on the updated information from the principle component analysis. The majority of these individuals self-reported as Hispanic Europeans indicating that population admixture may explain the misclassification. Re-matching of the cases and controls within this subset of 22 individuals resulted in a loss of three additional individuals because no matched cases or controls could be identified. This resulted in samples of 348 and 318 pairs of matched cases and controls with available SNP and HLA-B typing data, respectively.

Among the modeled covariates, only age at baseline (age at the time of dual seroconversion) was significantly (P=0.004) associated with KS. The degree of immunosuppression (area under the curve of CD4+ count below 300) was not a significant predictor (P>0.20); however, this covariate turned out to be marginally (0.05<P<0.10) associated with the risk for KS in models adjusted for HLA-B (see below). The rate of CD4+ count change (slope of CD4+ count as dual serocoversion) was not a significant predictor of the estimated risks (P>0.20) regardless of the HLA-B effects; this indicated that the confounding effect of the declining immunity was accounted for by our case and control matching criteria.

The observed HLA-B allele and genotype frequencies for the entire case and control samples are concordant with those published for the Caucasian populations. Only two HLA-B alleles, the risk B*1401 (odds ratio (OR)=4.2; 95% confidence interval (CI)=1.1–15.5; P=0.03) and the protection B*2705 (OR=0.37; 95% CI=0.15–0.94; P=0.04) alleles, passed the significance threshold of 5% (Supplementary Table 2).

In single SNP analyses, significant associations (P<0.05) were observed with risk alleles of several SNPs peaking essentially in two chromosomal locations, across a 120-kb-long interval centromeric to HLA-B and spanning MHC class III GPANK1, LY6G6C, MSH5-SAPCD1 and VARS genes, and across a 95-kb-long class II interval encompassing TAP1 and HLA-DMB loci (Figure 1a). Vanishing class III signal but unchanged class II signal was observed after control for the associated HLA-B alleles (Figure 1b); this indicated HLA-B-independent effects of target class II loci in contrast to those in the class III region, which appear to be confounded by the tightly linked HLA-B. Further sensitivity analyses restricted to the predominantly non-Hispanic European case and control group resulted in a similar pattern of association (Supplementary Figure 1A and Supplementary Table 1); thus false findings due to population structure was greatly minimized by the matched design and principle component analysis-based assessment of race.

Figure 1
figure 1

Association of major histocompatibility complex (MHC) polymorphisms with HIV-related Kaposi’s sarcoma (HIV-KS) in men. (a) The figure shows the strength of the association expressed as minus logarithm decimal of the P-value (−Log 10(p)) obtained from univariate analyses of 467 single-nucleotide polymorphisms (SNPs) spanning about 5 Mb across central and extended MHC. The association with the KS outcome was evaluated in conditional logistic models assuming additive variance and controlling for the effects of age at baseline, degree of immunosuppression (calculated as the area under the curve of CD4+ T-cell count below 300) and the rate of CD4+ T-cell count change at 6-month intervals (slope of CD4+ T-cell count from the time of dual seroconversion to HIV-1 and HHV8). (b) The logistic models were further controlled for the effects of two HLA-B alleles (B*2705 and B*1401) significantly (P<5%) associated with KS. The large centromeric gap corresponds to a chromosome 6p21 genomic interval where no genes of potential relevance to KS were found at the time this study was initiated.

Table 1 lists the SNPs associated with the risk of or protection from developing KS. A subset of these are non-synonymous or putative functional SNPs located in genes with relevance to cancer, including rs1116221 G>A, a missense (Glu421Lys) polymorphism in TRIM31 (OR=0.74; 95% CI=0.56–0.96; P=0.033), rs909253 A>G in the 5′-untranslated region (UTR) of LT-α (OR=0.75; 95% CI=0.58–0.96; P=0.022) and occurring in strong LD with non-synonymous SNP rs1041981 C>A (Thr60Asn) in the same gene (OR=0.75; 95% CI=0.58–0.96; P=0.022) and rs3093665 A>C in the 3′UTR of TNF-α with a stronger effect but marginally associated (OR=2.1; 95% CI=0.93–4.71; P=0.075) with the risk, possibly due to a low minor allele frequency.

Table 1 Association of gene variants in central MHC with HIV-related Kaposi’s sarcoma in HIV-positive men enrolled in the Multicenter AIDS Cohort Study

The most elevated risk was observed with the MHC class II variant rs6902982 A>G, an intronic SNP in HLA-DMB associated with a fourfold increase of risk (OR=4.09; 95% CI=1.90–8.80; P=0.0003). Within 95 kb from HLA-DMB toward HLA-DR, significant but moderate associations with risk were observed with two non-synonymous SNPs in TAP1, rs1800453 A>G (Asp697Gly) (OR=1.54; 95%=1.09–2.18; P=0.014) and rs4148880 A>G (Ile393Val) (OR=1.45; 95% CI=1.05–1.99; P=0.024), and with rs2071541 A>G, a SNP located in the overlapping microRNA TAPSAR1 (OR=1.60; 95% CI=1.11–2.32; P=0.012).

Significant associations were also observed with the 3′UTR SNP rs7029 A>G (OR=1.55; 95% CI=1.17–2.05; P=0.002) in GPANK1 and with the synonymous SNP rs1065356 G>A (OR=1.60; 95% CI=1.18–2.16; P=0.002) in LY6G6C located about 84 kb centromeric to TNF-α.

Evaluation of SNP effects under non-additive genetic models suggested that the target class II gene, which is most likely HLA-DMB, acts co-dominantly (compare genetic models in Figure 1b and Supplementary Figures 1b and c).

Multiple locus analysis

The conservation of effect sizes across extended regions suggested that the associated SNPs occur in long-range haplotypes. To capture possible additive or multiplicative effects of two or more candidate MHC loci, we used the haplotype trend regression (HTR) approach separately for class III and II regions to assess the risk associated with MHC haplotypes formed by the risk SNPs. HTR-estimated posterior probabilities were included as explanatory variables in stepwise conditional logistic models with control for the appropriate covariates and additionally for the effects of HLA-B*1401 and -B*2705 for class II haplotypes.

Table 2 illustrates the estimated frequency of the reconstructed haplotypes in the case and control groups together with the magnitude and strength of their association with the KS outcome. Three unique haplotypes, one four-SNP class III haplotype and two seven-SNP class II haplotypes were significantly associated with the risk of KS. Class III G-A-G-A haplotype formed in the order by SNP221 (rs7029 A>G) in GPANK1 3′UTR, synonymous SNP224 (rs1065356 G>A) in LY6G6C, intronic SNP225 (rs3749953 A>G) in MSH5-SAPCD1 readthrough and synonymous SNP227 (rs707926 G>A) in VARS was associated with 50% increase of risk (OR=1.52; 95% CI=1.01–2.28; P=0.047).

Table 2 Distribution of the major histocompatibility complex haplotype at risk for HIV-related Kaposi’s sarcoma in a case and control study of HIV-positive men

The strongest effect was observed with class II G-G-G-A-A-G-G haplotype (OR=10.5; 95% CI=2.54–43.6; P=0.0012), which carries among other risk alleles, the ‘G’ allele of intronic SNP372 (rs6902982 A>G) in HLA-DMB shown to be at a fourfold increase of risk in the single locus analysis (Table 1). This high-risk haplotype was formed strictly by the risk alleles at all the composite SNPs and occurred at a frequency of 2.6% in the cases (36 individuals) and at less than the HTR frequency cutoff of 1% in the controls. The second seven-SNP class II haplotype explained a small fraction of the risk (OR=1.38; 95% CI=1.02–1.86; P=0.035) and unexpectedly involved both risk and non-risk alleles at the composite SNPs (A-A-A-A-A-A-A haplotype). With this haplotype occurring on average in 26% of the study sample, we reasoned that homozygous A-A-A-A-A-A-A diplotypes must also be common and may occur more frequently in the cases than in the controls. A closer examination of haplotype distributions showed that homozygous A-A-A-A-A-A-A diplotypes indeed occurred more frequently in the cases (n=27) than in the controls (n=16) and exclusion of these individuals resulted in a complete loss of the association with the A-A-A-A-A-A-A haplotype and a slight diminution of the association strength with the risk G-G-G-A-A-G-G haplotype (OR=9.03; 95% CI=2.17–37.7; P=0.0025).

Discussion

We reported data from an extensive investigation of the MHC determinants of the natural history of HIV progression to KS. Using a carefully designed matched case and control study nested within the MACS cohort and analytical models with appropriate time-dependent CD4+ count covariates, we reported data suggesting the implication of MHC class III and II susceptibility loci in the etiology of HIV-KS. Our most important finding indicated that a HLA-DMB variant tagged by intronic rs6902982 increased the risk for HIV-KS by fourfold in HIV- and HHV-8-infected men. A significant increase of risk (adjusted OR=10.5) was associated with further carriage of non-synonymous rs1800453 (A>G) and rs4148880 (A>G) alleles encoding Asp697Gly and Ile393Val mutations in TAP1, respectively. Importantly, we have shown that the reported associations are controlled for the confounding effects of immunosuppression and are independent of HLA-B.

We also reported supportive data for a candidate class III susceptibility gene located within a 120-kb interval flanked by the proximal VARS and distal GPANK1 genes, and spanning members of the leucocyte antigen-6 (LY6) gene superfamily and other class III genes including casein kinase 2B (CSNK2B) implicated in endometrial and esophageal carcinoma and colorectal cancer,17, 18, 19 von Willebrand factor A domain containing 7 (VWA7) in lung cancer susceptibility20 and chloride intracellular channel 1 (CLIC1) in gliomas,21 gastric22 and hepatic cancer.23

The possibility that the positive association with HLA-B is an apparent association cannot be excluded in the present discovery stage of the study. HLA-B can influence the association in different ways: (i) directly as a true etiologic factor, (ii) mechanistically through confounding by linkage disequilibrium (LD) with one of the candidate class III susceptibility genes or (iii) through joint carriage of specific HLA-B alleles with the risk HLA-DMB variant (locus and allelic genetic heterogeneities). Although the protective effect of B*27 from HIV progression is well documented, that of B*14 is not known and surprisingly our E-M estimates of reconstructed HLA-B and SNP-221, -224, -225 and -227 joint haplotypes did not reveal the B*1401-G-A-G-A haplotype in either cases or controls, even at a haplotype frequency cutoff as low as 0.3% (not shown). The bulk of the risk G-A-G-A class III haplotype occurred on B*1501 and B*3501 chromosomes in the controls and additionally on B*2705 and accessorily on B*1801 in the cases. The observation that the risk G-A-G-A haplotype is carried on B*2705 is consistent with the drop of the association strength seen at these four SNPs after adjustment for B*2705 and B*1401 (Figure 1b and Supplementary Table 1). Furthermore, the drop of the association strength was explained by the control for B*2705 and not for B*1401 (not shown); this excluded the hypothesis of confounding by LD with B*1401 allele but not with B*2705. Thus, given the low frequency of B*1401, replication studies are needed to confirm or reject the association with this allele.

Based on our previous haplotyping data for the present study population, the risk G-A-G-A haplotype is most likely carried on the TNF-α superhaplotype VI (Supplementary Table 3).24

Several additional candidate gene loci were highlighted in this study, with some of them being tagged by potentially functional SNPs such as non-synonymous SNPs in TRIM31 and LTA, polymorphisms in noncoding RNA ZNRD1-AS1 and TAPSAR1, as well as polymorphisms in the 5′UTR of LTA and TAP1 or in the 3′UTR of LTA, TNF-a, GPANK1, VPS52 and ZBTB22. Most of these polymorphisms have protective effects and depending on the HLA background, they may occur or not in long-range LD with each other or with the risk alleles.

Our data suggested a spurious association with class II A-A-A-A-A-A-A haplotype for SNPs-349-350-352-353-355-367-372 and the default assumption of additive models in the HTR approach is consistent with this finding. Alternatively, homozygosity for this haplotype may also be considered disadvantageous if it tags a unique HLA-DMB allele, thus multiple genetic etiologies cannot be excluded.

None of the listed candidate genes have previously been implicated in the development of KS; nonetheless, after Bonferroni corrections for multiple testing, only HLA-DMB rs6902982 SNP passed the significant threshold of 5%.

Non-classical MHC class II molecules encoded by HLA-DM have recently emerged as important molecules involved in the stabilization of classical MHC class II molecules.25 Specifically, HLA-DM exert their critical role in antigen presentation by MHC class II molecules to CD4+ T-lymphocytes by accelerating the removal of class-II-associated invariant chain-derived peptide and by editing the peptide content of MHC class II molecules such that the display of high-affinity peptides is favored.26

To date, HLA-DMB and the linked TAP-1 gene have not been implicated in the pathogenesis of HIV-related KS. However, TAP1 peptide transporter and the proteasome subunit beta type 9, which are required for class I antigen presentation, were shown to be inactivated by interferon-γ-mediated down-modulation in response to an ectopic expression of LANA (HHV-8-encoded latency-associated nuclear antigen).27

In a previous investigation of candidate non-MHC determinants of HIV-KS, which we have conducted on the present study sample, we reported that variants of human homologs of two latently expressed HHV-8 genes, cyclin D1 (CCND1) and interleukin-6 (IL-6), in conjunction with angiogenic gene variants (VEGF, EDN-1 and EDNRB) conferred significant risk for HIV-KS (OR=2.84–3.92; Bonferroni-adjusted P=9.9 × 10−3–2.6 × 10−4).28 Here, we ruled out possible long-range LD between markers in central MHC and the risk variants reported for the endothelin-1-encoding EDN-1 gene, which maps few megabases away from central MHC (not shown).

It should be stressed that allelotyping of non-classical HLA-DMB and -DMA loci was not available at the time this study was conducted and only partial typing data were available for HLA-DR, the interaction partners of HLA-DM.

Our data lay foundation for a model in which prolonged exposure to low levels of CD4 count (<300) may provide favorable conditions for HHV-8 to downregulate antigen processing by class II molecules in susceptible individuals (those carrying a combination of risk alleles in target HLA-DM and TAP1 loci).

Methods

Study participants

This study uses information on patients enrolled in the MACS.29 MACS is a prospective longitudinal study of HIV-1 infection that recruited 5622 homosexual men in1984–1985 and 1987–1990. The MACS cohort began in 1984 in four US cities: Baltimore, MD, USA; Chicago, IL, USA; Pittsburgh, PA, USA, and Los Angeles, CA, USA. To be included in the study, participants could not have a clinical AIDS diagnosis at baseline, they must be >18 years at baseline, and participate in homosexual behavior within 5 years of study entry. The last date of follow-up for any of the study participants was 1 January 1996, when highly active anti-retroviral therapy became more widely available. Participants completed semiannual physical examinations and questionnaires that included information on treatments, medication utilization and information from routine blood draws.

Study design

This study built on an initial selection of 360 matched cases and controls pairs nested within the MACS cohort. Men with dual HIV-1 and HHV-8 infections who later developed KS were defined as cases (n=360). Matched controls (n=360) were chosen based on HIV/HHV-8 serostatus, race, KS-free time and CD4+ counts, and were defined as men who were free of KS. These pairs were matched for CD4+ counts at the visit within 1 year before the index diagnosis. If a control could not be matched by CD4+ count then it was matched by CD4+ slope (±25 cells per year) up to the time of KS diagnosis. Owing to the temporal relationship between HIV-1/HHV-8 and their influence on progression time to KS, cases were matched to controls within each of the four different sero-status groups. These four sero-status groups are defined as HIV-1 seroprevalent (SP)/HHV-8 SP; HIV-1 SP/HHV-8 seroconverter (SC); HIV-1-SC/HHV-8-SP and HIV-1-SC/HHV-8-SC. Follow-up time in the cases is defined as the date of KS diagnosis minus the baseline date or minus the date of seroconversion. The date of seroconversion is defined as the mid-point between the last HIV-1 or HHV-8-negative date and the first positive date.

Ascertainment of seropositivity and variable definition

Sera were tested at the enrollment visit, or the subsequent study visit for a baseline measurement and the final measurement of sera was conducted at the most recent visit the participant was tested. HIV-1 seropositivity was defined as an immunoblot-confirmed positive ELISA. Standardized T-cell phenotyping was performed at each follow-up visit. Specimens of peripheral blood mononuclear cells, plasma and serum from each participant have been stored in repositories. HHV-8 antibodies against HHV-8 lytic antigens were determined by use of an indirect immunofluorescence assay using 10-Q-tetradecanoyl phorbol 13-acetate-induced body cavity B-cell lymphoma-1 cells containing the HHV-8 genome. Known HHV-8-positive and -negative sera were assayed for each batch of serum samples tested. Serum samples were tested twice in a blinded manner and were assessed microscopically for the presence of whole-cell immunofluorescence by the same reader. Positivity at either sample defined an HHV-8-infected man and HHV-8 negativity at both visits defined an uninfected man.

Laboratory methods

SNPs were typed on a commercial genotyping platform (BeadArray, Illumina Inc., San Diego, CA, USA). A total of 496 SNPs were selected from candidate genes across 5 Mb of human chromosome 6p21 spanning central MHC and flanking regions (Figure 2).

Figure 2
figure 2

Genomic map of the target central MHC region. The genomic map of central major histocompatibility complex (MHC) on human chromosome 6p21.3 targeted for single-nucleotide polymorphism association mapping is shown in the telomere (tel) to centromere (cen) orientation. Major human leukocyte antigen (HLA) and non-HLA class I, II and III landmark loci are shown in bold. A sample of MHC genes associated with HIV-related Kaposi’s sarcoma in the present study is shown above the landmark loci. The three known hot spots for recombination map between HLA-B and NFKBIL1, DR and DQ and between HLA-DMB and DPB1.

High-resolution HLA-B genotyping (four-digit) was carried out by PCR amplification followed by automated capillary electrophoresis-based sequencing with Cy5-labeled primers (Abbott Park, North Chicago, IL, USA) in an ABI Prism 3130xl DNA Analyzer (Applied Biosystems, Foster City, CA, USA).

Quality control

Reliability in the typing data was assessed by a small set of intra- and inter-plates blind duplicates. SNP calls were checked for adherence to Hardy–Weinberg equilibrium in each of the KS outcome categories and only SNPs showing no significant deviation (P>0.01) from Hardy–Weinberg equilibrium in KS-free controls were included for analyses.

Statistical analysis

Covariates

Variables of interest within the demographic characteristics, clinical parameters and co-morbidities that are known to be associated with KS were compared between cases and controls using χ2 and Student’s t-test. We controlled for the effects of age (age at the time of dual seroconversion to HIV-1 and HHV-8 infections), the degree of immunosuppression and the rate of immunity decline over the exposure time (time from dual infection to index diagnosis). We adopted the trapezoidal method16 to the CD4+ count to estimate the degree of immunosuppression. Copy-years were defined as the CD4+ cell count per year below the cutoff of 300 CD4+ count and integrated over the number of years from study entry (for the SPs) or 1 year after dual seroconversion (for the SCs). The slope of the CD4+ count from baseline to the study end captured the rate of immunity decline.

Population structure

Principle component analysis based on the current MHC SNPs and of an additional set of 284 non-MHC SNPs was reported in a previous study that used the present case and control sample.28 Rigorously checking the population of origin is very important especially for MHC-linked diseases because of the differential and long-range pattern of LD across MHC in populations. The same SNP may capture different DNA variants depending on the population of origin, which usually has a distinct HLA distribution.

Single locus analysis

SNP markers were examined separately in case and control groups for adherence to Hardy–Weinberg equilibrium using Pearson’s χ2 test. Conditional logistic regression under different genetic models (additive, dominant and over-dominant) was used to assess the association between the SNP genotypes and the odds of developing KS in models with adjustment for the two time-dependent CD4+ count covariates. All statistical tests were performed in SAS and adjusted ORs, 95% CI and two-sided P-values are reported.

For the multi-allelic HLA-B locus, we conducted the analyses by carrier status including only alleles with a frequency greater than 1.0% in the entire study sample. Given the differentiation of MHC in highly diverse HLA-B-specific MHC haplotypes, we avoided grouping together related HLA-B alleles to increase specificity.

Multiple locus analysis

To evaluate the effects of MHC haplotypes, we estimated the haplotypes formed by the composite loci (non-HLA SNPs alone or joint HLA-B and non-HLA SNPs) that showed significant association with the outcome and assessed the overall differences in their distribution in cases and controls using the HTR approach.30 An additive model was assumed, estimating posterior probabilities for each subject for all expectation-maximization-inferred haplotypes. These posterior probabilities were treated as independent variables in the HTR model with the weights in the design matrix reflecting various alternative inferences about haplotypes. A logistic regression model containing the weighted haplotypes was used due to accommodate the case–control design. Confounding by differential CD4+ count was controlled for in multivariable logistic models, with the adjusted odds ratios representing the risk increase per haplotype copy. Haplotypes with a frequency <1% were aggregated as a single term in the model. Haplotype associations were tested using the most prevalent haplotype in the controls as the referent haplotype for calculating the odds ratios.

Multiple testing

Owing to the high correlation (strong LD) among the tested SNPs and the conservative nature of the conventional methods used to correct for multiple testing, we present the results with no correction and discuss them in the specific context of correlated data.