Herpes simplex labialis (‘cold sores’ or ‘fever blisters’) is a common and ubiquitous infection of the skin due to herpes simplex virus (HSV). The vast majority of cases are due to HSV type 1 (HSV1), although recurrent infections due to HSV type 2 have been reported. Roughly 20–40% of the population will experience labial or perioral outbreaks of vesicular herpetic lesions,14 out of the ~60% of the population that are HSV1 seropositive.5 The frequency of these outbreaks is variable, ranging, in some individuals, from rare episodes every 5–10 years, to monthly or more frequent outbreaks among a small proportion of subjects.6 Herpes simplex labialis is usually mild, although uncomfortable and temporarily disfiguring for many persons. There can be a significant psychological impact of this prominent facial infection, particularly in young patients with frequent or severe recurrences.

Viral strain, environmental factors and genetics are believed to account for the observed differences in disease expression among those infected with HSV1. For instance, in humans fever, wind, sunburn and surgical manipulation of the ganglion are inducers of HSV reactivation.79

Our group previously identified a region of chromosome 21 that is significantly linked to Herpes simplex labialis disease. This region was defined using a genome-wide, family-based linkage study involving the Utah Centre d’Etude du Polymorphisme Humain families.10 The region of nonrecombination believed to be driving the trait was refined by additional single-nucleotide polymorphisms (SNP) genotyping within this familial population. The association of individual SNP genotypes with Herpes simplex labialis frequency was evaluated using linkage analysis and transmission disequilibrium testing. These results indicated that human genotypes of C21orf91, which we call the ‘Cold Sore Susceptibility Gene-1’ (CSSG-1), affects susceptibility to or protection from orolabial herpes outbreaks.11

The previous study implicating CSSG-1 in herpes disease expression was performed in a familial sample. It was of interest to determine whether the effects of CSSG-1 on herpes disease expression might also apply to the general population. Therefore, a case-control study in an entirely new, unrelated human population was performed to investigate the proposed effects of the gene CSSG-1 (c21orf91) on the susceptibility to cold sores.

Materials and methods

Instruments, standard techniques and procedures

A standardized, updated ‘Cold Sore Questionnaire’ was used as the instrument to collect phenotypic data from the participants (SI1 Cold Sore Questionnaire). Six CSSG-1 SNP (Figure 1) were genotyped at the University of Utah Core Lab using standard TaqMan (Life Technologies, Carlsbad, CA, USA) genotyping methods. The PLINK whole-genome association analysis toolset computer application was used to group the SNPs into major haplotypes (groups of SNPs or alleles).12 Blood was drawn and processed for recovery of genomic DNA. Serum was separated for type-specific HSV1 ELISA analysis (HerpeSelect, Focus Diagnostics, Cypress, CA, USA). The subjects were not asked about genital herpes lesions, nor were they serotyped for HSV2. Subjects were defined as HSV1 seropositive if the enzyme-linked immunosorbent assay (ELISA) index (EI) was >1.1 (manufacturer’s recommendation) or EI >2.5 (more strigent definition, to avoid any false positives).

Figure 1
figure 1

Cold sore susceptibility gene-1 (CSSG-1) map. The CSSG-1 gene spans 30,300 bp of chromosome 21. It is transcribed from the minus strand (for convenience shown left to right in this figure). Arrows indicate the locations of the six single-nucleotide polymorphisms (SNPs) within the gene. Each SNP is labeled with its wild-type nucleotide, rs (reference SNP) number and location within the gene. The two non-synonymous (amino-acid changing) SNPs are shown, D136E and N115K, both within exon 3. One SNP, rs2824499, used previously, is no longer considered here.11

Subjects and exclusions

Many of the subjects were recruited from the University of Utah Health Sciences Center. Other subjects were recruited largely from the University of Utah student body, staff and faculty. The subjects all provided written informed consent, approved by the University of Utah Institutional Review Board, before participating. Prospective subjects with either a history of frequent cold sores (2 per year, ‘frequently affected’) or no cold sores at all (0 lifetime cold sores, ‘unaffected’) were actively recruited into the study. Subjects with 0–2 cold sores per year (‘unknown’ or ‘intermediate’) were not recruited, but neither were they excluded from participation.

To ensure that the subjects were not closely related, relatives closer than 3rd cousins of the existing participants were excluded from the study. To avoid including persons with the same HSV1 strain in the analysis, subjects with identical addresses (e.g., marital partners) were identified and one of the pair was excluded from the analysis. Participants who were residents of a campus dormitory, all of whom had the same address, were all included in the analysis unless they shared surnames because it is unlikely that they were true cohabitants and, therefore, were unlikely to have similar HSV1 strains. Among pairs with the same address, HSV1 seropositives were preferentially retained over seronegatives and low seropositives. If both members of the pair had the same serostatus, the subject enrolled first was arbitrarily excluded. The exclusions were performed blinded to the subjects' phenotype and genotype. Additional criteria for exclusion from the analysis included insufficient sample, equivocal HSV1 serology or failure of the genotyping reaction. In these cases, the remaining partner was not excluded from the analysis. Thirty-three individuals with rare haplotypes (other than H1–H6) were excluded in order to avoid estimating haplotype effects from small numbers of observations.


The subjects were queried about their cold sore disease history using a printed questionnaire (SI1 Cold Sore Questionnaire). They were asked to distinguish ‘cold sores’ on the lips or face from ‘canker sores’ inside the mouth. The subjects estimated their number of lifetime cold sores, their annual cold sore frequency and the perceived severity of their cold sores. The phenotypes were selected to capture as much of the complex nature of the disease as possible. Whether a person had frequent cold sores or none at all (i.e., ‘affection status’) was used in the previous study where the continuous phenotype ‘annual cold sore frequency’ also added significant useful information. Two other phenotypes, ‘lifetime number of cold sores’ and ‘perceived cold sore severity’, were proposed as methods to capture a broad range of cold sore phenotypes. For instance, persons who earlier in life experienced a large number of cold sores but who now have very few would be missed by the annual frequency phenotype, but recognized by the lifetime number phenotype. The phenotype definitions used for the study are included as Table S1.

Statistical analysis

Least squares regression was used to evaluate the effects of CSSG-1 haplotypes on two binary responses and four ordinal responses in the presence of up to four covariates. The binary responses (discrete variables) were: HSV1 serostatus using the standard 1.1 EI cutoff and cold sore affection status (‘frequent cold sores’ or ‘no cold sores’). The binary variables were analyzed using logistic regression. The ordinal responses (continuous variables) were: EI, annual cold sore frequency, lifetime number of cold sores and perceived severity. The ordinal responses were analyzed using conventional linear regression.

For the 3 serostatus responses, we used all 622 subjects (HSV1 seropositives and seronegatives) and considered the subject's age, sex and ethnicity as covariates. For the remaining disease responses, we used only the 388 HSV1-seropositive individuals and considered age, sex and ethnicity as potential covariates. The covariates were included regardless of whether they were significant predictors of the response.

Haplotypes were used as the unit of analysis, with each haplotype being allocated the response and covariates of the individual carrying it. We note that this data departs from the assumptions implicit in regression analyses: the ordinal data generally is not Gaussian and contains some extreme values; both haplotypes in an individual's observed pair were allocated the same response and covariate values, and there may be correlations between covariates. Therefore, in addition to calculating the nominal P values for non-zero effects, we calculated bootstrap s.e. and P values by randomly resampling individual records. For each analysis we took 100,000 random bootstrap samples.

Effect sizes, conventional s.e. estimates, conventional P values, bootstrap s.e. estimates and bootstrap P values are reported to three decimal places. The P values are for two-sided tests in all cases. The sex effect is for males relative to the baseline for females, the ethnicity effect is for non-whites relative to the baseline for whites. Haplotype effects are relative to the baseline wild-type haplotype 1.


Demographics of the study population

Seven hundred fifty-eight human subjects were recruited into the Cold Sore Study. DNA and serum samples from 718 subjects were analyzed. After exclusions of cohabitants, subjects whose genotyping or serotyping failed and subjects with minor CSSG-1 haplotypes, a total of 622 subjects were examined for genotype–phenotype associations. Most of the analyzed population was Caucasian, as expected, given the demographics of the local Utah population. Seventy-eight of the total analyzed participants (13%) were African–American, Hispanic, Asian, Native American or other/mixed race. One hundred thirty-three participants (21%) did not report any race or ethnicity.

Haplotype definition and frequency

Six major CSSG-1 haplotypes (i.e., alleles) were defined. The designations of the haplotypes and their frequencies within the study population are shown in Table 1. The locations of these SNPs within CSSG-1 are shown in Figure 1. A total of 1244 haplotypes were determined among the 622 analyzed subjects. (Each human subject has two copies of chromosome 21, and therefore two CSSG-1 genes with their respective haplotypes.) H2 was the most common haplotype (34%) followed by H1 (22%) and H4 (20%), H3 (14%), H5 (7%) and H6 (1%), as shown in Table 1. Haplotypes H5 and H6 were combined for the subsequent analysis because (a) the number of individuals with these haplotypes was low compared with the other four haplotypes, and (b) H5 and H6 are genetically similar, differing by only a single SNP in the 5′ region of the gene.

Table 1 Determination of the CSSG-1 haplotypes and their frequencies

HSV1 serotyping

Among the 622 analyzable participants, 234 (38%) were HSV1 seronegative (EI <0.9) and 388 (62%) were HSV1 seropositive (EI >1.1). The HSV-seropositive subjects were presumed to be HSV1 infected and, therefore, suitable for examining the effects of CSSG-1 haplotype on disease expression.

Serostatus (infection versus no infection with HSV1) and EI

Comparing the 388 seropositive and the 234 seronegative subjects, age had a highly significant effect on serostatus (Table 2). As expected, older subjects were more likely to be HSV1 seropositive: each year of age increased the log odds of being seropositive by 0.044 (P<0.001). Sex and ethnicity had no significant effects on serostatus. A higher EI value (EI >2.5) was also used for this analysis owing to the difficulties with specificity of the assay.13,14 The CSSG-1 haplotypes had no significant effect on HSV serostatus by either criteria (EI >1.0 or >2.5), except for the H5/H6 haplotype, which had a marginally significant effect on HSV1 serostatus where EI >1.1 (P=0.044).

Table 2 Summary of the regression analyses for the three cold sore serostatus phenotypes (N=622 subjects)

EI values ranged from 1.17 to 13.95 among the 388 HSV1-seropositive subjects. Age, but not sex or ethnicity, had a significant effect on the EI, with older individuals tending to have higher values. The CSSG-1 haplotypes did not significantly affect the EI values.

Affection status

Among the seropositives (N=388), logistic regression analysis revealed that sex and ethnicity, but not age, were significant predictors of affection status (Table 3). Women were more likely to be affected than men (74% vs. 60%, P=0.01), and whites were more likely to be affected than non-whites (71% vs. 52%, P=0.02). Each of the major CSSG-1 haplotypes was then examined individually for an effect on affection status, corrected for the covariates. Haplotypes 3 and 5/6 were marginally associated with the affection status (P=0.050, P=0.056, respectively), with negative effect sizes demonstrating relative protection among these two haplotypes compared with the H1 wild type (Figure SI3 Affection).

Table 3 Summary of the regression analyses for four herpes disease phenotypes (N=388 HSV1-seropositive subjects)

Annual cold sore frequency

The distribution of annual cold sore outbreaks among all the subjects, including seronegatives is shown in Figure 2. The 388 HSV1-seropositive subjects reported between 0 and 60 cold sores annually. Regression analysis revealed that sex (P=0.001) and ethnicity (P=0.006), but not age, were correlated with the number of annual outbreaks (Table 3). Again, whites and women had the most annual cold sore outbreaks. Haplotypes 3 (P=0.006) and 5/6 (P=0.025) were significantly associated with cold sore frequency, with negative effect sizes demonstrating relative protection among these two haplotypes compared with the H1 wild type (Figure SI4 Annual Frequency).

Figure 2
figure 2

Self-reported annual cold sore outbreaks among the 622 analyzed subjects. The bimodal distribution of HSV1-seropositive (EI >1.1) subjects was due to recruitment and enrollment of only clearly ‘affected’ and ‘unaffected’ individuals.

Lifetime number of cold sores

The subjects reported 0 to 960 estimated lifetime cold sores. Again, 776 haplotypes from 388 HSV1-seropositive subjects were compared. Regression analysis revealed that age and sex, but not ethnicity, were significant covariates for this phenotype (Table 3). The median number of lifetime cold sores for each haplotype is shown in Figure SI5 Lifetime. Haplotype 3 was significantly associated with lifetime cold sores (P=0.012), with a negative effect size demonstrating relative protection among this haplotype compared with the H1 wild type.

Perceived severity

The 388 HSV1-seropositive subjects reported cold sore severities between 0 (none) and 4 (severe). The median perceived cold sore severity for each haplotype is shown in Figure SI6 Severity. Regression analysis revealed that sex and ethnicity, but not age, were significant covariates. CSSG-1 haplotype H3 was significantly associated with perceived severity (P=0.008), whereas the H2 haplotype association was marginally significant (P=0.047).


This study evaluated the effects of CSSG-1, a putative herpes susceptibility gene, on the expression of cold sores in a larger, unrelated human population. One or more CSSG-1 haplotypes significantly affected the reported number of annual cold sore outbreaks, the number of reported lifetime cold sores and perceived cold sore severity. CSSG-1 haplotype H3 had the most consistent effects on the measured cold sore phenotypes in this population. The CSSG-1 haplotypes generally did not affect HSV1 serostatus (i.e., infection with HSV1) or the EI (quantity and affinity of anti-HSV1 antibodies).

The age, sex and ethnicity-adjusted data here suggested significant protection from frequent and severe cold sores among those with the H3 or H5/6 haplotypes, whereas those with H1, H2 and H4 haplotypes tended to be subject to more frequent and more severe episodes. It is interesting that the H2 and H4 haplotypes associated with more frequent cold sores contain the amino-acid-changing SNPs, D136E and N115K, respectively. The D136E substitution is a conservative amino-acid change owing to the similarity between aspartic acid (D) and glutamic acid (E). The N115K substitution is predicted to have more effect on protein function than the D136E substitution due to the difference between asparagine (N, uncharged) and lysine (K, positively charged). In the previous familial study the H2 haplotype seemed more protective than the N115K containing H4 haplotype, a finding not confirmed in the present study. However, the H2 and H4 may provide for a lower lifetime number of cold sores compared with the wild-type H1 haplotype.

Although age had a highly significant effect on HSV1 serostatus and EI, the CSSG-1 haplotypes generally did not. As persons with more HSV1 outbreaks tend to have higher antibody titers,9 it is possible that the effects of CSSG-1 haplotype were overshadowed by reactivation frequency. For instance, the protective H3 haplotype was shown to be associated with decreased reactivation frequency. It is possible that persons with the H3 haplotype also make more and/or higher affinity antibody against HSV1, but that this effect was not seen because these persons had fewer HSV1 outbreaks, which reduced their EI values. Therefore, the present study cannot exclude subtle effects of CSSG-1 genotypes on anti-HSV1 antibody production and affinity.

The relatively protective H3 and H5/H6 haplotypes carry SNPs in the 3′ UTR (3′ untranslated region or 3′ Tag SNP) of the gene. As these SNPs do not occur in protein coding regions, we cannot say whether these haplotypes exert their effect through alterations to the regulation or expression of CSSG-1 or through other genetic effects to nearby regions of the genome. Our data do not allow us to predict if these effects are associated with increased or decreased CSSG-1 protein production.

These results are different than the previous family-based study where some data suggested that the H4 and H5 haplotypes might be susceptible, whereas the H2 and H3 haplotypes could be protective.11 However, in the previous study the H3 group had the lowest mean annual cold sore frequency, in agreement with the present study. The present study was designed to characterize the nature of the associations between the haplotypes and the measured phenotypes. The current study includes an analysis of the predictive values of genotype on phenotype, an analysis not performed in the previous study, which was focused on linkage.

This study showed only a marginally significant effect of CSSG-1 haplotypes H3 and H5/6 on the discrete variable of affection status (‘frequently affected’ versus ‘unaffected’). The other phenotypes including annual cold sore frequency, lifetime cold sores and perceived severity are all continuous variables, which provide a more sensitive means to detect a genetic effect. The haplotype effects on these continuous variables were significant with H3 effects, for instance, all at significance levels of 0.01.

Other genetic determinants of herpetic disease include the human genes UNC93B and Toll-like receptor 2.15,16 Mutations in UNC93B, Toll-like receptor 3, STAT1 and NEMO genes were associated with an increased risk of HSV1 encephalitis in children.17 Changes in Toll-like receptor 2 increased the number of genital outbreaks and the rate of genital shedding among 128 subjects with HSV2 infection.16 Although interesting and potentially relevant, these studies involved investigating immunologic abnormalities in severe-HSV phenotypes such as encephalitis. A recent study in mice identified the calcitonin receptor as a possible susceptibility factor for herpes encephalitis.18 Another interesting new genetic study mapped the severity of HSV1-induced disease to a region on mouse chromosome 16.19 This important mouse herpes susceptibility region maps near the murine analog of CSSG-1 (c21orf91), suggesting that a single host locus may be the main genetic influence on HSV1 pathogenesis.

In conclusion, the present study confirms the activity of a previously unknown herpes susceptibility gene, CSSG-1, in a new population of unrelated humans. These new data are important because previous data in a large multigeneration linkage study suggests that CSSG-1 is the most important determinant of cold sore expression. The mechanism of CSSG-1 action is unknown.