Classifying migraine subtypes and their characteristics by latent class analysis using data of a nation-wide population-based study

Migraine neither presents with a definitive single symptom nor has a distinct biomarker; thus, its diagnosis is based on combinations of typical symptoms. We aimed to identify natural subgroups of migraine based on symptoms listed in the diagnostic criteria of the third edition of the International Classification of Headache Disorders. Latent class analysis (LCA) was applied to the data of the Korean Sleep-Headache Study, a nationwide population-based survey. We selected a three-class model based on Akaike and Bayesian information criteria and characterized the three identified classes as “mild and low frequency,” “photophobia and phonophobia,” and “severe and high frequency.” In total, 52.0% (65/125) of the participants were classified as “mild and low frequency,” showing the highest frequency of mild headache intensity but the lowest overall headache frequency. Meanwhile, “photophobia and phonophobia” involved 33.6% (42/125) of the participants, who showed the highest frequency of photophobia and phonophobia. Finally, “severe and high frequency” included 14.4% (18/125) of the participants, and they presented the highest frequency of severe headache intensity and highest headache frequency. In conclusion, LCA is useful for analyzing the heterogeneity of migraine symptoms and identifying migraine subtypes. This approach may improve our understanding of the clinical characterization of migraine.

Migraine is a common condition that affects approximately 5-15% of the general population 1 . Despite its debilitating effects, there is currently no definitive biomarker or single symptom of migraine; thus, its diagnosis is based on self-reported symptoms. The third edition of the International Classification of Headache Disorders (ICHD-3) requires fulfilling at least two of four typical headache characteristics and at least one typical combination from four accompanying symptoms for a migraine diagnosis 2 . Therefore, individuals with migraine have varying symptom combinations.
Identifying the symptom subtypes of a disorder may help to better characterize typical and atypical populations, improving precision diagnosis and treatment 3 . In this context, several approaches have been used to identify the heterogeneity of migraine symptomatology [4][5][6][7][8][9][10][11] . Statistical methodologies, including factor analysis, cluster analysis, discriminant function analysis, factor mixture modeling, latent trait analysis and latent class analysis (LCA), have been used to identify the subtypes of clinical symptoms 8,10,12 . Among them, LCA is the most commonly used methodology to identify subgroups of migraine according to their symptoms and comorbidity 11,13 . It is a subset of structural equation modeling used to find subtypes of cases in multivariate categorical data and allows detection of the presence of latent classes (the disease entities), creating patterns of association in the symptoms 14 .
Previous studies on subtyping of migraine symptoms were mostly performed during genetic analyses among individuals with headache, mostly using data of twins and their relatives [4][5][6][7][8][9] . Although these subtyped migraine symptoms, involved individuals included both non-migraine headache and those with migraine. Further, although two studies have subtyped clinical symptoms exclusively in individuals with migraine, these studies used cohort data of health professionals 15,16 . Consequently, these studies may not properly reflect the clinical symptoms of those in the general population.
The Korean Sleep-Headache Survey (KSHS) was a nationwide population-based cross-sectional survey about headaches and sleep that provides an opportunity to evaluate and to subtype clinical symptoms of individuals with migraine in a general population. This study aimed to identify the subtypes of migraine symptoms in individuals with migraine through LCA using the data of KSHS.

Results
Participants. In total, 2501 individuals completed the KSHS in 2018. The distribution of sex, age and educational level was not significantly different from those in the overall Korean population (Supplementary  Table S1). Overall, 1186 participants reported experiencing a headache during the previous year; of them, 125 participants (4.99%, 95% confidence interval [CI] 4.14-5.85%) were diagnosed with migraine. They were aged 45.8 ± 15.0 years and included 50 males and 75 females.
Model selection and classification of classes. Three models were proposed and assessed as to their likelihood of prediction error. The Akaike information criterion (AIC) value was the lowest in the four-class model, while the Bayesian information criterion (BIC) value was the lowest in the two-class model (Table 1). Considering the discrepancy between the lowest AIC and BIC values within the models, we selected the threeclass model because its BIC was lower than that of the four-class model and because it best described the different characteristics of the latent classes clinically. Both the standard errors of estimated prior probabilities of each class of the latent class membership and the class-conditional response probabilities of the indicators were within acceptable range (Supplementary Tables S2 and S3).
Demographic characteristics of the three classes. The model allowed sorting into Classes 1, 2 and 3, which identified 65, 42 and 18 individuals, respectively (Table 2). There were no significant differences in age, sex, residential area, education, income, body mass index and proportion of obesity among the three classes. The prevalence of comorbidities (hypertension, diabetes and dyslipidemia) was also similar.  www.nature.com/scientificreports/ Headache characteristics under the ICHD-3 migraine diagnostic criteria. The headache characteristics according to the ICHD-3 diagnostic criteria of migraine among the three classes are summarized in Table 3. The results of multiple comparisons are shown in Supplementary Table S4. There were significant differences in the distribution of headache intensity, unilateral pain, aggravation by routine physical activity, nausea, vomiting, photophobia and phonophobia among the three classes but not in pulsating quality. Unilateral pain and mild headache intensity were the most frequent symptoms, and aggravation by routine physical activity was the least frequent in Class 1, while photophobia and phonophobia were the most frequent in Class 2. Class 3 showed the highest frequency of nausea, aggravation of routine physical activity and severe headache intensity. The multiple comparisons revealed that the frequency of unilateral pain, aggravation by routine physical activity, nausea, photophobia and phonophobia was significantly different between Class 1 and Class 2. Meanwhile, the frequency of unilateral pain, aggravation by routine physical activity and vomiting differed significantly between Class 1 and Class 3. Headache intensity, photophobia and phonophobia were significantly different between Class 2 and Class 3.
Clinical features of migraine outside the ICHD-3 diagnostic criteria. Headache frequency per month, headache duration (in minutes), osmophobia, impact of headache, visual aura, anxiety and fatigue were significantly different among the three classes (Table 4). Multiple comparisons showed that the frequency of osmophobia, visual aura, anxiety and fatigue was significantly different between Class 1 and Class 2 (Supplementary Table S4).
Characterization of the three classes. The three classes were named to reflect the unique characteristics of each class. Class 1 was characterized by mild and low frequency migraine and was the most common, with 52.0% (65/125) of the participants belonging to this classification. We named Class 1 as "mild and low frequency. " Class 2 as the next most common class was characterized by intermediate headache frequency and headache intensity. Nevertheless, this class showed the highest frequency of photophobia and phonophobia and was thus named "photophobia and phonophobia. " This class involved 33.6% (42/125) participants. Class 3 was the least common and was characterized by the highest headache intensity and headache frequency. We named this as "severe and high frequency. " This class included 14.4% (18/125) of the participants.

Discussion
Migraine diagnosis is challenging because it does not present with a definitive single symptom or have a distinct biomarker. LCA in this study enabled us to classify each of the 125 participants with migraine into three classes. Each class showed distinct characteristics that can be briefly described as "mild and low frequency, " "photophobia and phonophobia" and "severe and high frequency".
Several studies have attempted to subclassify migraine. Schürks et al. evaluated women with migraine and found three classes related to central nervous system (CNS) sensitization, attack frequency and pain location and aura and visual phenomena 15 . The "CNS sensitization" and "attack frequency and pain location" classes may correspond to our "severe and high frequency" and "mild and low frequency" classes, respectively. Both the "CNS sensitization" and "severe and high frequency" classes showed high headache frequency and aggravation by routine physical activity. Meanwhile, the "attack frequency and pain location" and "mild and low frequency" classes presented with frequent unilateral pain and mild headache intensity. However, photophobia and phonophobia were significantly related in our study, but not in Schürks et al. 's study. This discrepancy may be due to differences in study population, ethnicity, sex and analysis methods. Studies on twins and their relatives included Table 3. Characteristics of headache under the diagnostic criteria of migraine in the third edition of the International Classification of Headache Disorders. Data are presented as n (%). † Significant P-value. a Significant difference between Class 1 and Class 2 in multiple comparisons. b Significant difference between Class 1 and Class 3 in multiple comparisons. c Significant difference between Class 2 and Class 3 in multiple comparisons. Our study found several noteworthy findings on the subtypes and symptoms of migraine. First, Class 2 (photophobia and phonophobia) showed the highest rate of osmophobia (i.e., intolerance of or hypersensitivity to smells), which is a common symptom of migraine and has high specificity for migraine diagnosis 17,18 . Photophobia, phonophobia and osmophobia are sensory hypersensitivity symptoms 19 . Our findings support that there is a migraine subtype that presents with a high frequency of sensory hypersensitivity symptoms. Second, considering that the intensity and frequency of headache are key parameters for determining migraine severity 20 , we found that each symptom is closely related to severity. Unilateral pain was most common in Class 1 (mild and infrequent) and least common in Class 3 (severe and frequent). In contrast, aggravation by routine physical activity was the most frequent in Class 3 and the least frequent in Class 1.
Photophobia and severe headache intensity were prevalent in 36.0% and 20.8% of our participants, respectively. These rates are lower than those in Western countries but are similar to those in Asian countries. Approximately 68% to 84.5% of patients with migraine in Western countries have photophobia 21,22 , whereas the rates range from 53.9 to 67.4% in Asian countries 23,24 . Similarly, the prevalence of severe headache intensity is lower in Asian countries (19.5-38.0%) than in Western countries (60-85%) 23,[25][26][27] .
LCA has been used in genetic research to capture underlying phenotypic and genetic variance in diseases with complex etiologies [7][8][9]13 . The three symptom subtypes of migraine identified in our study based on LCA may provide clues in understanding the pathophysiologic and genetic mechanisms of migraine. The presence of a subtype with a high frequency of photophobia and phonophobia suggests a shared, rather than an independent, pathophysiological mechanism in sensory hypersensitivity symptoms. In a similar context, the co-occurrence of a high headache frequency and severe headache intensity suggests that high headache frequency and severe headache intensity have overlapping pathophysiological mechanisms. In genetic studies, the identification of different subtype symptom profiles suggests the presence of different genetic loci in each phenotype.
This study had some limitations. First, the symptoms of migraine were evaluated based on the participant's report. These symptoms can vary between attacks in the same person 28 . A headache diary could have been a more accurate method for symptom evaluation; however, it is difficult to use in epidemiological studies. Second, although we used data from a large-scale epidemiological study, the sample size of this study was relatively small for LCA. The minimum sample size for LCA is not fixed, although better performance is expected with larger sample sizes 29 . To overcome this issue, we used high-quality indicators with strong theoretical bases as variables, which allowed us to use a small sample and made classification and interpretation easy 30,31 . To avoid local maxima Table 4. Clinical features not included in the diagnostic criteria of migraine in the third edition of the International Classification of Headache Disorders. Data are presented as the mean ± standard deviation, median (interquartile range), or n (%). VARS Visual Analogue Rating Scale, HIT-6 Headache Impact Test-6, MIDAS Migraine Disability Assessment, GAD-7 General Anxiety Disorder-7, PHQ-9 Patient Health Questionnaire-9, ASC-12 The 12-item Allodynia Symptom Checklist, FSS Fatigue Severity Scale, ESS Epworth Sleepiness Scale. † Significant P-value. a Significant difference between Class 1 and Class 2 in multiple comparisons. b Significant difference between Class 1 and Class 3 in multiple comparisons. c Significant difference between Class 2 and Class 3 in multiple comparisons. www.nature.com/scientificreports/ in the expectation-maximization algorithm, the latent class model was estimated 20 times using various initial parameter values; convergence failure was not present (Supplementary Tables S2 and S3). However, the present study also had several strengths. First, our study used data from a nationwide population-based study that included a sample proportional to the population distribution of Korea. As such, the risk of selection bias was minimized. Second, we used a validated questionnaire for the diagnosis of migraine. Depression, anxiety, visual aura, fatigue, disability and impact of headache were also evaluated using validated Korean versions of the Patient Health Questionnaire-9 (PHQ-9), General Anxiety Disorder-7 (GAD-7), selfadministered Visual Analogue Rating Scale (VARS), Fatigue Severity Scale (FSS), Migraine Disability Assessment (MIDAS) and Headache Impact Test-6 (HIT-6). This enabled an accurate evaluation of migraine and its clinical features. Third, we selected items of LCA based on the ICHD-3 diagnostic criteria of migraine, which comprised key characteristics of migraine. This allowed us to properly analyze and to compare findings. Other approaches of migraine subtyping including genetic, biochemical and neuroimaging studies are needed to verify our findings.
In this study, LCA identified three classes of migraine that showed distinct characteristics. These classes were "mild and low frequency, " "photophobia and phonophobia, " and "severe and high frequency, " in order of prevalence. LCA can be useful to analyze the heterogeneity of migraine symptoms and to identify migraine subtypes.

Methods
Survey. The method of sampling and survey of KSHS in 2018 has been previously described in detail 32 .
Briefly, the survey used a two-stage clustered random sampling method proportional to population distribution of all Korean territories, except for Jeju-do, based on data from the 2017 population and housing census conducted by the National Statistical Office of Korea 33 . The target sample size was 2500 adults aged ≥ 19 years, and the estimated sampling error was ≤ 1.9%. The survey was conducted between October 2018 and November 2018 using face-to-face questionnaire interviews by trained interviewers. All interviewers were employees of Gallup Korea and had experiences in social survey. The questionnaire items included demographic characteristics, headache profiles, headache diagnosis, use of medical services, medical consultation, disability from headache, impact of headache, anxiety, depression and fatigue. This study was the secondary analysis of the KSHS.

Diagnosis of migraine.
Migraine was diagnosed based on the diagnostic criteria for migraine without aura in ICHD-3 (code 1.1) 2 . The diagnostic validity of the questionnaire has been previously reported 32 . Participants with a positive response to the question "Did you experience headache during the previous year?" and fulfilling the ICHD-3 criteria of migraine were diagnosed with migraine.
The presence of a visual aura was assessed using the self-administered VARS, for which the Korean version has been validated, with a score of ≥ 3 defined as having a visual aura 34,35 . Participants who fulfilled the ICHD-3 diagnostic criteria of migraine without aura but reported visual aura were diagnosed with migraine with aura (code 1.2) 2 .
Assessment of physical and mental impact of headache. Migraine disability and the impact of headache were evaluated using the MIDAS and HIT-6 tools, respectively 36,37 . The Korean versions of the MIDAS and HIT-6 were previously validated 38,39 .
The level of anxiety was assessed using the GAD-7 tool 40 . Participants with a GAD-7 score of ≥ 8 were classified as having anxiety 41 . Depression was evaluated using the PHQ-9, with a PHQ-9 score of ≥ 10 defined as depression 42 . The Korean versions of the GAD-7 and PHQ-9 were previously validated 43,44 . Cutaneous allodynia, fatigue, and excessive daytime sleepiness. Cutaneous allodynia was evaluated using the Allodynia Symptom Checklist-12 (ASC-12), with an ASC-12 score of ≥ 3 being indicative of cutaneous allodyniac 45 . Fatigue was defined as an FSS score of ≥ 4 46 . Excessive daytime sleepiness was defined as an ESS score of ≥ 11 47 .
Model selection of LCA. LCA was used to differentiate the natural subgroups of migraine. It is a type of finite mixture model, a statistical technique for the analysis of multivariate categorical data, used for examining relationships among observed variables. It stratifies the cross-classifications and probabilistic groups and characterizes them into unobserved (latent), unordered categorical variable classes 48,49 . In this way, observations with similar sets of responses on the manifest variables tend to cluster within the same latent classes 49 . In this study, LCA was performed using the "poLCA" package by R, which is a frequently used package for LCA 49 .
Given that all participants with migraine satisfied criteria A (total attack number), B (typical duration) and E (exclusion of other diagnosis), we used items in criteria C and D of migraine without aura (code 1.1) 2 as variables for LCA. Headache intensity (mild, moderate and severe), unilateral location, pulsating quality, aggravation by routine physical activity, nausea, vomiting, photophobia and phonophobia were used as categorical variables in the gathered data. The headache frequency per month was categorized into a four-class variable (< 2, ≥ 2 but < 8, ≥ 8 but < 15 and ≥ 15). We evaluated the prior probabilities of latent class membership and class-conditional probabilities to check the quality of the indicators 30 . To avoid local maxima in the expectation-maximization algorithm, the latent class model was estimated 20 times using various initial parameter values.
The model with the best fit was selected primarily based on AIC and BIC, with their lowest values predicting the best fit model 48 . For comparison among classes, a one-way analysis of variance was used for normally distributed variables, while a Kruskal-Wallis test was used for non-normally distributed variables. A chi-square or Fisher's exact test was used to compare categorical variables, as appropriate. Among observed variables in LCA and those with significant P values, normally distributed variables were compared using an independent t-test, whereas Dunn's procedure was used for non-normally distributed variables 50  www.nature.com/scientificreports/ conducted with the Bonferroni method, with P values represented as adjusted P values. All statistical analyses were performed using R version 3.6.0 (R Core Team, 2019) 51 . A two-sided P-value of < 0.05 was considered significant. There was no missing data except for monthly headache frequency in one participant with migraine which was adjusted as mean value.
Ethical approval. This study was approved by the Institutional Review Board of Severance Hospital, Yonsei University (Approval No. 2018-1269-001). Written informed consent was obtained from all participants before the survey. Prior to obtaining written informed consent, all participants were given an explanation on the objective of the study and the data to be collected by interviewers. All procedures involving human participants were in accordance with the ethical standards of the institutional and/or national research committee as well as the tenets of the 1964 Declaration of Helsinki and its later amendments, or comparable ethical standards.

Data availability
The data used in this study are available from the corresponding author on reasonable request.