Comparison of self-reported and register-based hospital medical data on comorbidities in women

Breast cancer patients commonly present with comorbidities which are known to influence treatment decisions and survival. We aim to examine agreement between self-reported and register-based medical records (National Patient Register [NPR]). Ascertainment of nine conditions, using individually-linked data from 64,961 women enrolled in the Swedish KARolinska MAmmography Project for Risk Prediction of Breast Cancer (KARMA) study. Agreement was assessed using observed proportion of agreement (overall agreement), expected proportion of agreement, and Cohen’s Kappa statistic. Two-stage logistic regression models taking into account chance agreement were used to identify potential predictors of overall agreement. High levels of overall agreement (i.e. ≥86.6%) were observed for all conditions. Substantial agreement (Cohen’s Kappa) was observed for myocardial infarction (0.74), diabetes (0.71) and stroke (0.64) between self-reported and NPR data. Moderate agreement was observed for preeclampsia (0.51) and hypertension (0.46). Fair agreement was observed for heart failure (0.40) and polycystic ovaries or ovarian cysts (0.27). For hyperlipidemia (0.14) and angina (0.10), slight agreement was observed. In most subgroups we observed negative specific agreement of >90%. There is no clear reference data source for ascertainment of conditions. Negative specific agreement between NPR and self-reported data is consistently high across all conditions.

reports it) (85-95% in general) but lower sensitivity (i.e. the probability of inpatient records reporting the condition given that the condition is present) for many diagnoses in the NPR 9 .
Comorbid conditions such as hypertension and adulthood onset diabetes are almost exclusively managed in primary care and are therefore not well-captured in the register-based hospital records 10 . Ludvigsson et al. noted that sensitivity captured by inpatient records were especially low for hypertension and lipid disorders in the Swedish NPR (~10%) 9 . Other studies have highlighted the under-recording of mild medical conditions that do not require hospital care in hospital admission data 11,12 . Therefore, self-reported data from the patients may bridge the gap in recording these conditions and be an important source of information 13 . Though the reliability of self-reported medical conditions varies considerably, it has been increasingly adopted in both research and clinical settings 13 . The accuracy of self-reported medical history is impacted by the individual's age, health status and formal education, with higher accuracy linked to younger age, better health and higher education [14][15][16][17] . Potentially due to increased monitoring of heart disease and related conditions, higher body mass index (BMI) may also be associated with higher accuracy of self-reported medical history 18,19 .
In practice, missed cases, under-and over-reporting are inevitable regardless of data source used. However, the key question remains as to how and under what conditions different data resources can be used for research. Detailed knowledge of the limitations of different data sources can help to critically interpret results and draw conclusions from large-scale population-based research and clinical studies. Therefore, the aim of this study is to compare self-reported and register-based hospital medical data in the Swedish NPR on comorbidities in a large breast cancer screening cohort of women in Sweden. We focused on common comorbidities such as hypertension, hyperlipidemia, heart failure, myocardial infraction, angina, stroke and type I or II diabetes. In addition, we examined concordance between self-reported and NPR data on several women's health problems which are less studied, such as preeclampsia and polycystic ovaries or ovarian cysts.

Study population. The KARolinska MAmmography Project for Risk Prediction of Breast Cancer (KARMA)
study (http://karmastudy.org/) was set up to be a well-characterized breast cancer cohort 20 . Participants of the prospective KARMA study comprise women attending mammography screening or clinical mammography at four hospitals in Sweden (Stockholm South General Hospital, Helsingborg Hospital, Skåne University Hospital, Lund, and Landskrona Hospital). Since 1994, all women in Sweden, aged 40-74 years, are invited for publicly-funded mammography screening every 18-24 months. Adherence to mammography screening is high -three in four eligible women attend screening regularly. All women who were invited for screening between January 2011 and March 2013, at the four hospitals were invited to participate in the KARMA study. Additionally, women who had a clinical mammography (i.e. woman being referred for a mammogram because of symptom noticed by her and/or her doctor) at any of the participating mammography units during the recruitment period were invited. Of 210,233 women who were invited to participate in the KARMA study, 70,877 (34%) were enrolled. These women answered a detailed web questionnaire (https://karmastudy.org/wp-content/uploads/2015/07/ Karma_baseline_questionnaire_eng.pdf) on background and lifestyle risk factors. Consent was obtained for the retrieval of data from medical records and national registers. Ninety-two percent (n = 65,231) of the enrolled women completed the web questionnaire. The ethical review board in Stockholm approved the study (2010/958-31/1) and all study procedures were performed in accordance with relevant guidelines and regulations.
Data sources of medical history. In KARMA, self-reported data was collected for the following conditions: high blood pressure (hypertension), high blood cholesterol (hyperlipidemia), myocardial infarction, angina, heart failure, stroke, polycystic ovaries or ovarian cysts, preelampsia, and diabetes. Given instructions to "choose all that apply", the participants were asked -"Have you ever been diagnosed with [any of the medical conditions] by a medical doctor?" For each condition, participants would mark the corresponding checkbox (i.e. yes) if they have ever been diagnosed with the specified conditions, unmarked checkbox corresponds to not having the condition (i.e. no). Women who responded "Don't know/Refuse" to the question were excluded from further analysis (n = 270).
All participating women in KARMA study were electronically linked to the NPR (linkage date 1 st October 2013) through unique personal identity numbers (i.e. personnummer) 21 . Inpatient/outpatient diagnoses (main and secondary) of the nine medical conditions studied were identified using International Classification of Diseases (ICD) diagnosis codes (see Supplementary Table 1). Diagnoses registered after the date of completion of questionnaire were excluded.
Other covariates. Self-reported information on age at time of survey, education level, BMI and smoking were derived from the KARMA web questionnaire.
Statistical analysis. The count and percent of diseases for each data source (i.e. self-reported, and NPR) and various combinations of each medical history status combination from the two data sources (i.e., "Self-reported No/ NPR No", "Self-reported Yes/ NPR No", "Self-reported No/ NPR Yes" and "Self-reported Yes/ NPR Yes") were computed. In the comparison of self-reported and NPR we report the difference in prevalence instead of percentage difference. This was chosen to avoid the impression of either source is the gold standard. Both methods gave similar results. The "epi.kappa" function in the "epiR" package was used to compute observed proportion of agreement (overall agreement), expected proportion of agreement, prevalence index, bias index, prevalence and bias corrected kappa statistic and Cohen's Kappa statistic in R (version 3.4.2). The prevalence index ( − N [y/y] [n/n] , from the cells of a standard 2 × 2 matrix, Supplementary Method), an estimate of the difference in the probability of the condition being present and absent from the study population, ranges from −1 to 1 and equates to 0 when 50% of the study population has the condition. A larger absolute prevalence index value results in larger chance www.nature.com/scientificreports www.nature.com/scientificreports/ agreement and smaller Kappa value 22 . A bias index ( N [y/n] [n/y] − ), the difference in the reported proportion of the condition being present between NPR and self-reported, ranges from −1 to 1 and large absolute values indicate bias increasing κ, whereas zero bias index indicates equal marginal proportions and no bias 22  ) was expressed as percentage (100 denotes perfect agreement) 24 . The positive specific agreement is the inverse transformed mean of the sensitivity and positive predictive values, i.e., sensitivity agnostic to which measure is the gold standard (Supplementary Method). Similarly negative specific agreement, the proportion that tests negative under both criteria compared with the average proportion that test negative under each criteria separately ( ) was expressed as percentage.
Two-stage logistic regression models for analysing agreement which takes into account chance agreement were used to identify potential predictors of overall agreement (i.e. "Self-reported No/ NPR No", and "Self-reported Yes/NPR Yes" were coded as "1", otherwise coded as "0") 25 . The offset term was determined by the variable(s) in the main model. The 95% confidence interval was obtained using the bootstrap approach, the 2.5 and 97.5 percentile of 2000 iterations. The following predictors were considered: age, education level, BMI, and reported ever smoked for one year or 100 cigarettes (smoking). Stratified analyses were carried out for each condition by age (<50, 50-59 or ≥60 years), education level (elementary, intermediate or university), BMI (<25 or ≥25 kg/m 2 ), and smoking (Yes or No). A subset of parous women (i.e. who had at least one full-term pregnancy) was used to study preeclampsia. Statistical significance threshold was set at P < 0.05.

Ethics Approval And Consent To Participate.
All participants signed informed consent forms, and the ethical review board at Karolinska Institutet approved the study (2010/958-31/1). All study procedures were performed in accordance with relevant guidelines and regulations.
According to the self-reported data, the five most commonly diagnosed conditions were hypertension (19.8%), hyperlipidemia (10.8%), polycystic ovaries or ovarian cysts (9.2%), preeclampsia (5.0%) and diabetes (2.8%) ( Table 1). The remaining conditions (i.e. heart failure, myocardial infarction, angina and stroke) each affected ~1% of the study population. When comparing self-reported data to the NPR, the largest differences were observed for hypertension (11.2% more in self-reported data), angina (10.8% less in self-reported data), hyperlipidemia (9.6% more in self-reported data) and polycystic ovaries or ovarian cysts (4.1% more in self-reported data) ( Table 1). Differences between self-reported and NPR data were minimal (<1.0%) for heart failure, myocardial infarction, stroke, preeclampsia and diabetes. Figure 1 shows estimates of Cohen's Kappa for commonly diagnosed conditions. Relative cell counts, expected proportion of agreement, prevalence index, bias index, and Cohen's Kappa statistic are presented in Supplementary Tables 2-6. Substantial agreement (Cohen's Kappa) was observed for myocardial infarction (0.74), diabetes (0.71) and stroke (0.64). Moderate agreement was observed for preeclampsia (0.51) and hypertension (0.46). Fair agreement was observed for heart failure (0.40) and polycystic ovaries or ovarian cysts (0.27). For hyperlipidemia (0.14) and angina (0.10), slight agreement was observed between self-reported and NPR data. High levels of overall agreement (i.e. 86.6% or more) were observed for all included conditions (Fig. 2). The average agreement between self-reported and NPR data on absence of medical condition (percent negative specific agreement, range: 92.2-99.8%) was higher than for presence (percent positive specific agreement, range: 11.3-74.4) (Fig. 2).
Fair agreement (0.23) was observed for the number of conditions between self-reported and NPR data. Similar with the more common conditions, hypertension and hyperlipidemia, the associations of older age, higher BMI and lower education were associated with higher agreement, using multinomial modelling for the number of conditions (data not shown).

Discussion
Misclassification of conditions may result in confounded studies. In the study of survival, the misclassification of comorbid conditions that are associated with the higher risk of death would lead to over-emphasis of the risk of death from the disease of interest. In addition, studies looking at treatment outcomes may be potentially confounded by comorbidities. To systematically examine the appropriateness of using self-reported and register-based hospital medical (NPR) data to identify comorbidities, we compared prevalence and agreement between these two data sources in a large population-based breast cancer cohort in Sweden. Both data sources have their respective strengths and shortcomings. However, the focus of this study is not on whether self-reported personal medical history is "more correct" than NPR records and vice versa, but rather how closely they agree or disagree for various medical conditions. Few studies have looked at the concordance of preeclampsia between self-reported and hospital data 26,27 and to the best of our knowledge, ours is the first study investigating the same for polycystic ovaries and ovarian cysts. The Swedish NPR has been used to identify women with preeclampsia and polycystic ovaries and ovarian cysts in epidemiological studies previously 28,29 . Self-reported data provided more cases of polycystic ovaries or ovarian cysts than the NPR did and there was fair agreement between the sources. The combined classification of polycystic ovarian syndrome and ovarian cysts may have resulted in the higher than expected self-reported occurrence in the older age group. However, the prevalence estimates for preeclampsia were found to be similar for both self-reported and NPR data with moderate amount of agreement. The use of self-reported pre-eclampsia in the older generations of women may be of concern as preeclampsia may have been referred to as "toxemia of pregnancy" in the earlier period. However, in our population we observed similar concordance across the three age groups. These conditions could be potential risk factors or confounders for breast cancer risk, for example, preeclampsia has been shown to be associated with reduced risk of breast cancer 30 . Therefore, in order to identify women with these conditions reliably, both self-reported and NPR (hospital) data should be explored, if available.
Our study showed that the prevalence of hypertension and hyperlipidemia were highly under-represented in the NPR data when comparing it to self-reported data due to the fact that primary care outpatient records are not included in the NPR data. This is in agreement with previous studies which showed that medical conditions typically treated in primary care settings (not often leading to hospital admission) are not recognised or recorded in the hospital admission data. For example, a concordance study of self-reported and administrative hospital data in Australian Longitudinal Study on Women's Health showed under-recording of hypertension in the hospital data 12 .
For life-threatening conditions like heart failure, myocardial infraction and stroke, the differences in prevalence from self-reported and NPR data were minimal. In contrast to all the comorbidities we have studied in this paper, angina was less reported in the self-reported data, for example, absolute difference in prevalence was ~10%. This might be due to the fact that angina is not a well-defined disease and many people misclassify it because its symptoms are similar to other disease (e.g. myocardial infarction) and it is perceived as a symptom, www.nature.com/scientificreports www.nature.com/scientificreports/ not a disease 31 . Subsequently, the agreement between self-reported angina and NPR recorded angina was poor in our study.
In spite of heterogeneous methodology and comparisons, we observed common findings among previously published studies -higher agreement for medical conditions that are widely recognized and easily diagnosed (e.g. diabetes, hypertension) or require hospital care (e.g. myocardial infarction, stroke), and lower agreement for poorly defined diseases (e.g. heart failure, angina), conditions perceived as symptoms (e.g. angina) and conditions that may not require hospitalization (e.g. hyperlipidemia). Okura et al. measured the agreement between self-reported cardiovascular disease and extensive medical records with high completeness (including hospital inpatient or outpatient care, office visits, emergency room and nursing home care and death certificate and autopsy information) and long archival period for ~2,000 participants from the Olmsted County in Minnesota and found substantial agreement for diabetes, hypertension, myocardial infarction and stroke (Kappa values ranging from 0.71 to 0.80) 15 . Moderate agreement was observed for heart failure (Kappa 0.46) 15 32 . Huerta et al. compared self-reported diabetes, hypertension and hyperlipidemia with biometric data (levels of blood glucose and lipids and blood pressure) and found substantial (Kappa 0.78), moderate (Kappa 0.51) and fair agreement (Kappa 0.27) for the three conditions, respectively. Nonetheless, as low prevalence may result in high chance agreement, and consequently, low Kappa, caution should be exercised when interpreting statistics for less common conditions.
Overall agreement is a common measure of agreement between self-reported and hospital data 15,33 . Based on overall agreement, self-reported and NPR were concordant for 86.6% or more of the participants for all nine comorbidities studied. The high overall agreement observed in our study is mainly driven by the high negative www.nature.com/scientificreports www.nature.com/scientificreports/ specific agreement (>92%) for all comorbidities studied. In addition, conditions with higher proportion of positive specific agreement had higher Kappa. This may be an indication that we might be limited in identifying comorbidities when we use only one source of information. Factors associated with overall agreement tend to have similar association with positive specific agreement (Supplementary Table 7).
Previously, Ye et al. argued that the number of comorbidities increases with age, leading to lower precision between self-reported and medical records 33 . We observed fair agreement between self-reported and NPR data.  www.nature.com/scientificreports www.nature.com/scientificreports/ However, we observed lower overall agreement with increased age for polycystic ovaries or ovarian cysts and preeclampsia after accounting for other factors such as education, BMI, smoking and breast cancer history, better overall agreement was observed for hypertension, hyperlipidemia and angina. Our results suggest that relationship between overall agreement and age is likely due to the length of time between disease diagnosis and study entry, as polycystic ovaries or ovarian cysts and preeclampsia are typically diagnosed at much younger ages than hypertension, hyperlipidemia or angina.
Similar to the work of Ye et al. 33 , we did not find education level to be a predictor of overall agreement in general, with preeclampsia being the only exception. This finding should be interpreted in the light of education in Sweden being mandatory for all children between ages 7 to age 16. In addition, higher education is available at no cost for Swedish citizens. It is unclear why better overall agreement was observed for preeclampsia. Nonetheless, women with higher education may be privileged with higher health literacy, which in turn puts them in a better position to understand information conveyed to them by physicians.
Short et al. hypothesized that higher BMI is correlated with lower agreement between self-reported values of healthcare utilization and administrative claims 14 . It was suggested that there might be a tendency for people with higher BMI to use more healthcare services, making it less likely for them to accurately recall and report doctor visits and inpatient hospital admissions 14 . However, in our study, self-reported diagnoses for several diseases were more likely to be confirmed by NPR data in women with higher BMI. A possible explanation may be related to the high education among women in general and also greater health consciousness of women enrolled in KARMA; they may have been more aware of the risk of chronic diseases associated with obesity. Other studies have also shown that better health status is associated with better agreement 14 . This is supported by the higher agreement observed for (polycystic ovaries or ovarian cysts, preeclampsia, and diabetes) non-smokers in our results.
The main strengths of our study include a women-only cohort, the large sample size and resulting statistical power. An electronic linkage with NPR provided complete follow-up for virtually every woman in the cohort. Although our study base comprises of women attending screening or clinical mammography, the publicly funded health care system in Sweden means that all residents have access to health care and socioeconomic bias in hospital admission is very unlikely. Nonetheless, a number of limitations warrant discussion. For example, register-based diagnoses can be complemented by information from the Swedish Drug Prescription Register (e.g. beta blockers to indicate hypertension and lipid lowering agents to indicate hyperlipidemia) 34 . However, while the Swedish Drug Prescription Register contains information regarding drug utilization and expenditures for dispensed prescribed drugs in the entire Swedish population, it was established fairly recently in July 2005 (i.e. too young to be used) 34 . There are other conditions that may be of interest, however we were limited to those in the KARMA questionnaire. Our study consisted of a highly educated population that is well-served by a mainly government-funded and decentralized health care system. It is unclear whether the results can be generalized to other populations with different health-seeking behaviour and access to healthcare. In addition, two inherent disadvantages with agreement measures must be taken into account when interpreting the results. Firstly, we acknowledge that there is no clear reference standard for the ascertainment of the medical conditions. Secondly, when the disease prevalence in the population is very high or low, the value of Cohen's Kappa may indicate poor reliability even with a high observed proportion of agreement [35][36][37] (i.e. agreement results are dependent on the disease prevalence in the study population). We have thus reported multiple measures of agreement to take into account bias, prevalence and possible imbalance in each 2 × 2 table's marginal totals to address this paradox of the Kappa statistic.

Conclusions
An increasing number of breast cancer cohort studies 38,39 are including self-reported comorbidities in the data collection forms, prompting an investigation into how well the data from self-reported questionnaires correspond to register-based hospital medical data such as the Swedish NPR. Our study confirmed that on comorbidities of stroke and myocardial infarction, there is substantial overall agreement between registry data and self-reported data, regardless of age, education, and BMI. Older age was associated with better overall agreement on comorbidities of hypertension, hyperlipidemia and angina, but poorer overall agreement for polycystic ovaries or ovarian cysts, and preeclampsia. In most subgroups, negative specific agreement between registry data and self-reported data is >90%, which suggests that both sources can confidently identify individuals without the conditions studied in this subgroups.

Data Availability
The datasets used and/or analysed during the current study are available through an application for Karma Data Access (https://karmastudy.org/data-access/) on reasonable request.