Main

The family history is an effective clinical method for assessing risks for Mendelian and multifactorial disorders. Although the family history is used routinely in clinical genetics settings, it remains underutilized in primary care settings.16 There is a need for tools that automate and streamline the acquisition and interpretation of family history data, to increase the use of this information to refine health risks for individuals.

Although there are a number of tools to aid the collection of family history,712 only a few studies have attempted to validate these instruments.79 To the best of our knowledge, no studies have validated a tool that is both publicly available and applicable to a variety of common conditions.

The purpose of this study was to validate a publicly available tool for self-directed collection of family history for common conditions, My Family Health Portrait (MFHP) (https://familyhistory.hhs.gov), using family history data collected by a genetics professional as the reference standard. According to the Office of the Surgeon General, the MFHP site had an average of 18,064 unique visitors per month from October 2009 through February 2010 (unpublished data). Despite this wide utilization of MFHP, no studies have sought to formally validate it. We set out to test the validity of MFHP using the ClinSeqTM cohort, which is a longitudinal translational project investigating whole-genome sequencing in a clinical research setting.13 Study participants were asked to gather their family history before their initial enrollment visit using MFHP, thus providing an opportunity to validate this tool prospectively. The study sought to address a key question for health care providers, which centers on the quality and quantity of family history data collected by MFHP, compared to that collected by a genetics professional during a face-to-face interview.

SUBJECTS AND METHODS

The data were collected using a convenience sample of 150 individuals enrolled in the ClinSeqTM study between January 2007 and November 2008. Individuals eligible for the ClinSeqTM study were between 45 and 65 years of age. Before enrollment in ClinSeqTM, participants were asked to gather their family health histories using the U.S. Surgeon General's family health history tool, MFHP v 1.5. MFHP is a web-based tool for the self-directed collection of family health history. The study was approved by the National Human Genome Research Institute Institutional Review Board.

MFHP v 1.5 automatically queries users about parents and grandparents and six common complex conditions—heart disease, stroke, diabetes, and colon, breast, and ovarian cancers. Participants could enter additional family members and diseases. Participants brought their printed MFHP online pedigree to the research enrollment session; we refer to this as the “online pedigree” (Fig. 1). In the session, the disease diagnoses reported in the online pedigree were confirmed by a board certified genetic counselor (F.F.) through conversation with the participant, and additional relatives and disease diagnoses were added by the same clinician to complete a minimum three-generation pedigree. We refer to this pedigree as the “supplemented pedigree” (Fig. 1). Standard human pedigree nomenclature was used.14,15

Fig. 1
figure 1

Methodology.

To measure the accuracy and reliability of MFHP, the participants' online pedigrees were scored using the supplemented pedigrees as the reference standard. A scoring sheet was created, piloted, and revised before the initiation of this study (Figure, Supplemental Digital Content 1, http://links.lww.com/GIM/A106). For the purposes of this study, coronary artery disease (CAD) was defined as stent placement, bypass surgery, angioplasty, myocardial infarction, silent myocardial infarction, atherosclerotic heart disease, or carotid endarterectomy (which is not CAD per se but is correlated with CAD). Stroke was defined as a transient ischemic attack, cerebrovascular accident, or stroke. The online pedigrees were scored for missing and incorrect data, when compared to the supplemented pedigrees. The specificities and sensitivities for first- and second-degree relatives' histories were calculated. Sensitivity was defined as the disease cases reported in the online pedigree and confirmed in the supplemented pedigree (true positives) divided by the disease cases captured in the supplemented pedigree but not reported in the online pedigree (false negatives) plus the true positives. Specificity was defined as the nondisease cases reported in the online pedigree and confirmed in the supplemented pedigree (true negatives) divided by disease cases reported in the online pedigree but corrected in the supplemented pedigree as nondisease cases (false positives) plus the true negatives. Cases not considered in the specificity and sensitivity calculations included cases whose diagnoses were questionable or unclear after the supplemental data were gathered by the clinician. These fell into two categories: (1) disease cases not captured by the participant and noted as questionable by the clinician acquiring the supplemental pedigree data, and (2) “heart disease” cases that remained general heart disease and were not further specified by the clinician after interviewing the proband. Third-degree relatives were not included in these calculations, because they could not be taken into account in the risk stratification portion of the study (discussed below).

To measure the accuracy with which the online pedigree could be used to stratify disease risk, pedigrees were entered into the Centers for Disease Control research tool, Family HealthwareTM (FHW)11 (Fig. 1). FHW is an electronic tool that contains algorithms to stratify the risk of a proband for disease as weak, moderate, or strong based on family history. It provides risk assessment for the same six common diseases captured by MFHP v 1.5. Two histories were entered for each participant; one derived from the online pedigree and one from the supplemented pedigree. The supplemented pedigree was entered as it was drawn during the initial enrollment session, without any additions that may have been made subsequently through follow-up visits or contacts by either the genetics professional or the participant. To analyze the effect of the different information content of the alternative pedigrees (online vs. supplemented), we counted how many of the online/supplemented pedigree pairs yielded the same risk category, how many shifted one risk category, and how many shifted two risk categories. The denominators for the category risk counts vary because patients categorized as “moderate” can only shift one category and therefore should not be in the denominator for the two-risk category shift count. Overall correlation was analyzed using Cohen's Kappa. The Kappa value was interpreted using Landis-Koch guidelines16 where a Kappa of <0 equals poor agreement, 0.0–0.20 slight agreement, 0.21–0.40 fair agreement, 0.41–0.60 moderate agreement, 0.61–0.80 substantial agreement, and 0.81–1.00 almost perfect agreement. A sample size of 150 pedigree pairs sufficed for the observed weighted kappa 95% interval half width to be at most 0.08.

RESULTS

Table 1 summarizes the socio-demographic characteristics of the entire ClinSeqTM cohort enrolled up until October 8, 2009, and the subgroup of 150 patients whose pedigrees were used for this study. Similar to the larger ClinSeqTM cohort, the majority of participants in the validation study were white, not of Hispanic or Latino ethnicity, and from higher-socioeconomic groups. The average age was 56 years and the female to male ratio close to one.

Table 1 Socio-demographic characteristics of research participants

Based on the online pedigrees, 874 first-degree relatives and 1,153 second-degree relatives were reported (total 2,027), when compared to the supplemented pedigrees, which captured 888 first-degree relatives and 2,282 second-degree relatives (total 3,170). Most participants had two or more first-degree relatives with at least one of the six diseases (61.3%). According to the FHW risk assessment, 92% of participants had a moderate or strong risk for at least one of the six diseases in their pedigree. Based on data from the supplemented pedigrees, heart disease had the most number of participants with a strong risk assessment, followed by diabetes and stroke.

The specificities and sensitivities for first- and second-degree relatives were calculated. Sensitivities were higher for first-degree relatives than for second-degree relatives, whereas specificities were similar between these two groups (Table 2). Specificities and sensitivities also varied across diseases and generally were the lowest for CAD. Further comparisons of the online pedigrees to the supplemented pedigrees showed that omitted relatives in the online pedigrees were a major component of the lower sensitivities and specificities for CAD, when compared to other disorders (Table, Supplemental Digital Content 2, http://links.lww.com/GIM/A107). The other reason was the probands' reports of other heart disease diagnoses (e.g., congestive heart failure, mitral valve prolapse) instead of CAD (Table, Supplemental Digital Content 2, http://links.lww.com/GIM/A107). Stroke was also further analyzed to elucidate the cases not captured or reported in the online pedigrees. Table, Supplemental Digital Content 3, http://links.lww.com/GIM/A108, shows cases of stroke that were not captured by the online pedigrees (omitted relatives), and the reasons for incorrect assignments of stroke in the pedigrees. One common theme that arose was the interchangeable use on the part of the participants of the terms “stroke” and “heart attack or myocardial infarction,” with a total of 13 of these cases (see Tables, Supplemental Digital Content 2 and 3, http://links.lww.com/GIM/A107 and http://links.lww.com/GIM/A108) remaining unresolved in the supplemented pedigree data set.

Table 2 Specificities and sensitivities (%) of the online pedigree compared to the supplemented pedigree

The risk assessments by FHW of the online pedigrees and the supplemented pedigrees were compared (Fig. 1). The proportion of pedigree pairs that yielded the same risk category varied among the disorders (Fig. 2). The highest agreement was for colon cancer (148/150, 99%) and the lowest was CAD (102/150, 68%). The two diseases that fell in the ends of the spectrum for a one-risk category shift were colon cancer with no pedigree pairs shifting categories (0/150), and CAD with 41/150 (27%) pedigree pairs shifting categories. The two diseases that fell in the ends of the spectrum for a two-risk category shift were ovarian cancer with no pedigree pairs shifting categories (0/79), and CAD with 7/70 (10%) pedigree pairs shifting categories.

Fig. 2
figure 2

Comparison of risk assessments generated by Family HealthwareTM for the online pedigrees and the supplemented pedigrees. Note that only 65% (97/150) and 59% (88/150) of the sample has risk assessment data for breast and ovarian cancer, respectively. This is because Family HealthwareTM does not provide risk assessment for these cancers for male probands unless the familial risk is determined to be “high.”

The overall agreement of the risk assignments for the two methods was measured by Cohen's kappa. The Cohen's kappa values for four traits were considered “almost perfect”; diabetes (0.91, 95% CI, 0.85–0.97), breast (0.95, 95% CI, 0.87–1.00), ovarian (0.91, 95% CI, 0.78–1.00), and colon cancer (0.87, 95% CI, 0.70–1.00). In contrast, the correlation was only “substantial” for stroke (0.78, 95% CI, 0.69–0.86) and “moderate” for CAD (0.58, 95% CI, 0.47–0.68). Of note, risk assessment data were not analyzed for breast and ovarian cancer in 35% and 41% of the sample, respectively. This is because FHW does not provide a risk assessment for male probands for breast and ovarian cancer familial risk, unless the risk is determined to be “high.”

DISCUSSION

Despite the advances in genetic and genomic analysis, the family history will continue to play a central role in risk assessment for disease,17 as it has the ability to capture both inherited genetic susceptibilities and shared environmental and behavioral factors. Additionally, the family history has been shown to be a feasible and generally accurate initial method for risk stratification of many preventable, common conditions,18,19 and a potentially cost-effective screening tool.20 The family history can predict risk for many disorders including heart disease,21,22 type 2 diabetes,23,24 breast,25 ovarian,26 and colorectal cancer.27 Even with such ample evidence of the benefits of the family history, a recent NIH State of the Science conference entitled “Family History and Improving Health” found that there were few studies that examined the analytical performance of family history tools applicable to primary care settings.28

The data from this study show that, when compared to family history data collected by a genetics professional, semiautomated pedigree collection using MFHP was highly accurate for gathering proband-derived family history for first- and second-degree relatives across four common conditions—diabetes and breast, ovarian, and colon cancer (Table 2). Furthermore, the high correlations (94–99% of the paired pedigrees yielded the same risk category; Cohen's kappa of >0.86) of the risk categories for the three types of cancer and diabetes derived from the online and supplemented pedigrees show that the semiautomated pedigree process collected sufficient family history data to make a reliable assessment of risk (Fig. 2). The tool did not perform as well for the collection of family history of CAD and stroke. The lower sensitivity and specificity for heart disease and stroke compared to the other diseases were associated with omission of affected relatives and mis-categorization of these diseases. Improving the specificity of diagnostic choices for heart disease and stroke might improve tool performance. This should be of particular interest given the recent conclusion that self-reported family history remains significantly associated with cardiovascular disease, in contrast to multiplex genetic markers.29

Overall, the sensitivity and specificity of the online pedigrees ranged from 67 to 100% and 92–100%, respectively, relative to the supplemented pedigree (Table 2). Specificity and sensitivity were also dependent on degree of relationship, being higher for first-degree relatives than for second-degree relatives. These results are consistent with previous findings that individuals report the absence of disease (specificity) more accurately than the presence of disease (sensitivity) in family members, and that family history for first-degree relatives is more accurately reported than for second and third degree relatives.3032

The high accuracy of the information reported by the probands (i.e., the online pedigrees), when compared with the information supplemented during the face-to-face interview, may be related to the method of family history collection. It has been suggested that computer-based tools might generate more accurate information because individuals can enter family history over a span of time in the comfort of their homes, thus having better access to records and family members than they would have during an office visit.17 This could indeed be the case for ClinSeqTM participants, who completed their family histories before their visit to the NIH Clinical Center. Anecdotally, many of the participants reported speaking with their relatives to obtain or verify their family medical history before completing MFHP online.

Levels of agreement of 77–100% have been shown when self-collected family histories were compared to family histories gathered by a genetic counselor.79 However, these studies validated tools that are disease-specific and/or not publicly available. One of the strengths of the present study is that it validates a publicly available and widely used family history tool that interrogates six common heritable conditions and is continually maintained and updated. Additionally, this study was performed prospectively and the family history collection was similar to a realistic clinical scenario, in which a patient might complete a family history before meeting with a health professional, who could verify diagnoses and supplement the family history, as needed. These results suggest that MFHP could be a successful tool for the collection of family history of common heritable conditions in a primary care setting. Furthermore, if coupled with a risk assessment tool such as FHW, it could provide an initial triage system for primary care providers to decide if a patient needs referral to a specialist.

There are a number of attributes of this sample population that may have influenced the accuracy of the reported family history information. This convenience sample was comprised of highly educated, white, non-Hispanic individuals from higher income categories who were willing to participate in a highly complex genetic research study (Table 1).13 These research participants may be more motivated and knowledgeable regarding their family histories than the average patient. However, educational level and gender do not appear to influence accuracy even though females and individuals with higher educational levels tend to supply more information.30 Additionally, there are no documented differences in family history accuracy reported by whites compared to that provided by African Americans.33,34

An additional limitation of the study includes the fact that the genetic counseling interview was defined as the reference standard to measure sensitivity and specificity. The ideal method of validating self-reported family history would be direct interviews with relatives and/or review of their medical records. This approach is logistically difficult, time-consuming, and expensive given the inability to link to medical records.9,30 Additionally, it has been shown that the analytic validity is high for cancer, diabetes, and coronary heart disease in studies comparing family history gathered by personal interview to medical records and/or relatives' histories.32,3436 Taken together, these data suggest that a reported history of these conditions is generally accurate. We believe that the genetic counseling interview is a reasonable proxy for direct relative interviews or review of medical records.

Future studies should examine the performance of MFHP in a population more representative of the U.S. population. Such studies could also analyze the added value of the supplemented family history with relatives beyond second-degree relatives. Unfortunately, FHW does not use data from relatives beyond second-degree relatives. Therefore, it may be useful to use experts to evaluate the larger pedigrees. These analyses might elucidate the value of expanded content of the family history beyond second-degree relatives for risk assessment of these common heritable conditions. This study did not specifically address the ability of MFHP to effectively capture family histories of single gene disorders; given the rarity of these conditions a much large sample size would be required.

In summary, the availability of a tool for the systematic and semiautomated collection of family history could help identify those at increased risk for common heritable conditions who could benefit from further evaluation. The present data do not suggest that MFHP may be used in lieu of a family history interview by a genetic specialist. However, the data do suggest that MFHP is a valid tool for the initial collection of family history information. This information might then be incorporated into automated structured risk assessment tools such as FHW to aid busy primary care clinicians interested in more efficiently identifying and managing risk of common chronic conditions. As well, geneticists and genetic counselors may find that self-administered pedigrees with professional supplementation are an effective means to increase the efficiency of family history collection.