Using routinely collected primary care records to identify and investigate severe asthma: a scoping review

Shielding during the coronavirus pandemic has highlighted the potential of routinely collected primary care records to identify patients with ‘high-risk’ conditions, including severe asthma. We aimed to determine how previous studies have used primary care records to identify and investigate severe asthma and whether linkage to other data sources is required to fully investigate this ‘high-risk’ disease variant. A scoping review was conducted based on the Arksey and O’Malley framework. Twelve studies met all criteria for inclusion. We identified variation in how studies defined the background asthma cohort, asthma severity, control and clinical outcomes. Certain asthma outcomes could only be investigated through linkage to secondary care records. The ability of primary care records to represent the entire known asthma population is unique. However, a number of challenges need to be overcome if their full potential to accurately identify and investigate severe asthma is to be realised.


INTRODUCTION
The majority of asthma care in the United Kingdom (UK) is delivered in primary care. Severe asthma represents a subset of patients whose disease does not respond to treatment in this setting and remains uncontrolled despite confirmed adherence with maximal optimised therapy and treatment of contributory factors or that worsens when high-dose treatment is reduced 1 . This is distinct from difficult-to-treat asthma (DTA), where poor control is due to modifiable factors, such as incorrect inhaler technique, poor adherence, smoking, comorbidities or an incorrect diagnosis.
Patients with severe asthma have significantly better outcomes when identified and referred for specialist assessment 2 . There is significant variation in asthma specialist care across the UK, with unacceptable variation in prevalence, frequency of exacerbations, provision of services and health outcomes across geography, age, ethnicity and socio-economic groups 3,4 . New safe and effective management options are available for severe asthma, and it is vital that we better understand what contributes to this variation and put in place measures to reduce its effect.
Severe asthma was named as one of the 'high-risk' conditions during the coronavirus outbreak 5,6 . Searches of routinely collected primary care records were conducted in an attempt to rapidly identify these patients and advise them to isolate themselves to prevent harm from contracting the virus. This has placed a spotlight on the challenges of accurately identifying patients with severe asthma from routinely collected data, including gaining consensus on what criteria should be used to define this subgroup 7 . This process has also highlighted exciting potential opportunities such as improving our understanding of this highrisk disease, gaining accurate estimates of prevalence and disease burden and identifying potential candidates for novel therapies.
This aim of this review was to identify how previous studies have used primary care data to identify and investigate severe asthma. Given that asthma patients have healthcare records held in various other databases throughout the Health and Social Care system, we also aimed to determine the benefits and limitations of linking primary care data to other healthcare and administrative data.

RESULTS
One thousand five hundred and six records were identified from Ovid Medline, 3018 records from Embase and 855 from Web of Science (Fig. 1). Following screening, 28 full-text articles were identified from OVID Medline, 54 from Embase and 15 from Web of Science. After removal of duplicates, 71 full-text articles were included in the full review stage. Twelve studies met all the inclusion criteria. Of these, 8 studies linked General Practice (GP) data to other healthcare and/or administrative records.

Data sets
Of the 12 studies included in the review, 9 obtained primary care data from the UK 8-15 , 1 from Sweden 16 , 1 from Denmark 17 and 1 from the United States of America (USA) 18 (Table 1). The UK studies obtained primary care data from 5 sources of varying size and coverage of the UK population. The Swedish study obtained primary care data from a cluster of Swedish primary medical centres 16 . The Danish and American studies primarily obtained primary care data from Health Insurance registers.
Asthma control was defined by exacerbation frequency and short-acting beta agonist (SABA) reliever inhaler overuse ( Table 3). The proportion of patients with excessive SABA use ranged from 9.1 to 13.6% when defined by inhaler prescriptions per year (≥10 or ≥13 prescriptions) 8,11,15,19 and from 22% to 23.5% when by defined by SABA daily dosage (≥300 or ≥400 µg) 10,14 (Supplementary Tables 2a, b). Three articles used combined measures of asthma control 13,14,16 (Table 3). Two articles that used comparable measures of 'overall asthma control' estimated the proportion of patients with uncontrolled asthma at 26.7 and 59.3% 13,14 (Supplementary Tables 2a, b).

Clinical outcomes
Asthma exacerbations were the most commonly measured asthma outcome (Table 3). Six articles used the American Thoracic Society/European Respiratory Society (ATS/ERS) definition for an asthma exacerbation (asthma-related emergency department (ED) visit, hospitalisations or OCS prescription) [8][9][10]13,14,16 . Three articles that used this definition of asthma exacerbations found that the proportion of patients with >2 exacerbations per year ranged from 5 to 7% 10,13,14 (Supplementary Tables 3a, b). The remaining articles used OCS prescriptions 15,19 , hospitalisations 11 or both 18 (Table 3). Two articles included GP visits for asthma 18,20 and two included death due to asthma 8,20 in their definitions of an asthma exacerbation.

Other themes
We identified four other recurring themes: how the studies characterised their cohorts, what comorbidities they took note of and analysed and if they reviewed healthcare resource utilisation (HCRU) or the quality of care provided (Table 4). We grouped parameters used to characterise the asthma population into sociodemographic factors, investigations and management. The most common socio-demographic parameters were age, sex, BMI and smoking status. Other parameters included ethnicity 11,18 and socio-economic status 8,9,12,17,19 . Clinical investigations reported included eosinophil counts 8,10,13,14 and respiratory function using peak flow 10,12-14 and spirometry 12,13,16 . A number of studies Three articles expanded this to other asthma treatments [16][17][18] , and one article reviewed non-asthma treatment 16 .

Reporting of studies
We reviewed the quality of reporting of studies against the REporting of studies Conducted using Observational Routinelycollected health Data (RECORD) extension to the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement [21][22][23] . The most poorly reported areas of STROBE Statement areas (less than five articles) were explanations of efforts taken to address a source of bias 17 , how missing data were addressed 9,12,14 and the number of participants with missing data for variable of interest 9,11,18 .
For the RECORD extension areas, four articles provided the required detail on how codes or algorithms used to identify the study population were validated 8,9,16,17 and none of the included articles provided a description of the data cleaning methods. For the articles that linked to other data sets, none of the included articles provided a data linkage flow diagram or provided the required detail on data linkage level, methods and quality evaluation. When the quality of study reporting was compared numerically using the RECORD criteria, 5 articles covered 80% of more of the required checklist areas 9,12,14,16,24 .

DISCUSSION
The majority of asthma care in the UK is carried out in primary care, yet a significant proportion of previous research into this condition has taken place outside this setting. A previous review of the use of electronic health record-derived data to define asthma and assess asthma outcomes identified a number of limitations of the included studies, the majority of which used data extracted from secondary care 25 . This aim of this review was to identify how previous studies have used primary care data to identify and investigate severe asthma. We have summarised some of the challenges identified from reviewed articles for each step in the identification of patients with severe asthma, the potential opportunities if we can accurately identify these patients and propose how identified challenges might be overcome to realise these opportunities.
Before attempting to identify patents with severe asthma, the first challenge is to agree on how to identify the background asthma population from primary care records, within which patients with severe disease can be identified. The majority of articles included in the review used asthma 'read codes' to identify patients with asthma from primary care records. Remuneration in UK primary care via the Quality and Outcomes Framework (QOF) requires the use of these specific read codes 26 . GPs are required to hold, and to keep up to date, an accurate asthma patient population. Nissen et al. reported that the UK has benefitted from higher-quality coding due to QOF, and asthma can be accurately identified from UK primary care records using specific read codes, with a high positive predictive value for asthma (86%) 24 . By contrast, using medication records from primary care to identify patients with asthma has limitations, most notably the potential to miscategorise patients with conditions who use similar medications, in particular chronic obstructive pulmonary disease (COPD) 9,20 . A number of the included studies had pre-specified inclusion and exclusion criteria to identify their asthma cohorts. This has advantages if researchers want to investigate specific known or novel associations, such as with eosinophilia 8,10 . Excluding patients with COPD and other chronic respiratory disease can reduce miscategorisation 9,16 . However, these inclusion and exclusion criteria, required to address the specific objectives of each study, will reduce the generalisability of study findings to the wider asthma population in primary care and limit the ability to compare findings between studies 16 .
Once the background asthma population is identified, the next challenge is to agree on the criteria to diagnose 'severe' asthma. The majority of articles included in this review categorised the severity of their primary care asthma populations using prescribing records into the BTS [8][9][10][11][12][13] or GINA 14,16 treatment steps. The GINA 2017 guideline defines treatment Steps 4 and 5 as severe asthma 27 . Despite differences in the earlier treatment steps between guidelines, Steps 4 and 5 are essentially the same across BTS and GINA guideline versions (Supplementary Table 3) 28,29 . If we compare BTS and GINA guidelines to the definition of severe asthma used to identify patients with 'high risk' severe asthma during the coronavirus pandemic 5 , BTS and GINA Step 5 treatment would be required to meet the criteria or lower steps with admission to hospital or the Intensive Care Unit (Supplementary Table 3). Prescribing records for primary care records are amendable to categorisation by these treatment steps. However, this is not without challenges. Asthma guidelines change regularly, making comparison between studies difficult. Since the included studies were published, the BTS and GINA guidelines have both been updated, making comparison with future studies even more challenging (Supplementary Table 3) 28,29 .
Under the GINA classification, for an accurate diagnosis of severe asthma the disease must remain uncontrolled despite adherence to maximal treatment and management of modifiable contributory factors 1 . Therefore, the next challenge is to determine whether a patients' asthma is poorly controlled. In the studies included in this review, asthma control was generally defined by SABA reliever inhaler use and/or frequency of exacerbations. SABA overuse is a well-recognised surrogate measure of poor asthma control. The National Review of Asthma Deaths highlighted excessive SABA prescriptions as a predictor of poor outcomes, including asthma-related death 30 . While data on high levels of SABA prescribing is readily available from primary care prescription records in the UK, it is unclear whether the patient is actually using all prescribed treatment and whether inhalers are being used with the correct technique.
The ATS/ERS Task Force defines an asthma exacerbation as an OCS prescription, ED visit or hospitalisation 31 . As for SABA overuse, short courses of OCSs can be readily extracted from primary care data. However, their accuracy as an indirect measure of asthma  10 . This raises the question of how accurate measurement of asthma control is in studies that do not link to secondary care data 13,14 . While data on asthma-related hospitalisation from secondary care admission records is thought to be reproducible between studies 11,16 , records of asthma-related ED visits are felt to be much more poorly coded 9,11 . If ED data are to be useful, the quality and consistency of coding would need to improve.
While SABA overuse and exacerbation frequency can give an indication of asthma control, full evaluation requires assessment of self-report asthma symptom control and review of risk factors for poor asthma outcomes (Supplementary Table 4) 29 . Price et al. demonstrated that self-reported measures of asthma symptom control can be extracted from routinely collected patient-reported outcome measures 13,27,33 . The Royal College of Physicians' 3 questions are a routinely collected assessment of asthma control in UK primary care as a criterion for QOF 34,35 . Given that this measure of symptom control should be available for all patients with asthma, this raises whether its inclusion would enhance the definition severe asthma when using primary care data.
Once a patient is identified as having asthma, is on high dose therapy and the asthma is uncontrolled, the next challenge is to differentiate severe asthma from DTA, where the poor asthma control is due to a modifiable factor (e.g. incorrect inhaler technique, poor adherence, smoking, comorbidities or an incorrect diagnosis). Primary care records can provide some of this information. Smoking status is another asthma QOF criterion, which should be available from the annual asthma review 26 . Primary care records can also provide information on comorbidities (Table 5). However, while certain conditions, including asthma, have benefitted from the increased quality of clinical coding from QOF, the quality of coding for other conditions outside QOF registers will vary significantly between practitioners. Prescription record data can give an indication on adherence through frequency of prescriptions. However, it gives no indication on whether the treatment is actually being taken or whether there is correct inhaler technique, and compliance with treatment may vary according to different psychosocial factors 8,20 .
There are a variety of potential opportunities if patients with severe asthma are more accurately identified. Patients with severe asthma have significantly better outcomes when identified in primary care and referred for specialist assessment 2 . For a subset of patients with severe asthma (severe eosinophilic asthma), new safe and effective management options are available, which can improve disease control and quality of life and reduce OCS burden. The remaining patients (severe non-eosinophilic asthma) have been shown to respond poorly to corticosteroids, and their ICS treatment can be reduced without an increase in exacerbation rates 36 . If we can identify these patients and their biomarker phenotype based on eosinophilia status, we could significantly reduce OCS burden and the associated side effect profile. In the UK, primary care data sets have near-to-complete population coverage of the background asthma population as the majority of citizens have a primary care record. The UK is uniquely placed to harness primary care records to identify patients with severe asthma at scale and reduce current inequalities in access and outcomes across geography, age, ethnicity and socio-economic groups 3,4 . The studies included in this review highlight how accurate identification of severe asthma could support research Exacerbation frequency X Anti-asthma prescription frequency X

Asthma control
Exacerbation frequency X X X X X SABA use X X X X X X X Combined measure X X X

Clinical outcomes
Asthma deaths X X ATS/ERS American Thoracic Society/European Respiratory Society, BTS British Thoracic Society, ED emergency department, GINA Global Initiative for Asthma, GP General Practice, OCS oral corticosteroid, SABA short-acting beta agonist.
J Stewart et al. Table 4. Recurring themes analysed in the included articles using available data.

Gayle Price 2015 Turner Walsh Yang Nissen Bloom Price 2016 Hull Larsson Moth Shields
Demographic characteristics and planning, including better estimation of disease prevalence, clinical outcomes and healthcare resource utilisation.
To fully realise the potential of primary care data to identify and investigate patients with severe asthma, a number of challenges need to be overcome. International consensus is required on a standardised approach to defining asthma, asthma severity and asthma control when using these records. One of the major challenges identified throughout the study was the accuracy of primary care prescribing records. The use of OCS courses as a measure of acute exacerbations has limitations, and consideration should be given to whether incentivising better coding of acute asthma exacerbations in primary and secondary care records would give a more accurate measure of disease severity and control. Another key challenge for the identification of severe asthma identified throughout this review is the limited ability of prescription records from primary care to inform clinicians and researchers as to whether a patient is actually using prescribed treatment, and with the correct technique. Confirming adherence to maximal maintenance therapy is required to differentiate severe asthma from DTA. The extent to which SABA prescription records accurately represent poor control is largely determined by whether the numbers of prescribed inhalers accurately reflects a patient's symptoms. This problem cannot be solved through linkage to other data sets. Novel approaches using smart inhalers are under investigation to assess treatment adherence and technique. The RASP-UK consortium demonstrated that data on adherence and technique from smart inhalers can inform decisions on when to step up treatment in severe asthma centres and identify patients whose inadequate symptom control may be  Table 5. Comparison of the quality of study reporting using the RECORD extension to the STROBE statement checklist.

Gayle Price 2015 Turner Walsh Yang Nissen Bloom Price 2016 Hull Larsson Moth Shields
Title and abstract Introduction a result of nonadherence rather than failure of inhaled treatment 36 . This study focused on how routinely collected primary care data has been used to identify patients with severe asthma. This is extremely topical given the use of this data to identify patients with severe asthma as 'high risk' in the Coronavirus outbreak. Comparing the quantitative findings of the studies, including proportions of patients with varying levels of asthma severity and control was challenging as the data sets, cohort inclusion and exclusion criteria and definitions varied significantly between studies. Despite this variation, studies provided similar estimates, which can provide a baseline for further studies. Comparing the quality of reporting using the proportion of RECORD checklist points covered is not a validated approach, and results should be interpreted cautiously. However, we only included valid fields in estimates, and we believe it gives an overall indication of the quality of study reporting.
Primary care data are unique in its potential to represent the entire known asthma population. The coronavirus pandemic has placed a spotlight on the potential opportunities for clinical practice and research, which could be exploited if we can accurately identify severe asthma from primary care records. This review has highlighted a number of challenges that need to be overcome for an accurate diagnosis, including gaining consensus on a standardised approach to defining asthma, asthma severity and asthma control and ensuring the data accurately represent each component of the definition.

METHODS
The methodological approach was based on the Arksey and O'Malley framework for scoping reviews, which has been refined by Levac et al. and Pham et al. [37][38][39] . Scoping review methodology was chosen as our aim was to identify how research has been conducted and the knowledge gaps in this area 40 .
Step 1: Identifying the research question The research questions were (1) how has primary care data been used to identify and investigate severe asthma? (2) how does linkage to other healthcare and administrative data aids in this process? and (3) what was the quality of study reporting in articles using primary care data to identify patients with severe asthma?
Step 2: Identification of relevant studies Initial informal literature searches were carried out to identify terms used in the literature to investigate the use of primary care data to identify the prevalence and characteristics of severe asthma. A subject specialist medical librarian within Queen's University Belfast advised on search terms required to ensure adequate coverage and retrieval of relevant studies (Supplementary Table 5). Formal literature searches were carried out in April 2020 on three databases: Embase, OVID Medline, and Web of Science. Minor adaptations in search terms were required to account for different database subject headings. R-RECORD criteria; S-information available from supplementary materials of article; 1-Not applicable: no data linkage conducted; 2-Not applicable: inclusion criteria specified complete data for follow-up period available, therefore no loss to follow-up for final cohort; 3-Not applicable: study did not involve matching; 4-Not applicable: no categories in output data; 5-Not applicable: no risk analysis included; 6-Not applicable: inclusion criteria specified complete baseline and outcome data.
Step 3: Study selection The study selection process is summarised as a Preferred Reporting Items for Systematic Reviews and Meta-Analyses chart (Fig. 1). For a paper to be included, the following had to be true: (1) the primary care data or equivalent had to be collected as part of routine patient care and not collated for the specific purposes of a study; (2) data had to be from an entire primary care population, with the smallest unit being the known asthma population of a single primary care office; (3) the study had to identify varying levels of severity of asthma; and (4) the paper had to be a full peer reviewed article. Only studies published in English were included; abstracts, conference submissions and study protocol were excluded. Within the final list, we identified those articles that described linkage of records at an individual patient level. Studies that used linkage at aggregate level were excluded. Article abstracts were screened (by J.S.) for eligibility using the above criteria. When insufficient information was available from the abstract to determine eligibility, articles were fully reviewed. When there was any doubt about inclusion, ambiguity was resolved after consensus discussion with another team member (F.K.).
Step 4: Charting data Data from the included articles were extracted and charted into a summary table. Data extracted from each article included characteristics of the data sets used, how asthma was defined, how asthma severity and control were defined and specific identified themes within the included articles, including characterisation of asthma cohorts, clinical outcomes, healthcare resource utilisation and quality of care. Key and recurring themes were identified in an iterative manner as each paper was reviewed. Following complete review of all articles, they were re-reviewed to ensure that all themes were captured.
Step 5: Collating, summarising and reporting results Data from the charting table were transferred into a summary table to enable comparison between articles.

Evaluation of study reporting
We analysed the quality of reporting of each observational study against the RECORD Statement extension to the STROBE Statement checklist [21][22][23] .

Reporting summary
Further information on experimental design is available in the Nature Research Reporting Summary linked to this paper.