Author Correction: Frequent discussion of insomnia and weight gain with glucocorticoid therapy: an analysis of Twitter posts

[This corrects the article DOI: 10.1038/s41746-017-0007-z.].


INTRODUCTION
Glucocorticoid (GC) therapy is used widely in patients with inflammatory diseases. It is estimated that the prevalence of oral GC use in the UK population is around 1%. 1 Their powerful therapeutic benefit is, however, offset by potential adverse events acting through non-selective disruption of immunological and metabolic processes. Clinicians have long been aware of the many side effects of GC therapy, including osteoporosis, serious infection, diabetes mellitus and cardiovascular disease. 2 Despite widespread use since the 1950s and an understanding of their mechanism of action, 3 evidence on the frequency and impact of long-term glucocorticoid toxicity are still lacking. [4][5][6] Clinicians and patients are known to have differing views on what the important side effects of GC therapy are. This matters because decision-making is influenced by the value judgement of the outcome under consideration, be it a benefit or a risk, alongside its probability and its impact. 7 For example, when groups of patients and rheumatologists were asked to rank the ten "most worrisome" GC-related adverse events, patients ranked renal function, fatigue and palpitations much higher than rheumatologists. 8 Similarly, a study of patient-reported side effects of GC therapy in patients with asthma demonstrated patients often report adverse events in differing priorities and patterns to clinicians. 5 Given that clinician-patient disconnect and treatment concerns are two key factors in influencing medication adherence, 9,10 further work which focuses on patient experience and opinion is needed to develop a more complete understanding of GC safety.
Social media as a public health data source In the last two decades, increasing numbers of patients have become active on the World Wide Web; seeking information about symptoms, disease and medications, as well as sharing their health related experiences online in forums, or social media platforms such as Twitter and Facebook. A reported 72% of internet users say they looked online for health information within the past year. 11 In the 3rd quarter of 2015, Twitter and Facebook had 320 million and 1.6 billion active monthly users, respectively. 12,13 This corpus of publically-facing data could generate new knowledge regarding drugs and their benefits and harms. In recent years, text mining techniques; which aim to automatically transform free text into structured data using Natural Language Processing (NLP) techniques, have proven useful in identifying potential drug safety concerns from text within Facebook and Twitter. 14 Possible opportunities for drug safety arising from automated analysis of social media data include routine pharmacovigilance, the study of drug safety in particular populations of interest, or studying the impact of side effects on patients' lives. Traditional methods of pharmacovigilance such as the MHRA's Yellow Card scheme in the UK and the Food and Drug Administration Adverse Event Reporting System (FDA-FAERS) in the USA 15 are prone to Corrected: Author correction underreporting with an estimated 94% of adverse drug reactions (ADRs) unreported. 16,17 The aim of this study was to detect and quantify glucocorticoidrelated adverse events using a computerised system for automated suspected ADR detection from narrative text in Twitter. A secondary aim was to compare the frequency of specific ADR mentions within Twitter to the frequency and patterns of spontaneous ADR reporting to a national drug regulatory body.

RESULTS
Dataset structure A total of 159,297 tweets with mentions of prednisone or prednisolone were acquired for analysis ( Fig. 1). 45,470 tweets were manually curated to train the automated classifier. All remaining 113,827 tweets were then fed through the automated processor. 81,524 distinct tweets remained after duplicate tweets were removed. Of these, 20,206 were identified as proto-AEs as the NLP software assigned them an indicator score ≥0.7 indicating a high probability that the tweet contained an ADR. Within proto-AE tweets, 26,894 MedDRA® preferred terms (PTs) were captured. MedDRA® is the international medical terminology developed under the auspices of the International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH). This is the international standard for classifying adverse events in pharmacovigilance. 18 Figure 1 represents the hierarchical structure of the terminology. 19 After filtering out 17 non-medical event PTs and 41 treatment indications (Supplementary Tables 1 and 2), 20,210 PTs within 15,730 tweets remained, including 289 unique PTs. 12,132 (77%) tweets were tagged with one relevant PT. 2489 were tagged with two, 856 with three, whilst one tweet was tagged with nine PTs (Fig. 2).

Glucocorticoid AE profiles
In the primary analysis of all proto-AEs, the 5 most commonly reported PTs were 'insomnia' (8.6%), 'weight increased' (8.2%), 'non-specific reaction' (7.8%), 'increased appetite' (7.5%) and 'malaise' (4.4%) (Fig. 3). Non-specific reactions were mapped from tweets such as 'I hate the way prednisolone makes me feel'. The five and 25 most frequently mentioned PTs accounted for 25.0 and 74.6% of all AE PTs, respectively. Sensitivity analysis 23% of proto-AE tweets had more than one PT tag (Fig. 2) yet only one indicator score. After restricting analysis to proto-AE tweets containing a single PT, 22 of the top 25 PTs were also in the top 25 AE PTs in the primary analysis (Fig. 4). The top eight AE PTs were the same for the primary and sensitivity analyses albeit in a different order. 'Weight increased' overtook 'insomnia' as the most frequently detected AE PT. There was less than 10% difference between the primary and sensitivity analysis in the proportions of the top five AE PTs detected. The biggest relative differences were seen in 'condition aggravated', 'memory impairment' and 'irritability'.   Table 1. No subjects were reported to have a 'non-specific reaction', 'altered state of consciousness', 'abnormal dreams' or 'skin discomfort' within the Yellow Card system.

DISCUSSION
In this study of nearly 100,000 tweets mentioning prednisolone or prednisone, the two most common adverse events mentioned were 'insomnia' and 'weight increased'. These two outcomes were disproportionately detected in Twitter when compared to reports in the Yellow Card Scheme.
Clinicians may not be surprised by these findings as both insomnia and weight gain are commonly seen in clinical practice. Indeed, patients may opt not to take glucocorticoids on the basis of concerns about side effects, including weight gain, underlining its importance to patients. 20 Surprisingly, there is neither significant research previously undertaken to understand the frequency, progression and impact of these outcomes, nor on how best to manage them. In the electronic Medicines Compendium (eMC) patient information leaflet for prednisolone, there is no explicit mention of insomnia as a side effect (aside from mentioning 'problems sleeping' in the context of serious mental health problems), nor is there a reference to the predominance of any individual side effects. 21 Within the scientific literature, there are very few publications specifically studying the effect of (high-dose) glucocorticoids on insomnia, 22,23 despite it being one of the most commonly discussed side effects on social media.
Within our study of Tweets during the 3 year interval 2012-2015, there were 20,210 glucocorticoid-related AE PTs compared to 3022 AE PTs spontaneously reported to the UK MHRA over the same period. The frequent discussion of common but non-serious side effects may complement traditional methods that more commonly collect unexpected events. 16,17,24 Whilst not identifying new safety signals, the analysis tells us what is happening commonly, and potentially where this has impact on patients' lives. The importance of the side effects most commonly discussed in this Twitter analysis is supported by a recent survey that has found patients using glucocorticoid therapy are most concerned by weight gain and insomnia. 25 There are clear limitations in this analysis, including the representativeness of the population and thus generalisability, accurate identification of AEs and appropriate attribution of causality. We did not have access to any demographic information on the study subjects who generated the tweets as such data are not routinely collected within Twitter. It is likely that most tweets were representative of the population of interest (adult users of GC therapy), although younger people may be overrepresented. 26 We did note that a minority of tweets were written in the third person, for example parents tweeting about their children.
The type of PT detected in Twitter was different to that reported in Yellow Card. Only 'insomnia,' 'drug ineffective', 'malaise', 'weight increased', 'abdominal pain' and 'depression' were present in the top 25 AE list for both Twitter and the Yellow Card database. 'Nonspecific reaction', the third most commonly detected PT in Twitter, was never coded in Yellow Card, perhaps reflecting the low likelihood of reporting a vague symptom to the regulators. Some of the discrepancies may be accounted for by different terminologies and coding methods: it is possible, for example, that the PT 'altered state of consciousness' seen only in the Twitter top 25 may overlap with the PT 'confusional state' seen only in the MHRA top 25. Within Twitter, asymptomatic adverse events such as biochemical disturbances were understandably not reported. It was noted that a higher number of less severe AE PTs were reported in Twitter compared to that in Yellow Card. This may be due to differences in the motivation for patients to discuss their medications in Twitter vs. the motivation to report to SRSs. Furthermore, most reports within the Yellow Card database are derived from clinical staff with patients only able to submit suspected ADRs from 2005. Interestingly, within Twitter, nine of the top 25 (40%) AE PTs fell under the System Organ Class (SOC) pertaining to psychiatric disorders, in contrast to just one in the top 25 PTs in Yellow Card. This may reflect willingness for patients to discuss these matters on social media and reluctance for to report such AEs to Yellow Card. It may also represent the greater challenge in attributing drug causality within this type of disorder, something required for the Yellow Card scheme but not for tweets.
Current NLP methods are not perfect in correctly identifying all ADRs. This includes problems of sensitivity (i.e. correctly identifying true ADRs) and specificity (correctly not detecting non-ADRs). Opportunities for improving the automated identification of ADRs include analysis of sentiment, for example using context to identify benefits rather than harms ("The upside of prednisone, I am rash free"), negative modifiers ("As annoying as prednisone is, it really does help") and mapping temporal relationships ("I lost 20 pounds in a month, was put on Prednisone and gained 40 back"). 27 Others have proposed semantic approaches for identifying adverse events in social media, instead of the lexical approach used here. 28 Twitter is only one of many possible sources of social media data where patients and the public discuss medication and side effects, others being Facebook, Reddit, and dedicated online health social media platforms 23 and patient forums such as HealthUnlocked.com. Twitter offers a high volume of data from a wide demographic and prompts users to tweet their experiences. The 140 character limit per tweet means any side effect information is contained within a limited amount of text. This has the advantage of the NLP software handling a finite amount of text, with the disadvantage that experiences are limited to what can be included within this space. Other data sources such as patient forums may add further value beyond the occurrence of side effects with richer experiences, allowing the automated derivation of knowledge about the severity of ADRs, the impact of ADRs on quality of life, causality, strategies to manage side effects, as well as positive experiences with medications. Our future work will focus on optimising NLP methods to maximise validity and reliability, for determining causality and to explore quality of life within alternative data sources. A further extension of this research could include examining patient-reported discontinuation of drugs due to side effects.
Insomnia and weight gain are the most frequently reported glucocorticoid-related AEs posted on Twitter. There remains a disconnect between the frequency of these patient reported adverse events and our collective knowledge about these events. Here, using glucocorticoids as an example, we have demonstrated that Twitter can be a potentially useful, supplementary source for post marketing pharmacovigilance.

METHODS
Tweets mentioning prednisolone or prednisone were acquired and anonymised before being processed through automated NLP software, which mapped possible AEs to terms in the Medical Dictionary for Regulatory Activities (MedDRA®) dictionary. Tweets were automatically categorised based on their likelihood of containing text relating to a GCrelated ADR. Tweets with high probability of containing an ADR were further analysed to determine which AE symptoms appeared most frequently. An AE refers to a negative symptom/term (e.g. can't sleep, nausea), whilst an ADR refers to a causal association between drug and AE.

Data acquisition
Publically-facing tweets were accessed by Epidemico using Twitter's general-use streaming application programming interface (API) from 1st October 2012 to 30th June 2015. All tweets mentioning the glucocorticoid agents prednisolone and prednisone during the specified timeframe were acquired from an unknown denominator of all tweets (irrespective of content) using the search terms 'prednisolone' and 'prednisone', and misspellings, for example 'predniolone' and 'prdnisone'. Data were processed by Epidemico, Inc. to identify possible ADRs and AEs as described below and detailed elsewhere. 14,29 All data were anonymised, then transferred to the University of Manchester where tweets were deduplicated (>1 tweet with identical text and date/time posted) and underwent further analysis as below: user mentions were replaced by @user, all URLs were replaced by [link] and all named entities were replaced by 'Someone' or 'Someone-or-other' for consecutive entities.

Data processing
A subset of all Tweets (28%) mentioning prednisolone or prednisone were manually analysed by human MedDRA® trained curators at Epidemico to train their proprietary NLP automated classifier. The remaining tweets then went through this classifier, which identified possible symptoms and mapped them to the MedDRA® dictionary. Figure 5 represents the hierarchical structure of the terminology. All potential adverse events were mapped to a Preferred Term (PT). 19 Where tweets mentioned more than one possible AE, each AE was mapped to its respective PT. There were thus more PTs than there were tweets.
After possible AEs had been identified and mapped to MedDRA®, individual tweets were given an 'indicator score' representing the probability that the tweet contained an ADR. In brief, the indicator score uses a Bayesian probabilistic model developed through statistical machine learning computation following manual coding by human curators on a training set. The model generates a score ranging between 0 and 1. The closer the assigned score is to 1, the higher the probability that the tweet contained an ADR. Based on previous work, 14,30 tweets tagged with an indicator score of ≥0.7 were categorised as a 'proto-AE', meaning the tweet  was considered to represent a likely ADR. If a tweet was tagged with multiple PTs, it was assigned a single indicator score. It was noted during early exploratory analysis that the MedDRA-coded PT in Proto-AE tweets would sometimes represent the indication for treatment (e.g. rheumatoid arthritis), or a non-medical event (e.g. investigation, procedure). A clinical researcher (RP) thus manually assessed each unique PT and identified PTs deemed not to represent a medical event or which referred to a therapeutic indication (Supplement Table 1). Such PTs were excluded from the analysis. For equivocal PTs in which this was unclear (e.g. hypersensitivity), a random sample of 50 tweets tagged to each equivocal PT were manually reviewed by RP and coded as 'true AE', 'therapeutic indication' or 'unclear' (See Supplement Table 2). PTs where less than 50% of tweets were coded as 'true AE' were excluded from the analysis to minimise false positives.

Data analysis
Once tweets were acquired, anonymised, mapped to MedDRA®, deduplicated, restricted to Proto-AEs and indications and non-medical events removed, an analysis describing the frequency of PTs in prednisolone/ prednisone reported AEs was performed. Because tweets that included multiple PTs were given a single indicator score, it was possible that some of the PTs within a single tweet may not represent a suspected ADR. A sensitivity analysis was therefore performed restricting the analysis to proto-AE tweets tagged with a single PT.
The frequency of AEs by PT was compared to suspected ADRs reported in the UK MHRA Yellow Card Scheme until 15th Jan 2016. The Yellow Card scheme is the UK spontaneous reporting system (SRS) established in 1953 and which includes patient-reported data since 2005. Data held within Yellow Card is publically available in the MHRA's Drug Analysis Prints (DAPs).
A proportional reporting ratio (PRR) for each of the top 25 Twitterreported AE PTs was generated by comparing the proportional frequency of each AE reported through Twitter to the proportional frequency of that same PT in the Yellow Card database for both prednisolone and prednisone. For example, suppose 'cough' accounted for 10% of all AE PTs in Twitter, and accounted for 5% of reported AEs in the Yellow Card database, the PRR would be 10/5 = 2.