Introduction

In chronic conditions such as diabetes, hypertension and asthma, charts or graphs often form an important part of clinical monitoring and disease self-management. With increasing use of internet and mobile phone-based resources for telehealth monitoring,13 the visual display of data is also becoming more common. Charts may be used to confirm clinical diagnoses, to increase patient awareness of their condition, or to provide early warning of deterioration.

Graphic displays benefit from the extraordinary sophistication and efficiency of human visual perception.4,5 Charts may be particularly useful when the clinician or patient needs to detect material changes in the clinical variable (the signal) from background noise, but there are no validated numerical criteria to support clinical decisions. For example, in asthma, a wide variety of criteria has been suggested for identifying clinically important changes in peak expiratory flow (PEF) but none has been validated.6

A ‘good’ graph will draw attention to important relations and patterns in the data.7 However, variation in graphic format can substantially influence interpretation of the data, as has been seen in health risk communication.8 A large body of literature exists suggesting ways to optimise the design of graphs,7,912 but this is in the context of the results to be graphed already being known. Little attention has been paid to charts for health monitoring, and their format is often not standardised.

One key characteristic of a chart is its scale or aspect ratio (i.e. the ratio of the x-axis to the y-axis). The effect of manipulating the aspect ratio on the interpretation of graphs was demonstrated over 50 years ago by Huff in his classic monograph “How to Lie with Statistics”,13 and the underlying principles of visual perception have subsequently been identified.14,15 However, these observations have not so far been applied to the time-series graphs used in health monitoring where patterns may vary with time or between patients. We have previously reported that the aspect ratios of commonly available PEF charts vary widely16 and that, when PEF data from one severe exacerbation were displayed on a horizontally expanded chart (high aspect ratio), it was more difficult to recognise the change from the patient's previous state than when the same data were displayed on a more horizontally compressed chart (lower aspect ratio).16 However, to date, no study has empirically investigated the impact of the aspect ratio on the accuracy of interpretation of time-series plots of biological data.

The aim of this study was to assess whether displaying biological data on charts with different aspect ratios (i.e. more or less compressed) influences the accuracy of identification of a material deviation from a previously stable state. We also investigated whether the use of lines to join data points aided accuracy across any of the chart types. The data used in the study were PEF data from patients who had experienced clinically-recognised asthma exacerbations. However, in order to evaluate the effect of chart format alone on visual perception independent of any clinical concepts, the study was deliberately conducted with lay subjects and the data were identified only as ‘biological data’.

Methods

The study was a computer-based experimental task in which lay volunteers were presented with 72 sets of unidentified biological data displayed sequentially (Figure 1) on charts with different formats.

Figure 1
figure 1

Three, four, five and six weeks of peak expiratory flow data (dataset #11, Table 1), including a severe exacerbation (9.3SD, 26.2% fall) in week 5, plotted on Chart C (aspect ratio 1.1:1). Consecutive blocks of 14 data points were displayed to participants who were asked to respond by pressing a key if the data were increasing, stable or decreasing compared with the previous block

Participants

Participants were University of Sydney undergraduate and graduate students. Eligibility criteria included normal or corrected-to-normal vision and ability to understand the task instructions. Ethics approval for the study was obtained from The University of Sydney Human Research Ethics Committee; participants provided written informed consent and received a lunch voucher for participating in the study.

Design

The study used a 3-by-2 factorial design with three charts varying in aspect ratio and with or without a line joining the data points. This resulted in six different charts by line combinations (Table 1, Figure 2). Aspect ratios were calculated as x:y, adjusted for a standard width (84 data points) and height (0–800). The three charts were selected to represent a wide range of aspect ratios (A, 5.2:1; B, 3.0:1; C, 1.1:1) (Table 1). Charts were selected from those previously described16 to provide a broad range of x:y aspect ratios: Chart A: previously supplied with the Allersearch BreathAlert peak flow meter with a similar aspect ratio to that supplied with the MicroPeak peak flow meter (CareFusion, Basingstoke, UK); Chart B: FP1010, a 12-month peak flow booklet previously supplied by the National Health Service in the UK with a similar aspect ratio to the chart currently provided with the MiniWright peak flow meter (Clement Clarke, Essex, UK); Chart C: Woolcock peak flow chart ('prototype chart' used in the study by Reddel et al.16) available on the National Asthma Council Australia website (http://www.nationalasthma.org.au/content/view/384/506/).

Table 1 Characteristics of individual datasets and events (exacerbations) presented in the experiment
Figure 2
figure 2

Six weeks of peak expiratory flow data (dataset #7, Table 1), including a moderate clinically recognised exacerbation (6.1SD, 14.3% fall) in week 4, plotted on Chart A (aspect ratio 5.2:1), Chart B (aspect ratio 3.0:1), and Chart C (aspect ratio 1.1:1). I=baseline mean, II=exacerbation

Materials

The study used actual biological data comprising electronic PEF recordings by 12 patients with asthma during a clinical trial.17 Each dataset contained six weeks of twice-daily PEF data (84 data points), with the first two weeks being used as baseline. Each dataset either contained one week in which the PEF fell to a maximum of 3 (±0.5), 6 (±0.5), or 10 (±1.0) standard deviations (SD) from baseline, considered arbitrarily to represent mild/moderate/severe exacerbations respectively, or had zero (≤2±0.5) SD fall from baseline. The exacerbations were recognised clinically at the time and were retrospectively confirmed to represent a material change, based on a conventional statistical process control criterion of ≥3 SD fall from baseline mean.18 The start of each dataset was allocated so that the exacerbation nadir fell at the junction of two blocks. Minor modifications were made to the original data so that the exacerbation would appear between blocks 3 and 6.

Data were presented to subjects as generic biological data and the y-axis markers were unlabelled in order to avoid variation between participants due to any existing knowledge about asthma.

Procedure

The study was conducted in a computer laboratory in which all screens had identical screen size and resolution. Data, including brief demographic information, were collected anonymously. After three practice trials with no feedback provided on performance, participants viewed 72 trials in random order, each containing six blocks of 14 data points, with trials varying by chart type (one of three aspect ratios, and line vs no line), presence or severity of exacerbation (0, 3, 6, 10 SD fall), and week of exacerbation onset (block 3–6). For each trial, after each block of data was cumulatively added to the display, participants were asked to record, by pressing an arrow key, whether the data appeared to be increasing, decreasing, or stable in comparison with the previous block; the next block of data was then displayed (Figure 1). In order to assess visual perception, participants were asked to respond quickly and accurately; they were not able to change previous responses. After pilot testing, some minor modifications to the procedure were made, including the incorporation of a short break and a motivating statement displayed on the screen after 18, 36, and 54 trials.

Data analysis

For blocks containing an exacerbation, a response was coded as being ‘correct’ (true positive) if the ‘down’ button was pressed and false negative if it was not. False positives were recorded if the ‘down’ button was pressed for blocks containing no exacerbation. Log-binomial models were fitted for false negative and false positive responses using generalised estimating equations with an exchangeable error structure to account for within-subject correlations. Two-way interactions between aspect ratio, presence or absence of lines, and severity of exacerbation were tested using Wald tests.

Results

Eighty participants were recruited into the study, 57 of whom (71%) were women; the mean age of the sample was 28.8 years. The average time for completion of all 72 trials was 11.9 mins (range 8.3–25.4). All participants recorded a response in each of the 54 blocks containing an exacerbation, with 3,636 (84.2%) of these responses being correct.

For false negative responses (missing a true exacerbation), interactions between presence of lines and severity of exacerbation (p=0.34) and presence of lines and aspect ratio were not significant (p=0.29). The interaction between exacerbation severity and aspect ratio was significant (p=0.0048), indicating that the performance of charts with different aspect ratios differed by the magnitude of the fall. For mild and moderate exacerbations (3 SD and 6 SD fall), the most compressed chart (Chart C) had the highest proportion of correct responses and the lowest proportion of false negative responses (Table 2). For example, a 6 SD fall (average 19.8% fall from baseline mean) was missed by participants in 5% of trials with Chart C compared with 12% of trials with Chart B and 24% of trials with Chart A; the relative risk of a false negative response on Chart C compared with Chart A, adjusted for clustering, was 0.19 (95% CI 0.12 to 0.30). For the most severe exacerbations (10 SD, mean fall 32.4%), there was little difference in false negative responses between the charts (range 4–7%).

Table 2 Effect of chart format on false negative responses

For false positive responses (identification of an exacerbation when none was present), interactions between presence of lines and severity of exacerbation (p=0.71) and presence of lines and aspect ratio were not significant (p=0.26). The interaction between exacerbation severity and aspect ratio was significant (p=0.027), indicating that the likelihood of false positives for non-exacerbation blocks on charts with different aspect ratios differed by the magnitude of an exacerbation occurring elsewhere in the same dataset. The number of false positive responses increased significantly with greater chart compression, but the proportions were small (2.6%, 3.6%, and 7.9% of trials respectively for Charts A-C, Table 3).

Table 3 Effect of chart format on false positive responses

The presence of lines joining the data points made no significant difference to the risk of false negative responses (Table 2). False positive responses were significantly less likely when data lines were absent (adjusted relative risk 0.85 (95% CI 0.76 to 0.95)), but the difference was small (overall 4.3% false positive without lines vs 5.1% with lines, Table 3).

Discussion

Main findings

This study demonstrates that material changes in graphical displays of biological data such as PEF are more easily recognised on charts with a low aspect ratio (more horizontally compressed) than on charts with a high aspect ratio. The chart with the lowest aspect ratio was associated with a reduction of up to 80% in false negative responses compared with the chart with the highest aspect ratio across both mild (3 SD) and moderate (6 SD) changes, with only a very small increase in false positive responses. Furthermore, the addition of interconnecting lines — rather than assisting with interpretation — led to a significant but small increase in false positive readings. While the findings have immediate implications for monitoring of lung function in asthma, they are also relevant to other graphic displays (medical and non-medical) where relatively acute changes need to be easily and reliably detected.

Strengths and limitations of this study

A strength of this study was that it used actual biological data from a clinically relevant context — namely, monitoring of PEF by patients with moderate to severe asthma. Here a primary focus is on identifying change from the patient's previous state.6 Various widely differing criteria have been used to identify exacerbations but none has been validated,6 thus increasing the reliance of clinicians on subjective interpretation of the speed and magnitude of fall in lung function. The lack of clinical input may be seen as a limitation of the study; however, by deliberately conducting this experiment as a computer-based study with lay participants and de-identified data, we were able to assess the specific effects of aspect ratio and lines on the participants' ability to detect change, independent of any clinical criteria or knowledge of asthma. As in most studies of visual perception, images were presented in rapid sequence.19 To minimise participant burden, data were presented in consecutive blocks rather than incrementally point-by-point. In clinical practice the experiment would thus correspond to a periodic review of batched data rather than to interpretation of each new value day-by-day. However, if a fall in biological data cannot be identified when it has reached its nadir, it is unlikely that, during its evolution, it would be recognised sufficiently early to prompt a change in treatment.

Interpretation of findings in relation to previously published work

The present study adds to existing literature on optimal graph design by focusing on the specific challenges of self-monitored health data.1 While there is extensive literature about the neural basis of visual perception and factors that affect visual interpretation,20 existing guidelines on optimising graphic formats assume that the data to be graphed are already known, e.g. communication about treatment outcomes10 or health risks.21,22

The general public has become aware of the impact of aspect ratio through the transition to wide-screen televisions, and there is some recognition in the scientific literature that the aspect ratio can affect visual perception.14,15 For salient objects to ‘pop out’ from the surrounding data,14,15,23 the optimal angle between graph elements is said to be 45°.14,15 Changing the aspect ratio of a graph alters this angle, but little practical advice is available as to how to determine the optimum format except the general suggestion to vary the aspect ratio and make a subjective choice visually. Such recommendations are difficult to apply in health monitoring where data are recorded prospectively and the magnitude and rate of change in data are rarely predictable. Further, in many internet- and smartphone-based health monitoring applications, the y-axis scale adjusts automatically as new data are added. This constantly changes the aspect ratio and hence the visual appearance of the data, and is likely to impede the development of pattern recognition skills by clinicians and patients. In the case of PEF, an aspect ratio of around 1:1 (standardised for 6 weeks of twice daily data and PEF 0–800L/min) provided the best balance between sensitivity and specificity for identifying exacerbations, and is feasible for paper PEF charts.24 Further work is needed to establish the optimal aspect ratios for point-by-point interpretation or for other medical conditions.

With regard to whether data points on graphs should be joined by lines, existing literature is divided. Mathematical convention suggests that lines should be used when the main interest is in the change between consecutive data points and omitted when the focus is on the overall trend. Lines may facilitate pattern recognition by creating more structure in the data and making it easier for the eye to trace the changes through time. On the other hand, closely adjacent data points may be visually perceived as an object (a pattern or a line).9,25 In addition, the use of redundant information in graphs should be avoided — in the terminology introduced by Tufte, the ‘data-ink-ratio’ (i.e. the proportion of informative ink used to represent data) should be increased to reduce the cognitive burden.7 In the present study, the presence of lines did not have a clear advantage over the use of data points only and, if anything, increased the risk of false positive responses.

Implications for future research, policy and practice

The present findings should be confirmed in a clinical context where the effect of clinical input and of other chart features such as gridlines which are necessary for manual completion can also be assessed. Moreover, it should be investigated if the findings apply to experienced clinicians who are used to viewing time series of biological data. They should also be extended to other types of biological charts where the relative efficacy of different aspect ratios and their impact on clinical decisions may vary. In the case of asthma, the benefits of timely detection and treatment of exacerbations resulting from a substantial reduction in false negatives are likely to outweigh the potential harmful effects or costs of over-treatment from a small increase in false positive responses. However, the trade-off between benefits and harms is likely to be different for different conditions and in different clinical contexts.

There is clearly a popular community demand for graphs in telehealth monitoring with internet- and smartphone-based applications, and they are increasingly used in electronic medical records. However, better tools are needed to detect clinically important change reliably in such contexts. In the detection of signal from noise — one of the most important tasks in health monitoring — a conventional method such as statistical process control, which performs well in industrial applications, may not necessarily be as appropriate for monitoring of biological data.18,26 Visual perception is highly sophisticated,20 so graphic displays may contribute to the development of new statistical tools for identifying material deviations from normal. However, this will only be possible if data presentation is standardised. For example, in occupational asthma, visual recognition of different PEF patterns between work and rest periods on standardised charts led to the development27 and refinement of software tools.28

Conclusions

Recent reviews have highlighted the importance of methodological issues for use of technology in chronic disease, including for respiratory conditions.2,3 The present study has identified an area for quality improvement by providing strong evidence that the aspect ratio affects the visual interpretation of relatively acute changes in biological data such as during asthma exacerbations. The findings have immediate relevance to health monitoring, where the aim is to improve outcomes by early identification of deterioration. We found that use of a low aspect ratio (most compressed) resulted in a large decrease in false negative responses in comparison with the chart with the highest aspect ratio (least compressed), particularly for the mild to moderate falls that might be observed when health status is starting to deteriorate. This occurred at the expense of a small increase in false positive responses. At the very least, the present study has emphasised the importance of standardising the aspect ratios of charts used for health monitoring to facilitate the development of clinical pattern recognition skills.