Developing reliable hourly electricity demand data through screening and imputation

Electricity usage (demand) data are used by utilities, governments, and academics to model electric grids for a variety of planning (e.g., capacity expansion and system operation) purposes. The U.S. Energy Information Administration collects hourly demand data from all balancing authorities (BAs) in the contiguous United States. As of September 2019, we find 2.2% of the demand data in their database are missing. Additionally, 0.5% of reported quantities are either negative values or are otherwise identified as outliers. With the goal of attaining non-missing, continuous, and physically plausible demand data to facilitate analysis, we developed a screening process to identify anomalous values. We then applied a Multiple Imputation by Chained Equations (MICE) technique to impute replacements for missing and anomalous values. We conduct cross-validation on the MICE technique by marking subsets of plausible data as missing, and using the remaining data to predict this “missing” data. The mean absolute percentage error of imputed values is 3.5% across all BAs. The cleaned data are published and available open access: 10.5281/zenodo.3690240.

Step 1 Negative or zero filter: The 'negative or zero' filter enforces dt > 0 MW for all hours. This filter screens the non-physically plausible negative values as well as the unlikely zero values. Nearly all of the zero values returned from the EIA database are intermittently inserted within 'normal' data. Zero appears to be a common choice of default value for some BAs when their standard reporting procedures lapse. It is technically possible for the electricity demand of an entire BA to reach zero; however, it is very unlikely. In cases of extreme weather induced power outages, we expect demand to trend downwards and then slowly recover rather than plunging to zero.
In the four years of analyzed data, we observe four cases with demand trending down to zero and then recovering. The two confirmed full-service-area outages are both from the NSB BA, a municipal electric utility serving approximately 26,000 customers in Florida (ref. [S.22]). No special treatment is given to these regions during screening or imputation; the imputed results are included in the cleaned data product like every other anomalous instance.
News posts from the SCL BA indicate their zero demand event was a false alarm during system reliability tests:  Lastly, there is no confirmation of the observed full-service-area outage for the IID BA. Other outages are reported within a few days of the instance leading us to believe that the data are erroneous: • 4:08 PM, 10 June 2018: "Power Outage-Salton City. Estimated restoration time is unknown; troubleshooter en route, updates to follow" (https://twitter.com/ IIDatWork/status/1005949938268639232).
A total of 863 hours, or 0.0455% of the total data, are screened by the 'negative or zero' filter.  (d) show the imputed results. No special treatment is given to these regions during imputation and the results can be seen in the right column.
Identical run filter: There are numerous instances where days or even months of reported demand values are identical. The 'identical run' filter screens out cases where the same value is reported for three or more consecutive hours. The filter marks the third hour onwards as anomalous; i.e. filter hour t if dt 2 = dt 1 = dt. An example is seen in Figure 2 panel (a).
Because demand values are rounded to the nearest MW, it is possible for values to be identical hour-to-hour in accurately reported data. Repeated values are observed a non-negligible amount of time for the smaller BAs, particularly during the peaks and troughs of the daily demand cycle. The requirements of the 'identical run' filter strike a compromise between flagging false positives in the smaller BAs and removing every S.3 single hour in long identical runs. The 'identical run' filter screens a total of 3,861 hours, or 0.204%, of the data.
Global demand filter: The 'global demand' filter screens demand values that are at least 10 times larger than the four year median demand value for each BA. The median demand value is calculated after applying the 'negative or zero' and 'identical run' filters. The threshold of ten times the median value was chosen to remove the most extreme upward outliers with the intention of the subsequent algorithms providing a more detailed removal of other upward outliers. The 'global demand' filter screens a total of 170 hours, or 0.00898% of the data.
Global demand ± 1 hour filter: Visual inspection of the data shows that the hours preceding and following hours screened by the 'global demand' filter are often upward outliers. If hour t is screened by the 'global demand' filter, hours t 1 and t + 1 will be screened by the 'global demand ± 1 hour' filter. A total of 285 hours, or 0.0151% of the total data, are screened by this filter.
Step 2 Step 2 begins with a significantly cleaner data set because the four filters in Step 1 have removed values that we are very confident are anomalous. At this point, we calculate variables for use in the subsequent screening algorithms.
Algorithm variables: Sub-weekly trends in the central tendency of the demand are represented using a 48 hour moving median variable (eq. S.8). Trends in M t,48hr are likely related to the local weather. The M t,480hr is not intended to capture synoptic-scale weather changes. The difference between dt and M t,48hr is denoted as (dt, M t,48hr ) = dt M t,48hr and is used to understand the structure of the daily cycle. The typical spread in demand values about M t,48hr is expressed as the IQR of (dt, M t,48hr ) within a centered 10-day window (eq. S.10).
The IQR dem,t changes from season to season largely based on varying heating and cooling needs. The 10-day window provides enough data to estimate the IQR while remaining short enough to describe real synoptic-scale changes. First-order differences between hours, (dt 1, dt) = dt 1 dt, are used to identify outliers. The IQR of (dt 1, dt) quantifies the spread in values and is used to S.4 set thresholds for screening demand values. We calculate the IQR of the first-order differences using a 10-day window (eq. S.11).
We create a normalized local demand profile for the 24 hours in the daily cycle. For each hour of the day, we take the median (dt, M t,48hr ) value over a centered 21-day window and normalize the result by the long-term median demand (eq. S.12). h t,daily is calculated over the large 21-day interval because it incorporates only a single value each day for each hour. When multiplied by M t,48hr , (1 + h t,daily ) results in an estimate of the demand for any given hour based on local information. Demand values are compared against these estimates using eq. S.13.
First-order differences between rt values are denoted as (rt 1, rt) where (rt 1, rt) = rt 1 rt. rt is calculated as a ratio of the expected value making it and the derived quantity (rt 1, rt) relatively consistent throughout the year. Therefore, we calculate the IQR of (rt 1, rt) across the full four years of data (eq. S.14).
IQRr = IQR for all t in 4 years ( (rt 1, rt)) (S.14) Step 2 of the anomaly screening process consists of the following four screening algorithms that use the variables defined above.
Local demand filter: The 'local demand' filter screens values that deviate significantly from their expected values. The expected value is calculated as: M t,48hr ⇥ (1 + h t,daily ). Thresholds for deviations from the estimate are set in both the upwards and downwards directions. The thresholds are scaled by a determination of the typical spread in the daily cycle, IQR dem,t . We screen demand values if either eq. S.15 or S.16 are true.
The 'local demand' filter screens a total of 671 hours, or 0.0354% of the data. Double-sided delta filter: The 'double-sided delta' filter screens values that strongly violate the continuity of the demand data. The data typically exhibit relatively smooth transitions from one hour to the next. The filter targets single-hour spike-like features. We screen demand values if either equation S.17 or S.18 are true.
There are many instances in the data where (dt 1, dt) > IQR ,t ⇥ 2 or (dt 1, dt) < IQR ,t ⇥ ( 2), such as the steeply rising shoulders of the daily cycle. These instances are not flagged by these criteria. The 'double-sided delta' filter screens a total of 1,212 hours, or 0.0640% of the data.
Single-sided delta filter: The 'single-sided delta' filter is designed to screen anomalous sequences of hours associated with large discontinuities, which collectively deviate significantly from their surrounding data. First-order differences in demand and rt are used to identify discontinuities in the data. Deviations are defined based on the local values of M t,48hr and M t,480hr . In general, M t,48hr represents the central tendency of the local data more accurately. However, when many anomalous values are present they can significantly alter M t,48hr and M t,480hr can provide a helpful benchmark.
A specific challenge for the 'single-sided delta' filter is that it must determine which demand values should be screened, the values preceding or following a large discontinuity. The 'single-sided delta' filter iterates over each hour twice, once in the forward direction, then again in the reverse direction. The algorithm specifically tracks the most recently iterated hour still labeled as 'okay', hour okay , which is denoted with an okay subscript in the following description. Hour okay is initialized to the first 'okay' hour in the time series. For each hour, t, in each direction: Step 1: Calculate the first-order difference in demand and rt between hour okay and hour t: (d okay , dt) and (r okay , rt).
Step 2: Determine if the first-order differences are significant using IQR thresholds.
• Else: hour okay is incremented to hour t, proceed to Step 1 for hour t + 1.
Step 3: Determine which of the two hours deviate from their expected values the most using both 48 hour and 20 day moving medians. Screen the hour with the larger maximum deviation.
• Else: hour okay is incremented to hour t, proceed to Step 1 for hour t + 1.
The 'single-sided delta' filter screens a total of 181 hours, or 0.00960% of the data. Anomalous region filter: There are multiple occurrences of especially chaotic segments of demand data. In some instances, the previous seven filters screen the entire chaotic segment. Other times, there are residual hours left unfiltered in the middle of multiple screened hours. We designed the 'anomalous region' filter with the intention of screening these remaining hours.
The algorithm focuses on the structure of 'okay' and 'missing' versus otherwise screened values. A few extra variables are calculated based on this structure. The percent of screened hours within ±24 hours of each hour t is calculated, pct t,screened . If hour t is currently classified as 'okay', count the number of continuous 'okay' hours preceding and following giving the total length of 'okay' hours associated with hour t, l t,okay .
For each hour, t:

S.6
Step 1: Determine if the number of screened values surrounding hour t is significant.
• Else: proceed to Step 1 for hour t + 1.
Step 2: Iterate over the hours used to calculate pct t,screened . For each hour t 0 in (t 24, t 23, . . . t + 24): • If t 0 is not categorized as 'okay': continue to t 0 + 1 in Step 2 (t 0 is already screened or 'missing').
• If l t 0 ,okay > 24 hours: continue to t 0 +1 in Step 2 (t 0 is part of a substantial continuous section of 'okay' data).
• Else: screen t 0 and proceed to t 0 + 1 in Step 2.
A prime example of the 'anomalous region' filter can be seen in panel (d) of Figure 2. The 'single-sided delta' and 'negative or zero' filters screen many hours in this example. But, there are some remaining anomalies undetected by other filters. The 'anomalous region' filter screens a total of 1,773 hours, or 0.0936% of the data.

Energy system modeling details
The details of the Energy system modeling section are tabulated. Supplementary  Table S.5 shows summary statistics including the mean, maximum, and minimum hourly demand for these three different example data cleaning methods. Supplementary Table S.6 shows the optmized system capacities for each region, for each technology case, and for each data cleaning method.  Table S.5: Summary statistics including the mean, maximum, and minimum hourly demand for these different data cleaning techniques are shown for each geographic region for the modeled years, 2016 through 2018. A strong case for data cleaning is made when comparing the excessively large maximum demand values for Method 1 in the Western and CONUS regions. Noticable differences are seen between Method 2 and Method 3 comparing the minimum demand values for Texas. The EIA imputation method (Method 2) lacks the ability to fill in long consecutive demand gaps. Table S.6: Optimized system capacities are shown for each region, for the different technology cases, and for each data cleaning method. The capacities are for systems capable of meeting 100% of the electricity demand profile for each region. The "-" denote technologies that were excluded from the model configuration. Natural gas is abbreviated as NG. S.10