Abstract
Technologically driven transport systems are characterized by a networked structure connecting operation centers and by a dynamics ruled by pre-established schedules. Schedules impose serious constraints on the timing of the operations, condition the allocation of resources and define a baseline to assess system performance. Here we study the performance of an air transportation system in terms of delays. Technical, operational or meteorological issues affecting some flights give rise to primary delays. When operations continue, such delays can propagate, magnify and eventually involve a significant part of the network. We define metrics able to quantify the level of network congestion and introduce a model that reproduces the delay propagation patterns observed in the U.S. performance data. Our results indicate that there is a non-negligible risk of systemic instability even under normal operating conditions. We also identify passenger and crew connectivity as the most relevant internal factor contributing to delay spreading.
Similar content being viewed by others
Introduction
Air transportation systems have been traditionally described as graphs with vertices representing airports and edges direct flights during a fixed time period1,2. These graphs are called airport networks and have been studied at different geographical resolution scales, restricted, for instance, to a single country (usually the U.S. (USAN)3,4,5,6 but also China7 or Europe8), or for the whole world (WAN)1,2. These networks show high heterogeneity in the distribution of connections per airport and in the traffic sustained by each connection. A non linear relation between the number of connections of the airports (topology) and the number of passengers (traffic) has been observed in Ref. 1 and used later for modeling9. Furthermore, airport networks are structured in clusters of highly interconnected airports that reflect the geographical areas in which the traffic is naturally divided10. The dynamics of the connections and the traffic levels have been also analyzed for the USAN4. All of these are aspects of the graphs that influence their capability to transport persons, goods and even other less desirable passengers. For example the propagation of infectious diseases at a global scale that occurs when infected persons travel across the network11,12,13,14,15. The modeling and forecasting of disease spreading patterns using air traffic data is a story of a notable success13,14,15. One can, thus, wonder if this success can be extended to the propagation of other phenomena. In particular, we are interested in considering here flight delays and the way in which congestion can become a systemic risk.
According to the 2008 Report of the Congress Joint Economic Committee, flight delays have an economic impact in the U.S. equivalent to 40.7 billions of dollars per year16, while a similar cost is expected in Europe17,18. The situation can turn even grimmer in the next decade since the air traffic is envisaged to increase16,17,18,19. Delays damage companies' balances due to enhanced operation costs contributing to deteriorate their image with costumers20. Passengers suffer a loss of time, even more acute in case of missing connections, that translates into decreased productivity, missed business opportunities or leisure activities. Additionally, attempts to recover delays lead to excess fuel consumption and larger CO2 emissions. As a consequence of this challenging situation, a considerable effort has been invested in the area of Air Traffic Management to characterize the sources of initial (primary) delays21,22 and the way in which they may be transferred and amplified by consequent operations, the so-called reactionary delays19,23,24,25,26,27. The concept of delay itself implies a time difference with respect to the baseline provided by a predefined schedule21,24. The propagation of delays thus corresponds to the spreading of a malfunction across the system. The mechanisms responsible for it reflect the complexity of air traffic operations. Apart from the airport networks structure and dynamics, other factors contributing to the delay propagation are airport congestion25, plane rotation or crew and passenger connection disruptions23,26,27. Airline schedules typically include a buffer time to deal with all these issues. However, when this time is not enough, the departure of the next flight gets delayed and can affect further operations in a cascade-like effect23. There have been several attempts to model delay spreading28,29,30,31,32,33. These studies differ in the level of detail included but in general they consider the effects of delays or disruptions in the operations of a few major airports (hubs). In this work, we take instead a network-wide perspective to analyze the performance of a transportation system. We define metrics able to quantify the level of spread of the delays in the network. We then apply these metrics to a database with information on the operations in the U.S during 2010 and introduce a model that reproduces the delay propagation patterns observed in the data. The model shows also a notable capacity to evaluate the risk of development of system-wide congestion and to assess the resilience of daily schedules to service disruptions.
Results
Database
The data was downloaded from the web page of the Bureau of Transport Statistics (BTS)35. In particular, we used the Airline On-Time Performance Data, which is built with flight statistics provided by air carriers that exceed one percent of the annual national revenue for domestic regular service. The database comprehends 6, 450, 129 scheduled flights operated by 18 carriers connecting 305 different commercial airports. The total flights operated in the US in 2010, not only those that report on-time performance data, sum up 8, 687, 80036. Therefore, the database comprises information accounting for 74% of them. The information per flight includes real and scheduled departure (arrival) times, origin and destination airport, an identification code (tail number) for each aircraft, airline, etc. This data enables us to represent the US airport network and furthermore replicate the scheduled flights for every day of 2010. A detailed description can be found in Section 1 of the Supplementary Information. It is important to note that this schedule is based on real events, which in some occasions may differ from the original planned schedule of the companies. If a flight gets canceled, diverted or even rescheduled the airline may introduce changes in the original schedule that are not possible to trace back. However, given that these flights represent, respectively, the 0.20% and 1.75% of all flights in the database, one can expect these changes not to be of large magnitude.
Model
The modeling approach followed is agent-based at the level of aircrafts and is data-driven in the sense that the daily schedules and the primary delays are obtained directly from real records in the database. This level of realism is necessary to confront the model predictions with the real unfolding of the delay events during each day. Concretely, the model dynamics simulates three main subprocesses: aircraft rotation, flight connectivity and airport congestion. The latter two are independent from each other and can be turned on/off to explore the relevance of each subprocess in leading to network-wide congestion. Aircraft rotation, on the other hand, is intrinsic to the schedule and cannot be suppressed.
The basic time unit of the simulations is one minute, every aircraft state is tracked at this temporal resolution. We assume that the flights are not able to recover delays on air and so the departure delays are equal to those at arrival to destination. Throughout a day, each aircraft follows the connections given in the schedule, the so-called plane rotations. The airports are supposed to have a capacity per hour proportional to the scheduled airport arrival rate with a proportionality factor β. Further arrivals produce delays. Passengers (crew) of incoming flights have a certain probability of connecting with other flights within a time window of 3 hours from the scheduled arrival. The probability of connection is proportional, with a factor α, to flight connectivity levels provided by the BTS for each U.S. airport. A more precise description of the model is included in Section 2 of the Supplementary Information. This model has, thus, two free parameters: α, controlling passenger connectivity and β, accounting for airport capacity. In the following section, we will examine the effect of these parameters on the systemic spread of delays.
Data analysis and comparison with model predictions
Flight delays are defined as the difference between the scheduled and real departure (arrival) times21,24. Actually most of the flights operated in 2010 were on time, even some before schedule, but 37.5% of those reporting performance arrived or departed late. Their delays do not show a characteristic value: the delay distribution displays a broad tail as can be seen in Figure 1A. This implies that most flights arrived late by just a few minutes, while others were hours behind schedule. The shape of the distributions is similar regardless of the arrival or departure nature of the operations. The planned buffer time on ground for each aircraft should help absorb part of the delays, specially those mildest as will be discussed next, thus altering the shape of the departure delay distribution. However, this factor is not able to substantially modify the characteristics of the distributions. Interestingly, the shape of the delay distribution does not change either when the season of the year is considered. Summer concentrates the major part of the year traffic, so the total delay is higher but when the distribution of delay per flight is taken into account both summer and winter behave similarly (see Figure 1B). The overall distributions of delays are thus quite robust. Some small differences can be only observed when one focuses on particular airports. In Figure 1C, the departure delay distribution is plotted for Atlanta, JFK New York and Honolulu airports. While the distributions in Atlanta and New York are similar, the Honolulu airport shows a bias toward larger delays due to its isolation from the continent.
The effect of the buffer time in the airports for absorbing delays can be measured using the Turn Around Time (TAT). The TAT stands for the time spent by an aircraft on ground from arrival to departure from the gate. This measure is associated with airport operational efficiency and is used to improve the planning of flight connectivity and aircraft rotational sequence stability34. We refer as ΔTAT to the difference between scheduled and real times at the gate. On the one hand, a negative value of ΔTAT means that an aircraft stayed at the gate longer than expected and so fresh delay was introduced. On the other hand, a positive ΔTAT shows that the operation was quicker than scheduled and that part of the delay was recovered. In Figure 2A, we depict ΔTAT for each flight along a day in the most trafficked airport of the network: Hartsfield-Jackson in Atlanta (ATL). That day, March 12, happened to be one of the worst in the database in terms of average flight delay. The abundance of positive values of ΔTAT is a prove in favor of the capacity of the airport to recover delays. The distributions of ΔTAT for all the operations in 2010 separated in positive and negative values are displayed in Figure 2B. These distributions, as those for the delays, show long tails, which is a marker of the complex nature of delay spreading mechanisms.
The focus so far has been on individual flight delays. We define now a metric of congestion for the full network. To do so, the average delay of all delayed flights during the year is taken as baseline and amounts to 29 minutes. An airport is considered as congested whenever the average delay of all its departing flights over a certain period of time exceeds 29 minutes. Additionally, a daily airport network is built using the flights of the day to assess whether congested airports are organized in connected clusters or not. Note that being in the same cluster is a measure of spatiotemporal correlation of congestion but not necessarily a sign of a cause-effect relation. We apply the same metric in the simulations in order to compare empirical and model results. Maps with the congested airports and the connections between them are shown for different days of the database in Figures 3A–3C. As can be seen, the scenario dramatically changes from day to day: in some days a large cluster surges covering 1/3 of all airports, while in others only one or two airports cluster together. This is confirmed when the size of the largest connected cluster is depicted as a function of the day in Figure 3D. A strong variability is thus the main characteristic of the dynamics of the size of the largest congested cluster. The cumulative distribution of the cluster size is displayed in Figure 3E and it seems compatible with an exponential decay. Even if the fluctuations are large, there exists a well defined characteristic cluster size. Given the cluster variability, an important question to answer is whether the congested airports are recurrent. In panel 3F, we calculate the Jaccard index to compare the sets of airports in the largest cluster in consecutive days or for the top 20 worst and best days. This index is 1 if the clusters are equal and 0 if they are strictly different. Interestingly, the index is relatively low for days with large clusters, which implies that the same airports are not consistently part of the cluster.
In order to compare empirical results and model predictions regarding the evolution of the cluster of congested airports, we run the model fixing the airport capacity parameter β = 1 and fitting the flight connectivity factor α to obtain a maximum cluster size similar to the one observed in the data. By fixing β to 1, we are assuming the same airport capacity as originally scheduled. The results for the temporal evolution of the congested cluster size hour by hour can be seen in Figure 4 for March 12 and April 19. Similar plots for other days of the year are included in the Section 3 of the Supplementary Information. Note that the fit of α is essential to get the maximum of these curves, however all the cluster size evolution predicted by the model follows strikingly well that of the real data. Actually, almost 60% of the airports in the real cluster are correctly identified by the model since they are top ranking when airports are ordered by probability of congestion. Furthermore, by fixing α, without any fitting, the model can predict with 66% accuracy if a day will develop or not a large congested cluster (see Supplementary Information, Section 3 for further details). The model allows us also to explore which are the contributions of the main three ingredients (plane rotation, flight connectivity or airport congestion) to propagate delays. From Figures 4B–C, we can conclude that flight connectivity is the most important factor. One may still wonder if the picture changes when the capacity of the airports is modified. Actually, the model exhibits weak sensitivity to variations on the β coefficient as shown in Figure S13 of the Supplementary Information. Slightly increasing the airport capacity will not ease off the propagation of delays since the main cause of the spreading, flight connections, is independent of it. Conversely, a very strong decrease on the airports' capacity, around 50%, is needed to trigger new primary delays that later on will spread in a cascading effect. This might be the case when generalized severe weather conditions or labor conflicts occur.
The initial delays affect the outcome of the model. In the results of Figure 4, we take the primary delays for each aircraft from the data as initial conditions for the model. Introducing different initial conditions, we can assess the resilience of a day schedule to an increase of unexpected incidences. This question is explored in Figure 5 where a fraction of randomly selected flights are delayed. The size of the largest cluster is estimated as a function of the fraction of delayed flights and of the intensity of the initial delays. For the sake of simplicity, we set all the initial delays in the simulation equal to a fixed value (delay intensity in Figure 5). The results are displayed for the schedules of two days: April 19 and March 12, which respectively show a very small and very large cluster in the real data. In particular, the average flight delay on March 12 was the second largest in 2010. The congestion on the worst day of the year, October 27, can be explained due to extreme meteorological conditions37,38, while on March 12 no major external event was reported. Therefore, the network-wide propagation of delays in that day was likely caused and driven by internal mechanisms of the system. Comparing in Figure 5 the curves for March 12 and April 19, one notices that the surface representing the largest cluster size for March 12 are displaced toward smaller values of the initial delay intensity or fraction of flights with primary delay. This shows a higher susceptibility of the schedule of this day to disruptive perturbations. Another interesting feature of the curves of Figure 5 is that, given enough primary delays, they show a non-negligible risk of systemic failure regardless of the schedule. The curves in Figure 5 for different values of α also confirm the relevance of connections and crew rotations for the spreading of delays.
The primary flight delays in a day of real operations do not necessarily localize randomly in the network. If the causes are bad weather, technical or labor issues are more prone to concentrate in a few airports. In Figure 6, this issue is explored by comparing the intra-day evolution of the cumulative size of the largest congested cluster when the initial delays are introduced in the model in two different ways. The first one is by using the primary delays given in the database. The second procedure is by randomly shuffling the flights affected by the primary delays. The values of the real delays in the database are maintained but they are assigned to flights selected at random. The comparison of the curves for the two cases with the real data shows that random perturbations are way more efficient to collapse the system. While airports in general have some capacity to recover delays, the random selection of delayed flights affect a larger number of them and besides concentrate a heavier burden on smaller airports which have less capacity to react. This result evinces that the method followed for schedule evaluation in Figure 5 is conservative in the sense that it considers the schedule under a non favorable scenario for the distribution of primary delays.
Discussion
In summary, we analyze the spreading of delays in an air traffic network. In particular, our results focus on the US airport network in 2010 but the concepts and techniques employed can be easily extrapolated to the analysis of the performance of a generic transport system. We introduce a measure for the level of network-wide extension of the delays by defining when an airport is considered as congested and studying how congested airports form connected clusters in the network. The size of the largest congested cluster displays in the data a high variability from one day to the next. This feature is due to the re-start that the system suffers at the end of each day and points toward the relevance of the daily schedule to define the delay propagation patterns. In addition we introduce a data-driven model able to reproduce the delay evolution observed in the data. The model includes three main mechanisms to spread delays: Plane rotation, flight connections of either passenger or crews and airport congestion. The last two processes can be modulated at will to understand the role that each one of them plays in delay propagation. Our simulations evidence that passenger and crew connections is the most effective single mechanism to induce network congestion. We show how the model can be used to assess the daily schedule ability to deal with an increase in the number of disruptive events and also study the relevance of primary delay localization for the evolution of congestion in the network. Furthermore the model offers the possibility of evaluating the effects of interventions in the system before their real implementation.
Flight delays represent failures to meet constraints imposed by a daily schedule. Its propagation in the network is a paradigmatic example of the way in which a distributed transport system moves toward collapse. The framework develop in this work is thus of easy extension to system with dynamics regulated by predefined schedules. Its translation to other airport networks is, of course, straightforward and even though the modeling of other transportation systems may require some particular details, the applicability of the metrics defined to measure network-wide congestion based on clustering is universal.
References
Barrat, A., Barthélemy, M., Pastor-Satorras, R. & Vespignani, A. The architecture of complex weighted networks. Proc Natl Acad Sci USA 101, 3747–3752 (2004).
Guimerà, R., Mossa, S., Turtschi, A. & Amaral, L. A. N. The worldwide air transportation network: Anomalous centrality, community structure and cities' global roles. Proc Natl Acad Sci USA 102, 7794–7799 (2005).
Opsahl, T., Colizza, V., Panzarasa, P. & Ramasco, J. J. Prominence and control: The weighted rich-club effect. Physical Review Letters 101, 168702 (2008).
Gautreau, A., Barrat, A. & Barthélemy M.Microdynamics in stationary complex networks. Proc Natl Acad Sci USA 106, 8847–8852 (2009).
Wuellner, D. R., Roy, S. & D'Souza, R. M. Resilience and rewiring of the passenger airline networks in the United States. Phys Rev E 82, 056101 (2010).
Lancichinetti, A., Radicchi, F., Ramasco, J. J. & Fortunato, S. Finding statistically significant communities in networks. PloS ONE 6, e18961 (2011).
Li, W. & Cai, X. Statistical analysis of airport network of China. Physical Review E 69, 046106 (2004).
Burghouwt, G. & de Wit, J. Temporal configurations of European airline networks. Journal of Air Transport Management 11, 185–198 (2005).
Colizza, V. & Vespignani, A. Invasion Threshold in Heterogeneous Metapopulation Networks. Physical Review Letters 99, 148701 (2007).
Sales-Pardo, M., Guimerà, R., Moreira, A. A. & Amaral LAN. Extracting the hierarchical organization of complex systems. Proc Natl Acad Sci USA 104, 15224–15229 (2007).
Hufnagel, L., Brockmann, D. & Geisel, T. Forecast and control of epidemics in a globalized world. Proc Natl Acad Sci USA 101, 15124–15129 (2004).
Colizza, V., Barrat, A., Barthélemy, M. & Vespignani, A. The role of the airline transportation network in the prediction and predictability of global epidemics. Proc Natl Acad Sci USA 103, 2015–2020 (2006).
Balcan, D., Colizza, V., Gonçalves, B., Hu, H., Ramasco, J. J. & Vespignani, A. Multiscale mobility networks and the large scale spreading of infectious diseases. Proc Natl Acad Sci USA 106, 21484–21489 (2009).
Balcan, D. et al. Seasonal transmission potential and activity peaks of the new influenza A(H1N1): a Monte Carlo likelihood analysis based on human mobility. BMC Medicine 7, 45 (2009).
Tizzoni, M. et al. Real time numerical forecast of global epidemic spreading: case study of 2009 A/H1N1pdm. BMC Medicine (2012).
Joint Economic Committee of US Congress., Your flight has been delayed again: Flight delays cost passengers, airlines and the U. S. economy billions. Available online at http://www.jec.senate.gov (May 22. 2008).
ICCSAI. Fact Book on Air Transport in Europe, Available online at http://www.iccsai.eu (2007–2011).
Eurocontrol Annual report, Available online at http://www.eurocontrol.int (2008–2011).
Jetzki, M. The propagation of air transport delays in Europe, PhD Thesis, Department of Airport and Air Transportation Research, RWTH Aachen University, Aachen, The Netherlands. (2009).
Folkes, V. S., Koletsky, S. & Graham, J. L. A field study of casual inferences and consumer reaction: The view from the airport. Journal of Consumer Research 13, 534–539 (1987).
Rupp, N. G. Further investigations into the causes of flight delays, Working paper, Department of Economy, East Carolina University. Available online at http://www.ecu.edu/cs-educ/econ/upload/ecu0707.pdf (2007).
Ahmadbeygi, S., Cohn, A., Guan, Y. & Belobaba, P. Analysis of the potential for delay propagation in passenger aviation flight networks. Journal of Air Transport Management 14, 221–236 (2008).
Beatty, R., Hsu, R., Berry, L. & Rome, J. Preliminary evaluation of flight delay propagation through an airline schedule. Air Traffic Control Quarterly 7, 259–270 (1999).
Mayer, C. & Sinai, T. Network effects, congestion externalities and air traffic delays: Or why not all delays are evil. American Economic Review 93, 1194–1215 (2003).
Bonnefoy, P. A. & Hansman, R. J. Scalability and evolutionary dynamics of air transportation networks in the United States. Procs. of 7th AIAA Aviation Technology and Operations Conference (2007).
Cook, A. European air traffic management (Ashgate, Surrey U.K. 2007).
Belobaba, P., Odoni, A. & Barnhart, C. The global airline industry (John Wiley & Sons, Chichester U.K. 2009).
Schaefer, L. & Millner, D. Flights delay propagation analysis with the detailed policy assessment tool. Procs. of the 2001 IEEE international conference on systems, man and cybernetics 2, 1299–1303 (2001).
Rosenberg, J. M. et al. A stochastic model of airline operations. Transportation Science 36, 357–377 (2002).
Wang, P. T. R., Schaefer, L. A. & Wojcik, L. A. Flight connections and their impacts on delay propagation. Procs. of the IEEE Digital Avionic Systems Conference 1, 5.B.4-1-5.B.4-9 (2003).
Janić, M. Modeling the large scale disruptions of an airline network. Journal of Transportation Engineering 131, 249–260 (2005).
Bazzan, A. L. C. & Klügl, F. Multi-agent systems for traffic and transportation engineering (Information Science Reference, Hershey, N.Y. 2009).
Lacasa, L., Cea, M. & Zanin, M. Jamming transition in air transportation networks. Physica A 388, 3948–3954 (2009).
Wu, C. & Caves, R. E. Aircraft operational costs and turnaround efficiency at airports. Journal of Air Transport Management 6, 201–208 (2000).
Bureau of Transport Statistics of the US Government, RITA database. Available online at http://www.bts.gov.
BTS press release of March 22, 2011, Available online at http://www.rita.dot.gov/sites/default/files/rita_archives/bts_press_releases/2011/bts017_11/html/bts017_11.html. Date of access: Sept 10, 2012.
Severe weather occurred on October 27, see NOAA http://www.spc.noaa.gov/climo/reports/101027_rpts.html. Date of access: Sept 10, 2012.
Severe weather occurred on October 27, see CNN http://articles.cnn.com/2010-10-27/us/us.weather_1_tornado-damage-tornado-sightings-airport-delays?_s=PM%3AUS. Date of access: Sept 10, 2012.
Acknowledgements
PF receives support from the network Complex World within the WPE of SESAR (Eurocontrol and EU Commission). JJR acknowledges funding from the Ramón y Cajal program of the Spanish Ministry of Economy (MINECO). Partial support from MINECO and FEDER was also received through projects MODASS (FIS2011-24785) and FISICOS (FIS2007-60327) and from the EU Commission through projects EUNOIA (FP7-DG.Connect-318367) and LASAGNE (FP7-ICT-318132).
Author information
Authors and Affiliations
Contributions
P.F., J.J.R. & V.M.E. designed research, performed research, analyzed the data and interpreted the results. All authors wrote, reviewed and approved the manuscript.
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Electronic supplementary material
Supplementary Information
Supplementary Information for Systemic delay propagation in the US airport network
Rights and permissions
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/3.0/
About this article
Cite this article
Fleurquin, P., Ramasco, J. & Eguiluz, V. Systemic delay propagation in the US airport network. Sci Rep 3, 1159 (2013). https://doi.org/10.1038/srep01159
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/srep01159
This article is cited by
-
Mitigation strategies against cascading failures within a project activity network
Journal of Computational Social Science (2022)
-
Geographic delay characterization of railway systems
Scientific Reports (2021)
-
Statistical characterization of airplane delays
Scientific Reports (2021)
-
Flight delay prediction based on deep learning and Levenberg-Marquart algorithm
Journal of Big Data (2020)
-
Empirical dynamics of railway delay propagation identified during the large-scale Rastatt disruption
Scientific Reports (2020)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.