MAREA Mathematical Approach towards Resilience Engineering in ATM
MAREA (Mathematical Approach towards Resilience Engineering in ATM) is a SESAR WP-E Research Project. The project started in March 2011 and was completed in October 2013. The project partners are National Aerospace Laboratory NLR (Coordinator), VU University Amsterdam and University of l'Aquila.
Resilience in ATM
Resilience is important for the sociotechnical air traffic management (ATM) system, where large numbers of interacting human operators and technical systems, functioning in different organizations at a variety of locations, must control air traffic safely and efficiently in the context of uncertainty and disturbances (e.g. delays, weather, system malfunctioning). Although procedures and regulations tend to specify working processes in ATM to a considerable extent, the flexibility and system oversight by human operators are essential for efficient and safe operations in normal and rare conditions. The recognition of the positive contributions of human operators for maintaining safety and efficiency in complex sociotechnical systems has been a main driver of the resilience engineering research field and it explains the focus on the relation between human factors and safety herein. Resilience engineering stresses the inevitability of performance variability of human operators to adjust to the demands and conditions in the working context. As such, resilience engineering emphasises much more the variety of potential ways of human operators to deal with nominal and non-nominal conditions in their effort to support safety, rather than adhering to human error based thinking, such as applied in traditional human reliability assessment and event sequence based accident models.
Objective of MAREA
For a sociotechnical system as complex as ATM, resilience engineering is at an early stage of development. The MAREA partners recognised that there are challenges for prospective analysis of resilience in ATM, due to the combinatorially many potential behaviours that may stem from external and internal events, and the interactions between the various entities in the ATM sociotechnical system. As a way forward, the objective of MAREA has been to develop an adequate mathematical modelling and analysis approach for prospective analysis of resilience in ATM. This approach is intended to support effective implementation of resilience engineering for ATM.
Database of hazards for the study of resilience
In the process of assuring the safety of air transport operations, the assessment of the risk implications of hazards in the operation considered plays a central role. Here a ‘hazard’ means any condition, event, or circumstance which could induce an accident. A prime means in gathering hazards for safety assessments is by brainstorm sessions with pilots, controllers and other experts. These hazard brainstorm sessions aim to push the boundary between functionally imaginable and functionally unimaginable hazards. Consequently, considerable parts of these hazard brainstorm sessions address human behaviour, conditions and interactions between humans and technical systems. As part of safety risk analyses conducted since 1995 for many proposed ATM changes, NLR has identified a broad range of related hazards. These hazards have been collected in a Hazard Database, which contains over 4000 hazards. Given the broad view on hazard identification and the systematic inclusion of human-related performance herein, we have set forth to use the hazards collected in the NLR ATM Hazard Database as a broad source of disturbances for the study of safety-related resilience in ATM. The NLR ATM Hazard Database has been analysed in order to select the unique hazards and to formulate them in a generalized way. This resulted in a total number of 525 generalized hazards as a basis for further study in MAREA. This total set of hazards was split into two similarly sized sets for a model identification and development phase, and for a validation phase.
Agent-based modelling for the study of resilience
An agent-oriented perspective is useful to conceptualise processes in complex sociotechnical systems, such as ATM. Agent-based modelling considers a sociotechnical system to be composed of several interacting agents and the overall system behaviour emerges from the individual agent processes and their interactions. Agents in ATM operations (e.g. pilots, controllers, technical systems) can express a large variety of behavioural patterns and these are influenced by specific processes and characteristics of the agent considered. Especially for human agents there is a wide range of cognitive and affective aspects that influence their behaviour. Such agent-related aspects can be represented by model constructs for each agent.
Library of agent-based model constructs
In MAREA a variety of agent-based model constructs have been identified that together can model a broad set of hazards in ATM. For each hazard considered, it was analysed which model construct or combination of model constructs could represent it. This was done by performing ‘mental simulation’, i.e. qualitative reasoning by a team of analysts about the way that the models can reflect a hazard. The result of such analysis is that a hazard can be well captured, partly captured or not captured by the model constructs. As part of the analysis, argumentation was provided about the mechanisms by which the models can capture a hazard, and the aspects that are yet missing. This analysis of the hazard modelling capabilities of model constructs was done in three phases.
- In the first phase, 13 model constructs were identified that have been used in the context of multi-agent dynamic risk modelling (MA-DRM) in NLR’s safety assessment methodology TOPAZ (Traffic Organization and Perturbation AnalyZer). Using these model constructs, 58% of the hazards could be modelled well, 11% could be partially modelled and 30% could not be modelled.
- In the second phase, 11 complementary model constructs were identified through searching human performance models that have been applied by the Agent Systems research group at VU University Amsterdam. Including this extension, 80% of the hazards could be well modelled, 7% could be partially modelled, and 14% could not be modelled.
- In the third phase, 14 additional model constructs were identified in the literature for the hazards that had not yet been fully modelled. Including these additional model constructs, 92% of the hazards could be well modelled, 6% could be partially modelled, and 2% could not be modelled.
Integration of agent-based model constructs
All model constructs in the library have been integrated at a conceptual level, thereby creating an overall picture of their interconnectivity. At the highest level this overview has been split up in two parts: one to describe the interplay of the internal model constructs of a human operator, and one for the interplay of the model constructs that belong to the ATM environment of a human operator.
The hazards occurring in a number of historic incident and accident scenarios have been analysed as a systematic approach towards formalisation of model constructs. Based on this overview, a subset of combinations of model constructs has been selected for further formalisation. The formalisation of the integration was first described by means of visualisations and explanations about the interactions between the variables in the different model constructs, and subsequently by expressing the related mathematical formulae.
Validation of the agent-based model constructs
The analysis of model constructs against hazards in the validation set pointed out that 92% of these hazards are well modelled by the considered combination of model constructs, 8% of the hazards are partly modelled, and 1% of the hazards are not modelled. These numbers are similar to the results found for the hazard dataset used in the model identification and development phase (92% well modelled, 6% partly modelled, 2% not modelled). This similarity shows that the library of model constructs is not biased towards the hazard set used in the model identification and development phase. The library of model constructs identified in MAREA is thus capable of modelling a large percentage of hazards in ATM.
As an additional means of validation, a selection of the model constructs has been used to develop a formal, executable model of an historic ATM scenario. This scenario addresses an aircraft descending below the minimal descent altitude because of impaired conditions of the flight crew members. The model consists of a formal integration of several model constructs from the library. Based on the integrated model, a number of simulations have been generated that describe ways in which hazards can evolve in this scenario.
As a third way to validate the agent-based model constructs, a series of interviews with ATM domain experts (air traffic controllers and pilots) were conducted. The main purpose of the interviews was to obtain feedback of the experts about the integrated model constructs’ capability to model hazards in ATM scenarios and to describe variations in the evolution of such scenarios. This feedback has led to a number of valuable insights. First, various conclusions have been formulated for the different (clusters of) model constructs separately. In general, these conclusions confirmed that our selection of model constructs was judged plausible by the experts. In addition, they indicated possible modifications that could be applied to make the model constructs more plausible. Secondly, several suggestions for potential new model constructs were presented. Finally, the interviewees provided useful feedback about the research methodology in general. This feedback illustrated that the experts’ general attitude towards the proposed approach was positive. It was recognized that for adequate modelling of ATM operations, techniques are needed that can well capture the dynamics, variability and interactions of the relevant agents and processes.
Arena of Hybrid Systems for analysis of Critical Observability in ATM operations
A compositional framework termed Arena of Hybrid Systems (AHS) has been developed and applied to critical observability in ATM operations. Critical observability is a structural property of a hybrid system, which expresses the possibility to detect whether the system state is in a set of critical states, which may represent unsafe, unallowed or non-nominal situations. If a hybrid system enjoys this property, a hybrid observer can be constructed, which detects whether the hybrid system is in a critical state or not. The approach has been applied to a particular Terminal Manoeuvring Area (TMA) T1 operation and conclusions have been achieved for the following subjects.
Arena of Hybrid Systems (AHS). This compositional framework for hybrid systems has been shown to be effective in:
- Properly addressing heterogeneity of different actors - It appears that technical devices and procedures of human agents can be well modelled by finite state machines, and the dynamics of aircraft are better modelled by differential equations. The Hybrid Systems (HS) paradigm is general enough to capture the heterogeneous dynamics in the TMA T1 scenario.
- Properly capturing interaction among actors - The AHS paradigm permits to capture the exchange of both “continuous signals”-type information (e.g. position and velocity of the aircraft) as well as “digital signals”-type information (e.g. signals shared between the Cockpit HMI and ATM systems) among the agents involved.
- Properly capturing evolution of agents both in nominal and non-nominal operating modes - Mathematical models could be developed on the basis of model constructs in the model library and for hazards in the hazard database.
Critical Observability. When one or more agents in the TMA T1 operation are in non-nominal operating modes, possible safety critical situations may arise. It is therefore important to detect non-nominal operating modes, so that automatic and/or manual recovery from these situations can be processed. Critical observability has been shown to be effective in properly capturing which safety critical situation can be detected and which cannot.
Critical Compositional Bisimulation. Analysis of critical observability of the TMA T1 operation is complicated by the large number of variables involved, and therefore complexity reduction techniques were needed. Central for this is the notion of critical compositional bisimulation, which clusters agents that can be considered equivalent in the scenario. This procedure allows partitioning the set of agents into subsets, each one composed by equivalent agents.
Applying the MAREA results for analysis of safety-focused resilience
The MAREA project identified a set of 25 model constructs, which complement an existing set of 13 model constructs in TOPAZ MA-DRM. The additional model constructs include a variety of human, environmental and organization related model constructs, e.g. Operator functional state, Trust, Situation awareness with complex beliefs, Bad weather, and Formal organisation. This additional set of model constructs entails a larger variety in psychological and organisational factors, which supports the analysis of resilience in complex sociotechnical systems. By including such additional model constructs in the agent-based models of MA-DRM based safety assessments, a larger set of hazards can be represented in a direct manner. This implies that the emergent effects of the interactions between the model constructs used in the agent-based model can be directly reflected in Monte Carlo simulation results.
Agent-based modelling and simulation has considerable advantages over traditional probabilistic risk assessment (PRA) and human reliability assessment (HRA) approaches. In particular, the broad set of model constructs identified in MAREA supports direct representation of a wide variety of hazards in the ATM sociotechnical system by agent-based modelling, which can at best be represented indirectly by error/failure probabilities and error producing condition factors in traditional PRA/HRA. Due to the detailed agent-based modelling in MA-DRM, the behaviour of the interacting agents changes in response to encountered hazards/disturbances. Analysis of the implications of such changed behaviour in the agent-based model thus may provide insight in the resilience of the ATM sociotechnical system.
A follow-up question is how the MAREA improved view of agent-based hazard modelling can be exploited effectively in agent-based safety risk assessment. Since a straightforward inclusion of all 25 complementary model constructs in the MA-DRM approach would lead to a further extension of the agent-based model of the ATM operation considered, a minimal modelling approach has been proposed. In this approach, for model constructs that have similar interaction based behaviour effects, the main one is included explicitly in the agent-based model, while the effects of the remaining model constructs are taken into account through bias and uncertainty assessment.
In conclusion, the MAREA project has demonstrated that a mathematical approach towards resilience engineering provides novel methods for prospective analysis of safety implications of resilience in ATM, which are complementary to already on-going resilience engineering developments. In the light of the shown practical feasibility of MA-DRM for safety assessment of air transport operations, we expect that the MAREA enhanced set of agent-based model constructs can further support the analysis of safety-relevant relations in the ATM sociotechnical system. In this way, resilience in ATM can be supported, thereby improving the ability of the ATM sociotechnical system to adjust to disturbances/hazards and sustaining safe operations. In future research we plan to apply the enhanced model set in detailed assessments of resilience in ATM.
Stroeve SH, Everdij MHC, Blom HAP. Studying hazards for resilience modelling in ATM. First SESAR Innovation Days, Toulouse, France, 29 November - 1 December 2011
Petricone A, Pola G, Di Benedetto MD, De Santis E. Safety criticality analysis of complex air traffic management systems via compositional bisimulation. 4th IFAC Conference on Analysis and Design of Hybrid Systems, Eindhoven, The Netherlands, 6-8 June 2012
Bosse T, Sharpanskykh A, Treur J, Blom HAP, Stroeve SH. Modelling of human performance-related hazards in ATM. Air Transport and Operations Symposium (ATOS) 2012, Delft, The Netherlands, 18-20 June 2012
Stroeve SH, Blom HAP. How well are human-related hazards captured by multi-agent dynamic risk modelling? In: Landry SJ, editor. Advances in human aspects of aviation. Boca Raton (FL), USA: CRC Press; 2012. p. 462-71
Bosse T, Sharpanskykh A, Treur J, Blom HAP, Stroeve SH. Agent-based modelling of hazards in ATM. Second SESAR Innovation Days, Braunschweig, Germany, 27-29 November 2012
Blom HAP, Stroeve SH, Bosse T. Modelling of potential hazards in agent-based safety risk analysis. Tenth USA/Europe ATM R&D Seminar, Chicago, USA, 10-13 June 2013
Bosse T, Blom HAP, Stroeve SH, Sharpanskykh A. An integrated multi-agent model for modelling hazards within air traffic management. The 2013 IEEE/WIC/ACM International Conference on Intelligent Agent Technology, Atlanta, USA, 2013
De Santis E, Di Benedetto MD, Pezzuti D, Pola G, Scarciallo L, Everdij M. Safety criticality analysis of air traffic management systems: A compositional bisimulation approach. Third SESAR Innovation Days, Stockholm, Sweden, 26-28 November 2013
Stroeve SH, Bosse T, Blom HAP, Sharpanskykh A, Everdij MHC. Agent-based modelling for analysis of resilience in ATM. Third SESAR Innovation Days, Stockholm, Sweden, 26-28 November 2013
Pezzuti D, Pola G, De Santis E, Di Benedetto MD. A critical bisimulation approach to safety criticality analysis of large-scale air traffic management systems. 52nd IEEE Conference on Decisions and Control, Florence, Italy, 2013
Coordinator: National Aerospace Laboratory NLR (Coordinator)
Partners: VU University Amsterdam, University of l'Aquila.