zuruck zur Themenseite

Articles and background information on the topic

Risk determination

Holger Laible | Günter Herkommer,

The vampire effect of quantification

There are always controversial discussions regarding the determination of risks for products and systems. But which approaches actually make sense in the context of safety and security? What approach does the new framework for safety and security - IEC TR 63069:2019 - recommend?

© Siemens

Engineers are said to want to quantify the world and assign an exact numerical probability to certain events. In the context of functional safety, the derivation of statistical or random, dangerous errors is often at the center of many considerations as a supposedly exact risk reduction - among other things by proving compliance with Safety Integrity Level (SIL) or Performance Level (PL) limit values with the help of common calculation tools. This can quickly cause the focus to shift away from the actual target or lead to a well-known phenomenon known in the advertising industry as the 'vampire effect'. This effect - also known as the attention-absorbing effect - leads to the customer being able to remember a certain funny action or person (the vampire), but no longer the product being advertised.

By analogy, in safety engineering, the focus on successfully proving the failure rate using relevant software packages - in other words, ultimately quantifying random, dangerous errors - attracts similar attention to the vampire in question. Systematic errors that can be traced back to human behavior - for example, the incorrect programming of security software or the failure to observe important technical boundary conditions of security systems - are unjustifiably losing importance.

With this in mind, it is up to the reader to recall known accident reports and assess whether the causes can be attributed to incorrect statistics or to systematic errors and misjudgements. One of the biggest incidents in recent history - Fukushima, for example - could certainly not have been prevented by a statistically improved design of the emergency power supply. Rather, it was probably the systematic boundary conditions of the plant design that led to the actual catastrophe.

Of course, there are considerations in risk assessments that can be translated into figures. Traditional mechanical and plant engineering, for example, is good at considering the length of time spent in the danger zone. However, new questions arise when systems also have to cope with less controlled environments or an application is so flexible that previously fixed parameters change dynamically.

What every numerical game practiced in functional safety has in common is that it is based on a known environment and physically determinable relationships, but also on assumptions or approximations that are stable over time. Without these assumptions, it would not be possible to state the statistical probability of future dangerous failures - for example in the form of a PFH value. Attempts to quantify the probability of the Fukushima example mentioned above might even succeed; however, they would obscure the fact that this was a systematically unfavorable interpretation. In short, experts agree that the source of the most frequent system failures - and not only in the area of functional safety! - can be found in the specification and the lack of understanding of the system behavior (timely hazard detection) and not in individual statistical errors, which are also subjected to a systematic analysis (FMEA and FMEDA).

Advertisement

Figure 1: The relationship between the risk worlds of safety / security

© Siemens

Approaches that attempt to quantify the risk of attacks - whether intentional or unintentional - in the context of information security (IT or cyber security) are therefore almost absurd. While it is possible to make estimates for component faults based on documented failure figures in the past, this is problematic in information security. This is because, in contrast to observable component faults in the context of safety, successful attacks in IT security can also remain undetected in the long term and call into question any concrete statement or forecast regarding the information security of a system. Ultimately, the risk of an attack depends solely on the motivation of the attacker on a system. On the one hand, very weakly protected systems can remain sufficiently secure for a long time as long as there is no interest in attacking them. On the other hand, strongly protected systems - such as our electronic payment system - can be very attractive to attackers and presumably vulnerable to attack.

It should also be noted that the so-called 'human factor' already accounts for the majority of risks in security technology (safety). These systematic errors cannot be countered by quantification, but only by technical expertise and methods that are suitable for addressing known human weaknesses. In the context of IT security, intelligent manipulative or even criminal behavior must also be taken into account. This aspect is combined with the rapid technical development in terms of computing power and system complexity.

The diagram in Figure 1 compares the risk of a system in terms of safety and IT security and illustrates the following aspects:

Figure 2: The Simatic controller series is developed with integrated security in accordance with IEC 62443-4-1 across all phases of the product life cycle. The safety version also meets the requirements up to SIL 3 in accordance with IEC 61508.

© Siemens
  1. Risk assessments for safety and IT security are orthogonal, i.e. the risks and protective measures of one domain cannot generally be related to those of the other.
  2. The technical protective measures for safety represent the main part of risk reduction and can be determined relatively well for system operation today.
  3. In the case of IT security, the technical measures only represent a small, difficult to determine proportion with additional temporal dynamics. The main problems are the gray areas, which are evident both in the determination of risk and in the quality of the protective measures. The human factor is decisive in all phases - but contrary to functional safety, particularly in the application phase of the system.
  4. The special case of foreseeable misuse is applicable to a very small area of risk reduction in safety and should not be understood as a harmful attack on the system.

These aspects described above, and in particular the fundamental question of whether security belongs thematically to safety or safety to security, were and are the subject of highly controversial technical discussions in the relevant working groups of the standardization committees. The knowledge gained and the consensus formed resulted in the declarations and proposals of IEC TR 63069 (Framework for functional safety and security). This 'Technical Report' is fundamentally independent of specific industries at the level of the IEC 61508 series of standards (basic standard for functional safety) and IEC 62443, currently the most important series of standards for cybersecurity. Specifically, IEC TR 63069 recommends that the risk assessments for the two domains of safety and security be considered separately, as these are conducted using fundamentally different approaches. It should be noted that the relevant effects on safety resulting from the threat situation on the part of IT security are already addressed in the security risk assessment and only at this point is an efficient assessment of all threats to the functional safety of a system from cyber attacks possible at all.

It would of course be conceivable to propose a common risk approach. However, such an approach would already fail due to the different assessment criteria and the weighting of security compared to safety. Ultimately, the decisive factor is the realization that security-related statements must be based on the fact that a system is effectively and systematically prevented from being adversely affected by IT security attacks. This is also explicitly stated in the first guideline of IEC TR 63069. This specification is possibly a strong requirement for the protective measures of IT security for a system and will not always be achievable in practice, especially if these measures are understood to be purely preventative. However, there is no other practically reasonable alternative to this approach. Instead, the dynamic nature of IT security requires greater permanent attention and responsiveness throughout the entire life cycle.

This results in a significantly increased need to regularly scrutinize information security risk analyses and to rely on a corporate culture and functioning processes that address the 'human factor'. However, measures that focus on the plants and systems in operation and the planning of reactive measures in the event of an incident are particularly crucial. On the other hand, it does not make sense to create or expand new quantification efforts in advance and thus reinforce the vampire effect - with negative effects on necessary systematic analyses and effective measures to minimize risks.

Author:
Holger Laible is a Senior Safety Expert in Simatic Product Management at Siemens and a member of national and international standardization committees.

  • Xing Icon
  • LinkedIn Icon
Advertisement
Back to topic page
Advertisement

You might also be interested in

Advertisement
Advertisement
Advertisement
Advertisement

Functional safety

Secure hold in the slip ring

Transmitting safety-relevant data via slip rings is no trivial matter. Motion control experts from Kollmorgen have developed a TÜV-certified safety solution, including UL approval, together with slip ring manufacturer Stemmann-Technik.

read more...

EN ISO 13849

Validation neglected

EN ISO 13849 is decisive for the integration of safety-related control functions in machines. However, the part of the standard relating to validation is often neglected in practice - a major shortcoming.

read more...
Advertisement
Advertisement
Advertisement

Safety

The intelligent safety switch

Safety modules and safety switches that communicate at I4.0 level simplify troubleshooting. However, the communication capability also has interesting potential for predictive maintenance and tamper protection.

read more...
Subscribe to our newsletter
Advertisement
Back to home