Safety starts with the design

Risk analysis

Christoph Scherrer, Bernd Süßmilch | Günter Herkommer, 29.06.2016, 16:05

Safety starts with the design

A generally established 'state of the art' has developed for the design, development and operation of safety control systems. The situation is different when it comes to IT security, where appropriate measures are often defined and implemented on an ad hoc basis. It is high time to change this.

Images

The reciprocal relationships between safety and security are currently the subject of much debate. Obvious parallels between the two domains are just as recognizable as serious differences. The crux of the matter is that a generally accepted approach has not yet been developed and machine manufacturers and operators will find little guidance on this topic in the relevant standards.

When considering which security measures are suitable for protecting safety systems, it makes sense to bear in mind the nature of dangerous faults. In principle, safety distinguishes between stochastic errors and systematic errors. Mitigation measures for stochastic errors include, for example, redundancy and self-checking, while systematic errors are detected or avoided through verification and validation measures and the quality of the development process. The higher the required Safety Integrity Level (SIL), the more rigorous the measures. The exploitation of security gaps, on the other hand, requires a targeted approach by an attacker, who must have a certain motivation as well as certain skills and financial resources at their disposal.

Specific procedures are defined in these standards for determining the required Safety Integrity Level (SIL) in accordance with IEC 62061 or the Performance Level in accordance with ISO 13849 for an identified hazard. When providing proof of safety, it must be demonstrated that the system does not exceed a defined and SIL-dependent maximum tolerable statistical hazard rate.

A statistical approach to quantitatively describe the efficiency of security measures analogous to the procedure for determining failure rates in safety systems does not appear to make sense, as there is still no generally recognized calibration of security risks to failure probabilities. Furthermore, statistical data regarding the probability of an attack is often not available. Such data would also be of little help, as statistical data is not very suitable for predicting the frequency and professionalism of attacks in the future. This is a significant difference to safety considerations: Here, field observations in the past are used to infer failure behavior in the future and possible system architectures for achieving the required safety goals are derived from this. Another important assumption in safety systems is that the failure behavior is constant over a certain period of time. The calculation methods specified in the relevant standards are only valid as long as this can be assumed.

Safe&secure - an integrated approach

Figure 1: Integrated development process for safety and security based on the ISA 62443-3-2 standard (working draft) with connection to the safety development process.

The ISA/IEC 62443 family of standards defines a procedure for deriving specific security measures for an Industrial Automation and Control System (IACS) based on a risk analysis. The current status of these standards only considers safety as a potential risk area; connections to the safety life cycle are not made. The procedure proposed by Bachmann Electronic is shown in Figure 1.

The safety analysis

Prior to the security risk analysis, the manufacturer of the safety system carries out a safety-oriented Failure Mode and Effects Analysis (FMEA) (SA1). In this process, all possible faults that could impair the safety integrity are identified (SA2). Once all conceivable faults have been identified, appropriate safety measures are defined in order to limit the stochastic causes of the hazards identified (SA3). Systematic errors in the hardware or software development process must be excluded by means of appropriate verification and validation measures.

Step SA4 involves analyzing which of the identified errors could be caused by a potential attacker and which of these are not already covered by the measures defined in SA3. Both the manufacturer's intended use scenarios for the safety controller and the entire life cycle of the safety system must be included in the analysis. When it comes to the information security of safety control systems, it is also necessary to define the relevant protection objectives. From the perspective of personal and system safety, the integrity of data, authenticity, availability and temporal determinism are of primary importance. Protection goals such as confidentiality, on the other hand, are of secondary importance.

The result of this analysis lists all possible vulnerabilities that could jeopardize the integrity of functional security in the event that an attacker gains access to the system.

The security analysis

In a security analysis, it is not enough to look at a single control system or production machine in isolation. Instead, the entire IT network of the system operator and the assets that depend on this network must always be included in the analysis. In the course of the analysis, this network is divided into zones with similar security requirements and IT protection measures are determined for each zone. The security analysis process is also shown in Figure 1.

The starting point is a high-level risk analysis (SE1) with the aim of determining the risk (RA) that is just acceptable to the system operator for specific attack scenarios. The safety protection objectives are included in the analysis as a risk area.

In the next step (SE2), the IT network is divided into the aforementioned zones with similar security requirements. The result of the high-level risk analysis is taken into account here. Furthermore, communication links, so-called conduits, are defined between the different zones. The preliminary version of the ISA 62443-3-2 standard sets out specific requirements for this division: among other things, not only should different zones be set up for business IT and system controls, but systems with safety requirements should also be grouped together in their own zones.

In the detailed risk analysis (SE3), which is to be carried out for each zone with assets requiring protection, the same risk areas are to be considered as in step SE1. For this purpose, specific threat scenarios are considered in conjunction with known vulnerabilities. The impact and probability of a successful attack are qualitatively determined for each of these threat/vulnerability pairs and the resulting risk is then derived. For those zones in which safety controls are located, the results of step SA4 from the safety analysis are a fundamental part of the risk assessment.

As part of the detailed risk analysis, assumptions relevant to security must be defined and taken into account. Such assumptions may, for example, relate to the physical security of the system. However, protective measures for cyber security are not yet included. The risk resulting from the analysis is therefore called unmitigated risk (Ru).

Next, a so-called 'security level target' (SL-T) is determined for the zone under consideration (SE4). The security level is to be understood as a measure of the capabilities and resources that an attacker has to muster in order to circumvent the specific protective measures. The higher the security level, the more protective measures have to be implemented and the more difficult it is for an attacker to circumvent them.

The SL-T is calculated from the Ru/RA quotient, which is also known as the Cyber Risk Reduction Factor (CRRF). Once the SL-T has been determined, specific security measures can be derived on the basis of the IEC 62443-3-3 standard.

Once the protective measures required by the standard have been derived in step SE4, the security measures that can actually be implemented are determined in step SE5. However, it is possible that the components used in the automation system do not support all the required security measures. The intersection of the required and actually supported security measures is called SL-C. Taking into account the security measures that can be implemented in the SL-C, the impact and probability of a successful attack for the identified threats and vulnerabilities are redetermined and the remaining residual risk (RR) is derived.

If this residual risk is accepted, the process is completed and the results can be documented (SE6). If the residual risk is not acceptable, targeted security measures from higher security levels can be used, provided they are supported by the components (SE7). If necessary, other components should be used to increase the number of security measures that can be implemented.

Consistent implementation

In principle, each individual security measure can be implemented either as a security layer in the safety system itself or via a gateway function at the zone interface between operational management and safety. If security measures are integrated into the safety system, the system architect is often confronted with contradictory requirements. For example, cryptographic procedures require resources that are often not available at the communication endpoints of a safety controller. Also, the requirement for deterministic communication sometimes conflicts with the temporal behavior of cryptographic algorithms.

In practice, there is a significant difference between the life cycle of functional security and that of information security with regard to the frequency of software updates to running systems. While a well-defined procedure for installing security patches is part of every security lifecycle, the generally much higher software quality of safety software requires much less frequent updates, which are also very cost-intensive due to the necessary certification steps. For this reason, it must be ensured that changes in the security layer can always be implemented without having a retroactive effect on the safety part of the control system so that no recertification is necessary and the associated costs can be kept to a minimum.

Four security levels selectable

Figure 2: The M1 automation system from Bachmann combines safety and security in an integrated approach.

Figure 2 shows the M1 automation system from Bachmann, which consistently implements the approach discussed so far. The solution comprises a standard controller for implementing the control and operational management tasks, while safety-relevant tasks are implemented by the SLC284 safety controller together with its assigned input and output modules. Both controllers are networked via the backplane.

As an example, an information technology threat to safety integrity is used to illustrate the methodology presented in Figure 1. The data traffic between the safety controller and the associated input and output modules is safety-relevant and must therefore be safeguarded from a safety perspective in accordance with the requirements for safe communication in EN 50159. In order to make data from the safety controller available for operational management, safety data packets are routed in parallel to process data via the backplane. However, this makes them vulnerable from a security perspective and must be protected accordingly. For cost and performance reasons, cryptographic protection of the safety data packets between the safety controller and input/output modules is not expedient. For this reason, the gateway function of the M1 controller is used in step SE3 of the security analysis. From the perspective of the safety controller, the M1 standard controller thus serves as a security gateway on the one hand and as a router for safety data packets within the control system on the other.

In the security architecture of the higher-level communication network, the standard controller serves as a security endpoint and implements state-of-the-art security protection mechanisms. Depending on the identified threat, four different security levels can be selected by the user. In level 4, for example, only the most essential ports are open and communication with the controller via Ethernet is only possible using cryptographic protocols. An access management system has also been implemented, which allows a fine-grained definition of access rights not only to the controller itself, but also to each individual process variable.

In short: with this approach, the safety controller is fully integrated into the operational management without combining functional safety and information security measures. This means that security updates can be carried out at any time without the need to recertify the safety part.

Autorten:
Christoph Scherrer is Product Manager for Safety & Security at Bachmann Electronic;
Bernd Süßmilch is head of the test department for development tools and runtime environment at Bachmann Electronic.

You might also be interested in

Human-robot cooperation

Safety first

The development of cobots into a commodity product is closely linked to the issue of safety in robotics - a challenge that needs to be mastered efficiently.

Functional safety

Locking with need for clarification - help with risk assessment

DIN EN ISO 1419 has already been available for five years. However, there are always uncertainties regarding the actual safety functions of a guard locking device, which the machine manufacturer must evaluate based on his risk assessment. Here is...

Wireless Safety

Safe operation via radio - (how) does it work?

Many machine builders want to use tablets in addition to the existing machine operation. Demanded features such as WLAN, camera, multi-touch and much more are available here - but if safety functions are required, these devices reach their limits.

Functional safety

Heavy-duty HRC requires new safety approaches

The topic of HRC is not just for 'lightweights'. In the field of heavy-duty robotics, solutions will also be in demand in the future that allow humans and robots to work together directly without the need for a safety fence. However, this requires...

Functional safety

Secure hold in the slip ring

Transmitting safety-relevant data via slip rings is no trivial matter. Motion control experts from Kollmorgen have developed a TÜV-certified safety solution, including UL approval, together with slip ring manufacturer Stemmann-Technik.

EN ISO 13849

Validation neglected

EN ISO 13849 is decisive for the integration of safety-related control functions in machines. However, the part of the standard relating to validation is often neglected in practice - a major shortcoming.

Safe Motion

Dangerous movements well secured

How can dangerous movements be safeguarded? A challenge for the user, as there are a wide variety of functional and normative requirements. An overview.

Safety

The intelligent safety switch

Safety modules and safety switches that communicate at I4.0 level simplify troubleshooting. However, the communication capability also has interesting potential for predictive maintenance and tamper protection.

Risk determination

The vampire effect of quantification

There are always controversial discussions regarding the determination of risks for products and systems. But which approaches actually make sense in the context of safety and security? What approach does the new framework for safety and security -...

Safety starts with the design

Review of the 'Forum Safety & Security'

Kaspersky warns of critical vulnerabilities

Vulnerability discovered in Siemens protection technology

Safe&secure - an integrated approach

Consistent implementation

Four security levels selectable

You might also be interested in

Safety first

Locking with need for clarification - help with risk assessment

Safe operation via radio - (how) does it work?

Heavy-duty HRC requires new safety approaches

Secure hold in the slip ring

Validation neglected

Dangerous movements well secured

The intelligent safety switch

The vampire effect of quantification

Categories

Focus areas

Service

Magazine

Our network