Risk analysis
Safety starts with the design
A generally established 'state of the art' has developed for the design, development and operation of safety control systems. The situation is different when it comes to IT security, where appropriate measures are often defined and implemented on an ad hoc basis. It is high time to change this.
The reciprocal relationships between safety and security are currently the subject of much debate. Obvious parallels between the two domains are just as recognizable as serious differences. The crux of the matter is that a generally accepted approach has not yet been developed and machine manufacturers and operators will find little guidance on this topic in the relevant standards.
When considering which security measures are suitable for protecting safety systems, it makes sense to bear in mind the nature of dangerous faults. In principle, safety distinguishes between stochastic errors and systematic errors. Mitigation measures for stochastic errors include, for example, redundancy and self-checking, while systematic errors are detected or avoided through verification and validation measures and the quality of the development process. The higher the required Safety Integrity Level (SIL), the more rigorous the measures. The exploitation of security gaps, on the other hand, requires a targeted approach by an attacker, who must have a certain motivation as well as certain skills and financial resources at their disposal.
Specific procedures are defined in these standards for determining the required Safety Integrity Level (SIL) in accordance with IEC 62061 or the Performance Level in accordance with ISO 13849 for an identified hazard. When providing proof of safety, it must be demonstrated that the system does not exceed a defined and SIL-dependent maximum tolerable statistical hazard rate.
A statistical approach to quantitatively describe the efficiency of security measures analogous to the procedure for determining failure rates in safety systems does not appear to make sense, as there is still no generally recognized calibration of security risks to failure probabilities. Furthermore, statistical data regarding the probability of an attack is often not available. Such data would also be of little help, as statistical data is not very suitable for predicting the frequency and professionalism of attacks in the future. This is a significant difference to safety considerations: Here, field observations in the past are used to infer failure behavior in the future and possible system architectures for achieving the required safety goals are derived from this. Another important assumption in safety systems is that the failure behavior is constant over a certain period of time. The calculation methods specified in the relevant standards are only valid as long as this can be assumed.
Safe&secure - an integrated approach
Figure 1: Integrated development process for safety and security based on the ISA 62443-3-2 standard (working draft) with connection to the safety development process.
© Bachmann ElectronicThe ISA/IEC 62443 family of standards defines a procedure for deriving specific security measures for an Industrial Automation and Control System (IACS) based on a risk analysis. The current status of these standards only considers safety as a potential risk area; connections to the safety life cycle are not made. The procedure proposed by Bachmann Electronic is shown in Figure 1.
The safety analysis
Prior to the security risk analysis, the manufacturer of the safety system carries out a safety-oriented Failure Mode and Effects Analysis (FMEA) (SA1). In this process, all possible faults that could impair the safety integrity are identified (SA2). Once all conceivable faults have been identified, appropriate safety measures are defined in order to limit the stochastic causes of the hazards identified (SA3). Systematic errors in the hardware or software development process must be excluded by means of appropriate verification and validation measures.
Step SA4 involves analyzing which of the identified errors could be caused by a potential attacker and which of these are not already covered by the measures defined in SA3. Both the manufacturer's intended use scenarios for the safety controller and the entire life cycle of the safety system must be included in the analysis. When it comes to the information security of safety control systems, it is also necessary to define the relevant protection objectives. From the perspective of personal and system safety, the integrity of data, authenticity, availability and temporal determinism are of primary importance. Protection goals such as confidentiality, on the other hand, are of secondary importance.
The result of this analysis lists all possible vulnerabilities that could jeopardize the integrity of functional security in the event that an attacker gains access to the system.
The security analysis
In a security analysis, it is not enough to look at a single control system or production machine in isolation. Instead, the entire IT network of the system operator and the assets that depend on this network must always be included in the analysis. In the course of the analysis, this network is divided into zones with similar security requirements and IT protection measures are determined for each zone. The security analysis process is also shown in Figure 1.
The starting point is a high-level risk analysis (SE1) with the aim of determining the risk (RA) that is just acceptable to the system operator for specific attack scenarios. The safety protection objectives are included in the analysis as a risk area.
In the next step (SE2), the IT network is divided into the aforementioned zones with similar security requirements. The result of the high-level risk analysis is taken into account here. Furthermore, communication links, so-called conduits, are defined between the different zones. The preliminary version of the ISA 62443-3-2 standard sets out specific requirements for this division: among other things, not only should different zones be set up for business IT and system controls, but systems with safety requirements should also be grouped together in their own zones.
In the detailed risk analysis (SE3), which is to be carried out for each zone with assets requiring protection, the same risk areas are to be considered as in step SE1. For this purpose, specific threat scenarios are considered in conjunction with known vulnerabilities. The impact and probability of a successful attack are qualitatively determined for each of these threat/vulnerability pairs and the resulting risk is then derived. For those zones in which safety controls are located, the results of step SA4 from the safety analysis are a fundamental part of the risk assessment.
As part of the detailed risk analysis, assumptions relevant to security must be defined and taken into account. Such assumptions may, for example, relate to the physical security of the system. However, protective measures for cyber security are not yet included. The risk resulting from the analysis is therefore called unmitigated risk (Ru).
Next, a so-called 'security level target' (SL-T) is determined for the zone under consideration (SE4). The security level is to be understood as a measure of the capabilities and resources that an attacker has to muster in order to circumvent the specific protective measures. The higher the security level, the more protective measures have to be implemented and the more difficult it is for an attacker to circumvent them.
The SL-T is calculated from the Ru/RA quotient, which is also known as the Cyber Risk Reduction Factor (CRRF). Once the SL-T has been determined, specific security measures can be derived on the basis of the IEC 62443-3-3 standard.
Once the protective measures required by the standard have been derived in step SE4, the security measures that can actually be implemented are determined in step SE5. However, it is possible that the components used in the automation system do not support all the required security measures. The intersection of the required and actually supported security measures is called SL-C. Taking into account the security measures that can be implemented in the SL-C, the impact and probability of a successful attack for the identified threats and vulnerabilities are redetermined and the remaining residual risk (RR) is derived.
If this residual risk is accepted, the process is completed and the results can be documented (SE6). If the residual risk is not acceptable, targeted security measures from higher security levels can be used, provided they are supported by the components (SE7). If necessary, other components should be used to increase the number of security measures that can be implemented.
Consistent implementation
In principle, each individual security measure can be implemented either as a security layer in the safety system itself or via a gateway function at the zone interface between operational management and safety. If security measures are integrated into the safety system, the system architect is often confronted with contradictory requirements. For example, cryptographic procedures require resources that are often not available at the communication endpoints of a safety controller. Also, the requirement for deterministic communication sometimes conflicts with the temporal behavior of cryptographic algorithms.
In practice, there is a significant difference between the life cycle of functional security and that of information security with regard to the frequency of software updates to running systems. While a well-defined procedure for installing security patches is part of every security lifecycle, the generally much higher software quality of safety software requires much less frequent updates, which are also very cost-intensive due to the necessary certification steps. For this reason, it must be ensured that changes in the security layer can always be implemented without having a retroactive effect on the safety part of the control system so that no recertification is necessary and the associated costs can be kept to a minimum.
Four security levels selectable
Figure 2: The M1 automation system from Bachmann combines safety and security in an integrated approach.
© Bachmann ElectronicFigure 2 shows the M1 automation system from Bachmann, which consistently implements the approach discussed so far. The solution comprises a standard controller for implementing the control and operational management tasks, while safety-relevant tasks are implemented by the SLC284 safety controller together with its assigned input and output modules. Both controllers are networked via the backplane.
As an example, an information technology threat to safety integrity is used to illustrate the methodology presented in Figure 1. The data traffic between the safety controller and the associated input and output modules is safety-relevant and must therefore be safeguarded from a safety perspective in accordance with the requirements for safe communication in EN 50159. In order to make data from the safety controller available for operational management, safety data packets are routed in parallel to process data via the backplane. However, this makes them vulnerable from a security perspective and must be protected accordingly. For cost and performance reasons, cryptographic protection of the safety data packets between the safety controller and input/output modules is not expedient. For this reason, the gateway function of the M1 controller is used in step SE3 of the security analysis. From the perspective of the safety controller, the M1 standard controller thus serves as a security gateway on the one hand and as a router for safety data packets within the control system on the other.
In the security architecture of the higher-level communication network, the standard controller serves as a security endpoint and implements state-of-the-art security protection mechanisms. Depending on the identified threat, four different security levels can be selected by the user. In level 4, for example, only the most essential ports are open and communication with the controller via Ethernet is only possible using cryptographic protocols. An access management system has also been implemented, which allows a fine-grained definition of access rights not only to the controller itself, but also to each individual process variable.
In short: with this approach, the safety controller is fully integrated into the operational management without combining functional safety and information security measures. This means that security updates can be carried out at any time without the need to recertify the safety part.
Autorten:
Christoph Scherrer is Product Manager for Safety & Security at Bachmann Electronic;
Bernd Süßmilch is head of the test department for development tools and runtime environment at Bachmann Electronic.















