zuruck zur Themenseite

Articles and background information on the topic

Machine Learning

Karl Leidl, Andreas Grzemba | Günter Herkommer,

How AI can improve information security

The protective measures used in traditional IT in terms of information security are generally not suitable for OT environments. Machine learning (ML)-based intrusion detection systems (IDS) provide a remedy.

© Fotolia, Elnur

Increasing networking and intelligent systems are the driving forces behind the Industrial Internet of Things (IIoT) and Industry 4.0. The high degree of networking of IIoT devices makes it possible to collect, process and evaluate more and more data. This data forms the basis for IIoT applications such as smart manufacturing. IT security continues to be criminally neglected when planning new systems or modernizing existing systems for smart production, resulting in a constantly growing attack surface.

Current examples underline the explosive nature of the resulting incidents. In the recent past, well-known industrial companies such as the machine manufacturer Krauss Maffei, the aluminum producer Norsk Hydro and the construction group Porr have been victims of cyber attacks. Control components that are connected to the internet but not sufficiently secured often serve as an entry point into the company. Or an employee's remote maintenance access is misused to gain initial access to the network. These are just two possible types of attack that a hacker can use to infiltrate a company network and thereby endanger other systems in the network. It is important to know that hardening measures, as known from classic IT, cannot be easily transferred to OT systems.

In order to detect security incidents as quickly as possible, measures are required that go beyond purely signature-based procedures. With rule-based intrusion detection systems, security incidents can only be detected if the attack vectors are already known. In addition, the network structure and the network traffic within it must be known. It must also be ensured that the rules are always up to date. This is particularly difficult in the OT area, as there are usually only short maintenance windows and no fixed times for updates and patches.

Advertisement

Hierarchical IDS architecture: Decentralized sensors allow data to be recorded and evaluated close to the process, which saves bandwidth on the one hand and enables faster reactions on the other.

© TH Deggendorf

Many recent publications investigate the use of machine learning algorithms for anomaly detection with multi-stage complex procedures, which are usually performed centrally. Several algorithms are linked together to achieve better results. The disadvantages here are that the computational complexity of this method is higher than with single-stage methods due to the use of multiple algorithms, and all relevant data must be sent to the central instance. Such approaches are therefore unsuitable for comprehensive detection of security incidents in domains with resource-poor systems, such as those prevalent in industrial networks.

A better solution is a distributed intrusion detection system architecture (IDS), which can be used to quickly detect attacks on industrial networks and generate corresponding warnings. Such an IDS has been developed at the Pro-tectIT Institute at Deggendorf Institute of Technology. The core concept behind it is that sensors - implemented either as (cost-effective) embedded systems or as agents on the industrial components themselves - collect data decentrally and process this data appropriately. Particular attention is paid to efficient feature pre-processing. The collected features are then compressed and sent to more powerful nodes that have the appropriate capacity to calculate models for anomaly detection from the data. These models are then distributed back to the sensors, which can then efficiently detect incidents close to the process on site and generate suitable alarms. The main features of the IDS architecture are explained in more detail below.

ML for the detection of security incidents

Feature generation from the data: The pre-processing of data and selection of features plays a central role in forming suitable models for detecting security incidents.

© TH Deggendorf

Thanks to a distributed IDS architecture in combination with unsupervised machine learning algorithms, the system can also quickly detect previously unknown attacks. This is because unsupervised machine learning does not require the data to contain any labels (normal or abnormal). One example is the Isolation Forest algorithm, which belongs to the 'outlier detection' group. This algorithm is able to detect outliers without knowing the exact structure of the data beforehand. For this purpose, an attempt is made to separate outliers by isolating them from other data. The main assumption here is that data points that differ significantly from others (anomalies) are easier to separate from other (normal) data. To generate a model, only small subsets of data are required, for example a recording of network traffic. Each of these sub-sets is used to generate a binary tree (iTree). Several of these binary trees form the model (iForest) that is used for classification or anomaly detection.

In the architecture presented, the machine learning process is divided into two phases: First, models are formed from existing data (training phase). To do this, sensors collect relevant network data. The placement of the sensors forms an important basis for the effective monitoring of a network. The solution presented offers several options for this.

  1. As a TAP device in front of the respective network components/segments
  2. Operation of the sensors on a mirroring port of a managed switch
  3. As an agent on the industrial components used (e.g. PLC, HMI)

Relevant features can be extracted through appropriate pre-processing. In addition to common properties such as MAC and IP addresses, these features include protocol-specific information (Modbus/TCP, Profinet, etc.). As a result, the determinism normally prevalent in industrial processes is also learned, which results in an improved detection rate and also reduces the number of false alarms. The efficiency of the algorithm depends heavily on the features used. The best detection rates are not necessarily achieved when all available features are used. It is important to incorporate appropriate knowledge of network protocols or Industrial Control Systems (ICS) into the design. This means that comparatively good results can be achieved during operation, even with lightweight algorithms such as Isolation Forest.

In order to achieve an ideal detection rate with as few false positives as possible, it makes sense to optimize the selection of features used. One possible approach to finding the best feature set is to try out all possible combinations. However, the number of feature sets to be tested increases exponentially with the number of available features. A method for deriving a feature set based on the results of current tests during operation is therefore preferable.

In the next step, the generated models are distributed to the sensors again. As the machine learning methods used, such as the Isolation Forest algorithm, are particularly resource-efficient in terms of computing power and memory requirements, far fewer resources are needed in the second phase - the test phase. This means that checking for anomalies can be easily carried out by the sensors, which are mainly implemented on embedded systems. A crucial aspect resulting from this is that the IDS architecture scales well and is therefore ideal for large networks.

The advantages of decentralized IDS architecture

Data processing, modeling and distribution: Dividing the training and test phase into different levels or systems enables cost-efficient implementation of IDS in industrial networks.

© TH Deggendorf

A significant advantage of the IDS architecture presented here is that one node can receive data from several sensors for modeling, resulting in more comprehensive models. Systemic integration thus creates an image of the entire network. This enables holistic anomaly detection. As a compromise, it is also conceivable to integrate sensors only at critical connection nodes. This would drastically reduce the number of sensors - but at the expense of monitoring density and therefore security. Here, an individual risk assessment can help with the cost-benefit analysis to determine an acceptable security level and, as a result, derive a suitable implementation strategy.

Another key factor is that the internal architecture of the sensors is easy to expand. Individual modules can be easily replaced or modified, making it easy to implement new machine learning processes. This also avoids dependencies - for example on the communication framework used or the attack detection system employed by the sensors.

Authors:
Karl Leidl is a research associate at the Deggendorf Institute of Technology (THD);
Andreas Grzemba is Vice President for Research and Technology Transfer at Deggendorf Institute of Technology.

  • Xing Icon
  • LinkedIn Icon
Advertisement
Back to topic page
Advertisement

You might also be interested in

Advertisement

Security

No accidental changes to files

CodeMeter 6.80 from Wibu-Systems supports Universal Write Filter (UWF), a Windows option from Microsoft that prevents accidental changes to files, which is particularly important for embedded systems.

read more...
Advertisement
Advertisement
Advertisement
Advertisement
Advertisement
Advertisement
Subscribe to our newsletter
Advertisement
Back to home