Fraunhofer ESK

Günter Herkommer | Tiffany Dinges,

Light into the news 'chaos

More and more components in automation technology are sending messages to the network 'unsolicited' on the basis of pub/sub mechanisms - in IoT protocols, but also in OPC UA. With the help of the DANA framework, light can be shed on a seemingly chaotic exchange of messages.

Any components on the store floor - from individual sensors to entire machines - should be able to communicate with each other via DANA, as well as with the corporate IT level.

© Fraunhofer ESK

Industry 4.0 entails a change in communication structures in production. The RAMI 4.0 architecture (Reference Architecture Model Industry 4.0) makes it possible to dissolve the previous ordered, pyramid-shaped communication structure, consisting of technologically different segments of classic master/slave or client/server approaches. The concept envisages that any components on the store floor - from individual sensors to entire machines - can communicate with each other as well as with the corporate IT level(ERP applications). This should enable every component in a company's network to both provide and retrieve information - with as few technological hurdles as possible.

One example of this is the life cycle management of a machine component using the asset administration shell. So-called publish/subscribe protocols have the potential to optimally support this concept. Pub/sub protocols such as Message Queuing Telemetry Transport (MQTT), Advanced Message Queuing Protocol (AMQP) or Constrained Application Protocol (CoAP) were originally developed for IoT applications in order to transport sensor data to the cloud and process it there. Recently, however, these protocols have also been increasingly used in M2M (machine-to-machine) applications. And since 2018, pub/sub has also been integrated into the OPC UA standard.

Pub/Sub - solution and problem at the same time

Pub/Sub offers many advantages such as high scalability or the establishment of ad hoc connections. However, with very high meshing, i.e. a very high proportion of peer-to-peer connections as is the case with M2M applications, new problems arise: each participant must ensure that it publishes its information correctly in the form of messages. If a message is missing or is sent several times, this can have an impact on the entire distributed system. In addition, the more dependencies there are between the participants, the more unstable the distributed application can become. If a participant communicates with many other participants, this may result in a level of complexity that makes it difficult to understand the exchange of messages and their correct sequence in a distributed system. This may be unavoidable in mature applications; in a new design, however, a lean architecture with as few dependencies as possible between the participants (avoidance of coupled star structures) should be aimed for.

Advertisement

Decoupled communication

Example of a status model for three controllers/stations communicating with each other

© Fraunhofer ESK

The problems described become all the more apparent when properties of a structured logical network are required, for example a defined time behavior in a distributed application with defined states. If there are also chained temporal dependencies, the implementation of such a system with the help of pub/sub protocols can pose a major challenge.

Publish/subscribe describes a mechanism that enables participants in a networked application to exchange messages with each other without the participants having to coordinate in advance. This is also known as decoupled communication. Pub/sub protocols were originally developed to enable the exchange of messages between participants who are connected to each other via unreliable communication channels - for example those with dropouts or changing bandwidths. It should also be possible to communicate with subscribers who are sporadically or at least not permanently in operation.

This is implemented by so-called message brokers. A subscriber who wants to send a message (publisher) registers with a broker and defines a so-called topic for this message. Other participants who wish to receive this message (subscribers) subscribe to the topic and receive corresponding messages from the publisher from the message broker as soon as the publisher sends a new message to the broker.
The advantage and disadvantage of decoupled communication is that the recipient and sender do not know each other. Neither knows which of them is currently in operation and how many recipients have subscribed to the topic or received the message. There is therefore no handshake between the communication partners, which would actually be necessary for reliable communication between the participants.

Critical point: time specifications

What makes sense for distributed stateless applications can, however, lead to problems for distributed applications with time dependencies, for example:

- Race hazards: a message arrives at one participant earlier than another, which can lead to undesirable effects and is also difficult to track.
- Individual participants are overloaded due to insufficient resources and can only process incoming messages with a delay.

Key components of the DANA platform

© Fraunhofer ESK

In order to take account of the fact that pub/sub protocols are increasingly being used in M2M applications, the Quality of Service (QoS) mechanisms have been extended in the current version 5 of the MQTT protocol. Previously, these consisted exclusively of the definition of three QoS levels, which guaranteed the transmission of messages between publisher and broker as well as broker and subscriber, but did not take into account the temporal behavior. With version 5, a request/response function has now been introduced to control the end-to-end transmission of messages between publisher and subscriber. A message can also be given a validity period.

While protocols such as MQTT or AMQP usually use brokers, the DDS protocol does not require a broker. DDS uses the so-called RTPS discovery mechanism. The RTPS (Real Time Publish Subscribe) protocol is based on UDP and enables the participants in a network to locate the desired other participants.
OPC UA also provides for the use of a broker-based and a brokerless variant. Whether a broker is used depends on the application: If a message from a publisher is to be sent to a large number of subscribers or, conversely, a subscriber is to receive messages from a large number of different publishers (one to many), the communication effort of the sending participant can be greatly reduced with an intermediate broker. If Pub/Sub is used in a highly meshed application, a direct connection between the subscribers without a broker is more efficient, as a broker would become the bottleneck of the entire communication.

OPC UA takes the path of abstraction and first defines a Pub/Sub-
suitable 'environment', which can then be mapped to a 'message-oriented middleware' or to specific pub/sub protocols such as MQTT or AMQP. However, UDP-based (User Datagram Protocol) or pure Ethernet-based communication in the sense of brokerless communication is also supported here.

DANA checks distributed applications

Michael Stiller is a research associate at the Fraunhofer Institute for Embedded Systems and Communication Technology ESK.

© Fraunhofer ESK

Looking at the described pub/sub mechanisms and their characteristics in the various protocols, it becomes obvious that complex distributed applications with defined time behavior based on them can only be managed with the help of new software tools. These must be able to monitor and analyze the behavior of these applications - or more precisely, their communication with each other - in real time and help to resolve problems. Fraunhofer ESK has developed such a solution with the software tool DANA (Description and Analysis of Networked Applications). The tool platform, which is based on Eclipse and can therefore run under Linux and Windows, enables the analysis and verification of interaction behavior and communication protocols in networked systems.

DANA assumes that a networked application consists of a network of distributed functions. In addition to securing each individual function, the interaction between the functions must also be checked. Conformance tests can be used to ensure compatibility between different communication stacks of the participants. These use special tests to check whether a system implements the respective pub/sub protocol correctly. If a system is implemented with sufficient conformance, it can function with other standard-compliant systems.

However, creating good conformance tests is very time-consuming. Accordingly, they are only available for basic, frequently used communication protocols. However, the applications based on these are so individual that no general standards or test suites exist for them. This is why it is particularly important to safeguard the various functions when they interact. Above all, the dynamic behavior, i.e. the exact sequence in which functions communicate with each other, must be defined. DANA uses the model-driven approach for this.

DANA learns with

DANA's models describe the valid communication behavior of software components. A comparison of observed system behavior with the models makes it possible to detect deviations through so-called monitoring. The system can be passively observed without the need to develop separate tests. Special synchronization mechanisms allow DANA to continue monitoring the system after an error - or a difference between the model and actual behaviour.

Various learning methods are used to reduce the high manual effort required to design and maintain such models. This allows reference models to be derived semi-automatically from existing systems. For this purpose, the real behavior is learned as a state machine based on the recorded communication. This procedure offers the advantage that the learned automata can be subsequently adapted in order to supplement any unintended deviations or unrecorded behavior.

Monitoring the distributed application

In order to be able to monitor a distributed application that communicates using pub/sub protocols with DANA, a general description of the structure of the protocol interfaces and a concrete description of the message format are required. Franca IDL is used as the description language, where IDL stands for Interface Description Language. In addition, the target behavior of the distributed application must be described using an interaction model.

If a broker is used, the DANA message feeder can subscribe to all relevant topics of the distributed application or otherwise take on the role of the subscriber. If no broker is used, the DANA feeder must either be integrated directly into the application or it receives the messages exchanged by the participants via a protocol sniffer connected to the network. The messages are then sorted by time and publisher and forwarded to DANA Runtime, which compares the incoming messages with the target behavior.
If the pub/sub-based application does not run according to the target behavior, DANA recognizes this deviation and displays it. As the deviation has not been modeled, the new state of the system is initially unknown. This normally leads to the analysis process being aborted. However, thanks to special algorithms, DANA is able to automatically resume the check in real time without having to bring the application into a specific state beforehand.

Example application and outlook

The DANA framework has shown in various industrial applications that monitoring with integrated anomaly detection can significantly improve the quality of distributed applications both in the development phase and during operation. For example, when starting up a machine with distributed
distributed controllers can lead to problems in timing behavior. If one of the controllers reacts differently than usual or if messages are not exchanged between the controllers within the defined time window, the error can be quickly localized and analyzed with DANA.
However, the Fraunhofer ESK is not only working on methods for analyzing distributed applications, but also on simulation environments that can be used to plan pub/sub-based applications for optimum performance and scalability before they are implemented. To this end, the entire network infrastructure and the software components running in the individual nodes and communicating via pub/sub protocols are modeled.
protocols are modeled in order to be able to predict the expected temporal behavior (with or without message broker) with the help of a simulation. This enables seamless engineering from the design to the operation of complex pub/sub-based applications.

  • Xing Icon
  • LinkedIn Icon
Advertisement
Advertisement

You might also be interested in

Advertisement

Control technology

Cloud-based control - but how?

Two concepts dominate research into cloud-based control systems: Virtualization of PLCs and structuring a PLC according to the service principle. Industrial implementation depends primarily on engineering tools and suitable business models.

read more...
Advertisement
Advertisement
Advertisement
Advertisement
Advertisement
Advertisement

Prosoft

Manage IT and OT systems remotely

Maintaining and managing IT and OT infrastructures remotely is no trivial matter. The security standards defined by the BSI for remote maintenance solutions in the IT and OT environment are correspondingly high. An approach.

read more...
Subscribe to our newsletter
Advertisement
Back to home