Ethernet
The redundant protocols
The Ethernet protocols familiar from the office world are increasingly finding their way into factory and process automation. But which of the office-compatible redundancy protocols are also suitable for the industrial environment?
Redundancy of communication paths is a major issue in production: if a cable route or an active participant within a communication chain fails, the worst-case scenario can lead to system downtime. To prevent this, Ethernet comes with redundancy protocols as standard. However, these have their history and raison d'être in office IT, whose requirements differ fundamentally from those of a manufacturing environment. The following protocols are frequently used in Europe:
- RSTP
- MRP
- TurboRing
- PRP/HSR
Functionality of the protocols
Figure 1: Preventing switching loops: To prevent a loop from occurring, route 3 is deactivated. This results in a mixed topology of lines, stars and trees.
© Moxa EuropeRSTP, MRP and TurboRing each form a ring. However, this only looks like a ring from the cabling perspective. To prevent switching loops from occurring, the master in the ring deactivates one of the routes. In a switching loop, two network sockets of one and the same switch are directly connected to each other or there are two or more active connections between two switches. There are several ways in which a switching loop leads to a complete overload of the network due to the constant duplication of packets. It is therefore important to avoid switch looping, usually by deactivating a route. In Figure 1, the deactivated route corresponds to route number 3, resulting in a mixed topology of lines, stars and trees. However, if one of the active routes fails, route 3 is reactivated and communication can continue undisturbed.
The PRP/HSR protocol is very different. As can be seen in Fig. 2, all routes are active in this case, packets are duplicated and the bandwidth is halved. For example, if subscriber A wants to send packet A to subscriber B, this packet A goes to switch A. The latter now duplicates packet A and sends it via a parallel path to switch B and switch C. These each forward their packet A to switch D. The latter accepts both packets A and sends them to switch D. Switch D accepts both packets A, but only forwards one of them to subscriber B. If, for example, route 3 were to fail, switch D would still receive packet A from switch C, deliver it to subscriber B and at the same time signal a fault on route 3 via the redundancy manager.
In order to provide a decision-making aid, it is important to define the other characteristics and prioritize them for the respective application; these are described in the following table:
The most important attribute for redundancy protocols is the recovery or reconfiguration time. It determines how long it takes from the time the failed communication path is located until a new communication path is successfully established.
Not without recovery time
Figure 3: The RSTP, PRP and TurboRing protocols have very different reconfiguration times. The dependency of the individual participants on each other must be taken into account depending on the application.
© Moxa EuropeThis can be described in concrete terms using Figure 3. Imagine participant A and participant B are connected to each other via a redundant network. Several switches are used between the two subscribers. These are connected to each other as a ring. Now a fault occurs and a cable on the connection path is damaged. In this case, this is route 1, but subscriber A has just sent a packet to subscriber B. Due to the cable break, the parcel cannot be delivered. A will not receive confirmation of receipt and continues to try to send the parcel. However, the switches recognize the cable break, define the relevant route as faulty and activate the alternative route 3, which was previously blocked. This entire process takes different lengths of time for the various redundancy protocols. With an RSTP network, for example, the process can take up to five seconds. Only then is the packet delivered and subscriber A receives confirmation of receipt from subscriber B.
In order to make adequate use of the recovery time in applications, it is important to understand the dependencies of the participants on each other and to know the time-critical requirements.
Maximum number of participants
The second important attribute of redundancy protocols is the number of maximum participants and their definition. Each redundancy type has differences. In detail, these are as follows:
- RSTP: With RSTP, only the active RSTP network components - such as switches - are participants in the redundancy network. The regular participants are not part of the limit. For example, if you use 25 switches with 16 subscribers each, there are a total of 425 network subscribers, but only 25 of these are active subscribers in the RSTP network.
- MRP: MRP is a redundancy protocol that is primarily used by Profinet end devices, as can be seen in Figure 4. Accordingly, many PN controllers and PN devices have an integrated 2-port switch as standard. This architecture can be used to set up an MRP ring with a maximum of 50 MRP participants. In this analysis, it is primarily the end devices that are included in the calculation.
- TurboRing: The same calculation applies for TurboRing V2 as for RSTP. The only difference is that the maximum number of participants is ten times greater. A redundant network could now consist of 250 switches with 16 subscribers each, and this would correspond to a total of 4250 subscribers.
- PRP/HSR: PRP/HSR has 512 maximum active users in the redundant network. However, the functionality of PRP/HSR is more similar to an MRP network than an RSTP network. This is because PRP/HSR does not only involve switches that are equipped with the technology. Corresponding plug-in cards for PCs or IEDs are also available on the market. This results in the most flexible topologies with PRP/HSR despite the ring structure.
The possible topology
The last attribute of the redundancy protocols is the possible topology. Each of the redundancy protocols mentioned must be set up in a ring - the only exception is RSTP, which can also be set up as a mesh network. However, this results in a strongly fluctuating recovery time of 500 msec to 5 sec. Ring structures are mainly used to optimize the recovery time.
Design recommendation
Before redundancy protocols are used more or less indiscriminately, it is advisable to first become aware of the critical dependencies within a system. As a practical example, let's look at an application in a distributed plant.
The production plant is distributed over an area of 100 m × 100 m. There is a central control cabinet with the Profinet controller and 100 other Profinet devices distributed across the area in a total of 20 smaller control cabinets. The very fast connection of the Profinet devices and controllers is essential for the application. Accordingly, the distances to be bridged between the central control cabinet and the 20 smaller control cabinets are the most important. Why? Because they are the most susceptible to a cable break. The risk of a cable breaking within the central control cabinet or in the smaller control boxes is much smaller.
RSTP could theoretically provide redundancy in the application. However, the regular cycle times for Profinet controllers are in the range of 1 and 50 msec. A recovery time of up to 5 sec would not be a redundancy that the PLC would accept without problems. Instead, the system would be stopped and would have to be put back into operation at great expense. The only plus point is that there would be no need to pull a new cable or laboriously identify the faulty cable.
A similar problem would arise when using MRP. Although its recovery time of less than 200 msec is much shorter than 5 sec, it is still many times longer than the cycle time of the PLC. Therefore, the system would also stop in this case. Another point would be that the maximum number of participants would be far exceeded, as only 50 participants can take part in the ring.
TurboRing can provide redundancy in this application. It has the capacity for enough participants and would also have sufficient reserves for extensions. The recovery time of less than 20 msec can even ensure that the PLC can continue to operate undisturbed and production does not come to a standstill. In this case, however, the cycle times of the nodes should be known exactly and the possible error case should be tested in combination with the recovery time. The advantage clearly lies in the fact that the system continues to produce without interruption. The faulty section can still be repaired after the end of production.
PRP/HSR is the most technologically advanced. With a recovery time of 0 msec, the system will also continue to run in this case. The maximum participants are also no problem. The only drawback of the technology is undoubtedly the price. However, if this is compared to a loss due to production downtime, then the price may be justified.
PRP/HSR and TurboRing stand out
PRP/HSR is clearly superior to other technologies in terms of speed. This is precisely why the technology is primarily used in critical infrastructures such as energy supply and distribution.
The technology has rarely been used in manufacturing plant and machine infrastructure. This is where TurboRing scores particularly well. It can solve many of the redundancy requirements without triggering too large an investment.
Moxa uses TurboRing technology in a large number of its devices, such as the EDS-400A series of DIN rail switches with management functions or in the IKS-6726/6728 series of 19-inch switches and the PT-7700 series of IEC 61850-3 switches for use in the energy industry.
Author:
Philipp Jauch is Strategic Account Manager Industrial Automation at Moxa.

















