Cloudera
Machine learning makes it necessary
Industrial IoT and edge computing play a key role in the digitalization of production. But why is machine learning so important in this context?
One discipline of artificial intelligence (AI) plays a special role in the context of the digital networking of companies: machine learning. As a sub-discipline of AI, it enables IT systems to independently identify regularities by analyzing existing data sets and situations and to develop solutions based on this. To do this, the system collects relevant data, extracts and summarizes it. Predictions are then made on this basis.
At the same time, the artificial intelligence calculates the probabilities for the predictions made. At the same time, the system constantly adapts itself to developments and optimizes its processes based on identified patterns.
Three main fields of application
Machine learning can be implemented in production in three main areas: in the machines, in the process and in the product itself. In machines and systems, machine learning is used to monitor the condition of a production device, identify potential errors and diagnose the cause of failures in order to increase system safety and efficiency. It is also used in predictive maintenance. Machine learning can also be used for machine control. In this way, robots could perform their tasks autonomously, not according to instructions but according to targets and even create solutions together if necessary (self-learning machines).
Machine learning is also used to improve production processes. It then optimizes the design of processes and sequences in series production - before the start and during operation. It can also predict future demand trends or point out problems that could arise in the supply chain. Machine learning also improves scheduling and route planning in logistics, optimizes the use of resources and forecasts throughput times. Companies use it in process control and optimization, for example to predict process parameters or product quality.
The introduction of machine learning can be incorporated into the planning of a new
production plant. In this case, the production process is not restricted or influenced. However, machine learning can also be implemented during ongoing operations. In this case, the big data system must first be created separately from production operations and tested in a simulation of the actual conditions by gradually transferring it to real operations. For example, the first machines can be integrated into predictive maintenance in this way.
Everyone has to learn...
However, for a system with artificial intelligence to be able to make decisions independently, a learning phase is required.
This is correspondingly more complex than entering fixed rules, but opens up considerably more possibilities as soon as the system is implemented.
The learning process can take place in different ways. In supervised learning, instructors define and specify example models in advance, which assign information appropriately to the model groups of the algorithms. In contrast, in unsupervised learning, the system forms the necessary model groups itself using patterns it has recognized itself.
A hybrid of both models is partially supervised learning. Here, the system learns independently, but is guided in the right direction by 'reward' and 'punishment'. The latter is similar to natural human learning. Integrating new and novel information into an already known context and thus being able to assess unknown data is also referred to as learning transfer.
The gradual improvement of machine learning models is based on the way in which it accumulates its knowledge. It generates a model with descriptions of inputs, recognized categories and correlations, thus making predictions possible in the first place. A clustering process is used here to divide the data into different categories. The classification is based on typical patterns. The artificial intelligence uses this to independently create classifiers. One factor is the EM algorithm (Expectation-Maximization algorithm or Estimation-Maximization algorithm), which starts with a random model in order to optimize the changing assignment of data to the respective parts of the model and the model parameters to the current assignment. It thus iteratively determines the parameters of the model in such a way that it explains the recognized data well.
In addition to the EM algorithm, principal component analysis is also used, for example. It dispenses with categorization and instead relies on the translation of the recorded data into a simpler representation that reproduces it fairly accurately, even though the underlying information is very reduced.
Big data is essential
The basis for a sufficiently broad collection of information is a big data platform that combines data from various sources and creates a coherent overall picture even from unstructured data. Big data refers to data volumes that are so large, complex, fast-moving or insufficiently structured that conventional data processing methods are inadequate. Without big data, machine learning is inconceivable, as it is only able to work at all after the learning process with a sufficient amount of data. Last but not least, big data is also particularly important as it can collect and analyze data from a wide variety of production fields.
In the production environment, this is particularly evident in the integration of sensor data or decentralized information. This is because intelligent production methods always include the Internet of Things (IoT). It is important to avoid data silos. This refers to data stocks in different locations to which only individual departments have access, for example. When choosing a big data platform, an open system should also be selected to avoid a vendor lock-in, i.e. being tied to just one manufacturer.
Decentralized data acquisition
When the Internet of Things comes into play, the edge computing model is often pursued. Here, data is collected decentrally in edge node computers and consolidated there before it is sent on to a central structure, such as a main data center. The advantage here is that this method saves bandwidth and only the information that is actually required to control the manufacturing process is transmitted. Mass data can also be generated here, which flows into a big data platform. To ensure end-to-end data processing and analysis, IT managers should choose a platform that is particularly flexible here - and can grow dynamically in line with demand.
In order to be able to use big data in production particularly efficiently, manufacturing companies need to clarify how the relevant production stepsneed to be arranged not only in terms of processes, but also in terms of organizational allocation. Which data is required where? Which production steps are interlinked and which are merely a sequence of each other?
The role of suppliers also needs to be examined. Especially in just-in-time production, the supply chains must be mapped appropriately. Traditional solutions often only look at individual processes and do not support a correlating view of current measured values with historical data or a subsequent simple or event-driven adjustment of threshold values, for example.
The implementation of an open, powerful and company-wide data platform therefore supports the production process in a variety of ways, makes it easier to plan and therefore also ensures higher productivity throughout the entire value creation process.














