Fraunhofer IKS
Framework for data and AI lifecycle
Fraunhofer IKS is developing an open, interoperable and technology-neutral framework that supports and optimizes the data and AI lifecycle. This will significantly increase value creation through AI.
Industry 4.0 describes the ongoing digitalization of production - machines and processes are intelligently networked, allowing more and more data to be generated. Artificial intelligence (AI) has the potential to generate information from this data in order to improve production and services. Possible application scenarios include predictive maintenance, process optimization and automation as well as quality assurance.
However, this potential cannot yet be fully exploited, as there are several technological barriers that restrict the generation and processing of information and therefore make the use of AI more difficult.
| Artificial Intelligence Forum |
|---|
|
The author Hoai My Van will be speaking at the Artificial Intelligence Forum on May 17, 2022 at the Science Congress Center Munich. Her presentation will go into more detail about the framework presented here. Take a look at the rest of the program and register for the conference with the VIP code FKI2022LV! |
The first obstacle is the multi-vendor landscape in today's production facilities, i.e. machines from different manufacturers from different technology generations with different communication interfaces and protocols as well as operating systems exist side by side. This heterogeneity means that standardized data access is not possible. Instead, there are many technology-specific isolated solutions for which domain knowledge is required. Although standards such as the Asset Administration Shell, OPC UA and MQTT already exist, there are still many proprietary solutions in use. As a result, standardized communication and simple AI integration are not yet possible.
The second obstacle is the lack of support for data scientists. They have no domain knowledge, which is why they need support when obtaining real-time or historical data. This is because automated data queries are not possible, there is no overview of the topology and technologies used, and metadata is sometimes missing. As a result, the data processing procedure is often laborious, lengthy, manual and requires a lot of coordination.
The third obstacle is inflexible AI operation. AI applications are often operated rigidly in the cloud or on a local server. This means that the applications do not have the opportunity to make optimum use of the available resources. In addition, updates to AI models are necessary in order to be able to react appropriately to changes in production, for which no general solution yet exists.
Framework based on service-oriented architectures
Figure 1: Example of a data and AI lifecycle consisting of six AI services (hexagons): Data acquisition, data processing, data aggregation, AI training, data analysis using AI, AI monitoring.
© Fraunhofer IKSIn order to simplify the use of AI in Industry 4.0, researchers at the Fraunhofer Institute for Cognitive Systems IKS are developing an open, inter-operable and technology-neutral framework in the "REMORA - Multi-Stage Automated Continuous Delivery for AI-based Software & Services Development in Industry 4.0" project that supports parts of the data and AI lifecycle (Figure 1). In this context, the data lifecycle refers to the steps from data creation, data aggregation and data processing through to data analysis - with the support of AI. And the AI lifecycle includes the steps from AI training, AI deployment and AI operation through to AI monitoring and AI updating. The aim of the framework is
- to support a data scientist by simplifying access to the required data,
- flexible integration of AI from the component level through to the cloud
- automation of the AI life cycle.
The framework is based on service-oriented architectures in which modular AI services can be flexibly distributed, exchanged and updated. An AI service implements parts of the data or AI lifecycle. The form of an AI service can be diverse: an AI model integrated directly into the component level, a virtual machine on the edge, a container in the cloud, etc.
The developed framework provides a library of different functionalities for these AI services, which supports and simplifies the development, integration, operation and updating of AI applications.
A framework for the stages of production automation
Depending on the desired level of automation in production, the framework can be used for different purposes, e.g. for uniform data access, flexible data access or to support the AI lifecycle.
Data is a key aspect of Industry 4.0: it is aggregated, processed and analyzed in order to gain information and added value. This data is provided by various technologies and communication interfaces. Fraunhofer IKS aims to provide technology-independent data access via a standardized interface.
In addition, data access can be further standardized and simplified. A standardized information model for describing individual assets, e.g. systems, machines, sensors, products and their existing data, should make it possible to exchange individual assets without having to change data requests.
Additional automation is possible with the help of AI. To this end, the framework allows the distribution and management of AI services, e.g. automated retraining, and thus enables continuous, constantly improving data analysis.
In the long term, the aim is to create a framework that can automatically adapt to the infrastructure and situation. Such a system could react to the data analysis, e.g. by automatically adjusting the machine parameters. Services would be able to continuously "migrate" in order to constantly improve resource utilization - the "optimal" one could be selected from several data sources, e.g. cameras. The data analysis itself could be optimized, for example by automatically adjusting the sampling frequency. The framework would have the ability to reconfigure itself in the event of changes to the system, e.g. after a retrofit or the replacement of a machine. In short, the possibilities of this framework would be manifold.
Four components, three goals
Figure 2: The four basic components of the framework: data interface, AI management, service management, metadata management. AI services (left) use functionalities of the framework to realize a data and AI lifecycle.
© Fraunhofer IKSThe basic concept of the framework is based on four components (Figure 2):
- The data interface is the interface for a data scientist or the AI service. This interface can be used to request data and to train and operate AI models. On the one hand, it enables technology-independent data exchange via different protocols; on the other hand, it enables standardized training, validation and operation of AI models by different AI platforms. The framework supports common AI platforms such as TensorFlow, Pytorch and sk-learn. AutoML is also supported so that the framework can also be used by AI beginners.
- For data exchange , Fraunhofer IKS relies on existing Industry 4.0 technologies such as Apache plc4x and BaSyx Virtual Automation Bus, for which adapters to the data interface are being developed. Separate adapters are being developed for other protocols such as MQTT, ROS and DDS.
- Metadata management includes the management of ML metadata for tracking the AI pipeline and AI models from creation to use. On the other hand, it also provides relevant data for understanding the problem, e.g. metadata on individual machines or an overview of the topology. This metadata is also relevant to enable the flexibility of data analysis. Metadata management is based on the asset administration shell. This is primarily used for the documentation of assets. However, it is also conceivable to carry out data pre-processing or even data analysis using an AI model directly in the asset administration shell.
- AI management is responsible for automating the AI lifecycle. It enables automated retraining and automated distribution of the AI models, as well as versioning and management of the AI models themselves.
- Finally, service management is responsible for the services themselves. It automates the creation and distribution of services. In addition, the services are coordinated to ensure the flow of data. Fraunhofer IKS relies on the container solution Docker to implement services and service-oriented architectures. Alternatives, e.g. virtual machines, virtual (Python) environments and the integration of AI models directly into the component level, will also be supported.
The following goals can be achieved with the described components of the framework:
- Support a data scientist by simplifying access to the required data: The data interface is the interface for a data scientist. Real-time data and historical data can be queried via this interface, as well as the metadata managed by metadata management and the topology overview. The data interface can also be used to access Service Management, which is responsible for creating and distributing AI services.
- Flexible integration of AI from component level through to the cloud: Service Management can distribute the services - based on the resource information from metadata management - in such a way that the resources are used optimally. AI management distributes the AI models; the data interface of the AI services ensures the data flow, i.e. communication with other AI services or data sources.
- Automation of the AI lifecycle: Service management coordinates the AI services. For example, an AI training service can be informed that another AI service requires a newly trained AI model. AI management implements the retraining process and distributes the newly trained AI model.
Workflow with AI support
This is what a workflow using the framework could look like:
The domain expert provides the framework with metadata of the assets - including the available data - as well as their relationship to each other. In certain situations, it is also conceivable to automatically recognize the relationship between the individual assets.
In the production systems, all relevant data is fed into a database. Metadata is also available. A data scientist can access this data and a topology overview via the data interface. Based on this, an AI model and the associated AI training service are developed. A data scientist implements the service via the same data interface.
The relevant information is defined for the distribution and operation of the services, e.g. required and provided data, required resources and optimization targets.
The AI training service can then be distributed flexibly, e.g. on an edge device or in the cloud. The service trains an AI model based on the historical data and then makes it available in a model repository.
The AI user creates an AI analysis service that can also be distributed flexibly, e.g. close to the data source. This service uses the trained AI model to analyze real-time data.
An additional AI monitoring service can monitor the quality of the analysis - and therefore the AI model - in parallel. A signal for the retraining process is transmitted in the event of poor model quality, but also in the event of retooling or new data. The AI model is then automatically retrained and distributed.
Model application demonstrates how it works
Figure 3: Demonstrator setup (left): Four stations with PLC control that assemble a product. A camera captures an image of the end product. This is analyzed for errors using AI (right). A training service provides the analysis service with a trained AI model. The analysis service can trigger retraining in the training service
© Fraunhofer IKSTo test the framework, Fraunhofer IKS is developing a demonstrator consisting of several stations with PLCs that assemble a product from various individual parts (Figure 3).
AI is now to be integrated into this demonstrator for quality assurance and error prediction. To this end, an AI model is being developed that analyzes images of the end product for defects. The framework manages this AI model and the required AI services for training and operation, for example to enable retraining of the model. Furthermore, data from the individual PLCs is to be used to predict errors. In addition to the real-time data, a topology overview and metadata should also be available.
The framework is geared towards Industry 4.0. However, it can also be used in other domains: Especially where a large amount of data (machine data, process data) is generated that needs to be analyzed in real time using AI. By using flexible AI services, the framework adapts to the existing infrastructure and resources can be used optimally.
The framework then uses the existing data to generate information and thus added value. Networking production is necessary in order to make as much data as possible available and to enable the distribution of AI services. The framework supports various communication protocols for data transfer.
Framework facilitates AI access
The framework provides the ideal prerequisites for AI integration. It supports and optimizes the AI lifecycle by offering flexibility, adaptivity and decentralized and service-oriented management of AI services, as well as the (semi-)automation of AI. It is also open, interoperable and technology-neutral. The framework enables existing solutions such as containers and component integration to be combined and used flexibly. Both Industry 4.0 standards and proprietary technologies are supported, making brown and green field approaches possible. The standardized data interface facilitates access to the required data - regardless of the technology used. An overview of the topology and metadata helps to understand the problem to be analyzed.
The author: Hoai My Van is a research associate at Fraunhofer IKS. Her research focuses on self-adaptive systems for and with AI.
© Fraunhofer IKSWhat's more, you don't need to be an AI expert to use the framework: By supporting AutoML, AI models can be created without any additional knowledge. In addition, training, operation, etc. are platform-independent thanks to a standardized programming interface.
| Funding information |
|---|
| As part of the project "REMORA - Multi-Stage Automated Continuous Delivery for AI-based Software & Services Development in Industry 4.0", a data interface for uniform and flexible data access as well as a framework for the data and AI lifecycle is being developed and a reference implementation for the framework is being created. This project is funded by the Bavarian State Ministry of Economic Affairs, Regional Development and Energy and supported by Bayern Innovativ - Bayerische Gesellschaft für Innovation und Wissenstransfer mbH. |

















