MVTec
She is not creative
To what extent can deep learning meet today's user expectations? - It's all a question of the combination of deep learning and traditional image processing, say Dr. Olaf Munkel and Christian Eckstein from MVTec.
The heat map for deep learning products from MVTec marks the areas in the image that were decisive for the deep learning network's decision.
© MVTecMr Eckstein, we encounter deep learning on a daily basis - from search engine queries and automatic translations to the individualized filtering of content on social media. We regularly use artificial intelligence to make decisions for us. The use of AI is also increasing in an industrial context. How does the technology work compared to traditional approaches to image processing?
Christian Eckstein, Business Developer & Partner Manager at MVTec "Contrary to what the media sometimes suggest with the term AI, learning systems can only solve narrowly defined problems in a highly specialized way."
© MVTecChristian Eckstein: When we talk about artificial intelligence in image processing, we usually mean learning systems, especially deep learning. In most methods, this learning works in a similar way to school, where a pupil has the task of solving a problem. In a traditional approach, the student would follow a predetermined solution path; in a self-learning method, they would find a solution themselves using a variety of tasks and their solutions in an iterative process. It follows from this: With deep learning, in contrast to traditional image processing, a developer does not explicitly describe and process the properties of an image. Instead, neural networks are trained on large image data sets, which can be used to identify and evaluate the relevant image properties. This approach can overcome previously unsolvable problems and has led to astounding successes in recent years.
Dr. Munkelt, you claim that expectations of deep learning generally exceed what reality can currently achieve. Why do you think this is the case?
Dr. Olaf Munkelt: Expectations are based on the one hand on the media and advertising messages from some manufacturers, and on the other hand on the direct experiences of people who come into contact with this technology in their private lives. And there are amazing new possibilities in many areas, such as digital assistants or self-driving cars.
For industry, however, deep learning as a new technology must first prove that - and in which areas - it is actually the better option. This must be done in the context of relevant industrial applications. For example, the technology must prove itself in terms of investment costs, speed, quality, consistency, maintainability and traceability.
In which of these areas is there still some catching up to do in terms of deep learning?
Dr. Olaf Munkelt: Let's take the example of investment costs and speed: deep learning is technically a good alternative for many applications, but performance and memory requirements are often very high. Learning systems cannot produce creative solutions. They fail to find shortcuts.
Instead, they try to solve the problem with data volume and computing power. Back to the school analogy: a self-learning student would patiently add up all the numbers from 1 to 100; a rule-based system like little Gauss, on the other hand, would apply the Gaussian summation formula as a student and solve the same problem in a fraction of the time. However, efficiency and consistency are of great importance in industrial applications - for example, the strict cycle times of industrial plants require short and constant calculation cycles. In addition, many deep learning applications require high-performance hardware with a high energy profile and acquisition costs. Deep learning is therefore still too slow for high-speed applications in some cases. In practice, highly differentiated systems with traditional image processing are used here.
What are the challenges in practical use and maintenance?
Christian Eckstein: Industrial plants are subject to continuous change. These often also affect the inspected objects or the environment, which means that the image processing component also has to be adapted accordingly. With traditional systems, this is usually solved by adapting the relevant control parameters. In a deep learning application, such parameters only exist to a limited extent. Contrary to what the media sometimes suggest with the term AI, learning systems can only solve narrowly defined problems in a highly specialized way. The generalizability, i.e. the transfer of knowledge to a similar problem, is largely unresolved. The parameters can only be adapted during post-processing, which is then carried out using traditional image processing. As a result, the deep learning model usually has to be retrained with new data after the change. This is because retraining a network with new, additional images is also fundamentally an unsolved problem. Current approaches result in a 'catastrophic forgetting' of the old knowledge, or they are merely workarounds.
The smart labeling of MVTec's deep learning tool automatically provides label suggestions. The user only labels a circumscribing rectangle.
© MVTecThis susceptibility to changing environmental conditions and the occasionally high cost of acquiring new images and subsequent retraining prove to be a challenge in practice. To minimize the dependency on environmental influences, deep learning is usually combined with classic image processing for pre- and post-processing. This is the only way to exploit the full potential.
Many users are still hesitant about using deep learning, as AI solutions are supposedly less comprehensible than rule-based systems. Is this skepticism justified?
Dr. Olaf Munkelt: The lack of traceability of decisions made by deep learning-based systems is currently still an obstacle to the use of the technology in some cases.
Imagine the following horror scenario for a quality engineer: A car manufacturer launches a recall campaign. A component for whose quality and testing this engineer was responsible is responsible. He now needs documentation of the traceable criteria on the basis of which he approved the system. The answer to the question of why the system found the parts to be good cannot be that the neural network classified them as good parts.
In contrast to deep learning, traditional image processing usually explicitly describes the image properties on the basis of which decisions are made - this simplifies traceability. MVTec, for example, offers various technologies to open up the 'black box' of a neural network and simplify traceability, especially for industrial applications.
Where is deep learning research heading? Do you think that future technologies will be able to address the previously unsolved problems?
Christian Eckstein: Deep learning is enjoying massive investment in research, both from institutional and private sources. This is leading to progress in this area, such as the development of new network architectures.
However, we are also seeing diminishing returns on these investments, meaning that the amazing growth of the past cannot be repeated. We should not hope that future improvements alone will solve the unsolved cases, even if this is currently the prevailing view in research. Instead, in practice we are seeing an increasing focus on the quality of the data sets on which the networks are based. The approaches in research and practice differ greatly here.
To what extent do the problems in research and practice relating to deep learning differ?
Christian Eckstein: In the field of deep learning, research starts with an existing data set. The researchers compete against each other in an attempt to develop the best network architectures. The aim is to achieve the best result on the existing data. Using a lot of computing power and complex mathematics, they outperform each other by a few percent or fractions thereof.
In industrial practice, the procedure is reversed: here, the user does not start with an existing data set, but has to record and annotate the data in the first step.
Users in industry often use a similar approach, such as the latest open source experiments. However, they usually get no or only a marginal improvement in the result, despite complex and performance-hungry models. The reason for this is that the data volumes in practice are usually significantly smaller than in research. As a result, errors in the annotation or underrepresented classes are not so easy to identify. Smaller amounts of data require creative, intelligent solutions that are often programmed by an image processing specialist. In this way, the best result can be achieved with fewer resources.
What significantly increases performance is therefore the focus on a clean, well-described database and the combination of deep learning with traditional image processing. The corresponding data can be generated through good project management and the use of suitable tools - such as the MVTec Deep Learning Tool. Deep learning and traditional image processing are best combined when both technologies are seamlessly integrated in a solution such as MVTec Halcon.
















