MVTec
The 'eye of production'
Machine vision makes a significant contribution to the end-to-end automation and optimization of production chains. Process efficiency can be further increased by using artificial intelligence methods such as deep learning.
Machine vision plays an important role in the end-to-end automation of value chains in production companies. In line with Industry 4.0, machine vision acts as an important accompanying technology and is effectively the 'eye of production'. Image acquisition devices such as cameras, scanners or 3D sensors positioned at various points keep a constant eye on the production and intralogistics processes. The recorded digital image data is collected and processed using integrated machine vision software. This allows them to be used for a wide range of applications, which optimizes the entire production process.
For example, a wide variety of objects can be reliably recognized and assigned based on their visual appearance. In this way, finished products can be identified and tracked along the entire process chain - both on the basis of external features as well as printed data codes or character combinations. In addition, the entire handling of objects can be optimized and automated: Workpieces are precisely localized and optimally positioned for processing. Robots can recognize objects precisely and grip them in a targeted manner. And the interaction between machines and people is also safer and more efficient, as dangerous collisions between the players can be avoided thanks to seamless machine vision monitoring.
High error detection rates thanks to AI
Another important field of application for industrial image processing is defect inspection. The technology is able to reliably detect defects of all kinds by comparing the actual and target condition. Defective goods can then be sorted out accurately and automatically, which minimizes the production of rejects. In this way, industrial image processing helps to take quality assurance to a new level. This becomes even more important when processes based on artificial intelligence (AI) are used. In particular, deep learning based on convolutional neural networks (CNNs) comes into consideration here. If corresponding algorithms are integrated into the machine vision software, even higher error detection rates can be achieved.
Deep learning uses large amounts of digital image information generated by the image acquisition devices. The technology analyzes the data in depth as part of a comprehensive training process. In this process, the software learns special characteristics that are typical for a specific object class. This allows the image data to be precisely assigned to a specific class. This enables the automatic classification of objects and defects that can be seen in the images. This significantly improves the detection and localization of objects and defects.
In addition to AI-based solutions, rule-based systems can also be used for inspection tasks. However, the latter require very complex programming. Developers have to manually define rules in order to extract the relevant information for defect detection from the image data. If an unmanageably large number of possible defects are to be expected, the development of a rule-based solution often quickly exceeds the effort required. AI technologies such as deep learning are therefore the better choice in such cases. The algorithms learn independently through training, identify relevant features automatically and can therefore clearly distinguish individual object classes from one another. This allows a large number of heterogeneous errors to be detected more effectively.
Training preparation in several steps
With deep learning, both 'bad images' and 'good images' can be used for fault inspection.
© MVTec SoftwareHowever, the use of deep learning methods also requires a certain amount of effort. The training must be well prepared: The first step is to generate, collect and provide the relevant image data. The objects with the defects to be recognized must be visible on these images (so-called 'bad images'). The images are then labeled. This means that they are given a digital label that allows them to be clearly assigned to a specific object or defect class. A label can be defined as a 'crack', 'tear' or 'notch', for example. Only then is the underlying CNN trained with the selected images.
In order to reduce the deep learning effort, some providers already make pre-trained networks available. Modern machine vision standard software such as 'Halcon' from MVTec already includes corresponding functions. The integrated deep learning networks have been pre-trained with an enormous number of images. These pre-trained networks are optimized for applications in an industrial environment and are free of third-party rights, which eliminates licensing concerns. This means that users only need a few images for application-specific training and save time, effort and costs. In this way, companies can increase the profitability of their deep learning-based inspection processes.
The challenge: generating enough 'bad images'
The recognition process using AI consists of the actual training and the execution of the test procedure (inference).
© MVTec SoftwareHowever, a certain amount of work remains for labeling your own images. Depending on the individual application, at least 150 to 300 training images per defect type are required. These are 'bad images', i.e. images that show the corresponding objects with the defects to be detected in various forms. In most cases, however, companies do not have such a large number of 'bad images' available. In addition, the specific nature of the possible defects and sources of error are usually not known in advance. Due to this uncertainty, the procurement and labeling of such images would involve a disproportionate amount of effort, which is generally not profitable for companies.
A practicable solution to this problem is offered by a technology that is part of the standard machine vision software 'MVTec Halcon': the 'Anomaly Detection' feature only requires so-called 'good images' that depict the respective object in perfect condition. They are therefore much easier to create than 'bad images'. In addition, the training data does not need to be explicitly labeled, as no defects are visible on the images. 20 to a maximum of 100 training images are sufficient to achieve robust recognition rates. With the 'MVTec Halcon 20.05' version, training in test series with 20 images only took around six minutes.
Once the training is complete, the system creates an 'anomaly map' during the execution of the test step (inference). This uses a gray value to highlight certain regions in which there is a high probability of an anomaly. This segmentation allows deviations to be detected, localized and sized with pixel precision.
















