Machine Learning

Armin Erich | Günter Herkommer,

Artificial intelligence requires computing power

The integration of AI in industrial applications is increasing the need for computing solutions that can cope with the complexity of the tasks involved. In addition to the CPU, processors specialized in AI algorithms and large amounts of data are becoming increasingly important.

© iStockphoto

Industrial applications in the field of artificial intelligence have significantly higher performance requirements than previous 'standard' applications. In addition, these are increasingly being moved closer to the sensors or directly to the machine. The reason for this is that the large volumes of data generated with their high bandwidth can hardly be transported via a poorly controllable public data network while complying with the real-time requirements that apply in an industrial environment - not to mention the issue of data security.

However, AI on the machine requires powerful and robust hardware. How must this be 'knitted' or how do the technologies available on the market differ from one another in terms of their functionality, and in which scenarios can they be used sensibly? To answer this question, let us first take a look at the terms commonly used in the context of AI.

Definitions and delimitation of terms

Artificial intelligence generally involves the execution of tasks using human intelligence strategies and human behaviour, such as evaluation, reasoning, solution-oriented thinking and subsequent optimization based on positive or negative results (so-called experience-based learning).

Machine learning is an approach to implementing artificial intelligence using computer technology. The use of self-optimizing methods ensures that a machine solves a task better and faster over time through a feedback loop in the form of feedback on the result. More data leads to better results, as there is more underlying information for decision-making.

Advertisement

The terms are often confused. Deep learning is a sub-area of machine learning, which in turn is a sub-area of artificial intelligence.

© Inonet

Deep learning describes the approach of replicating the functioning of the human brain with its complex neural networks using computer technology. Deep learning is therefore a sub-area of machine learning that uses a neural network structure to solve tasks automatically. The basis for deep learning is currently 'big data', i.e. the analysis of huge amounts of data. During a training phase, a machine learns to use abstraction to evaluate properties and make (correct) decisions or classifications - based on image data evaluation, structure-borne sound detection or speech analysis, for example. The data required for the training process is collected by sensors for various physical variables.

As part of a learning process, the first step is to generate a model based on a large amount of data that is specifically trained to perform this task. In order to be able to carry out the optimization - i.e. the learning process - external feedback must always be provided. At the end of this performance-intensive process is a (software) model that can then be transferred to a significantly less performant machine for processing precisely this task. This process is called inference, and the corresponding devices are referred to as inference computers.

As already mentioned, deep learning works with a simulation of neural networks. The neurons are arranged in a different number of individual layers (depth), and the neurons of one layer are meshed with the neurons of the previous and subsequent layers. In mathematical terms, each neuron performs several operations: Each input x is multiplied by a weighting value y, an offset value n is added and then cumulatively passed on to the subsequent neuron. If the input values are a matrix instead of a vector, this is referred to as a 'convolutional neural network'. In mathematical terms, this involves so-called 'multiply-accumulate' operations (multiplication of two values with subsequent addition of a further value) and matrix multiplications, which also consist of a large number of multiplications of two values with subsequent addition. These operations require an extremely large amount of computing power, which can be provided in the form of various special hardware (coprocessor concept).

To clarify the difference between training and inference: During the training phase, the aforementioned parameters y and n are changed (optimized) for all 'neurons' and their connections on the basis of the respective results until the network performs the task correctly with a predetermined accuracy. At the end, the optimization of the network with regard to the respective task is completed, i.e. all x- and n-values are fixed so that the model can then be transferred to an inference computer. No further learning process then takes place on the latter.

Special hardware for training processes and inferences?

In view of the mathematical peculiarities described with regard to the IT implementation of neural network structures, there are various hardware approaches that are particularly suitable for fast and efficient implementation of these operations due to their architecture The CPU is the heart of every computer and is characterized by a complex hardware architecture and a universal instruction set. This predestines the CPU for very flexible processing of a wide variety of algorithms with different objectives. However, this universality comes at the cost of sub-optimal performance for dedicated tasks, meaning that its suitability for AI use is initially limited. However, modern high-performance server CPUs with many cores and multithreading can also achieve very good performance for AI tasks and can also be used for model generation in data centers. For an inference scenario, i.e. the application of an already trained model with low to medium performance requirements, less powerful CPUs can generally also be used without any problems.

Brain cells are linked by synapses and communicate bidirectionally. The conversion of information into action takes place at so-called motor neurons. In biology, the whole thing is called a neural network.

© Inonet

Regardless of this, however, there is much more suitable hardware for deep learning scenarios, such as graphics processors or GPUs (Graphic Processing Units) for short. The GPU can be part of the CPU, be located as a separate chip on the mainboard or be connected to the mainboard in the form of a plug-in card (usually via PCIe). The computing power is increased immensely by parallelizing computing tasks via a large number of available computing units compared to a CPU. Both consumer graphics cards and professional graphics cards can be used for this purpose, whereby the former are cheaper to purchase initially, while the latter have a significantly longer service life.

VPUs (Vision Processing Units) have recently become increasingly popular in the inference environment when it comes to deep learning scenarios based on image and moving image data. This hardware, which is designed for industrial use, is more durable and can withstand extended ambient temperatures. Manufacturers of VPU modules include Nvidia with the Jetson-TX2 module, for example, and Intel or Movidius with the Myriad X. In addition to industrial hardware, VPU modules enable medium to high performance for inference machines, for example for the simple analysis of image data at up to 9 FPS (frames per second), with relatively low power consumption of just 4 W in the case of a Myriad2 VPU. VPUs are usually offered on plug-in modules whose performance scales from one to currently eight VPUs and which have standard interfaces such as PCIe, mPCIe, m.2 or USB. Due to their compactness - the USB units are only the size of standard USB sticks - these modules are easy to integrate into industrial PCs and can therefore be used for edge computing.

FPGAs (Field Programmable Gate Array) are programmable digital components in which the hardware structure (logical circuits) is also programmable. FPGA cards have a very dynamic power consumption, as they can be adapted directly to the application thanks to the individual configuration options of the hardware and can therefore deliver maximum efficiency and performance for the respective (AI) application. By implementing hardware structures that work in parallel, the performance can be exploited even further. On the other hand, however, there is a high level of individual development effort, which is generally only worthwhile for applications with larger quantities.

On site or in the cloud?

In general, it is possible to carry out both model generation (training) and the subsequent processing of a task (inference) on site, for example on the shop floor (edge intelligence), or remotely, for example in a data center (cloud). While the computing power is located locally in edge computing, which usually ensures real-time capability without significant delays, the cloud approach generally does not have real-time capability and requires a high bandwidth for the transmission of individual data packets. In contrast to edge computing, however, cloud computing offers easily scalable computing power - for example through virtualization - with worldwide access. In addition, unlike the hardware used on the store floor, the hardware used does not have to meet industrial requirements and is therefore significantly cheaper to purchase.

In inference applications, much more compact device classes can be used than in training applications to implement the previously learned processes.

© Inonet

Modern AI systems are already defined, created and trained by trained specialists in dedicated data centers. These experts are often specialized in a specific AI framework. MXNet, Tensorflow and Caffe2 are just three examples of many in this context. For some applications, the potential inference hardware is already specified during AI system design and training.

However, there is also a trend towards adding already trained inference systems as building blocks to an individual AI application. This is always possible if the AI application can be defined for standardized tasks. Examples of this would be the visual recognition of traffic signs or the recognition and reading of text in photos. Here, the inference system can be selected as a software module from a number of providers, qualified and integrated into the company's own product. It is essential that the AI system is not too strongly geared towards a specific target system during the system design phase and that the user still has enough flexibility to implement the requirements of their application.

This flexible approach is supported, for example, by toolkits such as Intel's OpenVino. This claims to be able to read in already defined inference systems from other frameworks, optimize them and set them up on different target hardware without a great deal of programming effort. On the input side, a large number of well-known AI frameworks are already supported and Intel advertises that it is expanding this selection on a monthly basis. On the output side, the target systems already mentioned are supported in the form of CPUs, GPUs, VPUs and FPGAs. This means that the target system on which the inference application is to be used can be flexibly scaled without having to reconfigure the application. In contrast to the AI training platforms in the data centers, all of these hardware platforms are industrial-grade and therefore suitable for integration into edge computing applications.

Author: Armin Erich is Head of Development at Inonet Computer.

  • Xing Icon
  • LinkedIn Icon
Advertisement
Advertisement

You might also be interested in

Advertisement

Machine Learning

A direct comparison of solutions

The topic of machine learning raises a number of questions: Which data should be analyzed using which methods? What role does the user play in the data analysis process? And what about the real-time capability, explainability and reliability of the...

read more...
Advertisement
Advertisement
Advertisement

Cloud solution

More usability through AI?

Flexible and cost-transparent booking of temporarily required resources - a promise that cloud providers often fail to keep: Although the offerings are comprehensive and powerful, they are just as confusing and lack transparency in terms of price. A...

read more...
Advertisement
Advertisement
Advertisement
Subscribe to our newsletter
Advertisement
Back to home