Edge AI + Vision Alliance
Quo vadis image processing?
Innovations in the fields of artificial intelligence and embedded vision as well as special processors and development environments for image processing open up a broader spectrum for the technology. An overview.
Deep learning for tasks that were previously not feasible, partly because conventional algorithms were not precise enough.
© Allied VisionDeep learning as an AI discipline is now the most important factor driving change in the practical use of image processing. This technology represents a fundamentally different way of extracting features from images, videos and many other types of data. Deep learning is not the right solution for every problem, but it allows us to recognize a wider range of features with greater accuracy than before. In just a few years, deep neural networks have evolved from virtually unknown commercial products to almost universal new tools for image processing.
As a result, tasks can now be solved that were previously impossible, partly because conventional algorithms were not precise enough or the cost of developing these algorithms was economically prohibitive. Deep learning algorithms or neural network architectures are generally reused and retrained with different training data for different applications. This makes many previously uneconomical applications profitable.
Specialized processors
One of the disadvantages of deep learning algorithms is that they are extremely computationally intensive. Their use usually requires processors that can deliver enormous computing power and are compatible with embedded systems in terms of cost and power consumption. Fortunately, more and more of these processors are becoming available. There has been a recent surge in innovation, particularly in special processors with integrated deep learning capabilities. A key reason for this development is that there are far fewer algorithms in the world of deep learning than in traditional image processing. When algorithms are similar from application to application, as in deep learning, it is much easier to develop a single processor that can serve a range of applications. As a result, huge investments are currently being made in the development of processors that specialize in deep neural networks.
From 2013 to 2016, there was a steady decline in investment in semiconductor start-ups. Some experts concluded that chip start-ups would never be funded again because it seemed too expensive, too risky and too difficult to find the necessary mass applications to make the investments profitable. There was even talk of the end of 'Silicon' in Silicon Valley. In 2017, however, this trend reversed and the financing of semiconductor start-ups grew rapidly once again. If the current trend continues, venture capital investments in semiconductor start-ups will reach around USD 3 billion this year, which corresponds to an increase by a factor of around 10 compared to 2016. Most of this investment will flow into the production of AI chips. This boom is mainly due to the introduction of deep learning. Investment in this technology is not limited to chips, but also extends to other levels such as algorithms and software tools.
It is estimated that around 75 companies worldwide are currently developing processors for deep learning, ranging from start-ups to large chip manufacturers and silicon IP providers such as MediaTek or Synopsys. While the majority of current machine vision systems still use a conventional CPU, around 40% of respondents to a survey of machine vision developers (Computer Vision Developer Survey from the Edge AI and Vision Alliance - formerly Embedded Vision Allicance - January 2020) use dedicated machine vision or deep learning processors in their systems. Such processors were not even available five years ago. The survey result confirms the current rapid changes that were observed in a similar way with the introduction of deep neural networks.
Development environments simplified
Investment in semiconductors fell steadily from 2013 to 2016, but this trend has largely reversed thanks to significant investment in semiconductors with AI (source: Woodside Capital Partners, 2019).
© Edge AI + Vision AllianceIn addition to the huge investments in processors, major investments and important innovations are currently being made in development tools and other infrastructure for application development. One example of this is the 'OpenVINO' development environment from Intel. One of its most interesting features is that the tool suite is aimed at a variety of processor types. The background to this is that Intel offers very different processors in its various product lines. In contrast, the development tools from silicon manufacturers are often only tailored to a single processor type. So if you want to change the basis of your image processing system from a GPU to an FPGA, for example, you have to start all over again with a different tool suite.
Another interesting aspect of 'OpenVINO' is that the development environment is specifically designed for image processing and deep learning applications. By focusing on these tasks, Intel can make the tools more efficient. For example, if a compiler only has to deal with a specific, well-defined range of algorithms, it can perform deeper and more specialized optimizations than a tool that has to be able to process code of any functionality.
Cloud computing is also currently having an increasing impact on the simplification of algorithm and application development. Training deep neural networks requires a lot of computing power and data. A single training run for such a complex network can take a week, even on a powerful GPU-enabled workstation. If the results are not satisfactory afterwards and require further iteration steps, the entire process takes a lot of time. With cloud computing and powerful GPU computing nodes, however, the training effort can be parallelized and significantly accelerated.
If the priority is to bring a product to market as quickly as possible, it may make sense to use simple standard cameras and stream videos to the cloud. Algorithms can be iterated and updated there quickly and easily without having to deal with issues such as transferring firmware updates to devices in the field. This approach will not be the right solution for every application, but in certain situations it can be useful to be able to quickly finalize and iterate a product and then make improvements to it. For applications where developers are running their deep neural networks in the cloud (almost 40% according to the survey), there is a growing range of APIs and coprocessor hardware available in the public cloud. APIs simplify the development of these applications and increase the performance and efficiency of such algorithms. Xilinx's AI platform is an excellent example of cloud hardware, which in this case uses the same basic architecture in the cloud as edge computing.
Increase in 3D applications
Author: Jeff Bier is the founder of the Edge AI + Vision Alliance (formerly Embedded Vision Alliance) in Walnut Creek, USA.
© Edge AI + Vision AllianceIn many image processing applications, three-dimensional information is very valuable or even essential. For example, self-driving cars or robotic vacuum cleaners usually implement VSLAM (Visual Simultaneous Localization and Mapping) algorithms to create a 3D map and correctly classify their own position and orientation in space. In other cases, 3D data enables systems with better performance than 2D-based systems, such as face recognition. Apple has implemented 3D recognition in the latest generation of iPhones for precisely this reason: it enables more reliable face recognition. In certain cases, 3D data can be generated from 2D sensors. However, it is usually best to capture third-dimensional data using a depth sensor and then combine this with regular 2D information to obtain an accurate 3D image of the environment or object being inspected. The number of developers who want to use 3D image processing methods next year or are already doing so has risen by almost 20% to 60%.
In the past, a major obstacle to the use of 3D systems was that the required cameras were too expensive and too large for many applications or required too much power. However, 3D sensors have also developed enormously since the debut of the Microsoft Kinect, similar to the processor advances of the last ten years. Mobile telephony acts as a market accelerator here: many processors are often first developed for this very large-volume market before the same chips or derivatives are subsequently used in other applications. The costs of image sensors and 3D camera modules are also falling due to the large quantities involved. One example of such a sensor is the Infineon IRS2381C, which enables the implementation of 3D functionalities for a price of between 20 and 30 US dollars per system in large quantities. This order of magnitude is feasible even for cost-sensitive products with sales prices of a few hundred dollars.
The Edge AI + Vision Alliance
The Edge AI + Vision Alliance (formerly the Embedded Vision Alliance) is a global association of nearly 100 companies dedicated to accelerating the adoption of edge AI and vision technology. The association inspires and empowers product developers to integrate artificial intelligence and vision into their products. It also promotes the development of an active AI and vision ecosystem that brings together suppliers, end product developers and partners. The Edge AI and Vision Alliance organizes the annual Embedded Vision Summit for innovators who equip products with machine vision and AI technology. The 2020 conference will take place May 18-21 in Santa Clara, California.















