3D data directly from the camera

IDS Imaging Development Systems

Inka Krischke | Inka Krischke, 03.07.2020, 08:39

3D data directly from the camera

Shifting computationally intensive processes away from the industrial PC and into the camera improves both the speed and accuracy of data processing. But what are the benefits of integrated data processing in detail?

Images

If large volumes or multiple object views are to be inspected automatically using 3D cameras - for example on continuously running production lines in the automotive industry - high-resolution 3D result data must be generated and processed quickly due to the specified cycle times. Stereo camera systems with large 5 MP sensors and variable baselines provide the ideal output data, but also produce enormous amounts of data. In such high-performance 3D applications, interfaces and CPU performance quickly become the limiting bottleneck. The challenge is therefore to reduce both the data rates and the performance requirements of system components without compromising data quality. At the same time, the systems need to be space-saving and efficient.

In machine vision applications with 3D cameras that work according to the principle of spatial vision (stereo vision), camera images are processed at a high resolution and frame rate in order to provide the downstream processes with result data as quickly as possible. The calculation of the three-dimensional data - so-called 'point clouds ' - from the image material of the stereo cameras requires several complex process steps. Until now, this task has been performed by powerful industrial PCs. With increasing demands on the quality and speed of the resulting data, modern 3D stereo cameras use high-resolution 2D cameras with a Gigabit Ethernet interface. However, the transmission of the 2D output data to the processing IPC requires optimal utilization of the network bandwidth in order to avoid time delays or data loss. Apart from this, the processing performance of the IPC hardware must constantly grow in order not to restrict the overall system.

By shifting the computationally intensive processes of stereo vision pre-processing to the camera itself, the high-resolution 2D raw data no longer needs to be transferred to a host PC. This reduces the network load and at the same time the demands on the computing power of the peripheral components.

The performance of such 3D camera systems can be further enhanced by using
even better by using high-quality components. Thanks to interchangeable 2D cameras, the flexible design of the 'Ensenso X' series from IDS, for example, is not tied to specific data interfaces and sensor solutions and can continue to grow in line with the requirements for speed, object sizes and quality. However, high-resolution, fast GigE cameras, specially shielded cables, high-performance network technology and powerful PC hardware are simply too expensive for some applications. Sufficient space must also be available for these peripherals.
IDS is therefore taking a different approach with its 'Ensenso XR' embedded 3D cameras with integrated data processing. According to the principle of the Internet of Things (IoT), each individual component in a distributed system has a specific task and generates results that are used directly by other systems. In the case of a 3D camera, these are three-dimensional coordinates of pixels of a real object.

Onboard 3D processing

The reduced performance requirements for network peripherals and IPC hardware simplify the entire setup of a 3D application.

Thanks to an SoC (System-on-Chip) integrated into the projector unit of the 'Ensenso XR', the camera carries out the 3D processes itself, including the stereo analysis. After correcting the lens distortion, the 2D output images are converted into an axis-parallel stereo system by a virtual rotation of the cameras (rectification), which greatly facilitates all subsequent analyses. The matching algorithms for still and moving scenes then search the recorded image pairs for corresponding pixels. The different perspectives of the cameras result in different horizontal shifts in the image plane for these pixels, which are referred to as 'disparity'. Due to the geometric relationships in the parallel stereo system, this disparity represents a measure of the spatial depth of a 3D point in millimetres after the application of ray theorems and the knowledge of known system parameters - for example focal lengths, pixel sizes and the base length of the stereo system. These time-consuming and computationally intensive pixel operations are carried out in parallel by a supporting FPGA in the camera.

In conjunction with the 'FlexView2' technology (see next page), models of the 'XR36' series can process up to 16 image acquisitions in rapid succession for the 3D data set of a stationary scene without any additional time delay due to the transfer of the raw data to the host PC. The shifting of the projector pattern by 'FlexView2' results in different 3D points with each image pair, which contribute to a high-resolution 3D representation.
To reduce the data rate, the camera only transmits the 'disparity map'. The 16-bit 1-channel image is considerably smaller than a complete 'point cloud', which with color overlay is a 32-bit RGB image. The simple conversion can be carried out by the 'Ensenso SDK' on the industrial PC without much computing load.

New independence

Heiko Seitz is a technical editor at IDS Imaging Development Systems in Obersulm.

The greater autonomy of the 'Ensenso XR' stereo camera compared to other 3D cameras is therefore an important selection criterion for 3D applications, and not just in terms of speed. The reduced performance requirements for network peripherals and IPC hardware simplify the entire setup of a 3D application and reduce costs, especially in multi-camera systems. Furthermore, an additional WLAN interface alongside the wired Gigabit Ethernet connection enables temporary access to data and parameters during setup and maintenance. The projector unit has an integrated front light to help calibrate the working environment or improve the image quality of 2D camera images when ambient light is insufficient or no external lighting is available. Overall, the integrated data processing of the 'Ensenso XR' series opens up interesting new fields of application for 3D camera technology.

Higher 3D resolution thanks to movable pattern projection

Ensenso cameras work according to the principle of spatial vision, which is based on human vision: two cameras view a scene from different positions. Although the image content of both camera images appears identical, they show differences in the position of the objects being viewed. In an image comparison, special algorithms search for pixels and visualize their displacement (disparity) in a map with all the differences found (disparity map). Since the distance and viewing angle of the cameras and the focal length of the optics are known, the 'Ensenso' software can convert these deviations into known lengths using triangulation methods and thus determine the 3D coordination of the object point for each individual image pixel.

As the classic method is directly dependent on lighting conditions and the textures of the objects in the scene, poorly textured or reflective surfaces have a direct impact on the quality of the resulting 3D point cloud: if only a few prominent image points can be recognized, compared and localized, incomplete depth information of the scene is created.
The cameras use special techniques to improve this classic stereo vision process. Even in difficult lighting conditions, a high-intensity projector uses a pattern mask with a random dot pattern to project a high-contrast texture onto the object to be imaged, thus supplementing the structures that are not or only weakly present on its surface. Using the auxiliary structures on the object surface, the algo-rithms recognize a much higher number of pixels, including their change in position, when comparing images, resulting in more complete, homogeneous depth information of the scene.

With a low-wear piezo mechanism, the position of the pattern mask in the light beam can also be moved linearly in very small steps. As a result, the projected texture on the surface of the scene objects also shifts, generating additional, varying information on shiny, dark or volume-scattering surfaces. For static scenes, this 'FlexView' technique allows several image pairs with different textures to be recorded, which when combined result in a higher number of pixels. The higher resolution enables the calculation of more detailed disparity images and point clouds, which is reflected in a higher robustness of the 3D data on difficult surfaces.
With just three to five image pairs, the X, Y and Z resolution can be doubled. The further improved 'FlexView2' uses a slightly modified projection pattern. Here, the random dot pattern is interrupted at regular intervals by stripes with gray gradients. These stripes enable more precise sub-pixel interpolation during stereo mapping. When using eight or more image pairs, the X, Y and Z resolution can be doubled again compared to 'FlexView1'.

You might also be interested in

Flir

Use in research and development

The Flir A8580 series thermal imaging camera is a cooled entry-level model with 1.3 megapixels that is suitable for industrial, military, scientific and product research and development applications.

Allied Vision

Four new USB camera models

Allied Vision has added a total of four Alvium cameras with a USB3 Vision interface to the '1800' series. All models feature a 2nd generation Sony CMOS sensor, either with Pregius global shutter technology or with rolling shutter.

VMT

Dynamic visual guidance for 6-axis robots

As part of the production ramp-up for the Audi model derivatives of the A3 successor at the Ingolstadt plant, an automated system for the application of primary seals in vehicle doors was put into operation together with the image processing system...

Isra Vision

Guided by the point cloud

In assembly processes, 3D point cloud technology plays a special role in the guidance of industrial robots. Quad-camera sensor solutions that enable multi-stereo recordings are a new approach here.

ifm

Pallets automatically recognized

How can the productivity of logistics processes be increased and the effectiveness of Automated Guided Vehicles improved? One promising approach is a system for automated pallet recognition based on time-of-flight cameras.

The new e-paper for you

Sensor & Vision 2020 dossier - simply smart

Our dossier on the latest trends in sensor technology and industrial image processing is now available as an e-paper or for download. Many exclusive reports, application examples and new product lines are waiting for you.

Baumer

Stable image evaluation

Baumer is launching six LXT cameras with resolutions from 0.5 to 7.1 megapixels that combine Sony 3rd generation Pregius CMOS sensors with 10 GigE.

Matrix Vision

Waterproof and compact

For use in harsh environments, Matrix Vision offers a waterproof version of the 'mvBlueCougar-X' industrial camera with the IP67C option.

Stemmer Imaging

The third dimension of light

Image processing based on polarized light can be used to detect hidden product properties such as stresses in plastics or glass. Jan Sandvoss from Stemmer Imaging explains the possibilities of the technology.

3D data directly from the camera

Alexander Lewinsky expands management team

AI-based image processing for the field level

Mobile or for use in confined spaces

Onboard 3D processing

New independence

Higher 3D resolution thanks to movable pattern projection

You might also be interested in

Use in research and development

Four new USB camera models

Dynamic visual guidance for 6-axis robots

Guided by the point cloud

Pallets automatically recognized

Sensor & Vision 2020 dossier - simply smart

Stable image evaluation

Waterproof and compact

The third dimension of light

Categories

Focus areas

Service

Magazine

Our network