zuruck zur Themenseite

Articles and background information on the topic

Preview on embedded world 2024

Inference can be as Compute Intensive as the Training

With CPUs, GPUs and FPGAs, AMD has all architectures that are used for AI. embedded world Keynote Speaker Dr. Salil Raje shares his view on AI.

Dr. Salil Raje is Vice President at AMD and will give the keynote on the first day of the embedded world Conference 2024 on April 9.

© Advanced Micro Devices

AI brings many functions that improve the user experience, or productivity, like generating images, creating documents, or interpreting voice commands. But in which sectors is AI being adopted to address technical and social challenges?

AI is transforming the fabric of everyday life across many sectors-and it's not just about training and inferencing in the cloud. AI is also happening at the edge and endpoints as well.

In healthcare, AI can lead to breakthroughs in drug discovery and medical research, and even improve medical diagnoses and treatment. AMD adaptive computing customers, Clarius and Topcon, for example, are already using AI to help doctors diagnose physical injuries and eye disease, respectively, and Japan's Hiroshima University uses AMD-powered AI to help doctors diagnose certain types of cancer. AI can also help speed up drug testing and make drugs safer by modeling their effects on the human body.

In automotive, AI is helping drive advanced safety systems by enabling cars to recognize various types of hazards and guiding drivers to safety. AI is also used for driver monitoring and passenger detection systems.

AI will also have a profound impact on the manufacturing sector. With Industry 5.0 automation, products can be made more cost effectively using AI-powered robots, with a reduced risk of human injury. Additionally, AI can help streamline product testing to enable even faster time-to-market for companies to scale products for mass production.

And on some of the world's fastest, exascale-level supercomputers like Frontier, LUMI, and the upcoming El Capitan, AI will enable researchers to study climate change, conduct medical research, and explore potential new sources of clean energy.

Advertisement
Dr. Salil Raje
As the leader of AMD's Adaptive and Embedded Computing Group (AECG), Salil Raje is responsible for all aspects of strategy, business management, engineering, and sales for FPGAs, adaptive SoCs, embedded processors, and core markets. Raje joined AMD in 2022 from Xilinx, as part of the largest acquisition in semiconductor history. Raje holds a Bachelor of Technology in Electrical Engineering from the Indian Institute of Technology, Madras, and Master of Science and Doctorate degrees in Computer Science from Northwestern University. He holds eight patents in electronic design tools, ASIC, and FPGA designs, and has written more than 15 industry-recognized research papers.

How will AI increase the computing power and energy requirements of intelligent systems? Is training more demanding than inferencing?

Generative AI inferencing can be as compute intensive as AI training. Both demand high levels of compute processing and energy, depending on the use case and where it's deployed. Most new technology advances are bound by how much compute you can fit into constraints of the platform (size, cost, thermal budget, etc.). Over the years, the industry has done a good job finding ways to mitigate these challenges so they do not become showstopper issues, from creating more power-efficient technologies to meet the unique requirements and enable AI at the edge. We are looking at a number of solutions, ranging from hardware, software, data types, and fine-tuning models to help developers innovate AI at the edge.

Listen to Salil's Keynote at embedded world 2024
embedded world 2024 takes place from Apriil 9 - 11 in Nuremberg.
Dr. Salil Raje will give his keynote as part of the embedded world Conference on 9 April at 10 a.m.
For the complete conference program and registration see www.embedded-world.eu

Is it fair to say that GPUs are the best choice for processing in the cloud, while dedicated AI accelerators or FPGAs provide the best performance at the edge?

We believe in a heterogenous processing approach that targets the right tasks to the right processor (e.g., GPU, CPU, adaptive SoC) to optimize for both compute and energy efficiency, and memory bandwidth. GPUs are often a good choice for the cloud, but some service providers may choose a dedicated/specialized AI accelerator for certain tasks that are done repetitively and at a huge scale, like search.

Edge workloads usually have constraints around latency with real-time inferencing, smaller form factors and lower power. Adaptive SoCs based on a heterogeneous mix of programmable logic, AI engines, CPUs, and GPUs, are well-suited to address edge requirements and provide a more optimized mix of resources ideal for processing at the edge.

In which cases does inferencing in the cloud make sense? Examples?

We envision a seamless, hybrid computing approach across cloud, edge, and endpoints that brings together the benefits of cloud computing with power-efficient, real-time inferencing. In many cases, an adaptable and scalable AI processing approach with both cloud and local processing working together can provide optimized experiences for each workload.

Inferencing in the cloud makes sense in cases where system devices are either so compact or so restricted in terms of battery power or cost that the better strategy is to send data to the cloud to do the inferencing. One case in point could be a security system. The system's sensors must be small, cost-effective, and power-efficient, so processing captured video, images, and forced-entry data in the cloud makes sense. In agriculture, drones used for monitoring crops could extend battery life by streaming sensor data to the cloud for processing and inferencing, rather than performing these tasks locally on the drone.

Workloads that are decidedly better processed at the edge include latency-sensitive workloads and also in applications like healthcare where limiting the storage or transmission of information to the edge device can help address some information privacy and security concerns.

Microcontrollers and processors are universal devices that are well known by developers. What does it mean for them if they need special hardware for certain tasks? (This applies not only to AI, but also to security, video encoding, etc.) Please address the problem of ever-growing complexity. Please address the problem of ever-growing complexity).

At AMD, we are addressing the challenges of increasing design complexity with hardware abstraction via a unified software stack and open-source community and ecosystem. What this means is that developers can use the languages and tools they are familiar with to target different device types (GPUs, CPUs, adaptive SoCs) with the same or similar code. And, when necessary, developers have the option to optimize portions of the code to improve compute and power efficiency.

Our vision is to build an open, proven, and ready ecosystem to help accelerate software development and reduce hardware complexity. We are driving toward a unified AI stack across the AMD portfolio where whatever you develop in the cloud can seamlessly move to the edge via heterogeneous deployment. While there are benefits to building with specialized architectures, the downside is the learning curve. We can help minimize this through integration with open frameworks and an open and broad ecosystem.

Does the trend towards heterogeneous computing also have to do with the fact that advances in semiconductor manufacturing no longer bring as much improvement in performance and energy savings?

The trend towards heterogeneous computing exists because no single processor is optimized to do all types of processing. For instance, there have always been certain processors that are more efficient at particular arithmetic functions than others. Heterogenous computing allows developers to target specific workloads to the processors that deliver the best performance and efficiency for that task.

As workloads get more domain-specific, we can create significant improvements at each generation with architectural changes. Additionally, scalable adaptive architectures like AMD's Versal architecture provide highly performant and energy efficient improvements that allow us to run multiple AI workloads simultaneously in real-time. Another advantage is our advanced packaging and chiplet innovations that are enabling us to drive significant improvements in power efficiency with each new generation.

Free Exhibition Ticket
Would you like to take part in the embedded world Exhibition? Then secure your free exhibition ticket! Your voucher code: ew24PrCuA

  • Xing Icon
  • LinkedIn Icon
Advertisement
Back to topic page
Advertisement

You might also be interested in

Advertisement
Advertisement
Advertisement
Advertisement

Couchbase

These are the AI trends for 2024

Artificial intelligence is changing the entire IT world forever and is having a huge impact on how data is understood, analyzed and used. AI is also taking data management to a new level. Five AI trends have been identified for the year 2024.

read more...
Advertisement
Advertisement
Advertisement

SPS 2023

The interactive hall plan

Plan your trade fair visit with our interactive hall plan! Incidentally, you will receive the practical exhibition planner as a print supplement with the only official daily newspaper directly in Nuremberg, THE OFFICIAL DAILY.

read more...
Subscribe to our newsletter
Advertisement
Back to home