Qualcomm is taking another stab at the server, throwing its hat into what is rapidly becoming a crowded ring — AI inference processing for the data center.
Qualcomm said in an event here Tuesday that it plans to begin sampling a 7-nm AI inference accelerator for the cloud that emphasizes power efficiency and boasts peak performance of more than 350 trillion operations per second (TOPS). The device is expected to be in production next year.
Details about the chip — known as the Qualcomm Cloud AI 100 — were limited. But Keith Kressin, Qualcomm’s senior vice president of product management, said that it would have more than 10 times the performance per watt of anything deployed today.
Recommended
AI Flood Drives Chips to the Edge
While providing few details, Qualcomm executives stressed the power efficiency of the Cloud AI 100, noting that the company’s heritage as a smartphone chip designer has ingrained power efficiency into its engineers’ DNA.
“I really think the way to get the most power-efficient processor is that you have to start with a mobile mindset,” Kressin said.
The market for inference accelerators is currently dominated by Nvidia GPUs, though more specialized solutions such as Google’s Tensor and Amazon’s Inferentia are available. In all, more than 30 companies are known to be developing chips purpose-built for AI inferencing, ranging from heavyweights such as Intel and Xilinx to a host of startups led by the likes of Graphcore, Wave Computing, and Mythic.
“Qualcomm will certainly have to do something impressive to differentiate from these other vendors,” said Linley Gwennap, president and principal analyst at the Linley Group. “Qualcomm executives like to wave their hand and say ‘power efficiency.’ But what you can do in a smartphone is very different from what you can do in a server.”
Gwennap added that the lack of detail shared by Qualcomm makes it difficult to gauge the Cloud AI 100’s prospects for success. He noted that the chip is not likely to be in production before late 2020 and that the rapidly evolving AI inference market could look much different by then.
But Patrick Moorhead, president and principal analyst at consulting firm Moor Insights and Strategy, called the introduction of the Cloud AI 100 “promising” and “a good start.” He added that the devices’ target power envelope — which he estimated to 2 W to 5 W — is likely to differentiate its target market from the high-octane servers that use Nvidia’s high-end Tesla T4 GPUs in inferencing, limiting overlap.