Click to learn more about author James Kobielus.
Artificial Intelligence (AI) is rapidly being incorporated into diverse applications in the Cloud and at the network’s edge. The pace at which the technology is being adopted depends on the extent to which it is incorporated into commodity chipsets. To be ready for widespread adoption, AI’s algorithmic smarts need to be miniaturized into low-cost, reliable, high-performance chips for robust crunching of locally acquired sensor data.
The next generation of commodity AI-optimized chipsets will gain mass-market edge deployment. By year-end 2018, the dominant AI chipmakers will all have introduced new generations of chipsets that densely pack tensor-processing components on low-cost, low-power systems on a chip.
Over the past several years, hardware manufacturers have introduced an impressive range of chip architectures—encompassing graphic processing units, tensor processing units, field programmable gate arrays, and application-specific integrated circuits—that address these requirements. These chipsets are optimized to execute layered deep neural network algorithms—especially convolutional and recurrent—that detect patterns in high-dimensional data objects. More embedded, mobile, and IoT&P platforms incorporate these hardware components to do local inferencing and drive varying degrees of AI-infused autonomous operation.
In 2018, a new generation of AI-optimized commodity chipsets will emerge to accelerate this technology’s adoption in edge devices. Currently, most AI—both in the Cloud and in edge devices—executes on GPUs, but other approaches are taking shape and are in various stages of being commercialized across the industry. What emerges from this ferment will be innovative approaches that combine GPUs with CPUs, FPGAs, and a new generation of densely packed tensor core processing units, exemplified by Google Tensor Processing Unit (which is one of many such architectures in development or on the market now). Over the next several years, AI hardware manufacturers will achieve year-over-year 10-100x boosts in the price-performance, scalability, and power efficiency of their chipset architectures. Every chipset that comes to market will be optimized for the core Deep Learning and Machine Learning algorithms in AI apps, especially convolutional neural networks, recurrent neural networks, long short-term memory networks, and generative adversarial networks (GANs).
In 2018, most of the AI chipset startups who’ve received funding in the past 2 years will come to market. The pace of mergers and acquisitions in this segment will increase as the incumbent AI solution providers (especially Google, AWS, Microsoft, and IBM) deepen their technology portfolios and the incumbent AI chip manufacturers—especially NVIDIA and Intel—defend their positions in what’s sure to be the fastest growing chip segment of the next several years.
Beyond 2018, Wikibon predicts that the AI chipset market will reach a tipping point in 2022, the price will drop below $25 per chip and the open-source ecosystem for real-time Linux on DL SOC emerges. This will trigger mass adoption of AI edge-client chipsets for embedding in mass-market mobiles and PC/laptops of growing range of sensors, streaming modalities, and inference apps.
In 2018 and beyond, GPU architectures will continue to improve to enable highly efficient inference at the edge, defending against further encroachments from CPUs, FPGAs, ASICs, and other chipset architectures for these workloads. Over the next several years, the principal AI client chipset architectures will shift away from all-GPU architectures toward hybrid architectures in which low-cost low-power alternatives are optimized for edge-based inference, and only secondarily for edge-based training. Embedded AI-inferencing co-processors (low cost, low power, high performance) will be standard components of all computing, communications, and consumer electronics devices by 2020.
By 2022, 75 percent of mass-market edge devices will be equipped with AI-chipsets. Local inferencing for the following core applications will be standard on AI-enabled edge devices: multifactor authentication, face recognition, speech recognition, Natural Language Processing, chatbot, virtual assistant, computer vision, mixed reality, and generative image manipulation. By that year, however, only 10 percent of the AI workloads performed at the edge will involve local training, so chipsets will primarily be optimized for fast, low-power, low-cost inferencing.
Until the middle of the next decade, consumer applications for AI chipsets will far outpace business, industry, government, and scientific applications. Owing to their degrees of consumer orientation, the principal AI-edge chipset verticals in 2025, in descending order of likely worldwide volumes and revenues, will be: mobiles, smart cameras, smart appliances, medical devices, robotics, and drones.
Well into the next decade, GPUs will continue to dominate training the DL public Clouds, private Clouds, and Data Centers, especially as NVIDIA and its major cloud (AWS, Microsoft, Google, IBM, Baidu) and systems (Dell EMC, HPE, IBM and Supermicro) partners roll out GPU cluster services and supercomputers for compute-intense AI/DL workloads.