New Front Opens in Chip War: Training vs. Inference
Advertisements
The 2025 Consumer Electronics Show (CES) has emerged as a pivotal event, showcasing the escalating influence of artificial intelligence (AI). The rise of AI agents and the robust interest in AI hardware illustrate a significant transition in technology applications, permeating various industries and marking a new chapter in the realms of large models and computing power.
As the dynamics of AI evolve, there is a noticeable shift from training to inferenceIf training is viewed as the foundational phase of AI model development, inference represents the critical stage where these models transition into commercial viabilityHigh-profile examples like OpenAI's o1, Gemini 2.0 Flash Thinking, and DeepSeek R1-Lite-Preview are enhancing inference capabilities, which are crucial for application development across diverse sectors.
This pivotal moment in AI inference is drawing increasing attention to the underlying computational infrastructure
Advertisements
According to a recent report by Barclays, the demand for AI inference computation is set to soar, potentially accounting for over 70% of the total computational needs of general artificial intelligenceNotably, the demand for inference could exceed that for training by as much as 4.5 times, indicating an impending shift in how resources are allocated within the AI landscape.
In this new era of AI inference, chip manufacturers are pivoting their strategiesNVIDIA, for example, has adopted an aggressive approach at CES, unveiling a supercomputer project named Project DIGITS, designed to empower individual users by facilitating model inference and AI application development directly from personal devices.
Project DIGITS represents a novel frontier in terminal computationIn a competitive cloud landscape where NVIDIA and AMD are vying for dominance, various cloud service providers and startups are exploring sustainable pathways for inference
Advertisements
The focus on AI inference signifies not only the emergence of new technologies but also the democratization of powerful computational resources.
Focusing on the terminal side, NVIDIA's Project DIGITS is equipped with the innovative GB10 super chip, heralded as the world's smallest AI supercomputer capable of running models with an impressive 200 billion parametersAs generative AI applications are on the brink of explosion, NVIDIA aims to extend its computational network further, making generative AI tools readily available on developers' desktops.
This push by NVIDIA to redefine the AI PC landscape showcases a vision for the future of personal computingAlthough DIGITS is primarily aimed at researchers and developers, it introduces a remarkable increase in personal computing power, suggesting new possibilities for the evolution of edge AI.
Such advancements not only provide developers with more efficient and accessible tools but also lower the barriers to AI computing applications
Advertisements
NVIDIA is keen to shift generative AI from cloud-based exclusivity towards a broader audience, creating an AI landscape that is more inclusive and widespread.
AMD, Qualcomm, and Intel are also active players in the edge AI domain, showcasing their innovations at CESAMD, for instance, has launched the Ryzen AI Max series mobile processors, which utilize a new generation of integrated neural processing units (NPUs), paving the way for enhanced performance in Windows laptopsThe Ryzen AI 300 series, built on the "Zen 5" architecture, boosts multitasking abilities and battery life, illustrating AMD's strategy of penetrating various market segments.
Simultaneously, Intel has introduced a series of CPUs, including various Ultra 200V series, catering to applications ranging from high-performance to entry-level devicesQualcomm, through its Snapdragon X entry-level processors, aims to democratize AI technology in more affordable laptops, allowing OEMs to market CoPilot+ computers in the $600 range.
When comparing the approach toward AI PCs across AMD, Intel, and Qualcomm, it becomes evident that the latter companies focus primarily on incremental improvements at the chip level
- Definition and Types of International Capital Flows
- Slowdown in Indian Manufacturing Hinders Growth
- BYD Shares Rise While Tesla Stock Dips
- Gold Surpasses $2,700 Threshold
- Latin America Faces Urgent Call for Structural Reforms
In contrast, NVIDIA is exploring integrated hardware-software solutions, carving out a unique position in the AI PC arena.
NVIDIA's strategy aligns closely with a consumer-oriented vision, hinting at the company’s intention to capture the consumer market once more, particularly in the AI PC segmentWith the integration of Arm architecture, GPU, and CPU advancements, NVIDIA is poised to leverage its technological strengths and launch new product lines that cater to evolving consumer demands.
In today’s landscape, many consumers ultimately view a computer as a vehicle to acquire a high-performance graphics card, allowing NVIDIA to expand its product assortment furtherAs we contemplate the future of the AI PC market, it becomes clear that brand loyalty to traditional names like HP or Dell is waning, while NVIDIA's influence among PC users firmly establishes it as a formidable competitor.
While Intel and rival companies continue to strive for innovation across different technological pathways, industry experts assert that surpassing NVIDIA will require a new strategic direction, one that abandons outdated methodologies
The collective movement in the AI sector heralds a clear strategy from NVIDIA: continuously enhancing its hardware through software integration and now bringing a consumer-facing product to the market.
Regardless of competitive dynamics, the rise of edge AI signifies a transformative shift in computational paradigmsFrom sophisticated data centers to mainstream consumer desktops, the future of AI is rapidly becoming more tangible.
NVIDIA’s Project DIGITS exemplifies this shift of innovation towards the terminal edge of AIDespite the predominance of cloud-based growth, demand for inference computing is growing more acute, intensifying competition in the marketWhile NVIDIA holds a staggering ninety percent of the AI training market, inference is opening exciting opportunities for emerging players.
Aiming to carve out its niche, NVIDIA’s executives have recently pointed to the market potential of inference at AI roadshows
They emphasize that the AI industry is still in its nascent stages, with the introduction of models like OpenAI's o1 marking a trend towards addressing increasingly complex inference challenges, thus boosting the need for diverse hardware solutions.
The launch of the Blackwell architecture chips sets the stage for NVIDIA to further enhance its offerings in fulfilling the burgeoning computational demands across various sectorsDuring CES, NVIDIA unveiled the GB200 NVL72, comprising 72 Blackwell GPUs tailored for heightened performance and energy efficiency, introducing advanced functionalities to accelerate large language model workloads.
However, the competitive landscape in the inference market is increasingly crowded, with tech giants like AWS, Google, and Microsoft actively iterating their ASIC and TPU chipsStartups like Groq, SambaNova, and Positron AI are also striving for their share in this burgeoning market.
NVIDIA’s dominance in the training market complicates entry for competitors, driving many to focus their efforts on the inference sector, which has traditionally been regarded as a peripheral market
However, inference is rapidly becoming the focal point of innovation within the industry.
This burgeoning market is fostering differentiation among competitors focused on optimizing performance specifically for inference workloads or innovating cost-effective computational solutions through hardware-software collaboration.
The pursuit of new competitive advantages in a demanding computational environment is challenging since the battleground is dominated by GPUs, the undisputed kings of powerAs demand for inference growth intensifies, hardware offerings are diversifying and competition stiffening.
While predictions for the inference market are optimistic in terms of growth potential, planning its segmentation into profitable niches remains a fierce and difficult challenge, especially as NVIDIA currently captures the bulk of available equity.
Interestingly, NVIDIA itself began as a player in the edge market, with GPUs initially designed for graphics computing—a segment that evolved from being a peripheral branch of computing tasks to becoming a crucial component of contemporary processing needs.
Similarly, inference is transitioning from the margins to the center stage, suggesting a bright future filled with advanced inference chips and applications
Leave a comment
Your email address will not be published