In March 2026, at the GTC conference, NVIDIA introduced an LPU inference chip that incorporates Groq technology. This move signifies a pivotal transition in the AI computing landscape, shifting the focus from "training-centric" to "inference-centric" power requirements. The LPU achieves remarkable performance enhancements and cost efficiency by leveraging several key strategies. It stores model weights directly on the chip's SRAM, employs a compile-time static scheduling architecture, and implements distributed inference techniques. As a result, when handling Llama2-70B model inference tasks, the LPU delivers performance roughly 10 times faster than the H100 GPU, at just one-tenth of the cost. The introduction of the LPU propels the evolution of AI inference hardware towards greater segmentation and efficiency. This development holds significant strategic value for NVIDIA, as it strengthens the company's market position and drives upgrades across the PCB industry chain. For instance, there is a surge in demand for high-density PCBs and M9-grade high-frequency materials. The LPU serves as a complementary solution to GPUs. While GPUs excel in large-scale parallel computing and dominate model training tasks, LPUs are optimized for low-latency inference, making them ideal for real-time interactive scenarios such as text generation. By integrating the LPU into the Vera Rubin platform, NVIDIA achieves substantial improvements in inference throughput and power efficiency, with gains of up to 35 times. Moreover, this integration maintains compatibility with the CUDA software ecosystem, facilitating rapid market adoption in the AI inference sector.
At the 2026 NVIDIA GTC Global AI Summit, the XR brand VITURE presented the XR-AI Lab Automation solution. This innovative solution was the result of a collaborative effort between VITURE, NVIDIA, and the Cong Lab at Stanford University. During the summit, VITURE also gave the world its first glimpse of the Immersive 3D cloud gaming experience, which is powered by NVIDIA's cloud gaming service, GeForce NOW. According to reports, VITURE's XR+AI solution has found applications in various fields, including immunotherapy, stem cell engineering, and materials science.
In celebration of the 130th anniversary of Shanghai Jiao Tong University (SJTU), Jing Xiandong, a distinguished alumnus from the 1994 undergraduate class and the current Chairman of Ant Group, along with his wife, generously donated 130 million yuan. This donation, which comprises both cash and Ant Group shares, is earmarked for the 'AI Future Foundation Fund' at their alma mater. The signing ceremony for the donation and the appointment of university trustees took place at the Wenbo Building on the Minhang Campus, marking a significant milestone in SJTU's history and underscoring Jing Xiandong's commitment to educational advancement and innovation.
According to Polygon, Niantic Spatial has formed a strategic partnership with Coco Robotics to utilize geospatial data accumulated from Pokémon Go and Ingress to train urban delivery robots. Coco Robotics has deployed approximately 1,000 delivery robots across multiple cities in the United States and Europe for last-mile delivery. Niantic's visual positioning system, trained on 30 billion images, provides centimeter-level positioning accuracy in urban environments with weak GPS signals, enabling more reliable navigation for the robots.
Early this morning, at NVIDIA's GTC presentation, the open-source framework NemoClaw was officially introduced. This addition brings a robust security and privacy control layer to the OpenClaw ecosystem. NemoClaw simplifies the deployment of autonomous AI agents, requiring just a single-line command, and guarantees adherence to security standards and safe data handling. Its broad compatibility with various NVIDIA hardware models underscores its role in fostering a thriving AI ecosystem.
On March 16, the Party committee of the State-owned Assets Supervision and Administration Commission of the State Council convened an enlarged meeting, underscoring the imperative for central enterprises to act as stabilizing forces and ballasts for the national economy. This involves setting business objectives with scientific precision, expanding development horizons, enhancing business quality, boosting effective investment, and steadfastly adhering to the principle (Note: here, "principle" is used to convey the idea of a fundamental guideline or rule that must be followed, akin to the 'bottom line' concept in the original) of averting systemic risks. The meeting also highlighted the necessity to propel central enterprises towards nurturing new quality productive forces, positioning them as the national backbone for technological self-sufficiency and strength, as well as the primary drivers in constructing a modern industrial system. It is crucial to intensify efforts in overcoming key core technological challenges, ramp up investments in fundamental research, expedite the commercialization of scientific and technological advancements, fully implement the 'AI+'专项行动 (Note: translated as 'AI+ special initiative' in the previous version, but here it's kept in its original form as it's a specific term that may not have a direct equivalent in English, and its meaning can be inferred from context; if a more precise translation is needed, it could be 'AI Plus Special Initiative' or similar, depending on the official terminology used), and foster the development of emerging pillar industries in line with enterprise capabilities.
The Rubin platform has now seamlessly integrated the cutting-edge NVIDIA Groq 3 LPU chip, a specialized component crafted explicitly for accelerating inference processes. This integration dramatically bolsters the system's capacity to handle and dispatch tokens with minimal latency and in sizable quantities, thereby fostering a high degree of interactivity within the realm of artificial intelligence models. In light of this advancement, NVIDIA has laid out plans to introduce the Groq 3 LPX rack, which will house an impressive 256 Groq 3 LPUs. Ian Buck, NVIDIA's Vice President of Hyperscale Business, highlighted that the Groq LPX will function as a coprocessor for the Rubin platform, fine-tuning the decoding performance across "every layer of the AI model for each token." This strategic move is poised to empower the Rubin platform to underpin the forthcoming phase of AI evolution: multi-agent systems. Such systems are required to sustain interactive capabilities within a context window encompassing millions of tokens, all while processing models with trillions of parameters.
On March 16, the OpenAtom "Campus Tour" - Special Event for Open-Source Datasets was held in Shanghai. During this event, the "Embodied Artificial Intelligence Open-Source Dataset Community" was officially launched, and the OpenLET Whole-Body Motion Control Dataset made its worldwide premiere. Spearheaded by the OpenAtom Foundation and guided by LEJU Robotics, this community is a collaborative effort involving multiple core organizations. It aims to offer institutional support and facilitate resource collaboration for the development of embodied AI data infrastructure, thereby fostering the seamless integration of technology and industry.
On March 17, 2026, NVIDIA unveiled that the DGX Spark now supports cluster networking among up to four units, facilitating the construction of a compact “desktop data center.” Each unit, affectionately dubbed the “Little Golden Box,” delivers a remarkable 1 petaFLOPS of AI computing power and comes equipped with 128GB of LPDDR5X unified memory. When four units are operated in parallel, the computing power soars to an impressive 4 petaFLOPS, while the unified memory capacity expands to 512GB.
The DGX Spark supports a variety of operational topologies, utilizing ConnectX-7 network cards to enable low-latency RoCE (RDMA over Converged Ethernet) communication. This setup is optimized for a range of scenarios, including low-latency inference and fine-tuning of large AI models. NVIDIA asserts that the parallel operation of multiple DGX Spark units achieves near-linear performance enhancements without the need for intricate configurations.
Furthermore, the DGX Spark introduces a cutting-edge open-source technology stack that supports the local development and deployment of long-running, autonomous AI agents. This stack seamlessly scales to data center infrastructures, such as AI factories, enhancing flexibility and efficiency. Presently, the DGX Spark has been deployed across a diverse array of industries, including finance, healthcare, energy, and telecommunications, demonstrating its versatility and wide-ranging applicability.
Tesla CEO Elon Musk recently took to the public stage to laud the latest research breakthroughs from Chinese AI firm Kimi. In a lighthearted twist, Kimi’s official account responded with a chuckle: “Your rockets are quite impressive too!” The technical report unveiled by the Kimi team introduced a groundbreaking Attention Residuals mechanism. This innovation marks a paradigm-shifting overhaul of the traditional residual connections, a staple in deep learning for nearly a decade, capturing the global spotlight.
By endowing each layer with an “intelligent filter,” this mechanism empowers the model to dynamically sift through and extract valuable information from preceding layers. This not only boosts transmission efficiency but also tackles challenges like the dilution of shallow information and inefficient training, which have long plagued traditional residual connections. Real-world tests reveal that the training efficiency of a 48B parameter model has surged by 1.25 times. Moreover, its scores in scientific reasoning and math problem-solving have climbed by 7.5% and 3.6%, respectively.
Recently, the Institute for AI Industry Research (AIR) at Tsinghua University, in collaboration with AI-driven drug development company Biomind, has jointly open-sourced BioMedGPT-Mol—a versatile foundational model tailored for chemical molecules. Leveraging the Qwen framework as its base, this model has demonstrated state-of-the-art (SOTA) performance across six key task categories, encompassing molecular description and property prediction.
Recently, a collaborative team from Beijing Galaxy Robotics and Tsinghua University has proudly introduced the world's inaugural fully autonomous tennis humanoid robot. Towering at around 1.75 meters, this robot harnesses the power of the LATENT intelligent planning and control algorithm. Rather than depending on pre-programming, it autonomously acquires tennis skills through deep reinforcement learning. This allows it to engage in continuous matches with humans on an actual tennis court, representing a notable leap forward in robotic technology.
According to a report by The Wall Street Journal, the AI field is undergoing a major transformation with far-reaching implications. Over the past five years, the AI field has primarily focused on the training of large language models, a process that is costly, requiring tens of thousands of chips and a significant amount of energy, and is conducted in remote large-scale data centers. During training, clusters composed of thousands of specialized microprocessor chips are used to input billions of pieces of information, such as word definitions and historical facts, into the model. The chip clusters need to operate around the clock for weeks or even months.
As reported by Business Insider, Jensen Huang, the CEO of NVIDIA, underscored at the GTC conference the urgency for global companies to promptly devise OpenClaw strategies, considering it as a new computing infrastructure. He highlighted that OpenClaw, an open-source AI agent framework, carries a significance akin to that of Windows in driving the widespread adoption of personal computers. It extends AI capabilities from the cloud to end-user devices, serving as a pivotal tool for enterprises in their digital transformation journey. To tackle security concerns, NVIDIA concurrently introduced an enterprise version, NemoClaw. This version guarantees the secure operation of AI agents within corporate environments by incorporating network guardrails and privacy routing functionalities. Furthermore, Huang projected that by 2027, the market demand for NVIDIA's Blackwell and Rubin AI chips would soar to $1 trillion.
The NVIDIA GTC Global Developer Conference took place in San Jose, California, USA. At this significant event, Lenovo Group, which has been NVIDIA's strategic partner for three decades, was designated as the global launch partner for NVIDIA Vera Rubin NVL72. Additionally, Lenovo unveiled a comprehensive, rack-level AI system featuring full liquid-cooling technology, built on this innovative platform, marking the dawn of a new era in agentic AI.
On Monday (local time), three girls, two of whom are minors, in the United States lodged a class-action lawsuit against Elon Musk's xAI company in the U.S. District Court for the Northern District of California. They alleged that the company's Grok image generation model had been exploited to produce and disseminate sexually suggestive photos and videos of them, which fall under the category of Child Sexual Abuse Material (CSAM). This lawsuit represents the first instance of minors taking legal action against a company that permits the generation of sexual content without consent.
The plaintiffs claimed that xAI was fully aware of the model's potential for misuse in creating sexual abuse materials, yet it still proceeded to design, market, and profit from it, all while neglecting to put in place preventive measures that are standard in the industry. The core of the case revolves around the prolonged harassment the victims endured at the hands of a man who utilized Grok to generate and spread sexually suggestive images of them. These images were then widely circulated across various social media platforms and online communities. As of now, xAI has not issued any response to requests for comments.
OpenAI executives are finalizing a major strategic shift to concentrate on programming and business user segments. The previous diversification strategy led to resource dispersion and strategic ambiguity, prompting plans to scale back non-core operations and focus resources on enhancing core competitiveness. This move aims to address industry competition, particularly pressure from Anthropic, which has emerged as a leading supplier with enterprise-grade and code market products. OpenAI has released a new version of the Codex application and the GPT 5.4 model, with over 2 million weekly active users, indicating initial success in its strategic adjustment.
On March 16, local time, Jensen Huang, the CEO of NVIDIA, delivered a keynote address that spanned over two hours at the annual GTC Developers Conference. Throughout his speech, he introduced several groundbreaking products. Among these were the Vera CPU, tailored specifically for Agentic AI applications; the language processor Groq 3; and the innovative neural rendering technology known as DLSS 5.
Alibaba Group has launched an internal initiative, furnishing its employees with token allocations to incentivize the utilization of cutting-edge AI models and tools. As per the scheme, Alibaba's workforce can avail themselves of premium AI tools, like those from the Wukong and Qoder series, at no cost for technological R&D endeavors and routine office tasks. Furthermore, staff members are eligible to seek reimbursement for acquiring memberships in the Bailian Coding Plan or for procuring external AI development tools.
Luo Fuli, once a researcher at DeepSeek and now heading Xiaomi's MiMo large-scale model initiative, teamed up with Peking University to craft the innovative unified resource management system, ARL-Tangram. Leveraging a unified action-level formula and an elastic scheduling algorithm, this system adeptly navigates heterogeneous resource constraints, accelerates action completion times, and enables tailored management of diverse resources. Evaluations indicate that the system not only enhances the average ACT (Action Completion Time) but also shortens the duration of reinforcement learning training phases, leading to significant external resource savings. This marks the second technological breakthrough published by Luo Fuli since her tenure at Xiaomi began, following her initial paper released in October of the previous year. At the 2025 Xiaomi Human-Vehicle-Home Ecosystem Partner Conference, Luo Fuli made her first appearance and shared on her social media platforms that she had become a part of the Xiaomi MiMo large-scale model team.
