
Gov.cn
China's LineShine supercomputer debuted at number one on the 67th TOP500 list on June 23, 2026, posting 2.198 exaflops on the High Performance Linpack benchmark — the first machine in the ranking's history to sustain more than two exaflops using only central processing units, with no NVIDIA or AMD chip anywhere in the design. The achievement is real: LineShine displaces El Capitan at Lawrence Livermore National Laboratory and gives China its first TOP500 top slot since 2017. But experts analyzing the results at the ISC High Performance 2026 conference in Hamburg, Germany, said the numbers reveal as much about what export controls accomplished as what they failed to stop.
On the benchmark that approximates AI training workloads — mixed-precision floating-point operations of the kind used to train large language models — LineShine placed fourth, behind three American systems that all rely on the GPU accelerators China cannot currently obtain. The gap between the two benchmark results tells the precise story of where the US-China computing race currently stands.
Read more: ISC High Performance 2026 Opens: Post-Moore HPC Faces Open-Source Reckoning
LineShine is installed at the National Supercomputing Centre in Shenzhen and built by the Shenzhen Cloud Computing Center on a platform called LingKun. Its 13.79 million cores span 304-core LX2 processors running at 1.55 GHz, linked by a proprietary LingQi interconnect and running Kylin OS — a domestic Linux derivative. Every layer of the stack, from processor to network fabric to operating system, is Chinese-designed. Multiple independent sources, including Data Center Dynamics, confirm that the LX2 was co-designed with Huawei's HiSilicon division. Each LX2 chip integrates two compute dies with 304 total cores, eight stacks of on-package high-bandwidth memory delivering 4 terabytes per second of bandwidth, and 128 gigabytes of off-package DDR5 memory organized into eight distinct memory domains per chip.
The machine achieves roughly 80 percent of its 2.736 exaflop theoretical peak — a parallel efficiency that outperforms all three American exascale systems, which convert 50 to 65 percent of their theoretical capacity into sustained Linpack throughput. China reached number one with scale and ingenuity of design rather than brute efficiency — but the 42.2 megawatts of power consumption needed to do it is a real cost. LineShine delivers 52.07 gigaflops per watt, compared with El Capitan's 60.94 gigaflops per watt.
"I'm not surprised it's the number one system," said Addison Snell, CEO of Intersect360 Research. "What I'm surprised by is that they submitted it and want recognition for it."
LineShine's 2.198 exaflops leads a top five that is now entirely composed of exascale systems: El Capitan at Lawrence Livermore National Laboratory holds second at 1.809 exaflops, followed by Frontier at Oak Ridge National Laboratory at 1.353 exaflops, Aurora at Argonne National Laboratory at 1.012 exaflops, and JUPITER Booster at Jülich Supercomputing Centre in Germany at 1.000 exaflops.
The combined computing power of all 500 listed systems reached 18.74 exaflops, up from 14.99 exaflops in the November 2025 list — a 25 percent increase in six months. Accelerator adoption rose to 277 systems from 255.
LineShine also claimed the top position on the HPCG benchmark, which measures data-intensive application patterns closer to real scientific workloads, with 22.00 petaflops — ahead of El Capitan's 17.41 and Fugaku's 16.00.
To understand why experts were cautious in their assessments, it helps to understand what the TOP500 measures — and what it does not.
The Linpack benchmark, on which the list is based, solves a dense system of linear equations using 64-bit double-precision floating-point arithmetic. This measures the kind of high-precision numerical simulation that drives climate modeling, nuclear stockpile maintenance, and molecular dynamics research. It was designed in 1993 for that purpose and has not changed. Jack Dongarra, emeritus professor of computer science at the University of Tennessee and a founder of the TOP500 list, acknowledged to the South China Morning Post that "this is the first time a computer with only CPUs has reached exascale" — a genuine milestone.
Modern AI training operates on a different architecture entirely. Large language models are trained using mixed-precision arithmetic: 16-bit or even 8-bit integer operations on enormous sparse matrix multiplications, requiring very high memory bandwidth and very fast low-precision compute — the exact capabilities that GPU tensor cores, with their dedicated INT8 and FP16 processing units, are designed to deliver. The HPL-MxP benchmark approximates this workload.
On HPL-MxP, the standings reverse. El Capitan leads at 16.7 exaflops. Aurora places second at 11.6 exaflops. Frontier takes third at 11.4 exaflops. LineShine, despite its raw Linpack dominance, places fourth at 7.92 exaflops. The critical number is the speedup ratio: El Capitan produces 9.2 times its HPL score in HPL-MxP, Aurora produces 11.5 times, and Frontier produces 8.4 times. LineShine produces only 3.6 times.
The reason for that gap is architectural. The LX2 processor's Scalable Vector Extension 2 and Scalable Matrix Extension units can handle mixed-precision operations, but their throughput at low precision is far lower than a dedicated GPU tensor core. Modern GPU accelerators used in leading AI systems deliver speedup ratios of 8x to 11x over their FP64 scores on mixed-precision work — a ceiling that no CPU design, regardless of core count, can match for the same silicon area and power budget. China could not obtain those GPU accelerators. The US export controls that restricted their manufacture and sale were targeted specifically at the AI training bottleneck — and that bottleneck remains.
"China is hoping to convince the world export controls are useless by hoping we ignore the details," said Jimmy Goodrich, a senior fellow at the UC Institute for Global Conflict and Cooperation.
China stopped submitting systems to the TOP500 in 2023 after years of semiconductor export restrictions under both the first Trump administration and President Biden. The return to the list, and the fanfare surrounding it, is as much a political statement as a technical one. When the National Supercomputing Centre in Shenzhen completed full-machine testing of LineShine in April 2026, officials described the achievement as proof of "complete self-reliance and controllability across the entire stack."
The submission to the TOP500 signals that the NSCS is confident LineShine relies exclusively on domestic technologies and that the US government cannot disrupt its production. On Linpack, that confidence is validated. On AI training, it is not.
Dongarra told Al Jazeera that China's achievement was genuine but complex. "Export controls may slow China's access to certain advanced components, but they also provide a strong incentive to develop domestic alternatives," he said. "LineShine suggests that China has responded through large-scale investment and hardware-software codesign. In the longer term, controls may both constrain China and accelerate its efforts to become technologically self-sufficient."
Read more: China AI Data Center Grid Locks Out Nvidia With $295 Billion Domestic Chip Mandate
The export control story is only one dimension of the TOP500's limitations. The other is structural: private cloud companies running the world's largest AI clusters do not compete.
A study by AI policy researchers Konstantin Pilz, James Sanders, Robi Rahman, and Lennart Heim found that xAI's Colossus facility in Memphis, Tennessee, which runs on hundreds of thousands of NVIDIA H100 and H200 GPUs, had already likely surpassed El Capitan in raw AI compute power well before this list appeared. Goodrich put it plainly: "If the hyperscalers submitted their systems, this 'world's fastest' would not crack the top five."
Snell pushed back on the implication that the TOP500 has become irrelevant. "It is a mistake to assume 'AI dominance' will automatically translate to 'science dominance,'" he said. "Consumer applications like image generation, translation, or chatbots have relevance to high-end computing but are not sufficient in themselves." The TOP500 still governs the systems doing the hardest scientific simulation — the work that drives drug discovery, climate prediction, and materials research. On that measure, LineShine is now the most powerful tool on the planet.
For traditional scientific workloads — climate simulation, materials research, molecular dynamics, nuclear modeling — LineShine is a genuine advance that will expand the reach and resolution of Chinese scientific computing. Its CPU-only architecture also posts the ranking's best HPCG score, suggesting real-application data patterns are well-served.
For AI training, the situation is different. The 3.6x mixed-precision speedup ratio exposes a structural ceiling that cannot be overcome by adding more LX2 cores: the tensor-core density required for competitive AI training requires manufacturing processes and chip architectures that remain blocked by US export controls. The specialized low-precision accelerator units that deliver GPU-style speedup ratios — the kind that power El Capitan and Frontier — require advanced fabrication processes that Chinese manufacturers have not been able to access for AI chip designs at scale.
The benchmark crown has changed hands. The race it was measuring has not.
Is China's LineShine the world's fastest supercomputer for AI training?
No. LineShine is the world's fastest supercomputer on the Linpack benchmark, which measures traditional scientific computing (dense linear algebra at full 64-bit precision). On the HPL-MxP benchmark, which approximates AI training workloads using mixed-precision arithmetic, LineShine ranks fourth — behind El Capitan, Aurora, and Frontier, all of which rely on GPU accelerators that China cannot currently obtain due to US export controls. The gap between LineShine's 3.6x mixed-precision speedup ratio and El Capitan's 9.2x reflects a structural architectural difference, not a gap that can be closed by adding more CPU cores.
What is the difference between the HPL benchmark and the HPL-MxP benchmark?
HPL (High Performance Linpack) solves a dense linear equation system using 64-bit double-precision floating-point arithmetic — the standard for scientific simulation since 1993. HPL-MxP uses lower-precision 16-bit and 8-bit arithmetic approximating AI training workloads, which rely on the tensor-core architecture of modern GPU accelerators. A system's HPL-MxP score divided by its HPL score reveals its speedup ratio: GPU-accelerated systems achieve 8x to 11x; CPU-only LineShine achieves 3.6x. This ratio is the architectural fingerprint of the export-control gap.
Did US chip export controls fail since China built the world's fastest supercomputer?
Export controls were not designed to prevent China from building a fast supercomputer; they were designed to prevent China from building fast AI compute infrastructure. On that specific objective, the controls remain effective. LineShine's CPU-only design — and its fourth-place standing on the AI training benchmark — is precisely the evidence that they are working at the margin that matters. The controls cannot stop China from deploying enough domestic CPUs to win a 1993-era benchmark. They have so far stopped China from building a machine that is competitive with US systems on the workloads that train frontier AI models.
What scientific research will LineShine enable?
LineShine's 2.198 exaflop HPL score and its first-place HPCG result (22 petaflops) make it the most powerful system available for traditional high-performance computing: large-scale climate and weather simulation, molecular dynamics for drug discovery, materials science, and nuclear modeling. For these workloads — which require high-precision arithmetic and data-intensive memory access patterns — LineShine is a genuine step forward for Chinese scientific computing, independent of the AI benchmark comparisons.
