Tencent Hunyuan AI Infrastructure's Pivotal Technology Now Open Source: Inference Throughput Surges by 30%

12 hour ago / Read about 0 minute

Author：小编

The Tencent Hunyuan AI Infrastructure team has recently unveiled HPC-Ops, an open-source, production-ready, high-performance operator library tailored for large language model (LLM) inference. In practical application settings, this library has significantly boosted the inference queries per minute (QPM) of the Hunyuan model by 30%, and the QPM of the DeepSeek model by 17%. When examining the performance of individual operators, the Attention operator has demonstrated a remarkable improvement, achieving up to 2.22 times the performance of FlashInfer and FlashAttention. The GroupGEMM operator has outperformed DeepGEMM by up to 1.88 times, and the FusedMoE operator has surpassed TensorRT-LLM by up to 1.49 times.

Previous page：Core Technologies of Tencent Hunyuan AI Infra Made...

Next page：MiniCPM-o 4.5 Goes Open Source

Return to List

Hot Reading

2 day ago

HyperX Cloud Alpha 2 Wireless Review: 250-Hour Battery, Sound, and Value In-Depth

2 day ago

UpScrolled’s founder says the social network has zoomed past 2.5M users

2 day ago

Linq raises $20M to enable AI assistants to live within messaging apps

2 day ago

These AI notetaking devices can help you record and transcribe your meetings

2 day ago

Supply Chain Executive Says Real AI Value Comes When Operations Move from Dashboards to Decision Engines

2 day ago

Carbon Robotics built an AI model that detects and identifies plants

2 day ago

Ring brings its ‘Search Party’ feature for finding lost dogs to non-Ring camera owners

2 day ago

Firefox will soon let you block all of its generative AI features

1 day ago

Agentic coding comes to Apple’s Xcode with agents from Anthropic and OpenAI

2 day ago

Laurent Balzano: Aptiv VP Explains How Software-Defined Vehicles Will Change Driving by 2030

Previous page：Core Technologies of Tencent Hunyuan AI Infra Made...