On March 12, Beijing Time, NVIDIA rolled out its latest open-source large language model, Nemotron 3 Super. This model is purpose-built for enterprise-level multi-agent systems. It embraces a cutting-edge Mixture of Experts (MoE) architecture, which enables it to deliver an inference throughput more than five times that of its forerunner. What's more, it inherently supports an ultra-long context window of up to 1 million tokens. NVIDIA has also made the model weights, training datasets, and the entire training methodology publicly available.
