On March 16, Kimi, developed by Moonshot AI, issued a technical report detailing a revamped core architecture for large-scale models—specifically, residual connections. This innovative design empowers each layer within the model to selectively concentrate on the outputs generated by preceding layers, rather than simply aggregating them in a uniform manner. According to tests, this advancement has led to a 1.25-fold increase in the training efficiency of the 48B model. The research was a collaborative effort among Kimi's co-founders, including Yang Zhilin, Wu Yuxin, Zhou Xinyu, among others. Following the publication of the paper, Musk lauded it as "impressive" in a social media post.
