BIT Team Achieves Remarkable Breakthroughs in Lightweight Large Language Models, Value Alignment, Inference Optimization, and Practical Applications
12 hour ago / Read about 0 minute
Author:小编   

Recently, the research team led by Professor Song Dawei from the School of Computer Science at Beijing Institute of Technology (BIT) has achieved significant advancements in the realm of large language models. After their paper clinched the 'Outstanding Paper Award' at the prestigious ACL2025 (a top-tier international conference categorized as CCF Category A) in 2025, an additional four papers from the team have been accepted for presentation at ACL2026. ACL2025 took place in Vienna, Austria, spanning from July 27 to August 1, 2025. In his paper, Zhang Chen, a Ph.D. candidate from the team, introduced the teacher-student capacity disparity law for large model distillation for the first time. This law elucidates an approximately linear proportionality between the optimal size of the teacher model and the student model for a given student model scale. By leveraging this law, the distilled 3B model outperformed contemporary baseline models of the same scale on standard benchmarks, setting a new benchmark in the computational-performance Pareto frontier.

ACL2026 is scheduled to be held in San Diego, California, USA, from July 2 to July 7, 2026. The acceptance rate for the main conference stands at 19%, while the acceptance rate for findings is 18%. The team's four accepted papers are authored by Li Zelin, a master's graduate; Tian Yanzhi, a Ph.D. student (co-supervised by Dr. Guo Yuhang from the School of Computer Science); Sui Yi; and Meng Ling'ang. Among these contributions, Li Zelin and colleagues proposed the RAO method, which achieves precise, point-by-point alignment optimization of large language model values. Tian Yanzhi and his co-authors introduced the RATE evaluation framework, which enhances the accuracy of quality assessment in machine translation, particularly in non-literal translation domains. Sui Yi and team proposed the STACK framework, effectively addressing the issues of 'over-reasoning' and inefficiency in long-chain reasoning processes of large models. Lastly, Meng Ling'ang and his collaborators proposed the VADE framework, enabling more nuanced and refined emotional reasoning capabilities.