On June 18, 2026, Alibaba's ATH-Token Foundry, in collaboration with the Gaoling School of Artificial Intelligence at Renmin University of China, announced the open-sourcing of LOGOS (Language Of Generative Objects in Science), the first multi-domain scientific generative foundational model based on a unified 'scientific grammar'. LOGOS encodes heterogeneous objects such as proteins, small molecules, and materials into unified discrete token sequences through a shared vocabulary, constructing a pre-training corpus covering 7 modal categories and totaling 44.87B tokens. In six major scientific tasks, the model consistently matched or surpassed domain-specific methods using a pure sequence modeling paradigm, demonstrating exceptional parameter efficiency. For instance, LOGOS-1B outperformed Microsoft's NatureLM with just 1/56 of the parameters. LOGOS also addresses the objective misalignment between pre-training and downstream tasks, enabling generative capabilities without complex fine-tuning. Currently, the model weights, inference code, and technical report of LOGOS have been fully open-sourced.
