Chinese Internet Basic Corpus 3.0 Officially Released to Support AI Large Model Training
3 day ago / Read about 0 minute
Author:小编   

On September 18, 2025, at the Sub-forum on AI Security Governance of the 2025 National Cybersecurity Publicity Week held in Kunming, the Chinese Internet Basic Corpus 3.0 was officially released. This version has a data volume of 120GB, expands the data sources from high-quality Chinese websites, and strengthens the filtering of illegal and undesirable information, providing Chinese data support for large-scale language model training and the development of artificial intelligence technology.