In 2024, "Brain Rot" was crowned the Oxford Word of the Year. This term describes the cognitive deterioration phenomenon that occurs when humans are exposed to fragmented and low - value online information for extended periods. Recently, a research team from institutions including Texas A&M University put forward and validated the "LLM Brain Rot Hypothesis." This hypothesis suggests that if large language models (LLMs) are persistently exposed to low - quality online text, their cognitive abilities will suffer a long - lasting decline and will be challenging to restore.
Through carefully controlled experiments, the research team classified junk data into two categories. The first is M1, which is based on user engagement. Examples of M1 data include short, highly popular content that grabs people's attention quickly. The second is M2, centered around semantic quality. M2 data encompasses clickbait and hollow content that lacks substance.
The experimental results were quite revealing. As the proportion of junk data rose from 0% to 100%, the model's core cognitive functions took a significant hit. For instance, its reasoning ability plummeted, with the ARC - Challenge score dropping from 74.9 to 57.2. Similarly, its long - text understanding ability also declined sharply, as the RULER - CWE score fell from 84.4 to 52.3. Moreover, it was found that M1 data caused far more severe damage than M2 data.
A further in - depth analysis uncovered that the main reason for the model's reasoning failures was the "leap of thought" issue. In these cases, the model would skip intermediate reasoning chains and directly jump to conclusions. This problem accounted for up to 84% of the reasoning failure cases.
The study also explored repair methods, such as reflective reasoning and retraining. However, the model's performance could not be fully brought back to its original state. This indicates that the brain rot effect had been deeply ingrained within the model.
The research team calls for a reevaluation of current internet data collection and continuous pretraining practices. They advocate for stricter data screening and quality control measures to prevent the cumulative damage that can occur to AI models.
