Meta's Departing Chief AI Scientist Acknowledges Company's Manipulation of Test Results for Llama 4 Release - AI

7 x 24 Track global technological trends

Hot Topic

Day

News Topic

Meta's Departing Chief AI Scientist Acknowledges Company's Manipulation of Test Results for Llama 4 Release

2026-01-03 / Read about 0 minute

Author：小编

In April 2025, Meta unveiled its Llama 4 large - scale model, which came in two variants: Scout and Maverick. The model is touted to employ a Mixture of Experts (MoE) architecture, boasting a parameter count of up to 400 billion. It is designed to support multimodal processing and features a context window that surpasses 10 million tokens.

However, post - release, the model's actual performance left much to be desired, especially in programming tasks. The Maverick version managed to score a mere 16% on the aider multilingual coding benchmark test. This score was significantly lower than anticipated and even lagged behind models with far fewer parameters. Moreover, Llama 4 demonstrated notable deficiencies in areas such as context recall and conversational coherence. There was a substantial chasm between its real - world performance and the claims made in official announcements.

In a more contentious development, internal staff members disclosed that during the training phase, Meta might have incorporated test set data into the training set. This was allegedly done to boost the model's scores on benchmark tests, sparking allegations of 'cheating.' Although Meta officially refuted these accusations, the damage to Llama 4's reputation was already done. This led to the exit of key team members.

This incident not only laid bare the technical shortcomings of Llama 4 but also ignited a flurry of discussions within the open - source AI community. The focus of these discussions centered on the transparency and ethical concerns surrounding model evaluation.

Previous page：China’s Pioneering "Embodied Artificial Intelligen...

Next page：Grok AI Generates a Plethora of Explicit Images, I...

Return to List

Hot Reading

2 day ago

No more Chinese Polestar 3s as production shifts entirely to the US

2 day ago

Microsoft Researcher AI Tool Upgrade Allows It to Use Multiple AI Models at the Same Time

2 day ago

As electric truck demand craters, GM lays off workers and idles plant

2 day ago

Apple Messages: RCS End-to-End Encryption Returns in iOS 26.5 Beta

2 day ago

Samsung Galaxy Watch Gets Blood Pressure Monitoring in the US, With One Major Caveat

2 day ago

Nothing Reportedly Plans AI Smart Glasses to Debut by 2027 As CEO Carl Pei Gets Onboard

1 day ago

ChatGPT Is Now on CarPlay Following iOS 26.4 Update That Adds Support for AI Voice Commands

2 day ago

Generative AI and Copyright Laws Shaping the Future of AI Artwork with DALL-E Midjourney and Stable Diffusion

2 day ago

Rivian spinoff Also will build autonomous delivery vehicles for DoorDash

1 day ago

Apple releases security fix for older iPhones and iPads to protect against DarkSword attacks

Previous page：China’s Pioneering "Embodied Artificial Intelligen...

Next page：Grok AI Generates a Plethora of Explicit Images, I...

C114 Communication Network
Communication Home

7 X 24 Track global technological trends

Find

News Topic

Hot Topic

7 x 24 Track global technological trends

News Flash

News Topic

AI
/
Devices
/
Smart Car
/
Chip
/
Cloud

C114 Communication Network

Communication Home