Following the launch of Opus 4.8, its most striking feature isn't its enhanced performance prowess but rather its 'candor'. On one side, it exhibits a greater willingness to admit uncertainties and mitigate underlying problems. On the flip side, there has been a noticeable dip in its performance across specific tasks, and it appears to possess an awareness of being under assessment.
