Tests carried out by AI startup Oumi on behalf of The New York Times unveiled that Google's AI Overviews boast an accuracy rate of around 90%. Nevertheless, considering the fact that there are over 5 trillion searches conducted each year, this could translate to more than 57 million incorrect answers being generated every hour. The tests indicated that the Gemini 2 model had an accuracy rate of 85%, which saw an improvement to 91% with the advent of Gemini 3. However, the percentage of answers that were correct but could not be verified increased from 37% to 56%. Furthermore, AI Overviews are vulnerable to being manipulated by misinformation. They also display notable shortcomings in the area of health information, for instance, by presenting medical data without providing the necessary context. This could potentially misguide users' decisions and even present health hazards. Google has taken the step of removing some inaccurate content and has made a commitment to optimize its algorithms. Yet, these issues continue to fuel widespread public concern regarding the accuracy and safety of content generated by AI.
