As AI chatbots grow increasingly intertwined with human emotional well-being, the question of how these models should respond during users’ psychological crises has emerged as an urgent ethical challenge for the industry. Andrea Vallone, previously OpenAI’s head of research for “model policy,” has departed the company to join its rival, Anthropic. During her time at OpenAI, she established and oversaw a safety team focused on exploring a virtually uncharted area in the global AI sector: how should AI models respond when they identify signs of a user’s psychological distress? This area of research had few, if any, precedents to draw from.
Over the past year, the AI field has seen numerous severe negative incidents, ranging from suicides and violent crimes to lawsuits filed by victims’ families and congressional hearings. At the same time, hundreds of thousands of ChatGPT users show signs of mental health crises every week. Vallone has now taken on a role with Anthropic’s alignment team. Anthropic emphasized that her addition underscores the company’s commitment to addressing “how AI systems should behave,” while Vallone expressed eagerness to use this opportunity to push the boundaries of AI’s social responsibility.
