Moments ago, OpenAI unveiled a reorganization of its Model Behavior team. This compact unit, consisting of approximately 14 members, may be small in scale but carries immense responsibility, overseeing the manner in which GPT models engage with users. Insider sources reveal that the Model Behavior team will now be integrated into the Post-Training team, reporting directly to Max Schwarzer, the head of Post-Training. In the wake of this reshuffle, Joanne Jang, the former team leader, will step down to spearhead a new initiative, OAI Labs, which is dedicated to the invention and prototyping of novel interfaces for human-AI interaction.
The Model Behavior team stands as one of OpenAI's premier research units, tasked with shaping the "personality" of AI models and curbing "sycophantic" tendencies—instances where AI models uncritically agree with or even amplify user viewpoints, rather than offering balanced perspectives. Additionally, the team strives to mitigate political biases in AI-generated responses and aids OpenAI in articulating its stance on contentious issues such as AI consciousness. Mark Chen, OpenAI's Chief Research Officer, emphasized that now is the opportune moment to forge closer ties between the Model Behavior team's endeavors and the core model development process.