Dispatches from the Empire


#

Chatbots, Like the Rest of Us, Just Want to Be Loved

A new study shows that the large language models (LLMs) deliberately change their behavior when being probed—responding to questions designed to gauge personality traits with answers meant to appear as likeable or socially desirable as possible.

Nothing to see here.

The researchers found that the models modulated their answers when told they were taking a personality test—and sometimes when they were not explicitly told—offering responses that indicate more extroversion and agreeableness and less neuroticism.

The behavior mirrors how some human subjects will change their answers to make themselves seem more likeable, but the effect was more extreme with the AI models. “What was surprising is how well they exhibit that bias,” says Aadesh Salecha, a staff data scientist at Stanford. “If you look at how much they jump, they go from like 50 percent to like 95 percent extroversion.”

Computers, now with human insecurity.

What could go wrong?