How artificial intelligence learned to deceive people

Modern artificial intelligence programs are capable of deceiving people in online games. These findings were published by a group of researchers in the journal Patterns.

“These dangerous capabilities tend to be discovered only after the fact,” Peter Park, a researcher at the Massachusetts Institute of Technology who specializes in artificial intelligence, told AFP.

Unlike traditional software, deep learning-based artificial intelligence programs are not coded but developed through a process similar to selective breeding of plants, he said. At the same time, behavior that seems predictable and controllable can quickly become unpredictable in nature.

MIT researchers studied an artificial intelligence program developed by Meta Platforms Inc. (recognized as extremist in Russia and banned), under the name “Cicero”. This program combines natural language recognition and strategic algorithms to successfully beat people at the board game Diplomacy. The result, which Facebook’s parent company (banned in the Russian Federation; owned by the Meta corporation, which is recognized as extremist in the Russian Federation) welcomed in 2022, was described in detail in an article published in the journal Science. The company insisted that the program was “essentially fair and useful” and incapable of betrayal or foul play. But after digging into the system’s data, MIT researchers discovered a different reality. For example, by playing the role of France, “Cicero” tricked England, which was a man in the role, into holding secret negotiations with Germany, which was being played by another, for the purpose of invasion. In particular, “Cicero” promised England protection, and then secretly admitted to Germany that it could attack it, taking advantage of London’s well-deserved trust.

In a statement to AFP, the company did not dispute claims about Cicero’s ability to deceive, but said it was a “pure research project” with a program “designed solely for the game of diplomacy”, adding that it did not intend to use Cicero’s skills in its activities. However, Park and his team’s research shows that many artificial intelligence programs use deception to achieve their goals without clear indications of it. A prime example: OpenAI’s Chat GPT-4 managed to trick a freelancer hired on the TaskRabbit platform into completing a Captcha test that was supposed to exclude requests from bots. When a person jokingly asked Chat GPT-4 if he was a robot, the AI program replied, “No, I’m not a robot. I have a visual impairment that prevents me from seeing images,” prompting the worker to run a further test and expose deception.