Meta creates an AI that can play the strategy game Diplomacy and is said to be able to deceive human players, after controversies over its AI systems BlenderBot 3 and Galactica

Meta, Facebook’s parent company, on Tuesday introduced Cicero, an AI agent that would be able to play the classic strategy game Diplomacy at a level comparable to most human players. Cicero’s team presents it as an AI that negotiates, persuades and cooperates with people based on natural language. This is a significant achievement in the field of natural language processing (NLP), as the game requires deep interpersonal negotiation skills, implying that Cicero has acquired some mastery of the language needed to win the game. part. But there are fears that it will be diverted from its original purpose.

Cicero: an AI that would be able to play the strategy game Diplomacy like humans

Before Deep Blue beat Garry Kasparov in chess in 1997, board games were already a useful measure of AI achievements. In 2015, another barrier came down when AlphaGo defeated Go master Lee Sedol. Both of these games follow a relatively clear set of analytic rules (although the rules of Go are generally simplified for computer AI). But with Diplomacy, much of the gameplay involves social skills. Players must empathize, use natural language, and build relationships to win, a potentially difficult task for a computer gamer.

Developed in the 1950s and currently published by Hasbro, Diplomacy focuses on communication and negotiation between players, who take on the role of seven European powers in the early 20th century. It is considered by some players as the perfect way to lose friends. Diplomacy simulates taking territories on a map of Europe. Rather than taking turns, players write their moves in advance and perform them simultaneously. To avoid making moves that are blocked because an opponent has made a counter move, players communicate with each other in private.

They discuss potential coordinated actions, then write down their moves, respecting or violating the commitments made to the other players. Diplomacy’s emphasis on communication, trust, and betrayal makes it a different challenge from more rules-based and resource-based games like chess and Go. to build more effective and flexible agents who can use language to negotiate, persuade, and work with people to achieve strategic goals, as humans do? .

In a blog post on Tuesday, Meta claims that Cicero is essentially a chatbot that can negotiate with other Diplomacy players to perform efficient in-game moves. online Diplomacy at webDiplomacy.net. Over time, he is said to have become a master of the game, getting “more than double the average score” of human gamers and ranking in the top 10% who have played more than once. Meta researchers claim that “Cicero carefully manipulates natural language and is able to deceive human players.”

Diplomacy has been seen for decades as a nigh-impossible grand challenge in AI, requiring players to master the art of understanding the motivations and perspectives of others, devising intricate plans, and adjusting their strategies, and that they use natural language to make deals with other people, convince them to form partnerships and alliances, and more. Cicero is so good at using natural language to negotiate with people in Diplomacy that they often preferred working with Cicero over other human participants, Meta said.

Cicero would be able to cooperate with human players or deceive them

While AI agents for games like chess can be trained through reinforcement learning, modeling the cooperative game of Diplomacy required a different technique. According to Meta, the classic approach would involve supervised learning, whereby an agent would be trained using labeled data from past Diplomacy games. But supervised learning alone produces a credulous AI agent that can be easily manipulated by lying players. The company announced that its researchers have implemented a new approach more adapted Diplomacy.

To create Cicero, Meta gathered AI models for strategic reasoning (AlphaGo similar) and natural language processing (GPT-3 similar) and integrated them into a single agent. During each game, Cicero examines the state of the game board and the chat history and predicts how the other players will act. He formulates a plan which he executes through a language model capable of generating human-like dialogue, which allows him to coordinate with the other players. Meta calls Cicero’s natural language skills a “controllable dialogue model”.

Cicero is based on a 2.7 billion parameter BART-like language model. Like GPT-3, Meta’s AI is trained on text from the Internet and augmented using a dataset of over 40,000 Diplomacy games played on webDiplomacy.net. According to the Meta blog post, these games contained more than 12 million messages exchanged between players. Cicero therefore includes an iterative planning algorithm called piKL, which refines an initial prediction of other players’ policies and planned moves, based on dialogue between the robot and other players.

The algorithm attempts to improve the predicted move sets for other players by evaluating different choices that would produce better results. Meta said the resulting model mastered the intricacies of a complex game. Cicero can deduce, for example, that later in the game he will need the support of a particular player, then devise a strategy to win that person’s favor – and even recognize the risks and opportunities that player sees from them. his particular point of view , explains Meta. Three-time Diplomacy world champion Andrew Goff praised Cicero’s dispassionate approach.

Many human players soften their approach or begin to be motivated by revenge, but Cicero never does that. He simply plays the situation as he sees it. So he’s ruthless in executing his strategy, but he’s not so ruthless as to annoy other players,” Goff said. Meta reported that Cicero played 40 games of Diplomacy anonymously in a “blitz” league on webDiplomacy.net between August 19 and October 13, 2022, and finished in the top 10% of participants who played more than one game. And of the 19 who played five or more games, Cicro would have finished second.

Across all 40 games, Cicron’s average score would be 25.8%, more than double the 12.4% average among his 82 opponents. Although Cicero still makes a few mistakes, Meta engineers expect their research to be useful for other applications, such as chatbots that can hold long conversations or video game characters that understand player motivations and can interact more effectively. Cicero’s code has been released under an open source license in the hope that the AI ​​developer community can improve it further.

Meta’s recent AIs came out racist and spread false information

Meta’s research on Cicero was published in the journal Science under the title “Human-level play in the game of Diplomacy by combining language models with strategic reasoning”. As for broader applications, Meta suggests his research on Cicero could “mitigate communication barriers” between humans and AI, such as maintaining a long-term conversation to teach someone a new skill. It could also power a video game where NPCs could talk like humans, understand player motivations, and adapt along the way.

According to analysts, this is a significant achievement in the field of natural language processing. This might help people forget the debut last week of Galactica, a large language model that Meta engineers trained on scientific papers that presented falsehoods as facts, and was taken offline after three days of criticism from the scientific community. Galactica was designed as an academic search engine on steroids and was meant to help scientists. But instead, he thoughtlessly spat out biased and incorrect nonsense.

Within hours of going live, Twitter users began posting examples where Meta’s AI was generating completely fake and racist searches. A user discovered that Galactica was making up information about Stanford University researchers creating “gaydar” software to find gay people on Facebook. Another managed to get the robot to create a fake study on the benefits of eating crushed glass. Meta’s AI agent also thoroughly filters out queries such as gay theory, AIDS, and racism.

However, perhaps one of the most baffling aspects of this case is the fact that he was creating entirely bogus studies and attributing them to real scientists. Michael Black, director of the Max Planck Institute for Intelligent Systems in Germany, reported in a Twitter thread several instances in which Galactica created fake quotes from real-world researchers. At the same time, these quotes were attributed to very convincing texts generated by the model, which seemed, at first glance, quite plausible and real. Meta’s AI BlenderBot 3 sparked the same controversy in August.

Early testing of BlenderBot 3, a chatbot released by Meta in August, revealed that it’s far from the high-performing chatbot the company claimed. For example, BlenderBot 3 called CEO Mark Zuckerberg “scary and manipulative.” He also claimed that “Zuckerberg is a good businessman, but his business practices are not always ethical.” Other conversations with BlenderBot 3 have shown that it has racial biases and spreads conspiracy theories. He described Facebook as having privacy issues and that the platform spreads false information.

Alternatively, Meta’s Cicero could be used to manipulate humans by impersonating people and deceiving them in potentially dangerous ways, depending on the context. Thus, Meta hopes that other researchers can build on its code “responsibly”. He claims to have taken steps to detect and remove “toxic messages in this new realm”, which likely refers to the dialogue Cicero learned from internet texts he ingested, which is always a risk for top models. of language.

Sources: Meta, Cicero (PDF), Article by Meta researchers in the journal Science, GitHub repository of the Cicero project

And you?

What is your opinion on the subject?

What do you think of Meta’s Cicero AI?

In your opinion, what could be the use cases for such an AI?

In your opinion, could Cicero be diverted from its initial use? If so, what end?

See as well

Meta’s new AI chatbot claims CEO Mark Zuckerberg is ‘scary and manipulative’, chatbot also makes racist remarks and spreads conspiracy theories

Meta’s AI chief releases paper on creating “autonomous” artificial intelligence, suggests current approaches will never lead to true intelligence

Meta, the parent company of Facebook, fires 60 random people using an algorithm, Xsolla, a company in the gaming industry, also fires 150 employees using an algorithm

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *