The artificial intelligences of DeepMind and Meta now beat human players at strategy board games Stratego and Diplomacy. This is a new milestone for the technology, but some experts fear a risk of serious abuses…
For many years, artificial intelligence beat the human at chess and the game of Go. However, until now, she was unable to master the more complex board games.
AI has just taken a new step in becoming an expert in Stratego and Diplomacy. These two strategy games are based on the notion ” imperfect information unlike chess and Go where players see all the pieces on the board.
In Stratego, the identity of the parts is hidden until another piece meets them. For its part, Diplomacy consists of establish agreements, alliancesand revenges whose nature is kept secret.
These games therefore do not require calculating the paths to victory, but require more subtle abilities such as the ability to guess the opponent’s thoughts and adjust his strategy to thwart his plans. It is necessary to bluff and convince.
These two board games have been mastered a few days apart by two AI models different, one developed by DeepMind and the other by Meta (formerly Facebook).
DeepMind’s DeepNash wins over Stratego
The model playing Stratego and developed by DeepMind is referred to as DeepNash. Rather than being focused on executing smart shots, it’s designed to play in an unpredictable way.
This game has features that make it more complicated than chess, Go or poker already mastered by AI. Two players place 40 pawns each on a board, but cannot see the opponent’s pawns. The goal is to move the pawns to eliminate those of the opponent and capture a flag.
In total, Stratego can unfold from 10535 manners different. In comparison, this number is 10360 for the game of Go. Similarly, in terms of imperfect information at the start of the game, Stratego has ten66 hidden positions possible against 106 in a game of poker.
This AI is sometimes bold. In a game against a human, she sacrificed several high-level pawns and found herself outnumbered. It was actually a calculated risk to push the player to bring out his best assets. It then won by developing its strategy around this element.
This DeepNash model is good enough at Stratego to beat every other system every time, and win 84% of games against experienced humans. In 50 games on the Gravon online gaming platform, she has risen to third position among the best Stratego players on the platform since 2002.
To reach this level of performance, DeepMind could not exploit the same algorithms as for the game of chess and Go. They were not at all adapted to this game. The researchers therefore invented a new algorithmic method titled Regularized Nash Dynamics.
The DeepNash model combines a reinforcement learning algorithm with a deep neural network. To find the ideal action to perform for each state of a game, this AI played 5.5 billion games against itself.
Cicero: Meta’s AI masters the game Diplomacy
For its part, the AI mastering Diplomacy is developed by Meta and the CSAIL and called Cicero. Despite the difficulty of this game, the model is able to compete with human players.
In Diplomacy, up to 7 players compete and represent each a European power before the First World War. The objective is to control supply centers by moving fleets and armies.
This game requires a sense of scheming, a talent for betrayal and false promises and a real Machiavellianism. The complexity is not in the world map or the counters, but in the strategy around the agreements made. In addition, players must communicate verbally and convince of the sincerity of their intentions.
Again, therefore, it is not not only a question of computing power. To beat the human at this game, Cicero follows a multi-step process.
First, the AI relies on the current status of the board and on-going discussions to make an initial prediction actions of each player. She then perfects that prediction and uses them to form an intention for herself and her partners.
Subsequently, she generates multiple candidate messages based on the state of the board, the dialogue and his intentions. Candidate messages are then filtered to reduce nonsense, maximize value, and ensure consistency with intent.
This artificial intelligence was trained from the data of 125,261 games on the online version of Diplomacy, combined with data from games played against itself. His strategic reasoning module (SRM) has thus learned to predict the actions of the players and to choose an optimal action accordingly.
Son dialog moduleused to communicate its intentions to its allies, is based on a 2.7 billion parameter language model pre-trained on text from the internet and then refined using messages from games of Diplomacy played by humans. Based on the intentions of the SRM, this module generates a chat message.
On webDiplomacy.net, Cicero managed to stand up to his human opponents. She got up in second place in a ranking of 19 players and surpassed the score of most of them.
An AI capable of starting a war?
According to Michael Wellman of the University of Michigan, the speed at which different game characteristics have been conquered or mastered by AI in recent years is quite remarkable “. This computer science researcher studies strategic reasoning and game theory.
As he points out, “ Stratego and Diplomacy are quite different from each other, and also present significantly different challenges of games where similar successes have been achieved “.
According to Meta AI researcher Noam Brown, these game AIs capable of interacting with humans and taking into account non-optimal or even irrational actions could pave the way to real-world applications.
In his words ” if you make a self-driving caryou don’t want to assume that every other driver on the road is perfectly rational or is going to behave optimally. Cicero is a big step in that direction “.
He believes that this technology could help virtual assistants better understand what consumers want, where to go more engaging metaverse virtual beings and realistic. The goal of these researchers is not to create AI capable of beating humans in games, but to cooperate with them in the real world.
However, some experts are significantly less optimistic. According to University of Michigan artificial intelligence expert Kentaro Toyama, these AIs are scary and could be used for evil “. Just as generative AIs worry artists, this type of artificial intelligence also poses a threat.
He fears that their ability to hide informationto think several turns ahead of their adversaries and to surpass the intelligence of humans represents a risk. In his eyes, this technology could be used to create more convincing scams or more realistic DeepFakes.
The Cicero’s code is open to the publicand malicious actors could copy him and use his negotiation and communication skills to create persuasive emails and extort their prey.
Worse still, if someone were to train this language model on data like the diplomatic documents leaked by WikiLeaks, Toyama fears the system could impersonate a diplomat and initiate communication with a foreign power.
According to this specialist, AI is like the nuclear power of this era. It has colossal potential for both good and evil, but… I think if we don’t start regulating evil, all science fiction about dystopian AI will become science fact »…