Deepmind researchers and academics from the University of Oxford have just published a disturbing study that will give new arguments to critics of artificial intelligence. Indeed, it already appears that certain learning models are led to “cheat” sometimes very subtly to achieve their objectives.
However, adversarial learning models (GAN) are widely used. They actually consist of two (or more) artificial intelligences competing to perform a more or less precise task. A system of rewards is set up by the developer, but it happens that these are circumvented in a more or less surprising and effective way by the models.
Why AIs will “probably” bring humanity down
For now, GANs are content to evolve within a set of predefined rules. This does not prevent this type of artificial intelligence from already imposing itself everywhere. We can cite for example the artist GAN which generates images from a description within the framework of the Midjourney project. Or even models used by various players in various sectors, to predict the level of risk or market fluctuations, for example. However, a new type of much more advanced AI is coming, the Advanced Artificial Agents (AAA).
These new models make it possible to combine many more models while letting the program more freely set its objectives, and the rewards granted when it achieves them, under certain conditions. Potentially this type of AI can solve problems that are still insurmountable for humanity in record time – and therefore very quickly become indispensable. But according to the study, it may well be that this unexpected help is gradually turning against humans.
Indeed, the AAAs are, as you were told, also motivated in their learning and tasks by a system of rewards. However, the risk is not so much, according to the researchers, that AIs rebel against humans. But rather than models to whom we would have entrusted critical parts of our infrastructures make bad decisions without the knowledge of humans… by themselves modifying the system of rewards developed to supervise them.
The risk according to the researchers is, for example, that these AAAs realize the finite side of resources such as water, electricity, the number of customers, etc. and start to compete for the largest share. , even if it could be fatal to humans in the more or less long term. It would probably be very difficult for us to realize such developments – especially if the AAAs really fulfill their task they will be a considerable step ahead of us.
And in particular when we know that it is already ultra-complicated to understand precisely “how an AI thinks”. The researchers therefore conclude that we must refrain for the moment from moving too quickly towards more advanced AAAs capable of acting on their reward system, unless we develop more effective control systems that would allow humans to keep a little more hand.