They have been blooming on the web for several months: with just a few words, artificial intelligences like Dall-E 2, Midjourney or Stable Diffusion create new images, each more impressive or strange than the next. A rapidly developing sector that also attracts tech giants.
You wake up with a start from a strange dream that you no longer remember… except for a surreal image: Joe Biden receiving Mickey Mouse at the White House. How to immortalize the scene, a little blurry, before it disappears from your memory?
No need to master Photoshop, or even to have touched a pencil before: anyone today can create photorealistic scenes, real works of art or images straight out of a fever dream in less than 30 seconds, thanks to to artificial intelligence. Here is our American scene, imperfect, bizarre but generated from a simple sentence.
To do this, just go to one of the sites recently opened to the public, such as Dall-E. In the interface, no frills, just a text bar. That’s all it takes to create the scene of your dreams: first, describe in a few words (and in English) the image that needs to be created, however improbable it may be.
For example, “a golden retriever surfing a nebula in space” (“A golden retriever on a surfboard in space, riding a nebulae”). Then click on “Generate”, wait about fifteen seconds… And there you have it, an AI has created an image that does not exist anywhere else.
The image is not a copy of an already existing sketch on the Web, it is the unique result of a totally artificial creation, imagined in real time.
As proof, these programs are far from perfect. Some productions are even sometimes very disturbing or completely failed – in particular the human faces, which are often distorted.
You have to go over it several times to obtain a completely coherent image, and specify your request by adding expressions: on the style (“polaroid”), the light (“cinematic lighting”), the level of detail expected (” highly detailed”), artist names (“Picasso”)…
Here is our golden retriever on his surfboard again, this time indicating specific information: an oil painting version with a lot of details.
A new decor for iconic paintings
These artificial intelligences are not only used to create photorealistic situations or works of art from almost nothing. They can also extend images.
Have you ever wondered what’s going on around famous painting designs, like The girl with the pearl? It’s simple: once again, copy and paste the original image on Dall-E, then click on the area where you want to extend the image, and write the desired result.
Again, it often takes several attempts, but the results can be stunning, like this achievement by American artist August Kamp.
And here is what Tech&Co produced in about 30 minutes – with all the imperfections of proportion and style – with the same tool, based on the artwork The milk girl Johannes Vermeer.
AIs can also rework entire images, for example to transform simple children’s drawings into professional work. Or give a radically different style to your selfies by modifying them… Each time by typing a few words, as simply as an internet search.
But “it’s not a simple Google search: it creates a totally new image that does not exist anywhere else”, insists Valentin Schmite, teacher at Sciences Po and author of About Art and Artificial Intelligence.
“Machine learning” and “latent space”
How do these AIs proceed? Before being able to create new works, programs must “learn” to decode real images. For this, the researchers employ the hard way: they make them swallow hundreds of millions, even billions of images of all kinds, recovered from the web and accompanied by a written description.
These programs are then trained to detect recurrences in the images using “machine learning”, a technique that allows the program to improve itself almost autonomously. The AI will thus learn on its own to distinguish a dog from a cat, a photo from a painting… Each image is then stored in a sort of large virtual warehouse – an area called “latent space”.
That’s it for the “inspiration” part. But when you ask these AIs to create an image, they must first understand what you are writing. This is possible thanks to another module which studies written descriptions to learn which part of the image each word corresponds to, and thus understand natural language.
Once it has decoded your query, the AI will determine which part of its latent space, or which shelf of its warehouse, matches it the most. Then she will get down to creating the image. And unlike a human artist, she does not start from a blank sheet, quite the contrary: she starts from a cluster of pixels of random colors, which she will “clean up” little by little, modifying some to get closer of the requested image. A technique called “diffusion”.
After images, music, podcasts…
AIs capable of creating images on demand have recently undergone rapid development. “We have seen a real explosion of this technology in the last 3 or 4 months”, testifies Valentin Schmite. In reality, these tools have been around for several years but they often required knowing how to code, and the most efficient ones required significant computer power. This is now a thing of the past since the tools have become very accessible.
And these AIs will soon be able to do much more than generate images, as every week sees awesome new possibilities, like creating interior designs or 3D models – which could be used in designing video games, for example.
Even web giants have joined in: Google and Facebook have recently demonstrated systems (not yet publicly available) for creating entire videos from text, and TikTok already offers a text-to-image tool.
And the AIs don’t only tackle drawing and cinema, since some also create music or even entire podcasts.
But nothing guarantees that this frantic pace of progress will continue forever: “We already had an ‘AI winter’ in the 1960s, the sector alternates phases of optimism and pessimism”, says Vincent Schmite .
And it remains to be seen how the greatest number will decide to seize it: “It is not because the use of the camera has become democratized with the smartphone that everyone has become a professional photographer. same way, just because everyone can create an image from text doesn’t mean everyone will become an artist.”
The main online creation tools
• From E2created by the OpenAI foundation
• Halfway, accessible via a Discord server
• Stable Diffusion, an open source software that can be used via different sites, such as DreamStudio or PlaygroundAI, or can be downloaded for free (but knowledge of code and a relatively powerful computer are required)