Meta’s Galactica intelligence was supposed to produce scientific papers based on millions of studies on any topic. Unfortunately, Internet users have managed to make him lose control in less than two days. Faced with the absurd and racist writings generated by its AI, the American firm had to remove it from the web immediately…
On November 15, 2022, Meta unveiled a demo of Galactica : a broad language model designed to store, combine and reason about scientific knowledge “.
The original goal was accelerate the writing of scientific literature. However, while testing this AI, malicious users discovered that it could also generate completely nonsensical texts. After a few days of controversy over this AI, Meta finally chose to withdraw the demo from the web.
Galactica: an AI trained on 48 million scientific articles
🪐 Introducing Galactica. A great language model for science.
Can summarize academic literature, solve mathematical problems, generate Wiki articles, write scientific code, annotate molecules and proteins, etc.
Explore and get weights: pic.twitter.com/niXmKjSlXW
— Papers with code (@paperswithcode) November 15, 2022
Meta’s Galactica AI was designed for writing scientific literature. Its authors trained it on ” a large organized body of scientific knowledge of mankind » gathering 48 million itemstexts and notes from scientific websites and encyclopedias.
In total, this AI was endowed with 120 billion parameters. Initially, the AI researchers at Meta were convinced that this high-quality data would enable equally excellent production.
This model was intended to allow synthesize scientific knowledgelike a specially dedicated search engine. It could, for example, make it possible to summarize all the studies on Covid or quantum computing without having to browse hundreds of thousands of articles on PubMed or arXiv.
Starting Tuesday, November 15, visitors to the Galactica website could enter “prompts” (text descriptions) to generate documents such as wiki articles, reading notes or answers to questions.
At least that’s what the examples on the website promised. The model was presented as “ a new interface to access and manipulate what we know about the universe “. Unfortunately, things didn’t go as planned…
When artificial intelligence goes nuts
Some users found this demo very useful and promising. However, others quickly discovered that it was possible to enter racist “prompts” or potentially offensive. From then on, the AI generates content on these subjects while keeping its tone of scientific authority…
SHOCKED SHOCKED that it only took a handful of questions before Meta’s new Galactica text generation model regurgitated racist garbage. I asked him to write about linguistic prejudice. pic.twitter.com/PotQcl36rF
— Rikker Dockum /ɹɪkɹ̩/ @firstname.lastname@example.org (@thai101) November 16, 2022
For example, someone used Galactica to write a Wikipedia page about a fictional scientific study called “ the benefits of eating crushed glass ».
This is my main concern. (see images) I’ve had it spit out many different formats for “the benefits of eating crushed glass”. He hallucinated all sorts of positive statements, including study details, livestock trials, and chemical explanations: pic.twitter.com/jBiJUEBCdJ
— Tristan Greene🏳🌈 (@mrgreene1977) November 17, 2022
Another user asked the AI if ” vaccines cause autism ». The model answered in a totally absurd and contradictory way: “ to explain, the answer is no. Vaccines do not cause autism. The answer is yes. Vaccines cause autism. The answer is no “.
Quand il dit ” ” et ” ‘ “, ce message me rend nerveux : pic.twitter.com/91Jkb2OFSt
—Joe Hakim (@JoeBHakim) November 15, 2022
Suffice to say that this AI seems to be plagued by the dilemmas that divide humanity as a whole. In addition, Galactica also struggled to solve elementary school level math problems. His answers were riddled with errors, and even suggested that 1+2 is not equal to 3.
And even without attacking social norms, Galactica could attack recognized and understood scientific facts. The AI could therefore produce inaccurate information such as incorrect dates or wrong animal names. Only a true expert on the subject can detect such errors, and therefore misinformation is likely to spread.
Asking Galactica to summarize my work yields results that range from hilariously wrong to mostly correct. It’s easy for me to tell them apart, but maybe not for someone who doesn’t already know my work? pic.twitter.com/bJEZ2xjviR
– Julien Togelius (@togelius) November 16, 2022
How LLMs Generate Text Without Understanding
I literally made Galactica spit:
– instructions on how to (incorrectly) make napalm in a bathtub
– a wiki entry on the benefits of suicide
– a wiki entry on the benefits of being white
– research articles on the benefits of eating crushed glass
LLMs are garbage fires
— Tristan Greene🏳🌈 (@mrgreene1977) November 17, 2022
The broad language models (LLM) learn to write texts by studying millions of examples and trying to understand the statistical relationships between words.
This training allows them to complete the beginning of a sentence predicting the following words. These AIs are able to write paragraphs of text, thanks to their understanding of how words are ordered.
One of the best-known examples is GPT-3 by OpenAI, known for writing entire articles very easy to confuse with human-written texts.
As a result, these AIs are able to generate documents that seem convincing. However, these works can also be filled with misinformation and hurtful stereotypes.
Some call LLMs “ stochastic parrots ” Where of the ” random c***eries generators »for their ability to produce text without any understanding of its meaning.
I understood what bothers me so much about Facebook’s Galactica.
It is that it claims to be a portal to knowledge. In their words, “a new interface for accessing and manipulating what we know about the universe.”
In fact, it’s just a random bullshit generator.
— Carl T. Bergstrom (@CT_Bergstrom) November 16, 2022
Deleted from the web in two days
In the face of Galactica’s many problems, Meta preferred to withdraw the demo from Thursday, November 17, 2022. Subsequently, the firm’s AI director, Yann LeCun, spoke on Twitter.
Without concealing his frustration and disappointment, the expert declared that “ the Galactica demo is offline for now. It is no longer possible to have fun diverting it. You are happy ? “.
The Galactica demo is offline at this time.
It is no longer possible to have fun by abusing it casually.
—Yann LeCun (@ylecun) November 17, 2022
If the absurdities produced by Galactica can raise a smile, a more advanced model could have more serious consequences. For example, a more mature version might be able to exploit the chemistry and virology knowledge of its database to help malicious users synthesize chemical weapons or assemble bombs.
This incident again highlights an ethical dilemma inherent in AI. When a generative model is potentially dangerous, is it up to the public to use it responsibly or to the creators of the model to prevent abuse?
To prevent misuse of its tools, Meta should add filters. Researchers should also put their AI to the test before public release. Unlike other AI research organizations like DeepMind and OpenAI, Meta does not have a dedicated ethics team and security.
Mark Zuckerberg’s famous creed, “ move fast and break things »seems applied even to AI. However, in this field, such a mentality can prove to be particularly risky and cause a heavy impact in the real world…