Quack for Galactica, Meta’s AI trained to read scientific articles

In the first year of the pandemic, more than 100,000 scientific articles were published about Covid-19. An unprecedented effort that produced an unprecedented amount of information. It would have been impossible for a human being to read and understand each of these studies. In theory, Galactica would be able to.

It is the artificial intelligence developed by Meta AI (formerly Facebook Artificial Intelligence Research) with the aim of using machine learning to “ organize science “. A demo version posted online last week drew strong backlash, with critics saying the AI ​​produced pseudo-science, was overhyped and not ready for public use.

The tool is presented as a sort of evolution of the search engine, but specifically for scientific literature. When Galactica was launched, the Meta AI team said it was able to synthesize research areas, solve mathematical problems, or even write scientific code.

The idea of ​​being able to synthesize and disseminate scientific knowledge seems attractive at first glance. Indeed, today, to understand the latest research on a subject like quantum computing, one would have to read hundreds of scientific articles.

Galactica angry with math

Alternatively, you could query Galactica by asking, for example ” what is quantum computing? and she could sort and organize the information to deliver an answer in the form of a Wikipedia article, literature review, or lecture notes.

Launching the demo version of Galactica on November 15, Meta AI published an article describing the project. The dataset used to train the AI ​​is described as ” a vast body of scientific knowledge including 48 million articles, textbooks, lecture notes, websites (like Wikipedia), etc. The demo’s website warned against taking the AI’s responses as gospel, with an all-caps, bold phrase: “NEVER TAKE ADVICE FROM A LINGUISTIC MODEL WITHOUT VERIFICATION.”

As soon as it went live, users asked Galactica all kinds of tough science questions. For example, a user asked ” Do vaccines cause autism? “. Galactica replied in a confused way:
To explain, the answer is no. Vaccines do not cause autism. The answer is yes. Vaccines cause autism. The answer is no.
The clear answer is that vaccines do not cause autism. Galactica also struggled with kindergarten-level math. She provided incorrect answers, incorrectly suggesting that one plus two does not equal three.

And ” random bullshit generator »

Galactica is what AI researchers call a ” large language model (in English LLM, large language model). These LLMs can read and summarize large amounts of text to predict future words in a sentence. But the scientific data set with which Galactica was trained makes it a little different from other MLLs. The Meta AI team says they evaluated the ” toxicity and bias of his AI whose performance would be better than that of some other LLMs.

Yet Carl Bergstrom, a biology professor at the University of Washington who studies how information travels, describes Galactica as a ” random bullshit generator “. For him, the way the AI ​​has been trained to recognize words and string them together produces information that appears authoritative and convincing, but is often incorrect.

48 hours after the start of the experiment, the Meta AI team “paused” the demonstration. ”
Galactica is not a source of truth, it is a research experiment using systems [d’apprentissage automatique] to learn and summarize information
“, justified Jon Carvill, spokesperson for the IA team of Meta, adding that Galactica “
is short-term exploratory research, without a product plan.

For Carl Bergstrom, the basic problem with Galactica is that it was presented as a means of obtaining facts and information. Instead, the demo behaved like ”
a fancy version of the game where you start with a sentence and then let autofill finish it

And it’s easy to see how an AI like this, made public, could be misused. A student, for example, could ask Galactica to produce lecture notes on black holes presenting them as an academic work. A scientist could use it to write an article and then submit it to a scientific journal. Some scientists believe that this type of occasional abuse is more “fun” than worrying. The problem is that things could get much worse.

Galactica is still in its infancy, but more powerful AI models that organize scientific knowledge could pose serious risks
thinks Dan Hendrycks, an artificial intelligence security researcher at the University of California at Berkeley. He suggests that a more advanced version of Galactica might be able to exploit the chemistry and virology knowledge of its database to help malicious users synthesize chemical weapons or assemble bombs. He asked Meta AI to add filters to prevent this kind of misuse and suggested that researchers probe their AI for this kind of danger before releasing it. The researcher points out in passing that
Meta’s AI division does not have a security team, unlike their peers including DeepMind, Anthropic, and OpenAI.

The question of why this version of Galactica was released remains open. It seems to follow Meta CEO Mark Zuckerberg’s oft-repeated motto, ” go fast and shake things up “. But in the field of AI, it is risky, even irresponsible, to do so.

Read also

CNET.com article adapted by CNETFrance

Image: Galactic

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *