Meta has developed an AI capable of decoding speech from brain activity

This model of artificial intelligence could help the millions of people who suffer from traumatic brain injury each year.

With its branch dedicated to artificial intelligence (AI), Meta works on tools for various fields such as medicine. After presenting a platform capable of helping to develop prostheses, it announced on August 31 that it had designed an AI model capable of decoding speech from recordings of brain activity. This, she says, could improve the lives of the millions of people who suffer from traumatic brain injury every year, which prevents them from communicating through speech or gestures.

As Meta’s division explains in its release, decoding speech-related brain activity has been a long-standing goal of neuroscientists and clinicians. The problem is that most of the progress in this field relies on invasive techniques, requiring surgical procedures to implant a device like Elon Musk’s company Neuralink does. Meta AI chose a non-invasive approach in order to offer a safer solution, which would allow a greater number of people to benefit from it.

The problem with non-invasive technologies

She used electroencephalography (EEG) and magnetoencephalography (MEG), two technologies that respectively measure fluctuations in electric and magnetic fields caused by neuronal activity. If they are less invasive than the others, they are also known to be imprecise. “EEG and MEG recordings are known to vary widely between individuals due to individual brain anatomy, differences in the location and timing of neuronal functions in brain regions, and the position of sensors at the during a recording session »explains Jean Remi King, research scientist at Meta AI. These recordings can also be extremely loud.

To solve this problem, the researchers turned to machine learning algorithms to help “clean up” the noise. They used a model called wave2vec 2.0, developed by Meta’s FAIR team in 2020, to identify “the complex representations of speech in the brain of volunteers listening to audiobooks”. It was indeed trained with four open source EEG and MEG datasets, comprising more than 150 hours of recordings from 169 healthy volunteers listening to audiobooks and isolated sentences in English and Dutch.

The Meta AI system thus manages to perform a classification “zero hit”. In other words, it is able to determine the audio clip that a person heard among several from an extract of brain activity. He then deduces the words that she probably heard. “From three seconds of brain activity, our results show that our model can decode vocal segments with a maximum accuracy of 73% from a vocabulary of 793 words, i.e. a large part of the words that we usually use on a daily basis”says Jean Remi King. For the researchers, these results are promising because they show that AI can be trained to decode speech from non-invasive recordings of brain activity. They now hope to be able to extend this capability to direct speech decoding, without relying on audio clips.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *