Researchers at Stanford University have made a significant breakthrough in the development of brain-computer interfaces (BCI). By creating technology that can decipher spoken language at speeds of up to 62 words per minute, the team improved the previous record by nearly 3 times.
A development that brings these systems a little closer to the rhythms of a natural conversation, and to practically instantaneous voice conversion.
Words words words
The co-founder of Neuralink along with Elon Musk, Max Hodak, called the Stanford research “a significant shift in the utility of brain-computer implants.” But what exactly does it consist of?
The crux of all the work, detailed in a paper that I link to here, is the ability to “translate” brain signals into coherent speech using a machine learning algorithm. And do so by analyzing brain activity in a relatively small region of the cortex.
The target? Helping people who can no longer speak due to diseases such as ALS recover their voice. A real leap in quality: a vocal interface of this type could significantly accelerate the decoding of brain signals.
The tests
In one experiment, the team recorded (from two small areas of the brain) the neural activity of an ALS patient who can move his mouth but has difficulty forming words.
Using a recurrent neural network decoder that can predict text, the researchers then transformed these cues into words. Words that go at a pace never seen before.
It has been discovered that the analysis of facial movements and associated neural activities is strong enough to support a brain-computer interfacing system despite the paralysis and limited extension of the cerebral cortex.
The challenges to face
Currently the system is fast, but still imperfect: the error rate of the recurrent neural network (RNN) decoder used by the researchers it's still 20%.
The researchers know this well: “Our demonstration,” they write, “is evidence that decoding attempted speech movements from intracortical recordings is a promising approach, even if it is not yet a complete, clinically viable system.”
To improve the error rate and optimize the algorithm, the studies will now aim to probe more areas of the brain.
Imagine such technologies combined with artificial intelligence. Algorithms capable of perfectly cloning a voice, such as the one recently presented by Microsoft which takes just 3 seconds of audio.
In the future, no one will remain silent.