Jan 292019
 

Towards reconstructing intelligible speech from the human auditory cortex

Abstract

Auditory stimulus reconstruction is a technique that finds the best approximation of the acoustic stimulus from the population of evoked neural activity. Reconstructing speech from the human auditory cortex creates the possibility of a speech neuroprosthetic to establish a direct communication with the brain and has been shown to be possible in both overt and covert conditions. However, the low quality of the reconstructed speech has severely limited the utility of this method for brain-computer interface (BCI) applications. To advance the state-of-the-art in speech neuroprosthesis, we combined the recent advances in deep learning with the latest innovations in speech synthesis technologies to reconstruct closed-set intelligible speech from the human auditory cortex. We investigated the dependence of reconstruction accuracy on linear and nonlinear (deep neural network) regression methods and the acoustic representation that is used as the target of reconstruction, including auditory spectrogram and speech synthesis parameters. In addition, we compared the reconstruction accuracy from low and high neural frequency ranges. Our results show that a deep neural network model that directly estimates the parameters of a speech synthesizer from all neural frequencies achieves the highest subjective and objective scores on a digit recognition task, improving the intelligibility by 65% over the baseline method which used linear regression to reconstruct the auditory spectrogram. These results demonstrate the efficacy of deep learning and speech synthesis algorithms for designing the next generation of speech BCI systems, which not only can restore communications for paralyzed patients but also have the potential to transform human-computer interaction technologies.

Translation: we’re living in the future and machines can understand your brain. I included a 500-year more advanced version of this in my “Zaneverse” stories; easily installed implants can understand your brain and hook you up to a facility comm system (ship, station, building, whatever). The result is that you have “electronic telepathy” built into something like a cell phone system so two people can “think” to each other and it comes through much like speech. A handy system for easy comms, as well as just the thing for chatting quietly without other people noticing or listening in.

Of course, what they’re shooting for in the scientific study isn’t a useful plot device for science fiction stories but a way to give voice to people who have functional brains but non-functional vocal systems. If I understand it correctly, the system would read your brain waves patterns as you think “dog” and understand that you are thinking the word or concept “dog,” but would instead understand the *sounds* you are trying to make, and thus it spits out the sound “dog.” As a result the results are apparently a bit incomprehensible at times, but 75% understandable is a hell of an improvement for someone who can’t speak *at* *all.*

This technology would of course be extremely valuable to many handicapped people. It would also be very handy to interrogators.

 Posted by at 3:59 pm