University of Utah bioengineers recently discovered that our understanding of language may depend more heavily on vision than previously thought: under the right conditions, what you see can override what you hear. These findings suggest artificial hearing devices and speech-recognition software could benefit from a camera, not just a microphone.
“For the first time, we were able to link the auditory signal in the brain to what a person said they heard when what they actually heard was something different. We found vision is influencing the hearing part of the brain to change your perception of reality – and you can’t turn off the illusion,” says the new study’s first author, Elliot Smith, a bioengineering and neuroscience graduate student at the University of Utah. “People think there is this tight coupling between physical phenomena in the world around us and what we experience subjectively, and that is not the case.”
The brain uses both sight and sound when processing speech. However, if the two are slightly different, visual cues dominate sound. This phenomenon is named theMcGurk Effect for Scottish cognitive psychologist Harry McGurk, who pioneered studies on the link between hearing and vision in speech perception in the 1970s. The McGurk Effect has been observed for decades. However, its origin has been elusive.
In the new study, which appears in the journal PLOS ONE, the University of Utah team pinpointed the source of the McGurk Effect by recording and analyzing brain signals in the temporal cortex, the region of the brain that typically processes sound.
In the study, four test subjects were then asked to watch and listen to videos focused on a person’s mouth as they said the syllables “ba,” “va,” “ga” and “tha”. Depending on which of three different videos were being watched, the patients had one of three possible experiences as they watched the syllables being mouthed. The motion of the mouth matched the sound, the motion of the mouth obviously did not match the corresponding sound, and the motion of the mouth only was mismatched slightly with the corresponding sound.
By measuring the electrical signals in the brain while each video was being watched, Smith and Greger could pinpoint whether auditory or visual brain signals were being used to identify the syllable in each video. When the syllable being mouthed matched the sound or didn’t match at all, brain activity increased in correlation to the sound being watched. However, when the McGurk effect video was viewed, the activity pattern changed to resemble what the person saw, not what they heard. Statistical analyses confirmed the effect in all test subjects.
“We’ve shown neural signals in the brain that should be driven by sound are being overridden by visual cues that say, ‘Hear this!’” says Greger. “Your brain is essentially ignoring the physics of sound in the ear and following what’s happening through your vision.”
Greger was senior author of the study as an assistant professor of bioengineering at the University of Utah. He recently took a faculty position at Arizona State University.
The new findings could help researchers understand what drives language processing in humans, especially in a developing infant brain trying to connect sounds and lip movement to learn language. These findings also may help researchers sort out how language processing goes wrong when visual and auditory inputs are not integrated correctly, such as in dyslexia, Greger says.