What might other people’s mental experiences look like? As it turns out, the answer to this question may not lie too far into the future. Scientists from the University of California, Berkeley, published a paper in the September issue of Current Biology that explains how visual experiences of movie trailer clips can be reconstructed with seconds of YouTube video.
The researchers wanted to study how the brain, specifically the early visual system, encodes incoming visual information. The early visual system is the first visual area to receive incoming visual information; it picks up simple features in the environment, such as oriented edges, patches of texture, and motion.
The three participants, who were also co-authors of the study, went inside a functional resonance imaging (fMRI) machine and watched about three hours of Hollywood movie trailers over the course of a few weeks. Data from this single task was taken and used to create a model that would describe how simple features presented in the movies were related to activity at different points in the brain. In total, the researchers measured about 4000 different points of brain activity. To decode the movie trailers seen inside the fMRI machine, they used 18 million seconds of randomly downloaded YouTube videos. YouTube was chosen because it was the quickest way to make a library that was independent of the movies shown in the fMRI machine.
The idea was that the model would reconstruct the movie trailer the participants saw by using unseen YouTube clips. In response to concerns about overlap, the researchers noted that all the movie trailer clips have common cinematic themes and features present in each of them that complement the YouTube clips. The YouTube clips were expected to provide variety and reinforce the basic reconstruction of the trailer clips.
The feat of reconstructing visual images with a model derived through brain activity and not neural activity was quite successful. Functional MRI can measure changes in blood flow and blood oxygen changes subsequent to neural activity and has a very low image resolution, making it excellent equipment for the experiment. Brain blood flow was measured using blood oxygen level-dependent signals, or BOLD signals, in the participants’ occipitotemporal visual cortex. The BOLD signals were ideal since they are indicators of underlying neural activity.
A longstanding problem with BOLD signals is that they are very slow, making it hard for researchers to map brain responses to dynamic stimuli. But with a new motion-energy encoding model, the researchers were able to track BOLD signals as well as use them to decode participants’ visual experiences.
It is important to emphasize that the researchers only decoded the early stages of the visual system and did not take into account the remaining visual areas. A decoding mechanism that combines both the lower and higher hierarchies of the visual system will provide a much clearer and more accurate image. In the visual system hierarchy, the primary visual cortex is concerned with basic features, like the location of edges, where characters are moving in a scene, and basic texture patterns. This part of the visual system does not register any ‘meaning’ behind the perceived objects. Higher level parts of the visual system, on the other hand, deal with the semantic elements of the scene (putting a name on whatever it is you are seeing).
The limitations of this study include accuracy of reconstruction — some people might wonder why the images are blurry. However, the researchers did not intend on fine-tuning the decoded brain activity; the resulting images are not very detailed. The authors point out that using quantifiable methods like fMRI makes it easier for researchers to interpret the results of decoding. It should be noted that the videos posted on the lab website have been reconstructed with approximately 10 minutes of data, although the entire study far exceeded that count.
High-tech improvements of this study will not only give science fantasy novel writers something more to add to their plots but can also potentially lead to a reliable reconstruction of typical dynamic visual experiences. However, it seems that involuntary subjective mental states like dreaming, hallucinations, and memories may be harder to verify as accurate representations due to their nature.
Since the visual system makes up about a third of the human brain, studies like these open doors to understanding the various unique aspects of the visual system and boost the technology available to hospitalized non-communicative patients. The brain, it seems, is the antenna to visual reality.