In an article recently published in the journal Communications Biology, researchers demonstrated the feasibility of using synthetically generated images or optimized natural images to modulate brain responses.
Background
The visual system of the brain has traditionally been an important topic in the field of neuroscientific study. Identifying preferences in the visual cortex’s macro-scale region and single neuron response patterns has contributed to understanding incoming visual information processing and interpretation in the brain. Artificial neural networks (ANNs), specifically deep neural networks, are used to model the human visual system owing to their biological neural network-inspired architecture and excellent performance on object recognition and image classification tasks.
Recent studies have primarily focused on comparing ANNs trained to predict the responses of the brain from visual stimuli/encoding models to the brain’s visual system. Although several studies have displayed the mismatches between biological neural networks and ANNs, the ANN is still one of the most suitable models to represent and probe visual systems.
Role of generative models
Optimal stimulus design and neural decoding have received attention as novel methods to control and understand neural responses to visual stimuli due to recent AI advances in generative models, such as variational autoencoders and generative adversarial networks (GANs).
Pretrained generators can be coupled with ANN-based encoding models to accurately decode viewed images from brain responses with both low-level alignment and high-level semantic with ground truth. ANNs capable of performing image classification can be combined with generative networks to create preferred inputs for artificial neurons through activation maximization.
Although similar approaches have been adopted to design optimal stimuli to maximize firing rate in populations of neurons or single neurons in macaque monkeys, no study has recorded macro-scale human brain activation responding to synthetic visual stimuli that have been designed to realize targeted, specific brain activation patterns.
The proposed approach
In this study, researchers suggested the modulation of the activation responses in specific human brain regions in a controlled, personalized way by selecting optimal natural stimuli and generating optimal synthetic stimuli to improve the understanding of the human visual system.
They also evaluated their approach for targeted activation of specific brain regions in different individuals. The objective was to demonstrate that the approach can be utilized to generate and select optimal visual stimuli that are designed for targeted modulation of macro-scale human brain activity, and this modulation can be performed at an individual level.
Researchers utilized the large-scale Natural Scenes Dataset (NSD) containing 30K coupled brain responses and images from each of the eight subjects for training the ANN-based individual-level encoding models with high accuracy. The NSD images were fed into the encoding models and their predicted average activations were sorted to obtain sets of natural images predicted to attain average/maximal levels of activity for the targeted brain region across the NSD subject population.
Additionally, the NeuroGen framework was employed for designing synthetic images predicted to realize maximal/average levels of activity for the targeted region. Once all synthetic and natural image sets were obtained, researchers enrolled six individuals and determined their brain responses to the images using functional MRI (fMRI), which led to the generation of image-response data from the six enrolled subjects.
Subsequently, researchers applied their recently developed linear ensemble method to the image-response data to create individual-level, personalized encoding models for each subject. Sets of individual-specific synthetic and natural images were obtained using the personalized encoding models of the subjects based on the image generation/selection procedure described above, and the regional responses of the subject’s brain to the personalized images were obtained by performing a second fMRI scan.
Significance of the study
Images predicted to attain maximal activations using a group-level encoding model evoked higher responses than images predicted to achieve average activations. Two visual regions, including anterior temporal lobe face area (aTLfaces) and fusiform body area 1 (FBA1), displayed a significantly higher activation in response to the maximal synthetic images compared to the maximal natural images.
However, a face area, including the medial temporal lobe face area (mTLfaces), and two word regions, including visual word form areas 1 and 2 (VWFA1 and VWFA2), demonstrated a higher activation in response to natural images compared to synthetic images. The modulation ability was associated with the encoding model accuracy. For instance, more accurate encoding models used in this study led to more accurate control over the brain activity.
Additionally, synthetic images derived using personalized encoding models elicited higher responses compared to synthetic images designed using group-level or other individuals’ encoding models, specifically in face regions that were higher in the processing hierarchy. The optimal personalized synthetic images led to larger responses in the aTLfaces/highest-level face processing region compared to the regions’ responses to optimal natural images.
Journal reference:
- Gu, Z., Jamison, K., Sabuncu, M. R., Kuceyeski, A. (2023). Human brain responses are modulated when exposed to optimized natural images or synthetically generated images. Communications Biology, 6(1), 1-12. https://doi.org/10.1038/s42003-023-05440-7, https://www.nature.com/articles/s42003-023-05440-7