In a study published in Communications Biology, researchers developed a spiking neural network (SNN) based model that can accurately predict brain activity patterns evoked by visual stimuli. Mapping how the brain responds to sensory inputs, known as neural encoding, is a significant focus of neuroscience and underpins brain-computer interfaces.
Despite the success of using deep convolutional neural networks (CNNs) for encoding, differences exist between their computational rules and real biological neurons. To address this, the researchers proposed an encoding framework using unsupervised SNNs that operate via spike timing akin to the brain.
What did the researchers do?
They constructed a two-layer spiking convolutional neural network (SCNN) to extract visual features from images. The first layer emulated retinal ganglion cells using the Difference of Gaussian filters, while the second layer mimicked cortical processing with spiking neurons and STDP learning. To predict fMRI voxel responses, the SCNN features were input to regression models that mapped each voxel's receptive field.
The approach was evaluated on four public datasets - handwritten characters, digits, grayscale, and colorful natural images. SCNN-based encoding achieved significantly higher accuracy than CNN and Gabor wavelet models across datasets. It also enabled successful decoding for image reconstruction and identification. The results highlight that SNNs can capture subtle patterns in visual data and predict brain activity better than traditional deep networks.
Neural Encoding
Neural encoding refers to modeling how sensory stimuli are transformed into brain responses. Vision is a key focus as it is fundamental to how we perceive the world. Previous encoding models have used filters based on findings that V1 neurons act as oriented edge detectors. With hierarchical representational structure mirroring the visual system, deep CNNs have become prevalent for encoding too. However, differences exist between deep networks and biological computation - traditional artificial neurons propagate continuous values while real neurons communicate via spikes.
SNNs incorporate spiking neurons and spike-timing-dependent plasticity (STDP) learning, offering greater biological fidelity. Recent works have applied SNNs to object recognition, but their potential for predictive brain modeling still needs to be explored. This study aimed to develop an SNN-based encoding framework to map visual inputs to fMRI responses more plausibly than prior deep network models.
Methodology of SCNN Encoding
The encoding model comprised a SCNN feature extractor and voxel-wise linear regression predictors. The SCNN architecture included a Difference of Gaussians (DoG) retinal layer and convolutional spiking layer. For an input image, the DoG layer outputs spike patterns fed to IF neurons in the convolutional layer for integration. STDP learning extracted feature timing.
The SCNN features were annotated with each voxel's receptive field location based on the retinal topology to predict voxel responses. Linear regression then mapped the SCNN inputs from a voxel's receptive field to its measured fMRI signal. This aligns with the findings that voxels receive localized visual information. The model was trained on four image-fMRI datasets - characters, digits, grayscale, and colorful natural images.
Evaluation of Encoding and Decoding Performance
The SCNN framework achieved significantly higher encoding accuracy over traditional models on all datasets. It outperformed Gabbors for natural images and CNNs for multiple datasets. This demonstrates the advantages of SNN bio-plausibility for predictive modeling. SCNN also enabled effective decoding for image reconstruction and identification of brain activity patterns.
Notably, SCNN performed comparably to a CNN directly optimized for reconstructing images from fMRI. This highlights the potential of unsupervised SNNs to capture informative visual features without needing brain responses as explicit training targets. Analyses of encoding voxels showed that their receptive field locations corresponded to areas of high stimulus intensity, validating the regression approach.
The findings showcase three key benefits of the spike-based encoding model. Firstly, the SCNN automatically extracts complex features optimized for the specific visual recognition task over hand-designed filters reliant on expert knowledge. Secondly, STDP learning works well even for limited samples, unlike CNNs requiring extensive labeled data. Finally, encoding time-series signals as 2D SNN spike patterns enables better feature learning than statistical approaches, which risk information loss.
Current Limitations
While demonstrating significant promise, some limitations exist in this model. The shallow SCNN depth restricts hierarchical feature learning compared to deeper CNNs. Also, the simplified spiking neuron model lacks the biological details of real neurons—Additionally, the impact of SCNN hyperparameters on encoding performance warrants investigation. The study provides a robust framework for predictive brain modeling using neural networks that computationally resemble the visual system.
Significance of Study Findings
This work highlights SNNs as an effective tool for neural encoding that can map visual stimuli to brain activity via spike-based computations. The SCNN approach achieved superior predictive accuracy over leading deep network models across experimental datasets. It also enabled successful reconstruction and identification directly from fMRI patterns.
These results demonstrate that mimicking the brain's spiking mechanisms can lead to better models than non-spiking networks for encoding sensory information. The study contributes a robust SNN-based encoding methodology with interdisciplinary impact across neuroscience, bioengineering, and computer vision. The proposed framework can provide biological insights and potentiate real-world applications like brain-computer interfaces. Overall, the findings help bridge the gap between AI and neurobiology to advance our understanding of vision in the brain.
Future Outlook
While this work provides a robust proof-of-concept for SNN-based encoding, ample opportunities exist to build on these initial findings to advance the approach further. Future work can incorporate advances in deep SNNs and realistic neuron models to enhance encoding capabilities. Another exciting direction is extending SNN encoding to non-visual cognitive functions like memory and emotion.
One research direction is exploring how encoding performance changes with more profound and complex SCNN architectures. The current model utilized a relatively simple two-layer spiking convolutional network. Incorporating techniques for training deeper multi-layer SNNs could enhance representation learning and encoding accuracy.
Another critical area is extending the SNN encoding framework beyond vision to other sensory modalities and cognitive processes. As the method is generalizable, it can likely be adapted to predict fMRI responses in higher-order brain regions involving memory, emotion, and decision-making. This could shed light on how more complex information is neurally encoded. Additionally, cross-modal SNN encoding could reveal how the brain integrates multisensory stimuli. Advancing on these fronts will open up new vistas for predictive biomimetic modeling of the brain using spiking networks.
Journal reference:
- Wang, C.; Yan, H.; Huang, W.; Sheng, W.; Wang, Y.; Fan, Y.S.; Liu, T.; Zou, T.; Li, R.; Chen, H. Neural Encoding with Unsupervised Spiking Convolutional Neural Network. Communications Biology 2023, 6, 880. https://doi.org/10.1038/s42003-023-05257-4, https://www.nature.com/articles/s42003-023-05257-4