ReAInet: Bridging Artificial and Human Vision through EEG-Aligned Neural Networks

In a recent submission to the ArXiV*, researchers from Ohio State University and the University of Texas, USA, aimed to bridge the gap between artificial and human vision and to pave the way for more brain-like artificial intelligence systems.

Study: ReAInet: Bridging Artificial and Human Vision through EEG-Aligned Neural Networks. Image credit: Mrspopman1985/Shutterstock
Study: ReAInet: Bridging Artificial and Human Vision through EEG-Aligned Neural Networks. Image credit: Mrspopman1985/Shutterstock

*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.

They developed a novel vision model called ReAInet that aligns with human brain activity based on non-invasive electroencephalography (EEG) recordings, demonstrating a significantly higher similarity to human brain representations than existing models. This model increased adversarial robustness and individual variability across different layers, reflecting the complexity and adaptability of human visual processing.

Background

An object detection model is a computer vision system that identifies and classifies objects within an image or video. Using techniques like deep learning, popular models include Faster region-convolutional neural network (Faster RCNN), only look once version (YOLO), and single shot multi-box detector (SSD). These models are crucial for applications like autonomous vehicles, surveillance, and image analysis.

In artificial intelligence, substantial progress has been achieved so far. However, existing object recognition models lag in replicating the complex mechanisms of visual information processing observed in human brains. Recent research has highlighted the promise of utilizing neural data to simulate brain processing; however, it relies heavily on invasive neural recordings from non-human subjects.

About the Research

In the present paper, the authors used a core object recognition network with a simple architecture (CORnet-S) state-of-the-art vision model as the foundational architecture for ReAlnet. CORnet-S is a recurrent convolutional neural network that mimics the hierarchical structure of the ventral visual stream in the brain, which is responsible for object recognition.

The study added an EEG generation module to CORnet-S consisting of a series of encoders that transform the latent features from each visual layer of the model into predicted EEG signals. These signals are a type of human neural data that measures the electrical activity of the brain using electrodes attached to the scalp. EEG signals can reflect the brain’s response to different visual stimuli, such as images of objects. The authors used EEG signals to align their vision model with human brain representations to achieve more human brain-like vision.

The model was trained to minimize both the classification loss on ImageNet labels and the generation loss between the predicted and real EEG signals, using a large and rich EEG dataset (THINGS EEG2) that recorded human brain activity while viewing images of objects from different categories. The researchers also used another EEG dataset, THINGS EEG1, and a functional magnetic resonance imaging (fMRI) dataset, Shen fMRI, to estimate the model’s similarity to human brain representations across different modalities, subjects, and images.

The THINGS EEG2 dataset contains human EEG responses from 10 subjects to 22,248 images from 1,854 object concepts. Similarly, THINGS EEG1 comprises responses from 50 subjects and 4,320 images from 720 object concepts. Shen fMRI dataset contains human fMRI responses from 3 subjects to 40 images from different categories.

The study evaluated the performance of ReAlnet on several aspects, including the similarity to human EEG and fMRI, the individual variability across subjects and layers, and the adversarial robustness against white-box attacks. They compared ReAlnet with CORnet-S and other baseline models, such as residual network 101 (ResNet-101) and contrastive language-image pre-training (CLIP).

Research Findings

The outcomes showed that ReAlnet achieved a significantly higher similarity to human EEG neural dynamics for all four visual layers than CORnet-S and also outperformed other baseline models, including ResNet-101 and CLIP. This similarity was consistent across different EEG and fMRI datasets, indicating that ReAlnet learned general and cross-modal brain representation patterns.

The designed model highlighted a higher similarity to human fMRI activity across different brain regions, even though it was not trained with fMRI data, suggesting that ReAlnet learned more general neural representations of the human brain. It exhibited hierarchical individual variabilities across different layers, reflecting the increasing complexity and diversity of neural representations in the human brain. Furthermore, it demonstrated increased adversarial robustness compared to CORnet-S, indicating that aligning with human neural representations can improve the model’s stability and generalization.

The developed techniques have several implications for both computer vision and cognitive neuroscience. For computer vision, the newly presented model represents a novel and effective approach to enhance the resemblance between vision models and the human brain, which can potentially improve the robustness and generalization of the models as well as enable more brain-like artificial intelligence systems. For cognitive neuroscience, the new method can serve as a useful tool to explore the mechanisms of human visual processing and test hypotheses and predictions about the brain’s representational patterns.

Conclusion

In summary, the presented novel model is an effective, efficient, and adaptable framework for human neural representational alignment, along with the corresponding human brain-like model, ReAlnet. It not only aligns closely with human EEG and fMRI but also exhibits hierarchical individual variabilities and increased adversarial robustness, mirroring human visual processing. Moreover, this research effectively fills the gap between human vision and artificial vision.

The study demonstrated that the developed technique can be extended to fMRI and MEG neural modalities and other tasks including natural language and auditory processing utilizing unsupervised or self-supervised models. The researchers acknowledged challenges and limitations, such as the small size of neural datasets and the lack of shared labels among different datasets.

*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.

Journal reference:
Muhammad Osama

Written by

Muhammad Osama

Muhammad Osama is a full-time data analytics consultant and freelance technical writer based in Delhi, India. He specializes in transforming complex technical concepts into accessible content. He has a Bachelor of Technology in Mechanical Engineering with specialization in AI & Robotics from Galgotias University, India, and he has extensive experience in technical content writing, data science and analytics, and artificial intelligence.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Osama, Muhammad. (2024, February 02). ReAInet: Bridging Artificial and Human Vision through EEG-Aligned Neural Networks. AZoAi. Retrieved on July 06, 2024 from https://www.azoai.com/news/20240202/ReAInet-Bridging-Artificial-and-Human-Vision-through-EEG-Aligned-Neural-Networks.aspx.

  • MLA

    Osama, Muhammad. "ReAInet: Bridging Artificial and Human Vision through EEG-Aligned Neural Networks". AZoAi. 06 July 2024. <https://www.azoai.com/news/20240202/ReAInet-Bridging-Artificial-and-Human-Vision-through-EEG-Aligned-Neural-Networks.aspx>.

  • Chicago

    Osama, Muhammad. "ReAInet: Bridging Artificial and Human Vision through EEG-Aligned Neural Networks". AZoAi. https://www.azoai.com/news/20240202/ReAInet-Bridging-Artificial-and-Human-Vision-through-EEG-Aligned-Neural-Networks.aspx. (accessed July 06, 2024).

  • Harvard

    Osama, Muhammad. 2024. ReAInet: Bridging Artificial and Human Vision through EEG-Aligned Neural Networks. AZoAi, viewed 06 July 2024, https://www.azoai.com/news/20240202/ReAInet-Bridging-Artificial-and-Human-Vision-through-EEG-Aligned-Neural-Networks.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Drought Prediction with Deep Learning