In a recent publication in the journal Scientific Reports, researchers introduced a novel approach where a virtual robot navigates and identifies targets in grids of varying sizes, with any location as a potential target.
Background
In the context of brain-computer interfaces (BCIs), the need for users to micromanage low-level actions to achieve high-level goals has been a significant challenge. The BCIs for robot navigation have existed for some time, but they demand users actively control individual actions, imposing a substantial mental workload. Recent research explores the concept of "cognitive probing," wherein reactive brain signals in response to machine actions serve as feedback for reinforcement learning (RL).
These studies predominantly focus on distinguishing correct from erroneous actions using error-related potentials (ErrP) spontaneously generated in response to recognized errors. Notably, recent research has demonstrated the potential to extract more nuanced information from reactive brain signals, subclassify navigational errors, and correct actions. Building upon this, the current study pioneers a four-way electroencephalography (EEG)-based approach for robot navigation and target identification.
Experimental design
The current study employed real EEG data from previous research, involving ten healthy adults with a mean age of 27.3 years. Participants provided written informed consent, adhering to the Declaration of Helsinki guidelines. EEG signals were recorded at 500 Hz using an Enobio 8 headset.
The experiment involved a virtual robot navigating a one-dimensional space with nine locations, aiming to reach and identify the target. Preset probabilities introduced erroneous movements and target identifications. The target's initial placement was random, and the robot began two or three steps away from it. The robot performed actions every two seconds, with varying chances of moving toward, away from, or incorrectly identifying the target.
Runs ended upon the robot's identification of a location as the target, whether correct or erroneous, followed by a five-second break. Participants could move and blink freely between runs but were asked to remain still during runs. Each participant observed blocks lasting around four minutes, with inter-block breaks as needed. Two participants observed two blocks each, while eight observed six blocks each.
The study followed an independent workflow for each participant, including observation, EEG data preprocessing, training and test data separation, classifier training, and offline simulations. Real-time simulations aim to replicate human-machine interaction using previously recorded EEG data as robot feedback. Eight participants contributed EEG data, which was divided into training and test sets.
Offline simulations occurred in two grids: a one-dimensional space with 9 squares and a 2-dimensional space with 400 squares. Three navigation strategies were tested: "Bayesian Inference," "React," and "Random." For the latter two, EEG trials from a participant who observed the experiment were retrieved after each robot action and classified in real time to inform subsequent actions. The navigation strategy determined when the target was reached, concluding the run. Each strategy was executed 1000 times per participant for each grid size.
Methodology and results
Researchers employed a four-way classification model to categorize the observed robotic movements for each participant. These four movement classes were defined as follows: TT condition (movements towards the target), FA condition (moving further away from an off-target location), TR condition (target reached), and SO condition (stepping off the target). The distance to the target was determined by the minimum number of steps from the robot's current position to the target, considering only horizontal and vertical movements. Based on this distance, the possible movements were classified into "TT condition" or "FA condition."
The classification process involved a two-stage binary tree. Initially, EEG trials were categorized as responses to either erroneous (SO or FA conditions) or correct (TR and TT conditions) movements. Subsequently, they were further classified into specific categories. For classification, stepwise linear discriminant analysis was employed. The preprocessing of the data involved baseline correction, downsampling, and bandpass filtering. A binary classification of target identification (TI) action was performed for each participant to determine whether the identified location was correct (CTI) or false (FTI).
The study investigated the impact of detailed EEG information by comparing the Bayesian inference strategy to an equivalent system with TI classification switched off. Additionally, the proposed system was compared to a binary error versus correct movement classification system. Both alternative systems used various stringency values to assess their performance.
Evaluation of navigation strategies in small and large grids involved two key metrics: the percentage of targets correctly identified (PTCI), which measures accuracy, and the mean normalized number of steps (MNS), which measures speed. Results revealed that, based on PTCI scores, the Bayesian strategy outperformed and was statistically significant compared to other strategies. At the same time, the statistical analysis for the MNS score reveals that there is no significant difference between strategies.
The trade-off between speed and accuracy was evident, with higher stringency settings improving accuracy but increasing the number of steps. The effect of classifying target identification actions showed improved robustness with TI classification. The four-way movement classification outperformed the binary classification in speed and consistency.
Conclusion
In summary, the authors introduced a scalable approach that leverages reactive EEG not only for navigating to target locations but also for accurately identifying those targets upon reaching them. They demonstrated the superiority of the Bayesian inference strategy and suggested its potential for various applications. Future research can concentrate on investigating other learning approaches, such as inverse reinforcement learning as well as long-term learning.