Enhancing Continual Learning in AI: A Multi-Learner Approach with Active Forgetting

In the pursuit of augmenting the adaptability of artificial intelligence (AI) in real-world scenarios, continual learning aims to maintain a delicate equilibrium between memory stability and learning plasticity. While current methodologies predominantly focus on preserving memory stability, they encounter challenges in effectively accommodating incremental changes.

Study: Enhancing Continual Learning in AI: A Multi-Learner Approach with Active Forgetting. Image credit: 3rdtimeluckystudio/Shutterstock
Study: Enhancing Continual Learning in AI: A Multi-Learner Approach with Active Forgetting. Image credit: 3rdtimeluckystudio/Shutterstock

In a recent publication in the journal Nature Machine Intelligence, researchers proposed a generic solution that employs multiple learning modules to actively regulate forgetting. This approach diminishes the impact of old memories on parameter distributions, thereby enhancing learning plasticity. The study adopts a multi-learner architecture to ensure compatibility with the evolving nature of problem-solving.

Background

Continual learning, also known as lifelong learning, serves as the foundation for empowering AI systems to navigate dynamic and unpredictable real-world situations. Existing strategies primarily concentrate on preserving memory stability but exhibit limited efficacy across diverse experimental settings, especially in mitigating catastrophic forgetting in neural networks.

To address this gap, a comprehensive approach is advocated, emphasizing the necessity of balancing the memory stability of old tasks with learning plasticity for new tasks while ensuring compatibility with their respective distributions. Drawing inspiration from the Gamma (γ) subset of the Drosophila mushroom body’s (γMB) natural continual learning, the study explores the functional advantages of the γMB system, particularly its ability to actively regulate memories.

The proposed strategy integrates active forgetting with stability protection, achieving a nuanced trade-off between new and old tasks. The model, featuring parallel learning modules mirroring the compartmentalized organization of γMB, demonstrates superior generality and performance across various continual learning benchmarks.

Synaptic expansion-renormalization framework

The authors introduced a framework called synaptic expansion-renormalization for continual learning. It focuses on the case of two tasks but extends to multiple tasks. The approach involves finding a posterior distribution that combines knowledge from different tasks. The learner optimizes a loss function considering the current task and the knowledge of previous tasks.

The method introduces active forgetting with a forgetting rate to enhance learning plasticity. The framework is further extended to multiple parallel continual learners, each with its neural network. Theoretical analyses are provided, discussing generalization ability and the benefits of using multiple learners.

For practical implementation, various benchmark datasets are used, and the approach is compared with baseline methods such as elastic weight consolidation (EWC), synaptic intelligence (SI), memory-aware synapses (MAS), adaptive group sparsity-based continual learning (AGS-CL), progress and compress (P&C), and classifier-projection regularization (CPR). Evaluation metrics include average accuracy, forward transfer, backward transfer, and diversity of learners' predictions. The method is also applied to Atari reinforcement tasks, evaluating normalized average reward, normalized plasticity, and normalized stability.

The study concludes with a proposition discussing the benefits of using multiple continual learners and active forgetting to enhance generalization bounds. Theoretical analyses based on probably approximately correct (PAC)-Bayes theory are presented to provide insights into the generalization errors of the proposed solution.

Results and analysis

The precise recall of old tasks can impede the effective learning of new tasks due to differences in data distribution. Drawing inspiration from biological active forgetting, a forgetting rate is introduced to modulate the impact of old knowledge. The loss function is formulated to balance stability protection and active forgetting, with hyperparameters controlling the strengths of the respective regularization terms. Active forgetting is optimized in two equivalent ways (AF-1 and AF-2), encouraging network parameters to renormalize during new task learning. The benefits of active forgetting are theoretically analyzed, demonstrating improved learning probability for new tasks and minimizing generalization errors.

Evaluations on visual classification benchmarks showcase the efficacy of active forgetting, particularly in enhancing average accuracy and forward transfer. Additionally, the study explores a γMB-like architecture with multiple parallel continual learners, demonstrating that adaptive implementations of active forgetting enhance performance by effectively coordinating the diversity of expertise among learners. The approach proves scalable and outperforms single continual learners across various experimental settings, showcasing its applicability in task-incremental learning scenarios, including Atari reinforcement tasks.

Conclusion

In summary, the authors introduced a generic approach for continual learning in artificial neural networks inspired by biological learning systems. The proposed method demonstrates superior performance and generality, holding promise for applications in smartphones, robotics, and autonomous driving. The energy-efficient deployment of continual learning avoids retraining all previous data, aligning with eco-friendly AI development.

Active forgetting, allowing flexibility for external changes, is supported by theoretical and empirical evidence in the computational model, offering testable hypotheses for further research. The study highlights the importance of generalized theories and methodologies for integrating advances in artificial and biological intelligence, promoting mutual progress and inspiration.

Journal reference:
Dr. Sampath Lonka

Written by

Dr. Sampath Lonka

Dr. Sampath Lonka is a scientific writer based in Bangalore, India, with a strong academic background in Mathematics and extensive experience in content writing. He has a Ph.D. in Mathematics from the University of Hyderabad and is deeply passionate about teaching, writing, and research. Sampath enjoys teaching Mathematics, Statistics, and AI to both undergraduate and postgraduate students. What sets him apart is his unique approach to teaching Mathematics through programming, making the subject more engaging and practical for students.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Lonka, Sampath. (2023, November 21). Enhancing Continual Learning in AI: A Multi-Learner Approach with Active Forgetting. AZoAi. Retrieved on November 21, 2024 from https://www.azoai.com/news/20231121/Enhancing-Continual-Learning-in-AI-A-Multi-Learner-Approach-with-Active-Forgetting.aspx.

  • MLA

    Lonka, Sampath. "Enhancing Continual Learning in AI: A Multi-Learner Approach with Active Forgetting". AZoAi. 21 November 2024. <https://www.azoai.com/news/20231121/Enhancing-Continual-Learning-in-AI-A-Multi-Learner-Approach-with-Active-Forgetting.aspx>.

  • Chicago

    Lonka, Sampath. "Enhancing Continual Learning in AI: A Multi-Learner Approach with Active Forgetting". AZoAi. https://www.azoai.com/news/20231121/Enhancing-Continual-Learning-in-AI-A-Multi-Learner-Approach-with-Active-Forgetting.aspx. (accessed November 21, 2024).

  • Harvard

    Lonka, Sampath. 2023. Enhancing Continual Learning in AI: A Multi-Learner Approach with Active Forgetting. AZoAi, viewed 21 November 2024, https://www.azoai.com/news/20231121/Enhancing-Continual-Learning-in-AI-A-Multi-Learner-Approach-with-Active-Forgetting.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
MIT Researchers Unveil Adaptive-Length Image Tokenizer for Dynamic Image Representation