Unraveling ChatGPT's Working Memory Capacity

Download PDF Copy

By Dr. Sampath LonkaReviewed by Susha Cheriyedath, M.Sc.Jul 9 2023

In a paper submitted at the NeurIPS 2023, researchers conducted a study on ChatGPT, a large language model, to access the working memory capacity using spatial and verbal n-back tasks. As the information load increased, ChatGPT experienced performance declines that resembled human limitations. The current study showed how different prompting strategies affect performance and highlighted the importance of using n-back tasks to assess working memory in large language models (LLMs).

*Study: Unraveling ChatGPT's Working Memory Capacity. Image Credit: SuPatMaN /Shutterstock*

Background

The emergence of LLMs, such as ChatGPT and GPT-4, has sparked interest in artificial general intelligence and the exploration of human-level abilities that require further investigation. A recent study delves into the working memory capacity of these LLMs, shedding light on their ability to retain contextual information during multi-turn conversations. Working memory, an essential component of human intelligence, forms the foundation for advanced cognitive processes like problem-solving, logical reasoning, and language understanding.

Studies involving human participants have discovered a fundamental limit to how much information can be held in working memory. However, researchers have yet to agree on why this limit exists or how it is enforced. According to the executive attention hypothesis, paying attention and staying focused are vital in controlling and retaining information in working memory. This hypothesis highlights the importance of maintaining sustained and regulated attention to enhance working memory capacity.

In order to validate the executive attention hypothesis, the research employs the n-back task, a widely acknowledged and dependable method for evaluating working memory capacity. In this task, participants consistently revise their mental image of specific items while disregarding those that are not relevant. Working memory capacity is strongly connected to fluid intelligence, which involves thinking logically and independently solving new problems. The current study has demonstrated that improving working memory capacity through the n-back task can enhance fluid intelligence in individuals.

Although LLMs have displayed impressive performance in different tasks, assessing and comparing their cognitive abilities still poses challenges. The current study defines the working memory of LLMs as the capability to choose and handle information for ongoing cognitive processes, which supports the executive attention hypothesis. It also proposes that language models' effectiveness on n-back tasks is a reliable measure to assess their working memory capacity and showcase their overall intelligence in reasoning and problem-solving abilities.

By employing ChatGPT as a representative large language model, the study devises two types of n-back tasks to assess its working memory capability. Remarkably consistent patterns indicating a capacity constraint are evident in various experimental scenarios, implying potential shared mechanisms of working memory between humans and LLMs. These findings hold significance for cognitive scientists and LLM researchers, paving the way for a better understanding of human working memory limitations and the development of more intelligent LLMs with enhanced working memory capacity.

Related works

For a considerable time, researchers have focused on investigating working memory in human cognition, attributing its upkeep to the engagement of neurons across interconnected brain networks. However, the investigation into working memory in LLMs has received limited attention. Recent studies highlight the significance of studying and improving the working memory of LLMs, as it contributes to their overall performance.

LLMs have played a pivotal role in achieving impressive results across various tasks. While fine-tuning is a popular approach to adapt pre-trained models to new tasks, it may not be feasible for extremely large models or when data is scarce. In-context learning, a promising alternative approach, demonstrates the ability of LLMs to learn from a few examples without relying on gradient descent for weight updates. This method mirrors the functioning of human working memory, as LLMs retrieve pre-existing knowledge and integrate it within the given context. The work on in-context learning in LLMs has garnered significant interest, presenting diverse strategies to harness this capability.

However, this study stands as the inaugural endeavor to offer an empirical analysis of the working memory capacity of LLMs through the lens of cognitive science.

Methods

The researchers conducted verbal and spatial n-back tasks with ChatGPT to assess its working memory. They utilized different prompts, including noise feedback and chain-of-thought prompts, to prompt ChatGPT to complete the verbal tasks. The results showed a decline in performance as the difficulty increased. Chain-of-thought prompts improved performance, while noise and feedback had adverse effects. The researchers observed abstract spatial reasoning in spatial tasks, with larger grid sizes leading to better performance.

The empirical analysis of ChatGPT's working memory capacity presented in this study provides valuable insights for cognitive scientists and LLM researchers. The observed capacity limit aligns with the executive attention hypothesis, highlighting the importance of attentional processes in working memory. The performance of LLMs on n-back tasks serves as a reliable metric to assess their working memory capacity, reflecting their general intelligence in reasoning and problem-solving.

Conclusion

In summary, the study reveals that artificial intelligence language models, such as ChatGPT, exhibit working memory capacities similar to humans. The research provides insights into the executive attention hypothesis, the relationship between working memory capacity and fluid intelligence, and the potential of n-back tasks as a metric to evaluate LLMs' cognitive abilities. These findings have implications for the development of more advanced and intelligent language models, contributing to the pursuit of artificial general intelligence.

Journal reference:

Gong, D. & Wan, X. & Wang, D. (2023). Working Memory Capacity of ChatGPT: An Empirical Study. Research Gate, NeurIPS 2023 submission. https://www.researchgate.net/publication/371959411_Working_Memory_Capacity_of_ChatGPT_An_Empirical_Study

Posted in: AI Research News

Comments (0)

Written by

Dr. Sampath Lonka

Dr. Sampath Lonka is a scientific writer based in Bangalore, India, with a strong academic background in Mathematics and extensive experience in content writing. He has a Ph.D. in Mathematics from the University of Hyderabad and is deeply passionate about teaching, writing, and research. Sampath enjoys teaching Mathematics, Statistics, and AI to both undergraduate and postgraduate students. What sets him apart is his unique approach to teaching Mathematics through programming, making the subject more engaging and practical for students.

Download PDF Copy

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

APA
Lonka, Sampath. (2023, July 09). Unraveling ChatGPT's Working Memory Capacity. AZoAi. Retrieved on July 14, 2025 from https://www.azoai.com/news/20230709/Unraveling-ChatGPTs-Working-Memory-Capacity.aspx.
MLA
Lonka, Sampath. "Unraveling ChatGPT's Working Memory Capacity". AZoAi. 14 July 2025. <https://www.azoai.com/news/20230709/Unraveling-ChatGPTs-Working-Memory-Capacity.aspx>.
Chicago
Lonka, Sampath. "Unraveling ChatGPT's Working Memory Capacity". AZoAi. https://www.azoai.com/news/20230709/Unraveling-ChatGPTs-Working-Memory-Capacity.aspx. (accessed July 14, 2025).
Harvard
Lonka, Sampath. 2023. Unraveling ChatGPT's Working Memory Capacity. AZoAi, viewed 14 July 2025, https://www.azoai.com/news/20230709/Unraveling-ChatGPTs-Working-Memory-Capacity.aspx.