Transforming Traffic Management: Cooperative DRL and Shapley Value Rewards for Congestion Reduction

Download PDF Copy

By Ashutosh RoyReviewed by Susha Cheriyedath, M.Sc.Aug 13 2023

In an article published in the journal Information Sciences, researchers proposed a transformative potential of cooperative DRL, and its synergy with the Shapley value reward system will be explored to revolutionize traffic signal management, leading to reduced congestion and enhanced traffic flow efficiency.

*Study: Transforming Traffic Management: Cooperative DRL and Shapley Value Rewards for Congestion Reduction. Image credit: ako photography/Shutterstock*

Urban traffic congestion remains a pressing challenge, impacting daily commutes, environmental sustainability, and overall urban productivity. Traditional traffic signal control methods have shown limitations in adapting to the complex and dynamic nature of traffic patterns. Recently, artificial intelligence and machine learning have emerged as a potential game-changer. Particularly, deep reinforcement learning (DRL) and the concept of cooperation between traffic intersections have gained prominence.

Cooperative deep reinforcement learning

Deep reinforcement learning (DRL) is an area of machine learning that holds immense promise in solving intricate challenges. In the context of traffic signal control, intersections can be likened to intelligent agents. Each agent learns how to time traffic signals by interacting with the traffic environment, adapting its actions to maximize a predefined reward, typically reduced travel times and minimized congestion. However, the real power of DRL emerges when these agents collaborate. Cooperative DRL introduces the concept of communication and collaboration between agents. By sharing information, such as queue lengths and vehicle counts, agents can collectively work to optimize traffic flow. This approach proves particularly effective in urban scenarios where intersections are interconnected, such as traffic grids in cities.

Shapley value reward system

Cooperation among agents necessitates a well-designed and equitable reward system. This is where the Shapley value comes into play. The Shapley value is a concept from cooperative game theory that fairly distributes the contribution of each agent in a cooperative setting. In the context of traffic signal control, the Shapley value assigns a reward to each intersection based on its individual contribution to reducing traffic congestion. This elegant mechanism encourages intersections to work together harmoniously, as their actions directly influence the collective goal of optimizing traffic flow.

Putting theory into practice

Implementing cooperative DRL with the Shapley value reward system involves several pivotal steps:

Problem Framing: Intersections are considered intelligent agents responsible for traffic signal control. The primary objective is to enhance collaboration among these agents to achieve optimal traffic flow.

Cooperative Learning: Agents engage in communication, sharing their localized observations and relevant information with neighboring intersections. This facilitates joint decision-making, thereby improving traffic signal synchronization.

Optimized Learning: Agents learn from their experiences through deep neural networks. To ensure stable and effective learning, outdated experiences are eliminated using the Kullback-Leibler divergence technique.

Shapley Value Reward: The Shapley value reward system calculates rewards for each agent based on their contributions toward mitigating traffic congestion. By doing so, the system encourages intersections to harmonize their actions and work collectively to enhance traffic flow.

Results and comparison

To validate the effectiveness of the proposed approach, a series of experiments are conducted using both simulated and real-world traffic datasets. The outcomes demonstrate the substantial benefits of cooperative DRL with Shapley value rewards compared to conventional fixed-time signal control methods:

Remarkable congestion reduction: The approach significantly reduces average travel times, thereby alleviating congestion and enhancing overall traffic flow efficiency.

Strengthened collaborative efforts: Cooperative DRL fosters deeper collaboration among traffic intersections, leading to smoother traffic flow and reduced bottlenecks.

Consistency and stability: Combining the optimized loss function and the Shapley value reward system ensures stable learning, even in the face of dynamically changing traffic conditions.

Outperforming conventional approaches: In direct comparison to traditional methods, the cooperative approach consistently outperforms them, demonstrating its superiority across diverse traffic grid sizes and complexities.

Conclusion

Traffic congestion continues to be a persistent urban challenge, but the convergence of cooperative DRL and the Shapley value reward system offers a promising avenue for resolution. By empowering intersections to communicate and collaborate effectively, this innovative approach paves the way for traffic signal control systems optimized for all road users' benefit. As cooperative DRL evolves and matures, it presents a powerful solution for enhancing traffic flow, reducing congestion, and elevating the overall urban commuting experience. As urban populations continue to grow, the potential impact of this synergy becomes increasingly profound, suggesting a brighter and less congested future for cities.

Journal reference:

Liu, J., Qin, S., Su, M., Luo, Y., Wang, Y., & Yang, S. (2023). Multiple Intersections Traffic Signal Control based on Cooperative Multi-agent Reinforcement Learning. Information Sciences. https://doi.org/10.1016/j.ins.2023.119484, https://www.sciencedirect.com/science/article/abs/pii/S0020025523010691

Posted in: AI Research News

Comments (0)

Written by

Ashutosh Roy

Ashutosh Roy has an MTech in Control Systems from IIEST Shibpur. He holds a keen interest in the field of smart instrumentation and has actively participated in the International Conferences on Smart Instrumentation. During his academic journey, Ashutosh undertook a significant research project focused on smart nonlinear controller design. His work involved utilizing advanced techniques such as backstepping and adaptive neural networks. By combining these methods, he aimed to develop intelligent control systems capable of efficiently adapting to non-linear dynamics.

Download PDF Copy

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

APA
Roy, Ashutosh. (2023, August 13). Transforming Traffic Management: Cooperative DRL and Shapley Value Rewards for Congestion Reduction. AZoAi. Retrieved on July 01, 2025 from https://www.azoai.com/news/20230813/Transforming-Traffic-Management-Cooperative-DRL-and-Shapley-Value-Rewards-for-Congestion-Reduction.aspx.
MLA
Roy, Ashutosh. "Transforming Traffic Management: Cooperative DRL and Shapley Value Rewards for Congestion Reduction". AZoAi. 01 July 2025. <https://www.azoai.com/news/20230813/Transforming-Traffic-Management-Cooperative-DRL-and-Shapley-Value-Rewards-for-Congestion-Reduction.aspx>.
Chicago
Roy, Ashutosh. "Transforming Traffic Management: Cooperative DRL and Shapley Value Rewards for Congestion Reduction". AZoAi. https://www.azoai.com/news/20230813/Transforming-Traffic-Management-Cooperative-DRL-and-Shapley-Value-Rewards-for-Congestion-Reduction.aspx. (accessed July 01, 2025).
Harvard
Roy, Ashutosh. 2023. Transforming Traffic Management: Cooperative DRL and Shapley Value Rewards for Congestion Reduction. AZoAi, viewed 01 July 2025, https://www.azoai.com/news/20230813/Transforming-Traffic-Management-Cooperative-DRL-and-Shapley-Value-Rewards-for-Congestion-Reduction.aspx.