Deep Reinforcement Learning Boosts Robotic Manipulation

Download PDF Copy

By Dr Silpaja Chandrasekar, PhDReviewed by Susha Cheriyedath, M.Sc.Aug 30 2024

In a paper published in the journal Actuators, researchers introduced a deep reinforcement learning (DRL) model that combined pushing and grasping actions to improve robotic manipulation in cluttered environments.

*Study: Deep Reinforcement Learning Boosts Robotic Manipulation. Image Credit: Aumm graphixphoto/Shutterstock.com*

The model used two convolutional neural networks (CNNs), push-net and grasp-net, to predict actions from heightmap images. It achieved a high grasp success rate, significantly outperforming traditional grasp-only methods. The model demonstrated strong generalization across various scenarios, showing potential for real-world applications.

Background

Past work in robotic manipulation highlighted advancements in grasping techniques, especially in cluttered environments, but significant gaps remained. Studies often needed help handling extreme clutter, which involves more integration of non-prehensile actions like pushing, and they faced difficulties achieving robust generalization across diverse object shapes and sizes. The challenge of transferring learned behaviors from simulations to real-world scenarios persisted.

Simulated Robotic Manipulation

The section details a simulation setup involving a universal robot 5 (UR5) arm from UR, equipped with a two-finger parallel gripper and an Intel real sense depth camera D435. The team used a robotic arm to test a proposed manipulation model in various environments, including cluttered and well-ordered configurations with a mix of known and novel objects.

The hardware included a UR5 arm with six degrees of freedom, a parallel jaw gripper, and a red, green, blue, depth (RGB-D) camera to capture scene data, processed on a system with an Intel Core i7 processor.

The simulation was conducted using CoppeliaSim and the PyTorch framework. The environment was controlled, and the robot's tasks included identifying, planning, and executing grasps. The UR5's mathematical model was based on Denavit-Hartenberg (D-H) parameters for forward kinematics, allowing precise end effector control. Data collection involved capturing RGB-D images and converting them into height maps for processing using the densely connected convolutional networks (DenseNet-121) model.

DRL was employed to optimize the robot's actions, using a Markov decision process to model state transitions and rewards. The agent learned through Q-learning, with experience replay and a target network to enhance training efficiency. The DenseNet-121 architecture processed heightmaps to predict Q-values for pushing and grasping actions, guiding the robot in its decision-making. The learning process was designed to maximize long-term rewards, with the agent selecting actions based on the highest predicted Q-values from the model.

Model Performance Summary

The results section presents the findings from the proposed model's training and testing sessions. After training the model with self-supervised DRL, it was evaluated in various test scenarios featuring different levels of clutter and novel objects. The model's performance was assessed in cluttered environments, challenging well-ordered configurations, and with novel objects.

The proposed model demonstrated significant improvements in cluttered environments over traditional grasping-only policies. While conventional approaches achieved a grasping success rate of 60%, the new model reached 87%, thanks to the effective integration of pushing and grasping actions. The model's approach involved pushing actions to create space around the target object, facilitating a better grasp.

The introduction of rewards for successful pushes further enhanced grasping efficiency. Comparison graphs indicated that the proposed model outperformed other methods, including those without rewards for pushing or using stochastic gradient descent (SGD) without momentum.

Testing in environments with randomly arranged objects showed that the model could effectively handle dense clutter. With an increased number of objects, the model achieved a grasp success rate of 50.5% and a grasp-to-push ratio of 77.1% in highly cluttered scenes. This performance underscores the model's ability to generalize to various configurations, demonstrating robustness in managing complex environments.

The model excelled in challenging, well-ordered configurations, where objects were stacked closely and placed in difficult orientations. The performance in these test cases, including high grasp success and completion rates, further validated the model's capability. Additionally, the model proved its generalization ability by maintaining high performance with novel objects not seen during training. It includes grasping items like bottles, bananas, and screwdrivers, highlighting the model's versatility and robustness for real-world applications.

Conclusion

To sum up, the proposed model improved grasping in cluttered environments by integrating pushing and grasping actions, achieving an 87% success rate. This approach surpassed grasping-only policies, no-reward-for-pushing policies, and SGD-without-momentum strategies by 27%, 16%, and 8%, respectively.

The model demonstrated robustness against lighting variations through RGB-D camera data and PyTorch data augmentation. Future work will incorporate domain randomization, transfer learning, and robust reward structures to enhance real-world applicability. Additionally, further improvements will be essential to address object properties like fragility and deformability.

Journal reference:

Birhanemeskel Alamir Shiferaw, Agidew, T. F., et al. (2024). Synergistic Pushing and Grasping for Enhanced Robotic Manipulation Using Deep Reinforcement Learning. Actuators, 13:8, 316–316. DOI: 10.3390/act13080316, https://www.mdpi.com/2076-0825/13/8/316

Posted in: AI Research News

Comments (0)

Written by

Silpaja Chandrasekar

Dr. Silpaja Chandrasekar has a Ph.D. in Computer Science from Anna University, Chennai. Her research expertise lies in analyzing traffic parameters under challenging environmental conditions. Additionally, she has gained valuable exposure to diverse research areas, such as detection, tracking, classification, medical image analysis, cancer cell detection, chemistry, and Hamiltonian walks.

Download PDF Copy

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

APA
Chandrasekar, Silpaja. (2024, August 30). Deep Reinforcement Learning Boosts Robotic Manipulation. AZoAi. Retrieved on July 18, 2025 from https://www.azoai.com/news/20240830/Deep-Reinforcement-Learning-Boosts-Robotic-Manipulation.aspx.
MLA
Chandrasekar, Silpaja. "Deep Reinforcement Learning Boosts Robotic Manipulation". AZoAi. 18 July 2025. <https://www.azoai.com/news/20240830/Deep-Reinforcement-Learning-Boosts-Robotic-Manipulation.aspx>.
Chicago
Chandrasekar, Silpaja. "Deep Reinforcement Learning Boosts Robotic Manipulation". AZoAi. https://www.azoai.com/news/20240830/Deep-Reinforcement-Learning-Boosts-Robotic-Manipulation.aspx. (accessed July 18, 2025).
Harvard
Chandrasekar, Silpaja. 2024. Deep Reinforcement Learning Boosts Robotic Manipulation. AZoAi, viewed 18 July 2025, https://www.azoai.com/news/20240830/Deep-Reinforcement-Learning-Boosts-Robotic-Manipulation.aspx.