AI Thrives in Real-World Chaos After Training in Calm, Simulated Environments

Discover how MIT’s groundbreaking 'indoor training effect' is revolutionizing AI performance, enabling robots and agents to excel in unpredictable real-world environments—starting with Atari games and beyond!

Research: The Indoor-Training Effect: unexpected gains from distribution shifts in the transition function

Research: The Indoor-Training Effect: unexpected gains from distribution shifts in the transition function

*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.

A home robot trained to perform household tasks in a factory may fail to scrub the sink effectively or take out the trash when deployed in a user's kitchen, as this new environment differs from its training space.

To avoid this, engineers often try to match the simulated training environment as closely as possible with the real world where the agent will be deployed.

However, researchers from MIT and elsewhere have now found that, despite this conventional wisdom, sometimes training in a completely different environment yields a better-performing artificial intelligence agent.

Their results indicate that, in some situations, training a simulated AI agent in a world with less uncertainty, or "noise," enabled it to perform better than a competing AI agent trained in the same noisy world they used to test both agents.

The researchers call this unexpected phenomenon the indoor training effect.

The researchers studied this phenomenon by training AI agents to play Atari games, which they modified by adding some unpredictability. They were surprised to find that the indoor training effect consistently occurred across Atari games and game variations.

They hope these results fuel additional research toward developing better training methods for AI agents.

"This is an entirely new axis to think about. Rather than trying to match the training and testing environments, we may be able to construct simulated environments where an AI agent learns even better," adds co-author Spandan Madan, a graduate student at Harvard University.

Bono and Madan are joined on the paper by Ishaan Grover, an MIT graduate student; Mao Yasueda, a graduate student at Yale University; Cynthia Breazeal, professor of media arts and sciences and leader of the Personal Robotics Group in the MIT Media Lab; Hanspeter Pfister, the An Wang Professor of Computer Science at Harvard; and Gabriel Kreiman, a professor at Harvard Medical School. The research will be presented at the Association for the Advancement of Artificial Intelligence Conference.

Training troubles

The researchers set out to explore why reinforcement learning agents perform so poorly when tested in environments that differ from their training space.

Reinforcement learning is a trial-and-error method in which the agent explores a training space and learns to take actions that maximize its reward.

The team developed a technique for explicitly adding noise to the transition function, an element of the reinforcement learning problem that defines an agent's probability of moving from one state to another based on the action it chooses.

If the agent is playing Pac-Man, a transition function might define the probability that ghosts on the game board will move up, down, left, or right. The AI would be trained and tested using the same transition function in standard reinforcement learning.

With this conventional approach, the researchers added noise to the transition function, which, as expected, affected the agent's Pac-Man performance.

However, when the researchers trained the agent with a noise-free Pac-Man game and tested it in an environment where they injected noise into the transition function, it performed better than an agent trained on the noisy game.

"The rule of thumb is that you should try to capture the deployment condition's transition function as well as you can during training to get the most bang for your buck. We really tested this insight to death because we couldn't believe it ourselves," Madan says.

Injecting varying amounts of noise into the transition function allowed the researchers to test many environments, but it didn't create realistic games. The more noise they injected into Pac-Man, the more likely ghosts would randomly teleport to different squares.

To see if the indoor training effect occurred in normal Pac-Man games, they adjusted underlying probabilities so ghosts moved normally but were more likely to move up and down rather than left and right. AI agents trained in noise-free environments still performed better in these realistic games.

"It was not only due to the way we added noise to create ad hoc environments. This seems to be a property of the reinforcement learning problem. And that was even more surprising to see," Bono says.

Exploration explanations

When the researchers dug deeper for an explanation, they saw some correlations in how the AI agents explore the training space.

When both AI agents explore mostly the same areas, the agent trained in the non-noisy environment performs better, perhaps because it is easier for the agent to learn the game's rules without noise interference.

If their exploration patterns differ, then the agent trained in the noisy environment tends to perform better. This might occur because the agent needs to understand patterns it can't learn in a noise-free environment.

"If I only learn to play tennis with my forehand in the non-noisy environment, but then in the noisy one I have to also play with my backhand, I won't play as well in the non-noisy environment," Bono explains.

In the future, the researchers hope to explore how the indoor training effect might occur in more complex reinforcement learning environments or with other techniques like computer vision and natural language processing. They also want to build training environments designed to leverage this effect, which could help AI agents perform better in uncertain environments.

*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.

Source:
Journal reference:
  • Preliminary scientific report. Bono, S., Madan, S., Grover, I., Yasueda, M., Breazeal, C., Pfister, H., & Kreiman, G. (2024). The Indoor-Training Effect: Unexpected gains from distribution shifts in the transition function. ArXiv. https://arxiv.org/abs/2401.15856

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.