Driven by inherent uncertainty and the sim-to-real gap, robust reinforcement learning (RL) seeks to improve resilience against the complexity and variability in agent-environment sequential interactions. Despite the existence of a large number of RL benchmarks, there is a lack of standardized benchmarks for robust RL. Current robust RL policies often focus on a specific type of uncertainty and are evaluated in distinct, one-off environments. In this work, we introduce Robust-Gymnasium, a unified modular benchmark designed for robust RL that supports a wide variety of disruptions across all key RL components—agents' observed state and reward, agents' actions, and the environment. It offers over sixty diverse task environments spanning control and robotics, safe RL, and multi-agent RL. Robust-Gymnasium provides an open-source and user-friendly tool for the community to assess current methods and foster the development of robust RL algorithms. In addition, we benchmark existing standard and robust RL algorithms in Robust-Gymnasium, uncovering significant deficiencies in each and offering new insights.
This benchmark aims to advance robust reinforcement learning (RL) for real-world applications and domain adaptation. The benchmark provides a comprehensive set of tasks that cover various robustness requirements in the face of uncertainty on state, action, reward, and environmental dynamics, and spans diverse applications including control, robot manipulations, dexterous hand, and more. (This repository is under active development. We appreciate any constructive comments and suggestions).
Each of these robust tasks incorporates elements such as robust observations, actions, reward signals, and dynamics to evaluate the robustness of RL algorithms.
We hope this benchmark serves as a useful platform for pushing the boundaries of RL in real-world problems --- promoting robustness and domain adaptation ability!
Robust RL problems typically consists of three modules:
By leveraging this benchmark, we can evaluate the robustness of RL algorithms and develop new ones that perform reliably under real-world uncertainties and adversarial conditions. This involves creating agents that maintain their performance despite distributional shifts, noisy data, and unforeseen perturbations. Therefore, there are vast opportunities for future research with this benchmark, such as:
In conclusion, by using this benchmark, we can test and refine the robustness of RL algorithms before deploying them in diverse, real-world scenarios.
@article{robustrl2024,
title={Robust Gymnasium: A Unified Modular Benchmark for Robust Reinforcement Learning},
author={Gu, Shangding and Shi, Laixi and Wen, Muning and Jin, Ming and Mazumdar, Eric and Chi, Yuejie and Wierman, Adam and Spanos, Costas},
journal={Github},
year={2024}
}