Deep Reinforcement Learning (DRL) has become a powerful model-free framework to learn optimal policies. However, in real-world navigation tasks, DRL methods often struggle with insufficient exploration, especially in clutter scenarios with sparse rewards or complex dynamics under system disturbances. To overcome this challenge, we bridge general graph-based motion planning with DRL, allowing RL agents to explore their environment comprehensively and achieve optimal performance. Specifically, we design a dense reward function based on a graph structure that spans the entire state space. This graph serves as a rich source of information, guiding the RL agent toward discovering optimal strategies. We validate our approach in dynamic and challenging environments, demonstrating significant improvements in exploration efficiency and task success rates.
The key idea of our work is to integrate a graph-based structure with model-free reinforcement learning to enhance exploration in complex environments. This integration provides structured guidance for agents, addressing the common issue of poor exploration in cluttered or high-dimensional state spaces.
Contributions: We propose a novel graph-based framework that is compatible with a wide range of model-free reinforcement learning algorithms to improve exploration efficiency. Compared to prior methods, our approach enables more complete coverage of the environment’s state space, thus fully utilizing the strengths of model-free RL in learning optimal policies under unknown dynamics. We provide a theoretical guarantee that our exploration strategy preserves the original RL objective and accelerates convergence. Moreover, our framework allows agents to generalize across arbitrary initial states without retraining or policy modification, making it practical for real-world deployment across diverse scenarios.
Obstacles | Dynamic model | Baseline rate | Success rate |
---|---|---|---|
Static Obstacles | Quadrotor | RRG | 100% |
RRT | 100% | ||
Binary | 0% | ||
Static Obstacles | Quadrotor | RRG | 100% |
RRT | 100% | ||
Binary | 0% | ||
Dynamic Obstacles | Vehicle | RRG | 100% |
RRT | 0% | ||
Binary | 0% | ||
Dynamic Obstacles | Vehicle | RRG | 100% |
RRT | 0% | ||
Binary | 0% |
@misc{luo2025bridgingdeepreinforcementlearning,
title={Bridging Deep Reinforcement Learning and Motion Planning for Model-Free Navigation in Cluttered Environments},
author={Licheng Luo and Mingyu Cai},
year={2025},
eprint={2504.07283},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2504.07283},
}