CPS-Small: Provable Enforcement of Hard Constraints in Reinforcement Learning-Based Controllers for Safety-Critical CPS
Lead PI:
Negar Mehr
Abstract
As autonomous systems like self-driving cars and delivery robots become more common, making sure that they operate safely is critical. These systems often learn how to act using reinforcement learning (RL). While powerful, RL methods typically do not guarantee safety, which limits their use in the real world. A common approach towards capturing safety in RL is ensuring the satisfaction of safety constraints. However, verifying that a learned RL controller never leads to any constraint violation is in general a nontrivial problem due to the black-box nature of such controllers. In this project, we will develop new techniques that allow autonomous systems to learn effective behaviors while always respecting strict safety constraints. This research tackles a core challenge in making intelligent machines reliable and trustworthy. As such, the results of this project could impact areas like transportation, aerospace, and healthcare, where safety is non-negotiable. We will develop a formal framework for learning control policies that provably satisfy hard safety constraints, even when the system dynamics are unknown or treated as black boxes. We will develop RL-based methods that guarantee constraint satisfaction throughout training and deployment in three research thrusts. Our approach begins with the challenge of 1) enforcing a single affine constraint of relative degree one, where we embed an appropriate structure into the policy network to ensure constraint satisfaction by design. 2) We will then extend this framework to handle multiple constraints and more general nonlinear safety specifications by lifting them into an augmented state space. 3) Finally, we will expand the methodology to handle constraints with higher relative degrees, which require greater anticipatory control, and explore how to generalize the approach to cyber-physical systems with hybrid or non-smooth dynamics. This progression allows us to systematically tackle increasingly complex safety requirements, enabling practical and reliable RL deployment in real-world systems. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Performance Period: 01/01/2026 - 12/31/2028
Institution: University of California-Berkeley
Award Number: 2529645
Feedback
Feedback
If you experience a bug or would like to see an addition or change on the current page, feel free to leave us a message.
Image CAPTCHA
Enter the characters shown in the image.
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.