This research project will study novel methods for designing sequential, non-myopic, online decision procedures for societal-scale cyber-physical systems such as public transit, emergency response systems, and power grid, forming the critical infrastructure of our communities. Online Optimization of these systems entails taking actions that consider the tightly integrated spatial, temporal, and human dimensions while accounting for uncertainty caused due to changes in the system and the environment. For example, emergency response management systems (ERM) operators must optimally dispatch ambulances and help trucks to respond to incidents while accounting for traffic pattern changes and road closures. Similarly, public transportation agencies operating electric vehicles must manage and schedule the vehicles considering the expected travel demand while deciding on charging schedules considering the overall grid load. The project's proposed approach focuses on designing a modular and reusable online decision-making pipeline that combines the advantages of online planning methods, such as Monte-Carlo Tree Search, with offline policy learning methods, such as reinforcement learning, promising to provide faster convergence and robustness to changes in the environment. The research activities of the proposed project are complemented by educational activities focusing on designing cloud-based teaching environments that can help students and operators with prerequisite domain and statistical knowledge to design, manage, and experiment with decision procedures.
The societal-scale CPS that we study have spatial-temporal properties. The spatial aspect refers to the location-specific state variables such as traffic congestion, transportation demand, and the frequency with which incidents occur at a location. The temporal aspect refers to the dynamic nature of these systems---traffic congestion evolves over time. Non-myopic decisions entail selecting actions over time under uncertainty while accounting for future impact and demand for resources. The combined research and education efforts proposed in the project focus on answering the following critical questions for these systems - first, how do we solve the challenge of sampling future state/ environmental actions across a high-dimensional space while also tackling the challenge of non-stationarity? Second, how do we address the need for robust, fast non-myopic planning that also tackles potential non-stationarity? And third, how do we make it possible to engage non-computer science students and community partners with the solutions built using approaches pioneered in the project? The proposed approach involves investigating novel machine learning methods, such as normalizing flows for designing generative models and an innovative approach to design planning algorithms using a policy-augmented hybrid Monte-Carlo Tree Search approach. A significant effort of the project will focus on complementing fundamental research with the design of a cloud-based visual domain-specific modeling environment that can help explain the design, operation, and introspection of methods by using a block-based compositional approach. The work will be augmented with course modules and online tutorials accompanying the cloud-based environment.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.