Consider two future autonomous system use-cases: (i) a bomb defusing rover sent into an unfamiliar, GPS and communication denied environment (e.g., a cave or mine), tasked with the objective of locating and defusing an improvised explosive device, and (ii) an autonomous racing drone competing in a future autonomous incarnation of the Drone Racing League. Both systems will make decisions based on inputs from a combination of simple, single output sensing devices, such as inertial measurement units, and complex, high dimensional output sensing modalities, such as cameras and LiDAR. This shift from relying only on simple, single output sensing devices to systems that incorporate rich, complex perceptual sensing modalities requires rethinking the design of safety-critical autonomous systems, especially given the inextricable role that machine and deep learning play in the design of modern perceptual sensors. These two motivating examples raise an even more fundamental question however: given the vastly different dynamics, environments, objectives, and safety/risk constraints, should these two systems have perceptual sensors with different properties? Indeed, due to the extremely safety critical nature of the bomb defusing task, an emphasis on robustness, risk aversion, and safety seems necessary. Conversely, the designer of the drone racer may be willing to sacrifice robustness to maximize responsiveness and lower lap-time. This extreme diversity in requirements highlights the need for a principled approach to navigate tradeoffs in this complex design space, which is what this proposal seeks to develop. Existing approaches to designing perception/action pipelines are either modular, which often ignore uncertainty and limit interaction between components, or monolithic and end-to-end, which are difficult to interpret, troubleshoot, and have high sample-complexity.
This project proposes an alternative approach and rethinks the scientific foundations of using machine learning and computer vision to process rich high-dimensional perceptual data for use in safety-critical cyber-physical control applications. Thrusts will develop integration between perception, planning and control that allow for their co-design and co-optimization. Using novel robust learning methods for perceptual representations and predictive models that characterize tradeoffs between robustness (e.g., to lighting & weather changes, rotations) and performance (e.g., responsiveness, discriminativeness), jointly learned perception maps and uncertainty profiles will be abstracted as ``noisy virtual sensors? for use in uncertainty aware perception-based planning & control algorithms with stability, performance, and safety guarantees. These insights will be integrated into novel perception-based model predictive control algorithms, which allow for planning, stability, and safety guarantees through a unifying optimization-based framework acting on rich perceptual data. Experimental validation of the benefits of these methods will be conducted at Penn using photorealistic simulations and physical camera equipped quadcopters, and be used to demonstrate perception-based planning and control algorithms at the extremes of speed/safety tradeoffs. On the educational front, the research outcomes of this proposal will be used to develop a sequence of courses on safe autonomy, safe perception, and learning and control at the University of Pennsylvania. Longer term, the goal of this project is to create a new community of researchers that focus on robust learning for perception-based control. Towards this goal, departmental efforts will be leveraged to increase and diversify the PhD students working on this project.
Abstract
Performance Period: 09/15/2020 - 08/31/2024
Institution: University of Pennsylvania
Sponsor: National Science Foundation
Award Number: 2038873