Multi-agent coordination and collaboration is a core challenge of future cyber-physical systems as they start having more complex interactions with each other or with humans in homes or cities. One of the key challenges is that agents must be able to reason about and learn the behavior of other agents in order to be able to make decisions. This is particularly challenging because state of the art approaches such as recursive belief modeling over partner policies often do not scale. However, humans are very effective in coordinating and collaborating with each other without the need of any expensive recursive belief modeling. One hypothesis is that humans can effectively capture the sufficient representations required for coordinating on tasks. Similar to humans, the agents in a multi-agent setting can look for the sufficient statistics needed for coordination and collaboration. This project is about learning and approximating such sufficient statistics to enable effective collaboration and coordination. In addition, the investigators will study teaching and learning in settings where the agents have partial observation over the world and need to teach and learn from each other in order to achieve a collaborative task.
Important successful demonstrations of reinforcement learning for single agents have spurred the drive to determine whether such methods can extend to multiple agents. There have also been notable developments in the area of multi-agent systems, both in understanding the structure of the resulting interacting dynamics and in the development of practical reinforcement learning algorithms. The core objective of this project is: 1) the development of learning methods that approximate the well-known concept of sufficient statistics in multi-agent interactions; 2) the development of a reinforcement learning algorithm that leverages the representations of sufficient statistics for more effective planning, coordination, and collaboration in multi-agent settings; and 3) the development of algorithms that use the representations of sufficient statistics to enable teaching and learning in multi-agent settings under partial observation over the environment. The overall outcome of this project will be a new formalism along with algorithms, tools, and techniques that enhance multi-agent learning and control. The investigators will ground this in two main applications: 1) collaborative search and exploration and 2) collaborative transport of objects.
Abstract
Performance Period: 09/15/2021 - 08/31/2024
Institution: Stanford University
Award Number: 2125511