Secure Control in Partially Observable Environments to Satisfy LTL Specifications
This paper studies the synthesis of control
policies for an agent that has to satisfy a temporal logic
specification in a partially observable environment, in
the presence of an adversary. The interaction of the
agent (defender) with the adversary is modeled as a partially
observable stochastic game. The goal is to generate
a defender policy to maximize satisfaction of a given
temporal logic specification under any adversary policy.
The search for policies is limited to the space of finite
state controllers, which leads to a tractable approach
to determine policies. We relate the satisfaction of the
specification to reaching (a subset of) recurrent states of a
Markov chain. We present an algorithm to determine a set
of defender and adversary finite state controllers of fixed
sizes that will satisfy the temporal logic specification, and
prove that it is sound. We then propose a value-iteration
algorithm to maximize the probability of satisfying the
temporal logic specification under finite state controllers
of fixed sizes. Lastly, we extend this setting to the scenario
where the size of the finite state controller of the defender
can be increased to improve the satisfaction probability.
We illustrate our approach with an example.