Visible to the public A Hybrid Game Theory and Reinforcement Learning Approach for Cyber-Physical Systems Security

TitleA Hybrid Game Theory and Reinforcement Learning Approach for Cyber-Physical Systems Security
Publication TypeConference Paper
Year of Publication2020
AuthorsKhoury, J., Nassar, M.
Conference NameNOMS 2020 - 2020 IEEE/IFIP Network Operations and Management Symposium
KeywordsCPS, CPS network, cps security, cyber-attacks, Cyber-physical systems, Cyber-physical systems security, Damage Assessment, game theory, hybrid game theory, ICs, invasive software, learning (artificial intelligence), malware author, Multi-Agent Reinforcement Learning, multi-agent systems, multiagent reinforcement learning, Nash equilibrium, pubcrawl, resilience, Resiliency, SCADA systems, supervisory control and data acquisition systems, Virus Spreading
AbstractCyber-Physical Systems (CPS) are monitored and controlled by Supervisory Control and Data Acquisition (SCADA) systems that use advanced computing, sensors, control systems, and communication networks. At first, CPS and SCADA systems were protected and secured by isolation. However, with recent industrial technology advances, the increased connectivity of CPSs and SCADA systems to enterprise networks has uncovered them to new cybersecurity threats and made them a primary target for cyber-attacks with the potential of causing catastrophic economic, social, and environmental damage. Recent research focuses on new methodologies for risk modeling and assessment using game theory and reinforcement learning. This paperwork proposes to frame CPS security on two different levels, strategic and battlefield, by meeting ideas from game theory and Multi-Agent Reinforcement Learning (MARL). The strategic level is modeled as imperfect information, extensive form game. Here, the human administrator and the malware author decide on the strategies of defense and attack, respectively. At the battlefield level, strategies are implemented by machine learning agents that derive optimal policies for run-time decisions. The outcomes of these policies manifest as the utility at a higher level, where we aim to reach a Nash Equilibrium (NE) in favor of the defender. We simulate the scenario of a virus spreading in the context of a CPS network. We present experiments using the MiniCPS simulator and the OpenAI Gym toolkit and discuss the results.
DOI10.1109/NOMS47738.2020.9110453
Citation Keykhoury_hybrid_2020