In English

Efficient Solving Methods for POMDP-based Threat Defense Environments on Bayesian Attack Graphs

Hampus Ramström ; Johan Backman
Göteborg : Chalmers tekniska högskola, 2018. 82 s.
[Examensarbete på avancerad nivå]

In this work, we show how to formulate a threat defense environment as a Partially Observable Markov Decision Process (POMDP) that allows for fast approximate defense algorithms against multiple attackers. It is done through an action extension, coined the Inspect action, which allows the agent to reveal the true state of the environment, thereby reducing the problem into a traditional Markov Decision Process (MDP) for the current time-step. The work is an extension of previous definitions of the same problem. Furthermore, based on the new definition we define and show the optimal policy, as well as two new solving algorithms, n-Myopic and n-Lookahead. To evaluate their performance, we show and compare the results of these new algorithms to more standard solving algorithms, such as Q-learning and Policy Gradients. The experimental results show that the new algorithms perform better than previous attempts and allows for larger scale threat environments thanks to the approximate MDP reduction. Additionally, to facilitate future research, two OpenAI Gym environments were developed and are publicly available for new research to build upon. We encourage new research with similar problem description to use this software library, opening up to standardized performance results.

Nyckelord: Reinforcement Learning, POMDP, Bayesian Attack Graphs, Security, Defense Policies, OpenAI Gym, Threat Defense



Publikationen registrerades 2018-12-14. Den ändrades senast 2018-12-14

CPL ID: 256400

Detta är en tjänst från Chalmers bibliotek