Learning Deceptive Strategies in Adversarial Settings: A Two-Player Game with Asymmetric Information
2025
Sai Krishna Reddy Mareddy | Dipankar Maity
This study explores strategic deception and counter-deception in multi-agent reinforcement learning environments for a <i>police officer–robber</i> game. The research is motivated by real-world scenarios where agents must operate with partial observability and adversarial intent. We develop a suite of progressively complex grid-based environments featuring dynamic goals, fake targets, and navigational obstacles. Agents are trained using deep Q-networks (DQNs) with game-theoretic reward shaping to encourage deceptive behavior in the <i>robber</i> and intent inference in the <i>police officer</i>. The <i>robber</i> learns to reach the true goal while misleading the <i>police officer</i>, and the <i>police officer</i> adapts to infer the <i>robber</i>’s intent and allocate resources effectively. The environments include fixed and dynamic layouts with varying numbers of goals and obstacles, allowing us to evaluate scalability and generalization. Experimental results demonstrate that the agents converge to equilibrium-like behaviors across all settings. The inclusion of obstacles increases complexity but also strengthens learned policies when guided by reward shaping. We conclude that integrating game theory with deep reinforcement learning enables the emergence of robust, deceptive strategies and effective counter-strategies, even in dynamic, high-dimensional environments. This work advances the design of intelligent agents capable of strategic reasoning under uncertainty and adversarial conditions.
Afficher plus [+] Moins [-]Mots clés AGROVOC
Informations bibliographiques
Cette notice bibliographique a été fournie par Directory of Open Access Journals
Découvrez la collection de ce fournisseur de données dans AGRIS