J. KIMS Technol Search


J. KIMS Technol > Volume 23(4); 2020 > Article
센서·신호처리 부문
Journal of the Korea Institute of Military Science and Technology 2020;23(4):337-345.
DOI: https://doi.org/10.9766/KIMST.2020.23.4.337    Published online August 5, 2020.
Mean Field Game based Reinforcement Learning for Weapon-Target Assignment
Min Kyu Shin, Soon-Seo Park, Daniel Lee, Han-Lim Choi
Department of Aerospace Engineering, Korea Advanced Institute of Science and Technology, Korea
평균 필드 게임 기반의 강화학습을 통한 무기-표적 할당
신민규, 박순서, 이단일, 최한림
한국과학기술원 항공우주공학과
Correspondence:  Min Kyu Shin,
Email: mkshin@lics.kaist.ac.kr
Received: 14 April 2020;   Revised: 10 June 2020;   Accepted: 26 June 2020;
The Weapon-Target Assignment(WTA) problem can be formulated as an optimization problem that minimize the threat of targets. Existing methods consider the trade-off between optimality and execution time to meet the various mission objectives. We propose a multi-agent reinforcement learning algorithm for WTA based on mean field game to solve the problem in real-time with nearly optimal accuracy. Mean field game is a recent method introduced to relieve the curse of dimensionality in multi-agent learning algorithm. In addition, previous reinforcement learning models for WTA generally do not consider weapon interference, which may be critical in real world operations. Therefore, we modify the reward function to discourage the crossing of weapon trajectories. The feasibility of the proposed method was verified through simulation of a WTA problem with multiple targets in realtime and the proposed algorithm can assign the weapons to all targets without crossing trajectories of weapons.
Key Words: Weapon-Target Assignment Problem, Multi-Agent Reinforcement Learning, Mean Field Game


Browse all articles >

Editorial Office
160 Bugyuseong-daero 488beon-gil, Yuseong-gu, Daejeon 34060, Korea
Tel: +82-42-823-4603    Fax: +82-42-823-4605    E-mail: kimst@kimst.or.kr                

Copyright © 2024 by The Korea Institute of Military Science and Technology.

Developed in M2PI

Close layer
prev next