Time-varying Proportional Navigation Guidance using Deep Reinforcement Learning |
Hyeok-Joo Chae1, Daniel Lee1, Su-Jeong Park1, Han-Lim Choi1, Han-Sol Park2, Kyeong-Soo An2 |
1Department of Aerospace Engineering, Korea Advanced Institute of Science and Technology, Korea 2Avionics R&D Center, Hanwha Systems, Korea |
심층 강화학습을 이용한 시변 비례 항법 유도 기법 |
채혁주1, 이단일1, 박수정1, 최한림1, 박한솔2, 안경수2 |
1한국과학기술원 항공우주공학과 2한화시스템㈜ 항공연구센터 |
Correspondence:
Han-Lim Choi, Email: hanlimc@kaist.ac.kr |
Received: 11 April 2020; Revised: 29 May 2020; Accepted: 26 June 2020; |
Abstract |
In this paper, we propose a time-varying proportional navigation guidance law that determines the proportional navigation gain in real-time according to the operating situation. When intercepting a target, an unidentified evasion strategy causes a loss of optimality. To compensate for this problem, proper proportional navigation gain is derived at every time step by solving an optimal control problem with the inferred evader's strategy. Recently, deep reinforcement learning algorithms are introduced to deal with complex optimal control problem efficiently. We adapt the actor-critic method to build a proportional navigation gain network and the network is trained by the Proximal Policy Optimization(PPO) algorithm to learn an evasion strategy of the target. Numerical experiments show the effectiveness and optimality of the proposed method. |
Key Words:
Pursuit-Evasion Game, Proportional Navigation Guidance, Reinforcement Learning |
|