# Biblio

In this paper, decentralized dynamic power allocation problem has been investigated for mobile ad hoc network (MANET) at tactical edge. Due to the mobility and self-organizing features in MANET and environmental uncertainties in the battlefield, many existing optimal power allocation algorithms are neither efficient nor practical. Furthermore, the continuously increasing large scale of the wireless connection population in emerging Internet of Battlefield Things (IoBT) introduces additional challenges for optimal power allocation due to the “Curse of Dimensionality”. In order to address these challenges, a novel Actor-Critic-Mass algorithm is proposed by integrating the emerging Mean Field game theory with online reinforcement learning. The proposed approach is able to not only learn the optimal power allocation for IoBT in a decentralized manner, but also effectively handle uncertainties from harsh environment at tactical edge. In the developed scheme, each agent in IoBT has three neural networks (NN), i.e., 1) Critic NN learns the optimal cost function that minimizes the Signal-to-interference-plus-noise ratio (SINR), 2) Actor NN estimates the optimal transmitter power adjustment rate, and 3) Mass NN learns the probability density function of all agents' transmitting power in IoBT. The three NNs are tuned based on the Fokker-Planck-Kolmogorov (FPK) and Hamiltonian-Jacobian-Bellman (HJB) equation given in the Mean Field game theory. An IoBT wireless network has been simulated to evaluate the effectiveness of the proposed algorithm. The results demonstrate that the actor-critic-mass algorithm can effectively approximate the probability distribution of all agents' transmission power and converge to the target SINR. Moreover, the optimal decentralized power allocation is obtained through integrated mean-field game theory with reinforcement learning.