Resource Allocation in Uplink NOMA-IoT Networks A Reinforcement-Learning Approach

Abstract

Non-orthogonal multiple access (NOMA) exploits the potential of power domain to enhance the connectivity for Internet of Things (IoT). Due to time-varying communication channels, dynamic user clustering is a promising method to increase the throughput of NOMA-IoT networks. This paper develops an intelligent resource allocation scheme for uplink NOMA-IoT communications. To maximise the average performance of sum rates, this work designs an efficient optimization approach based on two reinforcement learning algorithms, namely deep reinforcement learning (DRL) and SARSA-learning. For light traffic, SARSA-learning is used to explore the safest resource allocation policy with low cost. For heavy traffic, DRL is used to handle traffic-introduced huge variables. With the aid of the considered approach, this work addresses two main problems of the fair resource allocation in NOMA techniques: 1) allocating users dynamically and 2) balancing resource blocks and network traffic. We analytically demonstrate that the rate of convergence is inversely proportional to network sizes. Numerical results show that: 1) compared with the optimal benchmark scheme, the proposed DRL and SARSA-learning algorithms achieve high accuracy with low complexity and 2) NOMA-enabled IoT networks outperform the conventional orthogonal multiple access based IoT networks in terms of system throughput.

Existing System

? Most of the existing work on resource allocation assumes that the amount of harvested energy is known, or that traffic loads are predictable, which is hard to obtain in practical wireless networks. ? Power domain Nonorthogonal Multiple Access (NOMA) technologies can dramatically improve system capacity and spectrum efficiency. ? Unlike existing NOMA scheduling that mainly focuses on fairness, this paper proposes a power control solution for uplink hybrid OMA and PD-NOMA in dual dynamic environments: dynamic and imperfect channel information together with the random user-specific hierarchical quality of service (QoS). ? This paper also transforms the hierarchical QoS constraint under the NOMA serial interference cancellation (SIC) scene to fit DRL.

Disadvantages

? It is worth noting that the optimization of clustering is an NP-hard problem. ? This is due to the fact that, by default traditional approaches cannot extract knowledge from any given problem (e.g, given distributions) online. ? Combining multi-user relationship and resource allocation increases the complexity of NOMA-IoT systems, which also introduces new problems for optimizing power allocation and scheduling schemes. ? Therefore, due to the high complexity of the problem under multicell multi-user cases, AI can be a feasible option for the dynamic resource allocation. ? We propose two RL techniques, namely SARSA-learning with - greedy and DRL, to solve this long-term optimization problem.

Proposed System

• The proposed scheme can work effectively in a large-state-and-space system by adopting deep neural networks. • The authors in proposed joint resource allocation and transmission mode selection to maximize the secrecy rate in cognitive radio networks. • Thus, the complex in formulation and computation can be relieved regardless of the dynamic properties of the environment by using the proposed scheme, as compared to POMDP scheme. • In the proposed scheme, a deep neural network was trained to obtain the optimal policy where the reward of the system converges to optimal value. • We investigate the performance of uplink NOMA systems using our proposed scheme.

Advantages

? The sum-rate is an important parameter to depict the average performance of wireless networks in detail for each user. ? To characterize the communication distances, the authors in analysed the performance of large scale NOMA communications via stochastic geometry. ? Therefore, the study in considered a practical framework with dynamic channel state information for evaluating the performance of massive connectivity. ? Various model-based schemes have been proposed to improve different metrics of NOMA-IoT networks, such as coverage performance, energy efficiency, system throughput (sum-rates), etc. ? Different from others, in using 2D matching theory authors performed dynamic resource allocations considering energy efficiency for downlink NOMA.

Download DOC Download PPT