Repository: yjwong1999/Twin-TD3
Branch: main
Commit: 1e172dc58daa
Files: 47
Total size: 180.3 KB
Directory structure:
gitextract_sya5mzu7/
├── README.md
├── batch_eval.sh
├── batch_train.sh
├── channel.py
├── data/
│ ├── .vscode/
│ │ └── settings.json
│ ├── data_test.py
│ ├── init_location.xlsx
│ ├── readme.md
│ ├── simulation_result.xlsx
│ └── storage/
│ ├── benchmark/
│ │ ├── ddpg_see_benchmark/
│ │ │ ├── Actor_UAV_ddpg
│ │ │ ├── Critic_UAV_ddpg
│ │ │ ├── TargetActor_UAV_ddpg
│ │ │ └── TargetCritic_UAV_ddpg
│ │ ├── ddpg_ssr_benchmark/
│ │ │ ├── Actor_UAV_ddpg
│ │ │ ├── Critic_UAV_ddpg
│ │ │ ├── TargetActor_UAV_ddpg
│ │ │ └── TargetCritic_UAV_ddpg
│ │ ├── td3_see_benchmark/
│ │ │ ├── Actor_UAV_TD3
│ │ │ ├── Critic_1_UAV_TD3
│ │ │ ├── Critic_2_UAV_TD3
│ │ │ ├── TargetActor_UAV_TD3
│ │ │ ├── TargetCritic_1_UAV_TD3
│ │ │ └── TargetCritic_2_UAV_TD3
│ │ └── td3_ssr_benchmark/
│ │ ├── Actor_UAV_TD3
│ │ ├── Critic_1_UAV_TD3
│ │ ├── Critic_2_UAV_TD3
│ │ ├── TargetActor_UAV_TD3
│ │ ├── TargetCritic_1_UAV_TD3
│ │ └── TargetCritic_2_UAV_TD3
│ └── readme.md
├── data_manager.py
├── ddpg.py
├── entity.py
├── env.py
├── env1.py
├── load_and_plot.py
├── main_RIS.py
├── main_ref.py
├── main_train.py
├── math_tool.py
├── plot_see.py
├── plot_ssr.py
├── plot_traj.py
├── render.py
├── requirements.txt
├── run_simulation.py
└── td3.py
================================================
FILE CONTENTS
================================================
================================================
FILE: README.md
================================================
# Deep Reinforcement Learning for Secrecy Energy-Efficient UAV Communication with Reconfigurable Intelligent Surfaces
**IEEE Wireless Communications and Networking Conference 2023 (WCNC 2023)** </br>
Simulation for Conference Proceedings https://doi.org/10.1109/WCNC55385.2023.10118891 </br>
Refer [this link](https://github.com/yjwong1999/Twin-TD3/blob/main/WCNC2023%20WS-09%20%231570879488.pdf) for the preprint
## Abstract
This paper investigates the **physical layer security (PLS)** issue in **reconfigurable intelligent surface (RIS) aided millimeter-wave rotary-wing unmanned aerial vehicle (UAV) communications** under the presence of multiple eavesdroppers and imperfect channel state information (CSI). The goal is to maximize the **worst-case secrecy energy efficiency (SEE)** of UAV via a **joint optimization of flight trajectory, UAV active beamforming and RIS passive beamforming**. By interacting with the dynamically changing UAV environment, real-time decision making per time slot is possible via deep reinforcement learning (DRL). To decouple the continuous optimization variables, we introduce a **twin twin-delayed deep deterministic policy gradient (TTD3)** to maximize the expected cumulative reward, which is linked to SEE enhancement. Simulation results confirm that the proposed method achieves greater secrecy energy savings than the traditional twin-deep deterministic policy gradient DRL (TDDRL)-based method.
## TLDR
### System model:
**RIS-aided mmWave UAV system** under the presence of **eavesdroppers** and **imperfect channel state information (CSI)** </br>
### Solution:
A **Twin-TD3 (TTD3) algorithm** to decouple the joint optimization of:
1. UAV active beamforming and RIS passive beamforming
2. UAV flight trajectory
We adopt **double DRL framework**, where the 1st and 2nd agent provides the policy for task (1) and (2), respectively.
## How to use this repo
Setup the repo
```
conda create --name <env> python=3.10.4
conda activate <env>
git clone https://github.com/yjwong1999/Twin-TD3.git
cd Twin-TD3
pip install -r requirements.txt
```
User can train two types of algorithm for training:
1. Twin DDPG is [TDDRL algorithm](https://doi.org/10.1109/LWC.2021.3081464)
2. Twin TD3 is our proposed TTD3 algorithm
Run the following in the `bash` or `powershell`
`main_train.py` is the main python file to train the DRL algorithms
```shell
# To use Twin DDPG with SSR as optimization goal
python3 main_train.py --drl ddpg --reward ssr
# To use Twin TD3 with SSR as optimization goal
python3 main_train.py --drl td3 --reward ssr
# To use Twin DDPG with SEE as optimization goal
python3 main_train.py --drl ddpg --reward see
# To use Twin TD3 with SEE as optimization goal
python3 main_train.py --drl td3 --reward see
# To use pretrained DRL for UAV trajectory (recommended for stable convergence)
python3 main_train.py --drl td3 --reward see --trained-uav
# To set number of episodes (default is 300)
python3 main_train.py --drl td3 --reward see --ep-num 300
# To set seeds for DRL weight initialization (not recommended)
python3 main_train.py --drl td3 --reward see --seeds 0 # weights of both DRL are initialized with seed 0
python3 main_train.py --drl td3 --reward see --seeds 0 1 # weights of DRL 1 and DRL2 are initialized with seed 0 and 1, respectively
```
`run_simulation.py` is the python file to run the simulation using your trained models
```shell
# plot everything for each episode
python3 run_simulation.py --path data/storage/scratch/<DIR> # if you train the algorithm without the pretrained uav
python3 run_simulation.py --path data/storage/trained_uav/<DIR> # if you train the algorithm with the pretrained uav
```
`load_and_plot.py` is the python file to plot the (i) Rewards, (ii) Sum Secrecy Rate (SSR), (iii) Secrecy Energy Efficient (SEE), (iv) UAV Trajectory, (v) RIS configs for each episode in one experiments. The plotted figures are saved at `data/storage/scratch/<DIR>/plot` or `data/storage/trained_uav/<DIR>plot`
```shell
# plot everything for each episode
python3 load_and_plot.py --path data/storage/scratch/<DIR> --ep-num 300 # if you train the algorithm without the pretrained uav
python3 load_and_plot.py --path data/storage/trained_uav/<DIR> --ep-num 300 # if you train the algorithm with the pretrained uav
```
Note that you can use the bash script `batch_train.sh` and `batch_eval.sh` to train the algorithms and evaluate them using the previous two python codes
```shell
# To train on batch
bash batch_train.sh
# To evaluate on batch
bash batch_eval.sh
```
## Results
We run the ```main_train.py``` for 5 times for each settings below, and averaged out the performance
SSR and SEE (the higher the better)
Total Energy Consumption (the lower the better)
| Algorithms | SSR (bits/s/Hz)| Energy (kJ) | SEE (bits/s/Hz/kJ)|
|--------------------------------|----------------|-------------|-------------------|
| TDDRL | 5.03 | 12.4 | 40.8 |
| TTD3 | 6.05 | 12.7 | 48.2 |
| TDDRL (with energy constraint) | 4.68 | 11.2 | 39.4 |
| TTD3 (with energy constraint) | 5.39 | 11.2 | 48.4 |
Summary
1. In terms of SSR, TTD3 outperforms TDDRL with or without energy constraint
2. In terms of SEE and Energy, TTD3 (with energy constraint) outperforms all other algorithms
3. Generally, TTD3 algorithm are better than TTDRL
4. Even with energy contraint (trade-off between energy consumption and SSR), TTD3 outperforms TDDRL in all aspects
\* Remarks: </br>
Note that the performance of DRL (especially twin DRL) has a big variation, sometimes you may get extremely good (or bad) performance </br>
The above benchmark results are averaged performance of several experiments, to get a more holistic understandings on the algorithms </br>
It is advised to use the benchmark UAV models we trained, for better convergence. </br>
This approach is consistent with the codes provided by [TDDRL](https://github.com/Brook1711/WCL-pulish-code)
## References and Acknowledgement
This work was supported by the **British Council** under **UK-ASEAN Institutional Links Early Career Researchers Scheme** with project number 913030644.
Both **RIS Simulation** and the **System Model** for this Research Project are based the research work provided by [Brook1711](https://github.com/Brook1711). </br>
We intended to fork the original repo for the system model (as stated below) as the base of this project. </br>
However, GitHub does not allow a forked repo to be private. </br>
Hence, we could not maintain our code based a forked version of the original repo, while keeping it private until the project is completed.
We would like to express our utmost gratitude for [Brook1711](https://github.com/Brook1711) and his co-authors for their research work.
### RIS Simulation
RIS Simulation is based on the following research work: </br>
[SimRIS Channel Simulator for Reconfigurable Intelligent Surface-Empowered Communication Systems](https://ieeexplore.ieee.org/document/9282349) </br>
The original simulation code is coded in matlab, this [GitHub repo](https://github.com/Brook1711/RIS_components) provides a Python version of the simulation.
### System Model: RIS-aided mmWave UAV communications
The simulation of the System Model is provided by the following research work: </br>
[Learning-Based Robust and Secure Transmission for Reconfigurable Intelligent Surface Aided Millimeter Wave UAV Communications](https://doi.org/10.1109/LWC.2021.3081464) </br>
The code is provided in this [GitHub repo](https://github.com/Brook1711/WCL-pulish-code).
### Rotary-Wing UAV
We can derive the Rotary-Wing UAV’s propulsion energy consumption based on the following research work: </br>
[Energy Minimization in Internet-of-Things System Based on Rotary-Wing UAV](https://doi.org/10.1109/LWC.2019.2916549)
### TD3
Main reference for TD3 implementation: </br>
[PyTorch/TensorFlow 2.0 for TD3](https://github.com/philtabor/Actor-Critic-Methods-Paper-To-Code/tree/master/TD3)
## TODO
- [x] Add argparse arguments to set drl algo and reward type
- [x] Add argparse arguments to set episode number
- [x] Add argparse arguments to set seeds for the two DRLs
- [x] Add argparse arguments to load pretrained DRL for UAV trajectory
- [x] Add benchmark/pretrained model
- [x] Project naming (use <DRL>_<Reward>_<Num> instead of using datetime format)
- [x] Remove saving "best model", there are no best model, only latest model
- [ ] The following codes can be used, but you have to manually change the filepath in the codes
`plot_ssr.py` is the python file to plot the final episode's SSR for the 4 benchmarks in the paper
```shell
# plot ssr
python3 plot_ssr.py
```
`plot_see.py` is the python file to plot the final episode's SSR for the 4 benchmarks in the paper
```shell
# plot see
python3 plot_see.py
```
`plot_traj.py` is the python file to plot the final episode's UAV trajectory for the 4 benchmarks in the paper
```shell
# plot UAV trajectory
python3 plot_traj.py
```
## Cite this repository
```
@INPROCEEDINGS{10118891,
author={Tham, Mau-Luen and Wong, Yi Jie and Iqbal, Amjad and Ramli, Nordin Bin and Zhu, Yongxu and Dagiuklas, Tasos},
booktitle={2023 IEEE Wireless Communications and Networking Conference (WCNC)},
title={Deep Reinforcement Learning for Secrecy Energy- Efficient UAV Communication with Reconfigurable Intelligent Surface},
year={2023},
doi={10.1109/WCNC55385.2023.10118891}}
```
<details>
<summary>Star History</summary>
[](https://star-history.com/#yjwong1999/Twin-TD3&Date)
</details>
================================================
FILE: batch_eval.sh
================================================
#!/bin/bash
echo
echo ddpg_ssr
python3 load_and_plot.py --path data/storage/trained_uav/ddpg_ssr --ep-num 300
echo ddpg_ssr_2
python3 load_and_plot.py --path data/storage/trained_uav/ddpg_ssr_2 --ep-num 300
echo ddpg_ssr_3
python3 load_and_plot.py --path data/storage/trained_uav/ddpg_ssr_3 --ep-num 300
echo ddpg_ssr_4
python3 load_and_plot.py --path data/storage/trained_uav/ddpg_ssr_4 --ep-num 300
echo ddpg_ssr_5
python3 load_and_plot.py --path data/storage/trained_uav/ddpg_ssr_5 --ep-num 300
echo td3_ssr
python3 load_and_plot.py --path data/storage/trained_uav/td3_ssr --ep-num 300
echo td3_ssr_2
python3 load_and_plot.py --path data/storage/trained_uav/td3_ssr_2 --ep-num 300
echo td3_ssr_3
python3 load_and_plot.py --path data/storage/trained_uav/td3_ssr_3 --ep-num 300
echo td3_ssr_4
python3 load_and_plot.py --path data/storage/trained_uav/td3_ssr_4 --ep-num 300
echo td3_ssr_5
python3 load_and_plot.py --path data/storage/trained_uav/td3_ssr_5 --ep-num 300
echo ddpg_see
python3 load_and_plot.py --path data/storage/trained_uav/ddpg_see --ep-num 300
echo ddpg_see_2
python3 load_and_plot.py --path data/storage/trained_uav/ddpg_see_2 --ep-num 300
echo ddpg_see_3
python3 load_and_plot.py --path data/storage/trained_uav/ddpg_see_3 --ep-num 300
echo ddpg_see_4
python3 load_and_plot.py --path data/storage/trained_uav/ddpg_see_4 --ep-num 300
echo ddpg_see_5
python3 load_and_plot.py --path data/storage/trained_uav/ddpg_see_5 --ep-num 300
echo td3_see
python3 load_and_plot.py --path data/storage/trained_uav/td3_see --ep-num 300
echo td3_see_2
python3 load_and_plot.py --path data/storage/trained_uav/td3_see_2 --ep-num 300
echo td3_see_3
python3 load_and_plot.py --path data/storage/trained_uav/td3_see_3 --ep-num 300
echo td3_see_4
python3 load_and_plot.py --path data/storage/trained_uav/td3_see_4 --ep-num 300
echo td3_see_5
python3 load_and_plot.py --path data/storage/trained_uav/td3_see_5 --ep-num 300
================================================
FILE: batch_train.sh
================================================
#!/bin/bash
echo
echo ddpg_ssr
python3 main_train.py --drl ddpg --reward ssr --ep-num 300 --trained-uav
echo ddpg_ssr_2
python3 main_train.py --drl ddpg --reward ssr --ep-num 300 --trained-uav
echo ddpg_ssr_3
python3 main_train.py --drl ddpg --reward ssr --ep-num 300 --trained-uav
echo ddpg_ssr_4
python3 main_train.py --drl ddpg --reward ssr --ep-num 300 --trained-uav
echo ddpg_ssr_5
python3 main_train.py --drl ddpg --reward ssr --ep-num 300 --trained-uav
echo td3_ssr
python3 main_train.py --drl td3 --reward ssr --ep-num 300 --trained-uav
echo td3_ssr_2
python3 main_train.py --drl td3 --reward ssr --ep-num 300 --trained-uav
echo td3_ssr_3
python3 main_train.py --drl td3 --reward ssr --ep-num 300 --trained-uav
echo td3_ssr_4
python3 main_train.py --drl td3 --reward ssr --ep-num 300 --trained-uav
echo td3_ssr_5
python3 main_train.py --drl td3 --reward ssr --ep-num 300 --trained-uav
echo ddpg_see
python3 main_train.py --drl ddpg --reward see --ep-num 300 --trained-uav
echo ddpg_see_2
python3 main_train.py --drl ddpg --reward see --ep-num 300 --trained-uav
echo ddpg_see_3
python3 main_train.py --drl ddpg --reward see --ep-num 300 --trained-uav
echo ddpg_see_4
python3 main_train.py --drl ddpg --reward see --ep-num 300 --trained-uav
echo ddpg_see_5
python3 main_train.py --drl ddpg --reward see --ep-num 300 --trained-uav
echo td3_see
python3 main_train.py --drl td3 --reward see --ep-num 300 --trained-uav
echo td3_see_2
python3 main_train.py --drl td3 --reward see --ep-num 300 --trained-uav
echo td3_see_3
python3 main_train.py --drl td3 --reward see --ep-num 300 --trained-uav
echo td3_see_4
python3 main_train.py --drl td3 --reward see --ep-num 300 --trained-uav
echo td3_see_5
python3 main_train.py --drl td3 --reward see --ep-num 300 --trained-uav
================================================
FILE: channel.py
================================================
import numpy as np
import math
import cmath
from math_tool import *
class mmWave_channel(object):
"""
generate MmWave under UMi open
input: distance, angle, pair entity object
output: Instantaneous CSI
"""
def __init__(self, transmitter, receiver, frequncy):
"""
transmitter: object in entity.py
receiver: object in entity.py
"""
self.channel_name = ''
self.n = 0
self.sigma = 0
self.transmitter = transmitter
self.receiver = receiver
self.channel_type = self.init_type() # 'UAV_RIS', 'UAV_user', 'UAV_attacker', 'RIS_user', 'RIS_attacker'
# self.distance = np.linalg.norm(transmitter.coordinate - receiver.coordinate)
self.frequncy = frequncy
# init & updata path loss
self.path_loss_normal = self.get_channel_path_loss()
self.path_loss_dB = normal_to_dB(self.path_loss_normal)
# init & update channel CSI matrix
self.channel_matrix = self.get_estimated_channel_matrix()
def init_type(self):
channel_type = self.transmitter.type+'_'+self.receiver.type
if channel_type == 'UAV_RIS' or channel_type == 'RIS_UAV':
self.n = 2.2
self.sigma = 3
self.channel_name = 'H_UR'
elif channel_type == 'UAV_user' or channel_type == 'UAV_attacker':
self.n = 3.5
self.sigma = 3
if channel_type =='UAV_user':
self.channel_name = 'h_U_k,' + str(self.transmitter.index)
elif channel_type == 'UAV_attacker':
self.channel_name = 'h_U_p,' + str(self.transmitter.index)
elif channel_type == 'user_UAV' or channel_type == 'attacker_UAV':
self.n = 3.5
self.sigma = 3
if channel_type =='user_UAV':
self.channel_name = 'h_U_k,' + str(self.transmitter.index)
elif channel_type == 'attacker_UAV':
self.channel_name = 'h_U_p,' + str(self.transmitter.index)
elif channel_type == 'RIS_user' or channel_type == 'RIS_attacker':
self.n = 2.8
self.sigma = 3
if channel_type =='RIS_user':
self.channel_name = 'h_R_k,' + str(self.transmitter.index)
elif channel_type == 'RIS_attacker':
self.channel_name = 'h_R_p,' + str(self.transmitter.index)
elif channel_type == 'user_RIS' or channel_type == 'attacker_RIS':
self.n = 2.8
self.sigma = 3
if channel_type =='user_RIS':
self.channel_name = 'h_R_k,' + str(self.transmitter.index)
elif channel_type == 'attacker_RIS':
self.channel_name = 'h_R_p,' + str(self.transmitter.index)
return channel_type
def get_channel_path_loss(self):
"""
calculate the path loss including shadow fading
(in normal form)
"""
distance = np.linalg.norm(self.transmitter.coordinate - self.receiver.coordinate)
PL = -20 * math.log10(4*math.pi/(3e8/self.frequncy)) - 10*self.n*math.log10(distance)
shadow_loss = np.random.normal() * self.sigma
# return dB_to_normal(PL - shadow_loss)
return dB_to_normal(PL)
def get_estimated_channel_matrix(self):
"""
init & update channel matrix
"""
# init matrix
N_t = self.transmitter.ant_num
N_r = self.receiver.ant_num
channel_matrix = np.mat(np.ones(shape=(N_r,N_t),dtype=complex), dtype=complex)
# get relevant coordinate receiver under transmitter system
r_under_t_car_coor = get_coor_ref(\
self.transmitter.coor_sys, \
self.receiver.coordinate - self.transmitter.coordinate)
# get relevant spherical_coordinate
r_t_r, r_t_theta, r_t_fai = cartesian_coordinate_to_spherical_coordinate(\
cartesian_coordinate=r_under_t_car_coor\
)
# get relevant coordinate transmitter under receiver system
t_under_r_car_coor = get_coor_ref(\
# remmber to Meet channel direction restrictions
[-self.receiver.coor_sys[0], self.receiver.coor_sys[1], -self.receiver.coor_sys[2]],\
self.transmitter.coordinate - self.receiver.coordinate)
# get relevant spherical_coordinate
t_r_r, t_r_theta, t_r_fai = cartesian_coordinate_to_spherical_coordinate(\
cartesian_coordinate=t_under_r_car_coor\
)
# calculate array response
t_array_response = self.generate_array_response(self.transmitter, r_t_theta, r_t_fai)
r_array_response = self.generate_array_response(self.receiver, t_r_theta, t_r_fai)
array_response_product = r_array_response * t_array_response.H
# get H_LOS
# get LOS path loss
PL = self.path_loss_normal
# get LOS phase shift
LOS_fai = 2 * math.pi * self.frequncy * np.linalg.norm(self.transmitter.coordinate - self.receiver.coordinate) / 3e8
channel_matrix = cmath.exp(1j*LOS_fai)* math.pow(PL, 0.5) * array_response_product
return channel_matrix
def generate_array_response(self, transceiver, theta, fai):
"""
if the ant_type is 'UPA'
generate_UPA_response
if the ant_type is 'ULA'
generate_ULA_response
if the ant_type is 'single'
generate_singleant_response
"""
ant_type = transceiver.ant_type
ant_num = transceiver.ant_num
if ant_type == 'UPA':
row_num = int(math.sqrt(ant_num))
Planar_response = np.mat(np.ones(shape=(ant_num, 1)), dtype=complex)
for i in range(row_num):
for j in range(row_num):
Planar_response[j+i*row_num,0] = cmath.exp(1j *\
(math.sin(theta) * math.cos(fai)*i*math.pi + math.sin(theta)*math.sin(fai))\
)
return Planar_response
elif ant_type == 'ULA':
Linear_response = np.mat(np.ones(shape=(ant_num,1)), dtype=complex)
for i in range(ant_num):
Linear_response[i, 0] = cmath.exp(1j * math.sin(theta) * math.cos(fai)*i*math.pi)
return Linear_response
elif ant_type == 'single':
return np.mat(np.array([1]))
else:
return False
def update_CSI(self):
"""
update pathloss and channel matrix
"""
# init & updata path loss
self.path_loss_normal = self.get_channel_path_loss()
self.path_loss_dB = normal_to_dB(self.path_loss_normal)
# init & update channel CSI matrix
self.channel_matrix = self.get_estimated_channel_matrix()
================================================
FILE: data/.vscode/settings.json
================================================
{
"python.pythonPath": "C:\\Users\\67039\\anaconda3\\python.exe"
}
================================================
FILE: data/data_test.py
================================================
import pandas as pd
user_sheet = pd.read_excel('init_location.xlsx', sheet_name='user')
user_data = user_sheet
print(user_data)
================================================
FILE: data/readme.md
================================================
#### init_location.xlsx
- the config files for the initilization postion of each entity (such as UAV, users, attackers, RIS)
#### data_test.py (should change name)
- to read and show the initialization position of each entity
================================================
FILE: data/storage/readme.md
================================================
# This directory stores all experiments
- Each experiment will be saved as a file project in this directory
- benchmark directory stores all the pretrained DRL for UAV trajectory (agent 2 in the code), which will be used when --trained-uav flag is raised
================================================
FILE: data_manager.py
================================================
import numpy as np
import scipy.io
import pandas as pd
import os
import time, csv
class DataManager(object):
"""
class to read and store simulation results
before use, please create a direction under current file path './data'
and must have a file 'init_location.xlsx' which contain the position of each entities
"""
def __init__(self, store_list = ['beamforming_matrix', 'reflecting_coefficient', 'UAV_state', 'user_capacity'],file_path = './data', store_path = './data/storage', project_name = None):
# 1 init location data
self.store_list = store_list
self.init_data_file = file_path + '/init_location.xlsx'
if project_name is None:
self.time_stemp = time.strftime('/%Y-%m-%d %H_%M_%S',time.localtime(time.time()))
self.store_path = store_path + self.time_stemp
else:
for i in range(1, 100, 1):
if i == 1:
dir_name = store_path + '/' + project_name
else:
dir_name = store_path + '/' + project_name + f'_{i}'
if not os.path.isdir(dir_name):
self.store_path = dir_name
break
os.makedirs(self.store_path)
# self.writer = pd.ExcelWriter(self.store_path + '/simulation_result.xlsx', engine='openpyxl') # pylint: disable=abstract-class-instantiated
self.simulation_result_dic = {}
self.init_format()
def save_file(self, episode_cnt = 10):
# record step counts per episode
with open(self.store_path + "/step_num_per_episode.csv", "a", newline='') as f:
writer = csv.writer(f)
writer.writerow([len(list(self.simulation_result_dic.values())[0])])
# when ended, auto save to .mat file
scipy.io.savemat(self.store_path + '/simulation_result_ep_' + str(episode_cnt) + '.mat', {'result_' + str(episode_cnt):self.simulation_result_dic})
self.simulation_result_dic = {}
self.init_format()
def save_meta_data(self, meta_dic):
"""
save system and agent information
"""
scipy.io.savemat(self.store_path + '/meta_data.mat', {'meta_data': meta_dic})
def init_format(self):
"""
used only one time in env.py
"""
for store_item in self.store_list:
self.simulation_result_dic.update({store_item:[]})
def read_init_location(self, entity_type = 'user', index = 0):
if entity_type == 'user' or 'attacker' or 'RIS' or 'RIS_norm_vec' or 'UAV':
return np.array([\
pd.read_excel(self.init_data_file, sheet_name=entity_type)['x'][index],\
pd.read_excel(self.init_data_file, sheet_name=entity_type)['y'][index],\
pd.read_excel(self.init_data_file, sheet_name=entity_type)['z'][index]])
else:
return None
def store_data(self, row_data, value_name):
"""
docstring
"""
self.simulation_result_dic[value_name].append(row_data)
================================================
FILE: ddpg.py
================================================
import os
import torch as T
#import torch.cuda as T
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import numpy as np
from data_manager import DataManager
class OUActionNoise(object):
def __init__(self, mu, sigma=0.15, theta=.2, dt=1e-2, x0=None):
self.theta = theta
self.mu = mu
self.sigma = sigma
self.dt = dt
self.x0 = x0
self.reset()
def __call__(self):
x = self.x_prev + self.theta * (self.mu - self.x_prev) * self.dt + \
self.sigma * np.sqrt(self.dt) * np.random.normal(size=self.mu.shape)
self.x_prev = x
return x
def reset(self):
self.x_prev = self.x0 if self.x0 is not None else np.zeros_like(self.mu)
def __repr__(self):
return 'OrnsteinUhlenbeckActionNoise(mu={}, sigma={})'.format(
self.mu, self.sigma)
class AWGNActionNoise(object):
def __init__(self, mu = 0, sigma=1):
self.mu = mu
self.sigma = sigma
def __call__(self):
#self.mu = mu
#self.sigma = sigma
x = np.random.normal(size=self.mu.shape) * self.sigma
return x
class ReplayBuffer(object):
def __init__(self, max_size, input_shape, n_actions):
self.mem_size = max_size
self.mem_cntr = 0
self.state_memory = np.zeros((self.mem_size, *input_shape))
self.new_state_memory = np.zeros((self.mem_size, *input_shape))
self.action_memory = np.zeros((self.mem_size, n_actions))
self.reward_memory = np.zeros(self.mem_size)
self.terminal_memory = np.zeros(self.mem_size, dtype=np.float32)
def store_transition(self, state, action, reward, state_, done):
index = self.mem_cntr % self.mem_size
self.state_memory[index] = state
self.new_state_memory[index] = state_
self.action_memory[index] = action
self.reward_memory[index] = reward
self.terminal_memory[index] = 1 - done
self.mem_cntr += 1
def sample_buffer(self, batch_size):
max_mem = min(self.mem_cntr, self.mem_size)
batch = np.random.choice(max_mem, batch_size)
states = self.state_memory[batch]
actions = self.action_memory[batch]
rewards = self.reward_memory[batch]
states_ = self.new_state_memory[batch]
terminal = self.terminal_memory[batch]
return states, actions, rewards, states_, terminal
class CriticNetwork(nn.Module):
def __init__(self, beta, input_dims, fc1_dims, fc2_dims, fc3_dims, fc4_dims, n_actions, name,
chkpt_dir='C:\\demo\\IRS_DDPG_minimal\\main_foder\\tmp\\ddpg', load_file = ''):
super(CriticNetwork, self).__init__()
self.input_dims = input_dims
self.fc1_dims = fc1_dims
self.fc2_dims = fc2_dims
self.fc3_dims = fc3_dims
self.fc4_dims = fc4_dims
self.n_actions = n_actions
self.checkpoint_file = os.path.join(chkpt_dir,name+'_ddpg')
self.load_file = 'C:\\demo\\other_branch\\Learning-based_Secure_Transmission_for_RIS_Aided_mmWave-UAV_Communications_with_Imperfect_CSI\\data\\mannal_store\\models\\Critic_UAV_ddpg'
self.fc1 = nn.Linear(*self.input_dims, self.fc1_dims)
f1 = 1./np.sqrt(self.fc1.weight.data.size()[0])
T.nn.init.uniform_(self.fc1.weight.data, -f1, f1)
T.nn.init.uniform_(self.fc1.bias.data, -f1, f1)
#self.fc1.weight.data.uniform_(-f1, f1)
#self.fc1.bias.data.uniform_(-f1, f1)
self.bn1 = nn.LayerNorm(self.fc1_dims)
self.fc2 = nn.Linear(self.fc1_dims, self.fc2_dims)
f2 = 1./np.sqrt(self.fc2.weight.data.size()[0])
#f2 = 0.002
T.nn.init.uniform_(self.fc2.weight.data, -f2, f2)
T.nn.init.uniform_(self.fc2.bias.data, -f2, f2)
#self.fc2.weight.data.uniform_(-f2, f2)
#self.fc2.bias.data.uniform_(-f2, f2)
self.bn2 = nn.LayerNorm(self.fc2_dims)
self.fc3 = nn.Linear(self.fc2_dims, self.fc3_dims)
f3 = 1./np.sqrt(self.fc3.weight.data.size()[0])
T.nn.init.uniform_(self.fc3.weight.data, -f3, f3)
T.nn.init.uniform_(self.fc3.bias.data, -f3, f3)
#self.fc2.weight.data.uniform_(-f2, f2)
#self.fc2.bias.data.uniform_(-f2, f2)
self.bn3 = nn.LayerNorm(self.fc3_dims)
self.fc4 = nn.Linear(self.fc3_dims, self.fc4_dims)
f4 = 1./np.sqrt(self.fc4.weight.data.size()[0])
T.nn.init.uniform_(self.fc4.weight.data, -f4, f4)
T.nn.init.uniform_(self.fc4.bias.data, -f4, f4)
#self.fc2.weight.data.uniform_(-f2, f2)
#self.fc2.bias.data.uniform_(-f2, f2)
self.bn4 = nn.LayerNorm(self.fc4_dims)
self.action_value = nn.Linear(self.n_actions, self.fc4_dims)
f5 = 0.003
self.q = nn.Linear(self.fc4_dims, 1)
T.nn.init.uniform_(self.q.weight.data, -f5, f5)
T.nn.init.uniform_(self.q.bias.data, -f5, f5)
#self.q.weight.data.uniform_(-f3, f3)
#self.q.bias.data.uniform_(-f3, f3)
self.optimizer = optim.Adam(self.parameters(), lr=beta)
# if torch.cuda.available():
# import torch.cuda as T
# else:
# import torch as T
self.device = T.device('cuda:0' if T.cuda.is_available() else 'cpu')
self.to(self.device)
def forward(self, state, action):
state_value = self.fc1(state)
state_value = self.bn1(state_value)
state_value = F.relu(state_value)
state_value = self.fc2(state_value)
state_value = self.bn2(state_value)
state_value = F.relu(state_value)
state_value = self.fc3(state_value)
state_value = self.bn3(state_value)
state_value = F.relu(state_value)
state_value = self.fc4(state_value)
state_value = self.bn4(state_value)
action_value = F.relu(self.action_value(action))
state_action_value = F.relu(T.add(state_value, action_value))
state_action_value = self.q(state_action_value)
return state_action_value
def save_checkpoint(self):
print('... saving checkpoint ...')
T.save(self.state_dict(), self.checkpoint_file)
def load_checkpoint(self,load_file = ''):
print('... loading checkpoint ...')
if T.cuda.is_available():
self.load_state_dict(T.load(load_file))
else:
self.load_state_dict(T.load(load_file, map_location=T.device('cpu')))
class ActorNetwork(nn.Module):
def __init__(self, alpha, input_dims, fc1_dims, fc2_dims, fc3_dims, fc4_dims, n_actions, name,
chkpt_dir='C:\\demo\\IRS_DDPG_minimal\\main_foder\\tmp\\ddpg', load_file = ''):
super(ActorNetwork, self).__init__()
self.input_dims = input_dims
self.fc1_dims = fc1_dims
self.fc2_dims = fc2_dims
self.fc3_dims = fc3_dims
self.fc4_dims = fc4_dims
self.n_actions = n_actions
self.checkpoint_file = os.path.join(chkpt_dir,name+'_ddpg')
self.load_file = 'C:\\demo\\other_branch\\Learning-based_Secure_Transmission_for_RIS_Aided_mmWave-UAV_Communications_with_Imperfect_CSI\\data\\mannal_store\\models\\Actor_UAV_ddpg'
self.fc1 = nn.Linear(*self.input_dims, self.fc1_dims)
f1 = 1./np.sqrt(self.fc1.weight.data.size()[0])
# T.nn.init.uniform_(self.fc1.weight.data, -f1, f1)
# T.nn.init.uniform_(self.fc1.bias.data, -f1, f1)
self.fc1.weight.data.uniform_(-f1, f1)
self.fc1.bias.data.uniform_(-f1, f1)
self.bn1 = nn.LayerNorm(self.fc1_dims)
self.fc2 = nn.Linear(self.fc1_dims, self.fc2_dims)
#f2 = 0.002
f2 = 1./np.sqrt(self.fc2.weight.data.size()[0])
# T.nn.init.uniform_(self.fc2.weight.data, -f2, f2)
# T.nn.init.uniform_(self.fc2.bias.data, -f2, f2)
self.fc2.weight.data.uniform_(-f2, f2)
self.fc2.bias.data.uniform_(-f2, f2)
self.bn2 = nn.LayerNorm(self.fc2_dims)
self.fc3 = nn.Linear(self.fc2_dims, self.fc3_dims)
f3 = 1./np.sqrt(self.fc3.weight.data.size()[0])
# T.nn.init.uniform_(self.fc3.weight.data, -f3, f3)
# T.nn.init.uniform_(self.fc3.bias.data, -f3, f3)
self.fc2.weight.data.uniform_(-f2, f2)
self.fc2.bias.data.uniform_(-f2, f2)
self.bn3 = nn.LayerNorm(self.fc3_dims)
self.fc4 = nn.Linear(self.fc3_dims, self.fc4_dims)
f4 = 1./np.sqrt(self.fc4.weight.data.size()[0])
# T.nn.init.uniform_(self.fc4.weight.data, -f4, f4)
# T.nn.init.uniform_(self.fc4.bias.data, -f4, f4)
self.fc2.weight.data.uniform_(-f2, f2)
self.fc2.bias.data.uniform_(-f2, f2)
self.bn4 = nn.LayerNorm(self.fc4_dims)
#f3 = 0.004
f5 = 0.003
self.mu = nn.Linear(self.fc4_dims, self.n_actions)
# T.nn.init.uniform_(self.mu.weight.data, -f5, f5)
# T.nn.init.uniform_(self.mu.bias.data, -f5, f5)
self.mu.weight.data.uniform_(-f3, f3)
self.mu.bias.data.uniform_(-f3, f3)
self.optimizer = optim.Adam(self.parameters(), lr=alpha)
self.device = T.device('cuda:0' if T.cuda.is_available() else 'cpu')
self.to(self.device)
def forward(self, state):
x = self.fc1(state)
x = self.bn1(x)
x = F.relu(x)
x = self.fc2(x)
x = self.bn2(x)
x = F.relu(x)
x = self.fc3(x)
x = self.bn3(x)
x = F.relu(x)
x = self.fc4(x)
x = self.bn4(x)
x = F.relu(x)
x = T.tanh(self.mu(x))
return x
def save_checkpoint(self):
print('... saving checkpoint ...')
T.save(self.state_dict(), self.checkpoint_file)
def load_checkpoint(self, load_file=''):
print('... loading checkpoint ...')
if T.cuda.is_available():
self.load_state_dict(T.load(load_file))
else:
self.load_state_dict(T.load(load_file, map_location=T.device('cpu')))
class Agent(object):
def __init__(self, alpha, beta, input_dims, tau, env, gamma=0.99,
n_actions=2, max_size=1000000, layer1_size=400,
layer2_size=300, layer3_size=256, layer4_size=128, batch_size=64, noise = 'AWGN', agent_name = 'default', load_file = ''):
self.load_file = load_file
self.layer1_size = layer1_size
self.layer2_size = layer2_size
self.layer3_size = layer3_size
self.layer4_size = layer4_size
self.gamma = gamma
self.tau = tau
self.memory = ReplayBuffer(max_size, input_dims, n_actions)
self.batch_size = batch_size
self.actor = ActorNetwork(alpha, input_dims, layer1_size,
layer2_size, layer3_size, layer4_size, n_actions=n_actions,
name='Actor_' + agent_name,chkpt_dir=env.data_manager.store_path )
self.critic = CriticNetwork(beta, input_dims, layer1_size,
layer2_size, layer3_size, layer4_size, n_actions=n_actions,
name='Critic_' + agent_name,chkpt_dir=env.data_manager.store_path)
self.target_actor = ActorNetwork(alpha, input_dims, layer1_size,
layer2_size, layer3_size, layer4_size, n_actions=n_actions,
name='TargetActor_' + agent_name,chkpt_dir=env.data_manager.store_path)
self.target_critic = CriticNetwork(beta, input_dims, layer1_size,
layer2_size, layer3_size, layer4_size, n_actions=n_actions,
name='TargetCritic_' + agent_name,chkpt_dir=env.data_manager.store_path)
if noise == 'OU':
self.noise = OUActionNoise(mu=np.zeros(n_actions))
elif noise == 'AWGN':
self.noise = AWGNActionNoise(mu = np.zeros(n_actions))
# tau = 1 means copy parameters to target
self.update_network_parameters(tau=1)
def choose_action(self, observation, greedy=0.5, epsilon = 1):
self.actor.eval()
observation = T.tensor(observation, dtype=T.float).to(self.actor.device)
mu = self.actor.forward(observation).to(self.actor.device)
mu_prime = mu + T.tensor(greedy * self.noise(),
dtype=T.float).to(self.actor.device)
self.actor.train()
return mu_prime.cpu().detach().numpy()
def remember(self, state, action, reward, new_state, done):
self.memory.store_transition(state, action, reward, new_state, done)
def learn(self):
if self.memory.mem_cntr < self.batch_size:
return
# the done here is opposite of the done in the env
state, action, reward, new_state, done = \
self.memory.sample_buffer(self.batch_size)
# trun s, a, r, new_s into tensor
reward = T.tensor(reward, dtype=T.float).to(self.critic.device)
done = T.tensor(done).to(self.critic.device)
new_state = T.tensor(new_state, dtype=T.float).to(self.critic.device)
action = T.tensor(action, dtype=T.float).to(self.critic.device)
state = T.tensor(state, dtype=T.float).to(self.critic.device)
# trun on evaliation mode of target actor, target critic, critic net
# fix these three nets
self.target_actor.eval()
self.target_critic.eval()
self.critic.eval()
# argmax_action*(Q(new_state,action*)),i.e. pi(s|pi),i.e. stratage choose the action* maxmium the future Q
target_actions = self.target_actor.forward(new_state)
# caculate the Q(new_state, new_action)
critic_value_ = self.target_critic.forward(new_state, target_actions)
# caculate the Q(state, action)
critic_value = self.critic.forward(state, action)
target = []
for j in range(self.batch_size):
target.append(reward[j] + self.gamma*critic_value_[j]*done[j])
target = T.tensor(target).to(self.critic.device)
target = target.view(self.batch_size, 1)
# here update the critic net using mse of (r + gamma * Q_argmax_a*(newstate, a*)) - Q(state, action)
self.critic.train()
self.critic.optimizer.zero_grad()
critic_loss = F.mse_loss(target, critic_value)
critic_loss.backward()
self.critic.optimizer.step()
# here update the actor net by policy gradient
# first fix the critic net
self.critic.eval()
self.actor.optimizer.zero_grad()
mu = self.actor.forward(state)
self.actor.train()
actor_loss = -self.critic.forward(state, mu)
actor_loss = T.mean(actor_loss)
actor_loss.backward()
self.actor.optimizer.step()
self.update_network_parameters()
def update_network_parameters(self, tau=None):
if tau is None:
tau = self.tau
actor_params = self.actor.named_parameters()
critic_params = self.critic.named_parameters()
target_actor_params = self.target_actor.named_parameters()
target_critic_params = self.target_critic.named_parameters()
critic_state_dict = dict(critic_params)
actor_state_dict = dict(actor_params)
target_critic_dict = dict(target_critic_params)
target_actor_dict = dict(target_actor_params)
for name in critic_state_dict:
critic_state_dict[name] = tau*critic_state_dict[name].clone() + \
(1-tau)*target_critic_dict[name].clone()
self.target_critic.load_state_dict(critic_state_dict)
for name in actor_state_dict:
actor_state_dict[name] = tau*actor_state_dict[name].clone() + \
(1-tau)*target_actor_dict[name].clone()
self.target_actor.load_state_dict(actor_state_dict)
"""
#Verify that the copy assignment worked correctly
target_actor_params = self.target_actor.named_parameters()
target_critic_params = self.target_critic.named_parameters()
critic_state_dict = dict(target_critic_params)
actor_state_dict = dict(target_actor_params)
print('\nActor Networks', tau)
for name, param in self.actor.named_parameters():
print(name, T.equal(param, actor_state_dict[name]))
print('\nCritic Networks', tau)
for name, param in self.critic.named_parameters():
print(name, T.equal(param, critic_state_dict[name]))
input()
"""
def save_models(self):
self.actor.save_checkpoint()
self.target_actor.save_checkpoint()
self.critic.save_checkpoint()
self.target_critic.save_checkpoint()
def load_models(self, load_file_actor = '',load_file_critic =''):
self.actor.load_checkpoint(load_file = load_file_actor)
self.target_actor.load_checkpoint(load_file = load_file_actor)
self.critic.load_checkpoint(load_file = load_file_critic)
self.target_critic.load_checkpoint(load_file = load_file_critic)
def check_actor_params(self):
current_actor_params = self.actor.named_parameters()
current_actor_dict = dict(current_actor_params)
original_actor_dict = dict(self.original_actor.named_parameters())
original_critic_dict = dict(self.original_critic.named_parameters())
current_critic_params = self.critic.named_parameters()
current_critic_dict = dict(current_critic_params)
print('Checking Actor parameters')
for param in current_actor_dict:
print(param, T.equal(original_actor_dict[param], current_actor_dict[param]))
print('Checking critic parameters')
for param in current_critic_dict:
print(param, T.equal(original_critic_dict[param], current_critic_dict[param]))
input()
================================================
FILE: entity.py
================================================
import numpy as np
import math
#from math_tool import *
class UAV(object):
"""
UAV object with coordinate
And with ULA antenas, default 8
And limited power
And with fixed rotation angle
"""
def __init__(self, coordinate, index = 0, rotation = 0, ant_num=16, ant_type = 'ULA', max_movement_per_time_slot = 0.5):
"""
coordinate is the init coordinate of UAV, meters, np.array
"""
self.max_movement_per_time_slot = max_movement_per_time_slot
self.type = 'UAV'
self.coordinate = coordinate
self.rotation = rotation
self.ant_num = ant_num
self.ant_type = ant_type
self.coor_sys = [np.array([1, 0, 0]), np.array([0, -1, 0]), np.array([0, 0, -1])]
self.index = index
# init beamforming matrix in UAV (must be inited in env.py)
self.G = np.mat(np.zeros((ant_num, 1)))
self.G_Pmax = 0
def reset(self, coordinate):
"""
reset UAV coordinate
"""
self.coordinate = coordinate
def update_coor_sys(self, delta_angle):
"""
used in function move to update the relevant coordinate system
"""
self.rotation = self.rotation + delta_angle
coor_sys_x = np.array([\
math.cos(self.rotation),\
math.sin(self.rotation),\
0])
coor_sys_z = np.array([\
0,\
0,\
-1])
coor_sys_y = np.cross(coor_sys_z, coor_sys_x)
self.coor_sys = np.array([coor_sys_x,coor_sys_y,coor_sys_z])
def update_coordinate(self, distance_delta_d, direction_fai):
"""
used in function move to update UAV cordinate
"""
delta_x = distance_delta_d * math.cos(direction_fai)
delta_y = distance_delta_d * math.sin(direction_fai)
self.coordinate[0] += delta_x
self.coordinate[1] += delta_y
def move(self, distance_delta_d, direction_fai, delta_angle = 0):
"""
preform the 2D movement every step
"""
self.update_coordinate(distance_delta_d, direction_fai)
self.update_coor_sys(delta_angle)
class RIS(object):
"""
reconfigrable intelligent surface
with N reflecting elements, UPA, default 4 X 4 = 16
continues phase shift
"""
def __init__(self, coordinate, coor_sys_z, index = 0, ant_num=36, ant_type = 'UPA'):
"""
coordinate is the init coordinate of with N reflecting elements, meters, np.array
norm_vec is the normal vector of the reflecting direction
!!! ant_num Must be the square of a certain int number
"""
self.type = 'RIS'
self.coordinate = coordinate
self.ant_num = ant_num
self.ant_type = ant_type
coor_sys_z = coor_sys_z / np.linalg.norm(coor_sys_z)
coor_sys_x = np.cross(coor_sys_z, np.array([0,0,1]))
coor_sys_x = coor_sys_x / np.linalg.norm(coor_sys_x)
coor_sys_y = np.cross(coor_sys_z, coor_sys_x)
self.coor_sys = [coor_sys_x,coor_sys_y,coor_sys_z]
self.index = index
# init reflecting phase shift
self.Phi = np.mat(np.diag(np.ones(self.ant_num, dtype=complex)), dtype = complex)
class User(object):
"""
user with single antenas
"""
def __init__(self, coordinate, index, ant_num = 1, ant_type = 'single'):
"""
coordinate is the init coordinate of user, meters, np.array
ant_num is the antenas number of user
"""
self.type = 'user'
self.coordinate = coordinate
self.ant_num = ant_num
self.ant_type = ant_type
self.index = index
self.coor_sys = [np.array([1, 0, 0]), np.array([0, 1, 0]), np.array([0, 0, 1])]
# init the capacity
self.capacity = 0
self.secure_capacity = 0
self.QoS_constrain = 0
# init the comprehensive_channel, (must used in env.py to init)
self.comprehensive_channel = 0
# init receive noise sigma in dB
self.noise_power = -114
def reset(self, coordinate):
"""
reset user coordinate
"""
self.coordinate = coordinate
def update_coordinate(self, distance_delta_d, direction_fai):
"""
used in function move to update UAV cordinate
"""
delta_x = distance_delta_d * math.cos(direction_fai)
delta_y = distance_delta_d * math.sin(direction_fai)
self.coordinate[0] += delta_x
self.coordinate[1] += delta_y
def move(self, distance_delta_d, direction_fai):
"""
preform the 2D movement every step
"""
self.update_coordinate(distance_delta_d, direction_fai)
class Attacker(object):
"""
Attacker with single antenas
"""
def __init__(self, coordinate, index, ant_num = 1, ant_type= 'single'):
"""
coordinate is the init coordinate of Attacker, meters, np.array
ant_num is the antenas number of Attacker
"""
self.type = 'attacker'
self.coordinate = coordinate
self.ant_num = ant_num
self.ant_type = ant_type
self.index = index
self.coor_sys = [np.array([1, 0, 0]), np.array([0, 1, 0]), np.array([0, 0, 1])]
# init the capacity, this is a K length np.array ,shape: (K,)
# represent the attack rate for kth user, (must init in env.py)
self.capacity = 0
self.comprehensive_channel = 0
# init receive noise sigma in dBmW
self.noise_power = -114
def reset(self, coordinate):
"""
reset attacker coordinate
"""
self.coordinate = coordinate
def update_coordinate(self, distance_delta_d, direction_fai):
"""
used in function move to update UAV cordinate
"""
delta_x = distance_delta_d * math.cos(direction_fai)
delta_y = distance_delta_d * math.sin(direction_fai)
self.coordinate[0] += delta_x
self.coordinate[1] += delta_y
def move(self, distance_delta_d, direction_fai):
"""
preform the 2D movement every step
"""
self.update_coordinate(distance_delta_d, direction_fai)
================================================
FILE: env.py
================================================
#%matplotlib inline
import numpy as np
from entity import *
from channel import *
from math_tool import *
from datetime import datetime
from mpl_toolkits import mplot3d
import matplotlib.pyplot as plt
from render import Render
from data_manager import DataManager
# s.t every simulition is the same model
np.random.seed(2)
######################################################
# new for energy
# energy related parameters of rotary-wing UAV
# based on Energy Minimization in Internet-of-Things System Based on Rotary-Wing UAV
P_i = 790.6715
P_0 = 580.65
U2_tip = (200) ** 2
s = 0.05
d_0 = 0.3
p = 1.225
A = 0.79
delta_time = 0.1/1000 #0.1ms
# add ons hover veloctiy
# based on https://www.intechopen.com/chapters/57483
m = 1.3 # mass: assume 1.3kg https://www.droneblog.com/average-weights-of-common-types-of-drones/#:~:text=In%20most%20cases%2C%20toy%20drones,What%20is%20this%3F
g = 9.81 # gravity
T = m * g # thrust
v_0 = (T / (A * 2 * p)) ** 0.5
def get_energy_consumption(v_t):
'''
arg
1) v_t = displacement per time slot
'''
energy_1 = P_0 \
+ 3 * P_0 * (abs(v_t)) ** 2 / U2_tip \
+ 0.5 * d_0 * p * s * A * (abs(v_t))**3
energy_2 = P_i * ((
(1 + (abs(v_t) ** 4) / (4 * (v_0 ** 4))) ** 0.5 \
- (abs(v_t) ** 2) / (2 * (v_0 **2)) \
) ** 0.5)
energy = delta_time * (energy_1 + energy_2)
return energy
ENERGY_MIN = get_energy_consumption(0.25)
ENERGY_MAX = get_energy_consumption(0)
######################################################
class MiniSystem(object):
#class MiniSystem(K=1):
"""
define mini RIS communication system with one UAV
and one RIS and one user, one attacker
"""
def __init__(self, UAV_num = 1, RIS_num = 1, user_num = 1, attacker_num = 1, fre = 28e9, \
RIS_ant_num = 16, UAV_ant_num=8, if_dir_link = 1, if_with_RIS = True, \
if_move_users = True, if_movements = True, reverse_x_y = (True, True), \
if_UAV_pos_state = True, reward_design = 'ssr', project_name = None, step_num=100):
self.if_dir_link = if_dir_link
self.if_with_RIS = if_with_RIS
self.if_move_users = if_move_users
self.if_movements = if_movements
self.if_UAV_pos_state = if_UAV_pos_state
self.reverse_x_y = reverse_x_y
self.user_num = user_num
self.attacker_num = attacker_num
self.border = [(-25,25), (0, 50)]
# 1.init entities: 1 UAV, 1 RIS, many users and attackers
self.data_manager = DataManager(file_path='./data', project_name = project_name, \
store_list = ['beamforming_matrix', 'reflecting_coefficient', 'UAV_state', 'user_capacity', 'secure_capacity', 'attaker_capacity','G_power', 'reward','UAV_movement'])
# 1.1 init UAV position and beamforming matrix
self.UAV = UAV(
coordinate=self.data_manager.read_init_location('UAV', 0),
ant_num= UAV_ant_num,
max_movement_per_time_slot=0.25)
self.UAV.G = np.mat(np.ones((self.UAV.ant_num, user_num), dtype=complex), dtype=complex)
self.power_factor = 100
self.UAV.G_Pmax = np.trace(self.UAV.G * self.UAV.G.H) * self.power_factor
# 1.2 init RIS
self.RIS = RIS(\
coordinate=self.data_manager.read_init_location('RIS', 0), \
coor_sys_z=self.data_manager.read_init_location('RIS_norm_vec', 0), \
ant_num=RIS_ant_num)
# 1.3 init users
self.user_list = []
for i in range(user_num):
user_coordinate = self.data_manager.read_init_location('user', i)
user = User(coordinate=user_coordinate, index=i)
user.noise_power = -114
self.user_list.append(user)
# 1.4 init attackers
self.attacker_list = []
for i in range(attacker_num):
attacker_coordinate = self.data_manager.read_init_location('attacker', i)
attacker = Attacker(coordinate=attacker_coordinate, index=i)
attacker.capacity = np.zeros((user_num))
attacker.noise_power = -114
self.attacker_list.append(attacker)
# 1.5 generate the eavesdrop capacity array , shape: P X K
self.eavesdrop_capacity_array= np.zeros((attacker_num, user_num))
# 1.6 reward design
self.reward_design = reward_design # reward_design is ['ssr' or 'see']
# 1.7 step_num
self.step_num = step_num
# 2.init channel
self.H_UR = mmWave_channel(self.UAV, self.RIS, fre)
self.h_U_k = []
self.h_R_k = []
self.h_U_p = []
self.h_R_p = []
for user_k in self.user_list:
self.h_U_k.append(mmWave_channel(user_k, self.UAV, fre))
self.h_R_k.append(mmWave_channel(user_k, self.RIS, fre))
for attacker_p in self.attacker_list:
self.h_U_p.append(mmWave_channel(attacker_p, self.UAV, fre))
self.h_R_p.append(mmWave_channel(attacker_p, self.RIS, fre))
# 3 update user and attaker channel capacity
self.update_channel_capacity()
# 4 draw system
self.render_obj = Render(self)
def reset(self):
"""
reset UAV, users, attackers, beamforming matrix, reflecting coefficient
"""
# 1 reset UAV
self.UAV.reset(coordinate=self.data_manager.read_init_location('UAV', 0))
# 2 reset users
for i in range(self.user_num):
user_coordinate = self.data_manager.read_init_location('user', i)
self.user_list[i].reset(coordinate=user_coordinate)
# 3 reset attackers
for i in range(self.attacker_num):
attacker_coordinate = self.data_manager.read_init_location('attacker', i)
self.attacker_list[i].reset(coordinate=attacker_coordinate)
# 4 reset beamforming matrix
self.UAV.G = np.mat(np.ones((self.UAV.ant_num, self.user_num), dtype=complex), dtype=complex)
self.UAV.G_Pmax = np.trace(self.UAV.G * self.UAV.G.H) * self.power_factor
# 5 reset reflecting coefficient
"""self.RIS = RIS(\
coordinate=self.data_manager.read_init_location('RIS', 0), \
coor_sys_z=self.data_manager.read_init_location('RIS_norm_vec', 0), \
ant_num=16)"""
self.RIS.Phi = np.mat(np.diag(np.ones(self.RIS.ant_num, dtype=complex)), dtype = complex)
# 6 reset time
self.render_obj.t_index = 0
# 7 reset CSI
self.H_UR.update_CSI()
for h in self.h_U_k + self.h_U_p + self.h_R_k + self.h_R_p:
h.update_CSI()
# 8 reset capcaity
self.update_channel_capacity()
def step(self, action_0 = 0, action_1 = 0, G = 0, Phi = 0, set_pos_x = 0, set_pos_y = 0):
"""
test step only move UAV and update channel
"""
# 0 update render
self.render_obj.t_index += 1
# 1 update entities
if self.if_move_users:
self.user_list[0].update_coordinate(0.2, -1/2 * math.pi)
self.user_list[1].update_coordinate(0.2, -1/2 * math.pi)
if self.if_movements:
move_x = action_0 * self.UAV.max_movement_per_time_slot
move_y = action_1 * self.UAV.max_movement_per_time_slot
######################################################
# new for energy
v_t = (move_x ** 2 + move_y ** 2) ** 0.5
#self.data_manager.store_data([v_t],'velocity')
######################################################
if self.reverse_x_y[0]:
move_x = -move_x
if self.reverse_x_y[1]:
move_y = -move_y
self.UAV.coordinate[0] +=move_x
self.UAV.coordinate[1] +=move_y
self.data_manager.store_data([move_x, move_y], 'UAV_movement')
else:
set_pos_x = map_to(set_pos_x, (-1, 1), self.border[0])
set_pos_y = map_to(set_pos_y, (-1, 1), self.border[1])
self.UAV.coordinate[0] = set_pos_x
self.UAV.coordinate[1] = set_pos_y
# 2 update channel CSI
for h in self.h_U_k + self.h_U_p + self.h_R_k + self.h_R_p:
h.update_CSI()
# !!! test to make direct link zero
if self.if_dir_link == 0:
for h in self.h_U_k + self.h_U_p:
h.channel_matrix = np.mat(np.zeros(shape = np.shape(h.channel_matrix)), dtype=complex)
if self.if_with_RIS == False:
self.H_UR.channel_matrix = np.mat(np.zeros((self.RIS.ant_num, self.UAV.ant_num)), dtype=complex)
else:
self.H_UR.update_CSI()
# 3 update beamforming matrix & reflecting phase shift
"""
self.UAV.G = G
self.RIS.Phi = Phi
"""
self.UAV.G = convert_list_to_complex_matrix(G, (self.UAV.ant_num, self.user_num)) * math.pow(self.power_factor, 0.5)
# fix beamforming matrix
#self.UAV.G = np.mat(np.ones((self.UAV.ant_num, self.user_num), dtype=complex), dtype=complex) * math.pow(self.power_factor, 0.5)
if self.if_with_RIS:
self.RIS.Phi = convert_list_to_complex_diag(Phi, self.RIS.ant_num)
# 4 update channel capacity in every user and attacker
self.update_channel_capacity()
# 5 store current system state to .mat
self.store_current_system_sate()
# 6 get new state
new_state = self.observe()
# 7 get reward
reward = self.reward()
# 7.1 reward with energy efficiency
######################################################
if self.reward_design == 'see':
# new for see
energy = energy_raw = get_energy_consumption(v_t)
energy -= ENERGY_MIN
energy /= (ENERGY_MAX - ENERGY_MIN)
energy_penalty = -1 * 0.1 * abs(reward) * energy # -1 * 0.1 * reward * energy
if reward > 0:
reward += energy_penalty
######################################################
# 8 calculate if UAV is cross the bourder
reward = math.tanh(reward) # new for energy (ori not commented)
done = False
x, y = self.UAV.coordinate[0:2]
if x < self.border[0][0] or x > self.border[0][1]:
done = True
reward = -10
if y < self.border[1][0] or y > self.border[1][1]:
done = True
reward = -10
self.data_manager.store_data([reward],'reward')
return new_state, reward, done, []
def reward(self):
"""
used in function step to get the reward of current step
"""
reward = 0
reward_ = 0
P = np.trace(self.UAV.G * self.UAV.G.H)
if abs(P) > abs(self.UAV.G_Pmax) :
reward = abs(self.UAV.G_Pmax) - abs(P)
reward /= self.power_factor
else:
for user in self.user_list:
r = user.capacity - max(self.eavesdrop_capacity_array[:, user.index])
if r < user.QoS_constrain:
reward_ += r - user.QoS_constrain
else:
reward += r/(self.user_num*2)
if reward_ < 0:
reward = reward_ * self.user_num * 10
return reward
def observe(self):
"""
used in function main to get current state
the state is a list with
"""
# users' and attackers' comprehensive channel
comprehensive_channel_elements_list = []
for entity in self.user_list + self.attacker_list:
tmp_list = list(np.array(np.reshape(entity.comprehensive_channel, (1,-1)))[0])
comprehensive_channel_elements_list += list(np.real(tmp_list)) + list(np.imag(tmp_list))
UAV_position_list = []
if self.if_UAV_pos_state:
UAV_position_list = list(self.UAV.coordinate)
return comprehensive_channel_elements_list + UAV_position_list
def store_current_system_sate(self):
"""
function used in step() to store system state
"""
# 1 store beamforming matrix
row_data = list(np.array(np.reshape(self.UAV.G, (1, -1)))[0,:])
self.data_manager.store_data(row_data, 'beamforming_matrix')
# 2 store reflecting coefficient matrix
row_data = list(np.array(np.reshape(diag(self.RIS.Phi), (1,-1)))[0,:])
self.data_manager.store_data(row_data, 'reflecting_coefficient')
# 3 store UAV state
row_data = list(self.UAV.coordinate)
self.data_manager.store_data(row_data, 'UAV_state')
# 4 store user_capicity
row_data = [user.secure_capacity for user in self.user_list] \
+ [user.capacity for user in self.user_list]
# 5 store G_power
row_data = [np.trace(self.UAV.G*self.UAV.G.H), self.UAV.G_Pmax]
self.data_manager.store_data(row_data, 'G_power')
row_data = []
for user in self.user_list:
row_data.append(user.capacity)
self.data_manager.store_data(row_data, 'user_capacity')
row_data = []
for attacker in self.attacker_list:
row_data.append(attacker.capacity)
self.data_manager.store_data(row_data, 'attaker_capacity')
row_data = []
for user in self.user_list:
row_data.append(user.secure_capacity)
self.data_manager.store_data(row_data, 'secure_capacity')
def update_channel_capacity(self):
"""
function used in step to calculate user and attackers' capacity
"""
# 1 calculate eavesdrop rate
for attacker in self.attacker_list:
attacker.capacity = self.calculate_capacity_array_of_attacker_p(attacker.index)
self.eavesdrop_capacity_array[attacker.index, :] = attacker.capacity
# remmeber to update comprehensive_channel
attacker.comprehensive_channel = self.calculate_comprehensive_channel_of_attacker_p(attacker.index)
# 2 calculate unsecure rate
for user in self.user_list:
user.capacity = self.calculate_capacity_of_user_k(user.index)
# 3 calculate secure rate
user.secure_capacity = self.calculate_secure_capacity_of_user_k(user.index)
# remmeber to update comprehensive_channel
user.comprehensive_channel = self.calculate_comprehensive_channel_of_user_k(user.index)
def calculate_comprehensive_channel_of_attacker_p(self, p):
"""
used in update_channel_capacity to calculate the comprehensive_channel of attacker p
"""
h_U_p = self.h_U_p[p].channel_matrix
h_R_p = self.h_R_p[p].channel_matrix
Psi = diag_to_vector(self.RIS.Phi)
H_c = vector_to_diag(h_R_p).H * self.H_UR.channel_matrix
return h_U_p.H + Psi.H * H_c
def calculate_comprehensive_channel_of_user_k(self, k):
"""
used in update_channel_capacity to calculate the comprehensive_channel of user k
"""
h_U_k = self.h_U_k[k].channel_matrix
h_R_k = self.h_R_k[k].channel_matrix
Psi = diag_to_vector(self.RIS.Phi)
H_c = vector_to_diag(h_R_k).H * self.H_UR.channel_matrix
return h_U_k.H + Psi.H * H_c
def calculate_capacity_of_user_k(self, k):
"""
function used in update_channel_capacity to calculate one user
"""
noise_power = self.user_list[k].noise_power
h_U_k = self.h_U_k[k].channel_matrix
h_R_k = self.h_R_k[k].channel_matrix
Psi = diag_to_vector(self.RIS.Phi)
H_c = vector_to_diag(h_R_k).H * self.H_UR.channel_matrix
G_k = self.UAV.G[:, k]
G_k_ = 0
if len(self.user_list) == 1:
G_k_ = np.mat(np.zeros((self.UAV.ant_num, 1), dtype=complex), dtype=complex)
else:
G_k_1 = self.UAV.G[:, 0:k]
G_k_2 = self.UAV.G[:, k+1:]
G_k_ = np.hstack((G_k_1, G_k_2))
alpha_k = math.pow(abs((h_U_k.H + Psi.H * H_c) * G_k), 2)
beta_k = math.pow(np.linalg.norm((h_U_k.H + Psi.H * H_c)*G_k_), 2) + dB_to_normal(noise_power) * 1e-3
return math.log10(1 + abs(alpha_k / beta_k))
def calculate_capacity_array_of_attacker_p(self, p):
"""
function used in update_channel_capacity to calculate one attacker capacities to K users
output is a K length np.array ,shape: (K,)
"""
K = len(self.user_list)
noise_power = self.attacker_list[p].noise_power
h_U_p = self.h_U_p[p].channel_matrix
h_R_p = self.h_R_p[p].channel_matrix
Psi = diag_to_vector(self.RIS.Phi)
H_c = vector_to_diag(h_R_p).H * self.H_UR.channel_matrix
if K == 1:
G_k = self.UAV.G
G_k_ = np.mat(np.zeros((self.UAV.ant_num, 1), dtype=complex), dtype=complex)
alpha_p = math.pow(abs((h_U_p.H + Psi.H * H_c) * G_k), 2)
beta_p = math.pow(np.linalg.norm((h_U_p.H + Psi.H * H_c)*G_k_), 2) + dB_to_normal(noise_power) * 1e-3
return np.array([math.log10(1 + abs(alpha_p / beta_p))])
else:
result = np.zeros(K)
for k in range(K):
G_k = G_k = self.UAV.G[:, k]
G_k_1 = self.UAV.G[:, 0:k]
G_k_2 = self.UAV.G[:, k+1:]
G_k_ = np.hstack((G_k_1, G_k_2))
alpha_p = math.pow(abs((h_U_p.H + Psi.H * H_c) * G_k), 2)
beta_p = math.pow(np.linalg.norm((h_U_p.H + Psi.H * H_c)*G_k_), 2) + dB_to_normal(noise_power) * 1e-3
result[k] = math.log10(1 + abs(alpha_p / beta_p))
return result
def calculate_secure_capacity_of_user_k(self, k=2):
"""
function used in update_channel_capacity to calculate the secure rate of user k
"""
user = self.user_list[k]
R_k_unsecure = user.capacity
R_k_maxeavesdrop = max(self.eavesdrop_capacity_array[:, k])
secrecy_rate= max(0, R_k_unsecure - R_k_maxeavesdrop)
return secrecy_rate
def get_system_action_dim(self):
"""
function used in main function to get the dimention of actions
"""
result = 0
# 0 UAV movement
result += 2
# 1 RIS reflecting elements
if self.if_with_RIS:
result += self.RIS.ant_num
else:
result += 0
# 2 beamforming matrix dimention
result += 2 * self.UAV.ant_num * self.user_num
return result
def get_system_state_dim(self):
"""
function used in main function to get the dimention of states
"""
result = 0
# users' and attackers' comprehensive channel
result += 2 * (self.user_num + self.attacker_num) * self.UAV.ant_num
# UAV position
if self.if_UAV_pos_state:
result += 3
return result
================================================
FILE: env1.py
================================================
import numpy as np
import math
import cmath
np.random.seed(42)
class minimal_IRS_system():
def __init__(self, BS_M = 8, IRS_N_x = 4, IRS_N_y =2, K = 8, statistic = False):
self.M = BS_M # number of BS antennas
self.N = IRS_N_x * IRS_N_y # number of IRS elements
self.K = K # number of users
self.z_BS = 0 # hight of BS antenas
self.BS_rotation = 0 #
self.BS_coordinate = np.array([0, 0, 30])
self.BS_elevation_angle = 0
self.BS_max_power = 1000 # BS power constrain, W
self.BS_normal_vecter = np.array([math.cos(self.BS_rotation), math.sin(self.BS_rotation), math.sin(self.BS_elevation_angle)*np.linalg.norm([-math.cos(self.BS_rotation), math.sin(self.BS_rotation)])])
self.BS_normal_vecter = self.BS_normal_vecter / np.linalg.norm(self.BS_normal_vecter)
# channel parameters class
class channel_parameters():
"""
channel parameters
"""
def __init__(self):
self.noise_dBm = noise_dBm = -114 # channel noise, -114 dBm
self.noise_segma = 0 # channel noise std,
self.d_0 = 1 # path loss referance distance 1 m
self.rho_0 = 0.01 # path loss parameter
self.alpha_BS_to_IRS = 3 # path loss exponent
self.alpha_IRS_to_user = 2.5 # path loss exponent
self.alpha_BS_to_user = 3.5 # path loss exponent
self.K_BS = math.pow(10, 3/10) # rician factor, 3dB
self.K_IRS = math.pow(10, 3/10) # rician factor, 3dB
self.channel_parameters = channel_parameters()
self.channel_parameters.noise_segma = math.pow(self.dBm_to_W(self.channel_parameters.noise_dBm),0.5)
# IRS initial
class IRS():
"""
store user parameters
"""
def __init__(self):
self.IRS_N_x = IRS_N_x
self.IRS_N_y = IRS_N_y
self.IRS_N = IRS_N_x * IRS_N_y
self.x_IRS = 1000 # position x of IRS, 1000 m
self.y_IRS = 0 # position y of IRS, 0 m
self.z_IRS = 25 # position z of IRS, 0 m
self.IRS_coordinate = np.array([self.x_IRS,self.y_IRS,self.z_IRS])
self.deltaD_to_lambda = 0.5
self.d_BS_to_IRS = np.linalg.norm(self.IRS_coordinate)
self.theta_IRS = math.atan(self.y_IRS / self.x_IRS) # theta_IRS
self.psi_IRS = 0 # rotation of IRS , range: -pi/2 + theta_IRS ~ pi/2 - theta_IRS
self.elevation_angle = 0 # elevation angle
self.IRS_normal_vecter = np.array([-math.cos(self.psi_IRS), math.sin(self.psi_IRS),math.sin(self.elevation_angle)*np.linalg.norm([-math.cos(self.psi_IRS), math.sin(self.psi_IRS)])])
self.IRS_normal_vecter = self.IRS_normal_vecter / np.linalg.norm(self.IRS_normal_vecter)
self.theta_BS_to_IRS = 0
self.psi_BS_to_IRS = 0
self.theta_IRS_to_BS = 0
self.psi_IRS_to_BS = 0
self.IRS = IRS()
# users initial
class user():
"""
store user parameters
"""
def __init__(self):
self.user_index = 0 # user ID
self.distance_to_IRS = 10 # distance to IRS(need caculation), range of (10, 30)
self.psi = 0 # psi angle to IRS
self.psi_IRS_surface = 0 # psi used to calculate the UPA response
self.x_user = 10 # position x
self.y_user = 10 # position y
self.z_user = 0 # position z
self.user_coordinate = np.array([self.x_user,self.y_user,self.z_user])
self.distance_to_BS = np.linalg.norm(self.user_coordinate) # distance to BS
self.theta_user = 0 # theta to BS
self.theta_IRS_to_kth_user = 0
self.psi_IRS_to_kth_user = 0
mini_distance = 10 # minimal distance for users to IRS, m
max_distance = 30 # maxmial distance for users to IRS, m
statistic_distance_list = [10, 12, 14, 17, 20, 23, 25, 29]
statistic_angle_list = [0, math.pi/4, math.pi/2, 3*math.pi/4, math.pi, 5*math.pi/4, 3*math.pi/2, 7*math.pi/4]
stocastic_distance_list = np.random.uniform(mini_distance, max_distance, self.K) # list of dis
stocastic_angle_list = np.random.uniform(- math.pi / 2, math.pi / 2, self.K) # list of angle
self.user_list = []
for index_user in range(K):
user_temp = user()
user_temp.user_index = index_user
if statistic == True:
user_temp.distance_to_IRS = statistic_distance_list[index_user]
user_temp.psi = statistic_angle_list[index_user]
user_temp.x_user = self.IRS.x_IRS - user_temp.distance_to_IRS * math.cos(user_temp.psi)
user_temp.y_user = self.IRS.y_IRS + user_temp.distance_to_IRS * math.sin(user_temp.psi)
else:
user_temp.distance_to_IRS = stocastic_distance_list[index_user]
user_temp.psi = stocastic_angle_list[index_user]
user_temp.x_user = self.IRS.x_IRS - user_temp.distance_to_IRS * math.cos(user_temp.psi)
user_temp.y_user = self.IRS.y_IRS + user_temp.distance_to_IRS * math.sin(user_temp.psi)
user_temp.psi_IRS_surface = user_temp.psi - self.IRS.psi_IRS
#user_temp.distance_to_BS = np.linalg.norm([user_temp.x_user, user_temp.y_user])
user_temp.theta_user = math.atan(user_temp.y_user / user_temp.x_user )
user_temp.user_coordinate = np.array([user_temp.x_user,user_temp.y_user,user_temp.z_user])
self.user_list.append(user_temp)
# transmit signal
self.X_transmit = np.mat(np.ones((K, 1))) # BS transmit signal, unit variance, shape: K X 1
# active beamforming at BS
self.G_beamforming = np.mat(np.ones((self.M, K),dtype=complex)) # BS beamforming matrix, shape: M X K
self.BS_max_power = self.calculate_total_transmit_power()
self.reset_G = np.mat(np.ones((self.M, K),dtype=complex))
# channal gain from BS to IRS
#self.H_BS_to_IRS = np.mat(np.ones((self.N, self.M))) # channel gain BS to IRS, shape: N X M
self.H_BS_to_IRS = self.calculate_channel_BS_to_IRS(self.IRS)
# channal gain from IRS to users
# self.H_IRS_to_user = np.mat(np.ones((K, self.N),dtype=complex)) # channel gain IRS to user, shape: K X N
self.H_IRS_to_user = self.calculate_channel_IRS_to_user(self.IRS)
# IRS phase shifter parameter
self.Fai = np.mat(np.identity(self.N), dtype=complex) # IRS phase shift
self.reset_Fai = np.mat(np.identity(self.N), dtype=complex)
# data rate matrex container
self.data_rate = np.mat(np.ones((K,1))) # channal capacities for all K users
# calculate sum of all user data rate
self.sum_of_user_data_rates = self.calculate_data_rate()
def calculate_total_transmit_power(self):
"""
calculate total power
"""
G = self.G_beamforming
G_H = G.H
temp_power = 0
for m in range(self.M):
temp_power += G[m,:] * G_H[:, m]
return float(np.real(temp_power[0,0]))
def dBm_to_W(self, dBm):
mW = math.pow(10, dBm/10)
W = mW/1000.0
return W
def generate_noise(self, mu, segma, size = 1):
"""
generates noise of variance segma ^ 2
size can be a tuple like (3,3)
output : a float of noise
or an array with size size of average mu and std of segma
"""
result = np.random.normal(mu, segma, size)
if size == 1:
return float(result)
else:
return result
def calculate_received_signal_amplitude(self):
"""
calculate_received_signal_amplitude from BS to user
output : a matix , size: K X 1
"""
noise_array = self.generate_noise(0, self.channel_parameters.noise_segma, (self.K,1))
noise_matrix = np.mat(noise_array)
Y_matrix = self.H_IRS_to_user * self.Fai * self.H_BS_to_IRS * self.G_beamforming * self.X_transmit + noise_matrix
return Y_matrix
def calculate_SINR_of_kth_user(self, k):
"""
calculate kth user's SINR
output : a float
"""
expected_amplitude = abs(self.H_IRS_to_user[k] * self.Fai * self.H_BS_to_IRS * self.G_beamforming[:, k])
expected_power = math.pow(expected_amplitude, 2)
noise_power = 0
noise_power = noise_power + math.pow(self.channel_parameters.noise_segma, 2)
for n in range(self.K):
if n != k:
noise_power = noise_power + math.pow(abs(self.H_IRS_to_user[n] * self.Fai * self.H_BS_to_IRS * self.G_beamforming[:, n]),2)
return expected_power/noise_power
def calculate_data_rate(self):
"""
calculate K data rates
output : sum of datarate
"""
for kth in range(self.K):
self.data_rate[kth][0] = math.log2(1 + self.calculate_SINR_of_kth_user(kth))
return float(self.data_rate.sum())
def calculate_theta_and_psi(self, ref_norm_vector, coordinate):
"""
calculate_theta_and_psi
"""
theta_ref_to_coor = math.acos(abs(np.dot(coordinate/ np.linalg.norm(coordinate),ref_norm_vector)))
coordinate_xoy = coordinate - abs(math.cos(theta_ref_to_coor)) * np.linalg.norm(coordinate) * ref_norm_vector
temp_x_vector = [ref_norm_vector[1], -ref_norm_vector[0], 0]
psi_ref_to_coor = abs(np.dot(temp_x_vector, coordinate_xoy)) / (np.linalg.norm(temp_x_vector) * np.linalg.norm(coordinate_xoy))
if math.isnan(psi_ref_to_coor):
psi_ref_to_coor = 0
return theta_ref_to_coor, psi_ref_to_coor
def calculate_channel_BS_to_IRS(self, IRS): # IRS elements spacing half wavelength
"""
denate the self.H_BS_to_IRS (N X M)
"""
temp_vertical_matrix = np.zeros((IRS.IRS_N,1), dtype=complex)
temp_horizontal_matrix = np.zeros((1, self.M), dtype=complex)
"""
theta_BS_to_IRS = math.acos(np.dot(IRS.IRS_coordinate/ np.linalg.norm(IRS.IRS_coordinate),self.BS_normal_vecter))
IRS_xoy = IRS.IRS_coordinate - abs(math.cos(theta_BS_to_IRS)) * np.linalg.norm(IRS.IRS_coordinate) * self.BS_normal_vecter
temp_x_BS = [self.BS_normal_vecter[1], -self.BS_normal_vecter[0], 0]
psi_BS_to_IRS = abs(np.dot(temp_x_BS, IRS_xoy)) / (np.linalg.norm(temp_x_BS) * np.linalg.norm(IRS_xoy))
if math.isnan(psi_BS_to_IRS):
psi_BS_to_IRS = 0
theta_IRS_to_BS = math.acos(abs(np.dot(IRS.IRS_coordinate/ np.linalg.norm(IRS.IRS_coordinate), IRS.IRS_normal_vecter)))
BS_xoy = -1 * np.array(IRS.IRS_coordinate) - abs(math.cos(theta_IRS_to_BS)) * np.linalg.norm(IRS.IRS_coordinate) * IRS.IRS_normal_vecter
temp_x_IRS = [IRS.IRS_normal_vecter[1], -IRS.IRS_normal_vecter[0], 0]
psi_IRS_to_BS = abs(np.dot(temp_x_IRS, BS_xoy)) / (np.linalg.norm(temp_x_IRS) * np.linalg.norm(BS_xoy))
if math.isnan(psi_IRS_to_BS):
psi_IRS_to_BS = 0
# old way to calculate theta and psi
"""
theta_BS_to_IRS, psi_BS_to_IRS = self.calculate_theta_and_psi(self.BS_normal_vecter, IRS.IRS_coordinate - self.BS_coordinate)
theta_IRS_to_BS, psi_IRS_to_BS = self.calculate_theta_and_psi(IRS.IRS_normal_vecter, self.BS_coordinate - IRS.IRS_coordinate)
IRS.theta_BS_to_IRS = theta_BS_to_IRS
IRS.psi_BS_to_IRS = psi_BS_to_IRS
IRS.theta_IRS_to_BS = theta_IRS_to_BS
IRS.psi_IRS_to_BS = psi_IRS_to_BS
for i in range(IRS.IRS_N_y):
for j in range(IRS.IRS_N_x):
temp_vertical_matrix[i * IRS.IRS_N_x + j][0] = \
cmath.exp(-1j * 2 * math.pi * IRS.deltaD_to_lambda * \
(i * math.sin(theta_IRS_to_BS)* math.cos(psi_IRS_to_BS)+ \
j * math.sin(theta_IRS_to_BS)* math.sin(psi_IRS_to_BS)) \
)
for i in range(self.M):
temp_horizontal_matrix[0][i] = \
cmath.exp(-1j * 2 * math.pi * IRS.deltaD_to_lambda * \
(i * math.sin(theta_BS_to_IRS)* math.cos(psi_BS_to_IRS)) \
)
temp_horizontal_matrix = np.mat(temp_horizontal_matrix)
temp_vertical_matrix = np.mat(temp_vertical_matrix)
H_LOS = temp_vertical_matrix * temp_horizontal_matrix
H = math.pow(self.channel_parameters.rho_0* (math.pow(IRS.d_BS_to_IRS/self.channel_parameters.d_0, - self.channel_parameters.alpha_BS_to_IRS)), 0.5) \
* (H_LOS)
return H
def calculate_channel_IRS_to_user(self, IRS):
"""
channel gain IRS to user, shape: K X N
"""
temp_array = np.zeros((self.K, self.N), dtype=complex)
for kth in range(self.K):
theta_IRS_to_kth_user, psi_IRS_to_kth_user = self.calculate_theta_and_psi(IRS.IRS_normal_vecter, self.user_list[kth].user_coordinate - IRS.IRS_coordinate)
self.user_list[kth].theta_IRS_to_kth_user = theta_IRS_to_kth_user
self.user_list[kth].psi_IRS_to_kth_user = psi_IRS_to_kth_user
for i in range(IRS.IRS_N_y):
for j in range(IRS.IRS_N_x):
temp_array[kth][i * IRS.IRS_N_x + j] = \
cmath.exp(-1j * 2 * math.pi * IRS.deltaD_to_lambda * \
(i * math.sin(theta_IRS_to_kth_user)* math.cos(psi_IRS_to_kth_user)+ \
j * math.sin(theta_IRS_to_kth_user)* math.sin(psi_IRS_to_kth_user)) \
)
H_LOS = temp_array
for kth in range(self.K):
H_LOS[kth] = math.pow(self.channel_parameters.rho_0* (math.pow(np.linalg.norm(IRS.IRS_coordinate - self.user_list[kth].user_coordinate)/self.channel_parameters.d_0, - self.channel_parameters.alpha_BS_to_IRS)), 0.5) \
* (H_LOS[kth])
H = np.mat(H_LOS)
return H
def reset(self):
"""
RL implement,
"""
# 1 reset all parameters
self.G_beamforming = self.reset_G
self.Fai = self.reset_Fai
# 2 get state
return self.get_state()
def apply_action(self, action):
"""
apply action
"""
# 0 parameters and all kinds of matrix
K = self.K
M = self.M
N = self.N
# 1 apply action
# 1.1 divide action list into two parts
# 1.1.1 beamforming part, MXK
action_beamforming = action[0 : 2*M*K]
# 1.1.2 IRS reflecting coinfetions
action_Fai = action[2*M*K : 2*M*K + 2*N]
# 1.2 apply action
# 1.2.1 beamforming part, MXK
for m in range(M):
for k in range(K):
index = m * K + k
temp_Gmk = action_beamforming[2*index] + 1j * action_beamforming[2*index + 1]
self.G_beamforming[m, k] = temp_Gmk
# 1.2.2 IRS reflecting coinfetions
for n in range(N):
temp_Fainn = action_Fai[2*n] + 1j * action_Fai[2*n + 1]
self.Fai[n, n] = temp_Fainn
return True
def get_state(self):
"""
get current state
"""
# 0 parameters and all kinds of matrix
K = self.K
M = self.M
N = self.N
new_state_dims = 2*K + 2*K**2 + 2*N + 2*M*K + 2*N*M + 2*K*N
# 1 get new state
new_state = []
G = self.G_beamforming
G_H = self.G_beamforming.H
H_IRS_to_user = self.H_IRS_to_user
Fai = self.Fai
H_BS_to_IRS = self.H_BS_to_IRS
# 1.1 transmit power to k users, 2K
state_tran_power_BS_to_users = [] # result container
for kth in range(K):
state_tran_power_BS_to_users.append(\
math.pow(float(np.real(G_H[kth,:] * G[:,kth])),2))
state_tran_power_BS_to_users.append(\
math.pow(float(np.imag(G_H[kth,:] * G[:,kth])),2))
# 1.2 received power for users, 2K^2
state_action_receive_power_for_users = []
for k1 in range(K):
user_k1_received_power_from_all_other_users = \
H_IRS_to_user[k1, :] * Fai * H_BS_to_IRS * G
for k2 in range(K):
state_action_receive_power_for_users.append(\
math.pow(np.real(user_k1_received_power_from_all_other_users[0, k2]), 2))
state_action_receive_power_for_users.append(\
math.pow(np.imag(user_k1_received_power_from_all_other_users[0, k2]), 2))
# 1.3 beamforming at BS, 2*M*K+2*N
state_beamforming_at_BS = []
for m in range(M):
for k in range(K):
state_beamforming_at_BS.append(\
math.pow(np.real(G[m, k]), 2))
state_beamforming_at_BS.append(\
math.pow(np.imag(G[m, k]), 2))
for n in range(N):
state_beamforming_at_BS.append(\
math.pow(np.real(Fai[n, n]), 2))
state_beamforming_at_BS.append(\
math.pow(np.imag(Fai[n, n]), 2))
# 1.4 channel state at both BS-IRS & IRS-users, 2*N*M+2*K*M
state_channel_BS_IRS_and_IRS_users = []
for n in range(N):
for m in range(M):
state_channel_BS_IRS_and_IRS_users.append(\
math.pow(np.real(H_BS_to_IRS[n, m]), 2))
state_channel_BS_IRS_and_IRS_users.append(\
math.pow(np.imag(H_BS_to_IRS[n, m]), 2))
for k in range(K):
for n in range(N):
state_channel_BS_IRS_and_IRS_users.append(\
math.pow(np.real(H_IRS_to_user[k, n]), 2))
state_channel_BS_IRS_and_IRS_users.append(\
math.pow(np.imag(H_IRS_to_user[k, n]), 2))
# 1.5 finnish getting state
new_state.append(state_tran_power_BS_to_users)
new_state.append(state_action_receive_power_for_users)
new_state.append(state_beamforming_at_BS)
new_state.append(state_channel_BS_IRS_and_IRS_users)
result = state_tran_power_BS_to_users + \
state_action_receive_power_for_users + \
state_beamforming_at_BS + \
state_channel_BS_IRS_and_IRS_users
return result
def step(self, action):
"""
RL function ,state is a 1 X (2K+2K^2+2N+2MK+2NM+2KN) vecter
"""
done = False
# 1 get new state
new_state = self.get_state()
# 2 apply action
self.apply_action(action)
# 3 judge if done
total_power = self.calculate_total_transmit_power()
if total_power > self.BS_max_power:
done = True
# 4 get other reward
reward = 0
if total_power > self.BS_max_power:
reward = np.real(self.BS_max_power - total_power)
# reward = np.real(self.BS_max_power[0,0] - total_power[0,0] )/1000
else:
reward = self.calculate_data_rate()*100
info = 0
return new_state, reward, done, info
def render(self):
"""
show system function
"""
return True
================================================
FILE: load_and_plot.py
================================================
import matplotlib.pyplot as plt
import numpy as np
import cmath
from scipy.io import loadmat, savemat
import pandas as pd
import os
import copy
import math
import csv
import argparse
# get argument from user
parser = argparse.ArgumentParser()
parser.add_argument('--path', type = str, required = False, default=None, help='the path where the training/simulation data is stored')
parser.add_argument('--ep-num', type = int, required = False, default=300, help='total number of episodes')
# extract argument
args = parser.parse_args()
STORE_PATH = args.path
EP_NUM = args.ep_num
######################################################
# new for energy
# energy related parameters of rotary-wing UAV
# based on Energy Minimization in Internet-of-Things System Based on Rotary-Wing UAV
P_i = 790.6715
P_0 = 580.65
U2_tip = (200) ** 2
s = 0.05
d_0 = 0.3
p = 1.225
A = 0.79
delta_time = 0.1 #0.1/1000 #0.1ms
# add ons hover veloctiy
# based on https://www.intechopen.com/chapters/57483
m = 1.3 # mass: assume 1.3kg https://www.droneblog.com/average-weights-of-common-types-of-drones/#:~:text=In%20most%20cases%2C%20toy%20drones,What%20is%20this%3F
g = 9.81 # gravity
T = m * g # thrust
v_0 = (T / (A * 2 * p)) ** 0.5
def get_energy_consumption(v_t):
'''
arg
1) v_t = displacement per time slot
'''
energy_1 = P_0 \
+ 3 * P_0 * (abs(v_t)) ** 2 / U2_tip \
+ 0.5 * d_0 * p * s * A * (abs(v_t))**3
energy_2 = P_i * ((
(1 + (abs(v_t) ** 4) / (4 * (v_0 ** 4))) ** 0.5 \
- (abs(v_t) ** 2) / (2 * (v_0 **2)) \
) ** 0.5)
energy = delta_time * (energy_1 + energy_2)
return energy
ENERGY_MIN = get_energy_consumption(0.25)
ENERGY_MAX = get_energy_consumption(0)
######################################################
# modified from data_manager.py
init_data_file = 'data/init_location.xlsx'
def read_init_location(entity_type = 'user', index = 0):
if entity_type == 'user' or 'attacker' or 'RIS' or 'RIS_norm_vec' or 'UAV':
return np.array([\
pd.read_excel(init_data_file, sheet_name=entity_type)['x'][index],\
pd.read_excel(init_data_file, sheet_name=entity_type)['y'][index],\
pd.read_excel(init_data_file, sheet_name=entity_type)['z'][index]])
else:
return None
# load and plot everything
class LoadAndPlot(object):
"""
load date and plot 2022-07-22 16_16_26
"""
def __init__(self, store_path, \
user_num = 2, attacker_num = 1, RIS_ant_num = 4, \
ep_num = EP_NUM, step_num = 100): # RIS_ant_num = 16 (not true)
self.color_list = ['b', 'c', 'g', 'k', 'm', 'r', 'y']
self.store_path = store_path + '//'
self.user_num = user_num
self.attacker_num = attacker_num
self.RIS_ant_num = RIS_ant_num
self.ep_num = ep_num
self.step_num = step_num
self.all_steps = self.load_all_steps()
def load_one_ep(self, file_name):
m = loadmat(self.store_path + file_name)
return m
def load_all_steps(self):
result_dic = {}
result_dic.update({'reward':[]})
result_dic.update({'user_capacity':[]})
for i in range(self.user_num):
result_dic['user_capacity'].append([])
result_dic.update({'secure_capacity':[]})
for i in range(self.user_num):
result_dic['secure_capacity'].append([])
result_dic.update({'attaker_capacity':[]})
for i in range(self.attacker_num):
result_dic['attaker_capacity'].append([])
result_dic.update({'RIS_elements':[]})
for i in range(self.RIS_ant_num):
result_dic['RIS_elements'].append([])
for ep_cnt in range(self.ep_num):
mat_ep = self.load_one_ep("simulation_result_ep_" + str(ep_cnt) + ".mat")
one_ep_reward = mat_ep["result_" + str(ep_cnt)]["reward"][0][0]
result_dic['reward'] += list(one_ep_reward[:, 0])
one_ep_user_capacity = mat_ep["result_" + str(ep_cnt)]["user_capacity"][0][0]
for i in range(self.user_num):
result_dic['user_capacity'][i] += list(one_ep_user_capacity[:, i])
one_ep_secure_capacity = mat_ep["result_" + str(ep_cnt)]["secure_capacity"][0][0]
for i in range(self.user_num):
result_dic['secure_capacity'][i] += list(one_ep_secure_capacity[:, i])
one_ep_attaker_capacity = mat_ep["result_" + str(ep_cnt)]["attaker_capacity"][0][0]
for i in range(self.attacker_num):
result_dic['attaker_capacity'][i] += list(one_ep_attaker_capacity[:, i])
one_ep_RIS_first_element = mat_ep["result_" + str(ep_cnt)]["reflecting_coefficient"][0][0]
for i in range(self.RIS_ant_num):
result_dic['RIS_elements'][i] += list(one_ep_RIS_first_element[:, i])
return result_dic
def plot(self):
"""
plot result
b--blue c--cyan(青色) g--green k--black m--magenta(紫红色) r--red w--white y--yellow
"""
if not os.path.exists(self.store_path + 'plot'):
os.makedirs(self.store_path + 'plot')
os.makedirs(self.store_path + 'plot/RIS')
color_list = ['b', 'g', 'c', 'k', 'm', 'r', 'y']
###############################
# read step counts per episode
###############################
step_num_per_episode = []
with open(self.store_path + 'step_num_per_episode.csv', newline='') as csvfile:
reader = csv.reader(csvfile)
for row in reader:
step_num_per_episode.append(int(row[0]))
###############################
# plot reward
###############################
fig = plt.figure('reward')
plt.plot(range(len(self.all_steps['reward'])), self.all_steps['reward'])
plt.xlabel("Time Steps ($t$)")
plt.ylabel("Reward")
plt.savefig(self.store_path + 'plot/reward.png')
plt.cla()
###############################
# plot secure capacity
###############################
fig = plt.figure('secure_capacity')
for i in range(self.user_num):
plt.plot(range(len(self.all_steps['secure_capacity'][i])), self.all_steps['secure_capacity'][i], c=color_list[i])
plt.legend(['user_' + str(i) for i in range(self.user_num)])
plt.xlabel("Time Steps ($t$)")
plt.ylabel("Secure Capacity")
plt.savefig(self.store_path + 'plot/secure_capacity.png')
plt.cla()
###############################
# plot average sum secrecy rate of each episode
###############################
fig = plt.figure('average_sum_secrecy_rate')
sum_secrecy_rate = np.array(self.all_steps['secure_capacity'])
sum_secrecy_rate = np.sum(sum_secrecy_rate, axis = 0)
average_sum_secrecy_rate = []
ssr = []
j = 0
for i in range(self.ep_num):
ssr_one_episode = sum_secrecy_rate[j:j+step_num_per_episode[i]] # ssr means Sum Secrecy Rate
#print(j, j+step_num_per_episode[i])
j = j+step_num_per_episode[i]
ssr.append(ssr_one_episode)
try:
_ = sum(ssr_one_episode) / len(ssr_one_episode)
except:
_ = 0
average_sum_secrecy_rate.append(_)
plt.plot(range(len(average_sum_secrecy_rate)), average_sum_secrecy_rate)
plt.xlabel("Episodes (Ep)")
plt.ylabel("Average Sum Secrecy Rate")
plt.savefig(self.store_path + 'plot/average_sum_secrecy_rate.png')
plt.cla()
print()
print('###########################################################')
print('Metrics\t\t\tLast Episode\tMax Values Reached')
print('###########################################################')
print('SSR (bits/s/Hz)\t\t{:.2f}\t\t{:.2f}'.format(average_sum_secrecy_rate[-1], max(average_sum_secrecy_rate)))
###############################
# plot secrecy energy efficient
###############################
fig = plt.figure('average_secrecy_energy_efficiency')
# get init location
init_uav_coord = read_init_location(entity_type = 'UAV')
init_user_coord_0 = read_init_location(entity_type = 'user', index=0)
init_user_coord_1 = read_init_location(entity_type = 'user', index=1)
ep_num = EP_NUM
energies = []
for i in range(ep_num):
# read the mat file
filename = f'simulation_result_ep_{i}.mat'
filename = os.path.join(STORE_PATH, filename)
data = loadmat(filename)
# v_ts
energies_one_episode = []
# loop all uav movt
uav_movt = data[f'result_{i}'][0][0][-1]
for j in range(uav_movt.shape[0]):
move_x = uav_movt[j][0]
move_y = uav_movt[j][1]
v_t = (move_x ** 2 + move_y ** 2) ** 0.5
energy = get_energy_consumption(v_t / delta_time)
energies_one_episode.append(energy)
energies.append(energies_one_episode)
average_see = []
for ssr_one_episode, energies_one_episode in zip(ssr, energies):
ssr_one_episode = ssr_one_episode[:len(energies_one_episode)]
energies_one_episode = energies_one_episode[:len(ssr_one_episode)]
#print(len(ssr_one_episode), len(energies_one_episode))
try:
see = np.array(ssr_one_episode) / np.array(energies_one_episode)
average_see.append(sum(see)/len(see))
except:
average_see.append(0)
plt.plot(range(len(average_see)), average_see)
plt.xlabel("Episodes (Ep)")
plt.ylabel("Average Secrecy Energy Efficiency")
plt.savefig(self.store_path + 'plot/average_secrecy_energy_efficiency.png')
plt.cla()
print('Energy (kJ)\t\t{:.2f}\t\t{:.2f}'.format(sum(energies[-1])/1000, sum(energies[np.argmax(average_see)])/1000))
print('SEE (bits/s/Hz/kJ)\t{:.2f}\t\t{:.2f}'.format(average_see[-1]*1000, max(average_see)*1000))
print('\nThe final performance is evalulated based on the Last Episode (where exploration=0)\n')
###############################
# plot user capacity
###############################
fig = plt.figure('user_capacity')
for i in range(self.user_num):
plt.plot(range(len(self.all_steps['user_capacity'][i])), self.all_steps['user_capacity'][i], c=color_list[i])
plt.legend(['user_' + str(i) for i in range(self.user_num)])
plt.xlabel("Time Steps ($t$)")
plt.ylabel("User Capacity")
plt.savefig(self.store_path + 'plot/user_capacity.png')
plt.cla()
###############################
# plot attacker capacity
###############################
fig = plt.figure('attaker_capacity')
for i in range(self.attacker_num):
plt.plot(range(len(self.all_steps['attaker_capacity'][i])), self.all_steps['attaker_capacity'][i], c=color_list[i])
plt.legend(['attacker_' + str(i) for i in range(self.attacker_num)])
plt.xlabel("Time Steps ($t$)")
plt.ylabel("Attack Capacity")
plt.savefig(self.store_path + 'plot/attaker_capacity.png')
plt.close('all')
###############################
# plot ris
###############################
for i in range(self.RIS_ant_num):
self.plot_one_RIS_element(i)
###############################
# plot trajectory
###############################
self.plot_trajectory()
def plot_one_RIS_element(self, index):
"""
docstring
"""
ax_real_imag = plt.subplot(1,1,1)
ax_pase = ax_real_imag.twinx()
#plt.ylim(ymax = 1, ymin = -1)
#plt.xlim(xmax = 10000 , xmin = 10000 - 100)
ax_real_imag.plot(range(len(self.all_steps['RIS_elements'][index])), np.real(self.all_steps['RIS_elements'][index]), c = self.color_list[0])
ax_real_imag.plot(range(len(self.all_steps['RIS_elements'][index])), np.imag(self.all_steps['RIS_elements'][index]), c = self.color_list[1])
phase_list = []
for complex_num in self.all_steps['RIS_elements'][index]:
phase_list.append(cmath.phase(complex_num))
plt.ylim(ymax = cmath.pi, ymin = -cmath.pi)
ax_pase.plot(range(len(self.all_steps['RIS_elements'][index])), phase_list, c = self.color_list[2])
# plt.xlabel("Time Steps ($t$)")
# plt.ylabel("RIS Dimension")
# plt.set_ylabel("position")
# plt.set_ylabel("position")
# plt.set_xlabel("Time Steps ($t$)")
plt.savefig(self.store_path + 'plot/RIS/RIS_' + str(index) + '_element.png')
plt.close('all')
pass
def plot_trajectory(self):
# get init location
init_uav_coord = read_init_location(entity_type = 'UAV')
init_user_coord_0 = read_init_location(entity_type = 'user', index=0)
init_user_coord_1 = read_init_location(entity_type = 'user', index=1)
ep_num = EP_NUM
interval = int(0.2 * EP_NUM)
ep_list = [0] + [i for i in range(20-1, ep_num, interval)]
if EP_NUM - 1 not in ep_list: ep_list.append(EP_NUM - 1)
color_list_template = ['b', 'g', 'c', 'k', 'm', 'r', 'y', 'black', 'red']
color_list = copy.deepcopy(color_list_template)
for i in ep_list:
# read the mat file
filename = f'simulation_result_ep_{i}.mat'
filename = os.path.join(STORE_PATH, filename)
data = loadmat(filename)
# uav movt
uav_coord = [ [init_uav_coord[0]], [init_uav_coord[1]] ]
uav_movt = data[f'result_{i}'][0][0][-1]
for j in range(uav_movt.shape[0]):
move_x = uav_movt[j][0]
move_y = uav_movt[j][1]
prev_x = uav_coord[0][-1]
prev_y = uav_coord[1][-1]
current_x = prev_x + move_x
current_y = prev_y + move_y
uav_coord[0].append(current_x)
uav_coord[1].append(current_y)
plt.plot(uav_coord[1],uav_coord[0], c=color_list.pop(0))
# user 0 movt
direction_fai = -1/2*math.pi
distance_delta_d = 0.2
user_coord_0 = [ [init_user_coord_0[0]], [init_user_coord_0[1]] ]
#color_list = copy.deepcopy(color_list_template)
for j in range(uav_movt.shape[0]):
delta_x = distance_delta_d * math.cos(direction_fai)
delta_y = distance_delta_d * math.sin(direction_fai)
prev_x = user_coord_0[0][-1]
prev_y = user_coord_0[1][-1]
current_x = prev_x + delta_x
current_y = prev_y + delta_y
user_coord_0[0].append(current_x)
user_coord_0[1].append(current_y)
plt.plot(user_coord_0[1],user_coord_0[0], c=color_list.pop(0))
plt.plot(user_coord_0[1][0], user_coord_0[0][0], marker="o", markersize=10, markeredgecolor="red", markerfacecolor="red")
# user 1 movt
direction_fai = -1/2*math.pi
distance_delta_d = 0.2
user_coord_0 = [ [init_user_coord_1[0]], [init_user_coord_1[1]] ]
#color_list = copy.deepcopy(color_list_template)
for j in range(uav_movt.shape[0]):
delta_x = distance_delta_d * math.cos(direction_fai)
delta_y = distance_delta_d * math.sin(direction_fai)
prev_x = user_coord_0[0][-1]
prev_y = user_coord_0[1][-1]
current_x = prev_x + delta_x
current_y = prev_y + delta_y
user_coord_0[0].append(current_x)
user_coord_0[1].append(current_y)
plt.plot(user_coord_0[1],user_coord_0[0], c=color_list.pop(0))
plt.plot(user_coord_0[1][0], user_coord_0[0][0], marker="o", markersize=10, markeredgecolor="red", markerfacecolor="red")
plt.legend(ep_list)
plt.grid()
plt.xlim(0, 50)
plt.ylim(-10, 30)
plt.gca().invert_yaxis()
plt.savefig(self.store_path + 'plot/trajectory.png')
plt.cla()
def restruct(self):
savemat(self.store_path + 'all_steps.mat',self.all_steps)
return 0
if __name__ == '__main__':
LoadPlotObject = LoadAndPlot(
store_path = STORE_PATH,
)
LoadPlotObject.plot()
LoadPlotObject.restruct()
================================================
FILE: main_RIS.py
================================================
from env import *
episode_num= 100
step_num = 100
a= int(5/5 * episode_num * step_num)
print(a)
================================================
FILE: main_ref.py
================================================
from ddpg import Agent
from env1 import minimal_IRS_system # orginal
from env import MiniSystem
import numpy as np
#from utils import plotLearning
import matplotlib.pyplot as plt
import os
from data_manager import DataManager
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
IRS_system = minimal_IRS_system(K = 1)
#IRS_system = MiniSystem()
K = IRS_system.K
M = IRS_system.M
N = IRS_system.N
RL_state_dims = 2*K + 2*K**2 + 2*N + 2*M*K + 2*N*M + 2*K*N
RL_input_dims = RL_state_dims
RL_action_dims = 2 * (M * K) + N
steps_per_ep = 200
alpha_actor_learning_rate = 0.001
beta_critic_learning_rate = 0.001
agent = Agent(alpha=alpha_actor_learning_rate, beta=beta_critic_learning_rate, input_dims=[RL_input_dims], tau=0.001, env=IRS_system,
batch_size=64, layer1_size=400 * 2, layer2_size=300 * 2, n_actions=RL_action_dims)
scores = []
for i in range(1000):
observersion = IRS_system.reset()
done = False
done_sys = False
score = 0
cnt_in_one_epi = 0
best_bit_per_Hz = 0
draw_bit_rate_list = []
draw_tran_power_list = []
#draw_bit_rate_one_element = {'if_exceed_max_power':False,'bit_rate' : 0}
while not done:
cnt_in_one_epi += 1
if cnt_in_one_epi > 500:
done = True
action = agent.choose_action(observersion)
new_state, reward, done_sys , info = IRS_system.step(action)
bit_per_Hz = IRS_system.calculate_data_rate()
#draw_bit_rate_one_element['bit_rate'] = bit_per_Hz
#draw_bit_rate_one_element['if_exceed_max_power']=done_sys
draw_bit_rate_list.append(bit_per_Hz)
total_power = IRS_system.calculate_total_transmit_power()
draw_tran_power_list.append(total_power)
if done_sys == False:# if not exceed max transmit power
if bit_per_Hz > best_bit_per_Hz:
best_bit_per_Hz = bit_per_Hz
agent.remember(observersion, action, reward, new_state, int(done))
agent.learn()
score += reward
observersion = new_state
IRS_system.render()
plt.cla()
plt.plot(range(len(draw_bit_rate_list)), draw_bit_rate_list, color = 'green')
plt.plot(range(len(draw_tran_power_list)), draw_tran_power_list, color = 'red')
# plt.show()
filename_i =os.path.abspath(os.curdir) + '\\main_foder\\image_result\\' + str(i) + '.png'
plt.savefig(filename_i)
scores.append(score)
#if i % 25 == 0:
#agent.save_models()
print('episode ', i, 'score %.2f' % score, 'best sum rate %.3f bit/s/Hz' % best_bit_per_Hz,
'trailing 100 games avg %.4f' % np.mean(scores[-100:]))
filename = 'C:\\demo\\IRS_DDPG_minimal\\main_foder\\LunarLander-alpha000025-beta00025-400-300.png'
# plotLearning(scores, filename, window=100)
================================================
FILE: main_train.py
================================================
# debug field
import os
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
import argparse
# get argument from user
parser = argparse.ArgumentParser()
parser.add_argument('--drl', type = str, required = True, default='td3', help="which drl algo would you like to choose ['ddpg', 'td3']")
parser.add_argument('--reward', type = str, required = True, default='see', help="which reward would you like to implement ['ssr', 'see']")
parser.add_argument('--seeds', type = int, required = False, default=None, nargs='+', help="what seed(s) would you like to use for DRL 1 and 2, please provide in one or two int")
parser.add_argument('--ep-num', type = int, required = False, default=300, help="how many episodes do you want to train your DRL")
parser.add_argument('--trained-uav', default=False, action='store_true', help='use trained uav instead of retraining')
args = parser.parse_args()
DRL_ALGO = args.drl
REWARD_DESIGN = args.reward
SEEDS = args.seeds
EPISODE_NUM = args.ep_num
TRAINED_UAV = args.trained_uav
# process the argument
assert DRL_ALGO in ['ddpg', 'td3'], "drl must be ['ddpg', 'td3']"
assert REWARD_DESIGN in ['ssr', 'see'], "reward must be ['ssr', 'see']"
if SEEDS is not None:
assert len(SEEDS) in [1, 2] and isinstance(SEEDS[0], int) and isinstance(SEEDS[-1], int), "seeds must be a list of 1 or 2 integer"
if DRL_ALGO == 'td3':
from td3 import Agent
elif DRL_ALGO == 'ddpg':
from ddpg import Agent
import ddpg
from env import MiniSystem
import numpy as np
import math
import time
import torch
# 1 init system model
episode_num = EPISODE_NUM # recommend to be 300
episode_cnt = 0
step_num = 100
project_name = f'trained_uav/{DRL_ALGO}_{REWARD_DESIGN}' if TRAINED_UAV else f'scratch/{DRL_ALGO}_{REWARD_DESIGN}'
system = MiniSystem(
user_num=2,
RIS_ant_num=4,
UAV_ant_num=4,
if_dir_link=1,
if_with_RIS=True,
if_move_users=True,
if_movements=True,
reverse_x_y=(False, False),
if_UAV_pos_state = True,
reward_design = REWARD_DESIGN,
project_name = project_name,
step_num = step_num
)
if_Theta_fixed = False
if_G_fixed = False
if_BS = False
if_robust = True
# 2 init RL Agent
agent_1_param_dic = {}
agent_1_param_dic["alpha"] = 0.0001
agent_1_param_dic["beta"] = 0.001
agent_1_param_dic["input_dims"] = system.get_system_state_dim()
agent_1_param_dic["tau"] = 0.001
agent_1_param_dic["batch_size"] = 64
agent_1_param_dic["n_actions"] = system.get_system_action_dim() - 2
agent_1_param_dic["action_noise_factor"] = 0.1
agent_1_param_dic["memory_max_size"] = int(5/5 * episode_num * step_num) #/2
agent_1_param_dic["agent_name"] = "G_and_Phi"
agent_1_param_dic["layer1_size"] = 800
agent_1_param_dic["layer2_size"] = 600
agent_1_param_dic["layer3_size"] = 512
agent_1_param_dic["layer4_size"] = 256
agent_2_param_dic = {}
agent_2_param_dic["alpha"] = 0.0001
agent_2_param_dic["beta"] = 0.001
agent_2_param_dic["input_dims"] = 3
agent_2_param_dic["tau"] = 0.001
agent_2_param_dic["batch_size"] = 64
agent_2_param_dic["n_actions"] = 2
agent_2_param_dic["action_noise_factor"] = 0.5
agent_2_param_dic["memory_max_size"] = int(5/5 * episode_num * step_num) #/2
agent_2_param_dic["agent_name"] = "UAV"
agent_2_param_dic["layer1_size"] = 400
agent_2_param_dic["layer2_size"] = 300
agent_2_param_dic["layer3_size"] = 256
agent_2_param_dic["layer4_size"] = 128
if SEEDS is not None:
torch.manual_seed(SEEDS[0]) # 1
torch.cuda.manual_seed_all(SEEDS[0]) # 1
agent_1 = Agent(
alpha = agent_1_param_dic["alpha"],
beta = agent_1_param_dic["beta"],
input_dims = [agent_1_param_dic["input_dims"]],
tau = agent_1_param_dic["tau"],
env = system,
batch_size = agent_1_param_dic["batch_size"],
layer1_size=agent_1_param_dic["layer1_size"],
layer2_size=agent_1_param_dic["layer2_size"],
layer3_size=agent_1_param_dic["layer3_size"],
layer4_size=agent_1_param_dic["layer4_size"],
n_actions = agent_1_param_dic["n_actions"],
max_size = agent_1_param_dic["memory_max_size"],
agent_name= agent_1_param_dic["agent_name"]
)
if SEEDS is not None:
torch.manual_seed(SEEDS[-1]) # 2
torch.cuda.manual_seed_all(SEEDS[-1]) # 2
agent_2 = Agent(
alpha = agent_2_param_dic["alpha"],
beta = agent_2_param_dic["beta"],
input_dims = [agent_2_param_dic["input_dims"]],
tau = agent_2_param_dic["tau"],
env = system,
batch_size = agent_2_param_dic["batch_size"],
layer1_size=agent_2_param_dic["layer1_size"],
layer2_size=agent_2_param_dic["layer2_size"],
layer3_size=agent_2_param_dic["layer3_size"],
layer4_size=agent_2_param_dic["layer4_size"],
n_actions = agent_2_param_dic["n_actions"],
max_size = agent_2_param_dic["memory_max_size"],
agent_name= agent_2_param_dic["agent_name"]
)
if TRAINED_UAV:
benchmark = f'data/storage/benchmark/{DRL_ALGO}_{REWARD_DESIGN}_benchmark'
if DRL_ALGO == 'td3':
agent_2.load_models(
load_file_actor = benchmark + '/Actor_UAV_TD3',
load_file_critic_1 = benchmark + '/Critic_1_UAV_TD3',
load_file_critic_2 = benchmark + '/Critic_2_UAV_TD3'
)
elif DRL_ALGO == 'ddpg':
agent_2.load_models(
load_file_actor = benchmark + '/Actor_UAV_ddpg',
load_file_critic = benchmark + '/Critic_UAV_ddpg'
)
meta_dic = {}
print("***********************system information******************************")
print("folder_name: "+str(system.data_manager.store_path))
meta_dic['folder_name'] = system.data_manager.store_path
print("user_num: "+str(system.user_num))
meta_dic['user_num'] = system.user_num
print("if_dir: "+str(system.if_dir_link))
meta_dic['if_dir_link'] = system.if_dir_link
print("if_with_RIS: "+str(system.if_with_RIS))
meta_dic['if_with_RIS'] = system.if_with_RIS
print("if_user_m: "+str(system.if_move_users))
meta_dic['if_move_users'] = system.if_move_users
print("RIS_ant_num: "+str(system.RIS.ant_num))
meta_dic['system_RIS_ant_num'] = system.RIS.ant_num
print("UAV_ant_num: "+str(system.UAV.ant_num))
meta_dic['system_UAV_ant_num'] = system.UAV.ant_num
print("if_movements: "+str(system.if_movements))
meta_dic['system_if_movements'] = system.if_movements
print("reverse_x_y: "+str(system.reverse_x_y))
meta_dic['system_reverse_x_y'] = system.reverse_x_y
print("if_UAV_pos_state:"+str(system.if_UAV_pos_state))
meta_dic['if_UAV_pos_state'] = system.if_UAV_pos_state
print("ep_num: "+str(episode_num))
meta_dic['episode_num'] = episode_num
print("step_num: "+str(step_num))
meta_dic['step_num'] = step_num
print("***********************agent_1 information******************************")
tplt = "{0:{2}^20}\t{1:{2}^20}"
for i in agent_1_param_dic:
parm = agent_1_param_dic[i]
print(tplt.format(i, parm, chr(12288)))
meta_dic["agent_1"] = agent_1_param_dic
print("***********************agent_2 information******************************")
for i in agent_2_param_dic:
parm = agent_2_param_dic[i]
print(tplt.format(i, parm, chr(12288)))
meta_dic["agent_2"] = agent_2_param_dic
system.data_manager.save_meta_data(meta_dic)
print("***********************traning information******************************")
while episode_cnt < episode_num:
# 1 reset the whole system
system.reset()
step_cnt = 0
score_per_ep = 0
# 2 get the initial state
if if_robust:
tmp = system.observe()
#z = np.random.multivariate_normal(np.zeros(2), 0.5*np.eye(2), size=len(tmp)).view(np.complex128)
z = np.random.normal(size=len(tmp))
observersion_1 = list(
np.array(tmp) + 0.6 *1e-7* z
)
else:
observersion_1 = system.observe()
observersion_2 = list(system.UAV.coordinate)
if episode_cnt == 80:
print("break point")
while step_cnt < step_num:
# 1 count num of step in one episode
step_cnt += 1
# judge if pause the whole system
if not system.render_obj.pause:
# 2 choose action acoording to current state
action_1 = agent_1.choose_action(observersion_1, greedy=agent_1_param_dic["action_noise_factor"] * math.pow((1-episode_cnt / episode_num), 2))
action_2 = agent_2.choose_action(observersion_2, greedy=agent_2_param_dic["action_noise_factor"]* math.pow((1-episode_cnt / episode_num), 2))
if if_BS:
action_2[0]=0
action_2[1]=0
if if_Theta_fixed:
action_1[0+2 * system.UAV.ant_num * system.user_num:] = len(action_1[0+2 * system.UAV.ant_num * system.user_num:])*[0]
if if_G_fixed:
action_1[0:0+2 * system.UAV.ant_num * system.user_num]=np.array([-0.0313, -0.9838, 0.3210, 1.0, -0.9786, -0.1448, 0.3518, 0.5813, -1.0, -0.2803, -0.4616, -0.6352, -0.1449, 0.7040, 0.4090, -0.8521]) * math.pow(episode_cnt / episode_num, 2) * 0.7
#action_1[0:0+2 * system.UAV.ant_num * system.user_num]=len(action_1[0:0+2 * system.UAV.ant_num * system.user_num])*[0.5]
# 3 get newstate, reward
if system.if_with_RIS:
new_state_1, reward, done, info = system.step(
action_0=action_2[0],
action_1=action_2[1],
G=action_1[0:0+2 * system.UAV.ant_num * system.user_num],
Phi=action_1[0+2 * system.UAV.ant_num * system.user_num:],
set_pos_x=action_2[0],
set_pos_y=action_2[1]
)
new_state_2 = list(system.UAV.coordinate)
else:
new_state_1, reward, done, info = system.step(
action_0=action_2[0],
action_1=action_2[1],
G=action_1[0:0+2 * system.UAV.ant_num * system.user_num],
set_pos_x=action_2[0],
set_pos_y=action_2[1]
)
new_state_2 = list(system.UAV.coordinate)
score_per_ep += reward
# 4 store state pair into mem pool
agent_1.remember(observersion_1, action_1, reward, new_state_1, int(done))
agent_2.remember(observersion_2, action_2, reward, new_state_2, int(done))
# 5 update DDPG net
agent_1.learn()
if not TRAINED_UAV:
agent_2.learn()
#system.render_obj.render(0.001) # no rendering for faster
observersion_1 = new_state_1
observersion_2 = new_state_2
if done == True:
break
else:
#system.render_obj.render_pause() # no rendering for faster
time.sleep(0.001) #time.sleep(1)
system.data_manager.save_file(episode_cnt=episode_cnt)
system.reset()
print("ep_num: "+str(episode_cnt)+" ep_score: "+str(score_per_ep))
episode_cnt +=1
if episode_cnt % 10 == 0:
agent_1.save_models()
agent_2.save_models()
# save the last model
agent_1.save_models()
agent_2.save_models()
================================================
FILE: math_tool.py
================================================
import math
import cmath
import numpy as np
import pandas as pd
from numpy import *
from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib.patches import FancyArrowPatch
from mpl_toolkits.mplot3d import proj3d
def cartesian_coordinate_to_spherical_coordinate(cartesian_coordinate):
"""
transmit cartesian_coordinate_to_spherical_coordinate
input 1 X 3 np.array, [x, y, z]
output 1 X 3 np.array, [r, theta, fai]
"""
r = np.linalg.norm(cartesian_coordinate)
if abs(cartesian_coordinate[2]) < 1e-8:
theta = math.atan(np.linalg.norm(cartesian_coordinate[0:2])/1e-8)
else:
theta = math.atan(np.linalg.norm(cartesian_coordinate[0:2])/cartesian_coordinate[2])
if abs(cartesian_coordinate[0]) < 1e-8:
x = 1e-8
else:
x = cartesian_coordinate[0]
y = cartesian_coordinate[1]
if abs(y) < 1e-8:
y = 1e-8
if y > 0 and x > 0:
fai = math.atan(y/x)
elif x < 0 and y > 0:
fai = math.atan(y/x) + math.pi
elif x < 0 and y < 0:
fai = math.atan(y/x) - math.pi
else:
fai = math.atan(y/x)
return r, theta, fai
def vecter_normalization(cartesian_coordinate):
return cartesian_coordinate/np.linalg.norm(cartesian_coordinate)
def get_coor_ref(coor_sys, coor):
"""
input: coor_sys: normalized 1,3 np.array list (1,3)
coor: coordinate under earth system
output: referenced coordinate for x,y, normalized 1,3 np.array
"""
x_ref = np.dot(coor_sys[0],coor)
y_ref = np.dot(coor_sys[1],coor)
z_ref = np.dot(coor_sys[2],coor)
return np.array([x_ref, y_ref, z_ref])
def dB_to_normal(dB):
"""
input: dB
output: normal vaule
"""
return math.pow(10, (dB/10))
def normal_to_dB(normal):
"""
input: normal
output: dB value
"""
return -10 * math.log10(normal)
def diag_to_vector(diag):
"""
transfer a diagnal matrix into a vector
"""
vec_size = np.shape(diag)[0]
vector = np.mat(np.zeros((vec_size, 1), dtype=complex), dtype=complex)
for i in range(vec_size):
vector[i, 0] = diag[i, i]
return vector
def vector_to_diag(vector):
"""
transfer a vector into a diagnal matrix
"""
vec_size = np.shape(vector)[0]
diag = np.mat(np.zeros((vec_size, vec_size), dtype=complex), dtype=complex)
for i in range(vec_size):
diag[i, i] = vector[i, 0]
return diag
def bigger_than_zero(value):
"""
max(0,value)
"""
return max(0, value)
def dataframe_to_dictionary(df):
"""
docstring
"""
return {col_name : df[col_name].values for col_name in df.columns.values}
def convert_list_to_complex_matrix(list_real, shape):
"""
list_real is a 2* N*K dim list, convert it to N X K complex matrix
shape is a tuple (N, K)
"""
N = shape[0]
K = shape[1]
matrix_complex =np.mat(np.zeros((N, K), dtype=complex), dtype=complex)
for i in range(N):
for j in range(K):
matrix_complex[i, j] = list_real[2*(i*K + j)] + 1j * list_real[2*(i*K + j) + 1]
return matrix_complex
def convert_list_to_complex_diag(list_real, diag_row_num):
"""
list_real is a M dim list, convert it to M X M complex diag matrix
diag_row_num is the M
"""
M = diag_row_num
diag_matrix_complex = np.mat(np.zeros((M, M), dtype=complex), dtype=complex)
for i in range(M):
diag_matrix_complex[i, i] = cmath.exp(1j * list_real[i] * math.pi)
return diag_matrix_complex
def map_to(x, x_range:tuple, y_range:tuple):
x_min = x_range[0]
x_max = x_range[1]
y_min = y_range[0]
y_max = y_range[1]
y = y_min+(y_max - y_min) / (x_max - x_min) * (x - x_min)
return y
================================================
FILE: plot_see.py
================================================
import matplotlib as mpl
mpl.rcParams['figure.dpi'] = 300
import matplotlib.pyplot as plt
import numpy as np
import cmath
from scipy.io import loadmat, savemat
import pandas as pd
import os
import copy
import math
######################################################
# new for energy
# energy related parameters of rotary-wing UAV
# based on Energy Minimization in Internet-of-Things System Based on Rotary-Wing UAV
P_i = 790.6715
P_0 = 580.65
U2_tip = (200) ** 2
s = 0.05
d_0 = 0.3
p = 1.225
A = 0.79
delta_time = 0.1 #0.1/1000 #0.1ms
# add ons hover veloctiy
# based on https://www.intechopen.com/chapters/57483
m = 1.3 # mass: assume 1.3kg https://www.droneblog.com/average-weights-of-common-types-of-drones/#:~:text=In%20most%20cases%2C%20toy%20drones,What%20is%20this%3F
g = 9.81 # gravity
T = m * g # thrust
v_0 = (T / (A * 2 * p)) ** 0.5
def get_energy_consumption(v_t):
'''
arg
1) v_t = displacement per time slot
'''
energy_1 = P_0 \
+ 3 * P_0 * (abs(v_t)) ** 2 / U2_tip \
+ 0.5 * d_0 * p * s * A * (abs(v_t))**3
energy_2 = P_i * ((
(1 + (abs(v_t) ** 4) / (4 * (v_0 ** 4))) ** 0.5 \
- (abs(v_t) ** 2) / (2 * (v_0 **2)) \
) ** 0.5)
energy = delta_time * (energy_1 + energy_2)
return energy
ENERGY_MIN = get_energy_consumption(0.25)
ENERGY_MAX = get_energy_consumption(0)
######################################################
# modified from data_manager.py
init_data_file = 'data/init_location.xlsx'
def read_init_location(entity_type = 'user', index = 0):
if entity_type == 'user' or 'attacker' or 'RIS' or 'RIS_norm_vec' or 'UAV':
return np.array([\
pd.read_excel(init_data_file, sheet_name=entity_type)['x'][index],\
pd.read_excel(init_data_file, sheet_name=entity_type)['y'][index],\
pd.read_excel(init_data_file, sheet_name=entity_type)['z'][index]])
else:
return None
# load and plot everything
class LoadAndPlot(object):
"""
load date and plot 2022-07-22 16_16_26
"""
def __init__(self, store_paths, \
user_num = 2, attacker_num = 1, RIS_ant_num = 4, \
ep_num = 300, step_num = 100): # RIS_ant_num = 16 (not true)
self.store_paths = store_paths
self.color_list = ['b', 'c', 'g', 'k', 'm', 'r', 'y']
# self.store_paths = store_paths + '//'
self.user_num = user_num
self.attacker_num = attacker_num
self.RIS_ant_num = RIS_ant_num
self.ep_num = ep_num
self.step_num = step_num
def load_one_ep(self, file_name):
m = loadmat(self.store_path + file_name)
return m
def load_all_steps(self):
result_dic = {}
result_dic.update({'reward':[]})
result_dic.update({'user_capacity':[]})
for i in range(self.user_num):
result_dic['user_capacity'].append([])
result_dic.update({'secure_capacity':[]})
for i in range(self.user_num):
result_dic['secure_capacity'].append([])
result_dic.update({'attaker_capacity':[]})
for i in range(self.attacker_num):
result_dic['attaker_capacity'].append([])
result_dic.update({'RIS_elements':[]})
for i in range(self.RIS_ant_num):
result_dic['RIS_elements'].append([])
for ep_cnt in range(self.ep_num):
mat_ep = self.load_one_ep("simulation_result_ep_" + str(ep_cnt) + ".mat")
one_ep_reward = mat_ep["result_" + str(ep_cnt)]["reward"][0][0]
result_dic['reward'] += list(one_ep_reward[:, 0])
one_ep_user_capacity = mat_ep["result_" + str(ep_cnt)]["user_capacity"][0][0]
for i in range(self.user_num):
result_dic['user_capacity'][i] += list(one_ep_user_capacity[:, i])
one_ep_secure_capacity = mat_ep["result_" + str(ep_cnt)]["secure_capacity"][0][0]
for i in range(self.user_num):
result_dic['secure_capacity'][i] += list(one_ep_secure_capacity[:, i])
one_ep_attaker_capacity = mat_ep["result_" + str(ep_cnt)]["attaker_capacity"][0][0]
for i in range(self.attacker_num):
result_dic['attaker_capacity'][i] += list(one_ep_attaker_capacity[:, i])
one_ep_RIS_first_element = mat_ep["result_" + str(ep_cnt)]["reflecting_coefficient"][0][0]
for i in range(self.RIS_ant_num):
result_dic['RIS_elements'][i] += list(one_ep_RIS_first_element[:, i])
return result_dic
def plot(self):
"""
plot result
b--blue c--cyan(青色) g--green k--black m--magenta(紫红色) r--red w--white y--yellow
"""
###############################
# plot average secrecy energy efficient of each episode
###############################
fig = plt.figure('average_secrecy_energy_efficiency')
# get init location
init_uav_coord = read_init_location(entity_type = 'UAV')
init_user_coord_0 = read_init_location(entity_type = 'user', index=0)
init_user_coord_1 = read_init_location(entity_type = 'user', index=1)
# paths
legends = ['TDDRL', 'TTD3', 'TDDRL (Energy Penalty)', 'TTD3 (Energy Penalty)']
all_average_see = []
all_energies = []
# energies
for store_path in self.store_paths:
energies = []
for i in range(self.ep_num):
# read the mat file
filename = f'simulation_result_ep_{i}.mat'
filename = os.path.join(store_path, filename)
data = loadmat(filename)
# v_ts
energies_one_episode = []
# loop all uav movt
uav_movt = data[f'result_{i}'][0][0][-1]
for j in range(uav_movt.shape[0]):
move_x = uav_movt[j][0]
move_y = uav_movt[j][1]
v_t = (move_x ** 2 + move_y ** 2) ** 0.5
energy = get_energy_consumption(v_t / delta_time)
energies_one_episode.append(energy)
energies.append(energies_one_episode)
all_energies.append(energies)
# see
for store_path, legend in zip(self.store_paths, legends):
average_see = []
# ssr
self.store_path = store_path + '//'
self.all_steps = self.load_all_steps()
sum_secrecy_rate = np.array(self.all_steps['secure_capacity'])
sum_secrecy_rate = np.sum(sum_secrecy_rate, axis = 0)
# energy
energies = all_energies.pop(0)
for i in range(0, self.ep_num * self.step_num, self.step_num):
ssr_one_episode = sum_secrecy_rate[i:i+self.step_num] # ssr means Sum Secrecy Rate
energies_one_episode = energies.pop(0)
ssr_one_episode = ssr_one_episode[:len(energies_one_episode)]
energies_one_episode = energies_one_episode[:len(ssr_one_episode)]
try:
see = np.array(ssr_one_episode) / np.array(energies_one_episode)
average_see.append(sum(see)/len(see))
except:
average_see.append(0)
# change from /J to /kJ
average_see = np.array(average_see) * 1000
average_see = list(average_see)
all_average_see.append(average_see)
plt.plot(range(len(average_see)), average_see, label=legend)
plt.xlabel("Episodes (Ep)")
plt.ylabel("Average Secrecy Energy Efficiency")
plt.legend()
plt.savefig('data/average_secrecy_energy_efficiency.png')
# dictionary of lists
dict = {legend: average_see for legend, average_see in zip(legends, all_average_see)}
df = pd.DataFrame(dict)
df.to_excel('data/average_secrecy_energy_efficiency.xlsx', index=False)
if __name__ == '__main__':
LoadPlotObject = LoadAndPlot(
store_paths = ['data/storage/scratch/ddpg_ssr', 'data/storage/scratch/td3_ssr', 'data/storage/scratch/ddpg_see', 'data/storage/scratch/td3_see'],
ep_num = 300,
)
LoadPlotObject.plot()
================================================
FILE: plot_ssr.py
================================================
import matplotlib as mpl
mpl.rcParams['figure.dpi'] = 300
import matplotlib.pyplot as plt
import numpy as np
import cmath
from scipy.io import loadmat, savemat
import pandas as pd
import os
import copy
import math
######################################################
# new for energy
# energy related parameters of rotary-wing UAV
# based on Energy Minimization in Internet-of-Things System Based on Rotary-Wing UAV
P_i = 790.6715
P_0 = 580.65
U2_tip = (200) ** 2
s = 0.05
d_0 = 0.3
p = 1.225
A = 0.79
delta_time = 0.1 #0.1/1000 #0.1ms
# add ons hover veloctiy
# based on https://www.intechopen.com/chapters/57483
m = 1.3 # mass: assume 1.3kg https://www.droneblog.com/average-weights-of-common-types-of-drones/#:~:text=In%20most%20cases%2C%20toy%20drones,What%20is%20this%3F
g = 9.81 # gravity
T = m * g # thrust
v_0 = (T / (A * 2 * p)) ** 0.5
def get_energy_consumption(v_t):
'''
arg
1) v_t = displacement per time slot
'''
energy_1 = P_0 \
+ 3 * P_0 * (abs(v_t)) ** 2 / U2_tip \
+ 0.5 * d_0 * p * s * A * (abs(v_t))**3
energy_2 = P_i * ((
(1 + (abs(v_t) ** 4) / (4 * (v_0 ** 4))) ** 0.5 \
- (abs(v_t) ** 2) / (2 * (v_0 **2)) \
) ** 0.5)
energy = delta_time * (energy_1 + energy_2)
return energy
ENERGY_MIN = get_energy_consumption(0.25)
ENERGY_MAX = get_energy_consumption(0)
######################################################
# modified from data_manager.py
init_data_file = 'data/init_location.xlsx'
def read_init_location(entity_type = 'user', index = 0):
if entity_type == 'user' or 'attacker' or 'RIS' or 'RIS_norm_vec' or 'UAV':
return np.array([\
pd.read_excel(init_data_file, sheet_name=entity_type)['x'][index],\
pd.read_excel(init_data_file, sheet_name=entity_type)['y'][index],\
pd.read_excel(init_data_file, sheet_name=entity_type)['z'][index]])
else:
return None
# load and plot everything
class LoadAndPlot(object):
"""
load date and plot 2022-07-22 16_16_26
"""
def __init__(self, store_paths, \
user_num = 2, attacker_num = 1, RIS_ant_num = 4, \
ep_num = 300, step_num = 100): # RIS_ant_num = 16 (not true)
self.store_paths = store_paths
self.color_list = ['b', 'c', 'g', 'k', 'm', 'r', 'y']
# self.store_path = store_path + '//'
self.user_num = user_num
self.attacker_num = attacker_num
self.RIS_ant_num = RIS_ant_num
self.ep_num = ep_num
self.step_num = step_num
def load_one_ep(self, file_name):
m = loadmat(self.store_path + file_name)
return m
def load_all_steps(self):
result_dic = {}
result_dic.update({'reward':[]})
result_dic.update({'user_capacity':[]})
for i in range(self.user_num):
result_dic['user_capacity'].append([])
result_dic.update({'secure_capacity':[]})
for i in range(self.user_num):
result_dic['secure_capacity'].append([])
result_dic.update({'attaker_capacity':[]})
for i in range(self.attacker_num):
result_dic['attaker_capacity'].append([])
result_dic.update({'RIS_elements':[]})
for i in range(self.RIS_ant_num):
result_dic['RIS_elements'].append([])
for ep_cnt in range(self.ep_num):
mat_ep = self.load_one_ep("simulation_result_ep_" + str(ep_cnt) + ".mat")
one_ep_reward = mat_ep["result_" + str(ep_cnt)]["reward"][0][0]
result_dic['reward'] += list(one_ep_reward[:, 0])
one_ep_user_capacity = mat_ep["result_" + str(ep_cnt)]["user_capacity"][0][0]
for i in range(self.user_num):
result_dic['user_capacity'][i] += list(one_ep_user_capacity[:, i])
one_ep_secure_capacity = mat_ep["result_" + str(ep_cnt)]["secure_capacity"][0][0]
for i in range(self.user_num):
result_dic['secure_capacity'][i] += list(one_ep_secure_capacity[:, i])
one_ep_attaker_capacity = mat_ep["result_" + str(ep_cnt)]["attaker_capacity"][0][0]
for i in range(self.attacker_num):
result_dic['attaker_capacity'][i] += list(one_ep_attaker_capacity[:, i])
one_ep_RIS_first_element = mat_ep["result_" + str(ep_cnt)]["reflecting_coefficient"][0][0]
for i in range(self.RIS_ant_num):
result_dic['RIS_elements'][i] += list(one_ep_RIS_first_element[:, i])
return result_dic
def plot(self):
"""
plot result
b--blue c--cyan(青色) g--green k--black m--magenta(紫红色) r--red w--white y--yellow
"""
###############################
# plot average sum secrecy rate of each episode
###############################
fig = plt.figure('average_sum_secrecy_rate')
# store_paths = ['data/storage/ddpg 4', 'data/storage/td3 3', 'data/storage/ddpg seem 6', 'data/storage/td3 seem 2']
# store_paths = r'data/storage/scratch/td3_ssr'
legends = ['TDDRL', 'TTD3', 'TDDRL (Energy Penalty)', 'TTD3 (Energy Penalty)']
all_average_sum_secrecy_rate = []
for store_path, legend in zip(self.store_paths, legends):
self.store_path = store_path + '//'
self.all_steps = self.load_all_steps()
sum_secrecy_rate = np.array(self.all_steps['secure_capacity'])
sum_secrecy_rate = np.sum(sum_secrecy_rate, axis = 0)
average_sum_secrecy_rate = []
ssr = []
for i in range(0, self.ep_num * self.step_num, self.step_num):
ssr_one_episode = sum_secrecy_rate[i:i+self.step_num] # ssr means Sum Secrecy Rate
ssr.append(ssr_one_episode)
try:
_ = sum(ssr_one_episode) / len(ssr_one_episode)
except:
_ = 0
average_sum_secrecy_rate.append(_)
all_average_sum_secrecy_rate.append(average_sum_secrecy_rate)
plt.plot(range(len(average_sum_secrecy_rate)), average_sum_secrecy_rate, label=legend)
plt.xlabel("Episodes (Ep)")
plt.ylabel("Average Sum Secrecy Rate")
plt.legend()
plt.savefig('data/average_sum_secrecy_rate.png')
# transpose
'''
all_average_sum_secrecy_rate = np.array(all_average_sum_secrecy_rate)
all_average_sum_secrecy_rate = np.transpose(all_average_sum_secrecy_rate)
all_average_sum_secrecy_rate = list(all_average_sum_secrecy_rate)
'''
# dictionary of lists
dict = {legend: average_sum_secrecy_rate for legend, average_sum_secrecy_rate in zip(legends, all_average_sum_secrecy_rate)}
df = pd.DataFrame(dict)
df.to_excel('data/average_sum_secrecy_rate.xlsx', index=False)
if __name__ == '__main__':
LoadPlotObject = LoadAndPlot(
store_paths = ['data/storage/scratch/ddpg_ssr', 'data/storage/scratch/td3_ssr', 'data/storage/scratch/ddpg_see', 'data/storage/scratch/td3_see'],
ep_num = 300,
)
LoadPlotObject.plot()
================================================
FILE: plot_traj.py
================================================
import matplotlib as mpl
mpl.rcParams['figure.dpi'] = 100
import matplotlib.pyplot as plt
import numpy as np
import cmath
from scipy.io import loadmat, savemat
import pandas as pd
import os
import copy
import math
######################################################
# new for energy
# energy related parameters of rotary-wing UAV
# based on Energy Minimization in Internet-of-Things System Based on Rotary-Wing UAV
P_i = 790.6715
P_0 = 580.65
U2_tip = (200) ** 2
s = 0.05
d_0 = 0.3
p = 1.225
A = 0.79
delta_time = 0.1 #0.1/1000 #0.1ms
# add ons hover veloctiy
# based on https://www.intechopen.com/chapters/57483
m = 1.3 # mass: assume 1.3kg https://www.droneblog.com/average-weights-of-common-types-of-drones/#:~:text=In%20most%20cases%2C%20toy%20drones,What%20is%20this%3F
g = 9.81 # gravity
T = m * g # thrust
v_0 = (T / (A * 2 * p)) ** 0.5
def get_energy_consumption(v_t):
'''
arg
1) v_t = displacement per time slot
'''
energy_1 = P_0 \
+ 3 * P_0 * (abs(v_t)) ** 2 / U2_tip \
+ 0.5 * d_0 * p * s * A * (abs(v_t))**3
energy_2 = P_i * ((
(1 + (abs(v_t) ** 4) / (4 * (v_0 ** 4))) ** 0.5 \
- (abs(v_t) ** 2) / (2 * (v_0 **2)) \
) ** 0.5)
energy = delta_time * (energy_1 + energy_2)
return energy
ENERGY_MIN = get_energy_consumption(0.25)
ENERGY_MAX = get_energy_consumption(0)
######################################################
# modified from data_manager.py
init_data_file = 'data/init_location.xlsx'
def read_init_location(entity_type = 'user', index = 0):
if entity_type == 'user' or 'attacker' or 'RIS' or 'RIS_norm_vec' or 'UAV':
return np.array([\
pd.read_excel(init_data_file, sheet_name=entity_type)['x'][index],\
pd.read_excel(init_data_file, sheet_name=entity_type)['y'][index],\
pd.read_excel(init_data_file, sheet_name=entity_type)['z'][index]])
else:
return None
# load and plot everything
class LoadAndPlot(object):
"""
load date and plot 2022-07-22 16_16_26
"""
def __init__(self, store_paths, \
user_num = 2, attacker_num = 1, RIS_ant_num = 4, \
ep_num = 300, step_num = 100): # RIS_ant_num = 16 (not true)
self.store_paths = store_paths
self.color_list = ['b', 'c', 'g', 'k', 'm', 'r', 'y']
# self.store_path = store_path + '//'
self.user_num = user_num
self.attacker_num = attacker_num
self.RIS_ant_num = RIS_ant_num
self.ep_num = ep_num
self.step_num = step_num
def load_one_ep(self, file_name):
m = loadmat(self.store_path + file_name)
return m
def load_all_steps(self):
result_dic = {}
result_dic.update({'reward':[]})
result_dic.update({'user_capacity':[]})
for i in range(self.user_num):
result_dic['user_capacity'].append([])
result_dic.update({'secure_capacity':[]})
for i in range(self.user_num):
result_dic['secure_capacity'].append([])
result_dic.update({'attaker_capacity':[]})
for i in range(self.attacker_num):
result_dic['attaker_capacity'].append([])
result_dic.update({'RIS_elements':[]})
for i in range(self.RIS_ant_num):
result_dic['RIS_elements'].append([])
for ep_cnt in range(self.ep_num):
mat_ep = self.load_one_ep("simulation_result_ep_" + str(ep_cnt) + ".mat")
one_ep_reward = mat_ep["result_" + str(ep_cnt)]["reward"][0][0]
result_dic['reward'] += list(one_ep_reward[:, 0])
one_ep_user_capacity = mat_ep["result_" + str(ep_cnt)]["user_capacity"][0][0]
for i in range(self.user_num):
result_dic['user_capacity'][i] += list(one_ep_user_capacity[:, i])
one_ep_secure_capacity = mat_ep["result_" + str(ep_cnt)]["secure_capacity"][0][0]
for i in range(self.user_num):
result_dic['secure_capacity'][i] += list(one_ep_secure_capacity[:, i])
one_ep_attaker_capacity = mat_ep["result_" + str(ep_cnt)]["attaker_capacity"][0][0]
for i in range(self.attacker_num):
result_dic['attaker_capacity'][i] += list(one_ep_attaker_capacity[:, i])
one_ep_RIS_first_element = mat_ep["result_" + str(ep_cnt)]["reflecting_coefficient"][0][0]
for i in range(self.RIS_ant_num):
result_dic['RIS_elements'][i] += list(one_ep_RIS_first_element[:, i])
return result_dic
def plot(self):
"""
plot result
b--blue c--cyan(青色) g--green k--black m--magenta(紫红色) r--red w--white y--yellow
"""
###############################
# plot trajectory
###############################
# create a fig
fig, ax = plt.subplots(figsize=(5.4,5.2))
#fig = plt.figure('trajectory')
MARKER_SIZE = 8
# colour
color_list_template = ['b', 'g', 'c', 'k', 'm', 'r', 'y', 'black', 'red']
color_list_template = ['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd', '#8c564b', '#e377c2', '#7f7f7f', '#bcbd22', '#17becf']
color_list = copy.deepcopy(color_list_template)
# get init location
init_uav_coord = read_init_location(entity_type = 'UAV')
init_ris_coord = read_init_location(entity_type = 'RIS')
init_eaves_coord = read_init_location(entity_type = 'attacker')
init_user_coord_0 = read_init_location(entity_type = 'user', index=0)
init_user_coord_1 = read_init_location(entity_type = 'user', index=1)
plt.text(20, init_uav_coord[0]-1, 'UAV Initial Coordinate', fontsize = 11)
plt.plot([init_uav_coord[1]], [init_uav_coord[0]], marker="s", markersize=MARKER_SIZE, markeredgecolor="black", markerfacecolor="none")
plt.text(46, init_ris_coord[0]-1, 'RIS', fontsize = 11)
plt.plot([init_ris_coord[1]], [init_ris_coord[0]], marker="d", markersize=MARKER_SIZE, markeredgecolor="black", markerfacecolor="none")
plt.text(36, init_eaves_coord[0]-1, 'Eavesdropper', fontsize = 11)
plt.plot([init_eaves_coord[1]], [init_eaves_coord[0]], marker="v", markersize=MARKER_SIZE, markeredgecolor="black", markerfacecolor="none")
# paths
# store_paths = ['data/storage/ddpg 2', 'data/storage/td3 3', 'data/storage/ddpg seem 3', 'data/storage/td3 seem 5']
# store_paths = ['data/storage/ddpg 2', 'data/storage/td3 3', 'data/storage/ddpg seem 3', 'data/storage/td3 seem 5']
legends = ['TDDRL', 'TTD3', 'TDDRL (Energy Penalty)', 'TTD3 (Energy Penalty)']
legends = ['Benchmark 1', 'Benchmark 2', 'Benchmark 3', 'Proposed method']
for store_path, legend in zip(self.store_paths, legends):
# read the mat file
i = 5 - 1 # episode 300
filename = f'simulation_result_ep_{i}.mat'
filename = os.path.join(store_path, filename)
data = loadmat(filename)
# uav movt
uav_coord = [ [init_uav_coord[0]], [init_uav_coord[1]] ]
uav_movt = data[f'result_{i}'][0][0][-1]
for j in range(uav_movt.shape[0]):
move_x = uav_movt[j][0]
move_y = uav_movt[j][1]
prev_x = uav_coord[0][-1]
prev_y = uav_coord[1][-1]
current_x = prev_x + move_x
current_y = prev_y + move_y
uav_coord[0].append(current_x)
uav_coord[1].append(current_y)
plt.plot(uav_coord[1],uav_coord[0], c=color_list.pop(0), label=legend)
# user 0 movt
direction_fai = -1/2*math.pi
distance_delta_d = 0.25
user_coord_0 = [ [init_user_coord_0[0]], [init_user_coord_0[1]] ]
plt.text(29, init_user_coord_0[0]-1, 'User 1 Initial Coordinate', fontsize = 11)
#color_list = copy.deepcopy(color_list_template)
for j in range(uav_movt.shape[0]):
delta_x = distance_delta_d * math.cos(direction_fai)
delta_y = distance_delta_d * math.sin(direction_fai)
prev_x = user_coord_0[0][-1]
prev_y = user_coord_0[1][-1]
current_x = prev_x + delta_x
current_y = prev_y + delta_y
user_coord_0[0].append(current_x)
user_coord_0[1].append(current_y)
plt.plot(user_coord_0[1],user_coord_0[0], c=color_list.pop(0), linestyle='dashed', linewidth=2, label='User 1')
plt.plot(user_coord_0[1][0], user_coord_0[0][0], marker="o", markersize=MARKER_SIZE, markeredgecolor="black", markerfacecolor="none")
# user 1 movt
direction_fai = -1/2*math.pi
distance_delta_d = 0.25
user_coord_1 = [ [init_user_coord_1[0]], [init_user_coord_1[1]] ]
plt.text(13, init_user_coord_1[0]-1, 'User 2 Initial Coordinate', fontsize = 11)
#color_list = copy.deepcopy(color_list_template)
for j in range(uav_movt.shape[0]):
delta_x = distance_delta_d * math.cos(direction_fai)
delta_y = distance_delta_d * math.sin(direction_fai)
prev_x = user_coord_1[0][-1]
prev_y = user_coord_1[1][-1]
current_x = prev_x + delta_x
current_y = prev_y + delta_y
user_coord_1[0].append(current_x)
user_coord_1[1].append(current_y)
plt.plot(user_coord_1[1],user_coord_1[0], c=color_list.pop(0), linestyle='dashed', linewidth=2, label='User 2')
plt.plot(user_coord_1[1][0], user_coord_1[0][0], marker="o", markersize=MARKER_SIZE, markeredgecolor="black", markerfacecolor="none")
# plot a line between last coord of user 0 and user 1
plt.plot([user_coord_0[1][-1], user_coord_1[1][0-1]], [user_coord_0[0][-1], user_coord_1[0][-1]], 'gray', linestyle='dashed')
# plot midpoint between last coord of user 0 and user 1
plt.plot([(user_coord_0[1][-1] + user_coord_1[1][0-1])/2], [(user_coord_0[0][-1] + user_coord_1[0][0-1])/2], marker="o", markersize=MARKER_SIZE, markeredgecolor="black", markerfacecolor="none")
plt.text(12, 18, "Midpoint of \ntwo user's last location", fontsize = 11)
plt.legend(loc='center right', fontsize=10)
plt.grid()
plt.xlim(0, 50)
plt.ylim(-10, 30)
plt.xlabel('x(m)')
plt.ylabel('y(m)')
plt.tight_layout()
plt.gca().invert_yaxis()
plt.savefig('data/trajectory.png')
#plt.cla()
if __name__ == '__main__':
LoadPlotObject = LoadAndPlot(
store_paths = ['data/storage/scratch/ddpg_ssr', 'data/storage/scratch/td3_ssr', 'data/storage/scratch/ddpg_see', 'data/storage/scratch/td3_see'],
ep_num=300,
)
LoadPlotObject.plot()
================================================
FILE: render.py
================================================
from mpl_toolkits import mplot3d
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib.patches import FancyArrowPatch
from mpl_toolkits.mplot3d import proj3d
class Arrow3D(FancyArrowPatch):
def __init__(self, xs, ys, zs, *args, **kwargs):
FancyArrowPatch.__init__(self, (0,0), (0,0), *args, **kwargs)
self._verts3d = xs, ys, zs
def draw(self, renderer):
xs3d, ys3d, zs3d = self._verts3d
xs, ys, zs = proj3d.proj_transform(xs3d, ys3d, zs3d, renderer.M)
self.set_positions((xs[0],ys[0]),(xs[1],ys[1]))
FancyArrowPatch.draw(self, renderer)
class Render(object):
"""
plot functions to render the whole system
"""
def __init__(self, system, canv_x = (-25, 25), canv_y = (0, 50), canv_z = (0, 60)):
"""
docstring
"""
plt.ion()
self.system = system
self.fig = plt.figure(1)
self.pause = False
self.t_index = 0
def render_pause(self):
"""
show whole system by using plt.show
"""
plt.ion()
ax = self.plot_config()
# plot the position of UAV, RIS, Users & Attakers
self.plot_entities(ax)
self.plot_channels(ax)
self.plot_text(ax)
plt.show(self.fig)
plt.cla()
self.pause = False
plt.ioff()
def render(self, interval):
"""
show whole system in 3D figure
"""
plt.ion()
ax = self.plot_config()
# plot the position of UAV, RIS, Users & Attakers
self.plot_entities(ax)
self.plot_channels(ax)
self.plot_text(ax)
plt.pause(interval)
plt.cla()
plt.ioff()
def plot_click(self, event):
self.pause ^= True
def plot_config(self):
self.fig = plt.figure(1)
ax = plt.axes(projection='3d')
ax.set_xlabel('X Label')
ax.set_ylabel('Y Label')
ax.set_zlabel('Z Label')
ax.set_xlim3d(-25, 25)
ax.set_ylim3d(0, 50)
ax.set_zlim3d(0, 60)
ax.view_init(90, 0)
self.fig.canvas.mpl_connect('key_press_event', self.plot_click)
return ax
def plot_entities(self, ax):
"""
function used in render to show the UAV, RIS, users and attackers
"""
ax.scatter(\
self.system.UAV.coordinate[0],\
self.system.UAV.coordinate[1],\
self.system.UAV.coordinate[2],\
color='r')
ax.text(self.system.UAV.coordinate[0],self.system.UAV.coordinate[1],self.system.UAV.coordinate[2], \
'UAV', size=15, zorder=1, color='r')
ax.scatter(\
self.system.RIS.coordinate[0],\
self.system.RIS.coordinate[1],\
self.system.RIS.coordinate[2],\
color='g')
ax.text(self.system.RIS.coordinate[0],self.system.RIS.coordinate[1],self.system.RIS.coordinate[2], \
'RIS', size=15, zorder=1, color='g')
for user in self.system.user_list:
ax.scatter(
user.coordinate[0],\
user.coordinate[1],\
user.coordinate[2],\
color='b'
)
text = 'user_'+str(user.index) + '\n'\
+ 'noise power(dB) = ' + str(user.noise_power) + '\n' \
+ 'capacity = ' + str(user.capacity) + '\n'\
+ 'secure_capacity = ' + str(user.secure_capacity)
ax.text(user.coordinate[0],user.coordinate[1],user.coordinate[2], \
text, size=10, zorder=1, color='b')
for attacker in self.system.attacker_list:
ax.scatter(
attacker.coordinate[0],\
attacker.coordinate[1],\
attacker.coordinate[2],\
color='y'
)
ax.text(attacker.coordinate[0],attacker.coordinate[1],attacker.coordinate[2], \
'attacker_'+str(attacker.index) + '\n'\
+'capacities:' + str(attacker.capacity), size=10, zorder=1, color='y')
def plot_channels(self, ax):
"""
function used in render to show the H_UR, h_U_k, h_R_k
"""
for channel in self.system.h_R_k:
self.plot_one_channel(ax, channel, "b")
for channel in self.system.h_R_p:
self.plot_one_channel(ax, channel, "y")
for channel in self.system.h_U_k:
self.plot_one_channel(ax, channel, "b")
for channel in self.system.h_U_p:
self.plot_one_channel(ax, channel, "y")
self.plot_one_channel(ax, self.system.H_UR, "r")
def plot_one_channel(self, ax, channel, color, text = "channel"):
"""
function used in plot channels to show only one channel
"""
arrow_side_coor = channel.receiver.coordinate
point_side_coor = channel.transmitter.coordinate
text = channel.channel_name + '\n' \
+ 'n=' + str(channel.n) \
+ ' sigma=' + str(channel.sigma) +'\n'\
+ 'PL=' + str(channel.path_loss_normal) + '\n'\
+ 'PL(dB)=' + str(channel.path_loss_dB)
x = (arrow_side_coor[0] + point_side_coor[0]) / 2
y = (arrow_side_coor[1] + point_side_coor[1]) / 2
z = (arrow_side_coor[2] + point_side_coor[2]) / 2
ax.text(x, y, z, text, size=10, zorder=1, color=color)
channel_arrow = Arrow3D(\
[point_side_coor[0], arrow_side_coor[0]], \
[point_side_coor[1], arrow_side_coor[1]], \
[point_side_coor[2], arrow_side_coor[2]],\
mutation_scale=20, lw = 3, arrowstyle="-|>", color=color
)
ax.add_artist(channel_arrow)
def plot_text(self, ax):
"""
used in render to polt texts
"""
text = "pause = " + str(self.pause) + "\n"\
+ "t_index = " + str(self.t_index)
ax.text(0, 0, 60, text, size=10, zorder=1, color='b')
================================================
FILE: requirements.txt
================================================
# Twin-TD3 requirements
# Usage: pip install -r requirements.txt
matplotlib==3.4.3
pyparsing==3.0.9
# sklearn==0.0.post1
joblib==1.2.0
pandas==1.4.3
cycler==0.11.0
seaborn==0.12.2
kiwisolver==1.4.3
torchvision==0.13.0
torch>=1.7.0
scikit-learn==1.2.2
python-dateutil==2.8.2
pytz==2022.1
scipy==1.8.1
openpyxl==3.0.9
================================================
FILE: run_simulation.py
================================================
# debug field
import os
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
import argparse
# get argument from user
parser = argparse.ArgumentParser()
parser.add_argument('--path', type = str, required = True, help="pretrained model weight path")
args = parser.parse_args()
STORE_PATH = args.path
# validate the weight path
if not os.path.isdir(STORE_PATH):
raise NotImplementedError("The provided weight path does not exist!")
# get DRL_ALGO
if 'td3' in STORE_PATH:
DRL_ALGO = 'td3'
else:
DRL_ALGO = 'ddpg'
# reward design
if 'see' in STORE_PATH:
REWARD_DESIGN = 'see'
else:
REWARD_DESIGN = 'ssr'
# seeds and episode number
SEEDS = None
# process the argument
assert DRL_ALGO in ['ddpg', 'td3'], "drl must be ['ddpg', 'td3']"
assert REWARD_DESIGN in ['ssr', 'see'], "reward must be ['ssr', 'see']"
if SEEDS is not None:
assert len(SEEDS) in [1, 2] and isinstance(SEEDS[0], int) and isinstance(SEEDS[-1], int), "seeds must be a list of 1 or 2 integer"
if DRL_ALGO == 'td3':
from td3 import Agent
elif DRL_ALGO == 'ddpg':
from ddpg import Agent
import ddpg
from env import MiniSystem
import numpy as np
import math
import time
import torch
import shutil
# 1 init system model
episode_num = 1
episode_cnt = 0
step_num = 100
project_name = STORE_PATH
system = MiniSystem(
user_num=2,
RIS_ant_num=4,
UAV_ant_num=4,
if_dir_link=1,
if_with_RIS=True,
if_move_users=True,
if_movements=True,
reverse_x_y=(False, False),
if_UAV_pos_state = True,
reward_design = REWARD_DESIGN,
project_name = project_name,
step_num = step_num
)
if_Theta_fixed = False
if_G_fixed = False
if_BS = False
if_robust = True
# 2 init RL Agent
agent_1_param_dic = {}
agent_1_param_dic["alpha"] = 0.0001
agent_1_param_dic["beta"] = 0.001
agent_1_param_dic["input_dims"] = system.get_system_state_dim()
agent_1_param_dic["tau"] = 0.001
agent_1_param_dic["batch_size"] = 64
agent_1_param_dic["n_actions"] = system.get_system_action_dim() - 2
agent_1_param_dic["action_noise_factor"] = 0.1
agent_1_param_dic["memory_max_size"] = int(5/5 * episode_num * step_num) #/2
agent_1_param_dic["agent_name"] = "G_and_Phi"
agent_1_param_dic["layer1_size"] = 800
agent_1_param_dic["layer2_size"] = 600
agent_1_param_dic["layer3_size"] = 512
agent_1_param_dic["layer4_size"] = 256
agent_2_param_dic = {}
agent_2_param_dic["alpha"] = 0.0001
agent_2_param_dic["beta"] = 0.001
agent_2_param_dic["input_dims"] = 3
agent_2_param_dic["tau"] = 0.001
agent_2_param_dic["batch_size"] = 64
agent_2_param_dic["n_actions"] = 2
agent_2_param_dic["action_noise_factor"] = 0.5
agent_2_param_dic["memory_max_size"] = int(5/5 * episode_num * step_num) #/2
agent_2_param_dic["agent_name"] = "UAV"
agent_2_param_dic["layer1_size"] = 400
agent_2_param_dic["layer2_size"] = 300
agent_2_param_dic["layer3_size"] = 256
agent_2_param_dic["layer4_size"] = 128
if SEEDS is not None:
torch.manual_seed(SEEDS[0]) # 1
torch.cuda.manual_seed_all(SEEDS[0]) # 1
agent_1 = Agent(
alpha = agent_1_param_dic["alpha"],
beta = agent_1_param_dic["beta"],
input_dims = [agent_1_param_dic["input_dims"]],
tau = agent_1_param_dic["tau"],
env = system,
batch_size = agent_1_param_dic["batch_size"],
layer1_size=agent_1_param_dic["layer1_size"],
layer2_size=agent_1_param_dic["layer2_size"],
layer3_size=agent_1_param_dic["layer3_size"],
layer4_size=agent_1_param_dic["layer4_size"],
n_actions = agent_1_param_dic["n_actions"],
max_size = agent_1_param_dic["memory_max_size"],
agent_name= agent_1_param_dic["agent_name"]
)
if SEEDS is not None:
torch.manual_seed(SEEDS[-1]) # 2
torch.cuda.manual_seed_all(SEEDS[-1]) # 2
agent_2 = Agent(
alpha = agent_2_param_dic["alpha"],
beta = agent_2_param_dic["beta"],
input_dims = [agent_2_param_dic["input_dims"]],
tau = agent_2_param_dic["tau"],
env = system,
batch_size = agent_2_param_dic["batch_size"],
layer1_size=agent_2_param_dic["layer1_size"],
layer2_size=agent_2_param_dic["layer2_size"],
layer3_size=agent_2_param_dic["layer3_size"],
layer4_size=agent_2_param_dic["layer4_size"],
n_actions = agent_2_param_dic["n_actions"],
max_size = agent_2_param_dic["memory_max_size"],
agent_name= agent_2_param_dic["agent_name"]
)
if DRL_ALGO == 'td3':
agent_1.load_models(
load_file_actor = STORE_PATH + '/Actor_G_and_Phi_TD3',
load_file_critic_1 = STORE_PATH + '/Critic_1_G_and_Phi_TD3',
load_file_critic_2 = STORE_PATH + '/Critic_2_G_and_Phi_TD3'
)
agent_2.load_models(
load_file_actor = STORE_PATH + '/Actor_UAV_TD3',
load_file_critic_1 = STORE_PATH + '/Critic_1_UAV_TD3',
load_file_critic_2 = STORE_PATH + '/Critic_2_UAV_TD3'
)
elif DRL_ALGO == 'ddpg':
agent_1.load_models(
load_file_actor = STORE_PATH + '/Actor_G_and_Phi_ddpg',
load_file_critic = STORE_PATH + '/Critic_G_and_Phi_ddpg'
)
agent_2.load_models(
load_file_actor = STORE_PATH + '/Actor_UAV_ddpg',
load_file_critic = STORE_PATH + '/Critic_UAV_ddpg'
)
meta_dic = {}
print("***********************system information******************************")
print("folder_name: "+str(system.data_manager.store_path))
meta_dic['folder_name'] = system.data_manager.store_path
print("user_num: "+str(system.user_num))
meta_dic['user_num'] = system.user_num
print("if_dir: "+str(system.if_dir_link))
meta_dic['if_dir_link'] = system.if_dir_link
print("if_with_RIS: "+str(system.if_with_RIS))
meta_dic['if_with_RIS'] = system.if_with_RIS
print("if_user_m: "+str(system.if_move_users))
meta_dic['if_move_users'] = system.if_move_users
print("RIS_ant_num: "+str(system.RIS.ant_num))
meta_dic['system_RIS_ant_num'] = system.RIS.ant_num
print("UAV_ant_num: "+str(system.UAV.ant_num))
meta_dic['system_UAV_ant_num'] = system.UAV.ant_num
print("if_movements: "+str(system.if_movements))
meta_dic['system_if_movements'] = system.if_movements
print("reverse_x_y: "+str(system.reverse_x_y))
meta_dic['system_reverse_x_y'] = system.reverse_x_y
print("if_UAV_pos_state:"+str(system.if_UAV_pos_state))
meta_dic['if_UAV_pos_state'] = system.if_UAV_pos_state
print("ep_num: "+str(episode_num))
meta_dic['episode_num'] = episode_num
print("step_num: "+str(step_num))
meta_dic['step_num'] = step_num
print("***********************agent_1 information******************************")
tplt = "{0:{2}^20}\t{1:{2}^20}"
for i in agent_1_param_dic:
parm = agent_1_param_dic[i]
print(tplt.format(i, parm, chr(12288)))
meta_dic["agent_1"] = agent_1_param_dic
print("***********************agent_2 information******************************")
for i in agent_2_param_dic:
parm = agent_2_param_dic[i]
print(tplt.format(i, parm, chr(12288)))
meta_dic["agent_2"] = agent_2_param_dic
system.data_manager.save_meta_data(meta_dic)
print("***********************traning information******************************")
try:
while episode_cnt < episode_num:
# 1 reset the whole system
system.reset()
step_cnt = 0
score_per_ep = 0
# 2 get the initial state
if if_robust:
tmp = system.observe()
#z = np.random.multivariate_normal(np.zeros(2), 0.5*np.eye(2), size=len(tmp)).view(np.complex128)
z = np.random.normal(size=len(tmp))
observersion_1 = list(
np.array(tmp) + 0.6 *1e-7* z
)
else:
observersion_1 = system.observe()
observersion_2 = list(system.UAV.coordinate)
while step_cnt < step_num:
# 1 count num of step in one episode
step_cnt += 1
# judge if pause the whole system
if not system.render_obj.pause:
# 2 choose action acoording to current state
action_1 = agent_1.choose_action(observersion_1, greedy=agent_1_param_dic["action_noise_factor"] * math.pow((1-episode_cnt / episode_num), 2))
action_2 = agent_2.choose_action(observersion_2, greedy=agent_2_param_dic["action_noise_factor"]* math.pow((1-episode_cnt / episode_num), 2))
if if_BS:
action_2[0]=0
action_2[1]=0
if if_Theta_fixed:
action_1[0+2 * system.UAV.ant_num * system.user_num:] = len(action_1[0+2 * system.UAV.ant_num * system.user_num:])*[0]
if if_G_fixed:
action_1[0:0+2 * system.UAV.ant_num * system.user_num]=np.array([-0.0313, -0.9838, 0.3210, 1.0, -0.9786, -0.1448, 0.3518, 0.5813, -1.0, -0.2803, -0.4616, -0.6352, -0.1449, 0.7040, 0.4090, -0.8521]) * math.pow(episode_cnt / episode_num, 2) * 0.7
#action_1[0:0+2 * system.UAV.ant_num * system.user_num]=len(action_1[0:0+2 * system.UAV.ant_num * system.user_num])*[0.5]
# 3 get newstate, reward
if system.if_with_RIS:
new_state_1, reward, done, info = system.step(
action_0=action_2[0],
action_1=action_2[1],
G=action_1[0:0+2 * system.UAV.ant_num * system.user_num],
Phi=action_1[0+2 * system.UAV.ant_num * system.user_num:],
set_pos_x=action_2[0],
set_pos_y=action_2[1]
)
new_state_2 = list(system.UAV.coordinate)
else:
new_state_1, reward, done, info = system.step(
action_0=action_2[0],
action_1=action_2[1],
G=action_1[0:0+2 * system.UAV.ant_num * system.user_num],
set_pos_x=action_2[0],
set_pos_y=action_2[1]
)
new_state_2 = list(system.UAV.coordinate)
score_per_ep += reward
# render
system.render_obj.render(0.001) # no rendering for faster
observersion_1 = new_state_1
observersion_2 = new_state_2
if done == True:
break
else:
system.render_obj.render_pause() # no rendering for faster
time.sleep(0.001) #time.sleep(1)
system.reset()
print("ep_num: "+str(episode_cnt)+" ep_score: "+str(score_per_ep))
episode_cnt +=1
except KeyboardInterrupt:
raise KeyboardInterrupt
finally:
shutil.rmtree('data/storage/data')
================================================
FILE: td3.py
================================================
import os
import torch as T
#import torch.cuda as T
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import numpy as np
from data_manager import DataManager
class OUActionNoise(object):
def __init__(self, mu, sigma=0.15, theta=.2, dt=1e-2, x0=None):
self.theta = theta
self.mu = mu
self.sigma = sigma
self.dt = dt
self.x0 = x0
self.reset()
def __call__(self):
x = self.x_prev + self.theta * (self.mu - self.x_prev) * self.dt + \
self.sigma * np.sqrt(self.dt) * np.random.normal(size=self.mu.shape)
self.x_prev = x
return x
def reset(self):
self.x_prev = self.x0 if self.x0 is not None else np.zeros_like(self.mu)
def __repr__(self):
return 'OrnsteinUhlenbeckActionNoise(mu={}, sigma={})'.format(
self.mu, self.sigma)
class AWGNActionNoise(object):
def __init__(self, mu = 0, sigma=1):
self.mu = mu
self.sigma = sigma
def __call__(self):
#self.mu = mu
#self.sigma = sigma
x = np.random.normal(size=self.mu.shape) * self.sigma
return x
class ReplayBuffer(object):
def __init__(self, max_size, input_shape, n_actions):
self.mem_size = max_size
self.mem_cntr = 0
self.state_memory = np.zeros((self.mem_size, *input_shape))
self.new_state_memory = np.zeros((self.mem_size, *input_shape))
self.action_memory = np.zeros((self.mem_size, n_actions))
self.reward_memory = np.zeros(self.mem_size)
self.terminal_memory = np.zeros(self.mem_size, dtype=np.float32)
def store_transition(self, state, action, reward, state_, done):
index = self.mem_cntr % self.mem_size
self.state_memory[index] = state
self.new_state_memory[index] = state_
self.action_memory[index] = action
self.reward_memory[index] = reward
self.terminal_memory[index] = 1 - done
self.mem_cntr += 1
def sample_buffer(self, batch_size):
max_mem = min(self.mem_cntr, self.mem_size)
batch = np.random.choice(max_mem, batch_size)
states = self.state_memory[batch]
actions = self.action_memory[batch]
rewards = self.reward_memory[batch]
states_ = self.new_state_memory[batch]
terminal = self.terminal_memory[batch]
return states, actions, rewards, states_, terminal
class CriticNetwork(nn.Module):
def __init__(self, beta, input_dims, fc1_dims, fc2_dims, fc3_dims, fc4_dims, n_actions, name,
chkpt_dir='C:\\demo\\IRS_TD3_minimal\\main_foder\\tmp\\TD3', load_file = ''):
super(CriticNetwork, self).__init__()
self.input_dims = input_dims
self.fc1_dims = fc1_dims
self.fc2_dims = fc2_dims
self.fc3_dims = fc3_dims
self.fc4_dims = fc4_dims
self.n_actions = n_actions
self.checkpoint_file = os.path.join(chkpt_dir,name+'_TD3')
self.load_file = 'C:\\demo\\other_branch\\Learning-based_Secure_Transmission_for_RIS_Aided_mmWave-UAV_Communications_with_Imperfect_CSI\\data\\mannal_store\\models\\Critic_UAV_TD3'
self.fc1 = nn.Linear(*self.input_dims, self.fc1_dims)
f1 = 1./np.sqrt(self.fc1.weight.data.size()[0])
T.nn.init.uniform_(self.fc1.weight.data, -f1, f1)
T.nn.init.uniform_(self.fc1.bias.data, -f1, f1)
#self.fc1.weight.data.uniform_(-f1, f1)
#self.fc1.bias.data.uniform_(-f1, f1)
self.bn1 = nn.LayerNorm(self.fc1_dims)
self.fc2 = nn.Linear(self.fc1_dims, self.fc2_dims)
f2 = 1./np.sqrt(self.fc2.weight.data.size()[0])
#f2 = 0.002
T.nn.init.uniform_(self.fc2.weight.data, -f2, f2)
T.nn.init.uniform_(self.fc2.bias.data, -f2, f2)
#self.fc2.weight.data.uniform_(-f2, f2)
#self.fc2.bias.data.uniform_(-f2, f2)
self.bn2 = nn.LayerNorm(self.fc2_dims)
self.fc3 = nn.Linear(self.fc2_dims, self.fc3_dims)
f3 = 1./np.sqrt(self.fc3.weight.data.size()[0])
T.nn.init.uniform_(self.fc3.weight.data, -f3, f3)
T.nn.init.uniform_(self.fc3.bias.data, -f3, f3)
#self.fc2.weight.data.uniform_(-f2, f2)
#self.fc2.bias.data.uniform_(-f2, f2)
self.bn3 = nn.LayerNorm(self.fc3_dims)
self.fc4 = nn.Linear(self.fc3_dims, self.fc4_dims)
f4 = 1./np.sqrt(self.fc4.weight.data.size()[0])
T.nn.init.uniform_(self.fc4.weight.data, -f4, f4)
T.nn.init.uniform_(self.fc4.bias.data, -f4, f4)
#self.fc2.weight.data.uniform_(-f2, f2)
#self.fc2.bias.data.uniform_(-f2, f2)
self.bn4 = nn.LayerNorm(self.fc4_dims)
self.action_value = nn.Linear(self.n_actions, self.fc4_dims)
f5 = 0.003
self.q = nn.Linear(self.fc4_dims, 1)
T.nn.init.uniform_(self.q.weight.data, -f5, f5)
T.nn.init.uniform_(self.q.bias.data, -f5, f5)
#self.q.weight.data.uniform_(-f3, f3)
#self.q.bias.data.uniform_(-f3, f3)
self.optimizer = optim.Adam(self.parameters(), lr=beta)
# if torch.cuda.available():
# import torch.cuda as T
# else:
# import torch as T
self.device = T.device('cuda:0' if T.cuda.is_available() else 'cpu')
self.to(self.device)
def forward(self, state, action):
state_value = self.fc1(state)
state_value = self.bn1(state_value)
state_value = F.relu(state_value)
state_value = self.fc2(state_value)
state_value = self.bn2(state_value)
state_value = F.relu(state_value)
state_value = self.fc3(state_value)
state_value = self.bn3(state_value)
state_value = F.relu(state_value)
state_value = self.fc4(state_value)
state_value = self.bn4(state_value)
action_value = F.relu(self.action_value(action))
state_action_value = F.relu(T.add(state_value, action_value))
state_action_value = self.q(state_action_value)
return state_action_value
def save_checkpoint(self):
print('... saving checkpoint ...')
T.save(self.state_dict(), self.checkpoint_file)
def load_checkpoint(self,load_file = ''):
print('... loading checkpoint ...')
if T.cuda.is_available():
self.load_state_dict(T.load(load_file))
else:
self.load_state_dict(T.load(load_file, map_location=T.device('cpu')))
class ActorNetwork(nn.Module):
def __init__(self, alpha, input_dims, fc1_dims, fc2_dims, fc3_dims, fc4_dims, n_actions, name,
chkpt_dir='C:\\demo\\IRS_TD3_minimal\\main_foder\\tmp\\TD3', load_file = ''):
super(ActorNetwork, self).__init__()
self.input_dims = input_dims
self.fc1_dims = fc1_dims
self.fc2_dims = fc2_dims
self.fc3_dims = fc3_dims
self.fc4_dims = fc4_dims
self.n_actions = n_actions
self.checkpoint_file = os.path.join(chkpt_dir,name+'_TD3')
self.load_file = 'C:\\demo\\other_branch\\Learning-based_Secure_Transmission_for_RIS_Aided_mmWave-UAV_Communications_with_Imperfect_CSI\\data\\mannal_store\\models\\Actor_UAV_TD3'
self.fc1 = nn.Linear(*self.input_dims, self.fc1_dims)
f1 = 1./np.sqrt(self.fc1.weight.data.size()[0])
# T.nn.init.uniform_(self.fc1.weight.data, -f1, f1)
# T.nn.init.uniform_(self.fc1.bias.data, -f1, f1)
self.fc1.weight.data.uniform_(-f1, f1)
self.fc1.bias.data.uniform_(-f1, f1)
self.bn1 = nn.LayerNorm(self.fc1_dims)
self.fc2 = nn.Linear(self.fc1_dims, self.fc2_dims)
#f2 = 0.002
f2 = 1./np.sqrt(self.fc2.weight.data.size()[0])
# T.nn.init.uniform_(self.fc2.weight.data, -f2, f2)
# T.nn.init.uniform_(self.fc2.bias.data, -f2, f2)
self.fc2.weight.data.uniform_(-f2, f2)
self.fc2.bias.data.uniform_(-f2, f2)
self.bn2 = nn.LayerNorm(self.fc2_dims)
self.fc3 = nn.Linear(self.fc2_dims, self.fc3_dims)
f3 = 1./np.sqrt(self.fc3.weight.data.size()[0])
# T.nn.init.uniform_(self.fc3.weight.data, -f3, f3)
# T.nn.init.uniform_(self.fc3.bias.data, -f3, f3)
self.fc2.weight.data.uniform_(-f2, f2)
self.fc2.bias.data.uniform_(-f2, f2)
self.bn3 = nn.LayerNorm(self.fc3_dims)
self.fc4 = nn.Linear(self.fc3_dims, self.fc4_dims)
f4 = 1./np.sqrt(self.fc4.weight.data.size()[0])
# T.nn.init.uniform_(self.fc4.weight.data, -f4, f4)
# T.nn.init.uniform_(self.fc4.bias.data, -f4, f4)
self.fc2.weight.data.uniform_(-f2, f2)
self.fc2.bias.data.uniform_(-f2, f2)
self.bn4 = nn.LayerNorm(self.fc4_dims)
#f3 = 0.004
f5 = 0.003
self.mu = nn.Linear(self.fc4_dims, self.n_actions)
# T.nn.init.uniform_(self.mu.weight.data, -f5, f5)
# T.nn.init.uniform_(self.mu.bias.data, -f5, f5)
self.mu.weight.data.uniform_(-f3, f3)
self.mu.bias.data.uniform_(-f3, f3)
self.optimizer = optim.Adam(self.parameters(), lr=alpha)
self.device = T.device('cuda:0' if T.cuda.is_available() else 'cpu')
self.to(self.device)
def forward(self, state):
x = self.fc1(state)
x = self.bn1(x)
x = F.relu(x)
x = self.fc2(x)
x = self.bn2(x)
x = F.relu(x)
x = self.fc3(x)
x = self.bn3(x)
x = F.relu(x)
x = self.fc4(x)
x = self.bn4(x)
x = F.relu(x)
x = T.tanh(self.mu(x))
return x
def save_checkpoint(self):
print('... saving checkpoint ...')
T.save(self.state_dict(), self.checkpoint_file)
def load_checkpoint(self, load_file=''):
print('... loading checkpoint ...')
if T.cuda.is_available():
self.load_state_dict(T.load(load_file))
else:
self.load_state_dict(T.load(load_file, map_location=T.device('cpu')))
class Agent(object):
def __init__(self, alpha, beta, input_dims, tau, env, gamma=0.99,
n_actions=2, max_size=1000000, layer1_size=400,
layer2_size=300, layer3_size=256, layer4_size=128, batch_size=64,
update_actor_interval=2, noise = 'AWGN', agent_name = 'default', load_file = ''):
self.load_file = load_file
self.layer1_size = layer1_size
self.layer2_size = layer2_size
self.layer3_size = layer3_size
self.layer4_size = layer4_size
self.gamma = gamma
self.tau = tau
self.memory = ReplayBuffer(max_size, input_dims, n_actions)
self.batch_size = batch_size
self.learn_step_cntr = 0
self.update_actor_iter = update_actor_interval
self.actor = ActorNetwork(alpha, input_dims, layer1_size,
layer2_size, layer3_size, layer4_size, n_actions=n_actions,
name='Actor_' + agent_name,chkpt_dir=env.data_manager.store_path )
self.critic_1 = CriticNetwork(beta, input_dims, layer1_size,
layer2_size, layer3_size, layer4_size, n_actions=n_actions,
name='Critic_1_' + agent_name,chkpt_dir=env.data_manager.store_path)
self.critic_2 = CriticNetwork(beta, input_dims, layer1_size,
layer2_size, layer3_size, layer4_size, n_actions=n_actions,
name='Critic_2_' + agent_name,chkpt_dir=env.data_manager.store_path)
self.target_actor = ActorNetwork(alpha, input_dims, layer1_size,
layer2_size, layer3_size, layer4_size, n_actions=n_actions,
name='TargetActor_' + agent_name,chkpt_dir=env.data_manager.store_path)
self.target_critic_1 = CriticNetwork(beta, input_dims, layer1_size,
layer2_size, layer3_size, layer4_size, n_actions=n_actions,
name='TargetCritic_1_' + agent_name,chkpt_dir=env.data_manager.store_path)
self.target_critic_2 = CriticNetwork(beta, input_dims, layer1_size,
layer2_size, layer3_size, layer4_size, n_actions=n_actions,
name='TargetCritic_2_' + agent_name,chkpt_dir=env.data_manager.store_path)
if noise == 'OU':
self.noise = OUActionNoise(mu=np.zeros(n_actions))
elif noise == 'AWGN':
self.noise = AWGNActionNoise(mu = np.zeros(n_actions))
# tau = 1 means copy parameters to target
self.update_network_parameters(tau=1)
def choose_action(self, observation, greedy=0.5, epsilon = 1):
self.actor.eval()
observation = T.tensor(observation, dtype=T.float).to(self.actor.device)
mu = self.actor.forward(observation).to(self.actor.device)
mu_prime = mu + T.tensor(greedy * self.noise(),
dtype=T.float).to(self.actor.device)
self.actor.train()
return mu_prime.cpu().detach().numpy()
def remember(self, state, action, reward, new_state, done):
self.memory.store_transition(state, action, reward, new_state, done)
def learn(self):
if self.memory.mem_cntr < self.batch_size:
return
# the done here is opposite of the done in the env
state, action, reward, new_state, done = \
self.memory.sample_buffer(self.batch_size)
# trun s, a, r, new_s into tensor
reward = T.tensor(reward, dtype=T.float).to(self.critic_1.device)
done = T.tensor(done).to(self.critic_1.device)
new_state = T.tensor(new_state, dtype=T.float).to(self.critic_1.device)
action = T.tensor(action, dtype=T.float).to(self.critic_1.device)
state = T.tensor(state, dtype=T.float).to(self.critic_1.device)
# trun on evaliation mode of target actor, target critic, critic net
# fix these three nets
self.target_actor.eval()
self.target_critic_1.eval()
self.target_critic_2.eval()
self.critic_1.eval()
self.critic_2.eval()
target_actions = self.target_actor.forward(new_state)
# target_actions = target_actions + \
# T.clamp(T.tensor(np.random.normal(scale=0.2)), -0.5, 0.5)
# # might break if elements of min and max are not all equal
# target_actions = T.clamp(target_actions, self.min_action[0], self.max_action[0])
critic_value_1_ = self.target_critic_1.forward(new_state, target_actions)
critic_value_2_ = self.target_critic_2.forward(new_state, target_actions)
critic_value_1 = self.critic_1.forward(state, action)
critic_value_2 = self.critic_2.forward(state, action)
critic_value_ = T.min(critic_value_1_, critic_value_2_)
target = []
for j in range(self.batch_size):
target.append(reward[j] + self.gamma*critic_value_[j]*done[j])
target = T.tensor(target).to(self.critic_1.device)
target = target.view(self.batch_size, 1)
# here update the critic net using mse of (r + gamma * Q_argmax_a*(newstate, a*)) - Q(state, action)
self.critic_1.train()
self.critic_2.train()
self.critic_1.optimizer.zero_grad()
self.critic_2.optimizer.zero_grad()
critic_1_loss = F.mse_loss(target, critic_value_1)
critic_2_loss = F.mse_loss(target, critic_value_2)
critic_loss = critic_1_loss + critic_2_loss
critic_loss.backward()
self.critic_1.optimizer.step()
self.critic_2.optimizer.step()
self.learn_step_cntr += 1
if self.learn_step_cntr % self.update_actor_iter != 0:
return
# here update the actor net by policy gradient
# first fix the critic net
self.critic_1.eval()
self.critic_2.eval()
self.actor.optimizer.zero_grad()
mu = self.actor.forward(state)
self.actor.train()
actor_q1_loss = self.critic_1.forward(state, mu)
actor_loss = -T.mean(actor_q1_loss)
actor_loss.backward()
self.actor.optimizer.step()
self.update_network_parameters()
def update_network_parameters(self, tau=None):
if tau is None:
tau = self.tau
actor_params = self.actor.named_parameters()
critic_1_params = self.critic_1.named_parameters()
critic_2_params = self.critic_2.named_parameters()
target_actor_params = self.target_actor.named_parameters()
target_critic_1_params = self.target_critic_1.named_parameters()
target_critic_2_params = self.target_critic_2.named_parameters()
critic_1_state_dict = dict(critic_1_params)
critic_2_state_dict = dict(critic_2_params)
actor_state_dict = dict(actor_params)
target_actor_state_dict = dict(target_actor_params)
target_critic_1_state_dict = dict(target_critic_1_params)
target_critic_2_state_dict = dict(target_critic_2_params)
for name in critic_1_state_dict:
critic_1_state_dict[name] = tau*critic_1_state_dict[name].clone() + \
(1-tau)*target_critic_1_state_dict[name].clone()
for name in critic_2_state_dict:
critic_2_state_dict[name] = tau*critic_2_state_dict[name].clone() + \
(1-tau)*target_critic_2_state_dict[name].clone()
for name in actor_state_dict:
actor_state_dict[name] = tau*actor_state_dict[name].clone() + \
(1-tau)*target_actor_state_dict[name].clone()
self.target_critic_1.load_state_dict(critic_1_state_dict)
self.target_critic_2.load_state_dict(critic_2_state_dict)
self.target_actor.load_state_dict(actor_state_dict)
def save_models(self):
self.actor.save_checkpoint()
self.target_actor.save_checkpoint()
self.critic_1.save_checkpoint()
self.critic_2.save_checkpoint()
self.target_critic_1.save_checkpoint()
self.target_critic_2.save_checkpoint()
def load_models(self, load_file_actor = '',load_file_critic_1 ='',load_file_critic_2 =''):
self.actor.load_checkpoint(load_file = load_file_actor)
self.target_actor.load_checkpoint(load_file = load_file_actor)
self.critic_1.load_checkpoint(load_file = load_file_critic_1)
self.critic_2.load_checkpoint(load_file = load_file_critic_2)
self.target_critic_1.load_checkpoint(load_file = load_file_critic_1)
self.target_critic_2.load_checkpoint(load_file = load_file_critic_2)
gitextract_sya5mzu7/ ├── README.md ├── batch_eval.sh ├── batch_train.sh ├── channel.py ├── data/ │ ├── .vscode/ │ │ └── settings.json │ ├── data_test.py │ ├── init_location.xlsx │ ├── readme.md │ ├── simulation_result.xlsx │ └── storage/ │ ├── benchmark/ │ │ ├── ddpg_see_benchmark/ │ │ │ ├── Actor_UAV_ddpg │ │ │ ├── Critic_UAV_ddpg │ │ │ ├── TargetActor_UAV_ddpg │ │ │ └── TargetCritic_UAV_ddpg │ │ ├── ddpg_ssr_benchmark/ │ │ │ ├── Actor_UAV_ddpg │ │ │ ├── Critic_UAV_ddpg │ │ │ ├── TargetActor_UAV_ddpg │ │ │ └── TargetCritic_UAV_ddpg │ │ ├── td3_see_benchmark/ │ │ │ ├── Actor_UAV_TD3 │ │ │ ├── Critic_1_UAV_TD3 │ │ │ ├── Critic_2_UAV_TD3 │ │ │ ├── TargetActor_UAV_TD3 │ │ │ ├── TargetCritic_1_UAV_TD3 │ │ │ └── TargetCritic_2_UAV_TD3 │ │ └── td3_ssr_benchmark/ │ │ ├── Actor_UAV_TD3 │ │ ├── Critic_1_UAV_TD3 │ │ ├── Critic_2_UAV_TD3 │ │ ├── TargetActor_UAV_TD3 │ │ ├── TargetCritic_1_UAV_TD3 │ │ └── TargetCritic_2_UAV_TD3 │ └── readme.md ├── data_manager.py ├── ddpg.py ├── entity.py ├── env.py ├── env1.py ├── load_and_plot.py ├── main_RIS.py ├── main_ref.py ├── main_train.py ├── math_tool.py ├── plot_see.py ├── plot_ssr.py ├── plot_traj.py ├── render.py ├── requirements.txt ├── run_simulation.py └── td3.py
SYMBOL INDEX (181 symbols across 13 files)
FILE: channel.py
class mmWave_channel (line 6) | class mmWave_channel(object):
method __init__ (line 12) | def __init__(self, transmitter, receiver, frequncy):
method init_type (line 33) | def init_type(self):
method get_channel_path_loss (line 70) | def get_channel_path_loss(self):
method get_estimated_channel_matrix (line 81) | def get_estimated_channel_matrix(self):
method generate_array_response (line 123) | def generate_array_response(self, transceiver, theta, fai):
method update_CSI (line 155) | def update_CSI(self):
FILE: data_manager.py
class DataManager (line 7) | class DataManager(object):
method __init__ (line 13) | def __init__(self, store_list = ['beamforming_matrix', 'reflecting_coe...
method save_file (line 34) | def save_file(self, episode_cnt = 10):
method save_meta_data (line 45) | def save_meta_data(self, meta_dic):
method init_format (line 51) | def init_format(self):
method read_init_location (line 58) | def read_init_location(self, entity_type = 'user', index = 0):
method store_data (line 67) | def store_data(self, row_data, value_name):
FILE: ddpg.py
class OUActionNoise (line 9) | class OUActionNoise(object):
method __init__ (line 10) | def __init__(self, mu, sigma=0.15, theta=.2, dt=1e-2, x0=None):
method __call__ (line 18) | def __call__(self):
method reset (line 24) | def reset(self):
method __repr__ (line 27) | def __repr__(self):
class AWGNActionNoise (line 31) | class AWGNActionNoise(object):
method __init__ (line 32) | def __init__(self, mu = 0, sigma=1):
method __call__ (line 36) | def __call__(self):
class ReplayBuffer (line 42) | class ReplayBuffer(object):
method __init__ (line 43) | def __init__(self, max_size, input_shape, n_actions):
method store_transition (line 52) | def store_transition(self, state, action, reward, state_, done):
method sample_buffer (line 61) | def sample_buffer(self, batch_size):
class CriticNetwork (line 74) | class CriticNetwork(nn.Module):
method __init__ (line 75) | def __init__(self, beta, input_dims, fc1_dims, fc2_dims, fc3_dims, fc4...
method forward (line 136) | def forward(self, state, action):
method save_checkpoint (line 155) | def save_checkpoint(self):
method load_checkpoint (line 159) | def load_checkpoint(self,load_file = ''):
class ActorNetwork (line 166) | class ActorNetwork(nn.Module):
method __init__ (line 167) | def __init__(self, alpha, input_dims, fc1_dims, fc2_dims, fc3_dims, fc...
method forward (line 224) | def forward(self, state):
method save_checkpoint (line 241) | def save_checkpoint(self):
method load_checkpoint (line 245) | def load_checkpoint(self, load_file=''):
class Agent (line 252) | class Agent(object):
method __init__ (line 253) | def __init__(self, alpha, beta, input_dims, tau, env, gamma=0.99,
method choose_action (line 286) | def choose_action(self, observation, greedy=0.5, epsilon = 1):
method remember (line 296) | def remember(self, state, action, reward, new_state, done):
method learn (line 299) | def learn(self):
method update_network_parameters (line 352) | def update_network_parameters(self, tau=None):
method save_models (line 391) | def save_models(self):
method load_models (line 397) | def load_models(self, load_file_actor = '',load_file_critic =''):
method check_actor_params (line 403) | def check_actor_params(self):
FILE: entity.py
class UAV (line 5) | class UAV(object):
method __init__ (line 12) | def __init__(self, coordinate, index = 0, rotation = 0, ant_num=16, an...
method reset (line 29) | def reset(self, coordinate):
method update_coor_sys (line 35) | def update_coor_sys(self, delta_angle):
method update_coordinate (line 51) | def update_coordinate(self, distance_delta_d, direction_fai):
method move (line 60) | def move(self, distance_delta_d, direction_fai, delta_angle = 0):
class RIS (line 67) | class RIS(object):
method __init__ (line 73) | def __init__(self, coordinate, coor_sys_z, index = 0, ant_num=36, ant_...
class User (line 93) | class User(object):
method __init__ (line 97) | def __init__(self, coordinate, index, ant_num = 1, ant_type = 'single'):
method reset (line 118) | def reset(self, coordinate):
method update_coordinate (line 124) | def update_coordinate(self, distance_delta_d, direction_fai):
method move (line 133) | def move(self, distance_delta_d, direction_fai):
class Attacker (line 139) | class Attacker(object):
method __init__ (line 143) | def __init__(self, coordinate, index, ant_num = 1, ant_type= 'single'):
method reset (line 162) | def reset(self, coordinate):
method update_coordinate (line 168) | def update_coordinate(self, distance_delta_d, direction_fai):
method move (line 177) | def move(self, distance_delta_d, direction_fai):
FILE: env.py
function get_energy_consumption (line 34) | def get_energy_consumption(v_t):
class MiniSystem (line 57) | class MiniSystem(object):
method __init__ (line 63) | def __init__(self, UAV_num = 1, RIS_num = 1, user_num = 1, attacker_nu...
method reset (line 138) | def reset(self):
method step (line 170) | def step(self, action_0 = 0, action_1 = 0, G = 0, Phi = 0, set_pos_x =...
method reward (line 265) | def reward(self):
method observe (line 287) | def observe(self):
method store_current_system_sate (line 303) | def store_current_system_sate(self):
method update_channel_capacity (line 338) | def update_channel_capacity(self):
method calculate_comprehensive_channel_of_attacker_p (line 356) | def calculate_comprehensive_channel_of_attacker_p(self, p):
method calculate_comprehensive_channel_of_user_k (line 366) | def calculate_comprehensive_channel_of_user_k(self, k):
method calculate_capacity_of_user_k (line 376) | def calculate_capacity_of_user_k(self, k):
method calculate_capacity_array_of_attacker_p (line 397) | def calculate_capacity_array_of_attacker_p(self, p):
method calculate_secure_capacity_of_user_k (line 426) | def calculate_secure_capacity_of_user_k(self, k=2):
method get_system_action_dim (line 436) | def get_system_action_dim(self):
method get_system_state_dim (line 452) | def get_system_state_dim(self):
FILE: env1.py
class minimal_IRS_system (line 6) | class minimal_IRS_system():
method __init__ (line 7) | def __init__(self, BS_M = 8, IRS_N_x = 4, IRS_N_y =2, K = 8, statistic...
method calculate_total_transmit_power (line 137) | def calculate_total_transmit_power(self):
method dBm_to_W (line 148) | def dBm_to_W(self, dBm):
method generate_noise (line 153) | def generate_noise(self, mu, segma, size = 1):
method calculate_received_signal_amplitude (line 167) | def calculate_received_signal_amplitude(self):
method calculate_SINR_of_kth_user (line 179) | def calculate_SINR_of_kth_user(self, k):
method calculate_data_rate (line 195) | def calculate_data_rate(self):
method calculate_theta_and_psi (line 205) | def calculate_theta_and_psi(self, ref_norm_vector, coordinate):
method calculate_channel_BS_to_IRS (line 217) | def calculate_channel_BS_to_IRS(self, IRS): # IRS elements spacing hal...
method calculate_channel_IRS_to_user (line 267) | def calculate_channel_IRS_to_user(self, IRS):
method reset (line 292) | def reset(self):
method apply_action (line 302) | def apply_action(self, action):
method get_state (line 330) | def get_state(self):
method step (line 407) | def step(self, action):
method render (line 431) | def render(self):
FILE: load_and_plot.py
function get_energy_consumption (line 45) | def get_energy_consumption(v_t):
function read_init_location (line 70) | def read_init_location(entity_type = 'user', index = 0):
class LoadAndPlot (line 81) | class LoadAndPlot(object):
method __init__ (line 85) | def __init__(self, store_path, \
method load_one_ep (line 100) | def load_one_ep(self, file_name):
method load_all_steps (line 105) | def load_all_steps(self):
method plot (line 150) | def plot(self):
method plot_one_RIS_element (line 321) | def plot_one_RIS_element(self, index):
method plot_trajectory (line 346) | def plot_trajectory(self):
method restruct (line 434) | def restruct(self):
FILE: math_tool.py
function cartesian_coordinate_to_spherical_coordinate (line 11) | def cartesian_coordinate_to_spherical_coordinate(cartesian_coordinate):
function vecter_normalization (line 42) | def vecter_normalization(cartesian_coordinate):
function get_coor_ref (line 45) | def get_coor_ref(coor_sys, coor):
function dB_to_normal (line 56) | def dB_to_normal(dB):
function normal_to_dB (line 63) | def normal_to_dB(normal):
function diag_to_vector (line 70) | def diag_to_vector(diag):
function vector_to_diag (line 80) | def vector_to_diag(vector):
function bigger_than_zero (line 90) | def bigger_than_zero(value):
function dataframe_to_dictionary (line 96) | def dataframe_to_dictionary(df):
function convert_list_to_complex_matrix (line 102) | def convert_list_to_complex_matrix(list_real, shape):
function convert_list_to_complex_diag (line 116) | def convert_list_to_complex_diag(list_real, diag_row_num):
function map_to (line 127) | def map_to(x, x_range:tuple, y_range:tuple):
FILE: plot_see.py
function get_energy_consumption (line 33) | def get_energy_consumption(v_t):
function read_init_location (line 58) | def read_init_location(entity_type = 'user', index = 0):
class LoadAndPlot (line 69) | class LoadAndPlot(object):
method __init__ (line 73) | def __init__(self, store_paths, \
method load_one_ep (line 88) | def load_one_ep(self, file_name):
method load_all_steps (line 93) | def load_all_steps(self):
method plot (line 138) | def plot(self):
FILE: plot_ssr.py
function get_energy_consumption (line 33) | def get_energy_consumption(v_t):
function read_init_location (line 58) | def read_init_location(entity_type = 'user', index = 0):
class LoadAndPlot (line 69) | class LoadAndPlot(object):
method __init__ (line 73) | def __init__(self, store_paths, \
method load_one_ep (line 88) | def load_one_ep(self, file_name):
method load_all_steps (line 93) | def load_all_steps(self):
method plot (line 138) | def plot(self):
FILE: plot_traj.py
function get_energy_consumption (line 33) | def get_energy_consumption(v_t):
function read_init_location (line 58) | def read_init_location(entity_type = 'user', index = 0):
class LoadAndPlot (line 69) | class LoadAndPlot(object):
method __init__ (line 73) | def __init__(self, store_paths, \
method load_one_ep (line 87) | def load_one_ep(self, file_name):
method load_all_steps (line 92) | def load_all_steps(self):
method plot (line 137) | def plot(self):
FILE: render.py
class Arrow3D (line 7) | class Arrow3D(FancyArrowPatch):
method __init__ (line 8) | def __init__(self, xs, ys, zs, *args, **kwargs):
method draw (line 12) | def draw(self, renderer):
class Render (line 18) | class Render(object):
method __init__ (line 22) | def __init__(self, system, canv_x = (-25, 25), canv_y = (0, 50), canv_...
method render_pause (line 32) | def render_pause(self):
method render (line 47) | def render(self, interval):
method plot_click (line 61) | def plot_click(self, event):
method plot_config (line 64) | def plot_config(self):
method plot_entities (line 77) | def plot_entities(self, ax):
method plot_channels (line 122) | def plot_channels(self, ax):
method plot_one_channel (line 137) | def plot_one_channel(self, ax, channel, color, text = "channel"):
method plot_text (line 163) | def plot_text(self, ax):
FILE: td3.py
class OUActionNoise (line 9) | class OUActionNoise(object):
method __init__ (line 10) | def __init__(self, mu, sigma=0.15, theta=.2, dt=1e-2, x0=None):
method __call__ (line 18) | def __call__(self):
method reset (line 24) | def reset(self):
method __repr__ (line 27) | def __repr__(self):
class AWGNActionNoise (line 31) | class AWGNActionNoise(object):
method __init__ (line 32) | def __init__(self, mu = 0, sigma=1):
method __call__ (line 36) | def __call__(self):
class ReplayBuffer (line 42) | class ReplayBuffer(object):
method __init__ (line 43) | def __init__(self, max_size, input_shape, n_actions):
method store_transition (line 52) | def store_transition(self, state, action, reward, state_, done):
method sample_buffer (line 61) | def sample_buffer(self, batch_size):
class CriticNetwork (line 74) | class CriticNetwork(nn.Module):
method __init__ (line 75) | def __init__(self, beta, input_dims, fc1_dims, fc2_dims, fc3_dims, fc4...
method forward (line 136) | def forward(self, state, action):
method save_checkpoint (line 155) | def save_checkpoint(self):
method load_checkpoint (line 159) | def load_checkpoint(self,load_file = ''):
class ActorNetwork (line 166) | class ActorNetwork(nn.Module):
method __init__ (line 167) | def __init__(self, alpha, input_dims, fc1_dims, fc2_dims, fc3_dims, fc...
method forward (line 224) | def forward(self, state):
method save_checkpoint (line 241) | def save_checkpoint(self):
method load_checkpoint (line 245) | def load_checkpoint(self, load_file=''):
class Agent (line 252) | class Agent(object):
method __init__ (line 253) | def __init__(self, alpha, beta, input_dims, tau, env, gamma=0.99,
method choose_action (line 295) | def choose_action(self, observation, greedy=0.5, epsilon = 1):
method remember (line 305) | def remember(self, state, action, reward, new_state, done):
method learn (line 308) | def learn(self):
method update_network_parameters (line 387) | def update_network_parameters(self, tau=None):
method save_models (line 422) | def save_models(self):
method load_models (line 430) | def load_models(self, load_file_actor = '',load_file_critic_1 ='',load...
Condensed preview — 47 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (192K chars).
[
{
"path": "README.md",
"chars": 9829,
"preview": "# Deep Reinforcement Learning for Secrecy Energy-Efficient UAV Communication with Reconfigurable Intelligent Surfaces\n\n*"
},
{
"path": "batch_eval.sh",
"chars": 1932,
"preview": "#!/bin/bash\n\necho \n\necho ddpg_ssr\npython3 load_and_plot.py --path data/storage/trained_uav/ddpg_ssr --ep-num 300\necho dd"
},
{
"path": "batch_train.sh",
"chars": 1775,
"preview": "#!/bin/bash\n\necho \n\necho ddpg_ssr\npython3 main_train.py --drl ddpg --reward ssr --ep-num 300 --trained-uav\necho ddpg_ssr"
},
{
"path": "channel.py",
"chars": 6795,
"preview": "import numpy as np\nimport math\nimport cmath\nfrom math_tool import *\n\nclass mmWave_channel(object):\n \"\"\"\n generate "
},
{
"path": "data/.vscode/settings.json",
"chars": 70,
"preview": "{\n \"python.pythonPath\": \"C:\\\\Users\\\\67039\\\\anaconda3\\\\python.exe\"\n}"
},
{
"path": "data/data_test.py",
"chars": 129,
"preview": "import pandas as pd \nuser_sheet = pd.read_excel('init_location.xlsx', sheet_name='user')\nuser_data = user_sheet\n\nprint(u"
},
{
"path": "data/readme.md",
"chars": 227,
"preview": "#### init_location.xlsx\n- the config files for the initilization postion of each entity (such as UAV, users, attackers, "
},
{
"path": "data/storage/readme.md",
"chars": 256,
"preview": "# This directory stores all experiments\n\n- Each experiment will be saved as a file project in this directory\n- benchmark"
},
{
"path": "data_manager.py",
"chars": 3051,
"preview": "import numpy as np\nimport scipy.io\nimport pandas as pd\nimport os\nimport time, csv\n\nclass DataManager(object):\n \"\"\"\n "
},
{
"path": "ddpg.py",
"chars": 17900,
"preview": "import os\nimport torch as T\n#import torch.cuda as T\nimport torch.nn as nn\nimport torch.nn.functional as F\nimport torch.o"
},
{
"path": "entity.py",
"chars": 6191,
"preview": "import numpy as np\nimport math\n#from math_tool import *\n\nclass UAV(object):\n \"\"\"\n UAV object with coordinate \n "
},
{
"path": "env.py",
"chars": 19041,
"preview": "#%matplotlib inline\nimport numpy as np\nfrom entity import *\nfrom channel import *\nfrom math_tool import *\nfrom datetime "
},
{
"path": "env1.py",
"chars": 20072,
"preview": "import numpy as np\nimport math\nimport cmath\nnp.random.seed(42)\n\nclass minimal_IRS_system():\n def __init__(self, BS_M "
},
{
"path": "load_and_plot.py",
"chars": 17003,
"preview": "import matplotlib.pyplot as plt\nimport numpy as np\nimport cmath\nfrom scipy.io import loadmat, savemat\nimport pandas as p"
},
{
"path": "main_RIS.py",
"chars": 99,
"preview": "from env import *\r\nepisode_num= 100\r\nstep_num = 100\r\na= int(5/5 * episode_num * step_num)\r\nprint(a)"
},
{
"path": "main_ref.py",
"chars": 2759,
"preview": "from ddpg import Agent\nfrom env1 import minimal_IRS_system # orginal\nfrom env import MiniSystem\nimport numpy as np\n#from"
},
{
"path": "main_train.py",
"chars": 11108,
"preview": "# debug field\nimport os\nos.environ[\"KMP_DUPLICATE_LIB_OK\"]=\"TRUE\"\n\nimport argparse\n\n# get argument from user\nparser = ar"
},
{
"path": "math_tool.py",
"chars": 3778,
"preview": "import math\nimport cmath\nimport numpy as np\nimport pandas as pd\nfrom numpy import *\nfrom matplotlib import pyplot as plt"
},
{
"path": "plot_see.py",
"chars": 8485,
"preview": "import matplotlib as mpl\nmpl.rcParams['figure.dpi'] = 300\nimport matplotlib.pyplot as plt\nimport numpy as np\nimport cmat"
},
{
"path": "plot_ssr.py",
"chars": 7338,
"preview": "import matplotlib as mpl\nmpl.rcParams['figure.dpi'] = 300\nimport matplotlib.pyplot as plt\nimport numpy as np\nimport cmat"
},
{
"path": "plot_traj.py",
"chars": 11048,
"preview": "import matplotlib as mpl\nmpl.rcParams['figure.dpi'] = 100\nimport matplotlib.pyplot as plt\nimport numpy as np\nimport cmat"
},
{
"path": "render.py",
"chars": 5905,
"preview": "from mpl_toolkits import mplot3d\nimport matplotlib.pyplot as plt\nfrom mpl_toolkits.mplot3d import Axes3D\nfrom matplotlib"
},
{
"path": "requirements.txt",
"chars": 317,
"preview": "# Twin-TD3 requirements\n# Usage: pip install -r requirements.txt\n\nmatplotlib==3.4.3\npyparsing==3.0.9\n# sklearn==0.0.post"
},
{
"path": "run_simulation.py",
"chars": 10669,
"preview": "# debug field\nimport os\nos.environ[\"KMP_DUPLICATE_LIB_OK\"]=\"TRUE\"\n\nimport argparse\n\n# get argument from user\nparser = ar"
},
{
"path": "td3.py",
"chars": 18835,
"preview": "import os\nimport torch as T\n#import torch.cuda as T\nimport torch.nn as nn\nimport torch.nn.functional as F\nimport torch.o"
}
]
// ... and 22 more files (download for full content)
About this extraction
This page contains the full source code of the yjwong1999/Twin-TD3 GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 47 files (180.3 KB), approximately 49.4k tokens, and a symbol index with 181 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.