Repository: Graph-Machine-Learning-Group/grin
Branch: main
Commit: 4a28afbb0926
Files: 102
Total size: 201.4 KB
Directory structure:
gitextract__pvx4bn4/
├── .gitignore
├── README.md
├── conda_env.yml
├── config/
│ ├── bimpgru/
│ │ ├── air.yaml
│ │ ├── air36.yaml
│ │ ├── bay_block.yaml
│ │ ├── bay_point.yaml
│ │ ├── irish_block.yaml
│ │ ├── irish_point.yaml
│ │ ├── la_block.yaml
│ │ └── la_point.yaml
│ ├── brits/
│ │ ├── air.yaml
│ │ ├── air36.yaml
│ │ ├── bay_block.yaml
│ │ ├── bay_point.yaml
│ │ ├── irish_block.yaml
│ │ ├── irish_point.yaml
│ │ ├── la_block.yaml
│ │ ├── la_point.yaml
│ │ └── synthetic.yaml
│ ├── grin/
│ │ ├── air.yaml
│ │ ├── air36.yaml
│ │ ├── bay_block.yaml
│ │ ├── bay_point.yaml
│ │ ├── irish_block.yaml
│ │ ├── irish_point.yaml
│ │ ├── la_block.yaml
│ │ ├── la_point.yaml
│ │ └── synthetic.yaml
│ ├── mpgru/
│ │ ├── air.yaml
│ │ ├── air36.yaml
│ │ ├── bay_block.yaml
│ │ ├── bay_point.yaml
│ │ ├── irish_block.yaml
│ │ ├── irish_point.yaml
│ │ ├── la_block.yaml
│ │ └── la_point.yaml
│ ├── rgain/
│ │ ├── air.yaml
│ │ ├── air36.yaml
│ │ ├── bay_block.yaml
│ │ ├── bay_point.yaml
│ │ ├── irish_block.yaml
│ │ ├── irish_point.yaml
│ │ ├── la_block.yaml
│ │ └── la_point.yaml
│ └── var/
│ ├── air.yaml
│ ├── air36.yaml
│ ├── bay_block.yaml
│ ├── bay_point.yaml
│ ├── irish_block.yaml
│ ├── irish_point.yaml
│ ├── la_block.yaml
│ └── la_point.yaml
├── lib/
│ ├── __init__.py
│ ├── data/
│ │ ├── __init__.py
│ │ ├── datamodule/
│ │ │ ├── __init__.py
│ │ │ └── spatiotemporal.py
│ │ ├── imputation_dataset.py
│ │ ├── preprocessing/
│ │ │ ├── __init__.py
│ │ │ └── scalers.py
│ │ ├── spatiotemporal_dataset.py
│ │ └── temporal_dataset.py
│ ├── datasets/
│ │ ├── __init__.py
│ │ ├── air_quality.py
│ │ ├── metr_la.py
│ │ ├── pd_dataset.py
│ │ ├── pems_bay.py
│ │ └── synthetic.py
│ ├── fillers/
│ │ ├── __init__.py
│ │ ├── britsfiller.py
│ │ ├── filler.py
│ │ ├── graphfiller.py
│ │ ├── multi_imputation_filler.py
│ │ └── rgainfiller.py
│ ├── nn/
│ │ ├── __init__.py
│ │ ├── layers/
│ │ │ ├── __init__.py
│ │ │ ├── gcrnn.py
│ │ │ ├── gril.py
│ │ │ ├── imputation.py
│ │ │ ├── mpgru.py
│ │ │ ├── rits.py
│ │ │ ├── spatial_attention.py
│ │ │ └── spatial_conv.py
│ │ ├── models/
│ │ │ ├── __init__.py
│ │ │ ├── brits.py
│ │ │ ├── grin.py
│ │ │ ├── mpgru.py
│ │ │ ├── rgain.py
│ │ │ ├── rnn_imputers.py
│ │ │ └── var.py
│ │ └── utils/
│ │ ├── __init__.py
│ │ ├── metric_base.py
│ │ ├── metrics.py
│ │ └── ops.py
│ └── utils/
│ ├── __init__.py
│ ├── numpy_metrics.py
│ ├── parser_utils.py
│ └── utils.py
├── requirements.txt
└── scripts/
├── run_baselines.py
├── run_imputation.py
└── run_synthetic.py
================================================
FILE CONTENTS
================================================
================================================
FILE: .gitignore
================================================
*.DS_STORE
================================================
FILE: README.md
================================================
# Filling the G_ap_s: Multivariate Time Series Imputation by Graph Neural Networks (ICLR 2022 - [open review](https://openreview.net/forum?id=kOu3-S3wJ7) - [pdf](https://openreview.net/pdf?id=kOu3-S3wJ7))
[](https://openreview.net/forum?id=kOu3-S3wJ7)
[](https://openreview.net/pdf?id=kOu3-S3wJ7)
[](https://arxiv.org/abs/2108.00298)
This repository contains the code for the reproducibility of the experiments presented in the paper "Filling the G_ap_s: Multivariate Time Series Imputation by Graph Neural Networks" (ICLR 2022). In this paper, we propose a graph neural network architecture for multivariate time series imputation and achieve state-of-the-art results on several benchmarks.
**Authors**: [Andrea Cini](mailto:andrea.cini@usi.ch), [Ivan Marisca](mailto:ivan.marisca@usi.ch), Cesare Alippi
**‼️ PyG implementation of GRIN is now available inside [Torch Spatiotemporal](https://github.com/TorchSpatiotemporal/tsl), a library built to accelerate research on neural spatiotemporal data processing methods, with a focus on Graph Neural Networks.**
---
GRIN in a nutshell
The [paper](https://arxiv.org/abs/2108.00298) introduces __GRIN__, a method and an architecture to exploit relational inductive biases to reconstruct missing values in multivariate time series coming from sensor networks. GRIN features a bidirectional recurrent GNN which learns __spatio-temporal node-level representations__ tailored to reconstruct observations at neighboring nodes.
---
## Directory structure
The directory is structured as follows:
```
.
├── config
│ ├── bimpgru
│ ├── brits
│ ├── grin
│ ├── mpgru
│ ├── rgain
│ └── var
├── datasets
│ ├── air_quality
│ ├── metr_la
│ ├── pems_bay
│ └── synthetic
├── lib
│ ├── __init__.py
│ ├── data
│ ├── datasets
│ ├── fillers
│ ├── nn
│ └── utils
├── requirements.txt
└── scripts
├── run_baselines.py
├── run_imputation.py
└── run_synthetic.py
```
Note that, given the size of the files, the datasets are not readily available in the folder. See the next section for the downloading instructions.
## Datasets
All the datasets used in the experiment, except CER-E, are open and can be downloaded from this [link](https://mega.nz/folder/qwwG3Qba#c6qFTeT7apmZKKyEunCzSg). The CER-E dataset can be obtained free of charge for research purposes following the instructions at this [link](https://www.ucd.ie/issda/data/commissionforenergyregulationcer/). We recommend storing the downloaded datasets in a folder named `datasets` inside this directory.
## Configuration files
The `config` directory stores all the configuration files used to run the experiment. They are divided into folders, according to the model.
## Library
The support code, including the models and the datasets readers, are packed in a python library named `lib`. Should you have to change the paths to the datasets location, you have to edit the `__init__.py` file of the library.
## Scripts
The scripts used for the experiment in the paper are in the `scripts` folder.
* `run_baselines.py` is used to compute the metrics for the `MEAN`, `KNN`, `MF` and `MICE` imputation methods. An example of usage is
```
python ./scripts/run_baselines.py --datasets air36 air --imputers mean knn --k 10 --in-sample True --n-runs 5
```
* `run_imputation.py` is used to compute the metrics for the deep imputation methods. An example of usage is
```
python ./scripts/run_imputation.py --config config/grin/air36.yaml --in-sample False
```
* `run_synthetic.py` is used for the experiments on the synthetic datasets. An example of usage is
```
python ./scripts/run_synthetic.py --config config/grin/synthetic.yaml --static-adj False
```
## Requirements
We run all the experiments in `python 3.8`, see `requirements.txt` for the list of `pip` dependencies.
## Bibtex reference
If you find this code useful please consider to cite our paper:
```
@inproceedings{cini2022filling,
title={Filling the G\_ap\_s: Multivariate Time Series Imputation by Graph Neural Networks},
author={Andrea Cini and Ivan Marisca and Cesare Alippi},
booktitle={International Conference on Learning Representations},
year={2022},
url={https://openreview.net/forum?id=kOu3-S3wJ7}
}
```
================================================
FILE: conda_env.yml
================================================
name: grin
channels:
- defaults
- pytorch
- conda-forge
dependencies:
- pip
- pytables
- python=3.8
- pytorch=1.8
- torchvision
- torchaudio
- wheel
- pip:
- einops
- fancyimpute==0.6
- h5py
- openpyxl
- pandas
- pytorch-lightning==1.4
- pyyaml
- scikit-learn
- scipy
- tensorboard
================================================
FILE: config/bimpgru/air.yaml
================================================
dataset_name: 'air'
window: 24
adj_threshold: 0.1
detrend: False
scale: True
scaling_axis: 'global' # ['channels', 'global']
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'bimpgru'
d_hidden: 64
d_emb: 8
d_ff: 64
dropout: 0
kernel_size: 2
n_layers: 1
layer_norm: false
merge: 'mlp'
================================================
FILE: config/bimpgru/air36.yaml
================================================
dataset_name: 'air36'
window: 36
adj_threshold: 0.1
detrend: False
scale: True
scaling_axis: 'global' # ['channels', 'global']
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'bimpgru'
d_hidden: 64
d_emb: 8
d_ff: 64
dropout: 0
kernel_size: 2
n_layers: 1
layer_norm: false
merge: 'mlp'
================================================
FILE: config/bimpgru/bay_block.yaml
================================================
dataset_name: 'bay_block'
window: 24
adj_threshold: 0.1
detrend: False
scale: True
scaling_axis: 'global' # ['channels', 'global']
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'bimpgru'
d_hidden: 64
d_emb: 8
d_ff: 64
dropout: 0
kernel_size: 2
n_layers: 1
layer_norm: false
merge: 'mlp'
================================================
FILE: config/bimpgru/bay_point.yaml
================================================
dataset_name: 'bay_point'
window: 24
adj_threshold: 0.1
detrend: False
scale: True
scaling_axis: 'global' # ['channels', 'global']
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'bimpgru'
d_hidden: 64
d_emb: 8
d_ff: 64
dropout: 0
kernel_size: 2
n_layers: 1
layer_norm: false
merge: 'mlp'
================================================
FILE: config/bimpgru/irish_block.yaml
================================================
dataset_name: 'irish_block'
window: 24
adj_threshold: 0.1
detrend: False
scale: True
scaling_axis: 'global' # ['channels', 'global']
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'bimpgru'
d_hidden: 64
d_emb: 8
d_ff: 64
dropout: 0
kernel_size: 2
n_layers: 1
layer_norm: false
merge: 'mlp'
================================================
FILE: config/bimpgru/irish_point.yaml
================================================
dataset_name: 'irish_point'
window: 24
adj_threshold: 0.1
detrend: False
scale: True
scaling_axis: 'global' # ['channels', 'global']
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'bimpgru'
d_hidden: 64
d_emb: 8
d_ff: 64
dropout: 0
kernel_size: 2
n_layers: 1
layer_norm: false
merge: 'mlp'
================================================
FILE: config/bimpgru/la_block.yaml
================================================
dataset_name: 'la_block'
window: 24
adj_threshold: 0.1
detrend: False
scale: True
scaling_axis: 'global' # ['channels', 'global']
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'bimpgru'
d_hidden: 64
d_emb: 8
d_ff: 64
dropout: 0
kernel_size: 2
n_layers: 1
layer_norm: false
merge: 'mlp'
================================================
FILE: config/bimpgru/la_point.yaml
================================================
dataset_name: 'la_point'
window: 24
adj_threshold: 0.1
detrend: False
scale: True
scaling_axis: 'global' # ['channels', 'global']
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'bimpgru'
d_hidden: 64
d_emb: 8
d_ff: 64
dropout: 0
kernel_size: 2
n_layers: 1
layer_norm: false
merge: 'mlp'
================================================
FILE: config/brits/air.yaml
================================================
dataset_name: 'air'
window: 24
detrend: False
scale: True
scaling_axis: 'channels'
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'brits'
d_hidden: 128
================================================
FILE: config/brits/air36.yaml
================================================
dataset_name: 'air36'
window: 36
detrend: False
scale: True
scaling_axis: 'channels'
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'brits'
d_hidden: 64
================================================
FILE: config/brits/bay_block.yaml
================================================
dataset_name: 'bay_block'
window: 24
detrend: False
scale: True
scaling_axis: 'channels'
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'brits'
d_hidden: 256
================================================
FILE: config/brits/bay_point.yaml
================================================
dataset_name: 'bay_point'
window: 24
detrend: False
scale: True
scaling_axis: 'channels'
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'brits'
d_hidden: 256
================================================
FILE: config/brits/irish_block.yaml
================================================
dataset_name: 'irish_block'
window: 24
detrend: False
scale: True
scaling_axis: 'channels'
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'brits'
d_hidden: 256
================================================
FILE: config/brits/irish_point.yaml
================================================
dataset_name: 'irish_point'
window: 24
detrend: False
scale: True
scaling_axis: 'channels'
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'brits'
d_hidden: 256
================================================
FILE: config/brits/la_block.yaml
================================================
dataset_name: 'la_block'
window: 24
detrend: False
scale: True
scaling_axis: 'channels'
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'brits'
d_hidden: 128
================================================
FILE: config/brits/la_point.yaml
================================================
dataset_name: 'la_point'
window: 24
detrend: False
scale: True
scaling_axis: 'channels'
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'brits'
d_hidden: 128
================================================
FILE: config/brits/synthetic.yaml
================================================
window: 36
p_block: 0.025
p_point: 0.025
min_seq: 4
max_seq: 9
use_exogenous: False
epochs: 200
batch_size: 32
model_name: 'brits'
d_hidden: 32
================================================
FILE: config/grin/air.yaml
================================================
dataset_name: 'air'
window: 24
adj_threshold: 0.1
detrend: False
scale: True
scaling_axis: 'global' # ['channels', 'global']
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'grin'
pred_loss_weight: 1
d_hidden: 64
d_emb: 8
d_ff: 64
ff_dropout: 0
kernel_size: 2
decoder_order: 1
n_layers: 1
layer_norm: false
merge: 'mlp'
================================================
FILE: config/grin/air36.yaml
================================================
dataset_name: 'air36'
window: 36
adj_threshold: 0.1
detrend: False
scale: True
scaling_axis: 'global' # ['channels', 'global']
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'grin'
pred_loss_weight: 1
d_hidden: 64
d_emb: 8
d_ff: 64
ff_dropout: 0
kernel_size: 2
decoder_order: 1
n_layers: 1
layer_norm: false
merge: 'mlp'
================================================
FILE: config/grin/bay_block.yaml
================================================
dataset_name: 'bay_block'
window: 24
adj_threshold: 0.1
detrend: False
scale: True
scaling_axis: 'global' # ['channels', 'global']
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'grin'
pred_loss_weight: 1
d_hidden: 64
d_emb: 8
d_ff: 64
ff_dropout: 0
kernel_size: 2
decoder_order: 1
n_layers: 1
layer_norm: false
merge: 'mlp'
================================================
FILE: config/grin/bay_point.yaml
================================================
dataset_name: 'bay_point'
window: 24
adj_threshold: 0.1
detrend: False
scale: True
scaling_axis: 'global' # ['channels', 'global']
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'grin'
pred_loss_weight: 1
d_hidden: 64
d_emb: 8
d_ff: 64
ff_dropout: 0
kernel_size: 2
decoder_order: 1
n_layers: 1
layer_norm: false
merge: 'mlp'
================================================
FILE: config/grin/irish_block.yaml
================================================
dataset_name: 'irish_block'
window: 24
adj_threshold: 0.1
detrend: False
scale: True
scaling_axis: 'global' # ['channels', 'global']
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'grin'
pred_loss_weight: 1
d_hidden: 64
d_emb: 8
d_ff: 64
ff_dropout: 0
kernel_size: 2
decoder_order: 1
n_layers: 1
layer_norm: false
merge: 'mlp'
================================================
FILE: config/grin/irish_point.yaml
================================================
dataset_name: 'irish_point'
window: 24
adj_threshold: 0.1
detrend: False
scale: True
scaling_axis: 'global' # ['channels', 'global']
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'grin'
pred_loss_weight: 1
d_hidden: 64
d_emb: 8
d_ff: 64
ff_dropout: 0
kernel_size: 2
decoder_order: 1
n_layers: 1
layer_norm: false
merge: 'mlp'
================================================
FILE: config/grin/la_block.yaml
================================================
dataset_name: 'la_block'
window: 24
adj_threshold: 0.1
detrend: False
scale: True
scaling_axis: 'global' # ['channels', 'global']
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'grin'
pred_loss_weight: 1
d_hidden: 64
d_emb: 8
d_ff: 64
ff_dropout: 0
kernel_size: 2
decoder_order: 1
n_layers: 1
layer_norm: false
merge: 'mlp'
================================================
FILE: config/grin/la_point.yaml
================================================
dataset_name: 'la_point'
window: 24
adj_threshold: 0.1
detrend: False
scale: True
scaling_axis: 'global' # ['channels', 'global']
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'grin'
pred_loss_weight: 1
d_hidden: 64
d_emb: 8
d_ff: 64
ff_dropout: 0
kernel_size: 2
decoder_order: 1
n_layers: 1
layer_norm: false
merge: 'mlp'
================================================
FILE: config/grin/synthetic.yaml
================================================
window: 36
p_block: 0.025
p_point: 0.025
min_seq: 4
max_seq: 9
use_exogenous: False
epochs: 200
batch_size: 32
model_name: 'grin'
d_hidden: 16
d_emb: 0
d_ff: 16
ff_dropout: 0
kernel_size: 1
decoder_order: 1
n_layers: 1
layer_norm: false
merge: 'mlp'
================================================
FILE: config/mpgru/air.yaml
================================================
dataset_name: 'air'
window: 24
adj_threshold: 0.1
detrend: False
scale: True
scaling_axis: 'global' # ['channels', 'global']
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'mpgru'
pred_loss_weight: 1
d_hidden: 64
d_ff: 64
dropout: 0
kernel_size: 2
n_layers: 1
layer_norm: false
================================================
FILE: config/mpgru/air36.yaml
================================================
dataset_name: 'air36'
window: 36
adj_threshold: 0.1
detrend: False
scale: True
scaling_axis: 'global' # ['channels', 'global']
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'mpgru'
pred_loss_weight: 1
d_hidden: 64
d_ff: 64
dropout: 0
kernel_size: 2
n_layers: 1
layer_norm: false
================================================
FILE: config/mpgru/bay_block.yaml
================================================
dataset_name: 'bay_block'
window: 24
adj_threshold: 0.1
detrend: False
scale: True
scaling_axis: 'global' # ['channels', 'global']
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'mpgru'
pred_loss_weight: 1
d_hidden: 64
d_ff: 64
dropout: 0
kernel_size: 2
n_layers: 1
layer_norm: false
================================================
FILE: config/mpgru/bay_point.yaml
================================================
dataset_name: 'bay_point'
window: 24
adj_threshold: 0.1
detrend: False
scale: True
scaling_axis: 'global' # ['channels', 'global']
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'mpgru'
pred_loss_weight: 1
d_hidden: 64
d_ff: 64
dropout: 0
kernel_size: 2
n_layers: 1
layer_norm: false
================================================
FILE: config/mpgru/irish_block.yaml
================================================
dataset_name: 'irish_block'
window: 24
adj_threshold: 0.1
detrend: False
scale: True
scaling_axis: 'global' # ['channels', 'global']
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'mpgru'
pred_loss_weight: 1
d_hidden: 64
d_ff: 64
dropout: 0
kernel_size: 2
n_layers: 1
layer_norm: false
================================================
FILE: config/mpgru/irish_point.yaml
================================================
dataset_name: 'irish_point'
window: 24
adj_threshold: 0.1
detrend: False
scale: True
scaling_axis: 'global' # ['channels', 'global']
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'mpgru'
pred_loss_weight: 1
d_hidden: 64
d_ff: 64
dropout: 0
kernel_size: 2
n_layers: 1
layer_norm: false
================================================
FILE: config/mpgru/la_block.yaml
================================================
dataset_name: 'la_block'
window: 24
adj_threshold: 0.1
detrend: False
scale: True
scaling_axis: 'global' # ['channels', 'global']
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'mpgru'
pred_loss_weight: 1
d_hidden: 64
d_ff: 64
dropout: 0
kernel_size: 2
n_layers: 1
layer_norm: false
================================================
FILE: config/mpgru/la_point.yaml
================================================
dataset_name: 'la_point'
window: 24
adj_threshold: 0.1
detrend: False
scale: True
scaling_axis: 'global' # ['channels', 'global']
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
aggregate_by: ['mean']
model_name: 'mpgru'
pred_loss_weight: 1
d_hidden: 64
d_ff: 64
dropout: 0
kernel_size: 2
n_layers: 1
layer_norm: false
================================================
FILE: config/rgain/air.yaml
================================================
dataset_name: 'air'
window: 24
whiten_prob: 0.2
detrend: False
scale: True
scaling_axis: 'channels'
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
loss_fn: mse_loss
consistency_loss: False
use_lr_schedule: True
grad_clip_val: -1
aggregate_by: ['mean']
model_name: 'gain'
d_model: 128
d_z: 4
dropout: 0.2
inject_noise: true
alpha: 20
g_train_freq: 3
d_train_freq: 1
================================================
FILE: config/rgain/air36.yaml
================================================
dataset_name: 'air36'
window: 36
whiten_prob: 0.2
detrend: False
scale: True
scaling_axis: 'channels'
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
loss_fn: mse_loss
consistency_loss: False
use_lr_schedule: True
grad_clip_val: -1
aggregate_by: ['mean']
model_name: 'gain'
d_model: 64
d_z: 4
dropout: 0.1
inject_noise: true
alpha: 20
g_train_freq: 3
d_train_freq: 1
================================================
FILE: config/rgain/bay_block.yaml
================================================
dataset_name: 'bay_block'
window: 24
whiten_prob: 0.2
detrend: False
scale: True
scaling_axis: 'channels'
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
loss_fn: mse_loss
consistency_loss: False
use_lr_schedule: True
grad_clip_val: -1
aggregate_by: ['mean']
model_name: 'gain'
d_model: 256
d_z: 4
dropout: 0.2
inject_noise: true
alpha: 20
g_train_freq: 3
d_train_freq: 1
================================================
FILE: config/rgain/bay_point.yaml
================================================
dataset_name: 'bay_point'
window: 24
whiten_prob: 0.2
detrend: False
scale: True
scaling_axis: 'channels'
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
loss_fn: mse_loss
consistency_loss: False
use_lr_schedule: True
grad_clip_val: -1
aggregate_by: ['mean']
model_name: 'gain'
d_model: 256
d_z: 4
dropout: 0.2
inject_noise: true
alpha: 20
g_train_freq: 3
d_train_freq: 1
================================================
FILE: config/rgain/irish_block.yaml
================================================
dataset_name: 'irish_block'
window: 24
whiten_prob: 0.2
detrend: False
scale: True
scaling_axis: 'channels'
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
loss_fn: mse_loss
consistency_loss: False
use_lr_schedule: True
grad_clip_val: -1
aggregate_by: ['mean']
model_name: 'gain'
d_model: 256
d_z: 4
dropout: 0.2
inject_noise: true
alpha: 20
g_train_freq: 3
d_train_freq: 1
================================================
FILE: config/rgain/irish_point.yaml
================================================
dataset_name: 'irish_point'
window: 24
whiten_prob: 0.2
detrend: False
scale: True
scaling_axis: 'channels'
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
loss_fn: mse_loss
consistency_loss: False
use_lr_schedule: True
grad_clip_val: -1
aggregate_by: ['mean']
model_name: 'gain'
d_model: 256
d_z: 4
dropout: 0.2
inject_noise: true
alpha: 20
g_train_freq: 3
d_train_freq: 1
================================================
FILE: config/rgain/la_block.yaml
================================================
dataset_name: 'la_block'
window: 24
whiten_prob: 0.2
detrend: False
scale: True
scaling_axis: 'channels'
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
loss_fn: mse_loss
consistency_loss: False
use_lr_schedule: True
grad_clip_val: -1
aggregate_by: ['mean']
model_name: 'gain'
d_model: 128
d_z: 4
dropout: 0.2
inject_noise: true
alpha: 20
g_train_freq: 3
d_train_freq: 1
================================================
FILE: config/rgain/la_point.yaml
================================================
dataset_name: 'la_point'
window: 24
whiten_prob: 0.2
detrend: False
scale: True
scaling_axis: 'channels'
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
batch_size: 32
loss_fn: mse_loss
consistency_loss: False
use_lr_schedule: True
grad_clip_val: -1
aggregate_by: ['mean']
model_name: 'gain'
d_model: 128
d_z: 4
dropout: 0.2
inject_noise: true
alpha: 20
g_train_freq: 3
d_train_freq: 1
================================================
FILE: config/var/air.yaml
================================================
dataset_name: 'air'
window: 24
detrend: False
scale: True
scaling_axis: 'channels'
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
lr: 0.0005
batch_size: 64
aggregate_by: ['mean']
model_name: 'var'
order: 5
padding: 'mean'
================================================
FILE: config/var/air36.yaml
================================================
dataset_name: 'air36'
window: 36
detrend: False
scale: True
scaling_axis: 'channels'
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
lr: 0.0005
batch_size: 64
aggregate_by: ['mean']
model_name: 'var'
order: 5
padding: 'mean'
================================================
FILE: config/var/bay_block.yaml
================================================
dataset_name: 'bay_block'
window: 24
detrend: False
scale: True
scaling_axis: 'channels'
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
lr: 0.0005
batch_size: 64
aggregate_by: ['mean']
model_name: 'var'
order: 5
padding: 'mean'
================================================
FILE: config/var/bay_point.yaml
================================================
dataset_name: 'bay_point'
window: 24
detrend: False
scale: True
scaling_axis: 'channels'
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
lr: 0.0005
batch_size: 64
aggregate_by: ['mean']
model_name: 'var'
order: 5
padding: 'mean'
================================================
FILE: config/var/irish_block.yaml
================================================
dataset_name: 'irish_block'
window: 24
detrend: False
scale: True
scaling_axis: 'channels'
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
lr: 0.0005
batch_size: 64
aggregate_by: ['mean']
model_name: 'var'
order: 5
padding: 'mean'
================================================
FILE: config/var/irish_point.yaml
================================================
dataset_name: 'irish_point'
window: 24
detrend: False
scale: True
scaling_axis: 'channels'
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
lr: 0.0005
batch_size: 64
aggregate_by: ['mean']
model_name: 'var'
order: 5
padding: 'mean'
================================================
FILE: config/var/la_block.yaml
================================================
dataset_name: 'la_block'
window: 24
detrend: False
scale: True
scaling_axis: 'channels'
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
lr: 0.0005
batch_size: 64
aggregate_by: ['mean']
model_name: 'var'
order: 5
padding: 'mean'
================================================
FILE: config/var/la_point.yaml
================================================
dataset_name: 'la_point'
window: 24
detrend: False
scale: True
scaling_axis: 'channels'
scaled_target: True
epochs: 300
samples_per_epoch: 5120 # 160 batch of 32
lr: 0.0005
batch_size: 64
aggregate_by: ['mean']
model_name: 'var'
order: 5
padding: 'mean'
================================================
FILE: lib/__init__.py
================================================
import os
base_dir = os.path.dirname(os.path.dirname(os.path.realpath(__file__)))
config = {
'logs': 'logs/'
}
datasets_path = {
'air': 'datasets/air_quality',
'la': 'datasets/metr_la',
'bay': 'datasets/pems_bay',
'synthetic': 'datasets/synthetic'
}
epsilon = 1e-8
for k, v in config.items():
config[k] = os.path.join(base_dir, v)
for k, v in datasets_path.items():
datasets_path[k] = os.path.join(base_dir, v)
================================================
FILE: lib/data/__init__.py
================================================
from .temporal_dataset import TemporalDataset
from .spatiotemporal_dataset import SpatioTemporalDataset
================================================
FILE: lib/data/datamodule/__init__.py
================================================
from .spatiotemporal import SpatioTemporalDataModule
================================================
FILE: lib/data/datamodule/spatiotemporal.py
================================================
import pytorch_lightning as pl
from torch.utils.data import DataLoader, Subset, RandomSampler
from .. import TemporalDataset, SpatioTemporalDataset
from ..preprocessing import StandardScaler, MinMaxScaler
from ...utils import ensure_list
from ...utils.parser_utils import str_to_bool
class SpatioTemporalDataModule(pl.LightningDataModule):
"""
Pytorch Lightning DataModule for TimeSeriesDatasets
"""
def __init__(self, dataset: TemporalDataset,
scale=True,
scaling_axis='samples',
scaling_type='std',
scale_exogenous=None,
train_idxs=None,
val_idxs=None,
test_idxs=None,
batch_size=32,
workers=1,
samples_per_epoch=None):
super(SpatioTemporalDataModule, self).__init__()
self.torch_dataset = dataset
# splitting
self.trainset = Subset(self.torch_dataset, train_idxs if train_idxs is not None else [])
self.valset = Subset(self.torch_dataset, val_idxs if val_idxs is not None else [])
self.testset = Subset(self.torch_dataset, test_idxs if test_idxs is not None else [])
# preprocessing
self.scale = scale
self.scaling_type = scaling_type
self.scaling_axis = scaling_axis
self.scale_exogenous = ensure_list(scale_exogenous) if scale_exogenous is not None else None
# data loaders
self.batch_size = batch_size
self.workers = workers
self.samples_per_epoch = samples_per_epoch
@property
def is_spatial(self):
return isinstance(self.torch_dataset, SpatioTemporalDataset)
@property
def n_nodes(self):
if not self.has_setup_fit:
raise ValueError('You should initialize the datamodule first.')
return self.torch_dataset.n_nodes if self.is_spatial else None
@property
def d_in(self):
if not self.has_setup_fit:
raise ValueError('You should initialize the datamodule first.')
return self.torch_dataset.n_channels
@property
def d_out(self):
if not self.has_setup_fit:
raise ValueError('You should initialize the datamodule first.')
return self.torch_dataset.horizon
@property
def train_slice(self):
return self.torch_dataset.expand_indices(self.trainset.indices, merge=True)
@property
def val_slice(self):
return self.torch_dataset.expand_indices(self.valset.indices, merge=True)
@property
def test_slice(self):
return self.torch_dataset.expand_indices(self.testset.indices, merge=True)
def get_scaling_axes(self, dim='global'):
scaling_axis = tuple()
if dim == 'global':
scaling_axis = (0, 1, 2)
elif dim == 'channels':
scaling_axis = (0, 1)
elif dim == 'nodes':
scaling_axis = (0,)
# Remove last dimension for temporal datasets
if not self.is_spatial:
scaling_axis = scaling_axis[:-1]
if not len(scaling_axis):
raise ValueError(f'Scaling axis "{dim}" not valid.')
return scaling_axis
def get_scaler(self):
if self.scaling_type == 'std':
return StandardScaler
elif self.scaling_type == 'minmax':
return MinMaxScaler
else:
return NotImplementedError
def setup(self, stage=None):
if self.scale:
scaling_axis = self.get_scaling_axes(self.scaling_axis)
train = self.torch_dataset.data.numpy()[self.train_slice]
train_mask = self.torch_dataset.mask.numpy()[self.train_slice] if 'mask' in self.torch_dataset else None
scaler = self.get_scaler()(scaling_axis).fit(train, mask=train_mask, keepdims=True).to_torch()
self.torch_dataset.scaler = scaler
if self.scale_exogenous is not None:
for label in self.scale_exogenous:
exo = getattr(self.torch_dataset, label)
scaler = self.get_scaler()(scaling_axis)
scaler.fit(exo[self.train_slice], keepdims=True).to_torch()
setattr(self.torch_dataset, label, scaler.transform(exo))
def _data_loader(self, dataset, shuffle=False, batch_size=None, **kwargs):
batch_size = self.batch_size if batch_size is None else batch_size
return DataLoader(dataset,
shuffle=shuffle,
batch_size=batch_size,
num_workers=self.workers,
**kwargs)
def train_dataloader(self, shuffle=True, batch_size=None):
if self.samples_per_epoch is not None:
sampler = RandomSampler(self.trainset, replacement=True, num_samples=self.samples_per_epoch)
return self._data_loader(self.trainset, False, batch_size, sampler=sampler, drop_last=True)
return self._data_loader(self.trainset, shuffle, batch_size, drop_last=True)
def val_dataloader(self, shuffle=False, batch_size=None):
return self._data_loader(self.valset, shuffle, batch_size)
def test_dataloader(self, shuffle=False, batch_size=None):
return self._data_loader(self.testset, shuffle, batch_size)
@staticmethod
def add_argparse_args(parser, **kwargs):
parser.add_argument('--batch-size', type=int, default=64)
parser.add_argument('--scaling-axis', type=str, default="channels")
parser.add_argument('--scaling-type', type=str, default="std")
parser.add_argument('--scale', type=str_to_bool, nargs='?', const=True, default=True)
parser.add_argument('--workers', type=int, default=0)
parser.add_argument('--samples-per-epoch', type=int, default=None)
return parser
================================================
FILE: lib/data/imputation_dataset.py
================================================
import numpy as np
import torch
from . import TemporalDataset, SpatioTemporalDataset
class ImputationDataset(TemporalDataset):
def __init__(self, data,
index=None,
mask=None,
eval_mask=None,
freq=None,
trend=None,
scaler=None,
window=24,
stride=1,
exogenous=None):
if mask is None:
mask = np.ones_like(data)
if exogenous is None:
exogenous = dict()
exogenous['mask_window'] = mask
if eval_mask is not None:
exogenous['eval_mask_window'] = eval_mask
super(ImputationDataset, self).__init__(data,
index=index,
exogenous=exogenous,
trend=trend,
scaler=scaler,
freq=freq,
window=window,
horizon=window,
delay=-window,
stride=stride)
def get(self, item, preprocess=False):
res, transform = super(ImputationDataset, self).get(item, preprocess)
res['x'] = torch.where(res['mask'], res['x'], torch.zeros_like(res['x']))
return res, transform
class GraphImputationDataset(ImputationDataset, SpatioTemporalDataset):
pass
================================================
FILE: lib/data/preprocessing/__init__.py
================================================
from .scalers import *
================================================
FILE: lib/data/preprocessing/scalers.py
================================================
from abc import ABC, abstractmethod
import numpy as np
class AbstractScaler(ABC):
def __init__(self, **kwargs):
for k, v in kwargs.items():
setattr(self, k, v)
def __repr__(self):
params = ", ".join([f"{k}={str(v)}" for k, v in self.params().items()])
return "{}({})".format(self.__class__.__name__, params)
def __call__(self, *args, **kwargs):
return self.transform(*args, **kwargs)
def params(self):
return {k: v for k, v in self.__dict__.items() if not callable(v) and not k.startswith("__")}
@abstractmethod
def fit(self, x):
pass
@abstractmethod
def transform(self, x):
pass
@abstractmethod
def inverse_transform(self, x):
pass
def fit_transform(self, x):
self.fit(x)
return self.transform(x)
def to_torch(self):
import torch
for p in self.params():
param = getattr(self, p)
param = np.atleast_1d(param)
param = torch.tensor(param).float()
setattr(self, p, param)
return self
class Scaler(AbstractScaler):
def __init__(self, offset=0., scale=1.):
self.bias = offset
self.scale = scale
super(Scaler, self).__init__()
def params(self):
return dict(bias=self.bias, scale=self.scale)
def fit(self, x, mask=None, keepdims=True):
pass
def transform(self, x):
return (x - self.bias) / self.scale
def inverse_transform(self, x):
return x * self.scale + self.bias
def fit_transform(self, x, mask=None, keepdims=True):
self.fit(x, mask, keepdims)
return self.transform(x)
class StandardScaler(Scaler):
def __init__(self, axis=0):
self.axis = axis
super(StandardScaler, self).__init__()
def fit(self, x, mask=None, keepdims=True):
if mask is not None:
x = np.where(mask, x, np.nan)
self.bias = np.nanmean(x, axis=self.axis, keepdims=keepdims)
self.scale = np.nanstd(x, axis=self.axis, keepdims=keepdims)
else:
self.bias = x.mean(axis=self.axis, keepdims=keepdims)
self.scale = x.std(axis=self.axis, keepdims=keepdims)
return self
class MinMaxScaler(Scaler):
def __init__(self, axis=0):
self.axis = axis
super(MinMaxScaler, self).__init__()
def fit(self, x, mask=None, keepdims=True):
if mask is not None:
x = np.where(mask, x, np.nan)
self.bias = np.nanmin(x, axis=self.axis, keepdims=keepdims)
self.scale = (np.nanmax(x, axis=self.axis, keepdims=keepdims) - self.bias)
else:
self.bias = x.min(axis=self.axis, keepdims=keepdims)
self.scale = (x.max(axis=self.axis, keepdims=keepdims) - self.bias)
return self
================================================
FILE: lib/data/spatiotemporal_dataset.py
================================================
import numpy as np
import pandas as pd
from einops import rearrange
from .temporal_dataset import TemporalDataset
class SpatioTemporalDataset(TemporalDataset):
def __init__(self, data,
index=None,
trend=None,
scaler=None,
freq=None,
window=24,
horizon=24,
delay=0,
stride=1,
**exogenous):
"""
Pytorch dataset for data that can be represented as a single TimeSeries
:param data:
raw target time series (ts) (can be multivariate), shape: [steps, (features), nodes]
:param exog:
global exogenous variables, shape: [steps, nodes]
:param trend:
trend time series to be removed from the ts, shape: [steps, (features), (nodes)]
:param bias:
bias to be removed from the ts (after de-trending), shape [steps, (features), (nodes)]
:param scale: r
scaling factor to scale the ts (after de-trending), shape [steps, (features), (nodes)]
:param mask:
mask for valid data, 1 -> valid time step, 0 -> invalid. same shape of ts.
:param target_exog:
exogenous variables of the target, shape: [steps, nodes]
:param window:
length of windows returned by __get_intem__
:param horizon:
length of prediction horizon returned by __get_intem__
:param delay:
delay between input and prediction
"""
super(SpatioTemporalDataset, self).__init__(data,
index=index,
trend=trend,
scaler=scaler,
freq=freq,
window=window,
horizon=horizon,
delay=delay,
stride=stride,
**exogenous)
def __repr__(self):
return "{}(n_samples={}, n_nodes={})".format(self.__class__.__name__, len(self), self.n_nodes)
@property
def n_nodes(self):
return self.data.shape[1]
@staticmethod
def check_dim(data):
if data.ndim == 2: # [steps, nodes] -> [steps, nodes, features]
data = rearrange(data, 's (n f) -> s n f', f=1)
elif data.ndim == 1:
data = rearrange(data, '(s n f) -> s n f', n=1, f=1)
elif data.ndim == 3:
pass
else:
raise ValueError(f'Invalid data dimensions {data.shape}')
return data
def dataframe(self):
if self.n_channels == 1:
return pd.DataFrame(data=np.squeeze(self.data, -1), index=self.index)
raise NotImplementedError()
================================================
FILE: lib/data/temporal_dataset.py
================================================
import numpy as np
import pandas as pd
import torch
from einops import rearrange
from pandas import DatetimeIndex
from torch.utils.data import Dataset
from .preprocessing import AbstractScaler
class TemporalDataset(Dataset):
def __init__(self, data,
index=None,
freq=None,
exogenous=None,
trend=None,
scaler=None,
window=24,
horizon=24,
delay=0,
stride=1):
"""Wrapper class for dataset whose entry are dependent from a sequence of temporal indices.
Parameters
----------
data : np.ndarray
Data relative to the main signal.
index : DatetimeIndex or None
Temporal indices for the data.
exogenous : dict or None
Exogenous data and label paired with main signal (default is None).
trend : np.ndarray or None
Trend paired with main signal (default is None). Must be of the same length of 'data'.
scaler : AbstractScaler or None
Scaler that must be used for data (default is None).
freq : pd.DateTimeIndex.freq or str
Frequency of the indices (defaults is indices.freq).
window : int
Size of the sliding window in the past.
horizon : int
Size of the prediction horizon.
delay : int
Offset between end of window and start of horizon.
Raises
----------
ValueError
If a frequency for the temporal indices is not provided neither in indices nor explicitly.
If preprocess is True and data_scaler is None.
"""
super(TemporalDataset, self).__init__()
# Initialize signatures
self.__exogenous_keys = dict()
self.__reserved_signature = {'data', 'trend', 'x', 'y'}
# Store data
self.data = data
if exogenous is not None:
for name, value in exogenous.items():
self.add_exogenous(value, name, for_window=True, for_horizon=True)
# Store time information
self.index = index
try:
freq = freq or index.freq or index.inferred_freq
self.freq = pd.tseries.frequencies.to_offset(freq)
except AttributeError:
self.freq = None
# Store offset information
self.window = window
self.delay = delay
self.horizon = horizon
self.stride = stride
# Identify the indices of the samples
self._indices = np.arange(self.data.shape[0] - self.sample_span + 1)[::self.stride]
# Store preprocessing options
self.trend = trend
self.scaler = scaler
def __getitem__(self, item):
return self.get(item, self.preprocess)
def __contains__(self, item):
return item in self.__exogenous_keys
def __len__(self):
return len(self._indices)
def __repr__(self):
return "{}(n_samples={})".format(self.__class__.__name__, len(self))
# Getter and setter for data
@property
def data(self):
return self.__data
@data.setter
def data(self, value):
assert value is not None
self.__data = self.check_input(value)
@property
def trend(self):
return self.__trend
@trend.setter
def trend(self, value):
self.__trend = self.check_input(value)
# Setter for exogenous data
def add_exogenous(self, obj, name, for_window=True, for_horizon=False):
assert isinstance(name, str)
if name.endswith('_window'):
name = name[:-7]
for_window, for_horizon = True, False
if name.endswith('_horizon'):
name = name[:-8]
for_window, for_horizon = False, True
if name in self.__reserved_signature:
raise ValueError("Channel '{0}' cannot be added in this way. Use obj.{0} instead.".format(name))
if not (for_window or for_horizon):
raise ValueError("Either for_window or for_horizon must be True.")
obj = self.check_input(obj)
setattr(self, name, obj)
self.__exogenous_keys[name] = dict(for_window=for_window, for_horizon=for_horizon)
return self
# Dataset properties
@property
def horizon_offset(self):
return self.window + self.delay
@property
def sample_span(self):
return max(self.horizon_offset + self.horizon, self.window)
@property
def preprocess(self):
return (self.trend is not None) or (self.scaler is not None)
@property
def n_steps(self):
return self.data.shape[0]
@property
def n_channels(self):
return self.data.shape[-1]
@property
def indices(self):
return self._indices
# Signature information
@property
def exo_window_keys(self):
return {k for k, v in self.__exogenous_keys.items() if v['for_window']}
@property
def exo_horizon_keys(self):
return {k for k, v in self.__exogenous_keys.items() if v['for_horizon']}
@property
def exo_common_keys(self):
return self.exo_window_keys.intersection(self.exo_horizon_keys)
@property
def signature(self):
attrs = []
if self.window > 0:
attrs.append('x')
for attr in self.exo_window_keys:
attrs.append(attr if attr not in self.exo_common_keys else (attr + '_window'))
for attr in self.exo_horizon_keys:
attrs.append(attr if attr not in self.exo_common_keys else (attr + '_horizon'))
attrs.append('y')
attrs = tuple(attrs)
preprocess = []
if self.trend is not None:
preprocess.append('trend')
if self.scaler is not None:
preprocess.extend(self.scaler.params())
preprocess = tuple(preprocess)
return dict(data=attrs, preprocessing=preprocess)
# Item getters
def get(self, item, preprocess=False):
idx = self._indices[item]
res, transform = dict(), dict()
if self.window > 0:
res['x'] = self.data[idx:idx + self.window]
for attr in self.exo_window_keys:
key = attr if attr not in self.exo_common_keys else (attr + '_window')
res[key] = getattr(self, attr)[idx:idx + self.window]
for attr in self.exo_horizon_keys:
key = attr if attr not in self.exo_common_keys else (attr + '_horizon')
res[key] = getattr(self, attr)[idx + self.horizon_offset:idx + self.horizon_offset + self.horizon]
res['y'] = self.data[idx + self.horizon_offset:idx + self.horizon_offset + self.horizon]
if preprocess:
if self.trend is not None:
y_trend = self.trend[idx + self.horizon_offset:idx + self.horizon_offset + self.horizon]
res['y'] = res['y'] - y_trend
transform['trend'] = y_trend
if 'x' in res:
res['x'] = res['x'] - self.trend[idx:idx + self.window]
if self.scaler is not None:
transform.update(self.scaler.params())
if 'x' in res:
res['x'] = self.scaler.transform(res['x'])
return res, transform
def snapshot(self, indices=None, preprocess=True):
if not self.preprocess:
preprocess = False
data, prep = [{k: [] for k in sign} for sign in self.signature.values()]
indices = np.arange(len(self._indices)) if indices is None else indices
for idx in indices:
data_i, prep_i = self.get(idx, preprocess)
[v.append(data_i[k]) for k, v in data.items()]
if len(prep_i):
[v.append(prep_i[k]) for k, v in prep.items()]
data = {k: np.stack(ds) for k, ds in data.items() if len(ds)}
if len(prep):
prep = {k: np.stack(ds) if k == 'trend' else ds[0] for k, ds in prep.items() if len(ds)}
return data, prep
# Data utilities
def expand_indices(self, indices=None, unique=False, merge=False):
ds_indices = dict.fromkeys([time for time in ['window', 'horizon'] if getattr(self, time) > 0])
indices = np.arange(len(self._indices)) if indices is None else indices
if 'window' in ds_indices:
w_idxs = [np.arange(idx, idx + self.window) for idx in self._indices[indices]]
ds_indices['window'] = np.concatenate(w_idxs)
if 'horizon' in ds_indices:
h_idxs = [np.arange(idx + self.horizon_offset, idx + self.horizon_offset + self.horizon)
for idx in self._indices[indices]]
ds_indices['horizon'] = np.concatenate(h_idxs)
if unique:
ds_indices = {k: np.unique(v) for k, v in ds_indices.items()}
if merge:
ds_indices = np.unique(np.concatenate(list(ds_indices.values())))
return ds_indices
def overlapping_indices(self, idxs1, idxs2, synch_mode='window', as_mask=False):
assert synch_mode in ['window', 'horizon']
ts1 = self.data_timestamps(idxs1, flatten=False)[synch_mode]
ts2 = self.data_timestamps(idxs2, flatten=False)[synch_mode]
common_ts = np.intersect1d(np.unique(ts1), np.unique(ts2))
is_overlapping = lambda sample: np.any(np.in1d(sample, common_ts))
m1 = np.apply_along_axis(is_overlapping, 1, ts1)
m2 = np.apply_along_axis(is_overlapping, 1, ts2)
if as_mask:
return m1, m2
return np.sort(idxs1[m1]), np.sort(idxs2[m2])
def data_timestamps(self, indices=None, flatten=True):
ds_indices = self.expand_indices(indices, unique=False)
ds_timestamps = {k: self.index[v] for k, v in ds_indices.items()}
if not flatten:
ds_timestamps = {k: np.array(v).reshape(-1, getattr(self, k)) for k, v in ds_timestamps.items()}
return ds_timestamps
def reduce_dataset(self, indices, inplace=False):
if not inplace:
from copy import deepcopy
dataset = deepcopy(self)
else:
dataset = self
old_index = dataset.index[dataset._indices[indices]]
ds_indices = dataset.expand_indices(indices, merge=True)
dataset.index = dataset.index[ds_indices]
dataset.data = dataset.data[ds_indices]
if dataset.mask is not None:
dataset.mask = dataset.mask[ds_indices]
if dataset.trend is not None:
dataset.trend = dataset.trend[ds_indices]
for attr in dataset.exo_window_keys.union(dataset.exo_horizon_keys):
if getattr(dataset, attr, None) is not None:
setattr(dataset, attr, getattr(dataset, attr)[ds_indices])
dataset._indices = np.flatnonzero(np.in1d(dataset.index, old_index))
return dataset
def check_input(self, data):
if data is None:
return data
data = self.check_dim(data)
data = data.clone().detach() if isinstance(data, torch.Tensor) else torch.tensor(data)
# cast data
if torch.is_floating_point(data):
return data.float()
elif data.dtype in [torch.int, torch.int8, torch.int16, torch.int32, torch.int64]:
return data.int()
return data
# Class-specific methods (override in children)
@staticmethod
def check_dim(data):
if data.ndim == 1: # [steps] -> [steps, features]
data = rearrange(data, '(s f) -> s f', f=1)
elif data.ndim != 2:
raise ValueError(f'Invalid data dimensions {data.shape}')
return data
def dataframe(self):
return pd.DataFrame(data=self.data, index=self.index)
@staticmethod
def add_argparse_args(parser, **kwargs):
parser.add_argument('--window', type=int, default=24)
parser.add_argument('--horizon', type=int, default=24)
parser.add_argument('--delay', type=int, default=0)
parser.add_argument('--stride', type=int, default=1)
return parser
================================================
FILE: lib/datasets/__init__.py
================================================
from .air_quality import AirQuality
from .metr_la import MissingValuesMetrLA
from .pems_bay import MissingValuesPemsBay
from .synthetic import ChargedParticles
================================================
FILE: lib/datasets/air_quality.py
================================================
import os
import numpy as np
import pandas as pd
from lib import datasets_path
from .pd_dataset import PandasDataset
from ..utils.utils import disjoint_months, infer_mask, compute_mean, geographical_distance, thresholded_gaussian_kernel
class AirQuality(PandasDataset):
SEED = 3210
def __init__(self, impute_nans=False, small=False, freq='60T', masked_sensors=None):
self.random = np.random.default_rng(self.SEED)
self.test_months = [3, 6, 9, 12]
self.infer_eval_from = 'next'
self.eval_mask = None
df, dist, mask = self.load(impute_nans=impute_nans, small=small, masked_sensors=masked_sensors)
self.dist = dist
if masked_sensors is None:
self.masked_sensors = list()
else:
self.masked_sensors = list(masked_sensors)
super().__init__(dataframe=df, u=None, mask=mask, name='air', freq=freq, aggr='nearest')
def load_raw(self, small=False):
if small:
path = os.path.join(datasets_path['air'], 'small36.h5')
eval_mask = pd.DataFrame(pd.read_hdf(path, 'eval_mask'))
else:
path = os.path.join(datasets_path['air'], 'full437.h5')
eval_mask = None
df = pd.DataFrame(pd.read_hdf(path, 'pm25'))
stations = pd.DataFrame(pd.read_hdf(path, 'stations'))
return df, stations, eval_mask
def load(self, impute_nans=True, small=False, masked_sensors=None):
# load readings and stations metadata
df, stations, eval_mask = self.load_raw(small)
# compute the masks
mask = (~np.isnan(df.values)).astype('uint8') # 1 if value is not nan else 0
if eval_mask is None:
eval_mask = infer_mask(df, infer_from=self.infer_eval_from)
eval_mask = eval_mask.values.astype('uint8')
if masked_sensors is not None:
eval_mask[:, masked_sensors] = np.where(mask[:, masked_sensors], 1, 0)
self.eval_mask = eval_mask # 1 if value is ground-truth for imputation else 0
# eventually replace nans with weekly mean by hour
if impute_nans:
df = df.fillna(compute_mean(df))
# compute distances from latitude and longitude degrees
st_coord = stations.loc[:, ['latitude', 'longitude']]
dist = geographical_distance(st_coord, to_rad=True).values
return df, dist, mask
def splitter(self, dataset, val_len=1., in_sample=False, window=0):
nontest_idxs, test_idxs = disjoint_months(dataset, months=self.test_months, synch_mode='horizon')
if in_sample:
train_idxs = np.arange(len(dataset))
val_months = [(m - 1) % 12 for m in self.test_months]
_, val_idxs = disjoint_months(dataset, months=val_months, synch_mode='horizon')
else:
# take equal number of samples before each month of testing
val_len = (int(val_len * len(nontest_idxs)) if val_len < 1 else val_len) // len(self.test_months)
# get indices of first day of each testing month
delta_idxs = np.diff(test_idxs)
end_month_idxs = test_idxs[1:][np.flatnonzero(delta_idxs > delta_idxs.min())]
if len(end_month_idxs) < len(self.test_months):
end_month_idxs = np.insert(end_month_idxs, 0, test_idxs[0])
# expand month indices
month_val_idxs = [np.arange(v_idx - val_len, v_idx) - window for v_idx in end_month_idxs]
val_idxs = np.concatenate(month_val_idxs) % len(dataset)
# remove overlapping indices from training set
ovl_idxs, _ = dataset.overlapping_indices(nontest_idxs, val_idxs, synch_mode='horizon', as_mask=True)
train_idxs = nontest_idxs[~ovl_idxs]
return [train_idxs, val_idxs, test_idxs]
def get_similarity(self, thr=0.1, include_self=False, force_symmetric=False, sparse=False, **kwargs):
theta = np.std(self.dist[:36, :36]) # use same theta for both air and air36
adj = thresholded_gaussian_kernel(self.dist, theta=theta, threshold=thr)
if not include_self:
adj[np.diag_indices_from(adj)] = 0.
if force_symmetric:
adj = np.maximum.reduce([adj, adj.T])
if sparse:
import scipy.sparse as sps
adj = sps.coo_matrix(adj)
return adj
@property
def mask(self):
return self._mask
@property
def training_mask(self):
return self._mask if self.eval_mask is None else (self._mask & (1 - self.eval_mask))
def test_interval_mask(self, dtype=bool, squeeze=True):
m = np.in1d(self.df.index.month, self.test_months).astype(dtype)
if squeeze:
return m
return m[:, None]
================================================
FILE: lib/datasets/metr_la.py
================================================
import os
import numpy as np
import pandas as pd
from lib import datasets_path
from .pd_dataset import PandasDataset
from ..utils import sample_mask
class MetrLA(PandasDataset):
def __init__(self, impute_zeros=False, freq='5T'):
df, dist, mask = self.load(impute_zeros=impute_zeros)
self.dist = dist
super().__init__(dataframe=df, u=None, mask=mask, name='la', freq=freq, aggr='nearest')
def load(self, impute_zeros=True):
path = os.path.join(datasets_path['la'], 'metr_la.h5')
df = pd.read_hdf(path)
datetime_idx = sorted(df.index)
date_range = pd.date_range(datetime_idx[0], datetime_idx[-1], freq='5T')
df = df.reindex(index=date_range)
mask = ~np.isnan(df.values)
if impute_zeros:
mask = mask * (df.values != 0.).astype('uint8')
df = df.replace(to_replace=0., method='ffill')
else:
mask = None
dist = self.load_distance_matrix()
return df, dist, mask
def load_distance_matrix(self):
path = os.path.join(datasets_path['la'], 'metr_la_dist.npy')
try:
dist = np.load(path)
except:
distances = pd.read_csv(os.path.join(datasets_path['la'], 'distances_la.csv'))
with open(os.path.join(datasets_path['la'], 'sensor_ids_la.txt')) as f:
ids = f.read().strip().split(',')
num_sensors = len(ids)
dist = np.ones((num_sensors, num_sensors), dtype=np.float32) * np.inf
# Builds sensor id to index map.
sensor_id_to_ind = {int(sensor_id): i for i, sensor_id in enumerate(ids)}
# Fills cells in the matrix with distances.
for row in distances.values:
if row[0] not in sensor_id_to_ind or row[1] not in sensor_id_to_ind:
continue
dist[sensor_id_to_ind[row[0]], sensor_id_to_ind[row[1]]] = row[2]
np.save(path, dist)
return dist
def get_similarity(self, thr=0.1, force_symmetric=False, sparse=False):
finite_dist = self.dist.reshape(-1)
finite_dist = finite_dist[~np.isinf(finite_dist)]
sigma = finite_dist.std()
adj = np.exp(-np.square(self.dist / sigma))
adj[adj < thr] = 0.
if force_symmetric:
adj = np.maximum.reduce([adj, adj.T])
if sparse:
import scipy.sparse as sps
adj = sps.coo_matrix(adj)
return adj
@property
def mask(self):
return self._mask
class MissingValuesMetrLA(MetrLA):
SEED = 9101112
def __init__(self, p_fault=0.0015, p_noise=0.05):
super(MissingValuesMetrLA, self).__init__(impute_zeros=True)
self.rng = np.random.default_rng(self.SEED)
self.p_fault = p_fault
self.p_noise = p_noise
eval_mask = sample_mask(self.numpy().shape,
p=p_fault,
p_noise=p_noise,
min_seq=12,
max_seq=12 * 4,
rng=self.rng)
self.eval_mask = (eval_mask & self.mask).astype('uint8')
@property
def training_mask(self):
return self.mask if self.eval_mask is None else (self.mask & (1 - self.eval_mask))
def splitter(self, dataset, val_len=0, test_len=0, window=0):
idx = np.arange(len(dataset))
if test_len < 1:
test_len = int(test_len * len(idx))
if val_len < 1:
val_len = int(val_len * (len(idx) - test_len))
test_start = len(idx) - test_len
val_start = test_start - val_len
return [idx[:val_start - window], idx[val_start:test_start - window], idx[test_start:]]
================================================
FILE: lib/datasets/pd_dataset.py
================================================
import numpy as np
import pandas as pd
import torch
class PandasDataset:
def __init__(self, dataframe: pd.DataFrame, u: pd.DataFrame = None, name='pd-dataset', mask=None, freq=None,
aggr='sum', **kwargs):
"""
Initialize a tsl dataset from a pandas dataframe.
:param dataframe: dataframe containing the data, shape: n_steps, n_nodes
:param u: dataframe with exog variables
:param name: optional name of the dataset
:param mask: mask for valid data (1:valid, 0:not valid)
:param freq: force a frequency (possibly by resampling)
:param aggr: aggregation method after resampling
"""
super().__init__()
self.name = name
# set dataset dataframe
self.df = dataframe
# set optional exog_variable dataframe
# make sure to consider only the overlapping part of the two dataframes
# assumption u.index \in df.index
idx = sorted(self.df.index)
self.start = idx[0]
self.end = idx[-1]
if u is not None:
self.u = u[self.start:self.end]
else:
self.u = None
if mask is not None:
mask = np.asarray(mask).astype('uint8')
self._mask = mask
if freq is not None:
self.resample_(freq=freq, aggr=aggr)
else:
self.freq = self.df.index.inferred_freq
# make sure that all the dataframes are aligned
self.resample_(self.freq, aggr=aggr)
assert 'T' in self.freq
self.samples_per_day = int(60 / int(self.freq[:-1]) * 24)
def __repr__(self):
return "{}(nodes={}, length={})".format(self.__class__.__name__, self.n_nodes, self.length)
@property
def has_mask(self):
return self._mask is not None
@property
def has_u(self):
return self.u is not None
def resample_(self, freq, aggr):
resampler = self.df.resample(freq)
idx = self.df.index
if aggr == 'sum':
self.df = resampler.sum()
elif aggr == 'mean':
self.df = resampler.mean()
elif aggr == 'nearest':
self.df = resampler.nearest()
else:
raise ValueError(f'{aggr} if not a valid aggregation method.')
if self.has_mask:
resampler = pd.DataFrame(self._mask, index=idx).resample(freq)
self._mask = resampler.min().to_numpy()
if self.has_u:
resampler = self.u.resample(freq)
self.u = resampler.nearest()
self.freq = freq
def dataframe(self) -> pd.DataFrame:
return self.df.copy()
@property
def length(self):
return self.df.values.shape[0]
@property
def n_nodes(self):
return self.df.values.shape[1]
@property
def mask(self):
if self._mask is None:
return np.ones_like(self.df.values).astype('uint8')
return self._mask
def numpy(self, return_idx=False):
if return_idx:
return self.numpy(), self.df.index
return self.df.values
def pytorch(self):
data = self.numpy()
return torch.FloatTensor(data)
def __len__(self):
return self.length
@staticmethod
def build():
raise NotImplementedError
def load_raw(self):
raise NotImplementedError
def load(self):
raise NotImplementedError
================================================
FILE: lib/datasets/pems_bay.py
================================================
import os
import numpy as np
import pandas as pd
from lib import datasets_path
from .pd_dataset import PandasDataset
from ..utils import sample_mask
class PemsBay(PandasDataset):
def __init__(self):
df, dist, mask = self.load()
self.dist = dist
super().__init__(dataframe=df, u=None, mask=mask, name='bay', freq='5T', aggr='nearest')
def load(self, impute_zeros=True):
path = os.path.join(datasets_path['bay'], 'pems_bay.h5')
df = pd.read_hdf(path)
datetime_idx = sorted(df.index)
date_range = pd.date_range(datetime_idx[0], datetime_idx[-1], freq='5T')
df = df.reindex(index=date_range)
mask = ~np.isnan(df.values)
df.fillna(method='ffill', axis=0, inplace=True)
dist = self.load_distance_matrix(list(df.columns))
return df.astype('float32'), dist, mask.astype('uint8')
def load_distance_matrix(self, ids):
path = os.path.join(datasets_path['bay'], 'pems_bay_dist.npy')
try:
dist = np.load(path)
except:
distances = pd.read_csv(os.path.join(datasets_path['bay'], 'distances_bay.csv'))
num_sensors = len(ids)
dist = np.ones((num_sensors, num_sensors), dtype=np.float32) * np.inf
# Builds sensor id to index map.
sensor_id_to_ind = {int(sensor_id): i for i, sensor_id in enumerate(ids)}
# Fills cells in the matrix with distances.
for row in distances.values:
if row[0] not in sensor_id_to_ind or row[1] not in sensor_id_to_ind:
continue
dist[sensor_id_to_ind[row[0]], sensor_id_to_ind[row[1]]] = row[2]
np.save(path, dist)
return dist
def get_similarity(self, type='dcrnn', thr=0.1, force_symmetric=False, sparse=False):
"""
Return similarity matrix among nodes. Implemented to match DCRNN.
:param type: type of similarity matrix.
:param thr: threshold to increase saprseness.
:param trainlen: number of steps that can be used for computing the similarity.
:param force_symmetric: force the result to be simmetric.
:return: and NxN array representig similarity among nodes.
"""
if type == 'dcrnn':
finite_dist = self.dist.reshape(-1)
finite_dist = finite_dist[~np.isinf(finite_dist)]
sigma = finite_dist.std()
adj = np.exp(-np.square(self.dist / sigma))
elif type == 'stcn':
sigma = 10
adj = np.exp(-np.square(self.dist) / sigma)
else:
raise NotImplementedError
adj[adj < thr] = 0.
if force_symmetric:
adj = np.maximum.reduce([adj, adj.T])
if sparse:
import scipy.sparse as sps
adj = sps.coo_matrix(adj)
return adj
@property
def mask(self):
if self._mask is None:
return self.df.values != 0.
return self._mask
class MissingValuesPemsBay(PemsBay):
SEED = 56789
def __init__(self, p_fault=0.0015, p_noise=0.05):
super(MissingValuesPemsBay, self).__init__()
self.rng = np.random.default_rng(self.SEED)
self.p_fault = p_fault
self.p_noise = p_noise
eval_mask = sample_mask(self.numpy().shape,
p=p_fault,
p_noise=p_noise,
min_seq=12,
max_seq=12 * 4,
rng=self.rng)
self.eval_mask = (eval_mask & self.mask).astype('uint8')
@property
def training_mask(self):
return self.mask if self.eval_mask is None else (self.mask & (1 - self.eval_mask))
def splitter(self, dataset, val_len=0, test_len=0, window=0):
idx = np.arange(len(dataset))
if test_len < 1:
test_len = int(test_len * len(idx))
if val_len < 1:
val_len = int(val_len * (len(idx) - test_len))
test_start = len(idx) - test_len
val_start = test_start - val_len
return [idx[:val_start - window], idx[val_start:test_start - window], idx[test_start:]]
================================================
FILE: lib/datasets/synthetic.py
================================================
import os.path
import numpy as np
import torch
from einops import rearrange
from torch.utils.data import Dataset, DataLoader, Subset
from lib import datasets_path
def generate_mask(shape, p_block=0.01, p_point=0.01, max_seq=1, min_seq=1, rng=None):
"""Generate mask in which 1 denotes valid values, 0 missing ones. Assuming shape=(steps, ...)."""
if rng is None:
rand = np.random.random
randint = np.random.randint
else:
rand = rng.random
randint = rng.integers
# init mask
mask = np.ones(shape, dtype='uint8')
# block missing
if p_block > 0:
assert max_seq >= min_seq
for col in range(shape[1]):
i = 0
while i < shape[0]:
if rand() > p_block:
i += 1
else:
fault_len = int(randint(min_seq, max_seq + 1))
mask[i:i + fault_len, col] = 0
i += fault_len + 1 # at least one valid value between two blocks
# point missing
# let values before and after block missing always valid
diff = np.zeros(mask.shape, dtype='uint8')
diff[:-1] |= np.diff(mask, axis=0) < 0
diff[1:] |= np.diff(mask, axis=0) > 0
mask = np.where(mask - diff, rand(shape) > p_point, mask)
return mask
class SyntheticDataset(Dataset):
SEED: int
def __init__(self, filename,
window=None,
p_block=0.05,
p_point=0.05,
max_seq=6,
min_seq=4,
use_exogenous=True,
mask_exogenous=True,
graph_mode=True):
super(SyntheticDataset, self).__init__()
self.mask_exogenous = mask_exogenous
self.use_exogenous = use_exogenous
self.graph_mode = graph_mode
# fetch data
content = self.load(filename)
self.window = window if window is not None else content['loc'].shape[1]
self.loc = torch.tensor(content['loc'][:, :self.window]).float()
self.vel = torch.tensor(content['vel'][:, :self.window]).float()
self.adj = content['adj']
self.SEED = content['seed'].item()
# compute masks
self.rng = np.random.default_rng(self.SEED)
mask_shape = (len(self), self.window, self.n_nodes, 1)
mask = generate_mask(mask_shape,
p_block=p_block,
p_point=p_point,
max_seq=max_seq,
min_seq=min_seq,
rng=self.rng).repeat(self.n_channels, -1)
eval_mask = 1 - generate_mask(mask_shape,
p_block=p_block,
p_point=p_point,
max_seq=max_seq,
min_seq=min_seq,
rng=self.rng).repeat(self.n_channels, -1)
self.mask = torch.tensor(mask).byte()
self.eval_mask = torch.tensor(eval_mask).byte() & self.mask
# store splitting indices
self.train_idxs = None
self.val_idxs = None
self.test_idxs = None
def __len__(self):
return self.loc.size(0)
def __getitem__(self, index):
eval_mask = self.eval_mask[index]
mask = self.training_mask[index]
x = mask * self.loc[index]
res = dict(x=x, mask=mask, eval_mask=eval_mask)
if self.use_exogenous:
u = self.vel[index]
if self.mask_exogenous:
u *= mask.all(-1, keepdims=True)
res.update(u=u)
res.update(y=self.loc[index])
if not self.graph_mode:
res = {k: rearrange(v, 's n f -> s (n f)') for k, v in res.items()}
return res
@property
def n_channels(self):
return self.loc.size(-1)
@property
def n_nodes(self):
return self.loc.size(-2)
@property
def n_exogenous(self):
return self.vel.size(-1) if self.use_exogenous else 0
@property
def training_mask(self):
return self.mask if self.eval_mask is None else (self.mask & (1 - self.eval_mask))
@staticmethod
def load(filename):
return np.load(filename)
def get_similarity(self, sparse=False):
return self.adj
# Splitting options
def split(self, val_len=0, test_len=0):
idx = np.arange(len(self))
if test_len < 1:
test_len = int(test_len * len(idx))
if val_len < 1:
val_len = int(val_len * (len(idx) - test_len))
test_start = len(idx) - test_len
val_start = test_start - val_len
# split dataset
self.train_idxs = idx[:val_start]
self.val_idxs = idx[val_start:test_start]
self.test_idxs = idx[test_start:]
def train_dataloader(self, shuffle=True, batch_size=32):
return DataLoader(Subset(self, self.train_idxs), shuffle=shuffle, batch_size=batch_size, drop_last=True)
def val_dataloader(self, shuffle=False, batch_size=32):
return DataLoader(Subset(self, self.val_idxs), shuffle=shuffle, batch_size=batch_size)
def test_dataloader(self, shuffle=False, batch_size=32):
return DataLoader(Subset(self, self.test_idxs), shuffle=shuffle, batch_size=batch_size)
class ChargedParticles(SyntheticDataset):
def __init__(self, static_adj=False,
window=None,
p_block=0.05,
p_point=0.05,
max_seq=6,
min_seq=4,
use_exogenous=True,
mask_exogenous=True,
graph_mode=True):
if static_adj:
filename = os.path.join(datasets_path['synthetic'], 'charged_static.npz')
else:
filename = os.path.join(datasets_path['synthetic'], 'charged_varying.npz')
self.static_adj = static_adj
super(ChargedParticles, self).__init__(filename, window,
p_block=p_block,
p_point=p_point,
max_seq=max_seq,
min_seq=min_seq,
use_exogenous=use_exogenous,
mask_exogenous=mask_exogenous,
graph_mode=graph_mode)
charges = self.load(filename)['charges']
self.charges = torch.tensor(charges).float()
def __getitem__(self, item):
res = super(ChargedParticles, self).__getitem__(item)
# add charges as exogenous features
if self.use_exogenous:
charges = self.charges[item] if not self.static_adj else self.charges
stacked_charges = charges[None].expand(self.window, -1, -1)
if not self.graph_mode:
stacked_charges = rearrange(stacked_charges, 's n f -> s (n f)')
res.update(u=torch.cat([res['u'], stacked_charges], -1))
return res
def get_similarity(self, sparse=False):
return np.ones((self.n_nodes, self.n_nodes)) - np.eye(self.n_nodes)
@property
def n_exogenous(self):
if self.use_exogenous:
return super(ChargedParticles, self).n_exogenous + 1 # add charges to features
return 0
================================================
FILE: lib/fillers/__init__.py
================================================
from .filler import Filler
from .britsfiller import BRITSFiller
from .graphfiller import GraphFiller
from .rgainfiller import RGAINFiller
from .multi_imputation_filler import MultiImputationFiller
================================================
FILE: lib/fillers/britsfiller.py
================================================
import torch
from . import Filler
from ..nn import BRITS
class BRITSFiller(Filler):
def training_step(self, batch, batch_idx):
# Unpack batch
batch_data, batch_preprocessing = self._unpack_batch(batch)
# Extract mask and target
mask = batch_data['mask'].clone().detach()
batch_data['mask'] = torch.bernoulli(mask.clone().detach().float() * self.keep_prob).byte()
eval_mask = batch_data.pop('eval_mask', None)
y = batch_data.pop('y')
# Compute predictions and compute loss
out, imputations, predictions = self.predict_batch(batch, preprocess=False, postprocess=False)
if self.scaled_target:
target = self._preprocess(y, batch_preprocessing)
else:
target = y
imputations = [self._postprocess(imp, batch_preprocessing) for imp in imputations]
predictions = [self._postprocess(prd, batch_preprocessing) for prd in predictions]
loss = sum([self.loss_fn(pred, target, mask) for pred in predictions])
loss += BRITS.consistency_loss(*imputations)
# Logging
metrics_mask = (mask | eval_mask) - batch_data['mask'] # all unseen data
out = self._postprocess(out, batch_preprocessing)
self.train_metrics.update(out.detach(), y, metrics_mask)
self.log_dict(self.train_metrics, on_step=False, on_epoch=True, logger=True, prog_bar=True)
self.log('train_loss', loss, on_step=False, on_epoch=True, logger=True, prog_bar=False)
return loss
def validation_step(self, batch, batch_idx):
# Unpack batch
batch_data, batch_preprocessing = self._unpack_batch(batch)
# Extract mask and target
mask = batch_data.get('mask')
eval_mask = batch_data.pop('eval_mask', None)
y = batch_data.pop('y')
# Compute predictions and compute loss
out, imputations, predictions = self.predict_batch(batch, preprocess=False, postprocess=False)
if self.scaled_target:
target = self._preprocess(y, batch_preprocessing)
else:
target = y
predictions = [self._postprocess(prd, batch_preprocessing) for prd in predictions]
val_loss = sum([self.loss_fn(pred, target, mask) for pred in predictions])
# Logging
out = self._postprocess(out, batch_preprocessing)
self.val_metrics.update(out.detach(), y, eval_mask)
self.log_dict(self.val_metrics, on_step=False, on_epoch=True, logger=True, prog_bar=True)
self.log('val_loss', val_loss.detach(), on_step=False, on_epoch=True, logger=True, prog_bar=False)
return val_loss
def test_step(self, batch, batch_idx):
# Unpack batch
batch_data, batch_preprocessing = self._unpack_batch(batch)
# Extract mask and target
eval_mask = batch_data.pop('eval_mask', None)
y = batch_data.pop('y')
# Compute outputs and rescale
imputation, *_ = self.predict_batch(batch, preprocess=False, postprocess=True)
test_loss = self.loss_fn(imputation, y, eval_mask)
# Logging
self.test_metrics.update(imputation.detach(), y, eval_mask)
self.log_dict(self.test_metrics, on_step=False, on_epoch=True, logger=True, prog_bar=True)
self.log('test_loss', test_loss.detach(), on_step=False, on_epoch=True, logger=True, prog_bar=False)
return test_loss
================================================
FILE: lib/fillers/filler.py
================================================
import inspect
from copy import deepcopy
import pytorch_lightning as pl
import torch
from pytorch_lightning.core.decorators import auto_move_data
from pytorch_lightning.metrics import MetricCollection
from pytorch_lightning.utilities import move_data_to_device
from .. import epsilon
from ..nn.utils.metric_base import MaskedMetric
from ..utils.utils import ensure_list
class Filler(pl.LightningModule):
def __init__(self,
model_class,
model_kwargs,
optim_class,
optim_kwargs,
loss_fn,
scaled_target=False,
whiten_prob=0.05,
metrics=None,
scheduler_class=None,
scheduler_kwargs=None):
"""
PL module to implement hole fillers.
:param model_class: Class of pytorch nn.Module implementing the imputer.
:param model_kwargs: Model's keyword arguments.
:param optim_class: Optimizer class.
:param optim_kwargs: Optimizer's keyword arguments.
:param loss_fn: Loss function used for training.
:param scaled_target: Whether to scale target before computing loss using batch processing information.
:param whiten_prob: Probability of removing a value and using it as ground truth for imputation.
:param metrics: Dictionary of type {'metric1_name':metric1_fn, 'metric2_name':metric2_fn ...}.
:param scheduler_class: Scheduler class.
:param scheduler_kwargs: Scheduler's keyword arguments.
"""
super(Filler, self).__init__()
self.save_hyperparameters(model_kwargs)
self.model_cls = model_class
self.model_kwargs = model_kwargs
self.optim_class = optim_class
self.optim_kwargs = optim_kwargs
self.scheduler_class = scheduler_class
if scheduler_kwargs is None:
self.scheduler_kwargs = dict()
else:
self.scheduler_kwargs = scheduler_kwargs
if loss_fn is not None:
self.loss_fn = self._check_metric(loss_fn, on_step=True)
else:
self.loss_fn = None
self.scaled_target = scaled_target
# during training whiten ground-truth values with this probability
assert 0. <= whiten_prob <= 1.
self.keep_prob = 1. - whiten_prob
if metrics is None:
metrics = dict()
self._set_metrics(metrics)
# instantiate model
self.model = self.model_cls(**self.model_kwargs)
def reset_model(self):
self.model = self.model_cls(**self.model_kwargs)
@property
def trainable_parameters(self):
return sum(p.numel() for p in self.model.parameters() if p.requires_grad)
@auto_move_data
def forward(self, *args, **kwargs):
return self.model(*args, **kwargs)
@staticmethod
def _check_metric(metric, on_step=False):
if not isinstance(metric, MaskedMetric):
if 'reduction' in inspect.getfullargspec(metric).args:
metric_kwargs = {'reduction': 'none'}
else:
metric_kwargs = dict()
return MaskedMetric(metric, compute_on_step=on_step, metric_kwargs=metric_kwargs)
return deepcopy(metric)
def _set_metrics(self, metrics):
self.train_metrics = MetricCollection(
{f'train_{k}': self._check_metric(m, on_step=True) for k, m in metrics.items()})
self.val_metrics = MetricCollection({f'val_{k}': self._check_metric(m) for k, m in metrics.items()})
self.test_metrics = MetricCollection({f'test_{k}': self._check_metric(m) for k, m in metrics.items()})
def _preprocess(self, data, batch_preprocessing):
"""
Perform preprocessing of a given input.
:param data: pytorch tensor of shape [batch, steps, nodes, features] to preprocess
:param batch_preprocessing: dictionary containing preprocessing data
:return: preprocessed data
"""
if isinstance(data, (list, tuple)):
return [self._preprocess(d, batch_preprocessing) for d in data]
trend = batch_preprocessing.get('trend', 0.)
bias = batch_preprocessing.get('bias', 0.)
scale = batch_preprocessing.get('scale', 1.)
return (data - trend - bias) / (scale + epsilon)
def _postprocess(self, data, batch_preprocessing):
"""
Perform postprocessing (inverse transform) of a given input.
:param data: pytorch tensor of shape [batch, steps, nodes, features] to trasform
:param batch_preprocessing: dictionary containing preprocessing data
:return: inverse transformed data
"""
if isinstance(data, (list, tuple)):
return [self._postprocess(d, batch_preprocessing) for d in data]
trend = batch_preprocessing.get('trend', 0.)
bias = batch_preprocessing.get('bias', 0.)
scale = batch_preprocessing.get('scale', 1.)
return data * (scale + epsilon) + bias + trend
def predict_batch(self, batch, preprocess=False, postprocess=True, return_target=False):
"""
This method takes as an input a batch as a two dictionaries containing tensors and outputs the predictions.
Prediction should have a shape [batch, nodes, horizon]
:param batch: list dictionary following the structure [data:
{'x':[...], 'y':[...], 'u':[...], ...},
preprocessing:
{'bias': ..., 'scale': ..., 'x_trend':[...], 'y_trend':[...]}]
:param preprocess: whether the data need to be preprocessed (note that inputs are by default preprocessed before creating the batch)
:param postprocess: whether to postprocess the predictions (if True we assume that the model has learned to predict the trasformed signal)
:param return_target: whether to return the prediction target y_true and the prediction mask
:return: (y_true), y_hat, (mask)
"""
batch_data, batch_preprocessing = self._unpack_batch(batch)
if preprocess:
x = batch_data.pop('x')
x = self._preprocess(x, batch_preprocessing)
y_hat = self.forward(x, **batch_data)
else:
y_hat = self.forward(**batch_data)
# Rescale outputs
if postprocess:
y_hat = self._postprocess(y_hat, batch_preprocessing)
if return_target:
y = batch_data.get('y')
mask = batch_data.get('mask', None)
return y, y_hat, mask
return y_hat
def predict_loader(self, loader, preprocess=False, postprocess=True, return_mask=True):
"""
Makes predictions for an input dataloader. Returns both the predictions and the predictions targets.
:param loader: torch dataloader
:param preprocess: whether to preprocess the data
:param postprocess: whether to postprocess the data
:param return_mask: whether to return the valid mask (if it exists)
:return: y_true, y_hat
"""
targets, imputations, masks = [], [], []
for batch in loader:
batch = move_data_to_device(batch, self.device)
batch_data, batch_preprocessing = self._unpack_batch(batch)
# Extract mask and target
eval_mask = batch_data.pop('eval_mask', None)
y = batch_data.pop('y')
y_hat = self.predict_batch(batch, preprocess=preprocess, postprocess=postprocess)
if isinstance(y_hat, (list, tuple)):
y_hat = y_hat[0]
targets.append(y)
imputations.append(y_hat)
masks.append(eval_mask)
y = torch.cat(targets, 0)
y_hat = torch.cat(imputations, 0)
if return_mask:
mask = torch.cat(masks, 0) if masks[0] is not None else None
return y, y_hat, mask
return y, y_hat
def _unpack_batch(self, batch):
"""
Unpack a batch into data and preprocessing dictionaries.
:param batch: the batch
:return: batch_data, batch_preprocessing
"""
if isinstance(batch, (tuple, list)) and (len(batch) == 2):
batch_data, batch_preprocessing = batch
else:
batch_data = batch
batch_preprocessing = dict()
return batch_data, batch_preprocessing
def training_step(self, batch, batch_idx):
# Unpack batch
batch_data, batch_preprocessing = self._unpack_batch(batch)
# Extract mask and target
mask = batch_data['mask'].clone().detach()
batch_data['mask'] = torch.bernoulli(mask.clone().detach().float() * self.keep_prob).byte()
eval_mask = batch_data.pop('eval_mask')
eval_mask = (mask | eval_mask) - batch_data['mask']
y = batch_data.pop('y')
# Compute predictions and compute loss
imputation = self.predict_batch(batch, preprocess=False, postprocess=False)
if self.scaled_target:
target = self._preprocess(y, batch_preprocessing)
else:
target = y
imputation = self._postprocess(imputation, batch_preprocessing)
loss = self.loss_fn(imputation, target, mask)
# Logging
if self.scaled_target:
imputation = self._postprocess(imputation, batch_preprocessing)
self.train_metrics.update(imputation.detach(), y, eval_mask) # all unseen data
self.log_dict(self.train_metrics, on_step=False, on_epoch=True, logger=True, prog_bar=True)
self.log('train_loss', loss.detach(), on_step=False, on_epoch=True, logger=True, prog_bar=False)
return loss
def validation_step(self, batch, batch_idx):
# Unpack batch
batch_data, batch_preprocessing = self._unpack_batch(batch)
# Extract mask and target
eval_mask = batch_data.pop('eval_mask', None)
y = batch_data.pop('y')
# Compute predictions and compute loss
imputation = self.predict_batch(batch, preprocess=False, postprocess=False)
if self.scaled_target:
target = self._preprocess(y, batch_preprocessing)
else:
target = y
imputation = self._postprocess(imputation, batch_preprocessing)
val_loss = self.loss_fn(imputation, target, eval_mask)
# Logging
if self.scaled_target:
imputation = self._postprocess(imputation, batch_preprocessing)
self.val_metrics.update(imputation.detach(), y, eval_mask)
self.log_dict(self.val_metrics, on_step=False, on_epoch=True, logger=True, prog_bar=True)
self.log('val_loss', val_loss.detach(), on_step=False, on_epoch=True, logger=True, prog_bar=False)
return val_loss
def test_step(self, batch, batch_idx):
# Unpack batch
batch_data, batch_preprocessing = self._unpack_batch(batch)
# Extract mask and target
eval_mask = batch_data.pop('eval_mask', None)
y = batch_data.pop('y')
# Compute outputs and rescale
imputation = self.predict_batch(batch, preprocess=False, postprocess=True)
test_loss = self.loss_fn(imputation, y, eval_mask)
# Logging
self.test_metrics.update(imputation.detach(), y, eval_mask)
self.log_dict(self.test_metrics, on_step=False, on_epoch=True, logger=True, prog_bar=True)
return test_loss
def on_train_epoch_start(self) -> None:
optimizers = ensure_list(self.optimizers())
for i, optimizer in enumerate(optimizers):
lr = optimizer.optimizer.param_groups[0]['lr']
self.log(f'lr_{i}', lr, on_step=False, on_epoch=True, logger=True, prog_bar=False)
def configure_optimizers(self):
cfg = dict()
optimizer = self.optim_class(self.parameters(), **self.optim_kwargs)
cfg['optimizer'] = optimizer
if self.scheduler_class is not None:
metric = self.scheduler_kwargs.pop('monitor', None)
scheduler = self.scheduler_class(optimizer, **self.scheduler_kwargs)
cfg['lr_scheduler'] = scheduler
if metric is not None:
cfg['monitor'] = metric
return cfg
================================================
FILE: lib/fillers/graphfiller.py
================================================
import torch
from . import Filler
from ..nn.models import MPGRUNet, GRINet, BiMPGRUNet
class GraphFiller(Filler):
def __init__(self,
model_class,
model_kwargs,
optim_class,
optim_kwargs,
loss_fn,
scaled_target=False,
whiten_prob=0.05,
pred_loss_weight=1.,
warm_up=0,
metrics=None,
scheduler_class=None,
scheduler_kwargs=None):
super(GraphFiller, self).__init__(model_class=model_class,
model_kwargs=model_kwargs,
optim_class=optim_class,
optim_kwargs=optim_kwargs,
loss_fn=loss_fn,
scaled_target=scaled_target,
whiten_prob=whiten_prob,
metrics=metrics,
scheduler_class=scheduler_class,
scheduler_kwargs=scheduler_kwargs)
self.tradeoff = pred_loss_weight
if model_class is MPGRUNet:
self.trimming = (warm_up, 0)
elif model_class in [GRINet, BiMPGRUNet]:
self.trimming = (warm_up, warm_up)
def trim_seq(self, *seq):
seq = [s[:, self.trimming[0]:s.size(1) - self.trimming[1]] for s in seq]
if len(seq) == 1:
return seq[0]
return seq
def training_step(self, batch, batch_idx):
# Unpack batch
batch_data, batch_preprocessing = self._unpack_batch(batch)
# Compute masks
mask = batch_data['mask'].clone().detach()
batch_data['mask'] = torch.bernoulli(mask.clone().detach().float() * self.keep_prob).byte()
eval_mask = batch_data.pop('eval_mask', None)
eval_mask = (mask | eval_mask) - batch_data['mask'] # all unseen data
y = batch_data.pop('y')
# Compute predictions and compute loss
res = self.predict_batch(batch, preprocess=False, postprocess=False)
imputation, predictions = (res[0], res[1:]) if isinstance(res, (list, tuple)) else (res, [])
# trim to imputation horizon len
imputation, mask, eval_mask, y = self.trim_seq(imputation, mask, eval_mask, y)
predictions = self.trim_seq(*predictions)
if self.scaled_target:
target = self._preprocess(y, batch_preprocessing)
else:
target = y
imputation = self._postprocess(imputation, batch_preprocessing)
for i, _ in enumerate(predictions):
predictions[i] = self._postprocess(predictions[i], batch_preprocessing)
loss = self.loss_fn(imputation, target, mask)
for pred in predictions:
loss += self.tradeoff * self.loss_fn(pred, target, mask)
# Logging
if self.scaled_target:
imputation = self._postprocess(imputation, batch_preprocessing)
self.train_metrics.update(imputation.detach(), y, eval_mask) # all unseen data
self.log_dict(self.train_metrics, on_step=False, on_epoch=True, logger=True, prog_bar=True)
self.log('train_loss', loss.detach(), on_step=False, on_epoch=True, logger=True, prog_bar=False)
return loss
def validation_step(self, batch, batch_idx):
# Unpack batch
batch_data, batch_preprocessing = self._unpack_batch(batch)
# Extract mask and target
mask = batch_data.get('mask')
eval_mask = batch_data.pop('eval_mask', None)
y = batch_data.pop('y')
# Compute predictions and compute loss
imputation = self.predict_batch(batch, preprocess=False, postprocess=False)
# trim to imputation horizon len
imputation, mask, eval_mask, y = self.trim_seq(imputation, mask, eval_mask, y)
if self.scaled_target:
target = self._preprocess(y, batch_preprocessing)
else:
target = y
imputation = self._postprocess(imputation, batch_preprocessing)
val_loss = self.loss_fn(imputation, target, eval_mask)
# Logging
if self.scaled_target:
imputation = self._postprocess(imputation, batch_preprocessing)
self.val_metrics.update(imputation.detach(), y, eval_mask)
self.log_dict(self.val_metrics, on_step=False, on_epoch=True, logger=True, prog_bar=True)
self.log('val_loss', val_loss.detach(), on_step=False, on_epoch=True, logger=True, prog_bar=False)
return val_loss
def test_step(self, batch, batch_idx):
# Unpack batch
batch_data, batch_preprocessing = self._unpack_batch(batch)
# Extract mask and target
eval_mask = batch_data.pop('eval_mask', None)
y = batch_data.pop('y')
# Compute outputs and rescale
imputation = self.predict_batch(batch, preprocess=False, postprocess=True)
test_loss = self.loss_fn(imputation, y, eval_mask)
# Logging
self.test_metrics.update(imputation.detach(), y, eval_mask)
self.log_dict(self.test_metrics, on_step=False, on_epoch=True, logger=True, prog_bar=True)
self.log('test_loss', test_loss.detach(), on_step=False, on_epoch=True, logger=True, prog_bar=False)
return test_loss
================================================
FILE: lib/fillers/multi_imputation_filler.py
================================================
import torch
from pytorch_lightning.core.decorators import auto_move_data
from . import Filler
class MultiImputationFiller(Filler):
"""
Filler with multiple imputation outputs
"""
def __init__(self,
model_class,
model_kwargs,
optim_class,
optim_kwargs,
loss_fn,
consistency_loss=False,
scaled_target=False,
whiten_prob=0.05,
metrics=None,
scheduler_class=None,
scheduler_kwargs=None):
super().__init__(model_class,
model_kwargs,
optim_class,
optim_kwargs,
loss_fn,
scaled_target,
whiten_prob,
metrics,
scheduler_class,
scheduler_kwargs)
self.consistency_loss = consistency_loss
@auto_move_data
def forward(self, *args, **kwargs):
out = self.model(*args, **kwargs)
assert isinstance(out, (list, tuple))
if self.training:
return out
return out[0] # we assume that the final imputation is the first one
def _consistency_loss(self, imputations, mask):
from itertools import combinations
return sum([self.loss_fn(imp1, imp2, mask) for imp1, imp2 in combinations(imputations, 2)])
def training_step(self, batch, batch_idx):
# Unpack batch
batch_data, batch_preprocessing = self._unpack_batch(batch)
# Extract mask and target
mask = batch_data['mask'].clone().detach()
batch_data['mask'] = torch.bernoulli(mask.clone().detach().float() * self.keep_prob).byte()
eval_mask = batch_data.pop('eval_mask', None)
y = batch_data.pop('y')
# Compute predictions and compute loss
imputations = self.predict_batch(batch, preprocess=False, postprocess=False)
if self.scaled_target:
target = self._preprocess(y, batch_preprocessing)
else:
target = y
imputations = [self._postprocess(imp, batch_preprocessing) for imp in imputations]
loss = sum([self.loss_fn(imp, target, mask) for imp in imputations])
if self.consistency_loss:
loss += self._consistency_loss(imputations, mask)
# Logging
metrics_mask = (mask | eval_mask) - batch_data['mask'] # all unseen data
x_hat = imputations[0]
x_hat = self._postprocess(x_hat, batch_preprocessing)
self.train_metrics.update(x_hat.detach(), y, metrics_mask)
self.log_dict(self.train_metrics, on_step=False, on_epoch=True, logger=True, prog_bar=True)
self.log('train_loss', loss.detach(), on_step=False, on_epoch=True, logger=True, prog_bar=False)
return loss
================================================
FILE: lib/fillers/rgainfiller.py
================================================
import torch
from torch.nn import functional as F
from .multi_imputation_filler import MultiImputationFiller
from ..nn.utils.metric_base import MaskedMetric
class MaskedBCEWithLogits(MaskedMetric):
def __init__(self,
mask_nans=False,
mask_inf=False,
compute_on_step=True,
dist_sync_on_step=False,
process_group=None,
dist_sync_fn=None,
at=None):
super(MaskedBCEWithLogits, self).__init__(metric_fn=F.binary_cross_entropy_with_logits,
mask_nans=mask_nans,
mask_inf=mask_inf,
compute_on_step=compute_on_step,
dist_sync_on_step=dist_sync_on_step,
process_group=process_group,
dist_sync_fn=dist_sync_fn,
metric_kwargs={'reduction': 'none'},
at=at)
class RGAINFiller(MultiImputationFiller):
def __init__(self,
model_class,
model_kwargs,
optim_class,
optim_kwargs,
loss_fn,
g_train_freq=1,
d_train_freq=5,
consistency_loss=False,
scaled_target=True,
whiten_prob=0.05,
hint_rate=0.7,
alpha=10.,
metrics=None,
scheduler_class=None,
scheduler_kwargs=None):
super(RGAINFiller, self).__init__(model_class=model_class,
model_kwargs=model_kwargs,
optim_class=optim_class,
optim_kwargs=optim_kwargs,
loss_fn=loss_fn,
scaled_target=scaled_target,
whiten_prob=whiten_prob,
metrics=metrics,
consistency_loss=consistency_loss,
scheduler_class=scheduler_class,
scheduler_kwargs=scheduler_kwargs)
# discriminator training params
self.alpha = alpha
self.g_train_freq = g_train_freq
self.d_train_freq = d_train_freq
self.masked_bce_loss = MaskedBCEWithLogits(compute_on_step=True)
# activate manual optimization
self.automatic_optimization = False
self.hint_rate = hint_rate
def training_step(self, batch, batch_idx):
# Unpack batch
batch_data, batch_preprocessing = self._unpack_batch(batch)
g_opt, d_opt = self.optimizers()
schedulers = self.lr_schedulers()
# Extract mask and target
x = batch_data.pop('x')
mask = batch_data['mask'].clone().detach()
training_mask = torch.bernoulli(mask.clone().detach().float() * self.keep_prob).byte()
eval_mask = batch_data.pop('eval_mask', None)
y = batch_data.pop('y')
##########################
# generate imputations
##########################
imputations = self.model.generator(x, training_mask)
imputed_seq = imputations[0]
target = self._preprocess(y, batch_preprocessing)
y_hat = self._postprocess(imputed_seq, batch_preprocessing)
x_in = training_mask * x + (1 - training_mask) * imputed_seq
hint = torch.rand_like(training_mask, dtype=torch.float) < self.hint_rate
hint = hint.byte()
hint = hint * training_mask + (1 - hint) * 0.5
#########################
# train generator
#########################
if (batch_idx % self.g_train_freq) == 0:
g_opt.zero_grad()
rec_loss = sum([torch.sqrt(self.loss_fn(imp, target, mask)) for imp in imputations])
if self.consistency_loss:
rec_loss += self._consistency_loss(imputations, mask)
logits = self.model.discriminator(x_in, hint)
# maximize logit
adv_loss = self.masked_bce_loss(logits, torch.ones_like(logits), 1 - training_mask)
g_loss = self.alpha * rec_loss + adv_loss
self.manual_backward(g_loss)
g_opt.step()
# Logging
metrics_mask = (mask | eval_mask) - training_mask
self.train_metrics.update(y_hat.detach(), y, metrics_mask) # all unseen data
self.log_dict(self.train_metrics, on_step=False, on_epoch=True, logger=True, prog_bar=True)
self.log('gen_loss', adv_loss.detach(), on_step=False, on_epoch=True, logger=True, prog_bar=False)
self.log('imp_loss', rec_loss.detach(), on_step=True, on_epoch=True, logger=True, prog_bar=True)
###########################
# train discriminator
###########################
if (batch_idx % self.d_train_freq) == 0:
d_opt.zero_grad()
logits = self.model.discriminator(x_in.detach(), hint)
d_loss = self.masked_bce_loss(logits, training_mask.to(logits.dtype))
self.manual_backward(d_loss)
d_opt.step()
self.log('d_loss', d_loss.detach(), on_step=False, on_epoch=True, logger=True, prog_bar=False)
if (schedulers is not None) and self.trainer.is_last_batch:
for sch in schedulers:
sch.step()
def configure_optimizers(self):
opt_g = self.optim_class(self.model.generator.parameters(), **self.optim_kwargs)
opt_d = self.optim_class(self.model.discriminator.parameters(), **self.optim_kwargs)
optimizers = [opt_g, opt_d]
if self.scheduler_class is not None:
metric = self.scheduler_kwargs.pop('monitor', None)
schedulers = [{"scheduler": self.scheduler_class(opt, **self.scheduler_kwargs), "monitor": metric}
for opt in optimizers]
return optimizers, schedulers
return optimizers
================================================
FILE: lib/nn/__init__.py
================================================
from .layers import *
================================================
FILE: lib/nn/layers/__init__.py
================================================
from .rits import RITS, BRITS
from .gril import GRIL, BiGRIL
from .spatial_conv import SpatialConvOrderK
from .mpgru import MPGRUImputer
================================================
FILE: lib/nn/layers/gcrnn.py
================================================
import torch
import torch.nn as nn
from .spatial_conv import SpatialConvOrderK
class GCGRUCell(nn.Module):
"""
Graph Convolution Gated Recurrent Unit Cell.
"""
def __init__(self, d_in, num_units, support_len, order, activation='tanh'):
"""
:param num_units: the hidden dim of rnn
:param support_len: the (weighted) adjacency matrix of the graph, in numpy ndarray form
:param order: the max diffusion step
:param activation: if None, don't do activation for cell state
"""
super(GCGRUCell, self).__init__()
self.activation_fn = getattr(torch, activation)
self.forget_gate = SpatialConvOrderK(c_in=d_in + num_units, c_out=num_units, support_len=support_len,
order=order)
self.update_gate = SpatialConvOrderK(c_in=d_in + num_units, c_out=num_units, support_len=support_len,
order=order)
self.c_gate = SpatialConvOrderK(c_in=d_in + num_units, c_out=num_units, support_len=support_len, order=order)
def forward(self, x, h, adj):
"""
:param x: (B, input_dim, num_nodes)
:param h: (B, num_units, num_nodes)
:param adj: (num_nodes, num_nodes)
:return:
"""
# we start with bias 1.0 to not reset and not update
x_gates = torch.cat([x, h], dim=1)
r = torch.sigmoid(self.forget_gate(x_gates, adj))
u = torch.sigmoid(self.update_gate(x_gates, adj))
x_c = torch.cat([x, r * h], dim=1)
c = self.c_gate(x_c, adj) # batch_size, self._num_nodes * output_size
c = self.activation_fn(c)
return u * h + (1. - u) * c
class GCRNN(nn.Module):
def __init__(self,
d_in,
d_model,
d_out,
n_layers,
support_len,
kernel_size=2):
super(GCRNN, self).__init__()
self.d_in = d_in
self.d_model = d_model
self.d_out = d_out
self.n_layers = n_layers
self.ks = kernel_size
self.support_len = support_len
self.rnn_cells = nn.ModuleList()
for i in range(self.n_layers):
self.rnn_cells.append(GCGRUCell(d_in=self.d_in if i == 0 else self.d_model,
num_units=self.d_model, support_len=self.support_len, order=self.ks))
self.output_layer = nn.Conv2d(self.d_model, self.d_out, kernel_size=1)
def init_hidden_states(self, x):
return [torch.zeros(size=(x.shape[0], self.d_model, x.shape[2])).to(x.device) for _ in range(self.n_layers)]
def single_pass(self, x, h, adj):
out = x
for l, layer in enumerate(self.rnn_cells):
out = h[l] = layer(out, h[l], adj)
return out, h
def forward(self, x, adj, h=None):
# x:[batch, features, nodes, steps]
*_, steps = x.size()
if h is None:
h = self.init_hidden_states(x)
# temporal conv
for step in range(steps):
out, h = self.single_pass(x[..., step], h, adj)
return self.output_layer(out[..., None])
================================================
FILE: lib/nn/layers/gril.py
================================================
import torch
import torch.nn as nn
from einops import rearrange
from .spatial_conv import SpatialConvOrderK
from .gcrnn import GCGRUCell
from .spatial_attention import SpatialAttention
from ..utils.ops import reverse_tensor
class SpatialDecoder(nn.Module):
def __init__(self, d_in, d_model, d_out, support_len, order=1, attention_block=False, nheads=2, dropout=0.):
super(SpatialDecoder, self).__init__()
self.order = order
self.lin_in = nn.Conv1d(d_in, d_model, kernel_size=1)
self.graph_conv = SpatialConvOrderK(c_in=d_model, c_out=d_model,
support_len=support_len * order, order=1, include_self=False)
if attention_block:
self.spatial_att = SpatialAttention(d_in=d_model,
d_model=d_model,
nheads=nheads,
dropout=dropout)
self.lin_out = nn.Conv1d(3 * d_model, d_model, kernel_size=1)
else:
self.register_parameter('spatial_att', None)
self.lin_out = nn.Conv1d(2 * d_model, d_model, kernel_size=1)
self.read_out = nn.Conv1d(2 * d_model, d_out, kernel_size=1)
self.activation = nn.PReLU()
self.adj = None
def forward(self, x, m, h, u, adj, cached_support=False):
# [batch, channels, nodes]
x_in = [x, m, h] if u is None else [x, m, u, h]
x_in = torch.cat(x_in, 1)
if self.order > 1:
if cached_support and (self.adj is not None):
adj = self.adj
else:
adj = SpatialConvOrderK.compute_support_orderK(adj, self.order, include_self=False, device=x_in.device)
self.adj = adj if cached_support else None
x_in = self.lin_in(x_in)
out = self.graph_conv(x_in, adj)
if self.spatial_att is not None:
# [batch, channels, nodes] -> [batch, steps, nodes, features]
x_in = rearrange(x_in, 'b f n -> b 1 n f')
out_att = self.spatial_att(x_in, torch.eye(x_in.size(2), dtype=torch.bool, device=x_in.device))
out_att = rearrange(out_att, 'b s n f -> b f (n s)')
out = torch.cat([out, out_att], 1)
out = torch.cat([out, h], 1)
out = self.activation(self.lin_out(out))
# out = self.lin_out(out)
out = torch.cat([out, h], 1)
return self.read_out(out), out
class GRIL(nn.Module):
def __init__(self,
input_size,
hidden_size,
u_size=None,
n_layers=1,
dropout=0.,
kernel_size=2,
decoder_order=1,
global_att=False,
support_len=2,
n_nodes=None,
layer_norm=False):
super(GRIL, self).__init__()
self.input_size = int(input_size)
self.hidden_size = int(hidden_size)
self.u_size = int(u_size) if u_size is not None else 0
self.n_layers = int(n_layers)
rnn_input_size = 2 * self.input_size + self.u_size # input + mask + (eventually) exogenous
# Spatio-temporal encoder (rnn_input_size -> hidden_size)
self.cells = nn.ModuleList()
self.norms = nn.ModuleList()
for i in range(self.n_layers):
self.cells.append(GCGRUCell(d_in=rnn_input_size if i == 0 else self.hidden_size,
num_units=self.hidden_size, support_len=support_len, order=kernel_size))
if layer_norm:
self.norms.append(nn.GroupNorm(num_groups=1, num_channels=self.hidden_size))
else:
self.norms.append(nn.Identity())
self.dropout = nn.Dropout(dropout) if dropout > 0. else None
# Fist stage readout
self.first_stage = nn.Conv1d(in_channels=self.hidden_size, out_channels=self.input_size, kernel_size=1)
# Spatial decoder (rnn_input_size + hidden_size -> hidden_size)
self.spatial_decoder = SpatialDecoder(d_in=rnn_input_size + self.hidden_size,
d_model=self.hidden_size,
d_out=self.input_size,
support_len=2,
order=decoder_order,
attention_block=global_att)
# Hidden state initialization embedding
if n_nodes is not None:
self.h0 = self.init_hidden_states(n_nodes)
else:
self.register_parameter('h0', None)
def init_hidden_states(self, n_nodes):
h0 = []
for l in range(self.n_layers):
std = 1. / torch.sqrt(torch.tensor(self.hidden_size, dtype=torch.float))
vals = torch.distributions.Normal(0, std).sample((self.hidden_size, n_nodes))
h0.append(nn.Parameter(vals))
return nn.ParameterList(h0)
def get_h0(self, x):
if self.h0 is not None:
return [h.expand(x.shape[0], -1, -1) for h in self.h0]
return [torch.zeros(size=(x.shape[0], self.hidden_size, x.shape[2])).to(x.device)] * self.n_layers
def update_state(self, x, h, adj):
rnn_in = x
for layer, (cell, norm) in enumerate(zip(self.cells, self.norms)):
rnn_in = h[layer] = norm(cell(rnn_in, h[layer], adj))
if self.dropout is not None and layer < (self.n_layers - 1):
rnn_in = self.dropout(rnn_in)
return h
def forward(self, x, adj, mask=None, u=None, h=None, cached_support=False):
# x:[batch, features, nodes, steps]
*_, steps = x.size()
# infer all valid if mask is None
if mask is None:
mask = torch.ones_like(x, dtype=torch.uint8)
# init hidden state using node embedding or the empty state
if h is None:
h = self.get_h0(x)
elif not isinstance(h, list):
h = [*h]
# Temporal conv
predictions, imputations, states = [], [], []
representations = []
for step in range(steps):
x_s = x[..., step]
m_s = mask[..., step]
h_s = h[-1]
u_s = u[..., step] if u is not None else None
# firstly impute missing values with predictions from state
xs_hat_1 = self.first_stage(h_s)
# fill missing values in input with prediction
x_s = torch.where(m_s, x_s, xs_hat_1)
# prepare inputs
# retrieve maximum information from neighbors
xs_hat_2, repr_s = self.spatial_decoder(x=x_s, m=m_s, h=h_s, u=u_s, adj=adj,
cached_support=cached_support) # receive messages from neighbors (no self-loop!)
# readout of imputation state + mask to retrieve imputations
# prepare inputs
x_s = torch.where(m_s, x_s, xs_hat_2)
inputs = [x_s, m_s]
if u_s is not None:
inputs.append(u_s)
inputs = torch.cat(inputs, dim=1) # x_hat_2 + mask + exogenous
# update state with original sequence filled using imputations
h = self.update_state(inputs, h, adj)
# store imputations and states
imputations.append(xs_hat_2)
predictions.append(xs_hat_1)
states.append(torch.stack(h, dim=0))
representations.append(repr_s)
# Aggregate outputs -> [batch, features, nodes, steps]
imputations = torch.stack(imputations, dim=-1)
predictions = torch.stack(predictions, dim=-1)
states = torch.stack(states, dim=-1)
representations = torch.stack(representations, dim=-1)
return imputations, predictions, representations, states
class BiGRIL(nn.Module):
def __init__(self,
input_size,
hidden_size,
ff_size,
ff_dropout,
n_layers=1,
dropout=0.,
n_nodes=None,
support_len=2,
kernel_size=2,
decoder_order=1,
global_att=False,
u_size=0,
embedding_size=0,
layer_norm=False,
merge='mlp'):
super(BiGRIL, self).__init__()
self.fwd_rnn = GRIL(input_size=input_size,
hidden_size=hidden_size,
n_layers=n_layers,
dropout=dropout,
n_nodes=n_nodes,
support_len=support_len,
kernel_size=kernel_size,
decoder_order=decoder_order,
global_att=global_att,
u_size=u_size,
layer_norm=layer_norm)
self.bwd_rnn = GRIL(input_size=input_size,
hidden_size=hidden_size,
n_layers=n_layers,
dropout=dropout,
n_nodes=n_nodes,
support_len=support_len,
kernel_size=kernel_size,
decoder_order=decoder_order,
global_att=global_att,
u_size=u_size,
layer_norm=layer_norm)
if n_nodes is None:
embedding_size = 0
if embedding_size > 0:
self.emb = nn.Parameter(torch.empty(embedding_size, n_nodes))
nn.init.kaiming_normal_(self.emb, nonlinearity='relu')
else:
self.register_parameter('emb', None)
if merge == 'mlp':
self._impute_from_states = True
self.out = nn.Sequential(
nn.Conv2d(in_channels=4 * hidden_size + input_size + embedding_size,
out_channels=ff_size, kernel_size=1),
nn.ReLU(),
nn.Dropout(ff_dropout),
nn.Conv2d(in_channels=ff_size, out_channels=input_size, kernel_size=1)
)
elif merge in ['mean', 'sum', 'min', 'max']:
self._impute_from_states = False
self.out = getattr(torch, merge)
else:
raise ValueError("Merge option %s not allowed." % merge)
self.supp = None
def forward(self, x, adj, mask=None, u=None, cached_support=False):
if cached_support and (self.supp is not None):
supp = self.supp
else:
supp = SpatialConvOrderK.compute_support(adj, x.device)
self.supp = supp if cached_support else None
# Forward
fwd_out, fwd_pred, fwd_repr, _ = self.fwd_rnn(x, supp, mask=mask, u=u, cached_support=cached_support)
# Backward
rev_x, rev_mask, rev_u = [reverse_tensor(tens) for tens in (x, mask, u)]
*bwd_res, _ = self.bwd_rnn(rev_x, supp, mask=rev_mask, u=rev_u, cached_support=cached_support)
bwd_out, bwd_pred, bwd_repr = [reverse_tensor(res) for res in bwd_res]
if self._impute_from_states:
inputs = [fwd_repr, bwd_repr, mask]
if self.emb is not None:
b, *_, s = fwd_repr.shape # fwd_h: [batches, channels, nodes, steps]
inputs += [self.emb.view(1, *self.emb.shape, 1).expand(b, -1, -1, s)] # stack emb for batches and steps
imputation = torch.cat(inputs, dim=1)
imputation = self.out(imputation)
else:
imputation = torch.stack([fwd_out, bwd_out], dim=1)
imputation = self.out(imputation, dim=1)
predictions = torch.stack([fwd_out, bwd_out, fwd_pred, bwd_pred], dim=0)
return imputation, predictions
================================================
FILE: lib/nn/layers/imputation.py
================================================
import math
import torch
from torch import nn
from torch.nn import functional as F
class ImputationLayer(nn.Module):
def __init__(self, d_in, bias=True):
super(ImputationLayer, self).__init__()
self.W = nn.Parameter(torch.Tensor(d_in, d_in))
if bias:
self.b = nn.Parameter(torch.Tensor(d_in))
else:
self.register_buffer('b', None)
mask = 1. - torch.eye(d_in)
self.register_buffer('mask', mask)
self.reset_parameters()
def reset_parameters(self):
nn.init.kaiming_uniform_(self.W, a=math.sqrt(5))
if self.b is not None:
fan_in, _ = nn.init._calculate_fan_in_and_fan_out(self.W)
bound = 1 / math.sqrt(fan_in)
nn.init.uniform_(self.b, -bound, bound)
def forward(self, x):
# batch, features
return F.linear(x, self.mask * self.W, self.b)
================================================
FILE: lib/nn/layers/mpgru.py
================================================
import torch
from torch import nn
from .gcrnn import GCGRUCell
class MPGRUImputer(nn.Module):
def __init__(self,
input_size,
hidden_size,
ff_size=None,
u_size=None,
n_layers=1,
dropout=0.,
kernel_size=2,
support_len=2,
n_nodes=None,
layer_norm=False,
autoencoder_mode=False):
super(MPGRUImputer, self).__init__()
self.input_size = int(input_size)
self.hidden_size = int(hidden_size)
self.ff_size = int(ff_size) if ff_size is not None else 0
self.u_size = int(u_size) if u_size is not None else 0
self.n_layers = int(n_layers)
rnn_input_size = 2 * self.input_size + self.u_size # input + mask + (eventually) exogenous
# Spatio-temporal encoder (rnn_input_size -> hidden_size)
self.cells = nn.ModuleList()
self.norms = nn.ModuleList()
for i in range(self.n_layers):
self.cells.append(GCGRUCell(d_in=rnn_input_size if i == 0 else self.hidden_size,
num_units=self.hidden_size, support_len=support_len, order=kernel_size))
if layer_norm:
self.norms.append(nn.GroupNorm(num_groups=1, num_channels=self.hidden_size))
else:
self.norms.append(nn.Identity())
self.dropout = nn.Dropout(dropout) if dropout > 0. else None
# Readout
if self.ff_size:
self.pred_readout = nn.Sequential(
nn.Conv1d(in_channels=self.hidden_size, out_channels=self.ff_size, kernel_size=1),
nn.PReLU(),
nn.Conv1d(in_channels=self.ff_size, out_channels=self.input_size, kernel_size=1)
)
else:
self.pred_readout = nn.Conv1d(in_channels=self.hidden_size, out_channels=self.input_size, kernel_size=1)
# Hidden state initialization embedding
if n_nodes is not None:
self.h0 = self.init_hidden_states(n_nodes)
else:
self.register_parameter('h0', None)
self.autoencoder_mode = autoencoder_mode
def init_hidden_states(self, n_nodes):
h0 = []
for l in range(self.n_layers):
std = 1. / torch.sqrt(torch.tensor(self.hidden_size, dtype=torch.float))
vals = torch.distributions.Normal(0, std).sample((self.hidden_size, n_nodes))
h0.append(nn.Parameter(vals))
return nn.ParameterList(h0)
def get_h0(self, x):
if self.h0 is not None:
return [h.expand(x.shape[0], -1, -1) for h in self.h0]
return [torch.zeros(size=(x.shape[0], self.hidden_size, x.shape[2])).to(x.device)] * self.n_layers
def update_state(self, x, h, adj):
rnn_in = x
for layer, (cell, norm) in enumerate(zip(self.cells, self.norms)):
rnn_in = h[layer] = norm(cell(rnn_in, h[layer], adj))
if self.dropout is not None and layer < (self.n_layers - 1):
rnn_in = self.dropout(rnn_in)
return h
def forward(self, x, adj, mask=None, u=None, h=None):
# x:[batch, features, nodes, steps]
*_, steps = x.size()
# infer all valid if mask is None
if mask is None:
mask = torch.ones_like(x, dtype=torch.uint8)
# init hidden state using node embedding or the empty state
if h is None:
h = self.get_h0(x)
elif not isinstance(h, list):
h = [*h]
# Temporal conv
predictions, states = [], []
for step in range(steps):
x_s = x[..., step]
m_s = mask[..., step]
h_s = h[-1]
u_s = u[..., step] if u is not None else None
# impute missing values with predictions from state
x_s_hat = self.pred_readout(h_s)
# store imputations and state
predictions.append(x_s_hat)
states.append(torch.stack(h, dim=0))
# fill missing values in input with prediction
x_s = torch.where(m_s, x_s, x_s_hat)
inputs = [x_s, m_s]
if u_s is not None:
inputs.append(u_s)
inputs = torch.cat(inputs, dim=1) # x_hat complemented + mask + exogenous
# update state with original sequence filled using imputations
h = self.update_state(inputs, h, adj)
# In autoencoder mode use states after input processing
if self.autoencoder_mode:
states = states[1:] + [torch.stack(h, dim=0)]
# Aggregate outputs -> [batch, features, nodes, steps]
predictions = torch.stack(predictions, dim=-1)
states = torch.stack(states, dim=-1)
return predictions, states
================================================
FILE: lib/nn/layers/rits.py
================================================
import math
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
from torch.nn.parameter import Parameter
from ..utils.ops import reverse_tensor
class FeatureRegression(nn.Module):
def __init__(self, input_size):
super(FeatureRegression, self).__init__()
self.W = Parameter(torch.Tensor(input_size, input_size))
self.b = Parameter(torch.Tensor(input_size))
m = torch.ones(input_size, input_size) - torch.eye(input_size, input_size)
self.register_buffer('m', m)
self.reset_parameters()
def reset_parameters(self):
stdv = 1. / math.sqrt(self.W.shape[0])
self.W.data.uniform_(-stdv, stdv)
if self.b is not None:
self.b.data.uniform_(-stdv, stdv)
def forward(self, x):
z_h = F.linear(x, self.W * Variable(self.m), self.b)
return z_h
class TemporalDecay(nn.Module):
def __init__(self, d_in, d_out, diag=False):
super(TemporalDecay, self).__init__()
self.diag = diag
self.W = Parameter(torch.Tensor(d_out, d_in))
self.b = Parameter(torch.Tensor(d_out))
if self.diag:
assert (d_in == d_out)
m = torch.eye(d_in, d_in)
self.register_buffer('m', m)
self.reset_parameters()
def reset_parameters(self):
stdv = 1. / math.sqrt(self.W.shape[0])
self.W.data.uniform_(-stdv, stdv)
if self.b is not None:
self.b.data.uniform_(-stdv, stdv)
@staticmethod
def compute_delta(mask, freq=1):
delta = torch.zeros_like(mask).float()
one_step = torch.tensor(freq, dtype=delta.dtype, device=delta.device)
for i in range(1, delta.shape[-2]):
m = mask[..., i - 1, :]
delta[..., i, :] = m * one_step + (1 - m) * torch.add(delta[..., i - 1, :], freq)
return delta
def forward(self, d):
if self.diag:
gamma = F.relu(F.linear(d, self.W * Variable(self.m), self.b))
else:
gamma = F.relu(F.linear(d, self.W, self.b))
gamma = torch.exp(-gamma)
return gamma
class RITS(nn.Module):
def __init__(self,
input_size,
hidden_size=64):
super(RITS, self).__init__()
self.input_size = int(input_size)
self.hidden_size = int(hidden_size)
self.rnn_cell = nn.LSTMCell(2 * self.input_size, self.hidden_size)
self.temp_decay_h = TemporalDecay(d_in=self.input_size, d_out=self.hidden_size, diag=False)
self.temp_decay_x = TemporalDecay(d_in=self.input_size, d_out=self.input_size, diag=True)
self.hist_reg = nn.Linear(self.hidden_size, self.input_size)
self.feat_reg = FeatureRegression(self.input_size)
self.weight_combine = nn.Linear(2 * self.input_size, self.input_size)
def init_hidden_states(self, x):
return Variable(torch.zeros((x.shape[0], self.hidden_size))).to(x.device)
def forward(self, x, mask=None, delta=None):
# x : [batch, steps, features]
steps = x.shape[-2]
if mask is None:
mask = torch.ones_like(x, dtype=torch.uint8)
if delta is None:
delta = TemporalDecay.compute_delta(mask)
# init rnn states
h = self.init_hidden_states(x)
c = self.init_hidden_states(x)
imputation = []
predictions = []
for step in range(steps):
d = delta[:, step, :]
m = mask[:, step, :]
x_s = x[:, step, :]
gamma_h = self.temp_decay_h(d)
# history prediction
x_h = self.hist_reg(h)
x_c = m * x_s + (1 - m) * x_h
h = h * gamma_h
# feature prediction
z_h = self.feat_reg(x_c)
# predictions combination
gamma_x = self.temp_decay_x(d)
alpha = self.weight_combine(torch.cat([gamma_x, m], dim=1))
alpha = torch.sigmoid(alpha)
c_h = alpha * z_h + (1 - alpha) * x_h
c_c = m * x_s + (1 - m) * c_h
inputs = torch.cat([c_c, m], dim=1)
h, c = self.rnn_cell(inputs, (h, c))
imputation.append(c_c)
predictions.append(torch.stack((c_h, z_h, x_h), dim=0))
# imputation -> [batch, steps, features]
imputation = torch.stack(imputation, dim=-2)
# predictions -> [predictions, batch, steps, features]
predictions = torch.stack(predictions, dim=-2)
c_h, z_h, x_h = predictions
return imputation, (c_h, z_h, x_h)
class BRITS(nn.Module):
def __init__(self, input_size, hidden_size):
super().__init__()
self.rits_fwd = RITS(input_size, hidden_size)
self.rits_bwd = RITS(input_size, hidden_size)
def forward(self, x, mask=None):
# x: [batches, steps, features]
# forward
imp_fwd, pred_fwd = self.rits_fwd(x, mask)
# backward
x_bwd = reverse_tensor(x, axis=1)
mask_bwd = reverse_tensor(mask, axis=1) if mask is not None else None
imp_bwd, pred_bwd = self.rits_bwd(x_bwd, mask_bwd)
imp_bwd, pred_bwd = reverse_tensor(imp_bwd, axis=1), [reverse_tensor(pb, axis=1) for pb in pred_bwd]
# stack into shape = [batch, directions, steps, features]
imputation = torch.stack([imp_fwd, imp_bwd], dim=1)
predictions = [torch.stack([pf, pb], dim=1) for pf, pb in zip(pred_fwd, pred_bwd)]
c_h, z_h, x_h = predictions
return imputation, (c_h, z_h, x_h)
@staticmethod
def consistency_loss(imp_fwd, imp_bwd):
loss = 0.1 * torch.abs(imp_fwd - imp_bwd).mean()
return loss
================================================
FILE: lib/nn/layers/spatial_attention.py
================================================
import torch.nn as nn
from einops import rearrange
class SpatialAttention(nn.Module):
def __init__(self, d_in, d_model, nheads, dropout=0.):
super(SpatialAttention, self).__init__()
self.lin_in = nn.Linear(d_in, d_model)
self.self_attn = nn.MultiheadAttention(d_model, nheads, dropout=dropout)
def forward(self, x, att_mask=None, **kwargs):
r"""Pass the input through the encoder layer.
Args:
src: the sequence to the encoder layer (required).
src_mask: the mask for the src sequence (optional).
src_key_padding_mask: the mask for the src keys per batch (optional).
Shape:
see the docs in Transformer class.
"""
b, s, n, f = x.size()
x = rearrange(x, 'b s n f -> n (b s) f')
x = self.lin_in(x)
x = self.self_attn(x, x, x, attn_mask=att_mask)[0]
x = rearrange(x, 'n (b s) f -> b s n f', b=b, s=s)
return x
================================================
FILE: lib/nn/layers/spatial_conv.py
================================================
import torch
from torch import nn
from ... import epsilon
class SpatialConvOrderK(nn.Module):
"""
Spatial convolution of order K with possibly different diffusion matrices (useful for directed graphs)
Efficient implementation inspired from graph-wavenet codebase
"""
def __init__(self, c_in, c_out, support_len=3, order=2, include_self=True):
super(SpatialConvOrderK, self).__init__()
self.include_self = include_self
c_in = (order * support_len + (1 if include_self else 0)) * c_in
self.mlp = nn.Conv2d(c_in, c_out, kernel_size=1)
self.order = order
@staticmethod
def compute_support(adj, device=None):
if device is not None:
adj = adj.to(device)
adj_bwd = adj.T
adj_fwd = adj / (adj.sum(1, keepdims=True) + epsilon)
adj_bwd = adj_bwd / (adj_bwd.sum(1, keepdims=True) + epsilon)
support = [adj_fwd, adj_bwd]
return support
@staticmethod
def compute_support_orderK(adj, k, include_self=False, device=None):
if isinstance(adj, (list, tuple)):
support = adj
else:
support = SpatialConvOrderK.compute_support(adj, device)
supp_k = []
for a in support:
ak = a
for i in range(k - 1):
ak = torch.matmul(ak, a.T)
if not include_self:
ak.fill_diagonal_(0.)
supp_k.append(ak)
return support + supp_k
def forward(self, x, support):
# [batch, features, nodes, steps]
if x.dim() < 4:
squeeze = True
x = torch.unsqueeze(x, -1)
else:
squeeze = False
out = [x] if self.include_self else []
if (type(support) is not list):
support = [support]
for a in support:
x1 = torch.einsum('ncvl,wv->ncwl', (x, a)).contiguous()
out.append(x1)
for k in range(2, self.order + 1):
x2 = torch.einsum('ncvl,wv->ncwl', (x1, a)).contiguous()
out.append(x2)
x1 = x2
out = torch.cat(out, dim=1)
out = self.mlp(out)
if squeeze:
out = out.squeeze(-1)
return out
================================================
FILE: lib/nn/models/__init__.py
================================================
from .grin import GRINet
from .brits import BRITSNet
from .mpgru import MPGRUNet, BiMPGRUNet
from .var import VARImputer
from .rgain import RGAINNet
from .rnn_imputers import BiRNNImputer, RNNImputer
================================================
FILE: lib/nn/models/brits.py
================================================
import torch
from torch import nn
from ..layers import BRITS
class BRITSNet(nn.Module):
def __init__(self,
d_in,
d_hidden=64):
super(BRITSNet, self).__init__()
self.birits = BRITS(input_size=d_in,
hidden_size=d_hidden)
def forward(self, x, mask=None, **kwargs):
# x: [batches, steps, features]
imputations, predictions = self.birits(x, mask=mask)
# predictions: [batch, directions, steps, features] x 3
out = torch.mean(imputations, dim=1) # -> [batch, steps, features]
predictions = torch.cat(predictions, dim=1) # -> [batch, directions * n_predictions, steps, features]
# reshape
imputations = torch.transpose(imputations, 0, 1) # rearrange(imputations, 'b d s f -> d b s f')
predictions = torch.transpose(predictions, 0, 1) # rearrange(predictions, 'b d s f -> d b s f')
return out, imputations, predictions
@staticmethod
def add_model_specific_args(parser):
parser.add_argument('--d-in', type=int)
parser.add_argument('--d-hidden', type=int, default=64)
return parser
================================================
FILE: lib/nn/models/grin.py
================================================
import torch
from einops import rearrange
from torch import nn
from ..layers import BiGRIL
from ...utils.parser_utils import str_to_bool
class GRINet(nn.Module):
def __init__(self,
adj,
d_in,
d_hidden,
d_ff,
ff_dropout,
n_layers=1,
kernel_size=2,
decoder_order=1,
global_att=False,
d_u=0,
d_emb=0,
layer_norm=False,
merge='mlp',
impute_only_holes=True):
super(GRINet, self).__init__()
self.d_in = d_in
self.d_hidden = d_hidden
self.d_u = int(d_u) if d_u is not None else 0
self.d_emb = int(d_emb) if d_emb is not None else 0
self.register_buffer('adj', torch.tensor(adj).float())
self.impute_only_holes = impute_only_holes
self.bigrill = BiGRIL(input_size=self.d_in,
ff_size=d_ff,
ff_dropout=ff_dropout,
hidden_size=self.d_hidden,
embedding_size=self.d_emb,
n_nodes=self.adj.shape[0],
n_layers=n_layers,
kernel_size=kernel_size,
decoder_order=decoder_order,
global_att=global_att,
u_size=self.d_u,
layer_norm=layer_norm,
merge=merge)
def forward(self, x, mask=None, u=None, **kwargs):
# x: [batches, steps, nodes, channels] -> [batches, channels, nodes, steps]
x = rearrange(x, 'b s n c -> b c n s')
if mask is not None:
mask = rearrange(mask, 'b s n c -> b c n s')
if u is not None:
u = rearrange(u, 'b s n c -> b c n s')
# imputation: [batches, channels, nodes, steps] prediction: [4, batches, channels, nodes, steps]
imputation, prediction = self.bigrill(x, self.adj, mask=mask, u=u, cached_support=self.training)
# In evaluation stage impute only missing values
if self.impute_only_holes and not self.training:
imputation = torch.where(mask, x, imputation)
# out: [batches, channels, nodes, steps] -> [batches, steps, nodes, channels]
imputation = torch.transpose(imputation, -3, -1)
prediction = torch.transpose(prediction, -3, -1)
if self.training:
return imputation, prediction
return imputation
@staticmethod
def add_model_specific_args(parser):
parser.add_argument('--d-hidden', type=int, default=64)
parser.add_argument('--d-ff', type=int, default=64)
parser.add_argument('--ff-dropout', type=int, default=0.)
parser.add_argument('--n-layers', type=int, default=1)
parser.add_argument('--kernel-size', type=int, default=2)
parser.add_argument('--decoder-order', type=int, default=1)
parser.add_argument('--d-u', type=int, default=0)
parser.add_argument('--d-emb', type=int, default=8)
parser.add_argument('--layer-norm', type=str_to_bool, nargs='?', const=True, default=False)
parser.add_argument('--global-att', type=str_to_bool, nargs='?', const=True, default=False)
parser.add_argument('--merge', type=str, default='mlp')
parser.add_argument('--impute-only-holes', type=str_to_bool, nargs='?', const=True, default=True)
return parser
================================================
FILE: lib/nn/models/mpgru.py
================================================
import torch
from einops import rearrange
from torch import nn
from ..layers import MPGRUImputer, SpatialConvOrderK
from ..utils.ops import reverse_tensor
from ...utils.parser_utils import str_to_bool
class MPGRUNet(nn.Module):
def __init__(self,
adj,
d_in,
d_hidden,
d_ff=0,
d_u=0,
n_layers=1,
dropout=0.,
kernel_size=2,
support_len=2,
layer_norm=False,
impute_only_holes=True):
super(MPGRUNet, self).__init__()
self.register_buffer('adj', torch.tensor(adj).float())
n_nodes = adj.shape[0]
self.gcgru = MPGRUImputer(input_size=d_in,
hidden_size=d_hidden,
ff_size=d_ff,
u_size=d_u,
n_layers=n_layers,
dropout=dropout,
kernel_size=kernel_size,
support_len=support_len,
layer_norm=layer_norm,
n_nodes=n_nodes)
self.impute_only_holes = impute_only_holes
def forward(self, x, mask=None, u=None, h=None):
# x: [batches, steps, nodes, channels] -> [batches, channels, nodes, steps]
x = rearrange(x, 'b s n c -> b c n s')
if mask is not None:
mask = rearrange(mask, 'b s n c -> b c n s')
if u is not None:
u = rearrange(u, 'b s n c -> b c n s')
adj = SpatialConvOrderK.compute_support(self.adj, x.device)
imputation, _ = self.gcgru(x, adj, mask=mask, u=u, h=h)
# In evaluation stage impute only missing values
if self.impute_only_holes and not self.training:
imputation = torch.where(mask, x, imputation)
# out: [batches, channels, nodes, steps] -> [batches, steps, nodes, channels]
imputation = rearrange(imputation, 'b c n s -> b s n c')
return imputation
@staticmethod
def add_model_specific_args(parser):
parser.add_argument('--d-hidden', type=int, default=64)
parser.add_argument('--d-ff', type=int, default=64)
parser.add_argument('--n-layers', type=int, default=1)
parser.add_argument('--kernel-size', type=int, default=2)
parser.add_argument('--layer-norm', type=str_to_bool, nargs='?', const=True, default=False)
parser.add_argument('--impute-only-holes', type=str_to_bool, nargs='?', const=True, default=True)
parser.add_argument('--dropout', type=float, default=0.)
return parser
class BiMPGRUNet(nn.Module):
def __init__(self,
adj,
d_in,
d_hidden,
d_ff=0,
d_u=0,
n_layers=1,
dropout=0.,
kernel_size=2,
support_len=2,
layer_norm=False,
embedding_size=0,
merge='mlp',
impute_only_holes=True,
autoencoder_mode=False):
super(BiMPGRUNet, self).__init__()
self.register_buffer('adj', torch.tensor(adj).float())
n_nodes = adj.shape[0]
self.gcgru_fwd = MPGRUImputer(input_size=d_in,
hidden_size=d_hidden,
u_size=d_u,
n_layers=n_layers,
dropout=dropout,
kernel_size=kernel_size,
support_len=support_len,
layer_norm=layer_norm,
n_nodes=n_nodes,
autoencoder_mode=autoencoder_mode)
self.gcgru_bwd = MPGRUImputer(input_size=d_in,
hidden_size=d_hidden,
u_size=d_u,
n_layers=n_layers,
dropout=dropout,
kernel_size=kernel_size,
support_len=support_len,
layer_norm=layer_norm,
n_nodes=n_nodes,
autoencoder_mode=autoencoder_mode)
self.impute_only_holes = impute_only_holes
if n_nodes is None:
embedding_size = 0
if embedding_size > 0:
self.emb = nn.Parameter(torch.empty(embedding_size, n_nodes))
nn.init.kaiming_normal_(self.emb, nonlinearity='relu')
else:
self.register_parameter('emb', None)
if merge == 'mlp':
self._impute_from_states = True
self.out = nn.Sequential(
nn.Conv2d(in_channels=2 * d_hidden + d_in + embedding_size,
out_channels=d_ff, kernel_size=1),
nn.ReLU(),
nn.Conv2d(in_channels=d_ff, out_channels=d_in, kernel_size=1)
)
elif merge in ['mean', 'sum', 'min', 'max']:
self._impute_from_states = False
self.out = getattr(torch, merge)
else:
raise ValueError("Merge option %s not allowed." % merge)
def forward(self, x, mask=None, u=None, h=None):
# x: [batches, steps, nodes, channels] -> [batches, channels, nodes, steps]
x = rearrange(x, 'b s n c -> b c n s')
if mask is not None:
mask = rearrange(mask, 'b s n c -> b c n s')
if u is not None:
u = rearrange(u, 'b s n c -> b c n s')
adj = SpatialConvOrderK.compute_support(self.adj, x.device)
# Forward
fwd_pred, fwd_states = self.gcgru_fwd(x, adj, mask=mask, u=u)
# Backward
rev_x, rev_mask, rev_u = [reverse_tensor(tens, axis=-1) for tens in (x, mask, u)]
bwd_res = self.gcgru_bwd(rev_x, adj, mask=rev_mask, u=rev_u)
bwd_pred, bwd_states = [reverse_tensor(res, axis=-1) for res in bwd_res]
if self._impute_from_states:
inputs = [fwd_states[-1], bwd_states[-1], mask] # take only state of last gcgru layer
if self.emb is not None:
b, *_, s = x.shape # fwd_h: [batches, channels, nodes, steps]
inputs += [self.emb.view(1, *self.emb.shape, 1).expand(b, -1, -1, s)] # stack emb for batches and steps
imputation = torch.cat(inputs, dim=1)
imputation = self.out(imputation)
else:
imputation = torch.stack([fwd_pred, bwd_pred], dim=1)
imputation = self.out(imputation, dim=1)
# In evaluation stage impute only missing values
if self.impute_only_holes and not self.training:
imputation = torch.where(mask, x, imputation)
# out: [batches, channels, nodes, steps] -> [batches, steps, nodes, channels]
imputation = rearrange(imputation, 'b c n s -> b s n c')
return imputation
@staticmethod
def add_model_specific_args(parser):
parser.add_argument('--d-hidden', type=int, default=64)
parser.add_argument('--d-ff', type=int, default=64)
parser.add_argument('--n-layers', type=int, default=1)
parser.add_argument('--kernel-size', type=int, default=2)
parser.add_argument('--d-emb', type=int, default=8)
parser.add_argument('--layer-norm', type=str_to_bool, nargs='?', const=True, default=False)
parser.add_argument('--merge', type=str, default='mlp')
parser.add_argument('--impute-only-holes', type=str_to_bool, nargs='?', const=True, default=True)
parser.add_argument('--dropout', type=float, default=0.)
parser.add_argument('--autoencoder-mode', type=str_to_bool, nargs='?', const=True, default=False)
return parser
================================================
FILE: lib/nn/models/rgain.py
================================================
import torch
from torch import nn
from .rnn_imputers import BiRNNImputer
from ...utils.parser_utils import str_to_bool
class Generator(nn.Module):
def __init__(self, d_in, d_model, d_z, dropout=0., inject_noise=True):
super(Generator, self).__init__()
self.inject_noise = inject_noise
self.d_z = d_z if inject_noise else 0
self.birnn = BiRNNImputer(d_in,
d_model,
d_u=d_z,
concat_mask=True,
detach_inputs=False,
dropout=dropout,
state_init='zero')
def forward(self, x, mask):
if self.inject_noise:
z = torch.rand(x.size(0), x.size(1), self.d_z, device=x.device) * 0.1
else:
z = None
return self.birnn(x, mask, u=z)
class Discriminator(torch.nn.Module):
def __init__(self, d_in, d_model, dropout=0.):
super(Discriminator, self).__init__()
self.birnn = nn.GRU(2 * d_in, d_model, bidirectional=True, batch_first=True)
self.dropout = nn.Dropout(dropout)
self.read_out = nn.Linear(2 * d_model, d_in)
def forward(self, x, h):
x_in = torch.cat([x, h], dim=-1)
out, _ = self.birnn(x_in)
logits = self.read_out(self.dropout(out))
return logits
class RGAINNet(torch.nn.Module):
def __init__(self, d_in, d_model, d_z, dropout=0., inject_noise=False, k=5):
super(RGAINNet, self).__init__()
self.inject_noise = inject_noise
self.k = k
self.generator = Generator(d_in, d_model, d_z=d_z, dropout=dropout, inject_noise=inject_noise)
self.discriminator = Discriminator(d_in, d_model, dropout)
def forward(self, x, mask, **kwargs):
if not self.training and self.inject_noise:
res = []
for _ in range(self.k):
res.append(self.generator(x, mask)[0])
return torch.stack(res, 0).mean(0),
return self.generator(x, mask)
@staticmethod
def add_model_specific_args(parser):
parser.add_argument('--d-in', type=int)
parser.add_argument('--d-model', type=int, default=None)
parser.add_argument('--d-z', type=int, default=8)
parser.add_argument('--k', type=int, default=5)
parser.add_argument('--inject-noise', type=str_to_bool, nargs='?', const=True, default=False)
parser.add_argument('--dropout', type=float, default=0.)
return parser
================================================
FILE: lib/nn/models/rnn_imputers.py
================================================
import torch
from torch import nn
from ..utils.ops import reverse_tensor
class RNNImputer(nn.Module):
"""Fill the blanks with a 1-step-ahead GRU predictor."""
def __init__(self, d_in, d_model, concat_mask=True, detach_inputs=False, state_init='zero', d_u=0):
super(RNNImputer, self).__init__()
self.concat_mask = concat_mask
self.detach_inputs = detach_inputs
self.state_init = state_init
self.d_model = d_model
self.input_dim = d_in + d_u if not concat_mask else 2 * d_in + d_u
self.rnn_cell = nn.GRUCell(self.input_dim, d_model)
self.read_out = nn.Linear(d_model, d_in)
def init_hidden_state(self, x):
if self.state_init == 'zero':
return torch.zeros((x.size(0), self.d_model), device=x.device, dtype=x.dtype)
if self.state_init == 'noise':
return torch.randn(x.size(0), self.d_model, device=x.device, dtype=x.dtype)
def _preprocess_input(self, x, x_hat, m, u):
if self.detach_inputs:
x_p = torch.where(m, x, x_hat.detach())
else:
x_p = torch.where(m, x, x_hat)
if u is not None:
x_p = torch.cat([x_p, u], -1)
if self.concat_mask:
x_p = torch.cat([x_p, m], -1)
return x_p
def forward(self, x, mask, u=None, return_hidden=False):
# x: [batches, steps, features]
steps = x.size(1)
# ensure masked values are not visible
x = torch.where(mask, x, torch.zeros_like(x))
h = self.init_hidden_state(x)
x_hat = self.read_out(h)
hs = [h]
preds = [x_hat]
for s in range(steps - 1):
u_t = None if u is None else u[:, s]
x_t = self._preprocess_input(x[:, s], x_hat, mask[:, s], u_t)
h = self.rnn_cell(x_t, h)
x_hat = self.read_out(h)
hs.append(h)
preds.append(x_hat)
x_hat = torch.stack(preds, 1)
h = torch.stack(hs, 1)
if return_hidden:
return x_hat, h
return x_hat
@staticmethod
def add_model_specific_args(parser):
parser.add_argument('--d-in', type=int)
parser.add_argument('--d-model', type=int, default=None)
return parser
class BiRNNImputer(nn.Module):
"""Fill the blanks with a 1-step-ahead GRU predictor."""
def __init__(self, d_in, d_model, dropout=0., concat_mask=True, detach_inputs=False, state_init='zero', d_u=0):
super(BiRNNImputer, self).__init__()
self.d_model = d_model
self.fwd_rnn = RNNImputer(d_in, d_model, concat_mask, detach_inputs=detach_inputs, state_init=state_init,
d_u=d_u)
self.bwd_rnn = RNNImputer(d_in, d_model, concat_mask, detach_inputs=detach_inputs, state_init=state_init,
d_u=d_u)
self.dropout = nn.Dropout(dropout)
self.read_out = nn.Linear(2 * d_model, d_in)
def forward(self, x, mask, u=None, return_hidden=False):
# x: [batches, steps, features]
x_hat_fwd, h_fwd = self.fwd_rnn(x, mask, u=u, return_hidden=True)
x_hat_bwd, h_bwd = self.bwd_rnn(reverse_tensor(x, 1),
reverse_tensor(mask, 1),
u=reverse_tensor(u, 1) if u is not None else None,
return_hidden=True)
x_hat_bwd = reverse_tensor(x_hat_bwd, 1)
h_bwd = reverse_tensor(h_bwd, 1)
h = self.dropout(torch.cat([h_fwd, h_bwd], -1))
x_hat = self.read_out(h)
if return_hidden:
return (x_hat, x_hat_fwd, x_hat_bwd), h
return x_hat, x_hat_fwd, x_hat_bwd
@staticmethod
def add_model_specific_args(parser):
parser.add_argument('--d-in', type=int)
parser.add_argument('--d-model', type=int, default=None)
parser.add_argument('--dropout', type=float, default=0.)
return parser
================================================
FILE: lib/nn/models/var.py
================================================
import torch
from einops import rearrange
from torch import nn
from lib import epsilon
class VAR(nn.Module):
def __init__(self, order, d_in, d_out=None, steps_ahead=1, bias=True):
super(VAR, self).__init__()
self.order = order
self.d_in = d_in
self.d_out = d_out if d_out is not None else d_in
self.steps_ahead = steps_ahead
self.lin = nn.Linear(order * d_in, steps_ahead * self.d_out, bias=bias)
def forward(self, x):
# x: [batches, steps, features]
x = rearrange(x, 'b s f -> b (s f)')
out = self.lin(x)
out = rearrange(out, 'b (s f) -> b s f', s=self.steps_ahead, f=self.d_out)
return out
@staticmethod
def add_model_specific_args(parser):
parser.add_argument('--order', type=int)
parser.add_argument('--d-in', type=int)
parser.add_argument('--d-out', type=int, default=None)
parser.add_argument('--steps-ahead', type=int, default=1)
return parser
class VARImputer(nn.Module):
"""Fill the blanks with a 1-step-ahead VAR predictor."""
def __init__(self, order, d_in, padding='mean'):
super(VARImputer, self).__init__()
assert padding in ['mean', 'zero']
self.order = order
self.padding = padding
self.predictor = VAR(order, d_in, d_out=d_in, steps_ahead=1)
def forward(self, x, mask=None):
# x: [batches, steps, features]
batch_size, steps, n_feats = x.shape
if mask is None:
mask = torch.ones_like(x, dtype=torch.uint8)
x = x * mask
# pad input sequence to start filling from first step
if self.padding == 'mean':
mean = torch.sum(x, 1) / (torch.sum(mask, 1) + epsilon)
pad = torch.repeat_interleave(mean.unsqueeze(1), self.order, 1)
elif self.padding == 'zero':
pad = torch.zeros((batch_size, self.order, n_feats)).to(x.device)
x = torch.cat([pad, x], 1)
# x: [batch, order + steps, features]
x = [x[:, i] for i in range(x.shape[1])]
for s in range(steps):
x_hat = self.predictor(torch.stack(x[s:s + self.order], 1))
x_hat = x_hat[:, 0]
x[s + self.order] = torch.where(mask[:, s], x[s + self.order], x_hat)
x = torch.stack(x[self.order:], 1) # remove padding
return x
@staticmethod
def add_model_specific_args(parser):
parser.add_argument('--order', type=int)
parser.add_argument('--d-in', type=int)
parser.add_argument("--padding", type=str, default='mean')
return parser
================================================
FILE: lib/nn/utils/__init__.py
================================================
================================================
FILE: lib/nn/utils/metric_base.py
================================================
from functools import partial
import torch
from pytorch_lightning.metrics import Metric
from torchmetrics.utilities.checks import _check_same_shape
class MaskedMetric(Metric):
def __init__(self,
metric_fn,
mask_nans=False,
mask_inf=False,
compute_on_step=True,
dist_sync_on_step=False,
process_group=None,
dist_sync_fn=None,
metric_kwargs=None,
at=None):
super(MaskedMetric, self).__init__(compute_on_step=compute_on_step,
dist_sync_on_step=dist_sync_on_step,
process_group=process_group,
dist_sync_fn=dist_sync_fn)
if metric_kwargs is None:
metric_kwargs = dict()
self.metric_fn = partial(metric_fn, **metric_kwargs)
self.mask_nans = mask_nans
self.mask_inf = mask_inf
if at is None:
self.at = slice(None)
else:
self.at = slice(at, at + 1)
self.add_state('value', dist_reduce_fx='sum', default=torch.tensor(0.).float())
self.add_state('numel', dist_reduce_fx='sum', default=torch.tensor(0))
def _check_mask(self, mask, val):
if mask is None:
mask = torch.ones_like(val).byte()
else:
_check_same_shape(mask, val)
if self.mask_nans:
mask = mask * ~torch.isnan(val)
if self.mask_inf:
mask = mask * ~torch.isinf(val)
return mask
def _compute_masked(self, y_hat, y, mask):
_check_same_shape(y_hat, y)
val = self.metric_fn(y_hat, y)
mask = self._check_mask(mask, val)
val = torch.where(mask, val, torch.tensor(0., device=val.device).float())
return val.sum(), mask.sum()
def _compute_std(self, y_hat, y):
_check_same_shape(y_hat, y)
val = self.metric_fn(y_hat, y)
return val.sum(), val.numel()
def is_masked(self, mask):
return self.mask_inf or self.mask_nans or (mask is not None)
def update(self, y_hat, y, mask=None):
y_hat = y_hat[:, self.at]
y = y[:, self.at]
if mask is not None:
mask = mask[:, self.at]
if self.is_masked(mask):
val, numel = self._compute_masked(y_hat, y, mask)
else:
val, numel = self._compute_std(y_hat, y)
self.value += val
self.numel += numel
def compute(self):
if self.numel > 0:
return self.value / self.numel
return self.value
================================================
FILE: lib/nn/utils/metrics.py
================================================
from .metric_base import MaskedMetric
from .ops import mape
from torch.nn import functional as F
import torch
from torchmetrics.utilities.checks import _check_same_shape
from ... import epsilon
class MaskedMAE(MaskedMetric):
def __init__(self,
mask_nans=False,
mask_inf=False,
compute_on_step=True,
dist_sync_on_step=False,
process_group=None,
dist_sync_fn=None,
at=None):
super(MaskedMAE, self).__init__(metric_fn=F.l1_loss,
mask_nans=mask_nans,
mask_inf=mask_inf,
compute_on_step=compute_on_step,
dist_sync_on_step=dist_sync_on_step,
process_group=process_group,
dist_sync_fn=dist_sync_fn,
metric_kwargs={'reduction': 'none'},
at=at)
class MaskedMAPE(MaskedMetric):
def __init__(self,
mask_nans=False,
compute_on_step=True,
dist_sync_on_step=False,
process_group=None,
dist_sync_fn=None,
at=None):
super(MaskedMAPE, self).__init__(metric_fn=mape,
mask_nans=mask_nans,
mask_inf=True,
compute_on_step=compute_on_step,
dist_sync_on_step=dist_sync_on_step,
process_group=process_group,
dist_sync_fn=dist_sync_fn,
at=at)
class MaskedMSE(MaskedMetric):
def __init__(self,
mask_nans=False,
compute_on_step=True,
dist_sync_on_step=False,
process_group=None,
dist_sync_fn=None,
at=None):
super(MaskedMSE, self).__init__(metric_fn=F.mse_loss,
mask_nans=mask_nans,
mask_inf=True,
compute_on_step=compute_on_step,
dist_sync_on_step=dist_sync_on_step,
process_group=process_group,
dist_sync_fn=dist_sync_fn,
metric_kwargs={'reduction': 'none'},
at=at)
class MaskedMRE(MaskedMetric):
def __init__(self,
mask_nans=False,
mask_inf=False,
compute_on_step=True,
dist_sync_on_step=False,
process_group=None,
dist_sync_fn=None,
at=None):
super(MaskedMRE, self).__init__(metric_fn=F.l1_loss,
mask_nans=mask_nans,
mask_inf=mask_inf,
compute_on_step=compute_on_step,
dist_sync_on_step=dist_sync_on_step,
process_group=process_group,
dist_sync_fn=dist_sync_fn,
metric_kwargs={'reduction': 'none'},
at=at)
self.add_state('tot', dist_reduce_fx='sum', default=torch.tensor(0., dtype=torch.float))
def _compute_masked(self, y_hat, y, mask):
_check_same_shape(y_hat, y)
val = self.metric_fn(y_hat, y)
mask = self._check_mask(mask, val)
val = torch.where(mask, val, torch.tensor(0., device=y.device, dtype=torch.float))
y_masked = torch.where(mask, y, torch.tensor(0., device=y.device, dtype=torch.float))
return val.sum(), mask.sum(), y_masked.sum()
def _compute_std(self, y_hat, y):
_check_same_shape(y_hat, y)
val = self.metric_fn(y_hat, y)
return val.sum(), val.numel(), y.sum()
def compute(self):
if self.tot > epsilon:
return self.value / self.tot
return self.value
def update(self, y_hat, y, mask=None):
y_hat = y_hat[:, self.at]
y = y[:, self.at]
if mask is not None:
mask = mask[:, self.at]
if self.is_masked(mask):
val, numel, tot = self._compute_masked(y_hat, y, mask)
else:
val, numel, tot = self._compute_std(y_hat, y)
self.value += val
self.numel += numel
self.tot += tot
================================================
FILE: lib/nn/utils/ops.py
================================================
import torch
import torch.nn.functional as F
from einops import reduce
from torch.autograd import Variable
from ... import epsilon
def mae(y_hat, y, reduction='none'):
return F.l1_loss(y_hat, y, reduction=reduction)
def mape(y_hat, y):
return torch.abs((y_hat - y) / y)
def wape_loss(y_hat, y):
l = torch.abs(y_hat - y)
return l.sum() / (y.sum() + epsilon)
def smape_loss(y_hat, y):
c = torch.abs(y) > epsilon
l_minus = torch.abs(y_hat - y)
l_plus = torch.abs(y_hat + y) + epsilon
l = 2 * l_minus / l_plus * c.float()
return l.sum() / c.sum()
def peak_prediction_loss(y_hat, y, reduction='none'):
y_max = reduce(y, 'b s n 1 -> b 1 n 1', 'max')
y_min = reduce(y, 'b s n 1 -> b 1 n 1', 'min')
target = torch.cat([y_max, y_min], dim=1)
return F.mse_loss(y_hat, target, reduction=reduction)
def wrap_loss_fn(base_loss):
def loss_fn(y_hat, y_true, mask=None):
scaling = 1.
if mask is not None:
try:
loss = base_loss(y_hat, y_true, reduction='none')
except TypeError:
loss = base_loss(y_hat, y_true)
loss = loss * mask
loss = loss.sum() / (mask.sum() + epsilon)
# scaling = mask.sum() / torch.numel(mask)
else:
loss = base_loss(y_hat, y_true).mean()
return scaling * loss
return loss_fn
def rbf_sim(x, gamma, device='cpu'):
n = x.size()[0]
a = torch.exp(-gamma * F.pdist(x, 2) ** 2)
row_idx, col_idx = torch.triu_indices(n, n, 1)
A = 0.5 * torch.eye(n, n).to(device)
A[row_idx, col_idx] = a
return A + A.T
def reverse_tensor(tensor=None, axis=-1):
if tensor is None:
return None
if tensor.dim() <= 1:
return tensor
indices = range(tensor.size()[axis])[::-1]
indices = Variable(torch.LongTensor(indices), requires_grad=False).to(tensor.device)
return tensor.index_select(axis, indices)
================================================
FILE: lib/utils/__init__.py
================================================
from .utils import *
================================================
FILE: lib/utils/numpy_metrics.py
================================================
import numpy as np
def mae(y_hat, y):
return np.abs(y_hat - y).mean()
def nmae(y_hat, y):
delta = np.max(y) - np.min(y) + 1e-8
return mae(y_hat, y) * 100 / delta
def mape(y_hat, y):
return 100 * np.abs((y_hat - y) / (y + 1e-8)).mean()
def mse(y_hat, y):
return np.square(y_hat - y).mean()
def rmse(y_hat, y):
return np.sqrt(mse(y_hat, y))
def nrmse(y_hat, y):
delta = np.max(y) - np.min(y) + 1e-8
return rmse(y_hat, y) * 100 / delta
def nrmse_2(y_hat, y):
nrmse_ = np.sqrt(np.square(y_hat - y).sum() / np.square(y).sum())
return nrmse_ * 100
def r2(y_hat, y):
return 1. - np.square(y_hat - y).sum() / (np.square(y.mean(0) - y).sum())
def masked_mae(y_hat, y, mask):
err = np.abs(y_hat - y) * mask
return err.sum() / mask.sum()
def masked_mape(y_hat, y, mask):
err = np.abs((y_hat - y) / (y + 1e-8)) * mask
return err.sum() / mask.sum()
def masked_mse(y_hat, y, mask):
err = np.square(y_hat - y) * mask
return err.sum() / mask.sum()
def masked_rmse(y_hat, y, mask):
err = np.square(y_hat - y) * mask
return np.sqrt(err.sum() / mask.sum())
def masked_mre(y_hat, y, mask):
err = np.abs(y_hat - y) * mask
return err.sum() / ((y * mask).sum() + 1e-8)
================================================
FILE: lib/utils/parser_utils.py
================================================
import inspect
from argparse import Namespace, ArgumentParser
from typing import Union
def str_to_bool(value):
if isinstance(value, bool):
return value
if value.lower() in {'false', 'f', '0', 'no', 'n', 'off'}:
return False
elif value.lower() in {'true', 't', '1', 'yes', 'y', 'on'}:
return True
raise ValueError(f'{value} is not a valid boolean value')
def config_dict_from_args(args):
"""
Extract a dictionary with the experiment configuration from arguments (necessary to filter TestTube arguments)
:param args: TTNamespace
:return: hyparams dict
"""
keys_to_remove = {'hpc_exp_number', 'trials', 'optimize_parallel', 'optimize_parallel_gpu',
'optimize_parallel_cpu', 'generate_trials', 'optimize_trials_parallel_gpu'}
hparams = {key: v for key, v in args.__dict__.items() if key not in keys_to_remove}
return hparams
def update_from_config(args: Namespace, config: dict):
assert set(config.keys()) <= set(vars(args)), f'{set(config.keys()).difference(vars(args))} not in args.'
args.__dict__.update(config)
return args
def parse_by_group(parser):
"""
Create a nested namespace using the groups defined in the argument parser.
Adapted from https://stackoverflow.com/a/56631542/6524027
:param args: arguments
:param parser: the parser
:return:
"""
assert isinstance(parser, ArgumentParser)
args = parser.parse_args()
# the first two argument groups are 'positional_arguments' and 'optional_arguments'
pos_group, optional_group = parser._action_groups[0], parser._action_groups[1]
args_dict = args._get_kwargs()
pos_optional_arg_names = [arg.dest for arg in pos_group._group_actions] + [arg.dest for arg in
optional_group._group_actions]
pos_optional_args = {name: value for name, value in args_dict if name in pos_optional_arg_names}
other_group_args = dict()
# If there are additional argument groups, add them as nested namespaces
if len(parser._action_groups) > 2:
for group in parser._action_groups[2:]:
group_arg_names = [arg.dest for arg in group._group_actions]
other_group_args[group.title] = Namespace(
**{name: value for name, value in args_dict if name in group_arg_names})
# combine the positiona/optional args and the group args
combined_args = pos_optional_args
combined_args.update(other_group_args)
return Namespace(flat=args, **combined_args)
def filter_args(args: Union[Namespace, dict], target_cls, return_dict=False):
argspec = inspect.getfullargspec(target_cls.__init__)
target_args = argspec.args
if isinstance(args, Namespace):
args = vars(args)
filtered_args = {k: args[k] for k in target_args if k in args}
if return_dict:
return filtered_args
return Namespace(**filtered_args)
def filter_function_args(args: Union[Namespace, dict], function, return_dict=False):
argspec = inspect.getfullargspec(function)
target_args = argspec.args
if isinstance(args, Namespace):
args = vars(args)
filtered_args = {k: args[k] for k in target_args if k in args}
if return_dict:
return filtered_args
return Namespace(**filtered_args)
================================================
FILE: lib/utils/utils.py
================================================
import numpy as np
import pandas as pd
from sklearn.metrics.pairwise import haversine_distances
def sample_mask(shape, p=0.002, p_noise=0., max_seq=1, min_seq=1, rng=None):
if rng is None:
rand = np.random.random
randint = np.random.randint
else:
rand = rng.random
randint = rng.integers
mask = rand(shape) < p
for col in range(mask.shape[1]):
idxs = np.flatnonzero(mask[:, col])
if not len(idxs):
continue
fault_len = min_seq
if max_seq > min_seq:
fault_len = fault_len + int(randint(max_seq - min_seq))
idxs_ext = np.concatenate([np.arange(i, i + fault_len) for i in idxs])
idxs = np.unique(idxs_ext)
idxs = np.clip(idxs, 0, shape[0] - 1)
mask[idxs, col] = True
mask = mask | (rand(mask.shape) < p_noise)
return mask.astype('uint8')
def compute_mean(x, index=None):
"""Compute the mean values for each datetime. The mean is first computed hourly over the week of the year.
Further NaN values are computed using hourly mean over the same month through the years. If other NaN are present,
they are removed using the mean of the sole hours. Hoping reasonably that there is at least a non-NaN entry of the
same hour of the NaN datetime in all the dataset."""
if isinstance(x, np.ndarray) and index is not None:
shape = x.shape
x = x.reshape((shape[0], -1))
df_mean = pd.DataFrame(x, index=index)
else:
df_mean = x.copy()
cond0 = [df_mean.index.year, df_mean.index.isocalendar().week, df_mean.index.hour]
cond1 = [df_mean.index.year, df_mean.index.month, df_mean.index.hour]
conditions = [cond0, cond1, cond1[1:], cond1[2:]]
while df_mean.isna().values.sum() and len(conditions):
nan_mean = df_mean.groupby(conditions[0]).transform(np.nanmean)
df_mean = df_mean.fillna(nan_mean)
conditions = conditions[1:]
if df_mean.isna().values.sum():
df_mean = df_mean.fillna(method='ffill')
df_mean = df_mean.fillna(method='bfill')
if isinstance(x, np.ndarray):
df_mean = df_mean.values.reshape(shape)
return df_mean
def geographical_distance(x=None, to_rad=True):
"""
Compute the as-the-crow-flies distance between every pair of samples in `x`. The first dimension of each point is
assumed to be the latitude, the second is the longitude. The inputs is assumed to be in degrees. If it is not the
case, `to_rad` must be set to False. The dimension of the data must be 2.
Parameters
----------
x : pd.DataFrame or np.ndarray
array_like structure of shape (n_samples_2, 2).
to_rad : bool
whether to convert inputs to radians (provided that they are in degrees).
Returns
-------
distances :
The distance between the points in kilometers.
"""
_AVG_EARTH_RADIUS_KM = 6371.0088
# Extract values of X if it is a DataFrame, else assume it is 2-dim array of lat-lon pairs
latlon_pairs = x.values if isinstance(x, pd.DataFrame) else x
# If the input values are in degrees, convert them in radians
if to_rad:
latlon_pairs = np.vectorize(np.radians)(latlon_pairs)
distances = haversine_distances(latlon_pairs) * _AVG_EARTH_RADIUS_KM
# Cast response
if isinstance(x, pd.DataFrame):
res = pd.DataFrame(distances, x.index, x.index)
else:
res = distances
return res
def infer_mask(df, infer_from='next'):
"""Infer evaluation mask from DataFrame. In the evaluation mask a value is 1 if it is present in the DataFrame and
absent in the `infer_from` month.
@param pd.DataFrame df: the DataFrame.
@param str infer_from: denotes from which month the evaluation value must be inferred.
Can be either `previous` or `next`.
@return: pd.DataFrame eval_mask: the evaluation mask for the DataFrame
"""
mask = (~df.isna()).astype('uint8')
eval_mask = pd.DataFrame(index=mask.index, columns=mask.columns, data=0).astype('uint8')
if infer_from == 'previous':
offset = -1
elif infer_from == 'next':
offset = 1
else:
raise ValueError('infer_from can only be one of %s' % ['previous', 'next'])
months = sorted(set(zip(mask.index.year, mask.index.month)))
length = len(months)
for i in range(length):
j = (i + offset) % length
year_i, month_i = months[i]
year_j, month_j = months[j]
mask_j = mask[(mask.index.year == year_j) & (mask.index.month == month_j)]
mask_i = mask_j.shift(1, pd.DateOffset(months=12 * (year_i - year_j) + (month_i - month_j)))
mask_i = mask_i[~mask_i.index.duplicated(keep='first')]
mask_i = mask_i[np.in1d(mask_i.index, mask.index)]
eval_mask.loc[mask_i.index] = ~mask_i.loc[mask_i.index] & mask.loc[mask_i.index]
return eval_mask
def prediction_dataframe(y, index, columns=None, aggregate_by='mean'):
"""Aggregate batched predictions in a single DataFrame.
@param (list or np.ndarray) y: the list of predictions.
@param (list or np.ndarray) index: the list of time indexes coupled with the predictions.
@param (list or pd.Index) columns: the columns of the returned DataFrame.
@param (str or list) aggregate_by: how to aggregate the predictions in case there are more than one for a step.
- `mean`: take the mean of the predictions
- `central`: take the prediction at the central position, assuming that the predictions are ordered chronologically
- `smooth_central`: average the predictions weighted by a gaussian signal with std=1
- `last`: take the last prediction
@return: pd.DataFrame df: the evaluation mask for the DataFrame
"""
dfs = [pd.DataFrame(data=data.reshape(data.shape[:2]), index=idx, columns=columns) for data, idx in zip(y, index)]
df = pd.concat(dfs)
preds_by_step = df.groupby(df.index)
# aggregate according passed methods
aggr_methods = ensure_list(aggregate_by)
dfs = []
for aggr_by in aggr_methods:
if aggr_by == 'mean':
dfs.append(preds_by_step.mean())
elif aggr_by == 'central':
dfs.append(preds_by_step.aggregate(lambda x: x[int(len(x) // 2)]))
elif aggr_by == 'smooth_central':
from scipy.signal import gaussian
dfs.append(preds_by_step.aggregate(lambda x: np.average(x, weights=gaussian(len(x), 1))))
elif aggr_by == 'last':
dfs.append(preds_by_step.aggregate(lambda x: x[0])) # first imputation has missing value in last position
else:
raise ValueError('aggregate_by can only be one of %s' % ['mean', 'central' 'smooth_central', 'last'])
if isinstance(aggregate_by, str):
return dfs[0]
return dfs
def ensure_list(obj):
if isinstance(obj, (list, tuple)):
return list(obj)
else:
return [obj]
def missing_val_lens(mask):
m = np.concatenate([np.zeros((1, mask.shape[1])),
(~mask.astype('bool')).astype('int'),
np.zeros((1, mask.shape[1]))])
mdiff = np.diff(m, axis=0)
lens = []
for c in range(m.shape[1]):
mj, = mdiff[:, c].nonzero()
diff = np.diff(mj)[::2]
lens.extend(list(diff))
return lens
def disjoint_months(dataset, months=None, synch_mode='window'):
idxs = np.arange(len(dataset))
months = ensure_list(months)
# divide indices according to window or horizon
if synch_mode == 'window':
start, end = 0, dataset.window - 1
elif synch_mode == 'horizon':
start, end = dataset.horizon_offset, dataset.horizon_offset + dataset.horizon - 1
else:
raise ValueError('synch_mode can only be one of %s' % ['window', 'horizon'])
# after idxs
start_in_months = np.in1d(dataset.index[dataset._indices + start].month, months)
end_in_months = np.in1d(dataset.index[dataset._indices + end].month, months)
idxs_in_months = start_in_months & end_in_months
after_idxs = idxs[idxs_in_months]
# previous idxs
months = np.setdiff1d(np.arange(1, 13), months)
start_in_months = np.in1d(dataset.index[dataset._indices + start].month, months)
end_in_months = np.in1d(dataset.index[dataset._indices + end].month, months)
idxs_in_months = start_in_months & end_in_months
prev_idxs = idxs[idxs_in_months]
return prev_idxs, after_idxs
def thresholded_gaussian_kernel(x, theta=None, threshold=None, threshold_on_input=False):
if theta is None:
theta = np.std(x)
weights = np.exp(-np.square(x / theta))
if threshold is not None:
mask = x > threshold if threshold_on_input else weights < threshold
weights[mask] = 0.
return weights
================================================
FILE: requirements.txt
================================================
einops
fancyimpute==0.6
h5py
openpyxl
numpy
pandas
pytorch-lightning==1.4
pyyaml
scikit-learn
scipy
tables
tensorboard
tensorflow==2.5.0
tensorflow-gpu==2.4.0
torch==1.8
torchvision
torchaudio
torchmetrics==0.5
================================================
FILE: scripts/run_baselines.py
================================================
from argparse import ArgumentParser
import numpy as np
from fancyimpute import MatrixFactorization, IterativeImputer
from sklearn.neighbors import kneighbors_graph
from lib import datasets
from lib.utils import numpy_metrics
from lib.utils.parser_utils import str_to_bool
metrics = {
'mae': numpy_metrics.masked_mae,
'mse': numpy_metrics.masked_mse,
'mre': numpy_metrics.masked_mre,
'mape': numpy_metrics.masked_mape
}
def parse_args():
parser = ArgumentParser()
# experiment setting
parser.add_argument('--datasets', nargs='+', type=str, default=['all'])
parser.add_argument('--imputers', nargs='+', type=str, default=['all'])
parser.add_argument('--n-runs', type=int, default=5)
parser.add_argument('--in-sample', type=str_to_bool, nargs='?', const=True, default=True)
# SpatialKNNImputer params
parser.add_argument('--k', type=int, default=10)
# MFImputer params
parser.add_argument('--rank', type=int, default=10)
# MICEImputer params
parser.add_argument('--mice-iterations', type=int, default=100)
parser.add_argument('--mice-n-features', type=int, default=None)
args = parser.parse_args()
# parse dataset
if args.datasets[0] == 'all':
args.datasets = ['air36', 'air', 'bay', 'irish', 'la', 'bay_noise', 'irish_noise', 'la_noise']
# parse imputers
if args.imputers[0] == 'all':
args.imputers = ['mean', 'knn', 'mf', 'mice']
if not args.in_sample:
args.imputers = [name for name in args.imputers if name in ['mean', 'mice']]
return args
class Imputer:
short_name: str
def __init__(self, method=None, is_deterministic=True, in_sample=True):
self.name = self.__class__.__name__
self.method = method
self.is_deterministic = is_deterministic
self.in_sample = in_sample
def fit(self, x, mask):
if not self.in_sample:
x_hat = np.where(mask, x, np.nan)
return self.method.fit(x_hat)
def predict(self, x, mask):
x_hat = np.where(mask, x, np.nan)
if self.in_sample:
return self.method.fit_transform(x_hat)
else:
return self.method.transform(x_hat)
def params(self):
return dict()
class SpatialKNNImputer(Imputer):
short_name = 'knn'
def __init__(self, adj, k=20):
super(SpatialKNNImputer, self).__init__()
self.k = k
# normalize sim between [0, 1]
sim = (adj + adj.min()) / (adj.max() + adj.min())
knns = kneighbors_graph(1 - sim,
n_neighbors=self.k,
include_self=False,
metric='precomputed').toarray()
self.knns = knns
def fit(self, x, mask):
pass
def predict(self, x, mask):
x = np.where(mask, x, 0)
with np.errstate(divide='ignore', invalid='ignore'):
y_hat = (x @ self.knns.T) / (mask @ self.knns.T)
y_hat[~np.isfinite(y_hat)] = x.mean()
return np.where(mask, x, y_hat)
def params(self):
return dict(k=self.k)
class MeanImputer(Imputer):
short_name = 'mean'
def fit(self, x, mask):
d = np.where(mask, x, np.nan)
self.means = np.nanmean(d, axis=0, keepdims=True)
def predict(self, x, mask):
if self.in_sample:
d = np.where(mask, x, np.nan)
means = np.nanmean(d, axis=0, keepdims=True)
else:
means = self.means
return np.where(mask, x, means)
class MatrixFactorizationImputer(Imputer):
short_name = 'mf'
def __init__(self, rank=10, loss='mae', verbose=0):
method = MatrixFactorization(rank=rank, loss=loss, verbose=verbose)
super(MatrixFactorizationImputer, self).__init__(method, is_deterministic=False, in_sample=True)
def params(self):
return dict(rank=self.method.rank)
class MICEImputer(Imputer):
short_name = 'mice'
def __init__(self, max_iter=100, n_nearest_features=None, in_sample=True, verbose=False):
method = IterativeImputer(max_iter=max_iter, n_nearest_features=n_nearest_features, verbose=verbose)
is_deterministic = n_nearest_features is None
super(MICEImputer, self).__init__(method, is_deterministic=is_deterministic, in_sample=in_sample)
def params(self):
return dict(max_iter=self.method.max_iter, k=self.method.n_nearest_features or -1)
def get_dataset(dataset_name):
if dataset_name[:3] == 'air':
dataset = datasets.AirQuality(impute_nans=True, small=dataset_name[3:] == '36')
elif dataset_name == 'bay':
dataset = datasets.MissingValuesPemsBay()
elif dataset_name == 'la':
dataset = datasets.MissingValuesMetrLA()
elif dataset_name == 'la_noise':
dataset = datasets.MissingValuesMetrLA(p_fault=0., p_noise=0.25)
elif dataset_name == 'bay_noise':
dataset = datasets.MissingValuesPemsBay(p_fault=0., p_noise=0.25)
else:
raise ValueError(f"Dataset {dataset_name} not available in this setting.")
# split in train/test
if isinstance(dataset, datasets.AirQuality):
test_slice = np.in1d(dataset.df.index.month, dataset.test_months)
train_slice = ~test_slice
else:
train_slice = np.zeros(len(dataset)).astype(bool)
train_slice[:-int(0.2 * len(dataset))] = True
# integrate back eval values in dataset
dataset.eval_mask[train_slice] = 0
return dataset, train_slice
def get_imputer(imputer_name, args):
if imputer_name == 'mean':
imputer = MeanImputer(in_sample=args.in_sample)
elif imputer_name == 'knn':
imputer = SpatialKNNImputer(adj=args.adj, k=args.k)
elif imputer_name == 'mf':
imputer = MatrixFactorizationImputer(rank=args.rank)
elif imputer_name == 'mice':
imputer = MICEImputer(max_iter=args.mice_iterations,
n_nearest_features=args.mice_n_features,
in_sample=args.in_sample)
else:
raise ValueError(f"Imputer {imputer_name} not available in this setting.")
return imputer
def run(imputer, dataset, train_slice):
test_slice = ~train_slice
if args.in_sample:
x_train, mask_train = dataset.numpy(), dataset.training_mask
y_hat = imputer.predict(x_train, mask_train)[test_slice]
else:
x_train, mask_train = dataset.numpy()[train_slice], dataset.training_mask[train_slice]
imputer.fit(x_train, mask_train)
x_test, mask_test = dataset.numpy()[test_slice], dataset.training_mask[test_slice]
y_hat = imputer.predict(x_test, mask_test)
# Evaluate model
y_true = dataset.numpy()[test_slice]
eval_mask = dataset.eval_mask[test_slice]
for metric, metric_fn in metrics.items():
error = metric_fn(y_hat, y_true, eval_mask)
print(f'{imputer.name} on {ds_name} {metric}: {error:.4f}')
if __name__ == '__main__':
args = parse_args()
print(args.__dict__)
for ds_name in args.datasets:
dataset, train_slice = get_dataset(ds_name)
args.adj = dataset.get_similarity(thr=0.1)
# Instantiate imputers
imputers = [get_imputer(name, args) for name in args.imputers]
for imputer in imputers:
n_runs = 1 if imputer.is_deterministic else args.n_runs
for _ in range(n_runs):
run(imputer, dataset, train_slice)
================================================
FILE: scripts/run_imputation.py
================================================
import copy
import datetime
import os
import pathlib
from argparse import ArgumentParser
import numpy as np
import pytorch_lightning as pl
import torch
import torch.nn.functional as F
import yaml
from pytorch_lightning.callbacks import EarlyStopping, ModelCheckpoint
from pytorch_lightning.loggers import TensorBoardLogger
from torch.optim.lr_scheduler import CosineAnnealingLR
from lib import fillers, datasets, config
from lib.data.datamodule import SpatioTemporalDataModule
from lib.data.imputation_dataset import ImputationDataset, GraphImputationDataset
from lib.nn import models
from lib.nn.utils.metric_base import MaskedMetric
from lib.nn.utils.metrics import MaskedMAE, MaskedMAPE, MaskedMSE, MaskedMRE
from lib.utils import parser_utils, numpy_metrics, ensure_list, prediction_dataframe
from lib.utils.parser_utils import str_to_bool
def has_graph_support(model_cls):
return model_cls in [models.GRINet, models.MPGRUNet, models.BiMPGRUNet]
def get_model_classes(model_str):
if model_str == 'brits':
model, filler = models.BRITSNet, fillers.BRITSFiller
elif model_str == 'grin':
model, filler = models.GRINet, fillers.GraphFiller
elif model_str == 'mpgru':
model, filler = models.MPGRUNet, fillers.GraphFiller
elif model_str == 'bimpgru':
model, filler = models.BiMPGRUNet, fillers.GraphFiller
elif model_str == 'var':
model, filler = models.VARImputer, fillers.Filler
elif model_str == 'gain':
model, filler = models.RGAINNet, fillers.RGAINFiller
elif model_str == 'birnn':
model, filler = models.BiRNNImputer, fillers.MultiImputationFiller
elif model_str == 'rnn':
model, filler = models.RNNImputer, fillers.Filler
else:
raise ValueError(f'Model {model_str} not available.')
return model, filler
def get_dataset(dataset_name):
if dataset_name[:3] == 'air':
dataset = datasets.AirQuality(impute_nans=True, small=dataset_name[3:] == '36')
elif dataset_name == 'bay_block':
dataset = datasets.MissingValuesPemsBay()
elif dataset_name == 'la_block':
dataset = datasets.MissingValuesMetrLA()
elif dataset_name == 'la_point':
dataset = datasets.MissingValuesMetrLA(p_fault=0., p_noise=0.25)
elif dataset_name == 'bay_point':
dataset = datasets.MissingValuesPemsBay(p_fault=0., p_noise=0.25)
else:
raise ValueError(f"Dataset {dataset_name} not available in this setting.")
return dataset
def parse_args():
# Argument parser
parser = ArgumentParser()
parser.add_argument('--seed', type=int, default=-1)
parser.add_argument("--model-name", type=str, default='brits')
parser.add_argument("--dataset-name", type=str, default='air36')
parser.add_argument("--config", type=str, default=None)
# Splitting/aggregation params
parser.add_argument('--in-sample', type=str_to_bool, nargs='?', const=True, default=False)
parser.add_argument('--val-len', type=float, default=0.1)
parser.add_argument('--test-len', type=float, default=0.2)
parser.add_argument('--aggregate-by', type=str, default='mean')
# Training params
parser.add_argument('--lr', type=float, default=0.001)
parser.add_argument('--epochs', type=int, default=300)
parser.add_argument('--patience', type=int, default=40)
parser.add_argument('--l2-reg', type=float, default=0.)
parser.add_argument('--scaled-target', type=str_to_bool, nargs='?', const=True, default=True)
parser.add_argument('--grad-clip-val', type=float, default=5.)
parser.add_argument('--grad-clip-algorithm', type=str, default='norm')
parser.add_argument('--loss-fn', type=str, default='l1_loss')
parser.add_argument('--use-lr-schedule', type=str_to_bool, nargs='?', const=True, default=True)
parser.add_argument('--consistency-loss', type=str_to_bool, nargs='?', const=True, default=False)
parser.add_argument('--whiten-prob', type=float, default=0.05)
parser.add_argument('--pred-loss-weight', type=float, default=1.0)
parser.add_argument('--warm-up', type=int, default=0)
# graph params
parser.add_argument("--adj-threshold", type=float, default=0.1)
# gain hparams
parser.add_argument('--alpha', type=float, default=10.)
parser.add_argument('--hint-rate', type=float, default=0.7)
parser.add_argument('--g-train-freq', type=int, default=1)
parser.add_argument('--d-train-freq', type=int, default=5)
known_args, _ = parser.parse_known_args()
model_cls, _ = get_model_classes(known_args.model_name)
parser = model_cls.add_model_specific_args(parser)
parser = SpatioTemporalDataModule.add_argparse_args(parser)
parser = ImputationDataset.add_argparse_args(parser)
args = parser.parse_args()
if args.config is not None:
with open(args.config, 'r') as fp:
config_args = yaml.load(fp, Loader=yaml.FullLoader)
for arg in config_args:
setattr(args, arg, config_args[arg])
return args
def run_experiment(args):
# Set configuration and seed
args = copy.deepcopy(args)
if args.seed < 0:
args.seed = np.random.randint(1e9)
torch.set_num_threads(1)
pl.seed_everything(args.seed)
model_cls, filler_cls = get_model_classes(args.model_name)
dataset = get_dataset(args.dataset_name)
########################################
# create logdir and save configuration #
########################################
exp_name = f"{datetime.datetime.now().strftime('%Y-%m-%d_%H-%M-%S')}_{args.seed}"
logdir = os.path.join(config['logs'], args.dataset_name, args.model_name, exp_name)
# save config for logging
pathlib.Path(logdir).mkdir(parents=True)
with open(os.path.join(logdir, 'config.yaml'), 'w') as fp:
yaml.dump(parser_utils.config_dict_from_args(args), fp, indent=4, sort_keys=True)
########################################
# data module #
########################################
# instantiate dataset
dataset_cls = GraphImputationDataset if has_graph_support(model_cls) else ImputationDataset
torch_dataset = dataset_cls(*dataset.numpy(return_idx=True),
mask=dataset.training_mask,
eval_mask=dataset.eval_mask,
window=args.window,
stride=args.stride)
# get train/val/test indices
split_conf = parser_utils.filter_function_args(args, dataset.splitter, return_dict=True)
train_idxs, val_idxs, test_idxs = dataset.splitter(torch_dataset, **split_conf)
# configure datamodule
data_conf = parser_utils.filter_args(args, SpatioTemporalDataModule, return_dict=True)
dm = SpatioTemporalDataModule(torch_dataset, train_idxs=train_idxs, val_idxs=val_idxs, test_idxs=test_idxs,
**data_conf)
dm.setup()
# if out of sample in air, add values removed for evaluation in train set
if not args.in_sample and args.dataset_name[:3] == 'air':
dm.torch_dataset.mask[dm.train_slice] |= dm.torch_dataset.eval_mask[dm.train_slice]
# get adjacency matrix
adj = dataset.get_similarity(thr=args.adj_threshold)
# force adj with no self loop
np.fill_diagonal(adj, 0.)
########################################
# predictor #
########################################
# model's inputs
additional_model_hparams = dict(adj=adj, d_in=dm.d_in, n_nodes=dm.n_nodes)
model_kwargs = parser_utils.filter_args(args={**vars(args), **additional_model_hparams},
target_cls=model_cls,
return_dict=True)
# loss and metrics
loss_fn = MaskedMetric(metric_fn=getattr(F, args.loss_fn),
compute_on_step=True,
metric_kwargs={'reduction': 'none'})
metrics = {'mae': MaskedMAE(compute_on_step=False),
'mape': MaskedMAPE(compute_on_step=False),
'mse': MaskedMSE(compute_on_step=False),
'mre': MaskedMRE(compute_on_step=False)}
# filler's inputs
scheduler_class = CosineAnnealingLR if args.use_lr_schedule else None
additional_filler_hparams = dict(model_class=model_cls,
model_kwargs=model_kwargs,
optim_class=torch.optim.Adam,
optim_kwargs={'lr': args.lr,
'weight_decay': args.l2_reg},
loss_fn=loss_fn,
metrics=metrics,
scheduler_class=scheduler_class,
scheduler_kwargs={
'eta_min': 0.0001,
'T_max': args.epochs
},
alpha=args.alpha,
hint_rate=args.hint_rate,
g_train_freq=args.g_train_freq,
d_train_freq=args.d_train_freq)
filler_kwargs = parser_utils.filter_args(args={**vars(args), **additional_filler_hparams},
target_cls=filler_cls,
return_dict=True)
filler = filler_cls(**filler_kwargs)
########################################
# training #
########################################
# callbacks
early_stop_callback = EarlyStopping(monitor='val_mae', patience=args.patience, mode='min')
checkpoint_callback = ModelCheckpoint(dirpath=logdir, save_top_k=1, monitor='val_mae', mode='min')
logger = TensorBoardLogger(logdir, name="model")
trainer = pl.Trainer(max_epochs=args.epochs,
logger=logger,
default_root_dir=logdir,
gpus=1 if torch.cuda.is_available() else None,
gradient_clip_val=args.grad_clip_val,
gradient_clip_algorithm=args.grad_clip_algorithm,
callbacks=[early_stop_callback, checkpoint_callback])
trainer.fit(filler, datamodule=dm)
########################################
# testing #
########################################
filler.load_state_dict(torch.load(checkpoint_callback.best_model_path,
lambda storage, loc: storage)['state_dict'])
filler.freeze()
trainer.test()
filler.eval()
if torch.cuda.is_available():
filler.cuda()
with torch.no_grad():
y_true, y_hat, mask = filler.predict_loader(dm.test_dataloader(), return_mask=True)
y_hat = y_hat.detach().cpu().numpy().reshape(y_hat.shape[:3]) # reshape to (eventually) squeeze node channels
# Test imputations in whole series
eval_mask = dataset.eval_mask[dm.test_slice]
df_true = dataset.df.iloc[dm.test_slice]
metrics = {
'mae': numpy_metrics.masked_mae,
'mse': numpy_metrics.masked_mse,
'mre': numpy_metrics.masked_mre,
'mape': numpy_metrics.masked_mape
}
# Aggregate predictions in dataframes
index = dm.torch_dataset.data_timestamps(dm.testset.indices, flatten=False)['horizon']
aggr_methods = ensure_list(args.aggregate_by)
df_hats = prediction_dataframe(y_hat, index, dataset.df.columns, aggregate_by=aggr_methods)
df_hats = dict(zip(aggr_methods, df_hats))
for aggr_by, df_hat in df_hats.items():
# Compute error
print(f'- AGGREGATE BY {aggr_by.upper()}')
for metric_name, metric_fn in metrics.items():
error = metric_fn(df_hat.values, df_true.values, eval_mask).item()
print(f' {metric_name}: {error:.4f}')
return y_true, y_hat, mask
if __name__ == '__main__':
args = parse_args()
run_experiment(args)
================================================
FILE: scripts/run_synthetic.py
================================================
import copy
import datetime
import os
import pathlib
from argparse import ArgumentParser
import numpy as np
import pytorch_lightning as pl
import torch
import torch.nn.functional as F
import yaml
from pytorch_lightning.callbacks import EarlyStopping, ModelCheckpoint
from pytorch_lightning.loggers import TensorBoardLogger
from torch.optim.lr_scheduler import CosineAnnealingLR
from lib import fillers, config
from lib.datasets import ChargedParticles
from lib.nn import models
from lib.nn.utils.metric_base import MaskedMetric
from lib.nn.utils.metrics import MaskedMAE, MaskedMAPE, MaskedMSE, MaskedMRE
from lib.utils import parser_utils
from lib.utils.parser_utils import str_to_bool
def has_graph_support(model_cls):
return model_cls is models.GRINet
def get_model_classes(model_str):
if model_str == 'brits':
model, filler = models.BRITSNet, fillers.BRITSFiller
elif model_str == 'grin':
model, filler = models.GRINet, fillers.GraphFiller
else:
raise ValueError(f'Model {model_str} not available.')
return model, filler
def parse_args():
# Argument parser
parser = ArgumentParser()
parser.add_argument('--seed', type=int, default=-1)
parser.add_argument("--model-name", type=str, default='bigrill')
parser.add_argument("--config", type=str, default=None)
# Dataset params
parser.add_argument('--static-adj', type=str_to_bool, nargs='?', const=True, default=False)
parser.add_argument('--window', type=int, default=50)
parser.add_argument('--p-block', type=float, default=0.025)
parser.add_argument('--p-point', type=float, default=0.025)
parser.add_argument('--min-seq', type=int, default=5)
parser.add_argument('--max-seq', type=int, default=10)
parser.add_argument('--use-exogenous', type=str_to_bool, nargs='?', const=True, default=True)
# Splitting/aggregation params
parser.add_argument('--val-len', type=float, default=0.1)
parser.add_argument('--test-len', type=float, default=0.2)
# Training params
parser.add_argument('--lr', type=float, default=0.001)
parser.add_argument('--epochs', type=int, default=300)
parser.add_argument('--patience', type=int, default=40)
parser.add_argument('--l2-reg', type=float, default=0.)
parser.add_argument('--scaled-target', type=str_to_bool, nargs='?', const=True, default=False)
parser.add_argument('--grad-clip-val', type=float, default=5.)
parser.add_argument('--grad-clip-algorithm', type=str, default='norm')
parser.add_argument('--loss-fn', type=str, default='mse_loss')
parser.add_argument('--use-lr-schedule', type=str_to_bool, nargs='?', const=True, default=True)
parser.add_argument('--whiten-prob', type=float, default=0.05)
parser.add_argument('--pred-loss-weight', type=float, default=1.0)
parser.add_argument('--warm-up', type=int, default=0)
# graph params
parser.add_argument("--adj-threshold", type=float, default=0.1)
known_args, _ = parser.parse_known_args()
model_cls, _ = get_model_classes(known_args.model_name)
parser = model_cls.add_model_specific_args(parser)
args = parser.parse_args()
if args.config is not None:
with open(args.config, 'r') as fp:
config_args = yaml.load(fp, Loader=yaml.FullLoader)
for arg in config_args:
setattr(args, arg, config_args[arg])
return args
def run_experiment(args):
# Set configuration and seed
args = copy.deepcopy(args)
if args.seed < 0:
args.seed = np.random.randint(1e9)
torch.set_num_threads(1)
pl.seed_everything(args.seed)
########################################
# load dataset and model #
########################################
model_cls, filler_cls = get_model_classes(args.model_name)
dataset = ChargedParticles(static_adj=args.static_adj,
window=args.window,
p_block=args.p_block,
p_point=args.p_point,
max_seq=args.max_seq,
min_seq=args.min_seq,
use_exogenous=args.use_exogenous,
graph_mode=has_graph_support(model_cls))
dataset.split(args.val_len, args.test_len)
# get adjacency matrix
adj = dataset.get_similarity()
np.fill_diagonal(adj, 0.) # force adj with no self loop
########################################
# create logdir and save configuration #
########################################
exp_name = f"{datetime.datetime.now().strftime('%Y-%m-%d_%H-%M-%S')}_{args.seed}"
logdir = os.path.join(config['logs'], 'synthetic', args.model_name, exp_name)
# save config for logging
pathlib.Path(logdir).mkdir(parents=True)
with open(os.path.join(logdir, 'config.yaml'), 'w') as fp:
yaml.dump(parser_utils.config_dict_from_args(args), fp, indent=4, sort_keys=True)
########################################
# predictor #
########################################
# model's inputs
if has_graph_support(model_cls):
model_params = dict(adj=adj, d_in=dataset.n_channels, d_u=dataset.n_exogenous, n_nodes=dataset.n_nodes)
else:
model_params = dict(d_in=(dataset.n_channels * dataset.n_nodes), d_u=(dataset.n_channels * dataset.n_exogenous))
model_kwargs = parser_utils.filter_args(args={**vars(args), **model_params},
target_cls=model_cls,
return_dict=True)
# loss and metrics
loss_fn = MaskedMetric(metric_fn=getattr(F, args.loss_fn),
compute_on_step=True,
metric_kwargs={'reduction': 'none'})
metrics = {'mae': MaskedMAE(compute_on_step=False),
'mape': MaskedMAPE(compute_on_step=False),
'mse': MaskedMSE(compute_on_step=False),
'mre': MaskedMRE(compute_on_step=False)}
# filler's inputs
scheduler_class = CosineAnnealingLR if args.use_lr_schedule else None
additional_filler_hparams = dict(model_class=model_cls,
model_kwargs=model_kwargs,
optim_class=torch.optim.Adam,
optim_kwargs={'lr': args.lr,
'weight_decay': args.l2_reg},
loss_fn=loss_fn,
metrics=metrics,
scheduler_class=scheduler_class,
scheduler_kwargs={
'eta_min': 0.0001,
'T_max': args.epochs
},
alpha=args.alpha,
hint_rate=args.hint_rate,
g_train_freq=args.g_train_freq,
d_train_freq=args.d_train_freq)
filler_kwargs = parser_utils.filter_args(args={**vars(args), **additional_filler_hparams},
target_cls=filler_cls,
return_dict=True)
filler = filler_cls(**filler_kwargs)
########################################
# logging options #
########################################
# log number of parameters
args.trainable_parameters = filler.trainable_parameters
# log statistics on masks
for mask_type in ['mask', 'eval_mask', 'training_mask']:
mask_type_mean = getattr(dataset, mask_type).float().mean().item()
setattr(args, mask_type, mask_type_mean)
print(args)
########################################
# training #
########################################
# callbacks
early_stop_callback = EarlyStopping(monitor='val_mse', patience=args.patience, mode='min')
checkpoint_callback = ModelCheckpoint(dirpath=logdir, save_top_k=1, monitor='val_mse', mode='min')
logger = TensorBoardLogger(logdir, name="model")
trainer = pl.Trainer(max_epochs=args.epochs,
default_root_dir=logdir,
logger=logger,
gpus=1 if torch.cuda.is_available() else None,
gradient_clip_val=args.grad_clip_val,
gradient_clip_algorithm=args.grad_clip_algorithm,
callbacks=[early_stop_callback, checkpoint_callback])
trainer.fit(filler,
train_dataloader=dataset.train_dataloader(batch_size=args.batch_size),
val_dataloaders=dataset.val_dataloader(batch_size=args.batch_size))
########################################
# testing #
########################################
filler.load_state_dict(torch.load(checkpoint_callback.best_model_path,
lambda storage, loc: storage)['state_dict'])
filler.freeze()
trainer.test(filler, test_dataloaders=dataset.test_dataloader(batch_size=args.batch_size))
filler.eval()
if __name__ == '__main__':
args = parse_args()
run_experiment(args)