Repository: Graph-Machine-Learning-Group/grin Branch: main Commit: 4a28afbb0926 Files: 102 Total size: 201.4 KB Directory structure: gitextract__pvx4bn4/ ├── .gitignore ├── README.md ├── conda_env.yml ├── config/ │ ├── bimpgru/ │ │ ├── air.yaml │ │ ├── air36.yaml │ │ ├── bay_block.yaml │ │ ├── bay_point.yaml │ │ ├── irish_block.yaml │ │ ├── irish_point.yaml │ │ ├── la_block.yaml │ │ └── la_point.yaml │ ├── brits/ │ │ ├── air.yaml │ │ ├── air36.yaml │ │ ├── bay_block.yaml │ │ ├── bay_point.yaml │ │ ├── irish_block.yaml │ │ ├── irish_point.yaml │ │ ├── la_block.yaml │ │ ├── la_point.yaml │ │ └── synthetic.yaml │ ├── grin/ │ │ ├── air.yaml │ │ ├── air36.yaml │ │ ├── bay_block.yaml │ │ ├── bay_point.yaml │ │ ├── irish_block.yaml │ │ ├── irish_point.yaml │ │ ├── la_block.yaml │ │ ├── la_point.yaml │ │ └── synthetic.yaml │ ├── mpgru/ │ │ ├── air.yaml │ │ ├── air36.yaml │ │ ├── bay_block.yaml │ │ ├── bay_point.yaml │ │ ├── irish_block.yaml │ │ ├── irish_point.yaml │ │ ├── la_block.yaml │ │ └── la_point.yaml │ ├── rgain/ │ │ ├── air.yaml │ │ ├── air36.yaml │ │ ├── bay_block.yaml │ │ ├── bay_point.yaml │ │ ├── irish_block.yaml │ │ ├── irish_point.yaml │ │ ├── la_block.yaml │ │ └── la_point.yaml │ └── var/ │ ├── air.yaml │ ├── air36.yaml │ ├── bay_block.yaml │ ├── bay_point.yaml │ ├── irish_block.yaml │ ├── irish_point.yaml │ ├── la_block.yaml │ └── la_point.yaml ├── lib/ │ ├── __init__.py │ ├── data/ │ │ ├── __init__.py │ │ ├── datamodule/ │ │ │ ├── __init__.py │ │ │ └── spatiotemporal.py │ │ ├── imputation_dataset.py │ │ ├── preprocessing/ │ │ │ ├── __init__.py │ │ │ └── scalers.py │ │ ├── spatiotemporal_dataset.py │ │ └── temporal_dataset.py │ ├── datasets/ │ │ ├── __init__.py │ │ ├── air_quality.py │ │ ├── metr_la.py │ │ ├── pd_dataset.py │ │ ├── pems_bay.py │ │ └── synthetic.py │ ├── fillers/ │ │ ├── __init__.py │ │ ├── britsfiller.py │ │ ├── filler.py │ │ ├── graphfiller.py │ │ ├── multi_imputation_filler.py │ │ └── rgainfiller.py │ ├── nn/ │ │ ├── __init__.py │ │ ├── layers/ │ │ │ ├── __init__.py │ │ │ ├── gcrnn.py │ │ │ ├── gril.py │ │ │ ├── imputation.py │ │ │ ├── mpgru.py │ │ │ ├── rits.py │ │ │ ├── spatial_attention.py │ │ │ └── spatial_conv.py │ │ ├── models/ │ │ │ ├── __init__.py │ │ │ ├── brits.py │ │ │ ├── grin.py │ │ │ ├── mpgru.py │ │ │ ├── rgain.py │ │ │ ├── rnn_imputers.py │ │ │ └── var.py │ │ └── utils/ │ │ ├── __init__.py │ │ ├── metric_base.py │ │ ├── metrics.py │ │ └── ops.py │ └── utils/ │ ├── __init__.py │ ├── numpy_metrics.py │ ├── parser_utils.py │ └── utils.py ├── requirements.txt └── scripts/ ├── run_baselines.py ├── run_imputation.py └── run_synthetic.py ================================================ FILE CONTENTS ================================================ ================================================ FILE: .gitignore ================================================ *.DS_STORE ================================================ FILE: README.md ================================================ # Filling the G_ap_s: Multivariate Time Series Imputation by Graph Neural Networks (ICLR 2022 - [open review](https://openreview.net/forum?id=kOu3-S3wJ7) - [pdf](https://openreview.net/pdf?id=kOu3-S3wJ7)) [![ICLR](https://img.shields.io/badge/ICLR-2022-blue.svg?style=flat-square)](https://openreview.net/forum?id=kOu3-S3wJ7) [![PDF](https://img.shields.io/badge/%E2%87%A9-PDF-orange.svg?style=flat-square)](https://openreview.net/pdf?id=kOu3-S3wJ7) [![arXiv](https://img.shields.io/badge/arXiv-2108.00298-b31b1b.svg?style=flat-square)](https://arxiv.org/abs/2108.00298) This repository contains the code for the reproducibility of the experiments presented in the paper "Filling the G_ap_s: Multivariate Time Series Imputation by Graph Neural Networks" (ICLR 2022). In this paper, we propose a graph neural network architecture for multivariate time series imputation and achieve state-of-the-art results on several benchmarks. **Authors**: [Andrea Cini](mailto:andrea.cini@usi.ch), [Ivan Marisca](mailto:ivan.marisca@usi.ch), Cesare Alippi **‼️ PyG implementation of GRIN is now available inside [Torch Spatiotemporal](https://github.com/TorchSpatiotemporal/tsl), a library built to accelerate research on neural spatiotemporal data processing methods, with a focus on Graph Neural Networks.** ---

GRIN in a nutshell

The [paper](https://arxiv.org/abs/2108.00298) introduces __GRIN__, a method and an architecture to exploit relational inductive biases to reconstruct missing values in multivariate time series coming from sensor networks. GRIN features a bidirectional recurrent GNN which learns __spatio-temporal node-level representations__ tailored to reconstruct observations at neighboring nodes.

Logo

--- ## Directory structure The directory is structured as follows: ``` . ├── config │   ├── bimpgru │   ├── brits │   ├── grin │   ├── mpgru │   ├── rgain │   └── var ├── datasets │   ├── air_quality │   ├── metr_la │   ├── pems_bay │   └── synthetic ├── lib │   ├── __init__.py │   ├── data │   ├── datasets │   ├── fillers │   ├── nn │   └── utils ├── requirements.txt └── scripts ├── run_baselines.py ├── run_imputation.py └── run_synthetic.py ``` Note that, given the size of the files, the datasets are not readily available in the folder. See the next section for the downloading instructions. ## Datasets All the datasets used in the experiment, except CER-E, are open and can be downloaded from this [link](https://mega.nz/folder/qwwG3Qba#c6qFTeT7apmZKKyEunCzSg). The CER-E dataset can be obtained free of charge for research purposes following the instructions at this [link](https://www.ucd.ie/issda/data/commissionforenergyregulationcer/). We recommend storing the downloaded datasets in a folder named `datasets` inside this directory. ## Configuration files The `config` directory stores all the configuration files used to run the experiment. They are divided into folders, according to the model. ## Library The support code, including the models and the datasets readers, are packed in a python library named `lib`. Should you have to change the paths to the datasets location, you have to edit the `__init__.py` file of the library. ## Scripts The scripts used for the experiment in the paper are in the `scripts` folder. * `run_baselines.py` is used to compute the metrics for the `MEAN`, `KNN`, `MF` and `MICE` imputation methods. An example of usage is ``` python ./scripts/run_baselines.py --datasets air36 air --imputers mean knn --k 10 --in-sample True --n-runs 5 ``` * `run_imputation.py` is used to compute the metrics for the deep imputation methods. An example of usage is ``` python ./scripts/run_imputation.py --config config/grin/air36.yaml --in-sample False ``` * `run_synthetic.py` is used for the experiments on the synthetic datasets. An example of usage is ``` python ./scripts/run_synthetic.py --config config/grin/synthetic.yaml --static-adj False ``` ## Requirements We run all the experiments in `python 3.8`, see `requirements.txt` for the list of `pip` dependencies. ## Bibtex reference If you find this code useful please consider to cite our paper: ``` @inproceedings{cini2022filling, title={Filling the G\_ap\_s: Multivariate Time Series Imputation by Graph Neural Networks}, author={Andrea Cini and Ivan Marisca and Cesare Alippi}, booktitle={International Conference on Learning Representations}, year={2022}, url={https://openreview.net/forum?id=kOu3-S3wJ7} } ``` ================================================ FILE: conda_env.yml ================================================ name: grin channels: - defaults - pytorch - conda-forge dependencies: - pip - pytables - python=3.8 - pytorch=1.8 - torchvision - torchaudio - wheel - pip: - einops - fancyimpute==0.6 - h5py - openpyxl - pandas - pytorch-lightning==1.4 - pyyaml - scikit-learn - scipy - tensorboard ================================================ FILE: config/bimpgru/air.yaml ================================================ dataset_name: 'air' window: 24 adj_threshold: 0.1 detrend: False scale: True scaling_axis: 'global' # ['channels', 'global'] scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'bimpgru' d_hidden: 64 d_emb: 8 d_ff: 64 dropout: 0 kernel_size: 2 n_layers: 1 layer_norm: false merge: 'mlp' ================================================ FILE: config/bimpgru/air36.yaml ================================================ dataset_name: 'air36' window: 36 adj_threshold: 0.1 detrend: False scale: True scaling_axis: 'global' # ['channels', 'global'] scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'bimpgru' d_hidden: 64 d_emb: 8 d_ff: 64 dropout: 0 kernel_size: 2 n_layers: 1 layer_norm: false merge: 'mlp' ================================================ FILE: config/bimpgru/bay_block.yaml ================================================ dataset_name: 'bay_block' window: 24 adj_threshold: 0.1 detrend: False scale: True scaling_axis: 'global' # ['channels', 'global'] scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'bimpgru' d_hidden: 64 d_emb: 8 d_ff: 64 dropout: 0 kernel_size: 2 n_layers: 1 layer_norm: false merge: 'mlp' ================================================ FILE: config/bimpgru/bay_point.yaml ================================================ dataset_name: 'bay_point' window: 24 adj_threshold: 0.1 detrend: False scale: True scaling_axis: 'global' # ['channels', 'global'] scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'bimpgru' d_hidden: 64 d_emb: 8 d_ff: 64 dropout: 0 kernel_size: 2 n_layers: 1 layer_norm: false merge: 'mlp' ================================================ FILE: config/bimpgru/irish_block.yaml ================================================ dataset_name: 'irish_block' window: 24 adj_threshold: 0.1 detrend: False scale: True scaling_axis: 'global' # ['channels', 'global'] scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'bimpgru' d_hidden: 64 d_emb: 8 d_ff: 64 dropout: 0 kernel_size: 2 n_layers: 1 layer_norm: false merge: 'mlp' ================================================ FILE: config/bimpgru/irish_point.yaml ================================================ dataset_name: 'irish_point' window: 24 adj_threshold: 0.1 detrend: False scale: True scaling_axis: 'global' # ['channels', 'global'] scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'bimpgru' d_hidden: 64 d_emb: 8 d_ff: 64 dropout: 0 kernel_size: 2 n_layers: 1 layer_norm: false merge: 'mlp' ================================================ FILE: config/bimpgru/la_block.yaml ================================================ dataset_name: 'la_block' window: 24 adj_threshold: 0.1 detrend: False scale: True scaling_axis: 'global' # ['channels', 'global'] scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'bimpgru' d_hidden: 64 d_emb: 8 d_ff: 64 dropout: 0 kernel_size: 2 n_layers: 1 layer_norm: false merge: 'mlp' ================================================ FILE: config/bimpgru/la_point.yaml ================================================ dataset_name: 'la_point' window: 24 adj_threshold: 0.1 detrend: False scale: True scaling_axis: 'global' # ['channels', 'global'] scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'bimpgru' d_hidden: 64 d_emb: 8 d_ff: 64 dropout: 0 kernel_size: 2 n_layers: 1 layer_norm: false merge: 'mlp' ================================================ FILE: config/brits/air.yaml ================================================ dataset_name: 'air' window: 24 detrend: False scale: True scaling_axis: 'channels' scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'brits' d_hidden: 128 ================================================ FILE: config/brits/air36.yaml ================================================ dataset_name: 'air36' window: 36 detrend: False scale: True scaling_axis: 'channels' scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'brits' d_hidden: 64 ================================================ FILE: config/brits/bay_block.yaml ================================================ dataset_name: 'bay_block' window: 24 detrend: False scale: True scaling_axis: 'channels' scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'brits' d_hidden: 256 ================================================ FILE: config/brits/bay_point.yaml ================================================ dataset_name: 'bay_point' window: 24 detrend: False scale: True scaling_axis: 'channels' scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'brits' d_hidden: 256 ================================================ FILE: config/brits/irish_block.yaml ================================================ dataset_name: 'irish_block' window: 24 detrend: False scale: True scaling_axis: 'channels' scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'brits' d_hidden: 256 ================================================ FILE: config/brits/irish_point.yaml ================================================ dataset_name: 'irish_point' window: 24 detrend: False scale: True scaling_axis: 'channels' scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'brits' d_hidden: 256 ================================================ FILE: config/brits/la_block.yaml ================================================ dataset_name: 'la_block' window: 24 detrend: False scale: True scaling_axis: 'channels' scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'brits' d_hidden: 128 ================================================ FILE: config/brits/la_point.yaml ================================================ dataset_name: 'la_point' window: 24 detrend: False scale: True scaling_axis: 'channels' scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'brits' d_hidden: 128 ================================================ FILE: config/brits/synthetic.yaml ================================================ window: 36 p_block: 0.025 p_point: 0.025 min_seq: 4 max_seq: 9 use_exogenous: False epochs: 200 batch_size: 32 model_name: 'brits' d_hidden: 32 ================================================ FILE: config/grin/air.yaml ================================================ dataset_name: 'air' window: 24 adj_threshold: 0.1 detrend: False scale: True scaling_axis: 'global' # ['channels', 'global'] scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'grin' pred_loss_weight: 1 d_hidden: 64 d_emb: 8 d_ff: 64 ff_dropout: 0 kernel_size: 2 decoder_order: 1 n_layers: 1 layer_norm: false merge: 'mlp' ================================================ FILE: config/grin/air36.yaml ================================================ dataset_name: 'air36' window: 36 adj_threshold: 0.1 detrend: False scale: True scaling_axis: 'global' # ['channels', 'global'] scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'grin' pred_loss_weight: 1 d_hidden: 64 d_emb: 8 d_ff: 64 ff_dropout: 0 kernel_size: 2 decoder_order: 1 n_layers: 1 layer_norm: false merge: 'mlp' ================================================ FILE: config/grin/bay_block.yaml ================================================ dataset_name: 'bay_block' window: 24 adj_threshold: 0.1 detrend: False scale: True scaling_axis: 'global' # ['channels', 'global'] scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'grin' pred_loss_weight: 1 d_hidden: 64 d_emb: 8 d_ff: 64 ff_dropout: 0 kernel_size: 2 decoder_order: 1 n_layers: 1 layer_norm: false merge: 'mlp' ================================================ FILE: config/grin/bay_point.yaml ================================================ dataset_name: 'bay_point' window: 24 adj_threshold: 0.1 detrend: False scale: True scaling_axis: 'global' # ['channels', 'global'] scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'grin' pred_loss_weight: 1 d_hidden: 64 d_emb: 8 d_ff: 64 ff_dropout: 0 kernel_size: 2 decoder_order: 1 n_layers: 1 layer_norm: false merge: 'mlp' ================================================ FILE: config/grin/irish_block.yaml ================================================ dataset_name: 'irish_block' window: 24 adj_threshold: 0.1 detrend: False scale: True scaling_axis: 'global' # ['channels', 'global'] scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'grin' pred_loss_weight: 1 d_hidden: 64 d_emb: 8 d_ff: 64 ff_dropout: 0 kernel_size: 2 decoder_order: 1 n_layers: 1 layer_norm: false merge: 'mlp' ================================================ FILE: config/grin/irish_point.yaml ================================================ dataset_name: 'irish_point' window: 24 adj_threshold: 0.1 detrend: False scale: True scaling_axis: 'global' # ['channels', 'global'] scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'grin' pred_loss_weight: 1 d_hidden: 64 d_emb: 8 d_ff: 64 ff_dropout: 0 kernel_size: 2 decoder_order: 1 n_layers: 1 layer_norm: false merge: 'mlp' ================================================ FILE: config/grin/la_block.yaml ================================================ dataset_name: 'la_block' window: 24 adj_threshold: 0.1 detrend: False scale: True scaling_axis: 'global' # ['channels', 'global'] scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'grin' pred_loss_weight: 1 d_hidden: 64 d_emb: 8 d_ff: 64 ff_dropout: 0 kernel_size: 2 decoder_order: 1 n_layers: 1 layer_norm: false merge: 'mlp' ================================================ FILE: config/grin/la_point.yaml ================================================ dataset_name: 'la_point' window: 24 adj_threshold: 0.1 detrend: False scale: True scaling_axis: 'global' # ['channels', 'global'] scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'grin' pred_loss_weight: 1 d_hidden: 64 d_emb: 8 d_ff: 64 ff_dropout: 0 kernel_size: 2 decoder_order: 1 n_layers: 1 layer_norm: false merge: 'mlp' ================================================ FILE: config/grin/synthetic.yaml ================================================ window: 36 p_block: 0.025 p_point: 0.025 min_seq: 4 max_seq: 9 use_exogenous: False epochs: 200 batch_size: 32 model_name: 'grin' d_hidden: 16 d_emb: 0 d_ff: 16 ff_dropout: 0 kernel_size: 1 decoder_order: 1 n_layers: 1 layer_norm: false merge: 'mlp' ================================================ FILE: config/mpgru/air.yaml ================================================ dataset_name: 'air' window: 24 adj_threshold: 0.1 detrend: False scale: True scaling_axis: 'global' # ['channels', 'global'] scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'mpgru' pred_loss_weight: 1 d_hidden: 64 d_ff: 64 dropout: 0 kernel_size: 2 n_layers: 1 layer_norm: false ================================================ FILE: config/mpgru/air36.yaml ================================================ dataset_name: 'air36' window: 36 adj_threshold: 0.1 detrend: False scale: True scaling_axis: 'global' # ['channels', 'global'] scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'mpgru' pred_loss_weight: 1 d_hidden: 64 d_ff: 64 dropout: 0 kernel_size: 2 n_layers: 1 layer_norm: false ================================================ FILE: config/mpgru/bay_block.yaml ================================================ dataset_name: 'bay_block' window: 24 adj_threshold: 0.1 detrend: False scale: True scaling_axis: 'global' # ['channels', 'global'] scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'mpgru' pred_loss_weight: 1 d_hidden: 64 d_ff: 64 dropout: 0 kernel_size: 2 n_layers: 1 layer_norm: false ================================================ FILE: config/mpgru/bay_point.yaml ================================================ dataset_name: 'bay_point' window: 24 adj_threshold: 0.1 detrend: False scale: True scaling_axis: 'global' # ['channels', 'global'] scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'mpgru' pred_loss_weight: 1 d_hidden: 64 d_ff: 64 dropout: 0 kernel_size: 2 n_layers: 1 layer_norm: false ================================================ FILE: config/mpgru/irish_block.yaml ================================================ dataset_name: 'irish_block' window: 24 adj_threshold: 0.1 detrend: False scale: True scaling_axis: 'global' # ['channels', 'global'] scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'mpgru' pred_loss_weight: 1 d_hidden: 64 d_ff: 64 dropout: 0 kernel_size: 2 n_layers: 1 layer_norm: false ================================================ FILE: config/mpgru/irish_point.yaml ================================================ dataset_name: 'irish_point' window: 24 adj_threshold: 0.1 detrend: False scale: True scaling_axis: 'global' # ['channels', 'global'] scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'mpgru' pred_loss_weight: 1 d_hidden: 64 d_ff: 64 dropout: 0 kernel_size: 2 n_layers: 1 layer_norm: false ================================================ FILE: config/mpgru/la_block.yaml ================================================ dataset_name: 'la_block' window: 24 adj_threshold: 0.1 detrend: False scale: True scaling_axis: 'global' # ['channels', 'global'] scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'mpgru' pred_loss_weight: 1 d_hidden: 64 d_ff: 64 dropout: 0 kernel_size: 2 n_layers: 1 layer_norm: false ================================================ FILE: config/mpgru/la_point.yaml ================================================ dataset_name: 'la_point' window: 24 adj_threshold: 0.1 detrend: False scale: True scaling_axis: 'global' # ['channels', 'global'] scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 aggregate_by: ['mean'] model_name: 'mpgru' pred_loss_weight: 1 d_hidden: 64 d_ff: 64 dropout: 0 kernel_size: 2 n_layers: 1 layer_norm: false ================================================ FILE: config/rgain/air.yaml ================================================ dataset_name: 'air' window: 24 whiten_prob: 0.2 detrend: False scale: True scaling_axis: 'channels' scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 loss_fn: mse_loss consistency_loss: False use_lr_schedule: True grad_clip_val: -1 aggregate_by: ['mean'] model_name: 'gain' d_model: 128 d_z: 4 dropout: 0.2 inject_noise: true alpha: 20 g_train_freq: 3 d_train_freq: 1 ================================================ FILE: config/rgain/air36.yaml ================================================ dataset_name: 'air36' window: 36 whiten_prob: 0.2 detrend: False scale: True scaling_axis: 'channels' scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 loss_fn: mse_loss consistency_loss: False use_lr_schedule: True grad_clip_val: -1 aggregate_by: ['mean'] model_name: 'gain' d_model: 64 d_z: 4 dropout: 0.1 inject_noise: true alpha: 20 g_train_freq: 3 d_train_freq: 1 ================================================ FILE: config/rgain/bay_block.yaml ================================================ dataset_name: 'bay_block' window: 24 whiten_prob: 0.2 detrend: False scale: True scaling_axis: 'channels' scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 loss_fn: mse_loss consistency_loss: False use_lr_schedule: True grad_clip_val: -1 aggregate_by: ['mean'] model_name: 'gain' d_model: 256 d_z: 4 dropout: 0.2 inject_noise: true alpha: 20 g_train_freq: 3 d_train_freq: 1 ================================================ FILE: config/rgain/bay_point.yaml ================================================ dataset_name: 'bay_point' window: 24 whiten_prob: 0.2 detrend: False scale: True scaling_axis: 'channels' scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 loss_fn: mse_loss consistency_loss: False use_lr_schedule: True grad_clip_val: -1 aggregate_by: ['mean'] model_name: 'gain' d_model: 256 d_z: 4 dropout: 0.2 inject_noise: true alpha: 20 g_train_freq: 3 d_train_freq: 1 ================================================ FILE: config/rgain/irish_block.yaml ================================================ dataset_name: 'irish_block' window: 24 whiten_prob: 0.2 detrend: False scale: True scaling_axis: 'channels' scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 loss_fn: mse_loss consistency_loss: False use_lr_schedule: True grad_clip_val: -1 aggregate_by: ['mean'] model_name: 'gain' d_model: 256 d_z: 4 dropout: 0.2 inject_noise: true alpha: 20 g_train_freq: 3 d_train_freq: 1 ================================================ FILE: config/rgain/irish_point.yaml ================================================ dataset_name: 'irish_point' window: 24 whiten_prob: 0.2 detrend: False scale: True scaling_axis: 'channels' scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 loss_fn: mse_loss consistency_loss: False use_lr_schedule: True grad_clip_val: -1 aggregate_by: ['mean'] model_name: 'gain' d_model: 256 d_z: 4 dropout: 0.2 inject_noise: true alpha: 20 g_train_freq: 3 d_train_freq: 1 ================================================ FILE: config/rgain/la_block.yaml ================================================ dataset_name: 'la_block' window: 24 whiten_prob: 0.2 detrend: False scale: True scaling_axis: 'channels' scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 loss_fn: mse_loss consistency_loss: False use_lr_schedule: True grad_clip_val: -1 aggregate_by: ['mean'] model_name: 'gain' d_model: 128 d_z: 4 dropout: 0.2 inject_noise: true alpha: 20 g_train_freq: 3 d_train_freq: 1 ================================================ FILE: config/rgain/la_point.yaml ================================================ dataset_name: 'la_point' window: 24 whiten_prob: 0.2 detrend: False scale: True scaling_axis: 'channels' scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 batch_size: 32 loss_fn: mse_loss consistency_loss: False use_lr_schedule: True grad_clip_val: -1 aggregate_by: ['mean'] model_name: 'gain' d_model: 128 d_z: 4 dropout: 0.2 inject_noise: true alpha: 20 g_train_freq: 3 d_train_freq: 1 ================================================ FILE: config/var/air.yaml ================================================ dataset_name: 'air' window: 24 detrend: False scale: True scaling_axis: 'channels' scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 lr: 0.0005 batch_size: 64 aggregate_by: ['mean'] model_name: 'var' order: 5 padding: 'mean' ================================================ FILE: config/var/air36.yaml ================================================ dataset_name: 'air36' window: 36 detrend: False scale: True scaling_axis: 'channels' scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 lr: 0.0005 batch_size: 64 aggregate_by: ['mean'] model_name: 'var' order: 5 padding: 'mean' ================================================ FILE: config/var/bay_block.yaml ================================================ dataset_name: 'bay_block' window: 24 detrend: False scale: True scaling_axis: 'channels' scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 lr: 0.0005 batch_size: 64 aggregate_by: ['mean'] model_name: 'var' order: 5 padding: 'mean' ================================================ FILE: config/var/bay_point.yaml ================================================ dataset_name: 'bay_point' window: 24 detrend: False scale: True scaling_axis: 'channels' scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 lr: 0.0005 batch_size: 64 aggregate_by: ['mean'] model_name: 'var' order: 5 padding: 'mean' ================================================ FILE: config/var/irish_block.yaml ================================================ dataset_name: 'irish_block' window: 24 detrend: False scale: True scaling_axis: 'channels' scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 lr: 0.0005 batch_size: 64 aggregate_by: ['mean'] model_name: 'var' order: 5 padding: 'mean' ================================================ FILE: config/var/irish_point.yaml ================================================ dataset_name: 'irish_point' window: 24 detrend: False scale: True scaling_axis: 'channels' scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 lr: 0.0005 batch_size: 64 aggregate_by: ['mean'] model_name: 'var' order: 5 padding: 'mean' ================================================ FILE: config/var/la_block.yaml ================================================ dataset_name: 'la_block' window: 24 detrend: False scale: True scaling_axis: 'channels' scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 lr: 0.0005 batch_size: 64 aggregate_by: ['mean'] model_name: 'var' order: 5 padding: 'mean' ================================================ FILE: config/var/la_point.yaml ================================================ dataset_name: 'la_point' window: 24 detrend: False scale: True scaling_axis: 'channels' scaled_target: True epochs: 300 samples_per_epoch: 5120 # 160 batch of 32 lr: 0.0005 batch_size: 64 aggregate_by: ['mean'] model_name: 'var' order: 5 padding: 'mean' ================================================ FILE: lib/__init__.py ================================================ import os base_dir = os.path.dirname(os.path.dirname(os.path.realpath(__file__))) config = { 'logs': 'logs/' } datasets_path = { 'air': 'datasets/air_quality', 'la': 'datasets/metr_la', 'bay': 'datasets/pems_bay', 'synthetic': 'datasets/synthetic' } epsilon = 1e-8 for k, v in config.items(): config[k] = os.path.join(base_dir, v) for k, v in datasets_path.items(): datasets_path[k] = os.path.join(base_dir, v) ================================================ FILE: lib/data/__init__.py ================================================ from .temporal_dataset import TemporalDataset from .spatiotemporal_dataset import SpatioTemporalDataset ================================================ FILE: lib/data/datamodule/__init__.py ================================================ from .spatiotemporal import SpatioTemporalDataModule ================================================ FILE: lib/data/datamodule/spatiotemporal.py ================================================ import pytorch_lightning as pl from torch.utils.data import DataLoader, Subset, RandomSampler from .. import TemporalDataset, SpatioTemporalDataset from ..preprocessing import StandardScaler, MinMaxScaler from ...utils import ensure_list from ...utils.parser_utils import str_to_bool class SpatioTemporalDataModule(pl.LightningDataModule): """ Pytorch Lightning DataModule for TimeSeriesDatasets """ def __init__(self, dataset: TemporalDataset, scale=True, scaling_axis='samples', scaling_type='std', scale_exogenous=None, train_idxs=None, val_idxs=None, test_idxs=None, batch_size=32, workers=1, samples_per_epoch=None): super(SpatioTemporalDataModule, self).__init__() self.torch_dataset = dataset # splitting self.trainset = Subset(self.torch_dataset, train_idxs if train_idxs is not None else []) self.valset = Subset(self.torch_dataset, val_idxs if val_idxs is not None else []) self.testset = Subset(self.torch_dataset, test_idxs if test_idxs is not None else []) # preprocessing self.scale = scale self.scaling_type = scaling_type self.scaling_axis = scaling_axis self.scale_exogenous = ensure_list(scale_exogenous) if scale_exogenous is not None else None # data loaders self.batch_size = batch_size self.workers = workers self.samples_per_epoch = samples_per_epoch @property def is_spatial(self): return isinstance(self.torch_dataset, SpatioTemporalDataset) @property def n_nodes(self): if not self.has_setup_fit: raise ValueError('You should initialize the datamodule first.') return self.torch_dataset.n_nodes if self.is_spatial else None @property def d_in(self): if not self.has_setup_fit: raise ValueError('You should initialize the datamodule first.') return self.torch_dataset.n_channels @property def d_out(self): if not self.has_setup_fit: raise ValueError('You should initialize the datamodule first.') return self.torch_dataset.horizon @property def train_slice(self): return self.torch_dataset.expand_indices(self.trainset.indices, merge=True) @property def val_slice(self): return self.torch_dataset.expand_indices(self.valset.indices, merge=True) @property def test_slice(self): return self.torch_dataset.expand_indices(self.testset.indices, merge=True) def get_scaling_axes(self, dim='global'): scaling_axis = tuple() if dim == 'global': scaling_axis = (0, 1, 2) elif dim == 'channels': scaling_axis = (0, 1) elif dim == 'nodes': scaling_axis = (0,) # Remove last dimension for temporal datasets if not self.is_spatial: scaling_axis = scaling_axis[:-1] if not len(scaling_axis): raise ValueError(f'Scaling axis "{dim}" not valid.') return scaling_axis def get_scaler(self): if self.scaling_type == 'std': return StandardScaler elif self.scaling_type == 'minmax': return MinMaxScaler else: return NotImplementedError def setup(self, stage=None): if self.scale: scaling_axis = self.get_scaling_axes(self.scaling_axis) train = self.torch_dataset.data.numpy()[self.train_slice] train_mask = self.torch_dataset.mask.numpy()[self.train_slice] if 'mask' in self.torch_dataset else None scaler = self.get_scaler()(scaling_axis).fit(train, mask=train_mask, keepdims=True).to_torch() self.torch_dataset.scaler = scaler if self.scale_exogenous is not None: for label in self.scale_exogenous: exo = getattr(self.torch_dataset, label) scaler = self.get_scaler()(scaling_axis) scaler.fit(exo[self.train_slice], keepdims=True).to_torch() setattr(self.torch_dataset, label, scaler.transform(exo)) def _data_loader(self, dataset, shuffle=False, batch_size=None, **kwargs): batch_size = self.batch_size if batch_size is None else batch_size return DataLoader(dataset, shuffle=shuffle, batch_size=batch_size, num_workers=self.workers, **kwargs) def train_dataloader(self, shuffle=True, batch_size=None): if self.samples_per_epoch is not None: sampler = RandomSampler(self.trainset, replacement=True, num_samples=self.samples_per_epoch) return self._data_loader(self.trainset, False, batch_size, sampler=sampler, drop_last=True) return self._data_loader(self.trainset, shuffle, batch_size, drop_last=True) def val_dataloader(self, shuffle=False, batch_size=None): return self._data_loader(self.valset, shuffle, batch_size) def test_dataloader(self, shuffle=False, batch_size=None): return self._data_loader(self.testset, shuffle, batch_size) @staticmethod def add_argparse_args(parser, **kwargs): parser.add_argument('--batch-size', type=int, default=64) parser.add_argument('--scaling-axis', type=str, default="channels") parser.add_argument('--scaling-type', type=str, default="std") parser.add_argument('--scale', type=str_to_bool, nargs='?', const=True, default=True) parser.add_argument('--workers', type=int, default=0) parser.add_argument('--samples-per-epoch', type=int, default=None) return parser ================================================ FILE: lib/data/imputation_dataset.py ================================================ import numpy as np import torch from . import TemporalDataset, SpatioTemporalDataset class ImputationDataset(TemporalDataset): def __init__(self, data, index=None, mask=None, eval_mask=None, freq=None, trend=None, scaler=None, window=24, stride=1, exogenous=None): if mask is None: mask = np.ones_like(data) if exogenous is None: exogenous = dict() exogenous['mask_window'] = mask if eval_mask is not None: exogenous['eval_mask_window'] = eval_mask super(ImputationDataset, self).__init__(data, index=index, exogenous=exogenous, trend=trend, scaler=scaler, freq=freq, window=window, horizon=window, delay=-window, stride=stride) def get(self, item, preprocess=False): res, transform = super(ImputationDataset, self).get(item, preprocess) res['x'] = torch.where(res['mask'], res['x'], torch.zeros_like(res['x'])) return res, transform class GraphImputationDataset(ImputationDataset, SpatioTemporalDataset): pass ================================================ FILE: lib/data/preprocessing/__init__.py ================================================ from .scalers import * ================================================ FILE: lib/data/preprocessing/scalers.py ================================================ from abc import ABC, abstractmethod import numpy as np class AbstractScaler(ABC): def __init__(self, **kwargs): for k, v in kwargs.items(): setattr(self, k, v) def __repr__(self): params = ", ".join([f"{k}={str(v)}" for k, v in self.params().items()]) return "{}({})".format(self.__class__.__name__, params) def __call__(self, *args, **kwargs): return self.transform(*args, **kwargs) def params(self): return {k: v for k, v in self.__dict__.items() if not callable(v) and not k.startswith("__")} @abstractmethod def fit(self, x): pass @abstractmethod def transform(self, x): pass @abstractmethod def inverse_transform(self, x): pass def fit_transform(self, x): self.fit(x) return self.transform(x) def to_torch(self): import torch for p in self.params(): param = getattr(self, p) param = np.atleast_1d(param) param = torch.tensor(param).float() setattr(self, p, param) return self class Scaler(AbstractScaler): def __init__(self, offset=0., scale=1.): self.bias = offset self.scale = scale super(Scaler, self).__init__() def params(self): return dict(bias=self.bias, scale=self.scale) def fit(self, x, mask=None, keepdims=True): pass def transform(self, x): return (x - self.bias) / self.scale def inverse_transform(self, x): return x * self.scale + self.bias def fit_transform(self, x, mask=None, keepdims=True): self.fit(x, mask, keepdims) return self.transform(x) class StandardScaler(Scaler): def __init__(self, axis=0): self.axis = axis super(StandardScaler, self).__init__() def fit(self, x, mask=None, keepdims=True): if mask is not None: x = np.where(mask, x, np.nan) self.bias = np.nanmean(x, axis=self.axis, keepdims=keepdims) self.scale = np.nanstd(x, axis=self.axis, keepdims=keepdims) else: self.bias = x.mean(axis=self.axis, keepdims=keepdims) self.scale = x.std(axis=self.axis, keepdims=keepdims) return self class MinMaxScaler(Scaler): def __init__(self, axis=0): self.axis = axis super(MinMaxScaler, self).__init__() def fit(self, x, mask=None, keepdims=True): if mask is not None: x = np.where(mask, x, np.nan) self.bias = np.nanmin(x, axis=self.axis, keepdims=keepdims) self.scale = (np.nanmax(x, axis=self.axis, keepdims=keepdims) - self.bias) else: self.bias = x.min(axis=self.axis, keepdims=keepdims) self.scale = (x.max(axis=self.axis, keepdims=keepdims) - self.bias) return self ================================================ FILE: lib/data/spatiotemporal_dataset.py ================================================ import numpy as np import pandas as pd from einops import rearrange from .temporal_dataset import TemporalDataset class SpatioTemporalDataset(TemporalDataset): def __init__(self, data, index=None, trend=None, scaler=None, freq=None, window=24, horizon=24, delay=0, stride=1, **exogenous): """ Pytorch dataset for data that can be represented as a single TimeSeries :param data: raw target time series (ts) (can be multivariate), shape: [steps, (features), nodes] :param exog: global exogenous variables, shape: [steps, nodes] :param trend: trend time series to be removed from the ts, shape: [steps, (features), (nodes)] :param bias: bias to be removed from the ts (after de-trending), shape [steps, (features), (nodes)] :param scale: r scaling factor to scale the ts (after de-trending), shape [steps, (features), (nodes)] :param mask: mask for valid data, 1 -> valid time step, 0 -> invalid. same shape of ts. :param target_exog: exogenous variables of the target, shape: [steps, nodes] :param window: length of windows returned by __get_intem__ :param horizon: length of prediction horizon returned by __get_intem__ :param delay: delay between input and prediction """ super(SpatioTemporalDataset, self).__init__(data, index=index, trend=trend, scaler=scaler, freq=freq, window=window, horizon=horizon, delay=delay, stride=stride, **exogenous) def __repr__(self): return "{}(n_samples={}, n_nodes={})".format(self.__class__.__name__, len(self), self.n_nodes) @property def n_nodes(self): return self.data.shape[1] @staticmethod def check_dim(data): if data.ndim == 2: # [steps, nodes] -> [steps, nodes, features] data = rearrange(data, 's (n f) -> s n f', f=1) elif data.ndim == 1: data = rearrange(data, '(s n f) -> s n f', n=1, f=1) elif data.ndim == 3: pass else: raise ValueError(f'Invalid data dimensions {data.shape}') return data def dataframe(self): if self.n_channels == 1: return pd.DataFrame(data=np.squeeze(self.data, -1), index=self.index) raise NotImplementedError() ================================================ FILE: lib/data/temporal_dataset.py ================================================ import numpy as np import pandas as pd import torch from einops import rearrange from pandas import DatetimeIndex from torch.utils.data import Dataset from .preprocessing import AbstractScaler class TemporalDataset(Dataset): def __init__(self, data, index=None, freq=None, exogenous=None, trend=None, scaler=None, window=24, horizon=24, delay=0, stride=1): """Wrapper class for dataset whose entry are dependent from a sequence of temporal indices. Parameters ---------- data : np.ndarray Data relative to the main signal. index : DatetimeIndex or None Temporal indices for the data. exogenous : dict or None Exogenous data and label paired with main signal (default is None). trend : np.ndarray or None Trend paired with main signal (default is None). Must be of the same length of 'data'. scaler : AbstractScaler or None Scaler that must be used for data (default is None). freq : pd.DateTimeIndex.freq or str Frequency of the indices (defaults is indices.freq). window : int Size of the sliding window in the past. horizon : int Size of the prediction horizon. delay : int Offset between end of window and start of horizon. Raises ---------- ValueError If a frequency for the temporal indices is not provided neither in indices nor explicitly. If preprocess is True and data_scaler is None. """ super(TemporalDataset, self).__init__() # Initialize signatures self.__exogenous_keys = dict() self.__reserved_signature = {'data', 'trend', 'x', 'y'} # Store data self.data = data if exogenous is not None: for name, value in exogenous.items(): self.add_exogenous(value, name, for_window=True, for_horizon=True) # Store time information self.index = index try: freq = freq or index.freq or index.inferred_freq self.freq = pd.tseries.frequencies.to_offset(freq) except AttributeError: self.freq = None # Store offset information self.window = window self.delay = delay self.horizon = horizon self.stride = stride # Identify the indices of the samples self._indices = np.arange(self.data.shape[0] - self.sample_span + 1)[::self.stride] # Store preprocessing options self.trend = trend self.scaler = scaler def __getitem__(self, item): return self.get(item, self.preprocess) def __contains__(self, item): return item in self.__exogenous_keys def __len__(self): return len(self._indices) def __repr__(self): return "{}(n_samples={})".format(self.__class__.__name__, len(self)) # Getter and setter for data @property def data(self): return self.__data @data.setter def data(self, value): assert value is not None self.__data = self.check_input(value) @property def trend(self): return self.__trend @trend.setter def trend(self, value): self.__trend = self.check_input(value) # Setter for exogenous data def add_exogenous(self, obj, name, for_window=True, for_horizon=False): assert isinstance(name, str) if name.endswith('_window'): name = name[:-7] for_window, for_horizon = True, False if name.endswith('_horizon'): name = name[:-8] for_window, for_horizon = False, True if name in self.__reserved_signature: raise ValueError("Channel '{0}' cannot be added in this way. Use obj.{0} instead.".format(name)) if not (for_window or for_horizon): raise ValueError("Either for_window or for_horizon must be True.") obj = self.check_input(obj) setattr(self, name, obj) self.__exogenous_keys[name] = dict(for_window=for_window, for_horizon=for_horizon) return self # Dataset properties @property def horizon_offset(self): return self.window + self.delay @property def sample_span(self): return max(self.horizon_offset + self.horizon, self.window) @property def preprocess(self): return (self.trend is not None) or (self.scaler is not None) @property def n_steps(self): return self.data.shape[0] @property def n_channels(self): return self.data.shape[-1] @property def indices(self): return self._indices # Signature information @property def exo_window_keys(self): return {k for k, v in self.__exogenous_keys.items() if v['for_window']} @property def exo_horizon_keys(self): return {k for k, v in self.__exogenous_keys.items() if v['for_horizon']} @property def exo_common_keys(self): return self.exo_window_keys.intersection(self.exo_horizon_keys) @property def signature(self): attrs = [] if self.window > 0: attrs.append('x') for attr in self.exo_window_keys: attrs.append(attr if attr not in self.exo_common_keys else (attr + '_window')) for attr in self.exo_horizon_keys: attrs.append(attr if attr not in self.exo_common_keys else (attr + '_horizon')) attrs.append('y') attrs = tuple(attrs) preprocess = [] if self.trend is not None: preprocess.append('trend') if self.scaler is not None: preprocess.extend(self.scaler.params()) preprocess = tuple(preprocess) return dict(data=attrs, preprocessing=preprocess) # Item getters def get(self, item, preprocess=False): idx = self._indices[item] res, transform = dict(), dict() if self.window > 0: res['x'] = self.data[idx:idx + self.window] for attr in self.exo_window_keys: key = attr if attr not in self.exo_common_keys else (attr + '_window') res[key] = getattr(self, attr)[idx:idx + self.window] for attr in self.exo_horizon_keys: key = attr if attr not in self.exo_common_keys else (attr + '_horizon') res[key] = getattr(self, attr)[idx + self.horizon_offset:idx + self.horizon_offset + self.horizon] res['y'] = self.data[idx + self.horizon_offset:idx + self.horizon_offset + self.horizon] if preprocess: if self.trend is not None: y_trend = self.trend[idx + self.horizon_offset:idx + self.horizon_offset + self.horizon] res['y'] = res['y'] - y_trend transform['trend'] = y_trend if 'x' in res: res['x'] = res['x'] - self.trend[idx:idx + self.window] if self.scaler is not None: transform.update(self.scaler.params()) if 'x' in res: res['x'] = self.scaler.transform(res['x']) return res, transform def snapshot(self, indices=None, preprocess=True): if not self.preprocess: preprocess = False data, prep = [{k: [] for k in sign} for sign in self.signature.values()] indices = np.arange(len(self._indices)) if indices is None else indices for idx in indices: data_i, prep_i = self.get(idx, preprocess) [v.append(data_i[k]) for k, v in data.items()] if len(prep_i): [v.append(prep_i[k]) for k, v in prep.items()] data = {k: np.stack(ds) for k, ds in data.items() if len(ds)} if len(prep): prep = {k: np.stack(ds) if k == 'trend' else ds[0] for k, ds in prep.items() if len(ds)} return data, prep # Data utilities def expand_indices(self, indices=None, unique=False, merge=False): ds_indices = dict.fromkeys([time for time in ['window', 'horizon'] if getattr(self, time) > 0]) indices = np.arange(len(self._indices)) if indices is None else indices if 'window' in ds_indices: w_idxs = [np.arange(idx, idx + self.window) for idx in self._indices[indices]] ds_indices['window'] = np.concatenate(w_idxs) if 'horizon' in ds_indices: h_idxs = [np.arange(idx + self.horizon_offset, idx + self.horizon_offset + self.horizon) for idx in self._indices[indices]] ds_indices['horizon'] = np.concatenate(h_idxs) if unique: ds_indices = {k: np.unique(v) for k, v in ds_indices.items()} if merge: ds_indices = np.unique(np.concatenate(list(ds_indices.values()))) return ds_indices def overlapping_indices(self, idxs1, idxs2, synch_mode='window', as_mask=False): assert synch_mode in ['window', 'horizon'] ts1 = self.data_timestamps(idxs1, flatten=False)[synch_mode] ts2 = self.data_timestamps(idxs2, flatten=False)[synch_mode] common_ts = np.intersect1d(np.unique(ts1), np.unique(ts2)) is_overlapping = lambda sample: np.any(np.in1d(sample, common_ts)) m1 = np.apply_along_axis(is_overlapping, 1, ts1) m2 = np.apply_along_axis(is_overlapping, 1, ts2) if as_mask: return m1, m2 return np.sort(idxs1[m1]), np.sort(idxs2[m2]) def data_timestamps(self, indices=None, flatten=True): ds_indices = self.expand_indices(indices, unique=False) ds_timestamps = {k: self.index[v] for k, v in ds_indices.items()} if not flatten: ds_timestamps = {k: np.array(v).reshape(-1, getattr(self, k)) for k, v in ds_timestamps.items()} return ds_timestamps def reduce_dataset(self, indices, inplace=False): if not inplace: from copy import deepcopy dataset = deepcopy(self) else: dataset = self old_index = dataset.index[dataset._indices[indices]] ds_indices = dataset.expand_indices(indices, merge=True) dataset.index = dataset.index[ds_indices] dataset.data = dataset.data[ds_indices] if dataset.mask is not None: dataset.mask = dataset.mask[ds_indices] if dataset.trend is not None: dataset.trend = dataset.trend[ds_indices] for attr in dataset.exo_window_keys.union(dataset.exo_horizon_keys): if getattr(dataset, attr, None) is not None: setattr(dataset, attr, getattr(dataset, attr)[ds_indices]) dataset._indices = np.flatnonzero(np.in1d(dataset.index, old_index)) return dataset def check_input(self, data): if data is None: return data data = self.check_dim(data) data = data.clone().detach() if isinstance(data, torch.Tensor) else torch.tensor(data) # cast data if torch.is_floating_point(data): return data.float() elif data.dtype in [torch.int, torch.int8, torch.int16, torch.int32, torch.int64]: return data.int() return data # Class-specific methods (override in children) @staticmethod def check_dim(data): if data.ndim == 1: # [steps] -> [steps, features] data = rearrange(data, '(s f) -> s f', f=1) elif data.ndim != 2: raise ValueError(f'Invalid data dimensions {data.shape}') return data def dataframe(self): return pd.DataFrame(data=self.data, index=self.index) @staticmethod def add_argparse_args(parser, **kwargs): parser.add_argument('--window', type=int, default=24) parser.add_argument('--horizon', type=int, default=24) parser.add_argument('--delay', type=int, default=0) parser.add_argument('--stride', type=int, default=1) return parser ================================================ FILE: lib/datasets/__init__.py ================================================ from .air_quality import AirQuality from .metr_la import MissingValuesMetrLA from .pems_bay import MissingValuesPemsBay from .synthetic import ChargedParticles ================================================ FILE: lib/datasets/air_quality.py ================================================ import os import numpy as np import pandas as pd from lib import datasets_path from .pd_dataset import PandasDataset from ..utils.utils import disjoint_months, infer_mask, compute_mean, geographical_distance, thresholded_gaussian_kernel class AirQuality(PandasDataset): SEED = 3210 def __init__(self, impute_nans=False, small=False, freq='60T', masked_sensors=None): self.random = np.random.default_rng(self.SEED) self.test_months = [3, 6, 9, 12] self.infer_eval_from = 'next' self.eval_mask = None df, dist, mask = self.load(impute_nans=impute_nans, small=small, masked_sensors=masked_sensors) self.dist = dist if masked_sensors is None: self.masked_sensors = list() else: self.masked_sensors = list(masked_sensors) super().__init__(dataframe=df, u=None, mask=mask, name='air', freq=freq, aggr='nearest') def load_raw(self, small=False): if small: path = os.path.join(datasets_path['air'], 'small36.h5') eval_mask = pd.DataFrame(pd.read_hdf(path, 'eval_mask')) else: path = os.path.join(datasets_path['air'], 'full437.h5') eval_mask = None df = pd.DataFrame(pd.read_hdf(path, 'pm25')) stations = pd.DataFrame(pd.read_hdf(path, 'stations')) return df, stations, eval_mask def load(self, impute_nans=True, small=False, masked_sensors=None): # load readings and stations metadata df, stations, eval_mask = self.load_raw(small) # compute the masks mask = (~np.isnan(df.values)).astype('uint8') # 1 if value is not nan else 0 if eval_mask is None: eval_mask = infer_mask(df, infer_from=self.infer_eval_from) eval_mask = eval_mask.values.astype('uint8') if masked_sensors is not None: eval_mask[:, masked_sensors] = np.where(mask[:, masked_sensors], 1, 0) self.eval_mask = eval_mask # 1 if value is ground-truth for imputation else 0 # eventually replace nans with weekly mean by hour if impute_nans: df = df.fillna(compute_mean(df)) # compute distances from latitude and longitude degrees st_coord = stations.loc[:, ['latitude', 'longitude']] dist = geographical_distance(st_coord, to_rad=True).values return df, dist, mask def splitter(self, dataset, val_len=1., in_sample=False, window=0): nontest_idxs, test_idxs = disjoint_months(dataset, months=self.test_months, synch_mode='horizon') if in_sample: train_idxs = np.arange(len(dataset)) val_months = [(m - 1) % 12 for m in self.test_months] _, val_idxs = disjoint_months(dataset, months=val_months, synch_mode='horizon') else: # take equal number of samples before each month of testing val_len = (int(val_len * len(nontest_idxs)) if val_len < 1 else val_len) // len(self.test_months) # get indices of first day of each testing month delta_idxs = np.diff(test_idxs) end_month_idxs = test_idxs[1:][np.flatnonzero(delta_idxs > delta_idxs.min())] if len(end_month_idxs) < len(self.test_months): end_month_idxs = np.insert(end_month_idxs, 0, test_idxs[0]) # expand month indices month_val_idxs = [np.arange(v_idx - val_len, v_idx) - window for v_idx in end_month_idxs] val_idxs = np.concatenate(month_val_idxs) % len(dataset) # remove overlapping indices from training set ovl_idxs, _ = dataset.overlapping_indices(nontest_idxs, val_idxs, synch_mode='horizon', as_mask=True) train_idxs = nontest_idxs[~ovl_idxs] return [train_idxs, val_idxs, test_idxs] def get_similarity(self, thr=0.1, include_self=False, force_symmetric=False, sparse=False, **kwargs): theta = np.std(self.dist[:36, :36]) # use same theta for both air and air36 adj = thresholded_gaussian_kernel(self.dist, theta=theta, threshold=thr) if not include_self: adj[np.diag_indices_from(adj)] = 0. if force_symmetric: adj = np.maximum.reduce([adj, adj.T]) if sparse: import scipy.sparse as sps adj = sps.coo_matrix(adj) return adj @property def mask(self): return self._mask @property def training_mask(self): return self._mask if self.eval_mask is None else (self._mask & (1 - self.eval_mask)) def test_interval_mask(self, dtype=bool, squeeze=True): m = np.in1d(self.df.index.month, self.test_months).astype(dtype) if squeeze: return m return m[:, None] ================================================ FILE: lib/datasets/metr_la.py ================================================ import os import numpy as np import pandas as pd from lib import datasets_path from .pd_dataset import PandasDataset from ..utils import sample_mask class MetrLA(PandasDataset): def __init__(self, impute_zeros=False, freq='5T'): df, dist, mask = self.load(impute_zeros=impute_zeros) self.dist = dist super().__init__(dataframe=df, u=None, mask=mask, name='la', freq=freq, aggr='nearest') def load(self, impute_zeros=True): path = os.path.join(datasets_path['la'], 'metr_la.h5') df = pd.read_hdf(path) datetime_idx = sorted(df.index) date_range = pd.date_range(datetime_idx[0], datetime_idx[-1], freq='5T') df = df.reindex(index=date_range) mask = ~np.isnan(df.values) if impute_zeros: mask = mask * (df.values != 0.).astype('uint8') df = df.replace(to_replace=0., method='ffill') else: mask = None dist = self.load_distance_matrix() return df, dist, mask def load_distance_matrix(self): path = os.path.join(datasets_path['la'], 'metr_la_dist.npy') try: dist = np.load(path) except: distances = pd.read_csv(os.path.join(datasets_path['la'], 'distances_la.csv')) with open(os.path.join(datasets_path['la'], 'sensor_ids_la.txt')) as f: ids = f.read().strip().split(',') num_sensors = len(ids) dist = np.ones((num_sensors, num_sensors), dtype=np.float32) * np.inf # Builds sensor id to index map. sensor_id_to_ind = {int(sensor_id): i for i, sensor_id in enumerate(ids)} # Fills cells in the matrix with distances. for row in distances.values: if row[0] not in sensor_id_to_ind or row[1] not in sensor_id_to_ind: continue dist[sensor_id_to_ind[row[0]], sensor_id_to_ind[row[1]]] = row[2] np.save(path, dist) return dist def get_similarity(self, thr=0.1, force_symmetric=False, sparse=False): finite_dist = self.dist.reshape(-1) finite_dist = finite_dist[~np.isinf(finite_dist)] sigma = finite_dist.std() adj = np.exp(-np.square(self.dist / sigma)) adj[adj < thr] = 0. if force_symmetric: adj = np.maximum.reduce([adj, adj.T]) if sparse: import scipy.sparse as sps adj = sps.coo_matrix(adj) return adj @property def mask(self): return self._mask class MissingValuesMetrLA(MetrLA): SEED = 9101112 def __init__(self, p_fault=0.0015, p_noise=0.05): super(MissingValuesMetrLA, self).__init__(impute_zeros=True) self.rng = np.random.default_rng(self.SEED) self.p_fault = p_fault self.p_noise = p_noise eval_mask = sample_mask(self.numpy().shape, p=p_fault, p_noise=p_noise, min_seq=12, max_seq=12 * 4, rng=self.rng) self.eval_mask = (eval_mask & self.mask).astype('uint8') @property def training_mask(self): return self.mask if self.eval_mask is None else (self.mask & (1 - self.eval_mask)) def splitter(self, dataset, val_len=0, test_len=0, window=0): idx = np.arange(len(dataset)) if test_len < 1: test_len = int(test_len * len(idx)) if val_len < 1: val_len = int(val_len * (len(idx) - test_len)) test_start = len(idx) - test_len val_start = test_start - val_len return [idx[:val_start - window], idx[val_start:test_start - window], idx[test_start:]] ================================================ FILE: lib/datasets/pd_dataset.py ================================================ import numpy as np import pandas as pd import torch class PandasDataset: def __init__(self, dataframe: pd.DataFrame, u: pd.DataFrame = None, name='pd-dataset', mask=None, freq=None, aggr='sum', **kwargs): """ Initialize a tsl dataset from a pandas dataframe. :param dataframe: dataframe containing the data, shape: n_steps, n_nodes :param u: dataframe with exog variables :param name: optional name of the dataset :param mask: mask for valid data (1:valid, 0:not valid) :param freq: force a frequency (possibly by resampling) :param aggr: aggregation method after resampling """ super().__init__() self.name = name # set dataset dataframe self.df = dataframe # set optional exog_variable dataframe # make sure to consider only the overlapping part of the two dataframes # assumption u.index \in df.index idx = sorted(self.df.index) self.start = idx[0] self.end = idx[-1] if u is not None: self.u = u[self.start:self.end] else: self.u = None if mask is not None: mask = np.asarray(mask).astype('uint8') self._mask = mask if freq is not None: self.resample_(freq=freq, aggr=aggr) else: self.freq = self.df.index.inferred_freq # make sure that all the dataframes are aligned self.resample_(self.freq, aggr=aggr) assert 'T' in self.freq self.samples_per_day = int(60 / int(self.freq[:-1]) * 24) def __repr__(self): return "{}(nodes={}, length={})".format(self.__class__.__name__, self.n_nodes, self.length) @property def has_mask(self): return self._mask is not None @property def has_u(self): return self.u is not None def resample_(self, freq, aggr): resampler = self.df.resample(freq) idx = self.df.index if aggr == 'sum': self.df = resampler.sum() elif aggr == 'mean': self.df = resampler.mean() elif aggr == 'nearest': self.df = resampler.nearest() else: raise ValueError(f'{aggr} if not a valid aggregation method.') if self.has_mask: resampler = pd.DataFrame(self._mask, index=idx).resample(freq) self._mask = resampler.min().to_numpy() if self.has_u: resampler = self.u.resample(freq) self.u = resampler.nearest() self.freq = freq def dataframe(self) -> pd.DataFrame: return self.df.copy() @property def length(self): return self.df.values.shape[0] @property def n_nodes(self): return self.df.values.shape[1] @property def mask(self): if self._mask is None: return np.ones_like(self.df.values).astype('uint8') return self._mask def numpy(self, return_idx=False): if return_idx: return self.numpy(), self.df.index return self.df.values def pytorch(self): data = self.numpy() return torch.FloatTensor(data) def __len__(self): return self.length @staticmethod def build(): raise NotImplementedError def load_raw(self): raise NotImplementedError def load(self): raise NotImplementedError ================================================ FILE: lib/datasets/pems_bay.py ================================================ import os import numpy as np import pandas as pd from lib import datasets_path from .pd_dataset import PandasDataset from ..utils import sample_mask class PemsBay(PandasDataset): def __init__(self): df, dist, mask = self.load() self.dist = dist super().__init__(dataframe=df, u=None, mask=mask, name='bay', freq='5T', aggr='nearest') def load(self, impute_zeros=True): path = os.path.join(datasets_path['bay'], 'pems_bay.h5') df = pd.read_hdf(path) datetime_idx = sorted(df.index) date_range = pd.date_range(datetime_idx[0], datetime_idx[-1], freq='5T') df = df.reindex(index=date_range) mask = ~np.isnan(df.values) df.fillna(method='ffill', axis=0, inplace=True) dist = self.load_distance_matrix(list(df.columns)) return df.astype('float32'), dist, mask.astype('uint8') def load_distance_matrix(self, ids): path = os.path.join(datasets_path['bay'], 'pems_bay_dist.npy') try: dist = np.load(path) except: distances = pd.read_csv(os.path.join(datasets_path['bay'], 'distances_bay.csv')) num_sensors = len(ids) dist = np.ones((num_sensors, num_sensors), dtype=np.float32) * np.inf # Builds sensor id to index map. sensor_id_to_ind = {int(sensor_id): i for i, sensor_id in enumerate(ids)} # Fills cells in the matrix with distances. for row in distances.values: if row[0] not in sensor_id_to_ind or row[1] not in sensor_id_to_ind: continue dist[sensor_id_to_ind[row[0]], sensor_id_to_ind[row[1]]] = row[2] np.save(path, dist) return dist def get_similarity(self, type='dcrnn', thr=0.1, force_symmetric=False, sparse=False): """ Return similarity matrix among nodes. Implemented to match DCRNN. :param type: type of similarity matrix. :param thr: threshold to increase saprseness. :param trainlen: number of steps that can be used for computing the similarity. :param force_symmetric: force the result to be simmetric. :return: and NxN array representig similarity among nodes. """ if type == 'dcrnn': finite_dist = self.dist.reshape(-1) finite_dist = finite_dist[~np.isinf(finite_dist)] sigma = finite_dist.std() adj = np.exp(-np.square(self.dist / sigma)) elif type == 'stcn': sigma = 10 adj = np.exp(-np.square(self.dist) / sigma) else: raise NotImplementedError adj[adj < thr] = 0. if force_symmetric: adj = np.maximum.reduce([adj, adj.T]) if sparse: import scipy.sparse as sps adj = sps.coo_matrix(adj) return adj @property def mask(self): if self._mask is None: return self.df.values != 0. return self._mask class MissingValuesPemsBay(PemsBay): SEED = 56789 def __init__(self, p_fault=0.0015, p_noise=0.05): super(MissingValuesPemsBay, self).__init__() self.rng = np.random.default_rng(self.SEED) self.p_fault = p_fault self.p_noise = p_noise eval_mask = sample_mask(self.numpy().shape, p=p_fault, p_noise=p_noise, min_seq=12, max_seq=12 * 4, rng=self.rng) self.eval_mask = (eval_mask & self.mask).astype('uint8') @property def training_mask(self): return self.mask if self.eval_mask is None else (self.mask & (1 - self.eval_mask)) def splitter(self, dataset, val_len=0, test_len=0, window=0): idx = np.arange(len(dataset)) if test_len < 1: test_len = int(test_len * len(idx)) if val_len < 1: val_len = int(val_len * (len(idx) - test_len)) test_start = len(idx) - test_len val_start = test_start - val_len return [idx[:val_start - window], idx[val_start:test_start - window], idx[test_start:]] ================================================ FILE: lib/datasets/synthetic.py ================================================ import os.path import numpy as np import torch from einops import rearrange from torch.utils.data import Dataset, DataLoader, Subset from lib import datasets_path def generate_mask(shape, p_block=0.01, p_point=0.01, max_seq=1, min_seq=1, rng=None): """Generate mask in which 1 denotes valid values, 0 missing ones. Assuming shape=(steps, ...).""" if rng is None: rand = np.random.random randint = np.random.randint else: rand = rng.random randint = rng.integers # init mask mask = np.ones(shape, dtype='uint8') # block missing if p_block > 0: assert max_seq >= min_seq for col in range(shape[1]): i = 0 while i < shape[0]: if rand() > p_block: i += 1 else: fault_len = int(randint(min_seq, max_seq + 1)) mask[i:i + fault_len, col] = 0 i += fault_len + 1 # at least one valid value between two blocks # point missing # let values before and after block missing always valid diff = np.zeros(mask.shape, dtype='uint8') diff[:-1] |= np.diff(mask, axis=0) < 0 diff[1:] |= np.diff(mask, axis=0) > 0 mask = np.where(mask - diff, rand(shape) > p_point, mask) return mask class SyntheticDataset(Dataset): SEED: int def __init__(self, filename, window=None, p_block=0.05, p_point=0.05, max_seq=6, min_seq=4, use_exogenous=True, mask_exogenous=True, graph_mode=True): super(SyntheticDataset, self).__init__() self.mask_exogenous = mask_exogenous self.use_exogenous = use_exogenous self.graph_mode = graph_mode # fetch data content = self.load(filename) self.window = window if window is not None else content['loc'].shape[1] self.loc = torch.tensor(content['loc'][:, :self.window]).float() self.vel = torch.tensor(content['vel'][:, :self.window]).float() self.adj = content['adj'] self.SEED = content['seed'].item() # compute masks self.rng = np.random.default_rng(self.SEED) mask_shape = (len(self), self.window, self.n_nodes, 1) mask = generate_mask(mask_shape, p_block=p_block, p_point=p_point, max_seq=max_seq, min_seq=min_seq, rng=self.rng).repeat(self.n_channels, -1) eval_mask = 1 - generate_mask(mask_shape, p_block=p_block, p_point=p_point, max_seq=max_seq, min_seq=min_seq, rng=self.rng).repeat(self.n_channels, -1) self.mask = torch.tensor(mask).byte() self.eval_mask = torch.tensor(eval_mask).byte() & self.mask # store splitting indices self.train_idxs = None self.val_idxs = None self.test_idxs = None def __len__(self): return self.loc.size(0) def __getitem__(self, index): eval_mask = self.eval_mask[index] mask = self.training_mask[index] x = mask * self.loc[index] res = dict(x=x, mask=mask, eval_mask=eval_mask) if self.use_exogenous: u = self.vel[index] if self.mask_exogenous: u *= mask.all(-1, keepdims=True) res.update(u=u) res.update(y=self.loc[index]) if not self.graph_mode: res = {k: rearrange(v, 's n f -> s (n f)') for k, v in res.items()} return res @property def n_channels(self): return self.loc.size(-1) @property def n_nodes(self): return self.loc.size(-2) @property def n_exogenous(self): return self.vel.size(-1) if self.use_exogenous else 0 @property def training_mask(self): return self.mask if self.eval_mask is None else (self.mask & (1 - self.eval_mask)) @staticmethod def load(filename): return np.load(filename) def get_similarity(self, sparse=False): return self.adj # Splitting options def split(self, val_len=0, test_len=0): idx = np.arange(len(self)) if test_len < 1: test_len = int(test_len * len(idx)) if val_len < 1: val_len = int(val_len * (len(idx) - test_len)) test_start = len(idx) - test_len val_start = test_start - val_len # split dataset self.train_idxs = idx[:val_start] self.val_idxs = idx[val_start:test_start] self.test_idxs = idx[test_start:] def train_dataloader(self, shuffle=True, batch_size=32): return DataLoader(Subset(self, self.train_idxs), shuffle=shuffle, batch_size=batch_size, drop_last=True) def val_dataloader(self, shuffle=False, batch_size=32): return DataLoader(Subset(self, self.val_idxs), shuffle=shuffle, batch_size=batch_size) def test_dataloader(self, shuffle=False, batch_size=32): return DataLoader(Subset(self, self.test_idxs), shuffle=shuffle, batch_size=batch_size) class ChargedParticles(SyntheticDataset): def __init__(self, static_adj=False, window=None, p_block=0.05, p_point=0.05, max_seq=6, min_seq=4, use_exogenous=True, mask_exogenous=True, graph_mode=True): if static_adj: filename = os.path.join(datasets_path['synthetic'], 'charged_static.npz') else: filename = os.path.join(datasets_path['synthetic'], 'charged_varying.npz') self.static_adj = static_adj super(ChargedParticles, self).__init__(filename, window, p_block=p_block, p_point=p_point, max_seq=max_seq, min_seq=min_seq, use_exogenous=use_exogenous, mask_exogenous=mask_exogenous, graph_mode=graph_mode) charges = self.load(filename)['charges'] self.charges = torch.tensor(charges).float() def __getitem__(self, item): res = super(ChargedParticles, self).__getitem__(item) # add charges as exogenous features if self.use_exogenous: charges = self.charges[item] if not self.static_adj else self.charges stacked_charges = charges[None].expand(self.window, -1, -1) if not self.graph_mode: stacked_charges = rearrange(stacked_charges, 's n f -> s (n f)') res.update(u=torch.cat([res['u'], stacked_charges], -1)) return res def get_similarity(self, sparse=False): return np.ones((self.n_nodes, self.n_nodes)) - np.eye(self.n_nodes) @property def n_exogenous(self): if self.use_exogenous: return super(ChargedParticles, self).n_exogenous + 1 # add charges to features return 0 ================================================ FILE: lib/fillers/__init__.py ================================================ from .filler import Filler from .britsfiller import BRITSFiller from .graphfiller import GraphFiller from .rgainfiller import RGAINFiller from .multi_imputation_filler import MultiImputationFiller ================================================ FILE: lib/fillers/britsfiller.py ================================================ import torch from . import Filler from ..nn import BRITS class BRITSFiller(Filler): def training_step(self, batch, batch_idx): # Unpack batch batch_data, batch_preprocessing = self._unpack_batch(batch) # Extract mask and target mask = batch_data['mask'].clone().detach() batch_data['mask'] = torch.bernoulli(mask.clone().detach().float() * self.keep_prob).byte() eval_mask = batch_data.pop('eval_mask', None) y = batch_data.pop('y') # Compute predictions and compute loss out, imputations, predictions = self.predict_batch(batch, preprocess=False, postprocess=False) if self.scaled_target: target = self._preprocess(y, batch_preprocessing) else: target = y imputations = [self._postprocess(imp, batch_preprocessing) for imp in imputations] predictions = [self._postprocess(prd, batch_preprocessing) for prd in predictions] loss = sum([self.loss_fn(pred, target, mask) for pred in predictions]) loss += BRITS.consistency_loss(*imputations) # Logging metrics_mask = (mask | eval_mask) - batch_data['mask'] # all unseen data out = self._postprocess(out, batch_preprocessing) self.train_metrics.update(out.detach(), y, metrics_mask) self.log_dict(self.train_metrics, on_step=False, on_epoch=True, logger=True, prog_bar=True) self.log('train_loss', loss, on_step=False, on_epoch=True, logger=True, prog_bar=False) return loss def validation_step(self, batch, batch_idx): # Unpack batch batch_data, batch_preprocessing = self._unpack_batch(batch) # Extract mask and target mask = batch_data.get('mask') eval_mask = batch_data.pop('eval_mask', None) y = batch_data.pop('y') # Compute predictions and compute loss out, imputations, predictions = self.predict_batch(batch, preprocess=False, postprocess=False) if self.scaled_target: target = self._preprocess(y, batch_preprocessing) else: target = y predictions = [self._postprocess(prd, batch_preprocessing) for prd in predictions] val_loss = sum([self.loss_fn(pred, target, mask) for pred in predictions]) # Logging out = self._postprocess(out, batch_preprocessing) self.val_metrics.update(out.detach(), y, eval_mask) self.log_dict(self.val_metrics, on_step=False, on_epoch=True, logger=True, prog_bar=True) self.log('val_loss', val_loss.detach(), on_step=False, on_epoch=True, logger=True, prog_bar=False) return val_loss def test_step(self, batch, batch_idx): # Unpack batch batch_data, batch_preprocessing = self._unpack_batch(batch) # Extract mask and target eval_mask = batch_data.pop('eval_mask', None) y = batch_data.pop('y') # Compute outputs and rescale imputation, *_ = self.predict_batch(batch, preprocess=False, postprocess=True) test_loss = self.loss_fn(imputation, y, eval_mask) # Logging self.test_metrics.update(imputation.detach(), y, eval_mask) self.log_dict(self.test_metrics, on_step=False, on_epoch=True, logger=True, prog_bar=True) self.log('test_loss', test_loss.detach(), on_step=False, on_epoch=True, logger=True, prog_bar=False) return test_loss ================================================ FILE: lib/fillers/filler.py ================================================ import inspect from copy import deepcopy import pytorch_lightning as pl import torch from pytorch_lightning.core.decorators import auto_move_data from pytorch_lightning.metrics import MetricCollection from pytorch_lightning.utilities import move_data_to_device from .. import epsilon from ..nn.utils.metric_base import MaskedMetric from ..utils.utils import ensure_list class Filler(pl.LightningModule): def __init__(self, model_class, model_kwargs, optim_class, optim_kwargs, loss_fn, scaled_target=False, whiten_prob=0.05, metrics=None, scheduler_class=None, scheduler_kwargs=None): """ PL module to implement hole fillers. :param model_class: Class of pytorch nn.Module implementing the imputer. :param model_kwargs: Model's keyword arguments. :param optim_class: Optimizer class. :param optim_kwargs: Optimizer's keyword arguments. :param loss_fn: Loss function used for training. :param scaled_target: Whether to scale target before computing loss using batch processing information. :param whiten_prob: Probability of removing a value and using it as ground truth for imputation. :param metrics: Dictionary of type {'metric1_name':metric1_fn, 'metric2_name':metric2_fn ...}. :param scheduler_class: Scheduler class. :param scheduler_kwargs: Scheduler's keyword arguments. """ super(Filler, self).__init__() self.save_hyperparameters(model_kwargs) self.model_cls = model_class self.model_kwargs = model_kwargs self.optim_class = optim_class self.optim_kwargs = optim_kwargs self.scheduler_class = scheduler_class if scheduler_kwargs is None: self.scheduler_kwargs = dict() else: self.scheduler_kwargs = scheduler_kwargs if loss_fn is not None: self.loss_fn = self._check_metric(loss_fn, on_step=True) else: self.loss_fn = None self.scaled_target = scaled_target # during training whiten ground-truth values with this probability assert 0. <= whiten_prob <= 1. self.keep_prob = 1. - whiten_prob if metrics is None: metrics = dict() self._set_metrics(metrics) # instantiate model self.model = self.model_cls(**self.model_kwargs) def reset_model(self): self.model = self.model_cls(**self.model_kwargs) @property def trainable_parameters(self): return sum(p.numel() for p in self.model.parameters() if p.requires_grad) @auto_move_data def forward(self, *args, **kwargs): return self.model(*args, **kwargs) @staticmethod def _check_metric(metric, on_step=False): if not isinstance(metric, MaskedMetric): if 'reduction' in inspect.getfullargspec(metric).args: metric_kwargs = {'reduction': 'none'} else: metric_kwargs = dict() return MaskedMetric(metric, compute_on_step=on_step, metric_kwargs=metric_kwargs) return deepcopy(metric) def _set_metrics(self, metrics): self.train_metrics = MetricCollection( {f'train_{k}': self._check_metric(m, on_step=True) for k, m in metrics.items()}) self.val_metrics = MetricCollection({f'val_{k}': self._check_metric(m) for k, m in metrics.items()}) self.test_metrics = MetricCollection({f'test_{k}': self._check_metric(m) for k, m in metrics.items()}) def _preprocess(self, data, batch_preprocessing): """ Perform preprocessing of a given input. :param data: pytorch tensor of shape [batch, steps, nodes, features] to preprocess :param batch_preprocessing: dictionary containing preprocessing data :return: preprocessed data """ if isinstance(data, (list, tuple)): return [self._preprocess(d, batch_preprocessing) for d in data] trend = batch_preprocessing.get('trend', 0.) bias = batch_preprocessing.get('bias', 0.) scale = batch_preprocessing.get('scale', 1.) return (data - trend - bias) / (scale + epsilon) def _postprocess(self, data, batch_preprocessing): """ Perform postprocessing (inverse transform) of a given input. :param data: pytorch tensor of shape [batch, steps, nodes, features] to trasform :param batch_preprocessing: dictionary containing preprocessing data :return: inverse transformed data """ if isinstance(data, (list, tuple)): return [self._postprocess(d, batch_preprocessing) for d in data] trend = batch_preprocessing.get('trend', 0.) bias = batch_preprocessing.get('bias', 0.) scale = batch_preprocessing.get('scale', 1.) return data * (scale + epsilon) + bias + trend def predict_batch(self, batch, preprocess=False, postprocess=True, return_target=False): """ This method takes as an input a batch as a two dictionaries containing tensors and outputs the predictions. Prediction should have a shape [batch, nodes, horizon] :param batch: list dictionary following the structure [data: {'x':[...], 'y':[...], 'u':[...], ...}, preprocessing: {'bias': ..., 'scale': ..., 'x_trend':[...], 'y_trend':[...]}] :param preprocess: whether the data need to be preprocessed (note that inputs are by default preprocessed before creating the batch) :param postprocess: whether to postprocess the predictions (if True we assume that the model has learned to predict the trasformed signal) :param return_target: whether to return the prediction target y_true and the prediction mask :return: (y_true), y_hat, (mask) """ batch_data, batch_preprocessing = self._unpack_batch(batch) if preprocess: x = batch_data.pop('x') x = self._preprocess(x, batch_preprocessing) y_hat = self.forward(x, **batch_data) else: y_hat = self.forward(**batch_data) # Rescale outputs if postprocess: y_hat = self._postprocess(y_hat, batch_preprocessing) if return_target: y = batch_data.get('y') mask = batch_data.get('mask', None) return y, y_hat, mask return y_hat def predict_loader(self, loader, preprocess=False, postprocess=True, return_mask=True): """ Makes predictions for an input dataloader. Returns both the predictions and the predictions targets. :param loader: torch dataloader :param preprocess: whether to preprocess the data :param postprocess: whether to postprocess the data :param return_mask: whether to return the valid mask (if it exists) :return: y_true, y_hat """ targets, imputations, masks = [], [], [] for batch in loader: batch = move_data_to_device(batch, self.device) batch_data, batch_preprocessing = self._unpack_batch(batch) # Extract mask and target eval_mask = batch_data.pop('eval_mask', None) y = batch_data.pop('y') y_hat = self.predict_batch(batch, preprocess=preprocess, postprocess=postprocess) if isinstance(y_hat, (list, tuple)): y_hat = y_hat[0] targets.append(y) imputations.append(y_hat) masks.append(eval_mask) y = torch.cat(targets, 0) y_hat = torch.cat(imputations, 0) if return_mask: mask = torch.cat(masks, 0) if masks[0] is not None else None return y, y_hat, mask return y, y_hat def _unpack_batch(self, batch): """ Unpack a batch into data and preprocessing dictionaries. :param batch: the batch :return: batch_data, batch_preprocessing """ if isinstance(batch, (tuple, list)) and (len(batch) == 2): batch_data, batch_preprocessing = batch else: batch_data = batch batch_preprocessing = dict() return batch_data, batch_preprocessing def training_step(self, batch, batch_idx): # Unpack batch batch_data, batch_preprocessing = self._unpack_batch(batch) # Extract mask and target mask = batch_data['mask'].clone().detach() batch_data['mask'] = torch.bernoulli(mask.clone().detach().float() * self.keep_prob).byte() eval_mask = batch_data.pop('eval_mask') eval_mask = (mask | eval_mask) - batch_data['mask'] y = batch_data.pop('y') # Compute predictions and compute loss imputation = self.predict_batch(batch, preprocess=False, postprocess=False) if self.scaled_target: target = self._preprocess(y, batch_preprocessing) else: target = y imputation = self._postprocess(imputation, batch_preprocessing) loss = self.loss_fn(imputation, target, mask) # Logging if self.scaled_target: imputation = self._postprocess(imputation, batch_preprocessing) self.train_metrics.update(imputation.detach(), y, eval_mask) # all unseen data self.log_dict(self.train_metrics, on_step=False, on_epoch=True, logger=True, prog_bar=True) self.log('train_loss', loss.detach(), on_step=False, on_epoch=True, logger=True, prog_bar=False) return loss def validation_step(self, batch, batch_idx): # Unpack batch batch_data, batch_preprocessing = self._unpack_batch(batch) # Extract mask and target eval_mask = batch_data.pop('eval_mask', None) y = batch_data.pop('y') # Compute predictions and compute loss imputation = self.predict_batch(batch, preprocess=False, postprocess=False) if self.scaled_target: target = self._preprocess(y, batch_preprocessing) else: target = y imputation = self._postprocess(imputation, batch_preprocessing) val_loss = self.loss_fn(imputation, target, eval_mask) # Logging if self.scaled_target: imputation = self._postprocess(imputation, batch_preprocessing) self.val_metrics.update(imputation.detach(), y, eval_mask) self.log_dict(self.val_metrics, on_step=False, on_epoch=True, logger=True, prog_bar=True) self.log('val_loss', val_loss.detach(), on_step=False, on_epoch=True, logger=True, prog_bar=False) return val_loss def test_step(self, batch, batch_idx): # Unpack batch batch_data, batch_preprocessing = self._unpack_batch(batch) # Extract mask and target eval_mask = batch_data.pop('eval_mask', None) y = batch_data.pop('y') # Compute outputs and rescale imputation = self.predict_batch(batch, preprocess=False, postprocess=True) test_loss = self.loss_fn(imputation, y, eval_mask) # Logging self.test_metrics.update(imputation.detach(), y, eval_mask) self.log_dict(self.test_metrics, on_step=False, on_epoch=True, logger=True, prog_bar=True) return test_loss def on_train_epoch_start(self) -> None: optimizers = ensure_list(self.optimizers()) for i, optimizer in enumerate(optimizers): lr = optimizer.optimizer.param_groups[0]['lr'] self.log(f'lr_{i}', lr, on_step=False, on_epoch=True, logger=True, prog_bar=False) def configure_optimizers(self): cfg = dict() optimizer = self.optim_class(self.parameters(), **self.optim_kwargs) cfg['optimizer'] = optimizer if self.scheduler_class is not None: metric = self.scheduler_kwargs.pop('monitor', None) scheduler = self.scheduler_class(optimizer, **self.scheduler_kwargs) cfg['lr_scheduler'] = scheduler if metric is not None: cfg['monitor'] = metric return cfg ================================================ FILE: lib/fillers/graphfiller.py ================================================ import torch from . import Filler from ..nn.models import MPGRUNet, GRINet, BiMPGRUNet class GraphFiller(Filler): def __init__(self, model_class, model_kwargs, optim_class, optim_kwargs, loss_fn, scaled_target=False, whiten_prob=0.05, pred_loss_weight=1., warm_up=0, metrics=None, scheduler_class=None, scheduler_kwargs=None): super(GraphFiller, self).__init__(model_class=model_class, model_kwargs=model_kwargs, optim_class=optim_class, optim_kwargs=optim_kwargs, loss_fn=loss_fn, scaled_target=scaled_target, whiten_prob=whiten_prob, metrics=metrics, scheduler_class=scheduler_class, scheduler_kwargs=scheduler_kwargs) self.tradeoff = pred_loss_weight if model_class is MPGRUNet: self.trimming = (warm_up, 0) elif model_class in [GRINet, BiMPGRUNet]: self.trimming = (warm_up, warm_up) def trim_seq(self, *seq): seq = [s[:, self.trimming[0]:s.size(1) - self.trimming[1]] for s in seq] if len(seq) == 1: return seq[0] return seq def training_step(self, batch, batch_idx): # Unpack batch batch_data, batch_preprocessing = self._unpack_batch(batch) # Compute masks mask = batch_data['mask'].clone().detach() batch_data['mask'] = torch.bernoulli(mask.clone().detach().float() * self.keep_prob).byte() eval_mask = batch_data.pop('eval_mask', None) eval_mask = (mask | eval_mask) - batch_data['mask'] # all unseen data y = batch_data.pop('y') # Compute predictions and compute loss res = self.predict_batch(batch, preprocess=False, postprocess=False) imputation, predictions = (res[0], res[1:]) if isinstance(res, (list, tuple)) else (res, []) # trim to imputation horizon len imputation, mask, eval_mask, y = self.trim_seq(imputation, mask, eval_mask, y) predictions = self.trim_seq(*predictions) if self.scaled_target: target = self._preprocess(y, batch_preprocessing) else: target = y imputation = self._postprocess(imputation, batch_preprocessing) for i, _ in enumerate(predictions): predictions[i] = self._postprocess(predictions[i], batch_preprocessing) loss = self.loss_fn(imputation, target, mask) for pred in predictions: loss += self.tradeoff * self.loss_fn(pred, target, mask) # Logging if self.scaled_target: imputation = self._postprocess(imputation, batch_preprocessing) self.train_metrics.update(imputation.detach(), y, eval_mask) # all unseen data self.log_dict(self.train_metrics, on_step=False, on_epoch=True, logger=True, prog_bar=True) self.log('train_loss', loss.detach(), on_step=False, on_epoch=True, logger=True, prog_bar=False) return loss def validation_step(self, batch, batch_idx): # Unpack batch batch_data, batch_preprocessing = self._unpack_batch(batch) # Extract mask and target mask = batch_data.get('mask') eval_mask = batch_data.pop('eval_mask', None) y = batch_data.pop('y') # Compute predictions and compute loss imputation = self.predict_batch(batch, preprocess=False, postprocess=False) # trim to imputation horizon len imputation, mask, eval_mask, y = self.trim_seq(imputation, mask, eval_mask, y) if self.scaled_target: target = self._preprocess(y, batch_preprocessing) else: target = y imputation = self._postprocess(imputation, batch_preprocessing) val_loss = self.loss_fn(imputation, target, eval_mask) # Logging if self.scaled_target: imputation = self._postprocess(imputation, batch_preprocessing) self.val_metrics.update(imputation.detach(), y, eval_mask) self.log_dict(self.val_metrics, on_step=False, on_epoch=True, logger=True, prog_bar=True) self.log('val_loss', val_loss.detach(), on_step=False, on_epoch=True, logger=True, prog_bar=False) return val_loss def test_step(self, batch, batch_idx): # Unpack batch batch_data, batch_preprocessing = self._unpack_batch(batch) # Extract mask and target eval_mask = batch_data.pop('eval_mask', None) y = batch_data.pop('y') # Compute outputs and rescale imputation = self.predict_batch(batch, preprocess=False, postprocess=True) test_loss = self.loss_fn(imputation, y, eval_mask) # Logging self.test_metrics.update(imputation.detach(), y, eval_mask) self.log_dict(self.test_metrics, on_step=False, on_epoch=True, logger=True, prog_bar=True) self.log('test_loss', test_loss.detach(), on_step=False, on_epoch=True, logger=True, prog_bar=False) return test_loss ================================================ FILE: lib/fillers/multi_imputation_filler.py ================================================ import torch from pytorch_lightning.core.decorators import auto_move_data from . import Filler class MultiImputationFiller(Filler): """ Filler with multiple imputation outputs """ def __init__(self, model_class, model_kwargs, optim_class, optim_kwargs, loss_fn, consistency_loss=False, scaled_target=False, whiten_prob=0.05, metrics=None, scheduler_class=None, scheduler_kwargs=None): super().__init__(model_class, model_kwargs, optim_class, optim_kwargs, loss_fn, scaled_target, whiten_prob, metrics, scheduler_class, scheduler_kwargs) self.consistency_loss = consistency_loss @auto_move_data def forward(self, *args, **kwargs): out = self.model(*args, **kwargs) assert isinstance(out, (list, tuple)) if self.training: return out return out[0] # we assume that the final imputation is the first one def _consistency_loss(self, imputations, mask): from itertools import combinations return sum([self.loss_fn(imp1, imp2, mask) for imp1, imp2 in combinations(imputations, 2)]) def training_step(self, batch, batch_idx): # Unpack batch batch_data, batch_preprocessing = self._unpack_batch(batch) # Extract mask and target mask = batch_data['mask'].clone().detach() batch_data['mask'] = torch.bernoulli(mask.clone().detach().float() * self.keep_prob).byte() eval_mask = batch_data.pop('eval_mask', None) y = batch_data.pop('y') # Compute predictions and compute loss imputations = self.predict_batch(batch, preprocess=False, postprocess=False) if self.scaled_target: target = self._preprocess(y, batch_preprocessing) else: target = y imputations = [self._postprocess(imp, batch_preprocessing) for imp in imputations] loss = sum([self.loss_fn(imp, target, mask) for imp in imputations]) if self.consistency_loss: loss += self._consistency_loss(imputations, mask) # Logging metrics_mask = (mask | eval_mask) - batch_data['mask'] # all unseen data x_hat = imputations[0] x_hat = self._postprocess(x_hat, batch_preprocessing) self.train_metrics.update(x_hat.detach(), y, metrics_mask) self.log_dict(self.train_metrics, on_step=False, on_epoch=True, logger=True, prog_bar=True) self.log('train_loss', loss.detach(), on_step=False, on_epoch=True, logger=True, prog_bar=False) return loss ================================================ FILE: lib/fillers/rgainfiller.py ================================================ import torch from torch.nn import functional as F from .multi_imputation_filler import MultiImputationFiller from ..nn.utils.metric_base import MaskedMetric class MaskedBCEWithLogits(MaskedMetric): def __init__(self, mask_nans=False, mask_inf=False, compute_on_step=True, dist_sync_on_step=False, process_group=None, dist_sync_fn=None, at=None): super(MaskedBCEWithLogits, self).__init__(metric_fn=F.binary_cross_entropy_with_logits, mask_nans=mask_nans, mask_inf=mask_inf, compute_on_step=compute_on_step, dist_sync_on_step=dist_sync_on_step, process_group=process_group, dist_sync_fn=dist_sync_fn, metric_kwargs={'reduction': 'none'}, at=at) class RGAINFiller(MultiImputationFiller): def __init__(self, model_class, model_kwargs, optim_class, optim_kwargs, loss_fn, g_train_freq=1, d_train_freq=5, consistency_loss=False, scaled_target=True, whiten_prob=0.05, hint_rate=0.7, alpha=10., metrics=None, scheduler_class=None, scheduler_kwargs=None): super(RGAINFiller, self).__init__(model_class=model_class, model_kwargs=model_kwargs, optim_class=optim_class, optim_kwargs=optim_kwargs, loss_fn=loss_fn, scaled_target=scaled_target, whiten_prob=whiten_prob, metrics=metrics, consistency_loss=consistency_loss, scheduler_class=scheduler_class, scheduler_kwargs=scheduler_kwargs) # discriminator training params self.alpha = alpha self.g_train_freq = g_train_freq self.d_train_freq = d_train_freq self.masked_bce_loss = MaskedBCEWithLogits(compute_on_step=True) # activate manual optimization self.automatic_optimization = False self.hint_rate = hint_rate def training_step(self, batch, batch_idx): # Unpack batch batch_data, batch_preprocessing = self._unpack_batch(batch) g_opt, d_opt = self.optimizers() schedulers = self.lr_schedulers() # Extract mask and target x = batch_data.pop('x') mask = batch_data['mask'].clone().detach() training_mask = torch.bernoulli(mask.clone().detach().float() * self.keep_prob).byte() eval_mask = batch_data.pop('eval_mask', None) y = batch_data.pop('y') ########################## # generate imputations ########################## imputations = self.model.generator(x, training_mask) imputed_seq = imputations[0] target = self._preprocess(y, batch_preprocessing) y_hat = self._postprocess(imputed_seq, batch_preprocessing) x_in = training_mask * x + (1 - training_mask) * imputed_seq hint = torch.rand_like(training_mask, dtype=torch.float) < self.hint_rate hint = hint.byte() hint = hint * training_mask + (1 - hint) * 0.5 ######################### # train generator ######################### if (batch_idx % self.g_train_freq) == 0: g_opt.zero_grad() rec_loss = sum([torch.sqrt(self.loss_fn(imp, target, mask)) for imp in imputations]) if self.consistency_loss: rec_loss += self._consistency_loss(imputations, mask) logits = self.model.discriminator(x_in, hint) # maximize logit adv_loss = self.masked_bce_loss(logits, torch.ones_like(logits), 1 - training_mask) g_loss = self.alpha * rec_loss + adv_loss self.manual_backward(g_loss) g_opt.step() # Logging metrics_mask = (mask | eval_mask) - training_mask self.train_metrics.update(y_hat.detach(), y, metrics_mask) # all unseen data self.log_dict(self.train_metrics, on_step=False, on_epoch=True, logger=True, prog_bar=True) self.log('gen_loss', adv_loss.detach(), on_step=False, on_epoch=True, logger=True, prog_bar=False) self.log('imp_loss', rec_loss.detach(), on_step=True, on_epoch=True, logger=True, prog_bar=True) ########################### # train discriminator ########################### if (batch_idx % self.d_train_freq) == 0: d_opt.zero_grad() logits = self.model.discriminator(x_in.detach(), hint) d_loss = self.masked_bce_loss(logits, training_mask.to(logits.dtype)) self.manual_backward(d_loss) d_opt.step() self.log('d_loss', d_loss.detach(), on_step=False, on_epoch=True, logger=True, prog_bar=False) if (schedulers is not None) and self.trainer.is_last_batch: for sch in schedulers: sch.step() def configure_optimizers(self): opt_g = self.optim_class(self.model.generator.parameters(), **self.optim_kwargs) opt_d = self.optim_class(self.model.discriminator.parameters(), **self.optim_kwargs) optimizers = [opt_g, opt_d] if self.scheduler_class is not None: metric = self.scheduler_kwargs.pop('monitor', None) schedulers = [{"scheduler": self.scheduler_class(opt, **self.scheduler_kwargs), "monitor": metric} for opt in optimizers] return optimizers, schedulers return optimizers ================================================ FILE: lib/nn/__init__.py ================================================ from .layers import * ================================================ FILE: lib/nn/layers/__init__.py ================================================ from .rits import RITS, BRITS from .gril import GRIL, BiGRIL from .spatial_conv import SpatialConvOrderK from .mpgru import MPGRUImputer ================================================ FILE: lib/nn/layers/gcrnn.py ================================================ import torch import torch.nn as nn from .spatial_conv import SpatialConvOrderK class GCGRUCell(nn.Module): """ Graph Convolution Gated Recurrent Unit Cell. """ def __init__(self, d_in, num_units, support_len, order, activation='tanh'): """ :param num_units: the hidden dim of rnn :param support_len: the (weighted) adjacency matrix of the graph, in numpy ndarray form :param order: the max diffusion step :param activation: if None, don't do activation for cell state """ super(GCGRUCell, self).__init__() self.activation_fn = getattr(torch, activation) self.forget_gate = SpatialConvOrderK(c_in=d_in + num_units, c_out=num_units, support_len=support_len, order=order) self.update_gate = SpatialConvOrderK(c_in=d_in + num_units, c_out=num_units, support_len=support_len, order=order) self.c_gate = SpatialConvOrderK(c_in=d_in + num_units, c_out=num_units, support_len=support_len, order=order) def forward(self, x, h, adj): """ :param x: (B, input_dim, num_nodes) :param h: (B, num_units, num_nodes) :param adj: (num_nodes, num_nodes) :return: """ # we start with bias 1.0 to not reset and not update x_gates = torch.cat([x, h], dim=1) r = torch.sigmoid(self.forget_gate(x_gates, adj)) u = torch.sigmoid(self.update_gate(x_gates, adj)) x_c = torch.cat([x, r * h], dim=1) c = self.c_gate(x_c, adj) # batch_size, self._num_nodes * output_size c = self.activation_fn(c) return u * h + (1. - u) * c class GCRNN(nn.Module): def __init__(self, d_in, d_model, d_out, n_layers, support_len, kernel_size=2): super(GCRNN, self).__init__() self.d_in = d_in self.d_model = d_model self.d_out = d_out self.n_layers = n_layers self.ks = kernel_size self.support_len = support_len self.rnn_cells = nn.ModuleList() for i in range(self.n_layers): self.rnn_cells.append(GCGRUCell(d_in=self.d_in if i == 0 else self.d_model, num_units=self.d_model, support_len=self.support_len, order=self.ks)) self.output_layer = nn.Conv2d(self.d_model, self.d_out, kernel_size=1) def init_hidden_states(self, x): return [torch.zeros(size=(x.shape[0], self.d_model, x.shape[2])).to(x.device) for _ in range(self.n_layers)] def single_pass(self, x, h, adj): out = x for l, layer in enumerate(self.rnn_cells): out = h[l] = layer(out, h[l], adj) return out, h def forward(self, x, adj, h=None): # x:[batch, features, nodes, steps] *_, steps = x.size() if h is None: h = self.init_hidden_states(x) # temporal conv for step in range(steps): out, h = self.single_pass(x[..., step], h, adj) return self.output_layer(out[..., None]) ================================================ FILE: lib/nn/layers/gril.py ================================================ import torch import torch.nn as nn from einops import rearrange from .spatial_conv import SpatialConvOrderK from .gcrnn import GCGRUCell from .spatial_attention import SpatialAttention from ..utils.ops import reverse_tensor class SpatialDecoder(nn.Module): def __init__(self, d_in, d_model, d_out, support_len, order=1, attention_block=False, nheads=2, dropout=0.): super(SpatialDecoder, self).__init__() self.order = order self.lin_in = nn.Conv1d(d_in, d_model, kernel_size=1) self.graph_conv = SpatialConvOrderK(c_in=d_model, c_out=d_model, support_len=support_len * order, order=1, include_self=False) if attention_block: self.spatial_att = SpatialAttention(d_in=d_model, d_model=d_model, nheads=nheads, dropout=dropout) self.lin_out = nn.Conv1d(3 * d_model, d_model, kernel_size=1) else: self.register_parameter('spatial_att', None) self.lin_out = nn.Conv1d(2 * d_model, d_model, kernel_size=1) self.read_out = nn.Conv1d(2 * d_model, d_out, kernel_size=1) self.activation = nn.PReLU() self.adj = None def forward(self, x, m, h, u, adj, cached_support=False): # [batch, channels, nodes] x_in = [x, m, h] if u is None else [x, m, u, h] x_in = torch.cat(x_in, 1) if self.order > 1: if cached_support and (self.adj is not None): adj = self.adj else: adj = SpatialConvOrderK.compute_support_orderK(adj, self.order, include_self=False, device=x_in.device) self.adj = adj if cached_support else None x_in = self.lin_in(x_in) out = self.graph_conv(x_in, adj) if self.spatial_att is not None: # [batch, channels, nodes] -> [batch, steps, nodes, features] x_in = rearrange(x_in, 'b f n -> b 1 n f') out_att = self.spatial_att(x_in, torch.eye(x_in.size(2), dtype=torch.bool, device=x_in.device)) out_att = rearrange(out_att, 'b s n f -> b f (n s)') out = torch.cat([out, out_att], 1) out = torch.cat([out, h], 1) out = self.activation(self.lin_out(out)) # out = self.lin_out(out) out = torch.cat([out, h], 1) return self.read_out(out), out class GRIL(nn.Module): def __init__(self, input_size, hidden_size, u_size=None, n_layers=1, dropout=0., kernel_size=2, decoder_order=1, global_att=False, support_len=2, n_nodes=None, layer_norm=False): super(GRIL, self).__init__() self.input_size = int(input_size) self.hidden_size = int(hidden_size) self.u_size = int(u_size) if u_size is not None else 0 self.n_layers = int(n_layers) rnn_input_size = 2 * self.input_size + self.u_size # input + mask + (eventually) exogenous # Spatio-temporal encoder (rnn_input_size -> hidden_size) self.cells = nn.ModuleList() self.norms = nn.ModuleList() for i in range(self.n_layers): self.cells.append(GCGRUCell(d_in=rnn_input_size if i == 0 else self.hidden_size, num_units=self.hidden_size, support_len=support_len, order=kernel_size)) if layer_norm: self.norms.append(nn.GroupNorm(num_groups=1, num_channels=self.hidden_size)) else: self.norms.append(nn.Identity()) self.dropout = nn.Dropout(dropout) if dropout > 0. else None # Fist stage readout self.first_stage = nn.Conv1d(in_channels=self.hidden_size, out_channels=self.input_size, kernel_size=1) # Spatial decoder (rnn_input_size + hidden_size -> hidden_size) self.spatial_decoder = SpatialDecoder(d_in=rnn_input_size + self.hidden_size, d_model=self.hidden_size, d_out=self.input_size, support_len=2, order=decoder_order, attention_block=global_att) # Hidden state initialization embedding if n_nodes is not None: self.h0 = self.init_hidden_states(n_nodes) else: self.register_parameter('h0', None) def init_hidden_states(self, n_nodes): h0 = [] for l in range(self.n_layers): std = 1. / torch.sqrt(torch.tensor(self.hidden_size, dtype=torch.float)) vals = torch.distributions.Normal(0, std).sample((self.hidden_size, n_nodes)) h0.append(nn.Parameter(vals)) return nn.ParameterList(h0) def get_h0(self, x): if self.h0 is not None: return [h.expand(x.shape[0], -1, -1) for h in self.h0] return [torch.zeros(size=(x.shape[0], self.hidden_size, x.shape[2])).to(x.device)] * self.n_layers def update_state(self, x, h, adj): rnn_in = x for layer, (cell, norm) in enumerate(zip(self.cells, self.norms)): rnn_in = h[layer] = norm(cell(rnn_in, h[layer], adj)) if self.dropout is not None and layer < (self.n_layers - 1): rnn_in = self.dropout(rnn_in) return h def forward(self, x, adj, mask=None, u=None, h=None, cached_support=False): # x:[batch, features, nodes, steps] *_, steps = x.size() # infer all valid if mask is None if mask is None: mask = torch.ones_like(x, dtype=torch.uint8) # init hidden state using node embedding or the empty state if h is None: h = self.get_h0(x) elif not isinstance(h, list): h = [*h] # Temporal conv predictions, imputations, states = [], [], [] representations = [] for step in range(steps): x_s = x[..., step] m_s = mask[..., step] h_s = h[-1] u_s = u[..., step] if u is not None else None # firstly impute missing values with predictions from state xs_hat_1 = self.first_stage(h_s) # fill missing values in input with prediction x_s = torch.where(m_s, x_s, xs_hat_1) # prepare inputs # retrieve maximum information from neighbors xs_hat_2, repr_s = self.spatial_decoder(x=x_s, m=m_s, h=h_s, u=u_s, adj=adj, cached_support=cached_support) # receive messages from neighbors (no self-loop!) # readout of imputation state + mask to retrieve imputations # prepare inputs x_s = torch.where(m_s, x_s, xs_hat_2) inputs = [x_s, m_s] if u_s is not None: inputs.append(u_s) inputs = torch.cat(inputs, dim=1) # x_hat_2 + mask + exogenous # update state with original sequence filled using imputations h = self.update_state(inputs, h, adj) # store imputations and states imputations.append(xs_hat_2) predictions.append(xs_hat_1) states.append(torch.stack(h, dim=0)) representations.append(repr_s) # Aggregate outputs -> [batch, features, nodes, steps] imputations = torch.stack(imputations, dim=-1) predictions = torch.stack(predictions, dim=-1) states = torch.stack(states, dim=-1) representations = torch.stack(representations, dim=-1) return imputations, predictions, representations, states class BiGRIL(nn.Module): def __init__(self, input_size, hidden_size, ff_size, ff_dropout, n_layers=1, dropout=0., n_nodes=None, support_len=2, kernel_size=2, decoder_order=1, global_att=False, u_size=0, embedding_size=0, layer_norm=False, merge='mlp'): super(BiGRIL, self).__init__() self.fwd_rnn = GRIL(input_size=input_size, hidden_size=hidden_size, n_layers=n_layers, dropout=dropout, n_nodes=n_nodes, support_len=support_len, kernel_size=kernel_size, decoder_order=decoder_order, global_att=global_att, u_size=u_size, layer_norm=layer_norm) self.bwd_rnn = GRIL(input_size=input_size, hidden_size=hidden_size, n_layers=n_layers, dropout=dropout, n_nodes=n_nodes, support_len=support_len, kernel_size=kernel_size, decoder_order=decoder_order, global_att=global_att, u_size=u_size, layer_norm=layer_norm) if n_nodes is None: embedding_size = 0 if embedding_size > 0: self.emb = nn.Parameter(torch.empty(embedding_size, n_nodes)) nn.init.kaiming_normal_(self.emb, nonlinearity='relu') else: self.register_parameter('emb', None) if merge == 'mlp': self._impute_from_states = True self.out = nn.Sequential( nn.Conv2d(in_channels=4 * hidden_size + input_size + embedding_size, out_channels=ff_size, kernel_size=1), nn.ReLU(), nn.Dropout(ff_dropout), nn.Conv2d(in_channels=ff_size, out_channels=input_size, kernel_size=1) ) elif merge in ['mean', 'sum', 'min', 'max']: self._impute_from_states = False self.out = getattr(torch, merge) else: raise ValueError("Merge option %s not allowed." % merge) self.supp = None def forward(self, x, adj, mask=None, u=None, cached_support=False): if cached_support and (self.supp is not None): supp = self.supp else: supp = SpatialConvOrderK.compute_support(adj, x.device) self.supp = supp if cached_support else None # Forward fwd_out, fwd_pred, fwd_repr, _ = self.fwd_rnn(x, supp, mask=mask, u=u, cached_support=cached_support) # Backward rev_x, rev_mask, rev_u = [reverse_tensor(tens) for tens in (x, mask, u)] *bwd_res, _ = self.bwd_rnn(rev_x, supp, mask=rev_mask, u=rev_u, cached_support=cached_support) bwd_out, bwd_pred, bwd_repr = [reverse_tensor(res) for res in bwd_res] if self._impute_from_states: inputs = [fwd_repr, bwd_repr, mask] if self.emb is not None: b, *_, s = fwd_repr.shape # fwd_h: [batches, channels, nodes, steps] inputs += [self.emb.view(1, *self.emb.shape, 1).expand(b, -1, -1, s)] # stack emb for batches and steps imputation = torch.cat(inputs, dim=1) imputation = self.out(imputation) else: imputation = torch.stack([fwd_out, bwd_out], dim=1) imputation = self.out(imputation, dim=1) predictions = torch.stack([fwd_out, bwd_out, fwd_pred, bwd_pred], dim=0) return imputation, predictions ================================================ FILE: lib/nn/layers/imputation.py ================================================ import math import torch from torch import nn from torch.nn import functional as F class ImputationLayer(nn.Module): def __init__(self, d_in, bias=True): super(ImputationLayer, self).__init__() self.W = nn.Parameter(torch.Tensor(d_in, d_in)) if bias: self.b = nn.Parameter(torch.Tensor(d_in)) else: self.register_buffer('b', None) mask = 1. - torch.eye(d_in) self.register_buffer('mask', mask) self.reset_parameters() def reset_parameters(self): nn.init.kaiming_uniform_(self.W, a=math.sqrt(5)) if self.b is not None: fan_in, _ = nn.init._calculate_fan_in_and_fan_out(self.W) bound = 1 / math.sqrt(fan_in) nn.init.uniform_(self.b, -bound, bound) def forward(self, x): # batch, features return F.linear(x, self.mask * self.W, self.b) ================================================ FILE: lib/nn/layers/mpgru.py ================================================ import torch from torch import nn from .gcrnn import GCGRUCell class MPGRUImputer(nn.Module): def __init__(self, input_size, hidden_size, ff_size=None, u_size=None, n_layers=1, dropout=0., kernel_size=2, support_len=2, n_nodes=None, layer_norm=False, autoencoder_mode=False): super(MPGRUImputer, self).__init__() self.input_size = int(input_size) self.hidden_size = int(hidden_size) self.ff_size = int(ff_size) if ff_size is not None else 0 self.u_size = int(u_size) if u_size is not None else 0 self.n_layers = int(n_layers) rnn_input_size = 2 * self.input_size + self.u_size # input + mask + (eventually) exogenous # Spatio-temporal encoder (rnn_input_size -> hidden_size) self.cells = nn.ModuleList() self.norms = nn.ModuleList() for i in range(self.n_layers): self.cells.append(GCGRUCell(d_in=rnn_input_size if i == 0 else self.hidden_size, num_units=self.hidden_size, support_len=support_len, order=kernel_size)) if layer_norm: self.norms.append(nn.GroupNorm(num_groups=1, num_channels=self.hidden_size)) else: self.norms.append(nn.Identity()) self.dropout = nn.Dropout(dropout) if dropout > 0. else None # Readout if self.ff_size: self.pred_readout = nn.Sequential( nn.Conv1d(in_channels=self.hidden_size, out_channels=self.ff_size, kernel_size=1), nn.PReLU(), nn.Conv1d(in_channels=self.ff_size, out_channels=self.input_size, kernel_size=1) ) else: self.pred_readout = nn.Conv1d(in_channels=self.hidden_size, out_channels=self.input_size, kernel_size=1) # Hidden state initialization embedding if n_nodes is not None: self.h0 = self.init_hidden_states(n_nodes) else: self.register_parameter('h0', None) self.autoencoder_mode = autoencoder_mode def init_hidden_states(self, n_nodes): h0 = [] for l in range(self.n_layers): std = 1. / torch.sqrt(torch.tensor(self.hidden_size, dtype=torch.float)) vals = torch.distributions.Normal(0, std).sample((self.hidden_size, n_nodes)) h0.append(nn.Parameter(vals)) return nn.ParameterList(h0) def get_h0(self, x): if self.h0 is not None: return [h.expand(x.shape[0], -1, -1) for h in self.h0] return [torch.zeros(size=(x.shape[0], self.hidden_size, x.shape[2])).to(x.device)] * self.n_layers def update_state(self, x, h, adj): rnn_in = x for layer, (cell, norm) in enumerate(zip(self.cells, self.norms)): rnn_in = h[layer] = norm(cell(rnn_in, h[layer], adj)) if self.dropout is not None and layer < (self.n_layers - 1): rnn_in = self.dropout(rnn_in) return h def forward(self, x, adj, mask=None, u=None, h=None): # x:[batch, features, nodes, steps] *_, steps = x.size() # infer all valid if mask is None if mask is None: mask = torch.ones_like(x, dtype=torch.uint8) # init hidden state using node embedding or the empty state if h is None: h = self.get_h0(x) elif not isinstance(h, list): h = [*h] # Temporal conv predictions, states = [], [] for step in range(steps): x_s = x[..., step] m_s = mask[..., step] h_s = h[-1] u_s = u[..., step] if u is not None else None # impute missing values with predictions from state x_s_hat = self.pred_readout(h_s) # store imputations and state predictions.append(x_s_hat) states.append(torch.stack(h, dim=0)) # fill missing values in input with prediction x_s = torch.where(m_s, x_s, x_s_hat) inputs = [x_s, m_s] if u_s is not None: inputs.append(u_s) inputs = torch.cat(inputs, dim=1) # x_hat complemented + mask + exogenous # update state with original sequence filled using imputations h = self.update_state(inputs, h, adj) # In autoencoder mode use states after input processing if self.autoencoder_mode: states = states[1:] + [torch.stack(h, dim=0)] # Aggregate outputs -> [batch, features, nodes, steps] predictions = torch.stack(predictions, dim=-1) states = torch.stack(states, dim=-1) return predictions, states ================================================ FILE: lib/nn/layers/rits.py ================================================ import math import torch import torch.nn as nn import torch.nn.functional as F from torch.autograd import Variable from torch.nn.parameter import Parameter from ..utils.ops import reverse_tensor class FeatureRegression(nn.Module): def __init__(self, input_size): super(FeatureRegression, self).__init__() self.W = Parameter(torch.Tensor(input_size, input_size)) self.b = Parameter(torch.Tensor(input_size)) m = torch.ones(input_size, input_size) - torch.eye(input_size, input_size) self.register_buffer('m', m) self.reset_parameters() def reset_parameters(self): stdv = 1. / math.sqrt(self.W.shape[0]) self.W.data.uniform_(-stdv, stdv) if self.b is not None: self.b.data.uniform_(-stdv, stdv) def forward(self, x): z_h = F.linear(x, self.W * Variable(self.m), self.b) return z_h class TemporalDecay(nn.Module): def __init__(self, d_in, d_out, diag=False): super(TemporalDecay, self).__init__() self.diag = diag self.W = Parameter(torch.Tensor(d_out, d_in)) self.b = Parameter(torch.Tensor(d_out)) if self.diag: assert (d_in == d_out) m = torch.eye(d_in, d_in) self.register_buffer('m', m) self.reset_parameters() def reset_parameters(self): stdv = 1. / math.sqrt(self.W.shape[0]) self.W.data.uniform_(-stdv, stdv) if self.b is not None: self.b.data.uniform_(-stdv, stdv) @staticmethod def compute_delta(mask, freq=1): delta = torch.zeros_like(mask).float() one_step = torch.tensor(freq, dtype=delta.dtype, device=delta.device) for i in range(1, delta.shape[-2]): m = mask[..., i - 1, :] delta[..., i, :] = m * one_step + (1 - m) * torch.add(delta[..., i - 1, :], freq) return delta def forward(self, d): if self.diag: gamma = F.relu(F.linear(d, self.W * Variable(self.m), self.b)) else: gamma = F.relu(F.linear(d, self.W, self.b)) gamma = torch.exp(-gamma) return gamma class RITS(nn.Module): def __init__(self, input_size, hidden_size=64): super(RITS, self).__init__() self.input_size = int(input_size) self.hidden_size = int(hidden_size) self.rnn_cell = nn.LSTMCell(2 * self.input_size, self.hidden_size) self.temp_decay_h = TemporalDecay(d_in=self.input_size, d_out=self.hidden_size, diag=False) self.temp_decay_x = TemporalDecay(d_in=self.input_size, d_out=self.input_size, diag=True) self.hist_reg = nn.Linear(self.hidden_size, self.input_size) self.feat_reg = FeatureRegression(self.input_size) self.weight_combine = nn.Linear(2 * self.input_size, self.input_size) def init_hidden_states(self, x): return Variable(torch.zeros((x.shape[0], self.hidden_size))).to(x.device) def forward(self, x, mask=None, delta=None): # x : [batch, steps, features] steps = x.shape[-2] if mask is None: mask = torch.ones_like(x, dtype=torch.uint8) if delta is None: delta = TemporalDecay.compute_delta(mask) # init rnn states h = self.init_hidden_states(x) c = self.init_hidden_states(x) imputation = [] predictions = [] for step in range(steps): d = delta[:, step, :] m = mask[:, step, :] x_s = x[:, step, :] gamma_h = self.temp_decay_h(d) # history prediction x_h = self.hist_reg(h) x_c = m * x_s + (1 - m) * x_h h = h * gamma_h # feature prediction z_h = self.feat_reg(x_c) # predictions combination gamma_x = self.temp_decay_x(d) alpha = self.weight_combine(torch.cat([gamma_x, m], dim=1)) alpha = torch.sigmoid(alpha) c_h = alpha * z_h + (1 - alpha) * x_h c_c = m * x_s + (1 - m) * c_h inputs = torch.cat([c_c, m], dim=1) h, c = self.rnn_cell(inputs, (h, c)) imputation.append(c_c) predictions.append(torch.stack((c_h, z_h, x_h), dim=0)) # imputation -> [batch, steps, features] imputation = torch.stack(imputation, dim=-2) # predictions -> [predictions, batch, steps, features] predictions = torch.stack(predictions, dim=-2) c_h, z_h, x_h = predictions return imputation, (c_h, z_h, x_h) class BRITS(nn.Module): def __init__(self, input_size, hidden_size): super().__init__() self.rits_fwd = RITS(input_size, hidden_size) self.rits_bwd = RITS(input_size, hidden_size) def forward(self, x, mask=None): # x: [batches, steps, features] # forward imp_fwd, pred_fwd = self.rits_fwd(x, mask) # backward x_bwd = reverse_tensor(x, axis=1) mask_bwd = reverse_tensor(mask, axis=1) if mask is not None else None imp_bwd, pred_bwd = self.rits_bwd(x_bwd, mask_bwd) imp_bwd, pred_bwd = reverse_tensor(imp_bwd, axis=1), [reverse_tensor(pb, axis=1) for pb in pred_bwd] # stack into shape = [batch, directions, steps, features] imputation = torch.stack([imp_fwd, imp_bwd], dim=1) predictions = [torch.stack([pf, pb], dim=1) for pf, pb in zip(pred_fwd, pred_bwd)] c_h, z_h, x_h = predictions return imputation, (c_h, z_h, x_h) @staticmethod def consistency_loss(imp_fwd, imp_bwd): loss = 0.1 * torch.abs(imp_fwd - imp_bwd).mean() return loss ================================================ FILE: lib/nn/layers/spatial_attention.py ================================================ import torch.nn as nn from einops import rearrange class SpatialAttention(nn.Module): def __init__(self, d_in, d_model, nheads, dropout=0.): super(SpatialAttention, self).__init__() self.lin_in = nn.Linear(d_in, d_model) self.self_attn = nn.MultiheadAttention(d_model, nheads, dropout=dropout) def forward(self, x, att_mask=None, **kwargs): r"""Pass the input through the encoder layer. Args: src: the sequence to the encoder layer (required). src_mask: the mask for the src sequence (optional). src_key_padding_mask: the mask for the src keys per batch (optional). Shape: see the docs in Transformer class. """ b, s, n, f = x.size() x = rearrange(x, 'b s n f -> n (b s) f') x = self.lin_in(x) x = self.self_attn(x, x, x, attn_mask=att_mask)[0] x = rearrange(x, 'n (b s) f -> b s n f', b=b, s=s) return x ================================================ FILE: lib/nn/layers/spatial_conv.py ================================================ import torch from torch import nn from ... import epsilon class SpatialConvOrderK(nn.Module): """ Spatial convolution of order K with possibly different diffusion matrices (useful for directed graphs) Efficient implementation inspired from graph-wavenet codebase """ def __init__(self, c_in, c_out, support_len=3, order=2, include_self=True): super(SpatialConvOrderK, self).__init__() self.include_self = include_self c_in = (order * support_len + (1 if include_self else 0)) * c_in self.mlp = nn.Conv2d(c_in, c_out, kernel_size=1) self.order = order @staticmethod def compute_support(adj, device=None): if device is not None: adj = adj.to(device) adj_bwd = adj.T adj_fwd = adj / (adj.sum(1, keepdims=True) + epsilon) adj_bwd = adj_bwd / (adj_bwd.sum(1, keepdims=True) + epsilon) support = [adj_fwd, adj_bwd] return support @staticmethod def compute_support_orderK(adj, k, include_self=False, device=None): if isinstance(adj, (list, tuple)): support = adj else: support = SpatialConvOrderK.compute_support(adj, device) supp_k = [] for a in support: ak = a for i in range(k - 1): ak = torch.matmul(ak, a.T) if not include_self: ak.fill_diagonal_(0.) supp_k.append(ak) return support + supp_k def forward(self, x, support): # [batch, features, nodes, steps] if x.dim() < 4: squeeze = True x = torch.unsqueeze(x, -1) else: squeeze = False out = [x] if self.include_self else [] if (type(support) is not list): support = [support] for a in support: x1 = torch.einsum('ncvl,wv->ncwl', (x, a)).contiguous() out.append(x1) for k in range(2, self.order + 1): x2 = torch.einsum('ncvl,wv->ncwl', (x1, a)).contiguous() out.append(x2) x1 = x2 out = torch.cat(out, dim=1) out = self.mlp(out) if squeeze: out = out.squeeze(-1) return out ================================================ FILE: lib/nn/models/__init__.py ================================================ from .grin import GRINet from .brits import BRITSNet from .mpgru import MPGRUNet, BiMPGRUNet from .var import VARImputer from .rgain import RGAINNet from .rnn_imputers import BiRNNImputer, RNNImputer ================================================ FILE: lib/nn/models/brits.py ================================================ import torch from torch import nn from ..layers import BRITS class BRITSNet(nn.Module): def __init__(self, d_in, d_hidden=64): super(BRITSNet, self).__init__() self.birits = BRITS(input_size=d_in, hidden_size=d_hidden) def forward(self, x, mask=None, **kwargs): # x: [batches, steps, features] imputations, predictions = self.birits(x, mask=mask) # predictions: [batch, directions, steps, features] x 3 out = torch.mean(imputations, dim=1) # -> [batch, steps, features] predictions = torch.cat(predictions, dim=1) # -> [batch, directions * n_predictions, steps, features] # reshape imputations = torch.transpose(imputations, 0, 1) # rearrange(imputations, 'b d s f -> d b s f') predictions = torch.transpose(predictions, 0, 1) # rearrange(predictions, 'b d s f -> d b s f') return out, imputations, predictions @staticmethod def add_model_specific_args(parser): parser.add_argument('--d-in', type=int) parser.add_argument('--d-hidden', type=int, default=64) return parser ================================================ FILE: lib/nn/models/grin.py ================================================ import torch from einops import rearrange from torch import nn from ..layers import BiGRIL from ...utils.parser_utils import str_to_bool class GRINet(nn.Module): def __init__(self, adj, d_in, d_hidden, d_ff, ff_dropout, n_layers=1, kernel_size=2, decoder_order=1, global_att=False, d_u=0, d_emb=0, layer_norm=False, merge='mlp', impute_only_holes=True): super(GRINet, self).__init__() self.d_in = d_in self.d_hidden = d_hidden self.d_u = int(d_u) if d_u is not None else 0 self.d_emb = int(d_emb) if d_emb is not None else 0 self.register_buffer('adj', torch.tensor(adj).float()) self.impute_only_holes = impute_only_holes self.bigrill = BiGRIL(input_size=self.d_in, ff_size=d_ff, ff_dropout=ff_dropout, hidden_size=self.d_hidden, embedding_size=self.d_emb, n_nodes=self.adj.shape[0], n_layers=n_layers, kernel_size=kernel_size, decoder_order=decoder_order, global_att=global_att, u_size=self.d_u, layer_norm=layer_norm, merge=merge) def forward(self, x, mask=None, u=None, **kwargs): # x: [batches, steps, nodes, channels] -> [batches, channels, nodes, steps] x = rearrange(x, 'b s n c -> b c n s') if mask is not None: mask = rearrange(mask, 'b s n c -> b c n s') if u is not None: u = rearrange(u, 'b s n c -> b c n s') # imputation: [batches, channels, nodes, steps] prediction: [4, batches, channels, nodes, steps] imputation, prediction = self.bigrill(x, self.adj, mask=mask, u=u, cached_support=self.training) # In evaluation stage impute only missing values if self.impute_only_holes and not self.training: imputation = torch.where(mask, x, imputation) # out: [batches, channels, nodes, steps] -> [batches, steps, nodes, channels] imputation = torch.transpose(imputation, -3, -1) prediction = torch.transpose(prediction, -3, -1) if self.training: return imputation, prediction return imputation @staticmethod def add_model_specific_args(parser): parser.add_argument('--d-hidden', type=int, default=64) parser.add_argument('--d-ff', type=int, default=64) parser.add_argument('--ff-dropout', type=int, default=0.) parser.add_argument('--n-layers', type=int, default=1) parser.add_argument('--kernel-size', type=int, default=2) parser.add_argument('--decoder-order', type=int, default=1) parser.add_argument('--d-u', type=int, default=0) parser.add_argument('--d-emb', type=int, default=8) parser.add_argument('--layer-norm', type=str_to_bool, nargs='?', const=True, default=False) parser.add_argument('--global-att', type=str_to_bool, nargs='?', const=True, default=False) parser.add_argument('--merge', type=str, default='mlp') parser.add_argument('--impute-only-holes', type=str_to_bool, nargs='?', const=True, default=True) return parser ================================================ FILE: lib/nn/models/mpgru.py ================================================ import torch from einops import rearrange from torch import nn from ..layers import MPGRUImputer, SpatialConvOrderK from ..utils.ops import reverse_tensor from ...utils.parser_utils import str_to_bool class MPGRUNet(nn.Module): def __init__(self, adj, d_in, d_hidden, d_ff=0, d_u=0, n_layers=1, dropout=0., kernel_size=2, support_len=2, layer_norm=False, impute_only_holes=True): super(MPGRUNet, self).__init__() self.register_buffer('adj', torch.tensor(adj).float()) n_nodes = adj.shape[0] self.gcgru = MPGRUImputer(input_size=d_in, hidden_size=d_hidden, ff_size=d_ff, u_size=d_u, n_layers=n_layers, dropout=dropout, kernel_size=kernel_size, support_len=support_len, layer_norm=layer_norm, n_nodes=n_nodes) self.impute_only_holes = impute_only_holes def forward(self, x, mask=None, u=None, h=None): # x: [batches, steps, nodes, channels] -> [batches, channels, nodes, steps] x = rearrange(x, 'b s n c -> b c n s') if mask is not None: mask = rearrange(mask, 'b s n c -> b c n s') if u is not None: u = rearrange(u, 'b s n c -> b c n s') adj = SpatialConvOrderK.compute_support(self.adj, x.device) imputation, _ = self.gcgru(x, adj, mask=mask, u=u, h=h) # In evaluation stage impute only missing values if self.impute_only_holes and not self.training: imputation = torch.where(mask, x, imputation) # out: [batches, channels, nodes, steps] -> [batches, steps, nodes, channels] imputation = rearrange(imputation, 'b c n s -> b s n c') return imputation @staticmethod def add_model_specific_args(parser): parser.add_argument('--d-hidden', type=int, default=64) parser.add_argument('--d-ff', type=int, default=64) parser.add_argument('--n-layers', type=int, default=1) parser.add_argument('--kernel-size', type=int, default=2) parser.add_argument('--layer-norm', type=str_to_bool, nargs='?', const=True, default=False) parser.add_argument('--impute-only-holes', type=str_to_bool, nargs='?', const=True, default=True) parser.add_argument('--dropout', type=float, default=0.) return parser class BiMPGRUNet(nn.Module): def __init__(self, adj, d_in, d_hidden, d_ff=0, d_u=0, n_layers=1, dropout=0., kernel_size=2, support_len=2, layer_norm=False, embedding_size=0, merge='mlp', impute_only_holes=True, autoencoder_mode=False): super(BiMPGRUNet, self).__init__() self.register_buffer('adj', torch.tensor(adj).float()) n_nodes = adj.shape[0] self.gcgru_fwd = MPGRUImputer(input_size=d_in, hidden_size=d_hidden, u_size=d_u, n_layers=n_layers, dropout=dropout, kernel_size=kernel_size, support_len=support_len, layer_norm=layer_norm, n_nodes=n_nodes, autoencoder_mode=autoencoder_mode) self.gcgru_bwd = MPGRUImputer(input_size=d_in, hidden_size=d_hidden, u_size=d_u, n_layers=n_layers, dropout=dropout, kernel_size=kernel_size, support_len=support_len, layer_norm=layer_norm, n_nodes=n_nodes, autoencoder_mode=autoencoder_mode) self.impute_only_holes = impute_only_holes if n_nodes is None: embedding_size = 0 if embedding_size > 0: self.emb = nn.Parameter(torch.empty(embedding_size, n_nodes)) nn.init.kaiming_normal_(self.emb, nonlinearity='relu') else: self.register_parameter('emb', None) if merge == 'mlp': self._impute_from_states = True self.out = nn.Sequential( nn.Conv2d(in_channels=2 * d_hidden + d_in + embedding_size, out_channels=d_ff, kernel_size=1), nn.ReLU(), nn.Conv2d(in_channels=d_ff, out_channels=d_in, kernel_size=1) ) elif merge in ['mean', 'sum', 'min', 'max']: self._impute_from_states = False self.out = getattr(torch, merge) else: raise ValueError("Merge option %s not allowed." % merge) def forward(self, x, mask=None, u=None, h=None): # x: [batches, steps, nodes, channels] -> [batches, channels, nodes, steps] x = rearrange(x, 'b s n c -> b c n s') if mask is not None: mask = rearrange(mask, 'b s n c -> b c n s') if u is not None: u = rearrange(u, 'b s n c -> b c n s') adj = SpatialConvOrderK.compute_support(self.adj, x.device) # Forward fwd_pred, fwd_states = self.gcgru_fwd(x, adj, mask=mask, u=u) # Backward rev_x, rev_mask, rev_u = [reverse_tensor(tens, axis=-1) for tens in (x, mask, u)] bwd_res = self.gcgru_bwd(rev_x, adj, mask=rev_mask, u=rev_u) bwd_pred, bwd_states = [reverse_tensor(res, axis=-1) for res in bwd_res] if self._impute_from_states: inputs = [fwd_states[-1], bwd_states[-1], mask] # take only state of last gcgru layer if self.emb is not None: b, *_, s = x.shape # fwd_h: [batches, channels, nodes, steps] inputs += [self.emb.view(1, *self.emb.shape, 1).expand(b, -1, -1, s)] # stack emb for batches and steps imputation = torch.cat(inputs, dim=1) imputation = self.out(imputation) else: imputation = torch.stack([fwd_pred, bwd_pred], dim=1) imputation = self.out(imputation, dim=1) # In evaluation stage impute only missing values if self.impute_only_holes and not self.training: imputation = torch.where(mask, x, imputation) # out: [batches, channels, nodes, steps] -> [batches, steps, nodes, channels] imputation = rearrange(imputation, 'b c n s -> b s n c') return imputation @staticmethod def add_model_specific_args(parser): parser.add_argument('--d-hidden', type=int, default=64) parser.add_argument('--d-ff', type=int, default=64) parser.add_argument('--n-layers', type=int, default=1) parser.add_argument('--kernel-size', type=int, default=2) parser.add_argument('--d-emb', type=int, default=8) parser.add_argument('--layer-norm', type=str_to_bool, nargs='?', const=True, default=False) parser.add_argument('--merge', type=str, default='mlp') parser.add_argument('--impute-only-holes', type=str_to_bool, nargs='?', const=True, default=True) parser.add_argument('--dropout', type=float, default=0.) parser.add_argument('--autoencoder-mode', type=str_to_bool, nargs='?', const=True, default=False) return parser ================================================ FILE: lib/nn/models/rgain.py ================================================ import torch from torch import nn from .rnn_imputers import BiRNNImputer from ...utils.parser_utils import str_to_bool class Generator(nn.Module): def __init__(self, d_in, d_model, d_z, dropout=0., inject_noise=True): super(Generator, self).__init__() self.inject_noise = inject_noise self.d_z = d_z if inject_noise else 0 self.birnn = BiRNNImputer(d_in, d_model, d_u=d_z, concat_mask=True, detach_inputs=False, dropout=dropout, state_init='zero') def forward(self, x, mask): if self.inject_noise: z = torch.rand(x.size(0), x.size(1), self.d_z, device=x.device) * 0.1 else: z = None return self.birnn(x, mask, u=z) class Discriminator(torch.nn.Module): def __init__(self, d_in, d_model, dropout=0.): super(Discriminator, self).__init__() self.birnn = nn.GRU(2 * d_in, d_model, bidirectional=True, batch_first=True) self.dropout = nn.Dropout(dropout) self.read_out = nn.Linear(2 * d_model, d_in) def forward(self, x, h): x_in = torch.cat([x, h], dim=-1) out, _ = self.birnn(x_in) logits = self.read_out(self.dropout(out)) return logits class RGAINNet(torch.nn.Module): def __init__(self, d_in, d_model, d_z, dropout=0., inject_noise=False, k=5): super(RGAINNet, self).__init__() self.inject_noise = inject_noise self.k = k self.generator = Generator(d_in, d_model, d_z=d_z, dropout=dropout, inject_noise=inject_noise) self.discriminator = Discriminator(d_in, d_model, dropout) def forward(self, x, mask, **kwargs): if not self.training and self.inject_noise: res = [] for _ in range(self.k): res.append(self.generator(x, mask)[0]) return torch.stack(res, 0).mean(0), return self.generator(x, mask) @staticmethod def add_model_specific_args(parser): parser.add_argument('--d-in', type=int) parser.add_argument('--d-model', type=int, default=None) parser.add_argument('--d-z', type=int, default=8) parser.add_argument('--k', type=int, default=5) parser.add_argument('--inject-noise', type=str_to_bool, nargs='?', const=True, default=False) parser.add_argument('--dropout', type=float, default=0.) return parser ================================================ FILE: lib/nn/models/rnn_imputers.py ================================================ import torch from torch import nn from ..utils.ops import reverse_tensor class RNNImputer(nn.Module): """Fill the blanks with a 1-step-ahead GRU predictor.""" def __init__(self, d_in, d_model, concat_mask=True, detach_inputs=False, state_init='zero', d_u=0): super(RNNImputer, self).__init__() self.concat_mask = concat_mask self.detach_inputs = detach_inputs self.state_init = state_init self.d_model = d_model self.input_dim = d_in + d_u if not concat_mask else 2 * d_in + d_u self.rnn_cell = nn.GRUCell(self.input_dim, d_model) self.read_out = nn.Linear(d_model, d_in) def init_hidden_state(self, x): if self.state_init == 'zero': return torch.zeros((x.size(0), self.d_model), device=x.device, dtype=x.dtype) if self.state_init == 'noise': return torch.randn(x.size(0), self.d_model, device=x.device, dtype=x.dtype) def _preprocess_input(self, x, x_hat, m, u): if self.detach_inputs: x_p = torch.where(m, x, x_hat.detach()) else: x_p = torch.where(m, x, x_hat) if u is not None: x_p = torch.cat([x_p, u], -1) if self.concat_mask: x_p = torch.cat([x_p, m], -1) return x_p def forward(self, x, mask, u=None, return_hidden=False): # x: [batches, steps, features] steps = x.size(1) # ensure masked values are not visible x = torch.where(mask, x, torch.zeros_like(x)) h = self.init_hidden_state(x) x_hat = self.read_out(h) hs = [h] preds = [x_hat] for s in range(steps - 1): u_t = None if u is None else u[:, s] x_t = self._preprocess_input(x[:, s], x_hat, mask[:, s], u_t) h = self.rnn_cell(x_t, h) x_hat = self.read_out(h) hs.append(h) preds.append(x_hat) x_hat = torch.stack(preds, 1) h = torch.stack(hs, 1) if return_hidden: return x_hat, h return x_hat @staticmethod def add_model_specific_args(parser): parser.add_argument('--d-in', type=int) parser.add_argument('--d-model', type=int, default=None) return parser class BiRNNImputer(nn.Module): """Fill the blanks with a 1-step-ahead GRU predictor.""" def __init__(self, d_in, d_model, dropout=0., concat_mask=True, detach_inputs=False, state_init='zero', d_u=0): super(BiRNNImputer, self).__init__() self.d_model = d_model self.fwd_rnn = RNNImputer(d_in, d_model, concat_mask, detach_inputs=detach_inputs, state_init=state_init, d_u=d_u) self.bwd_rnn = RNNImputer(d_in, d_model, concat_mask, detach_inputs=detach_inputs, state_init=state_init, d_u=d_u) self.dropout = nn.Dropout(dropout) self.read_out = nn.Linear(2 * d_model, d_in) def forward(self, x, mask, u=None, return_hidden=False): # x: [batches, steps, features] x_hat_fwd, h_fwd = self.fwd_rnn(x, mask, u=u, return_hidden=True) x_hat_bwd, h_bwd = self.bwd_rnn(reverse_tensor(x, 1), reverse_tensor(mask, 1), u=reverse_tensor(u, 1) if u is not None else None, return_hidden=True) x_hat_bwd = reverse_tensor(x_hat_bwd, 1) h_bwd = reverse_tensor(h_bwd, 1) h = self.dropout(torch.cat([h_fwd, h_bwd], -1)) x_hat = self.read_out(h) if return_hidden: return (x_hat, x_hat_fwd, x_hat_bwd), h return x_hat, x_hat_fwd, x_hat_bwd @staticmethod def add_model_specific_args(parser): parser.add_argument('--d-in', type=int) parser.add_argument('--d-model', type=int, default=None) parser.add_argument('--dropout', type=float, default=0.) return parser ================================================ FILE: lib/nn/models/var.py ================================================ import torch from einops import rearrange from torch import nn from lib import epsilon class VAR(nn.Module): def __init__(self, order, d_in, d_out=None, steps_ahead=1, bias=True): super(VAR, self).__init__() self.order = order self.d_in = d_in self.d_out = d_out if d_out is not None else d_in self.steps_ahead = steps_ahead self.lin = nn.Linear(order * d_in, steps_ahead * self.d_out, bias=bias) def forward(self, x): # x: [batches, steps, features] x = rearrange(x, 'b s f -> b (s f)') out = self.lin(x) out = rearrange(out, 'b (s f) -> b s f', s=self.steps_ahead, f=self.d_out) return out @staticmethod def add_model_specific_args(parser): parser.add_argument('--order', type=int) parser.add_argument('--d-in', type=int) parser.add_argument('--d-out', type=int, default=None) parser.add_argument('--steps-ahead', type=int, default=1) return parser class VARImputer(nn.Module): """Fill the blanks with a 1-step-ahead VAR predictor.""" def __init__(self, order, d_in, padding='mean'): super(VARImputer, self).__init__() assert padding in ['mean', 'zero'] self.order = order self.padding = padding self.predictor = VAR(order, d_in, d_out=d_in, steps_ahead=1) def forward(self, x, mask=None): # x: [batches, steps, features] batch_size, steps, n_feats = x.shape if mask is None: mask = torch.ones_like(x, dtype=torch.uint8) x = x * mask # pad input sequence to start filling from first step if self.padding == 'mean': mean = torch.sum(x, 1) / (torch.sum(mask, 1) + epsilon) pad = torch.repeat_interleave(mean.unsqueeze(1), self.order, 1) elif self.padding == 'zero': pad = torch.zeros((batch_size, self.order, n_feats)).to(x.device) x = torch.cat([pad, x], 1) # x: [batch, order + steps, features] x = [x[:, i] for i in range(x.shape[1])] for s in range(steps): x_hat = self.predictor(torch.stack(x[s:s + self.order], 1)) x_hat = x_hat[:, 0] x[s + self.order] = torch.where(mask[:, s], x[s + self.order], x_hat) x = torch.stack(x[self.order:], 1) # remove padding return x @staticmethod def add_model_specific_args(parser): parser.add_argument('--order', type=int) parser.add_argument('--d-in', type=int) parser.add_argument("--padding", type=str, default='mean') return parser ================================================ FILE: lib/nn/utils/__init__.py ================================================ ================================================ FILE: lib/nn/utils/metric_base.py ================================================ from functools import partial import torch from pytorch_lightning.metrics import Metric from torchmetrics.utilities.checks import _check_same_shape class MaskedMetric(Metric): def __init__(self, metric_fn, mask_nans=False, mask_inf=False, compute_on_step=True, dist_sync_on_step=False, process_group=None, dist_sync_fn=None, metric_kwargs=None, at=None): super(MaskedMetric, self).__init__(compute_on_step=compute_on_step, dist_sync_on_step=dist_sync_on_step, process_group=process_group, dist_sync_fn=dist_sync_fn) if metric_kwargs is None: metric_kwargs = dict() self.metric_fn = partial(metric_fn, **metric_kwargs) self.mask_nans = mask_nans self.mask_inf = mask_inf if at is None: self.at = slice(None) else: self.at = slice(at, at + 1) self.add_state('value', dist_reduce_fx='sum', default=torch.tensor(0.).float()) self.add_state('numel', dist_reduce_fx='sum', default=torch.tensor(0)) def _check_mask(self, mask, val): if mask is None: mask = torch.ones_like(val).byte() else: _check_same_shape(mask, val) if self.mask_nans: mask = mask * ~torch.isnan(val) if self.mask_inf: mask = mask * ~torch.isinf(val) return mask def _compute_masked(self, y_hat, y, mask): _check_same_shape(y_hat, y) val = self.metric_fn(y_hat, y) mask = self._check_mask(mask, val) val = torch.where(mask, val, torch.tensor(0., device=val.device).float()) return val.sum(), mask.sum() def _compute_std(self, y_hat, y): _check_same_shape(y_hat, y) val = self.metric_fn(y_hat, y) return val.sum(), val.numel() def is_masked(self, mask): return self.mask_inf or self.mask_nans or (mask is not None) def update(self, y_hat, y, mask=None): y_hat = y_hat[:, self.at] y = y[:, self.at] if mask is not None: mask = mask[:, self.at] if self.is_masked(mask): val, numel = self._compute_masked(y_hat, y, mask) else: val, numel = self._compute_std(y_hat, y) self.value += val self.numel += numel def compute(self): if self.numel > 0: return self.value / self.numel return self.value ================================================ FILE: lib/nn/utils/metrics.py ================================================ from .metric_base import MaskedMetric from .ops import mape from torch.nn import functional as F import torch from torchmetrics.utilities.checks import _check_same_shape from ... import epsilon class MaskedMAE(MaskedMetric): def __init__(self, mask_nans=False, mask_inf=False, compute_on_step=True, dist_sync_on_step=False, process_group=None, dist_sync_fn=None, at=None): super(MaskedMAE, self).__init__(metric_fn=F.l1_loss, mask_nans=mask_nans, mask_inf=mask_inf, compute_on_step=compute_on_step, dist_sync_on_step=dist_sync_on_step, process_group=process_group, dist_sync_fn=dist_sync_fn, metric_kwargs={'reduction': 'none'}, at=at) class MaskedMAPE(MaskedMetric): def __init__(self, mask_nans=False, compute_on_step=True, dist_sync_on_step=False, process_group=None, dist_sync_fn=None, at=None): super(MaskedMAPE, self).__init__(metric_fn=mape, mask_nans=mask_nans, mask_inf=True, compute_on_step=compute_on_step, dist_sync_on_step=dist_sync_on_step, process_group=process_group, dist_sync_fn=dist_sync_fn, at=at) class MaskedMSE(MaskedMetric): def __init__(self, mask_nans=False, compute_on_step=True, dist_sync_on_step=False, process_group=None, dist_sync_fn=None, at=None): super(MaskedMSE, self).__init__(metric_fn=F.mse_loss, mask_nans=mask_nans, mask_inf=True, compute_on_step=compute_on_step, dist_sync_on_step=dist_sync_on_step, process_group=process_group, dist_sync_fn=dist_sync_fn, metric_kwargs={'reduction': 'none'}, at=at) class MaskedMRE(MaskedMetric): def __init__(self, mask_nans=False, mask_inf=False, compute_on_step=True, dist_sync_on_step=False, process_group=None, dist_sync_fn=None, at=None): super(MaskedMRE, self).__init__(metric_fn=F.l1_loss, mask_nans=mask_nans, mask_inf=mask_inf, compute_on_step=compute_on_step, dist_sync_on_step=dist_sync_on_step, process_group=process_group, dist_sync_fn=dist_sync_fn, metric_kwargs={'reduction': 'none'}, at=at) self.add_state('tot', dist_reduce_fx='sum', default=torch.tensor(0., dtype=torch.float)) def _compute_masked(self, y_hat, y, mask): _check_same_shape(y_hat, y) val = self.metric_fn(y_hat, y) mask = self._check_mask(mask, val) val = torch.where(mask, val, torch.tensor(0., device=y.device, dtype=torch.float)) y_masked = torch.where(mask, y, torch.tensor(0., device=y.device, dtype=torch.float)) return val.sum(), mask.sum(), y_masked.sum() def _compute_std(self, y_hat, y): _check_same_shape(y_hat, y) val = self.metric_fn(y_hat, y) return val.sum(), val.numel(), y.sum() def compute(self): if self.tot > epsilon: return self.value / self.tot return self.value def update(self, y_hat, y, mask=None): y_hat = y_hat[:, self.at] y = y[:, self.at] if mask is not None: mask = mask[:, self.at] if self.is_masked(mask): val, numel, tot = self._compute_masked(y_hat, y, mask) else: val, numel, tot = self._compute_std(y_hat, y) self.value += val self.numel += numel self.tot += tot ================================================ FILE: lib/nn/utils/ops.py ================================================ import torch import torch.nn.functional as F from einops import reduce from torch.autograd import Variable from ... import epsilon def mae(y_hat, y, reduction='none'): return F.l1_loss(y_hat, y, reduction=reduction) def mape(y_hat, y): return torch.abs((y_hat - y) / y) def wape_loss(y_hat, y): l = torch.abs(y_hat - y) return l.sum() / (y.sum() + epsilon) def smape_loss(y_hat, y): c = torch.abs(y) > epsilon l_minus = torch.abs(y_hat - y) l_plus = torch.abs(y_hat + y) + epsilon l = 2 * l_minus / l_plus * c.float() return l.sum() / c.sum() def peak_prediction_loss(y_hat, y, reduction='none'): y_max = reduce(y, 'b s n 1 -> b 1 n 1', 'max') y_min = reduce(y, 'b s n 1 -> b 1 n 1', 'min') target = torch.cat([y_max, y_min], dim=1) return F.mse_loss(y_hat, target, reduction=reduction) def wrap_loss_fn(base_loss): def loss_fn(y_hat, y_true, mask=None): scaling = 1. if mask is not None: try: loss = base_loss(y_hat, y_true, reduction='none') except TypeError: loss = base_loss(y_hat, y_true) loss = loss * mask loss = loss.sum() / (mask.sum() + epsilon) # scaling = mask.sum() / torch.numel(mask) else: loss = base_loss(y_hat, y_true).mean() return scaling * loss return loss_fn def rbf_sim(x, gamma, device='cpu'): n = x.size()[0] a = torch.exp(-gamma * F.pdist(x, 2) ** 2) row_idx, col_idx = torch.triu_indices(n, n, 1) A = 0.5 * torch.eye(n, n).to(device) A[row_idx, col_idx] = a return A + A.T def reverse_tensor(tensor=None, axis=-1): if tensor is None: return None if tensor.dim() <= 1: return tensor indices = range(tensor.size()[axis])[::-1] indices = Variable(torch.LongTensor(indices), requires_grad=False).to(tensor.device) return tensor.index_select(axis, indices) ================================================ FILE: lib/utils/__init__.py ================================================ from .utils import * ================================================ FILE: lib/utils/numpy_metrics.py ================================================ import numpy as np def mae(y_hat, y): return np.abs(y_hat - y).mean() def nmae(y_hat, y): delta = np.max(y) - np.min(y) + 1e-8 return mae(y_hat, y) * 100 / delta def mape(y_hat, y): return 100 * np.abs((y_hat - y) / (y + 1e-8)).mean() def mse(y_hat, y): return np.square(y_hat - y).mean() def rmse(y_hat, y): return np.sqrt(mse(y_hat, y)) def nrmse(y_hat, y): delta = np.max(y) - np.min(y) + 1e-8 return rmse(y_hat, y) * 100 / delta def nrmse_2(y_hat, y): nrmse_ = np.sqrt(np.square(y_hat - y).sum() / np.square(y).sum()) return nrmse_ * 100 def r2(y_hat, y): return 1. - np.square(y_hat - y).sum() / (np.square(y.mean(0) - y).sum()) def masked_mae(y_hat, y, mask): err = np.abs(y_hat - y) * mask return err.sum() / mask.sum() def masked_mape(y_hat, y, mask): err = np.abs((y_hat - y) / (y + 1e-8)) * mask return err.sum() / mask.sum() def masked_mse(y_hat, y, mask): err = np.square(y_hat - y) * mask return err.sum() / mask.sum() def masked_rmse(y_hat, y, mask): err = np.square(y_hat - y) * mask return np.sqrt(err.sum() / mask.sum()) def masked_mre(y_hat, y, mask): err = np.abs(y_hat - y) * mask return err.sum() / ((y * mask).sum() + 1e-8) ================================================ FILE: lib/utils/parser_utils.py ================================================ import inspect from argparse import Namespace, ArgumentParser from typing import Union def str_to_bool(value): if isinstance(value, bool): return value if value.lower() in {'false', 'f', '0', 'no', 'n', 'off'}: return False elif value.lower() in {'true', 't', '1', 'yes', 'y', 'on'}: return True raise ValueError(f'{value} is not a valid boolean value') def config_dict_from_args(args): """ Extract a dictionary with the experiment configuration from arguments (necessary to filter TestTube arguments) :param args: TTNamespace :return: hyparams dict """ keys_to_remove = {'hpc_exp_number', 'trials', 'optimize_parallel', 'optimize_parallel_gpu', 'optimize_parallel_cpu', 'generate_trials', 'optimize_trials_parallel_gpu'} hparams = {key: v for key, v in args.__dict__.items() if key not in keys_to_remove} return hparams def update_from_config(args: Namespace, config: dict): assert set(config.keys()) <= set(vars(args)), f'{set(config.keys()).difference(vars(args))} not in args.' args.__dict__.update(config) return args def parse_by_group(parser): """ Create a nested namespace using the groups defined in the argument parser. Adapted from https://stackoverflow.com/a/56631542/6524027 :param args: arguments :param parser: the parser :return: """ assert isinstance(parser, ArgumentParser) args = parser.parse_args() # the first two argument groups are 'positional_arguments' and 'optional_arguments' pos_group, optional_group = parser._action_groups[0], parser._action_groups[1] args_dict = args._get_kwargs() pos_optional_arg_names = [arg.dest for arg in pos_group._group_actions] + [arg.dest for arg in optional_group._group_actions] pos_optional_args = {name: value for name, value in args_dict if name in pos_optional_arg_names} other_group_args = dict() # If there are additional argument groups, add them as nested namespaces if len(parser._action_groups) > 2: for group in parser._action_groups[2:]: group_arg_names = [arg.dest for arg in group._group_actions] other_group_args[group.title] = Namespace( **{name: value for name, value in args_dict if name in group_arg_names}) # combine the positiona/optional args and the group args combined_args = pos_optional_args combined_args.update(other_group_args) return Namespace(flat=args, **combined_args) def filter_args(args: Union[Namespace, dict], target_cls, return_dict=False): argspec = inspect.getfullargspec(target_cls.__init__) target_args = argspec.args if isinstance(args, Namespace): args = vars(args) filtered_args = {k: args[k] for k in target_args if k in args} if return_dict: return filtered_args return Namespace(**filtered_args) def filter_function_args(args: Union[Namespace, dict], function, return_dict=False): argspec = inspect.getfullargspec(function) target_args = argspec.args if isinstance(args, Namespace): args = vars(args) filtered_args = {k: args[k] for k in target_args if k in args} if return_dict: return filtered_args return Namespace(**filtered_args) ================================================ FILE: lib/utils/utils.py ================================================ import numpy as np import pandas as pd from sklearn.metrics.pairwise import haversine_distances def sample_mask(shape, p=0.002, p_noise=0., max_seq=1, min_seq=1, rng=None): if rng is None: rand = np.random.random randint = np.random.randint else: rand = rng.random randint = rng.integers mask = rand(shape) < p for col in range(mask.shape[1]): idxs = np.flatnonzero(mask[:, col]) if not len(idxs): continue fault_len = min_seq if max_seq > min_seq: fault_len = fault_len + int(randint(max_seq - min_seq)) idxs_ext = np.concatenate([np.arange(i, i + fault_len) for i in idxs]) idxs = np.unique(idxs_ext) idxs = np.clip(idxs, 0, shape[0] - 1) mask[idxs, col] = True mask = mask | (rand(mask.shape) < p_noise) return mask.astype('uint8') def compute_mean(x, index=None): """Compute the mean values for each datetime. The mean is first computed hourly over the week of the year. Further NaN values are computed using hourly mean over the same month through the years. If other NaN are present, they are removed using the mean of the sole hours. Hoping reasonably that there is at least a non-NaN entry of the same hour of the NaN datetime in all the dataset.""" if isinstance(x, np.ndarray) and index is not None: shape = x.shape x = x.reshape((shape[0], -1)) df_mean = pd.DataFrame(x, index=index) else: df_mean = x.copy() cond0 = [df_mean.index.year, df_mean.index.isocalendar().week, df_mean.index.hour] cond1 = [df_mean.index.year, df_mean.index.month, df_mean.index.hour] conditions = [cond0, cond1, cond1[1:], cond1[2:]] while df_mean.isna().values.sum() and len(conditions): nan_mean = df_mean.groupby(conditions[0]).transform(np.nanmean) df_mean = df_mean.fillna(nan_mean) conditions = conditions[1:] if df_mean.isna().values.sum(): df_mean = df_mean.fillna(method='ffill') df_mean = df_mean.fillna(method='bfill') if isinstance(x, np.ndarray): df_mean = df_mean.values.reshape(shape) return df_mean def geographical_distance(x=None, to_rad=True): """ Compute the as-the-crow-flies distance between every pair of samples in `x`. The first dimension of each point is assumed to be the latitude, the second is the longitude. The inputs is assumed to be in degrees. If it is not the case, `to_rad` must be set to False. The dimension of the data must be 2. Parameters ---------- x : pd.DataFrame or np.ndarray array_like structure of shape (n_samples_2, 2). to_rad : bool whether to convert inputs to radians (provided that they are in degrees). Returns ------- distances : The distance between the points in kilometers. """ _AVG_EARTH_RADIUS_KM = 6371.0088 # Extract values of X if it is a DataFrame, else assume it is 2-dim array of lat-lon pairs latlon_pairs = x.values if isinstance(x, pd.DataFrame) else x # If the input values are in degrees, convert them in radians if to_rad: latlon_pairs = np.vectorize(np.radians)(latlon_pairs) distances = haversine_distances(latlon_pairs) * _AVG_EARTH_RADIUS_KM # Cast response if isinstance(x, pd.DataFrame): res = pd.DataFrame(distances, x.index, x.index) else: res = distances return res def infer_mask(df, infer_from='next'): """Infer evaluation mask from DataFrame. In the evaluation mask a value is 1 if it is present in the DataFrame and absent in the `infer_from` month. @param pd.DataFrame df: the DataFrame. @param str infer_from: denotes from which month the evaluation value must be inferred. Can be either `previous` or `next`. @return: pd.DataFrame eval_mask: the evaluation mask for the DataFrame """ mask = (~df.isna()).astype('uint8') eval_mask = pd.DataFrame(index=mask.index, columns=mask.columns, data=0).astype('uint8') if infer_from == 'previous': offset = -1 elif infer_from == 'next': offset = 1 else: raise ValueError('infer_from can only be one of %s' % ['previous', 'next']) months = sorted(set(zip(mask.index.year, mask.index.month))) length = len(months) for i in range(length): j = (i + offset) % length year_i, month_i = months[i] year_j, month_j = months[j] mask_j = mask[(mask.index.year == year_j) & (mask.index.month == month_j)] mask_i = mask_j.shift(1, pd.DateOffset(months=12 * (year_i - year_j) + (month_i - month_j))) mask_i = mask_i[~mask_i.index.duplicated(keep='first')] mask_i = mask_i[np.in1d(mask_i.index, mask.index)] eval_mask.loc[mask_i.index] = ~mask_i.loc[mask_i.index] & mask.loc[mask_i.index] return eval_mask def prediction_dataframe(y, index, columns=None, aggregate_by='mean'): """Aggregate batched predictions in a single DataFrame. @param (list or np.ndarray) y: the list of predictions. @param (list or np.ndarray) index: the list of time indexes coupled with the predictions. @param (list or pd.Index) columns: the columns of the returned DataFrame. @param (str or list) aggregate_by: how to aggregate the predictions in case there are more than one for a step. - `mean`: take the mean of the predictions - `central`: take the prediction at the central position, assuming that the predictions are ordered chronologically - `smooth_central`: average the predictions weighted by a gaussian signal with std=1 - `last`: take the last prediction @return: pd.DataFrame df: the evaluation mask for the DataFrame """ dfs = [pd.DataFrame(data=data.reshape(data.shape[:2]), index=idx, columns=columns) for data, idx in zip(y, index)] df = pd.concat(dfs) preds_by_step = df.groupby(df.index) # aggregate according passed methods aggr_methods = ensure_list(aggregate_by) dfs = [] for aggr_by in aggr_methods: if aggr_by == 'mean': dfs.append(preds_by_step.mean()) elif aggr_by == 'central': dfs.append(preds_by_step.aggregate(lambda x: x[int(len(x) // 2)])) elif aggr_by == 'smooth_central': from scipy.signal import gaussian dfs.append(preds_by_step.aggregate(lambda x: np.average(x, weights=gaussian(len(x), 1)))) elif aggr_by == 'last': dfs.append(preds_by_step.aggregate(lambda x: x[0])) # first imputation has missing value in last position else: raise ValueError('aggregate_by can only be one of %s' % ['mean', 'central' 'smooth_central', 'last']) if isinstance(aggregate_by, str): return dfs[0] return dfs def ensure_list(obj): if isinstance(obj, (list, tuple)): return list(obj) else: return [obj] def missing_val_lens(mask): m = np.concatenate([np.zeros((1, mask.shape[1])), (~mask.astype('bool')).astype('int'), np.zeros((1, mask.shape[1]))]) mdiff = np.diff(m, axis=0) lens = [] for c in range(m.shape[1]): mj, = mdiff[:, c].nonzero() diff = np.diff(mj)[::2] lens.extend(list(diff)) return lens def disjoint_months(dataset, months=None, synch_mode='window'): idxs = np.arange(len(dataset)) months = ensure_list(months) # divide indices according to window or horizon if synch_mode == 'window': start, end = 0, dataset.window - 1 elif synch_mode == 'horizon': start, end = dataset.horizon_offset, dataset.horizon_offset + dataset.horizon - 1 else: raise ValueError('synch_mode can only be one of %s' % ['window', 'horizon']) # after idxs start_in_months = np.in1d(dataset.index[dataset._indices + start].month, months) end_in_months = np.in1d(dataset.index[dataset._indices + end].month, months) idxs_in_months = start_in_months & end_in_months after_idxs = idxs[idxs_in_months] # previous idxs months = np.setdiff1d(np.arange(1, 13), months) start_in_months = np.in1d(dataset.index[dataset._indices + start].month, months) end_in_months = np.in1d(dataset.index[dataset._indices + end].month, months) idxs_in_months = start_in_months & end_in_months prev_idxs = idxs[idxs_in_months] return prev_idxs, after_idxs def thresholded_gaussian_kernel(x, theta=None, threshold=None, threshold_on_input=False): if theta is None: theta = np.std(x) weights = np.exp(-np.square(x / theta)) if threshold is not None: mask = x > threshold if threshold_on_input else weights < threshold weights[mask] = 0. return weights ================================================ FILE: requirements.txt ================================================ einops fancyimpute==0.6 h5py openpyxl numpy pandas pytorch-lightning==1.4 pyyaml scikit-learn scipy tables tensorboard tensorflow==2.5.0 tensorflow-gpu==2.4.0 torch==1.8 torchvision torchaudio torchmetrics==0.5 ================================================ FILE: scripts/run_baselines.py ================================================ from argparse import ArgumentParser import numpy as np from fancyimpute import MatrixFactorization, IterativeImputer from sklearn.neighbors import kneighbors_graph from lib import datasets from lib.utils import numpy_metrics from lib.utils.parser_utils import str_to_bool metrics = { 'mae': numpy_metrics.masked_mae, 'mse': numpy_metrics.masked_mse, 'mre': numpy_metrics.masked_mre, 'mape': numpy_metrics.masked_mape } def parse_args(): parser = ArgumentParser() # experiment setting parser.add_argument('--datasets', nargs='+', type=str, default=['all']) parser.add_argument('--imputers', nargs='+', type=str, default=['all']) parser.add_argument('--n-runs', type=int, default=5) parser.add_argument('--in-sample', type=str_to_bool, nargs='?', const=True, default=True) # SpatialKNNImputer params parser.add_argument('--k', type=int, default=10) # MFImputer params parser.add_argument('--rank', type=int, default=10) # MICEImputer params parser.add_argument('--mice-iterations', type=int, default=100) parser.add_argument('--mice-n-features', type=int, default=None) args = parser.parse_args() # parse dataset if args.datasets[0] == 'all': args.datasets = ['air36', 'air', 'bay', 'irish', 'la', 'bay_noise', 'irish_noise', 'la_noise'] # parse imputers if args.imputers[0] == 'all': args.imputers = ['mean', 'knn', 'mf', 'mice'] if not args.in_sample: args.imputers = [name for name in args.imputers if name in ['mean', 'mice']] return args class Imputer: short_name: str def __init__(self, method=None, is_deterministic=True, in_sample=True): self.name = self.__class__.__name__ self.method = method self.is_deterministic = is_deterministic self.in_sample = in_sample def fit(self, x, mask): if not self.in_sample: x_hat = np.where(mask, x, np.nan) return self.method.fit(x_hat) def predict(self, x, mask): x_hat = np.where(mask, x, np.nan) if self.in_sample: return self.method.fit_transform(x_hat) else: return self.method.transform(x_hat) def params(self): return dict() class SpatialKNNImputer(Imputer): short_name = 'knn' def __init__(self, adj, k=20): super(SpatialKNNImputer, self).__init__() self.k = k # normalize sim between [0, 1] sim = (adj + adj.min()) / (adj.max() + adj.min()) knns = kneighbors_graph(1 - sim, n_neighbors=self.k, include_self=False, metric='precomputed').toarray() self.knns = knns def fit(self, x, mask): pass def predict(self, x, mask): x = np.where(mask, x, 0) with np.errstate(divide='ignore', invalid='ignore'): y_hat = (x @ self.knns.T) / (mask @ self.knns.T) y_hat[~np.isfinite(y_hat)] = x.mean() return np.where(mask, x, y_hat) def params(self): return dict(k=self.k) class MeanImputer(Imputer): short_name = 'mean' def fit(self, x, mask): d = np.where(mask, x, np.nan) self.means = np.nanmean(d, axis=0, keepdims=True) def predict(self, x, mask): if self.in_sample: d = np.where(mask, x, np.nan) means = np.nanmean(d, axis=0, keepdims=True) else: means = self.means return np.where(mask, x, means) class MatrixFactorizationImputer(Imputer): short_name = 'mf' def __init__(self, rank=10, loss='mae', verbose=0): method = MatrixFactorization(rank=rank, loss=loss, verbose=verbose) super(MatrixFactorizationImputer, self).__init__(method, is_deterministic=False, in_sample=True) def params(self): return dict(rank=self.method.rank) class MICEImputer(Imputer): short_name = 'mice' def __init__(self, max_iter=100, n_nearest_features=None, in_sample=True, verbose=False): method = IterativeImputer(max_iter=max_iter, n_nearest_features=n_nearest_features, verbose=verbose) is_deterministic = n_nearest_features is None super(MICEImputer, self).__init__(method, is_deterministic=is_deterministic, in_sample=in_sample) def params(self): return dict(max_iter=self.method.max_iter, k=self.method.n_nearest_features or -1) def get_dataset(dataset_name): if dataset_name[:3] == 'air': dataset = datasets.AirQuality(impute_nans=True, small=dataset_name[3:] == '36') elif dataset_name == 'bay': dataset = datasets.MissingValuesPemsBay() elif dataset_name == 'la': dataset = datasets.MissingValuesMetrLA() elif dataset_name == 'la_noise': dataset = datasets.MissingValuesMetrLA(p_fault=0., p_noise=0.25) elif dataset_name == 'bay_noise': dataset = datasets.MissingValuesPemsBay(p_fault=0., p_noise=0.25) else: raise ValueError(f"Dataset {dataset_name} not available in this setting.") # split in train/test if isinstance(dataset, datasets.AirQuality): test_slice = np.in1d(dataset.df.index.month, dataset.test_months) train_slice = ~test_slice else: train_slice = np.zeros(len(dataset)).astype(bool) train_slice[:-int(0.2 * len(dataset))] = True # integrate back eval values in dataset dataset.eval_mask[train_slice] = 0 return dataset, train_slice def get_imputer(imputer_name, args): if imputer_name == 'mean': imputer = MeanImputer(in_sample=args.in_sample) elif imputer_name == 'knn': imputer = SpatialKNNImputer(adj=args.adj, k=args.k) elif imputer_name == 'mf': imputer = MatrixFactorizationImputer(rank=args.rank) elif imputer_name == 'mice': imputer = MICEImputer(max_iter=args.mice_iterations, n_nearest_features=args.mice_n_features, in_sample=args.in_sample) else: raise ValueError(f"Imputer {imputer_name} not available in this setting.") return imputer def run(imputer, dataset, train_slice): test_slice = ~train_slice if args.in_sample: x_train, mask_train = dataset.numpy(), dataset.training_mask y_hat = imputer.predict(x_train, mask_train)[test_slice] else: x_train, mask_train = dataset.numpy()[train_slice], dataset.training_mask[train_slice] imputer.fit(x_train, mask_train) x_test, mask_test = dataset.numpy()[test_slice], dataset.training_mask[test_slice] y_hat = imputer.predict(x_test, mask_test) # Evaluate model y_true = dataset.numpy()[test_slice] eval_mask = dataset.eval_mask[test_slice] for metric, metric_fn in metrics.items(): error = metric_fn(y_hat, y_true, eval_mask) print(f'{imputer.name} on {ds_name} {metric}: {error:.4f}') if __name__ == '__main__': args = parse_args() print(args.__dict__) for ds_name in args.datasets: dataset, train_slice = get_dataset(ds_name) args.adj = dataset.get_similarity(thr=0.1) # Instantiate imputers imputers = [get_imputer(name, args) for name in args.imputers] for imputer in imputers: n_runs = 1 if imputer.is_deterministic else args.n_runs for _ in range(n_runs): run(imputer, dataset, train_slice) ================================================ FILE: scripts/run_imputation.py ================================================ import copy import datetime import os import pathlib from argparse import ArgumentParser import numpy as np import pytorch_lightning as pl import torch import torch.nn.functional as F import yaml from pytorch_lightning.callbacks import EarlyStopping, ModelCheckpoint from pytorch_lightning.loggers import TensorBoardLogger from torch.optim.lr_scheduler import CosineAnnealingLR from lib import fillers, datasets, config from lib.data.datamodule import SpatioTemporalDataModule from lib.data.imputation_dataset import ImputationDataset, GraphImputationDataset from lib.nn import models from lib.nn.utils.metric_base import MaskedMetric from lib.nn.utils.metrics import MaskedMAE, MaskedMAPE, MaskedMSE, MaskedMRE from lib.utils import parser_utils, numpy_metrics, ensure_list, prediction_dataframe from lib.utils.parser_utils import str_to_bool def has_graph_support(model_cls): return model_cls in [models.GRINet, models.MPGRUNet, models.BiMPGRUNet] def get_model_classes(model_str): if model_str == 'brits': model, filler = models.BRITSNet, fillers.BRITSFiller elif model_str == 'grin': model, filler = models.GRINet, fillers.GraphFiller elif model_str == 'mpgru': model, filler = models.MPGRUNet, fillers.GraphFiller elif model_str == 'bimpgru': model, filler = models.BiMPGRUNet, fillers.GraphFiller elif model_str == 'var': model, filler = models.VARImputer, fillers.Filler elif model_str == 'gain': model, filler = models.RGAINNet, fillers.RGAINFiller elif model_str == 'birnn': model, filler = models.BiRNNImputer, fillers.MultiImputationFiller elif model_str == 'rnn': model, filler = models.RNNImputer, fillers.Filler else: raise ValueError(f'Model {model_str} not available.') return model, filler def get_dataset(dataset_name): if dataset_name[:3] == 'air': dataset = datasets.AirQuality(impute_nans=True, small=dataset_name[3:] == '36') elif dataset_name == 'bay_block': dataset = datasets.MissingValuesPemsBay() elif dataset_name == 'la_block': dataset = datasets.MissingValuesMetrLA() elif dataset_name == 'la_point': dataset = datasets.MissingValuesMetrLA(p_fault=0., p_noise=0.25) elif dataset_name == 'bay_point': dataset = datasets.MissingValuesPemsBay(p_fault=0., p_noise=0.25) else: raise ValueError(f"Dataset {dataset_name} not available in this setting.") return dataset def parse_args(): # Argument parser parser = ArgumentParser() parser.add_argument('--seed', type=int, default=-1) parser.add_argument("--model-name", type=str, default='brits') parser.add_argument("--dataset-name", type=str, default='air36') parser.add_argument("--config", type=str, default=None) # Splitting/aggregation params parser.add_argument('--in-sample', type=str_to_bool, nargs='?', const=True, default=False) parser.add_argument('--val-len', type=float, default=0.1) parser.add_argument('--test-len', type=float, default=0.2) parser.add_argument('--aggregate-by', type=str, default='mean') # Training params parser.add_argument('--lr', type=float, default=0.001) parser.add_argument('--epochs', type=int, default=300) parser.add_argument('--patience', type=int, default=40) parser.add_argument('--l2-reg', type=float, default=0.) parser.add_argument('--scaled-target', type=str_to_bool, nargs='?', const=True, default=True) parser.add_argument('--grad-clip-val', type=float, default=5.) parser.add_argument('--grad-clip-algorithm', type=str, default='norm') parser.add_argument('--loss-fn', type=str, default='l1_loss') parser.add_argument('--use-lr-schedule', type=str_to_bool, nargs='?', const=True, default=True) parser.add_argument('--consistency-loss', type=str_to_bool, nargs='?', const=True, default=False) parser.add_argument('--whiten-prob', type=float, default=0.05) parser.add_argument('--pred-loss-weight', type=float, default=1.0) parser.add_argument('--warm-up', type=int, default=0) # graph params parser.add_argument("--adj-threshold", type=float, default=0.1) # gain hparams parser.add_argument('--alpha', type=float, default=10.) parser.add_argument('--hint-rate', type=float, default=0.7) parser.add_argument('--g-train-freq', type=int, default=1) parser.add_argument('--d-train-freq', type=int, default=5) known_args, _ = parser.parse_known_args() model_cls, _ = get_model_classes(known_args.model_name) parser = model_cls.add_model_specific_args(parser) parser = SpatioTemporalDataModule.add_argparse_args(parser) parser = ImputationDataset.add_argparse_args(parser) args = parser.parse_args() if args.config is not None: with open(args.config, 'r') as fp: config_args = yaml.load(fp, Loader=yaml.FullLoader) for arg in config_args: setattr(args, arg, config_args[arg]) return args def run_experiment(args): # Set configuration and seed args = copy.deepcopy(args) if args.seed < 0: args.seed = np.random.randint(1e9) torch.set_num_threads(1) pl.seed_everything(args.seed) model_cls, filler_cls = get_model_classes(args.model_name) dataset = get_dataset(args.dataset_name) ######################################## # create logdir and save configuration # ######################################## exp_name = f"{datetime.datetime.now().strftime('%Y-%m-%d_%H-%M-%S')}_{args.seed}" logdir = os.path.join(config['logs'], args.dataset_name, args.model_name, exp_name) # save config for logging pathlib.Path(logdir).mkdir(parents=True) with open(os.path.join(logdir, 'config.yaml'), 'w') as fp: yaml.dump(parser_utils.config_dict_from_args(args), fp, indent=4, sort_keys=True) ######################################## # data module # ######################################## # instantiate dataset dataset_cls = GraphImputationDataset if has_graph_support(model_cls) else ImputationDataset torch_dataset = dataset_cls(*dataset.numpy(return_idx=True), mask=dataset.training_mask, eval_mask=dataset.eval_mask, window=args.window, stride=args.stride) # get train/val/test indices split_conf = parser_utils.filter_function_args(args, dataset.splitter, return_dict=True) train_idxs, val_idxs, test_idxs = dataset.splitter(torch_dataset, **split_conf) # configure datamodule data_conf = parser_utils.filter_args(args, SpatioTemporalDataModule, return_dict=True) dm = SpatioTemporalDataModule(torch_dataset, train_idxs=train_idxs, val_idxs=val_idxs, test_idxs=test_idxs, **data_conf) dm.setup() # if out of sample in air, add values removed for evaluation in train set if not args.in_sample and args.dataset_name[:3] == 'air': dm.torch_dataset.mask[dm.train_slice] |= dm.torch_dataset.eval_mask[dm.train_slice] # get adjacency matrix adj = dataset.get_similarity(thr=args.adj_threshold) # force adj with no self loop np.fill_diagonal(adj, 0.) ######################################## # predictor # ######################################## # model's inputs additional_model_hparams = dict(adj=adj, d_in=dm.d_in, n_nodes=dm.n_nodes) model_kwargs = parser_utils.filter_args(args={**vars(args), **additional_model_hparams}, target_cls=model_cls, return_dict=True) # loss and metrics loss_fn = MaskedMetric(metric_fn=getattr(F, args.loss_fn), compute_on_step=True, metric_kwargs={'reduction': 'none'}) metrics = {'mae': MaskedMAE(compute_on_step=False), 'mape': MaskedMAPE(compute_on_step=False), 'mse': MaskedMSE(compute_on_step=False), 'mre': MaskedMRE(compute_on_step=False)} # filler's inputs scheduler_class = CosineAnnealingLR if args.use_lr_schedule else None additional_filler_hparams = dict(model_class=model_cls, model_kwargs=model_kwargs, optim_class=torch.optim.Adam, optim_kwargs={'lr': args.lr, 'weight_decay': args.l2_reg}, loss_fn=loss_fn, metrics=metrics, scheduler_class=scheduler_class, scheduler_kwargs={ 'eta_min': 0.0001, 'T_max': args.epochs }, alpha=args.alpha, hint_rate=args.hint_rate, g_train_freq=args.g_train_freq, d_train_freq=args.d_train_freq) filler_kwargs = parser_utils.filter_args(args={**vars(args), **additional_filler_hparams}, target_cls=filler_cls, return_dict=True) filler = filler_cls(**filler_kwargs) ######################################## # training # ######################################## # callbacks early_stop_callback = EarlyStopping(monitor='val_mae', patience=args.patience, mode='min') checkpoint_callback = ModelCheckpoint(dirpath=logdir, save_top_k=1, monitor='val_mae', mode='min') logger = TensorBoardLogger(logdir, name="model") trainer = pl.Trainer(max_epochs=args.epochs, logger=logger, default_root_dir=logdir, gpus=1 if torch.cuda.is_available() else None, gradient_clip_val=args.grad_clip_val, gradient_clip_algorithm=args.grad_clip_algorithm, callbacks=[early_stop_callback, checkpoint_callback]) trainer.fit(filler, datamodule=dm) ######################################## # testing # ######################################## filler.load_state_dict(torch.load(checkpoint_callback.best_model_path, lambda storage, loc: storage)['state_dict']) filler.freeze() trainer.test() filler.eval() if torch.cuda.is_available(): filler.cuda() with torch.no_grad(): y_true, y_hat, mask = filler.predict_loader(dm.test_dataloader(), return_mask=True) y_hat = y_hat.detach().cpu().numpy().reshape(y_hat.shape[:3]) # reshape to (eventually) squeeze node channels # Test imputations in whole series eval_mask = dataset.eval_mask[dm.test_slice] df_true = dataset.df.iloc[dm.test_slice] metrics = { 'mae': numpy_metrics.masked_mae, 'mse': numpy_metrics.masked_mse, 'mre': numpy_metrics.masked_mre, 'mape': numpy_metrics.masked_mape } # Aggregate predictions in dataframes index = dm.torch_dataset.data_timestamps(dm.testset.indices, flatten=False)['horizon'] aggr_methods = ensure_list(args.aggregate_by) df_hats = prediction_dataframe(y_hat, index, dataset.df.columns, aggregate_by=aggr_methods) df_hats = dict(zip(aggr_methods, df_hats)) for aggr_by, df_hat in df_hats.items(): # Compute error print(f'- AGGREGATE BY {aggr_by.upper()}') for metric_name, metric_fn in metrics.items(): error = metric_fn(df_hat.values, df_true.values, eval_mask).item() print(f' {metric_name}: {error:.4f}') return y_true, y_hat, mask if __name__ == '__main__': args = parse_args() run_experiment(args) ================================================ FILE: scripts/run_synthetic.py ================================================ import copy import datetime import os import pathlib from argparse import ArgumentParser import numpy as np import pytorch_lightning as pl import torch import torch.nn.functional as F import yaml from pytorch_lightning.callbacks import EarlyStopping, ModelCheckpoint from pytorch_lightning.loggers import TensorBoardLogger from torch.optim.lr_scheduler import CosineAnnealingLR from lib import fillers, config from lib.datasets import ChargedParticles from lib.nn import models from lib.nn.utils.metric_base import MaskedMetric from lib.nn.utils.metrics import MaskedMAE, MaskedMAPE, MaskedMSE, MaskedMRE from lib.utils import parser_utils from lib.utils.parser_utils import str_to_bool def has_graph_support(model_cls): return model_cls is models.GRINet def get_model_classes(model_str): if model_str == 'brits': model, filler = models.BRITSNet, fillers.BRITSFiller elif model_str == 'grin': model, filler = models.GRINet, fillers.GraphFiller else: raise ValueError(f'Model {model_str} not available.') return model, filler def parse_args(): # Argument parser parser = ArgumentParser() parser.add_argument('--seed', type=int, default=-1) parser.add_argument("--model-name", type=str, default='bigrill') parser.add_argument("--config", type=str, default=None) # Dataset params parser.add_argument('--static-adj', type=str_to_bool, nargs='?', const=True, default=False) parser.add_argument('--window', type=int, default=50) parser.add_argument('--p-block', type=float, default=0.025) parser.add_argument('--p-point', type=float, default=0.025) parser.add_argument('--min-seq', type=int, default=5) parser.add_argument('--max-seq', type=int, default=10) parser.add_argument('--use-exogenous', type=str_to_bool, nargs='?', const=True, default=True) # Splitting/aggregation params parser.add_argument('--val-len', type=float, default=0.1) parser.add_argument('--test-len', type=float, default=0.2) # Training params parser.add_argument('--lr', type=float, default=0.001) parser.add_argument('--epochs', type=int, default=300) parser.add_argument('--patience', type=int, default=40) parser.add_argument('--l2-reg', type=float, default=0.) parser.add_argument('--scaled-target', type=str_to_bool, nargs='?', const=True, default=False) parser.add_argument('--grad-clip-val', type=float, default=5.) parser.add_argument('--grad-clip-algorithm', type=str, default='norm') parser.add_argument('--loss-fn', type=str, default='mse_loss') parser.add_argument('--use-lr-schedule', type=str_to_bool, nargs='?', const=True, default=True) parser.add_argument('--whiten-prob', type=float, default=0.05) parser.add_argument('--pred-loss-weight', type=float, default=1.0) parser.add_argument('--warm-up', type=int, default=0) # graph params parser.add_argument("--adj-threshold", type=float, default=0.1) known_args, _ = parser.parse_known_args() model_cls, _ = get_model_classes(known_args.model_name) parser = model_cls.add_model_specific_args(parser) args = parser.parse_args() if args.config is not None: with open(args.config, 'r') as fp: config_args = yaml.load(fp, Loader=yaml.FullLoader) for arg in config_args: setattr(args, arg, config_args[arg]) return args def run_experiment(args): # Set configuration and seed args = copy.deepcopy(args) if args.seed < 0: args.seed = np.random.randint(1e9) torch.set_num_threads(1) pl.seed_everything(args.seed) ######################################## # load dataset and model # ######################################## model_cls, filler_cls = get_model_classes(args.model_name) dataset = ChargedParticles(static_adj=args.static_adj, window=args.window, p_block=args.p_block, p_point=args.p_point, max_seq=args.max_seq, min_seq=args.min_seq, use_exogenous=args.use_exogenous, graph_mode=has_graph_support(model_cls)) dataset.split(args.val_len, args.test_len) # get adjacency matrix adj = dataset.get_similarity() np.fill_diagonal(adj, 0.) # force adj with no self loop ######################################## # create logdir and save configuration # ######################################## exp_name = f"{datetime.datetime.now().strftime('%Y-%m-%d_%H-%M-%S')}_{args.seed}" logdir = os.path.join(config['logs'], 'synthetic', args.model_name, exp_name) # save config for logging pathlib.Path(logdir).mkdir(parents=True) with open(os.path.join(logdir, 'config.yaml'), 'w') as fp: yaml.dump(parser_utils.config_dict_from_args(args), fp, indent=4, sort_keys=True) ######################################## # predictor # ######################################## # model's inputs if has_graph_support(model_cls): model_params = dict(adj=adj, d_in=dataset.n_channels, d_u=dataset.n_exogenous, n_nodes=dataset.n_nodes) else: model_params = dict(d_in=(dataset.n_channels * dataset.n_nodes), d_u=(dataset.n_channels * dataset.n_exogenous)) model_kwargs = parser_utils.filter_args(args={**vars(args), **model_params}, target_cls=model_cls, return_dict=True) # loss and metrics loss_fn = MaskedMetric(metric_fn=getattr(F, args.loss_fn), compute_on_step=True, metric_kwargs={'reduction': 'none'}) metrics = {'mae': MaskedMAE(compute_on_step=False), 'mape': MaskedMAPE(compute_on_step=False), 'mse': MaskedMSE(compute_on_step=False), 'mre': MaskedMRE(compute_on_step=False)} # filler's inputs scheduler_class = CosineAnnealingLR if args.use_lr_schedule else None additional_filler_hparams = dict(model_class=model_cls, model_kwargs=model_kwargs, optim_class=torch.optim.Adam, optim_kwargs={'lr': args.lr, 'weight_decay': args.l2_reg}, loss_fn=loss_fn, metrics=metrics, scheduler_class=scheduler_class, scheduler_kwargs={ 'eta_min': 0.0001, 'T_max': args.epochs }, alpha=args.alpha, hint_rate=args.hint_rate, g_train_freq=args.g_train_freq, d_train_freq=args.d_train_freq) filler_kwargs = parser_utils.filter_args(args={**vars(args), **additional_filler_hparams}, target_cls=filler_cls, return_dict=True) filler = filler_cls(**filler_kwargs) ######################################## # logging options # ######################################## # log number of parameters args.trainable_parameters = filler.trainable_parameters # log statistics on masks for mask_type in ['mask', 'eval_mask', 'training_mask']: mask_type_mean = getattr(dataset, mask_type).float().mean().item() setattr(args, mask_type, mask_type_mean) print(args) ######################################## # training # ######################################## # callbacks early_stop_callback = EarlyStopping(monitor='val_mse', patience=args.patience, mode='min') checkpoint_callback = ModelCheckpoint(dirpath=logdir, save_top_k=1, monitor='val_mse', mode='min') logger = TensorBoardLogger(logdir, name="model") trainer = pl.Trainer(max_epochs=args.epochs, default_root_dir=logdir, logger=logger, gpus=1 if torch.cuda.is_available() else None, gradient_clip_val=args.grad_clip_val, gradient_clip_algorithm=args.grad_clip_algorithm, callbacks=[early_stop_callback, checkpoint_callback]) trainer.fit(filler, train_dataloader=dataset.train_dataloader(batch_size=args.batch_size), val_dataloaders=dataset.val_dataloader(batch_size=args.batch_size)) ######################################## # testing # ######################################## filler.load_state_dict(torch.load(checkpoint_callback.best_model_path, lambda storage, loc: storage)['state_dict']) filler.freeze() trainer.test(filler, test_dataloaders=dataset.test_dataloader(batch_size=args.batch_size)) filler.eval() if __name__ == '__main__': args = parse_args() run_experiment(args)