Full Code of googleinterns/deep-stabilization for AI

master 7159c09d21ae cached

65 files

42.8 MB

79.8k tokens

342 symbols

1 requests

Download .txt

Showing preview only (303K chars total). Download the full file or copy to clipboard to get everything.

Repository: googleinterns/deep-stabilization
Branch: master
Commit: 7159c09d21ae
Files: 65
Total size: 42.8 MB

Directory structure:
gitextract__lkvtuhi/

├── .gitignore
├── LICENSE
├── README.md
├── docs/
│   ├── code-of-conduct.md
│   └── contributing.md
└── dvs/
    ├── checkpoint/
    │   └── stabilzation/
    │       └── stabilzation_last.checkpoint
    ├── conf/
    │   ├── stabilzation.yaml
    │   └── stabilzation_train.yaml
    ├── dataset.py
    ├── flownet2/
    │   ├── LICENSE
    │   ├── README.md
    │   ├── __init__.py
    │   ├── convert.py
    │   ├── datasets.py
    │   ├── install.sh
    │   ├── losses.py
    │   ├── main.py
    │   ├── models.py
    │   ├── networks/
    │   │   ├── FlowNetC.py
    │   │   ├── FlowNetFusion.py
    │   │   ├── FlowNetS.py
    │   │   ├── FlowNetSD.py
    │   │   ├── __init__.py
    │   │   ├── channelnorm_package/
    │   │   │   ├── __init__.py
    │   │   │   ├── channelnorm.py
    │   │   │   ├── channelnorm_cuda.cc
    │   │   │   ├── channelnorm_kernel.cu
    │   │   │   ├── channelnorm_kernel.cuh
    │   │   │   └── setup.py
    │   │   ├── correlation_package/
    │   │   │   ├── __init__.py
    │   │   │   ├── correlation.py
    │   │   │   ├── correlation_cuda.cc
    │   │   │   ├── correlation_cuda_kernel.cu
    │   │   │   ├── correlation_cuda_kernel.cuh
    │   │   │   └── setup.py
    │   │   ├── resample2d_package/
    │   │   │   ├── __init__.py
    │   │   │   ├── resample2d.py
    │   │   │   ├── resample2d_cuda.cc
    │   │   │   ├── resample2d_kernel.cu
    │   │   │   ├── resample2d_kernel.cuh
    │   │   │   └── setup.py
    │   │   └── submodules.py
    │   ├── run.sh
    │   ├── run_release.sh
    │   └── utils/
    │       ├── __init__.py
    │       ├── flow_utils.py
    │       ├── frame_utils.py
    │       ├── param_utils.py
    │       └── tools.py
    ├── gyro/
    │   ├── __init__.py
    │   ├── gyro_function.py
    │   └── gyro_io.py
    ├── inference.py
    ├── load_frame_sensor_data.py
    ├── loss.py
    ├── metrics.py
    ├── model.py
    ├── printer.py
    ├── requirements.txt
    ├── train.py
    ├── util.py
    └── warp/
        ├── __init__.py
        ├── rasterizer.py
        ├── read_write.py
        └── warping.py

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
*.pyc
.torch
_ext
*.o
_ext/
*.png
*.jpg
*.tar
log/*


================================================
FILE: LICENSE
================================================

                                 Apache License
                           Version 2.0, January 2004
                        http://www.apache.org/licenses/

   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION

   1. Definitions.

      "License" shall mean the terms and conditions for use, reproduction,
      and distribution as defined by Sections 1 through 9 of this document.

      "Licensor" shall mean the copyright owner or entity authorized by
      the copyright owner that is granting the License.

      "Legal Entity" shall mean the union of the acting entity and all
      other entities that control, are controlled by, or are under common
      control with that entity. For the purposes of this definition,
      "control" means (i) the power, direct or indirect, to cause the
      direction or management of such entity, whether by contract or
      otherwise, or (ii) ownership of fifty percent (50%) or more of the
      outstanding shares, or (iii) beneficial ownership of such entity.

      "You" (or "Your") shall mean an individual or Legal Entity
      exercising permissions granted by this License.

      "Source" form shall mean the preferred form for making modifications,
      including but not limited to software source code, documentation
      source, and configuration files.

      "Object" form shall mean any form resulting from mechanical
      transformation or translation of a Source form, including but
      not limited to compiled object code, generated documentation,
      and conversions to other media types.

      "Work" shall mean the work of authorship, whether in Source or
      Object form, made available under the License, as indicated by a
      copyright notice that is included in or attached to the work
      (an example is provided in the Appendix below).

      "Derivative Works" shall mean any work, whether in Source or Object
      form, that is based on (or derived from) the Work and for which the
      editorial revisions, annotations, elaborations, or other modifications
      represent, as a whole, an original work of authorship. For the purposes
      of this License, Derivative Works shall not include works that remain
      separable from, or merely link (or bind by name) to the interfaces of,
      the Work and Derivative Works thereof.

      "Contribution" shall mean any work of authorship, including
      the original version of the Work and any modifications or additions
      to that Work or Derivative Works thereof, that is intentionally
      submitted to Licensor for inclusion in the Work by the copyright owner
      or by an individual or Legal Entity authorized to submit on behalf of
      the copyright owner. For the purposes of this definition, "submitted"
      means any form of electronic, verbal, or written communication sent
      to the Licensor or its representatives, including but not limited to
      communication on electronic mailing lists, source code control systems,
      and issue tracking systems that are managed by, or on behalf of, the
      Licensor for the purpose of discussing and improving the Work, but
      excluding communication that is conspicuously marked or otherwise
      designated in writing by the copyright owner as "Not a Contribution."

      "Contributor" shall mean Licensor and any individual or Legal Entity
      on behalf of whom a Contribution has been received by Licensor and
      subsequently incorporated within the Work.

   2. Grant of Copyright License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      copyright license to reproduce, prepare Derivative Works of,
      publicly display, publicly perform, sublicense, and distribute the
      Work and such Derivative Works in Source or Object form.

   3. Grant of Patent License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      (except as stated in this section) patent license to make, have made,
      use, offer to sell, sell, import, and otherwise transfer the Work,
      where such license applies only to those patent claims licensable
      by such Contributor that are necessarily infringed by their
      Contribution(s) alone or by combination of their Contribution(s)
      with the Work to which such Contribution(s) was submitted. If You
      institute patent litigation against any entity (including a
      cross-claim or counterclaim in a lawsuit) alleging that the Work
      or a Contribution incorporated within the Work constitutes direct
      or contributory patent infringement, then any patent licenses
      granted to You under this License for that Work shall terminate
      as of the date such litigation is filed.

   4. Redistribution. You may reproduce and distribute copies of the
      Work or Derivative Works thereof in any medium, with or without
      modifications, and in Source or Object form, provided that You
      meet the following conditions:

      (a) You must give any other recipients of the Work or
          Derivative Works a copy of this License; and

      (b) You must cause any modified files to carry prominent notices
          stating that You changed the files; and

      (c) You must retain, in the Source form of any Derivative Works
          that You distribute, all copyright, patent, trademark, and
          attribution notices from the Source form of the Work,
          excluding those notices that do not pertain to any part of
          the Derivative Works; and

      (d) If the Work includes a "NOTICE" text file as part of its
          distribution, then any Derivative Works that You distribute must
          include a readable copy of the attribution notices contained
          within such NOTICE file, excluding those notices that do not
          pertain to any part of the Derivative Works, in at least one
          of the following places: within a NOTICE text file distributed
          as part of the Derivative Works; within the Source form or
          documentation, if provided along with the Derivative Works; or,
          within a display generated by the Derivative Works, if and
          wherever such third-party notices normally appear. The contents
          of the NOTICE file are for informational purposes only and
          do not modify the License. You may add Your own attribution
          notices within Derivative Works that You distribute, alongside
          or as an addendum to the NOTICE text from the Work, provided
          that such additional attribution notices cannot be construed
          as modifying the License.

      You may add Your own copyright statement to Your modifications and
      may provide additional or different license terms and conditions
      for use, reproduction, or distribution of Your modifications, or
      for any such Derivative Works as a whole, provided Your use,
      reproduction, and distribution of the Work otherwise complies with
      the conditions stated in this License.

   5. Submission of Contributions. Unless You explicitly state otherwise,
      any Contribution intentionally submitted for inclusion in the Work
      by You to the Licensor shall be under the terms and conditions of
      this License, without any additional terms or conditions.
      Notwithstanding the above, nothing herein shall supersede or modify
      the terms of any separate license agreement you may have executed
      with Licensor regarding such Contributions.

   6. Trademarks. This License does not grant permission to use the trade
      names, trademarks, service marks, or product names of the Licensor,
      except as required for reasonable and customary use in describing the
      origin of the Work and reproducing the content of the NOTICE file.

   7. Disclaimer of Warranty. Unless required by applicable law or
      agreed to in writing, Licensor provides the Work (and each
      Contributor provides its Contributions) on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
      implied, including, without limitation, any warranties or conditions
      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
      PARTICULAR PURPOSE. You are solely responsible for determining the
      appropriateness of using or redistributing the Work and assume any
      risks associated with Your exercise of permissions under this License.

   8. Limitation of Liability. In no event and under no legal theory,
      whether in tort (including negligence), contract, or otherwise,
      unless required by applicable law (such as deliberate and grossly
      negligent acts) or agreed to in writing, shall any Contributor be
      liable to You for damages, including any direct, indirect, special,
      incidental, or consequential damages of any character arising as a
      result of this License or out of the use or inability to use the
      Work (including but not limited to damages for loss of goodwill,
      work stoppage, computer failure or malfunction, or any and all
      other commercial damages or losses), even if such Contributor
      has been advised of the possibility of such damages.

   9. Accepting Warranty or Additional Liability. While redistributing
      the Work or Derivative Works thereof, You may choose to offer,
      and charge a fee for, acceptance of support, warranty, indemnity,
      or other liability obligations and/or rights consistent with this
      License. However, in accepting such obligations, You may act only
      on Your own behalf and on Your sole responsibility, not on behalf
      of any other Contributor, and only if You agree to indemnify,
      defend, and hold each Contributor harmless for any liability
      incurred by, or claims asserted against, such Contributor by reason
      of your accepting any such warranty or additional liability.

   END OF TERMS AND CONDITIONS

   APPENDIX: How to apply the Apache License to your work.

      To apply the Apache License to your work, attach the following
      boilerplate notice, with the fields enclosed by brackets "[]"
      replaced with your own identifying information. (Don't include
      the brackets!)  The text should be enclosed in the appropriate
      comment syntax for the file format. We also recommend that a
      file or class name and description of purpose be included on the
      same "printed page" as the copyright notice for easier
      identification within third-party archives.

   Copyright [yyyy] [name of copyright owner]

   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
   You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.


================================================
FILE: README.md
================================================
# Deep Online Fused Video Stabilization

[[Paper]](https://openaccess.thecvf.com/content/WACV2022/papers/Shi_Deep_Online_Fused_Video_Stabilization_WACV_2022_paper.pdf)[[Supplementary]](https://zhmeishi.github.io/dvs/paper/dvs_supp.pdf)  [[Project Page]](https://zhmeishi.github.io/dvs/) [[Dataset]](https://storage.googleapis.com/dataset_release/all.zip) [[Our Result]](https://storage.googleapis.com/dataset_release/inference_result_release.zip) [[More Results]](https://zhmeishi.github.io/dvs/supp/results.html) 

This repository contains the Pytorch implementation of our method in the paper "Deep Online Fused Video Stabilization".

## Environment Setting
Python version >= 3.6
Pytorch with CUDA >= 1.0.0 (guide is [here](https://pytorch.org/get-started/locally/))
Install other used packages:
```
cd dvs
pip install -r requirements.txt --ignore-installed
```

## Data Preparation
Download sample video [here](https://drive.google.com/file/d/1PpF3-6BbQKy9fldjIfwa5AlbtQflx3sG/view?usp=sharing).
Uncompress the *video* folder under the *dvs* folder.
```
python load_frame_sensor_data.py 
```
Demo of curve visualization:
The **gyro/OIS curve visualization** can be found at *dvs/video/s_114_outdoor_running_trail_daytime/ControlCam_20200930_104820_real.jpg*.


## FlowNet2 Preparation
Note, we provide optical flow result of one test video in our Data Preparation. If you would like to generate them for all test videos, please follow [FlowNet2 official website](https://github.com/NVIDIA/flownet2-pytorch) and guide below. Otherwise, you can skip this section. 

Note, FlowNet2 installation is tricky. Please use Python=3.6 and Pytorch=1.0.0. More details are [here](https://github.com/NVIDIA/flownet2-pytorch/issues/156) or contact us for any questions.

Download FlowNet2 model *FlowNet2_checkpoint.pth.tar* [here](https://drive.google.com/file/d/1hF8vS6YeHkx3j2pfCeQqqZGwA_PJq_Da/view).  Move it under folder *dvs/flownet2*.
```
python warp/read_write.py # video2frames
cd flownet2
bash install.sh # install package
bash run.sh # generate optical flow file for dataset
``` 

## Running Inference 
```
python inference.py
python metrics.py
``` 
The loss and metric information will be printed in the terminal. The metric numbers can be slightly different due to difference on opencv/pytorch versions.  

The result is under *dvs/test/stabilzation*.   
In *s_114_outdoor_running_trail_daytime.jpg*, the blue curve is the output of our models, and the green curve is the input.   
*s_114_outdoor_running_trail_daytime_stab.mp4* is uncropped stabilized video.  
*s_114_outdoor_running_trail_daytime_stab_crop.mp4* is cropped stabilized video. Note, the cropped video is generated after running the metrics code.   

## Training
Download dataset for training and test [here](https://storage.googleapis.com/dataset_release/all.zip). 
Uncompress *all.zip* and move *dataset_release* folder under the *dvs* folder.

Follow FlowNet2 Preparation Section.
```
python warp/read_write.py --dir_path ./dataset_release # video2frames
cd flownet2
bash run_release.sh # generate optical flow file for dataset
``` 

Run training code.
```
python train.py
``` 
The model is saved in *checkpoint/stabilzation_train*.

## Citation 
If you use this code or dataset for your research, please cite our paper.
```
@inproceedings{shi2022deep,
  title={Deep Online Fused Video Stabilization},
  author={Shi, Zhenmei and Shi, Fuhao and Lai, Wei-Sheng and Liang, Chia-Kai and Liang, Yingyu},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={1250--1258},
  year={2022}
}
```


================================================
FILE: docs/code-of-conduct.md
================================================
# Google Open Source Community Guidelines

At Google, we recognize and celebrate the creativity and collaboration of open
source contributors and the diversity of skills, experiences, cultures, and
opinions they bring to the projects and communities they participate in.

Every one of Google's open source projects and communities are inclusive
environments, based on treating all individuals respectfully, regardless of
gender identity and expression, sexual orientation, disabilities,
neurodiversity, physical appearance, body size, ethnicity, nationality, race,
age, religion, or similar personal characteristic.

We value diverse opinions, but we value respectful behavior more.

Respectful behavior includes:

* Being considerate, kind, constructive, and helpful.
* Not engaging in demeaning, discriminatory, harassing, hateful, sexualized, or
  physically threatening behavior, speech, and imagery.
* Not engaging in unwanted physical contact.

Some Google open source projects [may adopt][] an explicit project code of
conduct, which may have additional detailed expectations for participants. Most
of those projects will use our [modified Contributor Covenant][].

[may adopt]: https://opensource.google/docs/releasing/preparing/#conduct
[modified Contributor Covenant]: https://opensource.google/docs/releasing/template/CODE_OF_CONDUCT/

## Resolve peacefully

We do not believe that all conflict is necessarily bad; healthy debate and
disagreement often yields positive results. However, it is never okay to be
disrespectful.

If you see someone behaving disrespectfully, you are encouraged to address the
behavior directly with those involved. Many issues can be resolved quickly and
easily, and this gives people more control over the outcome of their dispute.
If you are unable to resolve the matter for any reason, or if the behavior is
threatening or harassing, report it. We are dedicated to providing an
environment where participants feel welcome and safe.

## Reporting problems

Some Google open source projects may adopt a project-specific code of conduct.
In those cases, a Google employee will be identified as the Project Steward,
who will receive and handle reports of code of conduct violations. In the event
that a project hasn’t identified a Project Steward, you can report problems by
emailing opensource@google.com.

We will investigate every complaint, but you may not receive a direct response.
We will use our discretion in determining when and how to follow up on reported
incidents, which may range from not taking action to permanent expulsion from
the project and project-sponsored spaces. We will notify the accused of the
report and provide them an opportunity to discuss it before any action is
taken. The identity of the reporter will be omitted from the details of the
report supplied to the accused. In potentially harmful situations, such as
ongoing harassment or threats to anyone's safety, we may take action without
notice.

*This document was adapted from the [IndieWeb Code of Conduct][] and can also
be found at <https://opensource.google/conduct/>.*

[IndieWeb Code of Conduct]: https://indieweb.org/code-of-conduct


================================================
FILE: docs/contributing.md
================================================
# How to Contribute

We'd love to accept your patches and contributions to this project. There are
just a few small guidelines you need to follow.

## Contributor License Agreement

Contributions to this project must be accompanied by a Contributor License
Agreement. You (or your employer) retain the copyright to your contribution;
this simply gives us permission to use and redistribute your contributions as
part of the project. Head over to <https://cla.developers.google.com/> to see
your current agreements on file or to sign a new one.

You generally only need to submit a CLA once, so if you've already submitted one
(even if it was for a different project), you probably don't need to do it
again.

## Code reviews

All submissions, including submissions by project members, require review. We
use GitHub pull requests for this purpose. Consult
[GitHub Help](https://help.github.com/articles/about-pull-requests/) for more
information on using pull requests.

## Community Guidelines

This project follows [Google's Open Source Community
Guidelines](https://opensource.google/conduct/).


================================================
FILE: dvs/checkpoint/stabilzation/stabilzation_last.checkpoint
================================================
[File too large to display: 42.5 MB]

================================================
FILE: dvs/conf/stabilzation.yaml
================================================
data:
  exp: 'stabilzation'
  checkpoints_dir: './checkpoint'
  log: './log'
  data_dir: './video'           
  use_cuda: true
  batch_size: 16
  resize_ratio: 0.25
  number_real: 10
  number_virtual: 2
  time_train: 2000  # ms
  sample_freq: 40   # ms
  channel_size: 1
  num_workers: 16                    # num_workers for data_loader
model:
  load_model:  null
  cnn:
    activate_function: relu         # sigmoid, relu, tanh, quadratic
    batch_norm: true
    gap: false
    layers:
  rnn:
    layers:  
    - - 512                        
      - true  
    - - 512                        
      - true    
  fc:
    activate_function: relu
    batch_norm: false               # (batch_norm and drop_out) is False
    layers:  
    - - 256                        
      - true  
    - - 4                         # last layer should be equal to nr_class
      - true
    drop_out: 0
train:
  optimizer: "adam"                  # adam or sgd
  momentum: 0.9                     # for sgd
  decay_epoch: null
  epoch: 400
  snapshot: 2
  init_lr: 0.0001
  lr_decay: 0.5
  lr_step: 200                       # if > 0 decay_epoch should be null
  seed: 1
  weight_decay: 0.0001
  clip_norm: False
  init: "xavier_uniform"            # xavier_uniform or xavier_normal
loss:
  follow: 10
  angle: 1
  smooth: 10 #10
  c2_smooth: 200 #20
  undefine: 2.0
  opt: 0.1
  stay: 0

================================================
FILE: dvs/conf/stabilzation_train.yaml
================================================
data:
  exp: 'stabilzation_train'
  checkpoints_dir: './checkpoint'
  log: './log'
  data_dir: './dataset_release'           
  use_cuda: true
  batch_size: 16
  resize_ratio: 0.25
  number_real: 10
  number_virtual: 2
  time_train: 2000  # ms
  sample_freq: 40   # ms
  channel_size: 1
  num_workers: 16                    # num_workers for data_loader
model:
  load_model:  null
  cnn:
    activate_function: relu         # sigmoid, relu, tanh, quadratic
    batch_norm: true
    gap: false
    layers:
  rnn:
    layers:  
    - - 512                        
      - true  
    - - 512                        
      - true    
  fc:
    activate_function: relu
    batch_norm: false               # (batch_norm and drop_out) is False
    layers:  
    - - 256                        
      - true  
    - - 4                         # last layer should be equal to nr_class
      - true
    drop_out: 0
train:
  optimizer: "adam"                  # adam or sgd
  momentum: 0.9                     # for sgd
  decay_epoch: null
  epoch: 400
  snapshot: 2
  init_lr: 0.0001
  lr_decay: 0.5
  lr_step: 200                       # if > 0 decay_epoch should be null
  seed: 1
  weight_decay: 0.0001
  clip_norm: False
  init: "xavier_uniform"            # xavier_uniform or xavier_normal
loss:
  follow: 10
  angle: 1
  smooth: 10 #10
  c2_smooth: 200 #20
  undefine: 2.0
  opt: 0.1
  stay: 0

================================================
FILE: dvs/dataset.py
================================================
from torch.utils.data import Dataset
import os
import collections
from gyro import (
    LoadGyroData, 
    LoadOISData, 
    LoadFrameData, 
    GetGyroAtTimeStamp, 
    get_static, 
    GetMetadata, 
    GetProjections, 
    train_GetGyroAtTimeStamp,
    QuaternionProduct,
    QuaternionReciprocal,
    FindOISAtTimeStamp,
    norm_quat
    )
import random
import numpy as np
import torchvision.transforms as transforms
import torch
from flownet2 import flow_utils
from scipy import ndimage, misc
from numpy import linalg as LA

def get_data_loader(cf, no_flo = False):
    size = cf["data"]["batch_size"]
    num_workers = cf["data"]["num_workers"]
    train_data, test_data = get_dataset(cf, no_flo)
    trainloader = torch.utils.data.DataLoader(train_data, batch_size=size,shuffle=True, pin_memory=True, num_workers=num_workers)
    testloader = torch.utils.data.DataLoader(test_data, batch_size=size,shuffle=False, pin_memory=True, num_workers=num_workers)
    return trainloader,testloader

def get_dataset(cf, no_flo = False):
    resize_ratio = cf["data"]["resize_ratio"]
    train_transform, test_transform = _data_transforms()
    train_path = os.path.join(cf["data"]["data_dir"], "training")
    test_path = os.path.join(cf["data"]["data_dir"], "test")
    if not os.path.exists(train_path):
        train_path = cf["data"]["data_dir"]
    if not os.path.exists(test_path):
        test_path = cf["data"]["data_dir"]

    train_data = Dataset_Gyro(
        train_path, sample_freq = cf["data"]["sample_freq"]*1000000, number_real = cf["data"]["number_real"], 
        time_train = cf["data"]["time_train"]*1000000, transform = train_transform, resize_ratio = resize_ratio, no_flo = no_flo)
    test_data = Dataset_Gyro(
        test_path, sample_freq = cf["data"]["sample_freq"]*1000000, number_real = cf["data"]["number_real"], 
        time_train = cf["data"]["time_train"]*1000000, transform = test_transform, resize_ratio = resize_ratio, no_flo = no_flo)
    return train_data, test_data

def get_inference_data_loader(cf, data_path, no_flo = False):
    test_data = get_inference_dataset(cf, data_path, no_flo)
    testloader = torch.utils.data.DataLoader(test_data, batch_size=1,shuffle=False, pin_memory=True, num_workers=1)
    return testloader

def get_inference_dataset(cf, data_path, no_flo = False):
    resize_ratio = cf["data"]["resize_ratio"]
    _, test_transform = _data_transforms()
    test_data = Dataset_Gyro(
        data_path, sample_freq = cf["data"]["sample_freq"]*1000000, number_real = cf["data"]["number_real"], 
        time_train = cf["data"]["time_train"]*1000000, transform = test_transform, resize_ratio = resize_ratio,
        inference_only = True, no_flo = no_flo)
    return test_data

def _data_transforms():

    test_transform = transforms.Compose(
        [transforms.ToTensor(),
        ])
    train_transform = transforms.Compose(
        [transforms.ToTensor(),
        ])

    return train_transform, test_transform

class DVS_data():
    def __init__(self):
        self.gyro = None
        self.ois = None
        self.frame = None
        self.length = 0
        self.flo_path = None
        self.flo_shape = None
        self.flo_back_path = None

class Dataset_Gyro(Dataset):
    def __init__(self, path, sample_freq = 33*1000000, number_real = 10, time_train = 2000*1000000, \
        transform = None, inference_only = False, no_flo = False, resize_ratio = 1): 
        r"""
        Arguments:
            sample_freq: real quaternions [t-sample_freq*number_real, t+sample_freq*number_real] ns
            number_real: real gyro num in half time_interval
            time_train: time for a batch ns
        """
        self.sample_freq = sample_freq
        self.number_real = number_real
        self.no_flo = no_flo
        self.resize_ratio = resize_ratio
        self.static_options = get_static()
        self.inference_only = inference_only

        self.ois_ratio = np.array([self.static_options["crop_window_width"] / self.static_options["width"], \
            self.static_options["crop_window_height"] / self.static_options["height"]]) * 0.01
        self.unit_size = 4

        if inference_only:
            self.length = 1
            self.data = [self.process_one_video(path)]
            self.number_train = self.data[0].length  
            return

        self.time_train = time_train
        self.number_train = time_train//self.sample_freq
        
        self.data_name = sorted(os.listdir(path))
        self.length = len(self.data_name)
        self.data = []
        for i in range(self.length):
            self.data.append(self.process_one_video(os.path.join(path,self.data_name[i])))
    
    def process_one_video(self, path):
        dvs_data = DVS_data()
        files = sorted(os.listdir(path))
        print(path)
        for f in files:
            file_path = os.path.join(path,f)
            if "gimbal" in file_path.lower():
                continue
            if "frame" in f and "txt" in f:
                dvs_data.frame = LoadFrameData(file_path)
                print("frame:", dvs_data.frame.shape, end="    ")
            elif "gyro" in f:
                dvs_data.gyro = LoadGyroData(file_path)
                dvs_data.gyro = preprocess_gyro(dvs_data.gyro) 
                print("gyro:", dvs_data.gyro.shape, end="    ")
            elif "ois" in f and "txt" in f:
                dvs_data.ois = LoadOISData(file_path)
                print("ois:", dvs_data.ois.shape, end="    ")
            elif f == "flo":
                dvs_data.flo_path, dvs_data.flo_shape = LoadFlow(file_path)
                print("flo_path:", len(dvs_data.flo_path), end="    ")
                print("flo_shape:", dvs_data.flo_shape, end="    ")
            elif f == "flo_back":
                dvs_data.flo_back_path, _ = LoadFlow(file_path)
            
        print()
        if dvs_data.flo_path is not None:
            dvs_data.length = min(dvs_data.frame.shape[0] - 1, len(dvs_data.flo_path))
        else:
            dvs_data.length = dvs_data.frame.shape[0] - 1 
        return dvs_data

    def generate_quaternions(self, dvs_data):
        first_id = random.randint(0, dvs_data.length - self.number_train) + 1 # skip the first frame

        sample_data = np.zeros((self.number_train, 2 * self.number_real + 1, self.unit_size), dtype=np.float32)
        sample_ois = np.zeros((self.number_train, 2), dtype=np.float32)

        sample_time = np.zeros((self.number_train+1), dtype=np.float32)
        sample_time[0] = get_timestamp(dvs_data.frame, first_id - 1)

        real_postion = np.zeros((self.number_train, 4), dtype=np.float32)

        time_start = sample_time[0]

        for i in range(self.number_train):
            sample_time[i+1] = get_timestamp(dvs_data.frame, first_id + i)
            real_postion[i] = GetGyroAtTimeStamp(dvs_data.gyro, sample_time[i+1] - self.sample_freq)
            sample_ois[i] = self.get_ois_at_timestamp(dvs_data.ois, sample_time[i+1])
            for j in range(-self.number_real, self.number_real+1):
                index = j + self.number_real
                time_stamp = sample_time[i+1] + self.sample_freq * j 
                sample_data[i, index] = self.get_data_at_timestamp(dvs_data.gyro, dvs_data.ois, time_stamp, real_postion[i])
                
        sample_data = np.reshape(sample_data, (self.number_train, (2*self.number_real+1) * self.unit_size))
        return sample_data, sample_time, first_id, real_postion, sample_ois

    def load_flo(self, idx, first_id):
        shape = self.data[idx].flo_shape
        h, w = shape[0], shape[1]
        flo = np.zeros((self.number_train, h, w, 2))
        flo_back = np.zeros((self.number_train, h, w, 2))

        for i in range(self.number_train):
            frame_id = i + first_id
            f = flow_utils.readFlow(self.data[idx].flo_path[frame_id-1]).astype(np.float32) 
            flo[i] = f

            f_b = flow_utils.readFlow(self.data[idx].flo_back_path[frame_id-1]).astype(np.float32) 
            flo_back[i] = f_b

        return flo, flo_back

    def load_real_projections(self, idx, first_id):
        real_projections = np.zeros((self.number_train + 1, self.static_options["num_grid_rows"], 3, 3))
        for i in range(self.number_train + 1):
            frame_id = i + first_id
            metadata = GetMetadata(self.data[idx].frame, frame_id - 1)
            real_projections[i] = np.array(GetProjections(self.static_options, metadata, self.data[idx].gyro, np.zeros(self.data[idx].ois.shape), no_shutter = True))
        return real_projections

    def __getitem__(self, idx):
        inputs, times, first_id, real_postion, ois = self.generate_quaternions(self.data[idx]) 
        real_projections = self.load_real_projections(idx, first_id)
        if self.no_flo:
            flo, flo_back = 0, 0
        else:
            flo, flo_back = self.load_flo(idx, first_id)
        return inputs, times, flo, flo_back, real_projections, real_postion, ois, idx

    def __len__(self):
        return self.length

    def get_virtual_data(self, virtual_queue, real_queue_idx, pre_times, cur_times, time_start, batch_size, number_virtual, quat_t_1):
        # virtual_queue: [batch_size, num, 5 (timestamp, quats)]
        # eular angle, 
        # deta R angular velocity [Q't-1, Q't-2] 
        # output virtual angular velocity, x, x*dtime => detaQt
        virtual_data = np.zeros((batch_size, number_virtual, 4), dtype=np.float32)
        vt_1 = np.zeros((batch_size, 4), dtype=np.float32)
        quat_t_1 = quat_t_1.numpy()
        for i in range(batch_size):
            sample_time = cur_times[i]
            for j in range(number_virtual):
                time_stamp = sample_time - self.sample_freq * (number_virtual - j) 
                virtual_data[i, j] = get_virtual_at_timestamp(virtual_queue[i], self.data[real_queue_idx[i]].gyro, time_stamp, time_start[i], quat_t_1[i])
            vt_1[i] = get_virtual_at_timestamp(virtual_queue[i], self.data[real_queue_idx[i]].gyro, pre_times[i], time_start[i], None)
        virtual_data = np.reshape(virtual_data, (batch_size, number_virtual * 4))
        return torch.tensor(virtual_data, dtype=torch.float), torch.tensor(vt_1, dtype=torch.float)

    def update_virtual_queue(self, batch_size, virtual_queue, out, times):
        virtual_data = np.zeros((batch_size, 5))
        virtual_data[:,0] = times
        virtual_data[:, 1:] = out
        virtual_data = np.expand_dims(virtual_data, axis = 1)

        if None in virtual_queue:
            virtual_queue = virtual_data
        else:
            virtual_queue = np.concatenate((virtual_queue, virtual_data), axis = 1)
        return virtual_queue

    def random_init_virtual_queue(self, batch_size, real_postion, times):
        virtual_queue = np.zeros((batch_size, 3, 5))
        virtual_queue[:, 2, 0] = times - 0.1 * self.sample_freq
        virtual_queue[:, 1, 0] = times - 1.1 * self.sample_freq
        virtual_queue[:, 0, 0] = times - 2.1 * self.sample_freq
        for i in range(batch_size):
            quat = np.random.uniform(low=-0.06, high= 0.06, size=4) # transfer to angle # 0.05
            quat[3] = 1
            quat = quat / LA.norm(quat)
            quat = norm_quat(QuaternionProduct(real_postion[i], quat))
            virtual_queue[i, 2, 1:] = quat
            virtual_queue[i, 1, 1:] = quat
            virtual_queue[i, 0, 1:] = quat
        return virtual_queue

    def get_data_at_timestamp(self, gyro_data, ois_data, time_stamp, quat_t_1):
        quat_t = GetGyroAtTimeStamp(gyro_data, time_stamp)
        quat_dif = QuaternionProduct(quat_t, QuaternionReciprocal(quat_t_1))  
        return quat_dif

    def get_ois_at_timestamp(self, ois_data, time_stamp):
        ois_t = FindOISAtTimeStamp(ois_data, time_stamp)
        ois_t = np.array(ois_t) / self.ois_ratio
        return ois_t

def get_timestamp(frame_data, idx):
    sample_time = frame_data[idx, 0]
    metadata = GetMetadata(frame_data, idx)
    timestmap_ns = metadata["timestamp_ns"] + metadata["rs_time_ns"] * 0.5
    return timestmap_ns

def preprocess_gyro(gyro, extend = 200):
    fake_gyro = np.zeros((extend, 5))
    time_start = gyro[0,0]
    for i in range(extend):
        fake_gyro[-i-1, 0] = time_start - (gyro[i+1, 0] - time_start)
        fake_gyro[-i-1, 4] = gyro[i+1, 4]
        fake_gyro[-i-1, 1:4] = -gyro[i+1, 1:4]

    new_gyro = np.concatenate((fake_gyro, gyro), axis = 0)
    return new_gyro

def LoadFlow(path):
    file_names = sorted(os.listdir(path))
    file_path =[]
    for n in file_names:
        file_path.append(os.path.join(path, n))
    return file_path, flow_utils.readFlow(file_path[0]).shape

def get_virtual_at_timestamp(virtual_queue, real_queue, time_stamp, time_start, quat_t_1 = None, sample_freq = None):
    if virtual_queue is None:
        quat_t = GetGyroAtTimeStamp(real_queue, time_stamp)
    else:
        quat_t = train_GetGyroAtTimeStamp(virtual_queue, time_stamp)
        if quat_t is None:
            quat_t = GetGyroAtTimeStamp(real_queue, time_stamp)
            
    if quat_t_1 is None:
        return quat_t
    else:
        quat_dif = QuaternionProduct(quat_t, QuaternionReciprocal(quat_t_1))  
        return quat_dif


================================================
FILE: dvs/flownet2/LICENSE
================================================
Copyright 2017 NVIDIA CORPORATION

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

================================================
FILE: dvs/flownet2/README.md
================================================
# flownet2-pytorch 

Pytorch implementation of [FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks](https://arxiv.org/abs/1612.01925). 

Multiple GPU training is supported, and the code provides examples for training or inference on [MPI-Sintel](http://sintel.is.tue.mpg.de/) clean and final datasets. The same commands can be used for training or inference with other datasets. See below for more detail.

Inference using fp16 (half-precision) is also supported.

For more help, type <br />
    
    python main.py --help

## Network architectures
Below are the different flownet neural network architectures that are provided. <br />
A batchnorm version for each network is also available.

 - **FlowNet2S**
 - **FlowNet2C**
 - **FlowNet2CS**
 - **FlowNet2CSS**
 - **FlowNet2SD**
 - **FlowNet2**

## Custom layers

`FlowNet2` or `FlowNet2C*` achitectures rely on custom layers `Resample2d` or `Correlation`. <br />
A pytorch implementation of these layers with cuda kernels are available at [./networks](./networks). <br />
Note : Currently, half precision kernels are not available for these layers.

## Data Loaders

Dataloaders for FlyingChairs, FlyingThings, ChairsSDHom and ImagesFromFolder are available in [datasets.py](./datasets.py). <br />

## Loss Functions

L1 and L2 losses with multi-scale support are available in [losses.py](./losses.py). <br />

## Installation 

    # get flownet2-pytorch source
    git clone https://github.com/NVIDIA/flownet2-pytorch.git
    cd flownet2-pytorch

    # install custom layers
    bash install.sh
    
### Python requirements 
Currently, the code supports python 3
* numpy 
* PyTorch ( == 0.4.1, for <= 0.4.0 see branch [python36-PyTorch0.4](https://github.com/NVIDIA/flownet2-pytorch/tree/python36-PyTorch0.4))
* scipy 
* scikit-image
* tensorboardX
* colorama, tqdm, setproctitle 

## Converted Caffe Pre-trained Models
We've included caffe pre-trained models. Should you use these pre-trained weights, please adhere to the [license agreements](https://drive.google.com/file/d/1TVv0BnNFh3rpHZvD-easMb9jYrPE2Eqd/view?usp=sharing). 

* [FlowNet2](https://drive.google.com/file/d/1hF8vS6YeHkx3j2pfCeQqqZGwA_PJq_Da/view?usp=sharing)[620MB]
* [FlowNet2-C](https://drive.google.com/file/d/1BFT6b7KgKJC8rA59RmOVAXRM_S7aSfKE/view?usp=sharing)[149MB]
* [FlowNet2-CS](https://drive.google.com/file/d/1iBJ1_o7PloaINpa8m7u_7TsLCX0Dt_jS/view?usp=sharing)[297MB]
* [FlowNet2-CSS](https://drive.google.com/file/d/157zuzVf4YMN6ABAQgZc8rRmR5cgWzSu8/view?usp=sharing)[445MB]
* [FlowNet2-CSS-ft-sd](https://drive.google.com/file/d/1R5xafCIzJCXc8ia4TGfC65irmTNiMg6u/view?usp=sharing)[445MB]
* [FlowNet2-S](https://drive.google.com/file/d/1V61dZjFomwlynwlYklJHC-TLfdFom3Lg/view?usp=sharing)[148MB]
* [FlowNet2-SD](https://drive.google.com/file/d/1QW03eyYG_vD-dT-Mx4wopYvtPu_msTKn/view?usp=sharing)[173MB]
    
## Inference
    # Example on MPISintel Clean   
    python main.py --inference --model FlowNet2 --save_flow --inference_dataset MpiSintelClean \
    --inference_dataset_root /path/to/mpi-sintel/clean/dataset \
    --resume /path/to/checkpoints 
    
## Training and validation

    # Example on MPISintel Final and Clean, with L1Loss on FlowNet2 model
    python main.py --batch_size 8 --model FlowNet2 --loss=L1Loss --optimizer=Adam --optimizer_lr=1e-4 \
    --training_dataset MpiSintelFinal --training_dataset_root /path/to/mpi-sintel/final/dataset  \
    --validation_dataset MpiSintelClean --validation_dataset_root /path/to/mpi-sintel/clean/dataset

    # Example on MPISintel Final and Clean, with MultiScale loss on FlowNet2C model 
    python main.py --batch_size 8 --model FlowNet2C --optimizer=Adam --optimizer_lr=1e-4 --loss=MultiScale --loss_norm=L1 \
    --loss_numScales=5 --loss_startScale=4 --optimizer_lr=1e-4 --crop_size 384 512 \
    --training_dataset FlyingChairs --training_dataset_root /path/to/flying-chairs/dataset  \
    --validation_dataset MpiSintelClean --validation_dataset_root /path/to/mpi-sintel/clean/dataset
    
## Results on MPI-Sintel
[![Predicted flows on MPI-Sintel](./image.png)](https://www.youtube.com/watch?v=HtBmabY8aeU "Predicted flows on MPI-Sintel")

## Reference 
If you find this implementation useful in your work, please acknowledge it appropriately and cite the paper:
````
@InProceedings{IMKDB17,
  author       = "E. Ilg and N. Mayer and T. Saikia and M. Keuper and A. Dosovitskiy and T. Brox",
  title        = "FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks",
  booktitle    = "IEEE Conference on Computer Vision and Pattern Recognition (CVPR)",
  month        = "Jul",
  year         = "2017",
  url          = "http://lmb.informatik.uni-freiburg.de//Publications/2017/IMKDB17"
}
````
```
@misc{flownet2-pytorch,
  author = {Fitsum Reda and Robert Pottorff and Jon Barker and Bryan Catanzaro},
  title = {flownet2-pytorch: Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks},
  year = {2017},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/NVIDIA/flownet2-pytorch}}
}
```
## Related Optical Flow Work from Nvidia 
Code (in Caffe and Pytorch): [PWC-Net](https://github.com/NVlabs/PWC-Net) <br />
Paper : [PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume](https://arxiv.org/abs/1709.02371). 

## Acknowledgments
Parts of this code were derived, as noted in the code, from [ClementPinard/FlowNetPytorch](https://github.com/ClementPinard/FlowNetPytorch).


================================================
FILE: dvs/flownet2/__init__.py
================================================
from .utils import flow_utils, tools

================================================
FILE: dvs/flownet2/convert.py
================================================
#!/usr/bin/env python2.7

import caffe
from caffe.proto import caffe_pb2
import sys, os

import torch
import torch.nn as nn

import argparse, tempfile
import numpy as np

parser = argparse.ArgumentParser()
parser.add_argument('caffe_model', help='input model in hdf5 or caffemodel format')
parser.add_argument('prototxt_template',help='prototxt template')
parser.add_argument('flownet2_pytorch', help='path to flownet2-pytorch')

args = parser.parse_args()

args.rgb_max = 255
args.fp16 = False
args.grads = {}

# load models
sys.path.append(args.flownet2_pytorch)

import models
from utils.param_utils import *

width = 256
height = 256
keys = {'TARGET_WIDTH': width, 
        'TARGET_HEIGHT': height,
        'ADAPTED_WIDTH':width,
        'ADAPTED_HEIGHT':height,
        'SCALE_WIDTH':1.,
        'SCALE_HEIGHT':1.,}

template = '\n'.join(np.loadtxt(args.prototxt_template, dtype=str, delimiter='\n'))
for k in keys:
    template = template.replace('$%s$'%(k),str(keys[k]))

prototxt = tempfile.NamedTemporaryFile(mode='w', delete=True)
prototxt.write(template)
prototxt.flush()

net = caffe.Net(prototxt.name, args.caffe_model, caffe.TEST)

weights = {}
biases = {}

for k, v in list(net.params.items()):
    weights[k] = np.array(v[0].data).reshape(v[0].data.shape)
    biases[k] = np.array(v[1].data).reshape(v[1].data.shape)
    print((k, weights[k].shape, biases[k].shape))

if 'FlowNet2/' in args.caffe_model:
    model = models.FlowNet2(args)

    parse_flownetc(model.flownetc.modules(), weights, biases)
    parse_flownets(model.flownets_1.modules(), weights, biases, param_prefix='net2_')
    parse_flownets(model.flownets_2.modules(), weights, biases, param_prefix='net3_')
    parse_flownetsd(model.flownets_d.modules(), weights, biases, param_prefix='netsd_')
    parse_flownetfusion(model.flownetfusion.modules(), weights, biases, param_prefix='fuse_')

    state = {'epoch': 0,
             'state_dict': model.state_dict(),
             'best_EPE': 1e10}
    torch.save(state, os.path.join(args.flownet2_pytorch, 'FlowNet2_checkpoint.pth.tar'))

elif 'FlowNet2-C/' in args.caffe_model:
    model = models.FlowNet2C(args)

    parse_flownetc(model.modules(), weights, biases)
    state = {'epoch': 0,
             'state_dict': model.state_dict(),
             'best_EPE': 1e10}
    torch.save(state, os.path.join(args.flownet2_pytorch, 'FlowNet2-C_checkpoint.pth.tar'))

elif 'FlowNet2-CS/' in args.caffe_model:
    model = models.FlowNet2CS(args)

    parse_flownetc(model.flownetc.modules(), weights, biases)
    parse_flownets(model.flownets_1.modules(), weights, biases, param_prefix='net2_')

    state = {'epoch': 0,
             'state_dict': model.state_dict(),
             'best_EPE': 1e10}
    torch.save(state, os.path.join(args.flownet2_pytorch, 'FlowNet2-CS_checkpoint.pth.tar'))

elif 'FlowNet2-CSS/' in args.caffe_model:
    model = models.FlowNet2CSS(args)

    parse_flownetc(model.flownetc.modules(), weights, biases)
    parse_flownets(model.flownets_1.modules(), weights, biases, param_prefix='net2_')
    parse_flownets(model.flownets_2.modules(), weights, biases, param_prefix='net3_')

    state = {'epoch': 0,
             'state_dict': model.state_dict(),
             'best_EPE': 1e10}
    torch.save(state, os.path.join(args.flownet2_pytorch, 'FlowNet2-CSS_checkpoint.pth.tar'))

elif 'FlowNet2-CSS-ft-sd/' in args.caffe_model:
    model = models.FlowNet2CSS(args)

    parse_flownetc(model.flownetc.modules(), weights, biases)
    parse_flownets(model.flownets_1.modules(), weights, biases, param_prefix='net2_')
    parse_flownets(model.flownets_2.modules(), weights, biases, param_prefix='net3_')

    state = {'epoch': 0,
             'state_dict': model.state_dict(),
             'best_EPE': 1e10}
    torch.save(state, os.path.join(args.flownet2_pytorch, 'FlowNet2-CSS-ft-sd_checkpoint.pth.tar'))

elif 'FlowNet2-S/' in args.caffe_model:
    model = models.FlowNet2S(args)

    parse_flownetsonly(model.modules(), weights, biases, param_prefix='')
    state = {'epoch': 0,
             'state_dict': model.state_dict(),
             'best_EPE': 1e10}
    torch.save(state, os.path.join(args.flownet2_pytorch, 'FlowNet2-S_checkpoint.pth.tar'))

elif 'FlowNet2-SD/' in args.caffe_model:
    model = models.FlowNet2SD(args)

    parse_flownetsd(model.modules(), weights, biases, param_prefix='')

    state = {'epoch': 0,
             'state_dict': model.state_dict(),
             'best_EPE': 1e10}
    torch.save(state, os.path.join(args.flownet2_pytorch, 'FlowNet2-SD_checkpoint.pth.tar'))

else:
    print(('model type cound not be determined from input caffe model %s'%(args.caffe_model)))
    quit()
print(("done converting ", args.caffe_model))

================================================
FILE: dvs/flownet2/datasets.py
================================================
import torch
import torch.utils.data as data

import os, math, random
from os.path import *
import numpy as np

from glob import glob
import utils.frame_utils as frame_utils

from imageio import imread

class StaticRandomCrop(object):
    def __init__(self, image_size, crop_size):
        self.th, self.tw = crop_size
        h, w = image_size
        self.h1 = random.randint(0, h - self.th)
        self.w1 = random.randint(0, w - self.tw)

    def __call__(self, img):
        return img[self.h1:(self.h1+self.th), self.w1:(self.w1+self.tw),:]

class StaticCenterCrop(object):
    def __init__(self, image_size, crop_size):
        self.th, self.tw = crop_size
        self.h, self.w = image_size
    def __call__(self, img):
        return img[(self.h-self.th)//2:(self.h+self.th)//2, (self.w-self.tw)//2:(self.w+self.tw)//2,:]

class Padding(object):
    def __init__(self, image_size, pad_size):
        self.th, self.tw = pad_size
        self.h, self.w = image_size
    def __call__(self, img):
        out = np.zeros((self.th, self.tw, 3))
        out[:self.h, :self.w,:] = img
        return out

class MpiSintel(data.Dataset):
    def __init__(self, args, is_cropped = False, root = '', dstype = 'clean', replicates = 1):
        self.args = args
        self.is_cropped = is_cropped
        self.crop_size = args.crop_size
        self.render_size = args.inference_size
        self.replicates = replicates

        flow_root = join(root, 'flow')
        image_root = join(root, dstype)

        file_list = sorted(glob(join(flow_root, '*/*.flo')))

        self.flow_list = []
        self.image_list = []

        for file in file_list:
            if 'test' in file:
                # print file
                continue

            fbase = file[len(flow_root)+1:]
            fprefix = fbase[:-8]
            fnum = int(fbase[-8:-4])

            img1 = join(image_root, fprefix + "%04d"%(fnum+0) + '.png')
            img2 = join(image_root, fprefix + "%04d"%(fnum+1) + '.png')

            if not isfile(img1) or not isfile(img2) or not isfile(file):
                continue

            self.image_list += [[img1, img2]]
            self.flow_list += [file]

        self.size = len(self.image_list)

        self.frame_size = frame_utils.read_gen(self.image_list[0][0]).shape

        if (self.render_size[0] < 0) or (self.render_size[1] < 0) or (self.frame_size[0]%64) or (self.frame_size[1]%64):
            self.render_size[0] = ( (self.frame_size[0])//64 ) * 64
            self.render_size[1] = ( (self.frame_size[1])//64 ) * 64

        args.inference_size = self.render_size

        assert (len(self.image_list) == len(self.flow_list))

    def __getitem__(self, index):

        index = index % self.size

        img1 = frame_utils.read_gen(self.image_list[index][0])
        img2 = frame_utils.read_gen(self.image_list[index][1])

        flow = frame_utils.read_gen(self.flow_list[index])

        images = [img1, img2]
        image_size = img1.shape[:2]

        if self.is_cropped:
            cropper = StaticRandomCrop(image_size, self.crop_size)
        else:
            cropper = StaticCenterCrop(image_size, self.render_size)
        images = list(map(cropper, images))
        flow = cropper(flow)

        images = np.array(images).transpose(3,0,1,2)
        flow = flow.transpose(2,0,1)

        images = torch.from_numpy(images.astype(np.float32))
        flow = torch.from_numpy(flow.astype(np.float32))

        return [images], [flow]

    def __len__(self):
        return self.size * self.replicates

class MpiSintelClean(MpiSintel):
    def __init__(self, args, is_cropped = False, root = '', replicates = 1):
        super(MpiSintelClean, self).__init__(args, is_cropped = is_cropped, root = root, dstype = 'clean', replicates = replicates)

class MpiSintelFinal(MpiSintel):
    def __init__(self, args, is_cropped = False, root = '', replicates = 1):
        super(MpiSintelFinal, self).__init__(args, is_cropped = is_cropped, root = root, dstype = 'final', replicates = replicates)

class FlyingChairs(data.Dataset):
  def __init__(self, args, is_cropped, root = '/path/to/FlyingChairs_release/data', replicates = 1):
    self.args = args
    self.is_cropped = is_cropped
    self.crop_size = args.crop_size
    self.render_size = args.inference_size
    self.replicates = replicates

    images = sorted( glob( join(root, '*.ppm') ) )

    self.flow_list = sorted( glob( join(root, '*.flo') ) )

    assert (len(images)//2 == len(self.flow_list))

    self.image_list = []
    for i in range(len(self.flow_list)):
        im1 = images[2*i]
        im2 = images[2*i + 1]
        self.image_list += [ [ im1, im2 ] ]

    assert len(self.image_list) == len(self.flow_list)

    self.size = len(self.image_list)

    self.frame_size = frame_utils.read_gen(self.image_list[0][0]).shape

    if (self.render_size[0] < 0) or (self.render_size[1] < 0) or (self.frame_size[0]%64) or (self.frame_size[1]%64):
        self.render_size[0] = ( (self.frame_size[0])//64 ) * 64
        self.render_size[1] = ( (self.frame_size[1])//64 ) * 64

    args.inference_size = self.render_size

  def __getitem__(self, index):
    index = index % self.size

    img1 = frame_utils.read_gen(self.image_list[index][0])
    img2 = frame_utils.read_gen(self.image_list[index][1])

    flow = frame_utils.read_gen(self.flow_list[index])

    images = [img1, img2]
    image_size = img1.shape[:2]
    if self.is_cropped:
        cropper = StaticRandomCrop(image_size, self.crop_size)
    else:
        cropper = StaticCenterCrop(image_size, self.render_size)
    images = list(map(cropper, images))
    flow = cropper(flow)


    images = np.array(images).transpose(3,0,1,2)
    flow = flow.transpose(2,0,1)

    images = torch.from_numpy(images.astype(np.float32))
    flow = torch.from_numpy(flow.astype(np.float32))

    return [images], [flow]

  def __len__(self):
    return self.size * self.replicates

class FlyingThings(data.Dataset):
  def __init__(self, args, is_cropped, root = '/path/to/flyingthings3d', dstype = 'frames_cleanpass', replicates = 1):
    self.args = args
    self.is_cropped = is_cropped
    self.crop_size = args.crop_size
    self.render_size = args.inference_size
    self.replicates = replicates

    image_dirs = sorted(glob(join(root, dstype, 'TRAIN/*/*')))
    image_dirs = sorted([join(f, 'left') for f in image_dirs] + [join(f, 'right') for f in image_dirs])

    flow_dirs = sorted(glob(join(root, 'optical_flow_flo_format/TRAIN/*/*')))
    flow_dirs = sorted([join(f, 'into_future/left') for f in flow_dirs] + [join(f, 'into_future/right') for f in flow_dirs])

    assert (len(image_dirs) == len(flow_dirs))

    self.image_list = []
    self.flow_list = []

    for idir, fdir in zip(image_dirs, flow_dirs):
        images = sorted( glob(join(idir, '*.png')) )
        flows = sorted( glob(join(fdir, '*.flo')) )
        for i in range(len(flows)):
            self.image_list += [ [ images[i], images[i+1] ] ]
            self.flow_list += [flows[i]]

    assert len(self.image_list) == len(self.flow_list)

    self.size = len(self.image_list)

    self.frame_size = frame_utils.read_gen(self.image_list[0][0]).shape

    if (self.render_size[0] < 0) or (self.render_size[1] < 0) or (self.frame_size[0]%64) or (self.frame_size[1]%64):
        self.render_size[0] = ( (self.frame_size[0])//64 ) * 64
        self.render_size[1] = ( (self.frame_size[1])//64 ) * 64

    args.inference_size = self.render_size

  def __getitem__(self, index):
    index = index % self.size

    img1 = frame_utils.read_gen(self.image_list[index][0])
    img2 = frame_utils.read_gen(self.image_list[index][1])

    flow = frame_utils.read_gen(self.flow_list[index])

    images = [img1, img2]
    image_size = img1.shape[:2]
    if self.is_cropped:
        cropper = StaticRandomCrop(image_size, self.crop_size)
    else:
        cropper = StaticCenterCrop(image_size, self.render_size)
    images = list(map(cropper, images))
    flow = cropper(flow)


    images = np.array(images).transpose(3,0,1,2)
    flow = flow.transpose(2,0,1)

    images = torch.from_numpy(images.astype(np.float32))
    flow = torch.from_numpy(flow.astype(np.float32))

    return [images], [flow]

  def __len__(self):
    return self.size * self.replicates

class FlyingThingsClean(FlyingThings):
    def __init__(self, args, is_cropped = False, root = '', replicates = 1):
        super(FlyingThingsClean, self).__init__(args, is_cropped = is_cropped, root = root, dstype = 'frames_cleanpass', replicates = replicates)

class FlyingThingsFinal(FlyingThings):
    def __init__(self, args, is_cropped = False, root = '', replicates = 1):
        super(FlyingThingsFinal, self).__init__(args, is_cropped = is_cropped, root = root, dstype = 'frames_finalpass', replicates = replicates)

class ChairsSDHom(data.Dataset):
  def __init__(self, args, is_cropped, root = '/path/to/chairssdhom/data', dstype = 'train', replicates = 1):
    self.args = args
    self.is_cropped = is_cropped
    self.crop_size = args.crop_size
    self.render_size = args.inference_size
    self.replicates = replicates

    image1 = sorted( glob( join(root, dstype, 't0/*.png') ) )
    image2 = sorted( glob( join(root, dstype, 't1/*.png') ) )
    self.flow_list = sorted( glob( join(root, dstype, 'flow/*.flo') ) )

    assert (len(image1) == len(self.flow_list))

    self.image_list = []
    for i in range(len(self.flow_list)):
        im1 = image1[i]
        im2 = image2[i]
        self.image_list += [ [ im1, im2 ] ]

    assert len(self.image_list) == len(self.flow_list)

    self.size = len(self.image_list)

    self.frame_size = frame_utils.read_gen(self.image_list[0][0]).shape

    if (self.render_size[0] < 0) or (self.render_size[1] < 0) or (self.frame_size[0]%64) or (self.frame_size[1]%64):
        self.render_size[0] = ( (self.frame_size[0])//64 ) * 64
        self.render_size[1] = ( (self.frame_size[1])//64 ) * 64

    args.inference_size = self.render_size

  def __getitem__(self, index):
    index = index % self.size

    img1 = frame_utils.read_gen(self.image_list[index][0])
    img2 = frame_utils.read_gen(self.image_list[index][1])

    flow = frame_utils.read_gen(self.flow_list[index])
    flow = flow[::-1,:,:]

    images = [img1, img2]
    image_size = img1.shape[:2]
    if self.is_cropped:
        cropper = StaticRandomCrop(image_size, self.crop_size)
    else:
        cropper = StaticCenterCrop(image_size, self.render_size)
    images = list(map(cropper, images))
    flow = cropper(flow)


    images = np.array(images).transpose(3,0,1,2)
    flow = flow.transpose(2,0,1)

    images = torch.from_numpy(images.astype(np.float32))
    flow = torch.from_numpy(flow.astype(np.float32))

    return [images], [flow]

  def __len__(self):
    return self.size * self.replicates

class ChairsSDHomTrain(ChairsSDHom):
    def __init__(self, args, is_cropped = False, root = '', replicates = 1):
        super(ChairsSDHomTrain, self).__init__(args, is_cropped = is_cropped, root = root, dstype = 'train', replicates = replicates)

class ChairsSDHomTest(ChairsSDHom):
    def __init__(self, args, is_cropped = False, root = '', replicates = 1):
        super(ChairsSDHomTest, self).__init__(args, is_cropped = is_cropped, root = root, dstype = 'test', replicates = replicates)

class ImagesFromFolder(data.Dataset):
  def __init__(self, args, is_cropped, root = '/path/to/frames/only/folder', iext = 'png', replicates = 1):
    self.args = args
    self.is_cropped = is_cropped
    self.crop_size = args.crop_size
    self.render_size = args.inference_size
    self.replicates = replicates

    images = sorted( glob( join(root, '*.' + iext) ) )
    self.image_list = []
    for i in range(len(images)-1):
        im1 = images[i]
        im2 = images[i+1]
        self.image_list += [ [ im1, im2 ] ]

    self.size = len(self.image_list)

    self.frame_size = frame_utils.read_gen(self.image_list[0][0]).shape

    if (self.render_size[0] < 0) or (self.render_size[1] < 0) or (self.frame_size[0]%64) or (self.frame_size[1]%64):
        self.render_size[0] = ( (self.frame_size[0])//64 ) * 64
        self.render_size[1] = ( (self.frame_size[1])//64 ) * 64

    args.inference_size = self.render_size

  def __getitem__(self, index):
    index = index % self.size

    img1 = frame_utils.read_gen(self.image_list[index][0])
    img2 = frame_utils.read_gen(self.image_list[index][1])

    images = [img1, img2]
    image_size = img1.shape[:2]
    if self.is_cropped:
        cropper = StaticRandomCrop(image_size, self.crop_size)
    else:
        cropper = StaticCenterCrop(image_size, self.render_size)
    images = list(map(cropper, images))
    
    images = np.array(images).transpose(3,0,1,2)
    images = torch.from_numpy(images.astype(np.float32))

    return [images], [torch.zeros(images.size()[0:1] + (2,) + images.size()[-2:])]

  def __len__(self):
    return self.size * self.replicates


class Google(data.Dataset):
    def __init__(self, args, is_cropped = False, root = '', dstype = 'frames', replicates = 1):
        self.args = args
        self.is_cropped = is_cropped
        self.crop_size = args.crop_size
        self.render_size = args.inference_size
        self.replicates = replicates

        image_root = join(root, dstype)

        file_list = sorted(glob(join(image_root, '*.png')))

        self.image_list = []

        for i in range(len(file_list)-1):

            img1 = join(file_list[i])
            img2 = join(file_list[i+1])

            if not isfile(img1) or not isfile(img2):
                continue

            self.image_list += [[img1, img2]]

        self.size = len(self.image_list)

        self.frame_size = frame_utils.read_gen(self.image_list[0][0]).shape

        if (self.render_size[0] < 0) or (self.render_size[1] < 0) or (self.frame_size[0]%64) or (self.frame_size[1]%64):
            self.render_size[0] = ( math.ceil(self.frame_size[0]/64) ) * 64
            self.render_size[1] = ( math.ceil(self.frame_size[1]/64) ) * 64


        args.inference_size = self.render_size

    def __getitem__(self, index):

        index = index % self.size

        img1 = frame_utils.read_gen(self.image_list[index][0])
        img2 = frame_utils.read_gen(self.image_list[index][1])

        images = [img1, img2]
        image_size = img1.shape[:2]

        if self.is_cropped:
            cropper = StaticRandomCrop(image_size, self.crop_size)
        else:
            cropper = Padding(image_size, self.render_size)
        images = list(map(cropper, images))

        images = np.array(images).transpose(3,0,1,2)

        images = torch.from_numpy(images.astype(np.float32))

        return [images]

    def __len__(self):
        return self.size * self.replicates

'''
import argparse
import sys, os
import importlib
from scipy.misc import imsave
import numpy as np

import datasets
reload(datasets)

parser = argparse.ArgumentParser()
args = parser.parse_args()
args.inference_size = [1080, 1920]
args.crop_size = [384, 512]
args.effective_batch_size = 1

index = 500
v_dataset = datasets.MpiSintelClean(args, True, root='../MPI-Sintel/flow/training')
a, b = v_dataset[index]
im1 = a[0].numpy()[:,0,:,:].transpose(1,2,0)
im2 = a[0].numpy()[:,1,:,:].transpose(1,2,0)
imsave('./img1.png', im1)
imsave('./img2.png', im2)
flow_utils.writeFlow('./flow.flo', b[0].numpy().transpose(1,2,0))

'''


================================================
FILE: dvs/flownet2/install.sh
================================================
#!/bin/bash
cd ./networks/correlation_package
rm -rf *_cuda.egg-info build dist __pycache__
python3 setup.py install --user

cd ../resample2d_package
rm -rf *_cuda.egg-info build dist __pycache__
python3 setup.py install --user

cd ../channelnorm_package
rm -rf *_cuda.egg-info build dist __pycache__
python3 setup.py install --user

cd ..


================================================
FILE: dvs/flownet2/losses.py
================================================
'''
Portions of this code copyright 2017, Clement Pinard
'''

# freda (todo) : adversarial loss 

import torch
import torch.nn as nn
import math

def EPE(input_flow, target_flow):
    return torch.norm(target_flow-input_flow,p=2,dim=1).mean()

class L1(nn.Module):
    def __init__(self):
        super(L1, self).__init__()
    def forward(self, output, target):
        lossvalue = torch.abs(output - target).mean()
        return lossvalue

class L2(nn.Module):
    def __init__(self):
        super(L2, self).__init__()
    def forward(self, output, target):
        lossvalue = torch.norm(output-target,p=2,dim=1).mean()
        return lossvalue

class L1Loss(nn.Module):
    def __init__(self, args):
        super(L1Loss, self).__init__()
        self.args = args
        self.loss = L1()
        self.loss_labels = ['L1', 'EPE']

    def forward(self, output, target):
        lossvalue = self.loss(output, target)
        epevalue = EPE(output, target)
        return [lossvalue, epevalue]

class L2Loss(nn.Module):
    def __init__(self, args):
        super(L2Loss, self).__init__()
        self.args = args
        self.loss = L2()
        self.loss_labels = ['L2', 'EPE']

    def forward(self, output, target):
        lossvalue = self.loss(output, target)
        epevalue = EPE(output, target)
        return [lossvalue, epevalue]

class MultiScale(nn.Module):
    def __init__(self, args, startScale = 4, numScales = 5, l_weight= 0.32, norm= 'L1'):
        super(MultiScale,self).__init__()

        self.startScale = startScale
        self.numScales = numScales
        self.loss_weights = torch.FloatTensor([(l_weight / 2 ** scale) for scale in range(self.numScales)])
        self.args = args
        self.l_type = norm
        self.div_flow = 0.05
        assert(len(self.loss_weights) == self.numScales)

        if self.l_type == 'L1':
            self.loss = L1()
        else:
            self.loss = L2()

        self.multiScales = [nn.AvgPool2d(self.startScale * (2**scale), self.startScale * (2**scale)) for scale in range(self.numScales)]
        self.loss_labels = ['MultiScale-'+self.l_type, 'EPE'],

    def forward(self, output, target):
        lossvalue = 0
        epevalue = 0

        if type(output) is tuple:
            target = self.div_flow * target
            for i, output_ in enumerate(output):
                target_ = self.multiScales[i](target)
                epevalue += self.loss_weights[i]*EPE(output_, target_)
                lossvalue += self.loss_weights[i]*self.loss(output_, target_)
            return [lossvalue, epevalue]
        else:
            epevalue += EPE(output, target)
            lossvalue += self.loss(output, target)
            return  [lossvalue, epevalue]



================================================
FILE: dvs/flownet2/main.py
================================================
#!/usr/bin/env python
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
import torch
import torch.nn as nn
from torch.utils.data import DataLoader
from torch.autograd import Variable
from tensorboardX import SummaryWriter

import argparse, os, sys, subprocess
import colorama
import numpy as np
from tqdm import tqdm
from glob import glob
from os.path import *

import models, datasets
from utils import flow_utils, tools
import time

    # Reusable function for inference
def inference(args, epoch, data_path, data_loader, model, offset=0):

    model.eval()
    
    if args.save_flow or args.render_validation:
        flow_folder = "{}/flo".format(data_path)
        flow_back_folder = "{}/flo_back".format(data_path)
        if not os.path.exists(flow_folder):
            os.makedirs(flow_folder)
        if not os.path.exists(flow_back_folder):
            os.makedirs(flow_back_folder)
    
    # visualization folder
    if args.inference_visualize:
        flow_vis_folder = "{}/flo_vis".format(data_path)
        if not os.path.exists(flow_vis_folder):
            os.makedirs(flow_vis_folder)
        flow_back_vis_folder = "{}/flo_back_vis".format(data_path)
        if not os.path.exists(flow_back_vis_folder):
            os.makedirs(flow_back_vis_folder)
    
    args.inference_n_batches = np.inf if args.inference_n_batches < 0 else args.inference_n_batches

    progress = tqdm(data_loader, ncols=100, total=np.minimum(len(data_loader), args.inference_n_batches), desc='Inferencing ', 
        leave=True, position=offset)

    for batch_idx, (data) in enumerate(progress):
        data = data[0]
        data_back = torch.cat((data[:,:,1:,:,:], data[:,:,:1,:,:]), dim = 2)
        if args.cuda:
            data_forward = data.cuda(non_blocking=True)
            data_back = data_back.cuda(non_blocking=True)
        data_forward = Variable(data_forward)
        data_back = Variable(data_back)

        flo_path = join(flow_folder, '%06d.flo'%(batch_idx))
        flo_back_path = join(flow_back_folder, '%06d.flo'%(batch_idx))
        frame_size = data_loader.dataset.frame_size
        if not os.path.exists(flo_path):
            with torch.no_grad():
                output = model(data_forward)[:,:,:frame_size[0], :frame_size[1]]
            if args.save_flow or args.render_validation:
                _pflow = output[0].data.cpu().numpy().transpose(1, 2, 0)
                flow_utils.writeFlow( flo_path,  _pflow)
                if args.inference_visualize:
                    flow_utils.visulize_flow_file(
                        join(flow_folder, '%06d.flo' % (batch_idx)),flow_vis_folder)

        if not os.path.exists(flo_back_path):
            with torch.no_grad():
                output = model(data_back)[:,:,:frame_size[0], :frame_size[1]]
            if args.save_flow or args.render_validation:
                _pflow = output[0].data.cpu().numpy().transpose(1, 2, 0)
                flow_utils.writeFlow( flo_back_path,  _pflow)
                if args.inference_visualize:
                    flow_utils.visulize_flow_file(
                        join(flow_back_folder, '%06d.flo' % (batch_idx)), flow_back_vis_folder)
                
        progress.update(1)

        if batch_idx == (args.inference_n_batches - 1):
            break
    progress.close()
    return

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('--fp16', action='store_true', help='Run model in pseudo-fp16 mode (fp16 storage fp32 math).')
    parser.add_argument('--fp16_scale', type=float, default=1024., help='Loss scaling, positive power of 2 values can improve fp16 convergence.')

    parser.add_argument('--start_epoch', type=int, default=1)
    parser.add_argument('--batch_size', '-b', type=int, default=8, help="Batch size")
    parser.add_argument('--crop_size', type=int, nargs='+', default = [256, 256], help="Spatial dimension to crop training samples for training")
    parser.add_argument("--rgb_max", type=float, default = 255.)

    parser.add_argument('--number_workers', '-nw', '--num_workers', type=int, default=8)
    parser.add_argument('--number_gpus', '-ng', type=int, default=-1, help='number of GPUs to use')
    parser.add_argument('--no_cuda', action='store_true')

    parser.add_argument('--save', '-s', default='./Google', type=str, help='directory for saving')

    parser.add_argument('--inference', action='store_true')
    parser.add_argument('--inference_visualize', action='store_true',
                        help="visualize the optical flow during inference")
    parser.add_argument('--inference_size', type=int, nargs='+', default = [-1,-1], help='spatial size divisible by 64. default (-1,-1) - largest possible valid size would be used')
    parser.add_argument('--inference_batch_size', type=int, default=1)
    parser.add_argument('--inference_n_batches', type=int, default=-1)
    parser.add_argument('--save_flow', action='store_true', help='save predicted flows to file')

    parser.add_argument('--resume', default='', type=str, metavar='PATH', help='path to latest checkpoint (default: none)')
    parser.add_argument('--log_frequency', '--summ_iter', type=int, default=1, help="Log every n batches")

    tools.add_arguments_for_module(parser, models, argument_for_class='model', default='FlowNet2')
    
    tools.add_arguments_for_module(parser, datasets, argument_for_class='inference_dataset', default='Google', 
                                    skip_params=['is_cropped'],
                                    parameter_defaults={'root': './Google/train',
                                                        'replicates': 1})

    main_dir = os.path.dirname(os.path.realpath(__file__))
    os.chdir(main_dir)

    # Parse the official arguments
    with tools.TimerBlock("Parsing Arguments") as block:
        args = parser.parse_args()
        if args.number_gpus < 0 : args.number_gpus = torch.cuda.device_count()

        # Get argument defaults (hastag #thisisahack)
        parser.add_argument('--IGNORE',  action='store_true')
        defaults = vars(parser.parse_args(['--IGNORE']))

        # Print all arguments, color the non-defaults
        for argument, value in sorted(vars(args).items()):
            reset = colorama.Style.RESET_ALL
            color = reset if value == defaults[argument] else colorama.Fore.MAGENTA
            block.log('{}{}: {}{}'.format(color, argument, value, reset))

        args.model_class = tools.module_to_dict(models)[args.model]

        args.inference_dataset_class = tools.module_to_dict(datasets)[args.inference_dataset]

        args.cuda = not args.no_cuda and torch.cuda.is_available()
        # args.current_hash = subprocess.check_output(["git", "rev-parse", "HEAD"]).rstrip()
        args.log_file = join(args.save, 'args.txt')

        # dict to collect activation gradients (for training debug purpose)
        args.grads = {}

        args.total_epochs = 1
        args.inference_dir = "{}/inference".format(args.save)

    print('Source Code')
    # print(('  Current Git Hash: {}\n'.format(args.current_hash)))

    # Dynamically load the dataset class with parameters passed in via "--argument_[param]=[value]" arguments
    with tools.TimerBlock("Initializing Datasets") as block:
        args.effective_batch_size = args.batch_size * args.number_gpus
        args.effective_inference_batch_size = args.inference_batch_size * args.number_gpus
        args.effective_number_workers = args.number_workers * args.number_gpus
        gpuargs = {'num_workers': args.effective_number_workers, 
                   'pin_memory': True, 
                   'drop_last' : True} if args.cuda else {}
        inf_gpuargs = gpuargs.copy()
        inf_gpuargs['num_workers'] = args.number_workers

        block.log('Inference Dataset: {}'.format(args.inference_dataset))

        dataset_root = args.inference_dataset_root 
        data_name = sorted(os.listdir(dataset_root))

        block.log(data_name)
        inference_loaders = {}
        for i in range(len(data_name)):
            dataset_path = os.path.join(dataset_root, data_name[i])
            args.inference_dataset_root  = dataset_path
            inference_dataset = args.inference_dataset_class(args, False, **tools.kwargs_from_args(args, 'inference_dataset'))
            inference_loaders[dataset_path] = DataLoader(inference_dataset, batch_size=args.effective_inference_batch_size, shuffle=False, **inf_gpuargs)
            block.log('Inference Input: {}'.format(' '.join([str([d for d in x.size()]) for x in inference_dataset[0][0]])))

    # Dynamically load model and loss class with parameters passed in via "--model_[param]=[value]" or "--loss_[param]=[value]" arguments
    with tools.TimerBlock("Building {} model".format(args.model)) as block:
        class Model(nn.Module):
            def __init__(self, args):
                super(Model, self).__init__()
                kwargs = tools.kwargs_from_args(args, 'model')
                self.model = args.model_class(args, **kwargs)
                
            def forward(self, data):
                output = self.model(data)
                return output

        model = Model(args)

        block.log('Effective Batch Size: {}'.format(args.effective_batch_size))
        block.log('Number of parameters: {}'.format(sum([p.data.nelement() if p.requires_grad else 0 for p in model.parameters()])))

        if args.cuda and args.number_gpus > 0:
            block.log('Initializing CUDA')
            model = model.cuda()
            block.log('Parallelizing')
            model = nn.parallel.DataParallel(model, device_ids=list(range(args.number_gpus)))

        # Load weights if needed, otherwise randomly initialize
        if args.resume and os.path.isfile(args.resume):
            block.log("Loading checkpoint '{}'".format(args.resume))
            checkpoint = torch.load(args.resume)
            model.module.model.load_state_dict(checkpoint['state_dict'])
            block.log("Loaded checkpoint '{}' (at epoch {})".format(args.resume, checkpoint['epoch']))

        elif args.resume and args.inference:
            block.log("No checkpoint found at '{}'".format(args.resume))
            quit()

        else:
            block.log("Random initialization")

        block.log("Initializing save directory: {}".format(args.save))
        if not os.path.exists(args.save):
            os.makedirs(args.save)

    # Log all arguments to file
    for argument, value in sorted(vars(args).items()):
        block.log2file(args.log_file, '{}: {}'.format(argument, value))

    for data_path in inference_loaders:
        # Primary epoch loop
        progress = tqdm(list(range(args.start_epoch, args.total_epochs + 1)), miniters=1, ncols=100, desc='Overall Progress', leave=True, position=0)
        offset = 1

        for epoch in progress:
            stats = inference(args=args, epoch=epoch - 1, data_path = data_path, data_loader=inference_loaders[data_path], model=model, offset=offset)
            offset += 1
        print("\n")

================================================
FILE: dvs/flownet2/models.py
================================================
import torch
import torch.nn as nn
from torch.nn import init

import math
import numpy as np

try:
    from networks.resample2d_package.resample2d import Resample2d
    from networks.channelnorm_package.channelnorm import ChannelNorm

    from networks import FlowNetC
    from networks import FlowNetS
    from networks import FlowNetSD
    from networks import FlowNetFusion

    from networks.submodules import *
except:
    from .networks.resample2d_package.resample2d import Resample2d
    from .networks.channelnorm_package.channelnorm import ChannelNorm

    from .networks import FlowNetC
    from .networks import FlowNetS
    from .networks import FlowNetSD
    from .networks import FlowNetFusion

    from .networks.submodules import *
'Parameter count = 162,518,834'

class FlowNet2(nn.Module):

    def __init__(self, args, batchNorm=False, div_flow = 20.):
        super(FlowNet2,self).__init__()
        self.batchNorm = batchNorm
        self.div_flow = div_flow
        self.rgb_max = args.rgb_max
        self.args = args

        self.channelnorm = ChannelNorm()

        # First Block (FlowNetC)
        self.flownetc = FlowNetC.FlowNetC(args, batchNorm=self.batchNorm)
        self.upsample1 = nn.Upsample(scale_factor=4, mode='bilinear')

        if args.fp16:
            self.resample1 = nn.Sequential(
                            tofp32(), 
                            Resample2d(),
                            tofp16()) 
        else:
            self.resample1 = Resample2d()

        # Block (FlowNetS1)
        self.flownets_1 = FlowNetS.FlowNetS(args, batchNorm=self.batchNorm)
        self.upsample2 = nn.Upsample(scale_factor=4, mode='bilinear')
        if args.fp16:
            self.resample2 = nn.Sequential(
                            tofp32(), 
                            Resample2d(),
                            tofp16()) 
        else:
            self.resample2 = Resample2d()


        # Block (FlowNetS2)
        self.flownets_2 = FlowNetS.FlowNetS(args, batchNorm=self.batchNorm)

        # Block (FlowNetSD)
        self.flownets_d = FlowNetSD.FlowNetSD(args, batchNorm=self.batchNorm) 
        self.upsample3 = nn.Upsample(scale_factor=4, mode='nearest') 
        self.upsample4 = nn.Upsample(scale_factor=4, mode='nearest') 

        if args.fp16:
            self.resample3 = nn.Sequential(
                            tofp32(), 
                            Resample2d(),
                            tofp16()) 
        else:
            self.resample3 = Resample2d()

        if args.fp16:
            self.resample4 = nn.Sequential(
                            tofp32(), 
                            Resample2d(),
                            tofp16()) 
        else:
            self.resample4 = Resample2d()

        # Block (FLowNetFusion)
        self.flownetfusion = FlowNetFusion.FlowNetFusion(args, batchNorm=self.batchNorm)

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                if m.bias is not None:
                    init.uniform_(m.bias)
                init.xavier_uniform_(m.weight)

            if isinstance(m, nn.ConvTranspose2d):
                if m.bias is not None:
                    init.uniform_(m.bias)
                init.xavier_uniform_(m.weight)
                # init_deconv_bilinear(m.weight)

    def init_deconv_bilinear(self, weight):
        f_shape = weight.size()
        heigh, width = f_shape[-2], f_shape[-1]
        f = np.ceil(width/2.0)
        c = (2 * f - 1 - f % 2) / (2.0 * f)
        bilinear = np.zeros([heigh, width])
        for x in range(width):
            for y in range(heigh):
                value = (1 - abs(x / f - c)) * (1 - abs(y / f - c))
                bilinear[x, y] = value
        min_dim = min(f_shape[0], f_shape[1])
        weight.data.fill_(0.)
        for i in range(min_dim):
            weight.data[i,i,:,:] = torch.from_numpy(bilinear)
        return 

    def forward(self, inputs):
        rgb_mean = inputs.contiguous().view(inputs.size()[:2]+(-1,)).mean(dim=-1).view(inputs.size()[:2] + (1,1,1,))
        
        x = (inputs - rgb_mean) / self.rgb_max
        x1 = x[:,:,0,:,:]
        x2 = x[:,:,1,:,:]
        x = torch.cat((x1,x2), dim = 1)

        # flownetc
        flownetc_flow2 = self.flownetc(x)[0]
        flownetc_flow = self.upsample1(flownetc_flow2*self.div_flow)
        
        # warp img1 to img0; magnitude of diff between img0 and and warped_img1, 
        resampled_img1 = self.resample1(x[:,3:,:,:], flownetc_flow)
        diff_img0 = x[:,:3,:,:] - resampled_img1 
        norm_diff_img0 = self.channelnorm(diff_img0)

        # concat img0, img1, img1->img0, flow, diff-mag ; 
        concat1 = torch.cat((x, resampled_img1, flownetc_flow/self.div_flow, norm_diff_img0), dim=1)
        
        # flownets1
        flownets1_flow2 = self.flownets_1(concat1)[0]
        flownets1_flow = self.upsample2(flownets1_flow2*self.div_flow) 

        # warp img1 to img0 using flownets1; magnitude of diff between img0 and and warped_img1
        resampled_img1 = self.resample2(x[:,3:,:,:], flownets1_flow)
        diff_img0 = x[:,:3,:,:] - resampled_img1
        norm_diff_img0 = self.channelnorm(diff_img0)

        # concat img0, img1, img1->img0, flow, diff-mag
        concat2 = torch.cat((x, resampled_img1, flownets1_flow/self.div_flow, norm_diff_img0), dim=1)

        # flownets2
        flownets2_flow2 = self.flownets_2(concat2)[0]
        flownets2_flow = self.upsample4(flownets2_flow2 * self.div_flow)
        norm_flownets2_flow = self.channelnorm(flownets2_flow)

        diff_flownets2_flow = self.resample4(x[:,3:,:,:], flownets2_flow)
        # if not diff_flownets2_flow.volatile:
        #     diff_flownets2_flow.register_hook(save_grad(self.args.grads, 'diff_flownets2_flow'))

        diff_flownets2_img1 = self.channelnorm((x[:,:3,:,:]-diff_flownets2_flow))
        # if not diff_flownets2_img1.volatile:
        #     diff_flownets2_img1.register_hook(save_grad(self.args.grads, 'diff_flownets2_img1'))

        # flownetsd
        flownetsd_flow2 = self.flownets_d(x)[0]
        flownetsd_flow = self.upsample3(flownetsd_flow2 / self.div_flow)
        norm_flownetsd_flow = self.channelnorm(flownetsd_flow)
        
        diff_flownetsd_flow = self.resample3(x[:,3:,:,:], flownetsd_flow)
        # if not diff_flownetsd_flow.volatile:
        #     diff_flownetsd_flow.register_hook(save_grad(self.args.grads, 'diff_flownetsd_flow'))

        diff_flownetsd_img1 = self.channelnorm((x[:,:3,:,:]-diff_flownetsd_flow))
        # if not diff_flownetsd_img1.volatile:
        #     diff_flownetsd_img1.register_hook(save_grad(self.args.grads, 'diff_flownetsd_img1'))

        # concat img1 flownetsd, flownets2, norm_flownetsd, norm_flownets2, diff_flownetsd_img1, diff_flownets2_img1
        concat3 = torch.cat((x[:,:3,:,:], flownetsd_flow, flownets2_flow, norm_flownetsd_flow, norm_flownets2_flow, diff_flownetsd_img1, diff_flownets2_img1), dim=1)
        flownetfusion_flow = self.flownetfusion(concat3)

        # if not flownetfusion_flow.volatile:
        #     flownetfusion_flow.register_hook(save_grad(self.args.grads, 'flownetfusion_flow'))

        return flownetfusion_flow

class FlowNet2C(FlowNetC.FlowNetC):
    def __init__(self, args, batchNorm=False, div_flow=20):
        super(FlowNet2C,self).__init__(args, batchNorm=batchNorm, div_flow=20)
        self.rgb_max = args.rgb_max

    def forward(self, inputs):
        rgb_mean = inputs.contiguous().view(inputs.size()[:2]+(-1,)).mean(dim=-1).view(inputs.size()[:2] + (1,1,1,))
        
        x = (inputs - rgb_mean) / self.rgb_max
        x1 = x[:,:,0,:,:]
        x2 = x[:,:,1,:,:]

        # FlownetC top input stream
        out_conv1a = self.conv1(x1)
        out_conv2a = self.conv2(out_conv1a)
        out_conv3a = self.conv3(out_conv2a)

        # FlownetC bottom input stream
        out_conv1b = self.conv1(x2)
        
        out_conv2b = self.conv2(out_conv1b)
        out_conv3b = self.conv3(out_conv2b)

        # Merge streams
        out_corr = self.corr(out_conv3a, out_conv3b) # False
        out_corr = self.corr_activation(out_corr)

        # Redirect top input stream and concatenate
        out_conv_redir = self.conv_redir(out_conv3a)

        in_conv3_1 = torch.cat((out_conv_redir, out_corr), 1)

        # Merged conv layers
        out_conv3_1 = self.conv3_1(in_conv3_1)

        out_conv4 = self.conv4_1(self.conv4(out_conv3_1))

        out_conv5 = self.conv5_1(self.conv5(out_conv4))
        out_conv6 = self.conv6_1(self.conv6(out_conv5))

        flow6       = self.predict_flow6(out_conv6)
        flow6_up    = self.upsampled_flow6_to_5(flow6)
        out_deconv5 = self.deconv5(out_conv6)

        concat5 = torch.cat((out_conv5,out_deconv5,flow6_up),1)

        flow5       = self.predict_flow5(concat5)
        flow5_up    = self.upsampled_flow5_to_4(flow5)
        out_deconv4 = self.deconv4(concat5)
        concat4 = torch.cat((out_conv4,out_deconv4,flow5_up),1)

        flow4       = self.predict_flow4(concat4)
        flow4_up    = self.upsampled_flow4_to_3(flow4)
        out_deconv3 = self.deconv3(concat4)
        concat3 = torch.cat((out_conv3_1,out_deconv3,flow4_up),1)

        flow3       = self.predict_flow3(concat3)
        flow3_up    = self.upsampled_flow3_to_2(flow3)
        out_deconv2 = self.deconv2(concat3)
        concat2 = torch.cat((out_conv2a,out_deconv2,flow3_up),1)

        flow2 = self.predict_flow2(concat2)

        if self.training:
            return flow2,flow3,flow4,flow5,flow6
        else:
            return self.upsample1(flow2*self.div_flow)

class FlowNet2S(FlowNetS.FlowNetS):
    def __init__(self, args, batchNorm=False, div_flow=20):
        super(FlowNet2S,self).__init__(args, input_channels = 6, batchNorm=batchNorm)
        self.rgb_max = args.rgb_max
        self.div_flow = div_flow
        
    def forward(self, inputs):
        rgb_mean = inputs.contiguous().view(inputs.size()[:2]+(-1,)).mean(dim=-1).view(inputs.size()[:2] + (1,1,1,))
        x = (inputs - rgb_mean) / self.rgb_max
        x = torch.cat( (x[:,:,0,:,:], x[:,:,1,:,:]), dim = 1)

        out_conv1 = self.conv1(x)

        out_conv2 = self.conv2(out_conv1)
        out_conv3 = self.conv3_1(self.conv3(out_conv2))
        out_conv4 = self.conv4_1(self.conv4(out_conv3))
        out_conv5 = self.conv5_1(self.conv5(out_conv4))
        out_conv6 = self.conv6_1(self.conv6(out_conv5))

        flow6       = self.predict_flow6(out_conv6)
        flow6_up    = self.upsampled_flow6_to_5(flow6)
        out_deconv5 = self.deconv5(out_conv6)
        
        concat5 = torch.cat((out_conv5,out_deconv5,flow6_up),1)
        flow5       = self.predict_flow5(concat5)
        flow5_up    = self.upsampled_flow5_to_4(flow5)
        out_deconv4 = self.deconv4(concat5)
        
        concat4 = torch.cat((out_conv4,out_deconv4,flow5_up),1)
        flow4       = self.predict_flow4(concat4)
        flow4_up    = self.upsampled_flow4_to_3(flow4)
        out_deconv3 = self.deconv3(concat4)
        
        concat3 = torch.cat((out_conv3,out_deconv3,flow4_up),1)
        flow3       = self.predict_flow3(concat3)
        flow3_up    = self.upsampled_flow3_to_2(flow3)
        out_deconv2 = self.deconv2(concat3)

        concat2 = torch.cat((out_conv2,out_deconv2,flow3_up),1)
        flow2 = self.predict_flow2(concat2)

        if self.training:
            return flow2,flow3,flow4,flow5,flow6
        else:
            return self.upsample1(flow2*self.div_flow)

class FlowNet2SD(FlowNetSD.FlowNetSD):
    def __init__(self, args, batchNorm=False, div_flow=20):
        super(FlowNet2SD,self).__init__(args, batchNorm=batchNorm)
        self.rgb_max = args.rgb_max
        self.div_flow = div_flow

    def forward(self, inputs):
        rgb_mean = inputs.contiguous().view(inputs.size()[:2]+(-1,)).mean(dim=-1).view(inputs.size()[:2] + (1,1,1,))
        x = (inputs - rgb_mean) / self.rgb_max
        x = torch.cat( (x[:,:,0,:,:], x[:,:,1,:,:]), dim = 1)

        out_conv0 = self.conv0(x)
        out_conv1 = self.conv1_1(self.conv1(out_conv0))
        out_conv2 = self.conv2_1(self.conv2(out_conv1))

        out_conv3 = self.conv3_1(self.conv3(out_conv2))
        out_conv4 = self.conv4_1(self.conv4(out_conv3))
        out_conv5 = self.conv5_1(self.conv5(out_conv4))
        out_conv6 = self.conv6_1(self.conv6(out_conv5))

        flow6       = self.predict_flow6(out_conv6)
        flow6_up    = self.upsampled_flow6_to_5(flow6)
        out_deconv5 = self.deconv5(out_conv6)
        
        concat5 = torch.cat((out_conv5,out_deconv5,flow6_up),1)
        out_interconv5 = self.inter_conv5(concat5)
        flow5       = self.predict_flow5(out_interconv5)

        flow5_up    = self.upsampled_flow5_to_4(flow5)
        out_deconv4 = self.deconv4(concat5)
        
        concat4 = torch.cat((out_conv4,out_deconv4,flow5_up),1)
        out_interconv4 = self.inter_conv4(concat4)
        flow4       = self.predict_flow4(out_interconv4)
        flow4_up    = self.upsampled_flow4_to_3(flow4)
        out_deconv3 = self.deconv3(concat4)
        
        concat3 = torch.cat((out_conv3,out_deconv3,flow4_up),1)
        out_interconv3 = self.inter_conv3(concat3)
        flow3       = self.predict_flow3(out_interconv3)
        flow3_up    = self.upsampled_flow3_to_2(flow3)
        out_deconv2 = self.deconv2(concat3)

        concat2 = torch.cat((out_conv2,out_deconv2,flow3_up),1)
        out_interconv2 = self.inter_conv2(concat2)
        flow2 = self.predict_flow2(out_interconv2)

        if self.training:
            return flow2,flow3,flow4,flow5,flow6
        else:
            return self.upsample1(flow2*self.div_flow)

class FlowNet2CS(nn.Module):

    def __init__(self, args, batchNorm=False, div_flow = 20.):
        super(FlowNet2CS,self).__init__()
        self.batchNorm = batchNorm
        self.div_flow = div_flow
        self.rgb_max = args.rgb_max
        self.args = args

        self.channelnorm = ChannelNorm()

        # First Block (FlowNetC)
        self.flownetc = FlowNetC.FlowNetC(args, batchNorm=self.batchNorm)
        self.upsample1 = nn.Upsample(scale_factor=4, mode='bilinear')

        if args.fp16:
            self.resample1 = nn.Sequential(
                            tofp32(), 
                            Resample2d(),
                            tofp16()) 
        else:
            self.resample1 = Resample2d()

        # Block (FlowNetS1)
        self.flownets_1 = FlowNetS.FlowNetS(args, batchNorm=self.batchNorm)
        self.upsample2 = nn.Upsample(scale_factor=4, mode='bilinear')

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                if m.bias is not None:
                    init.uniform(m.bias)
                init.xavier_uniform(m.weight)

            if isinstance(m, nn.ConvTranspose2d):
                if m.bias is not None:
                    init.uniform(m.bias)
                init.xavier_uniform(m.weight)
                # init_deconv_bilinear(m.weight)

    def forward(self, inputs):
        rgb_mean = inputs.contiguous().view(inputs.size()[:2]+(-1,)).mean(dim=-1).view(inputs.size()[:2] + (1,1,1,))
        
        x = (inputs - rgb_mean) / self.rgb_max
        x1 = x[:,:,0,:,:]
        x2 = x[:,:,1,:,:]
        x = torch.cat((x1,x2), dim = 1)

        # flownetc
        flownetc_flow2 = self.flownetc(x)[0]
        flownetc_flow = self.upsample1(flownetc_flow2*self.div_flow)
        
        # warp img1 to img0; magnitude of diff between img0 and and warped_img1, 
        resampled_img1 = self.resample1(x[:,3:,:,:], flownetc_flow)
        diff_img0 = x[:,:3,:,:] - resampled_img1 
        norm_diff_img0 = self.channelnorm(diff_img0)

        # concat img0, img1, img1->img0, flow, diff-mag ; 
        concat1 = torch.cat((x, resampled_img1, flownetc_flow/self.div_flow, norm_diff_img0), dim=1)
        
        # flownets1
        flownets1_flow2 = self.flownets_1(concat1)[0]
        flownets1_flow = self.upsample2(flownets1_flow2*self.div_flow) 

        return flownets1_flow

class FlowNet2CSS(nn.Module):

    def __init__(self, args, batchNorm=False, div_flow = 20.):
        super(FlowNet2CSS,self).__init__()
        self.batchNorm = batchNorm
        self.div_flow = div_flow
        self.rgb_max = args.rgb_max
        self.args = args

        self.channelnorm = ChannelNorm()

        # First Block (FlowNetC)
        self.flownetc = FlowNetC.FlowNetC(args, batchNorm=self.batchNorm)
        self.upsample1 = nn.Upsample(scale_factor=4, mode='bilinear')

        if args.fp16:
            self.resample1 = nn.Sequential(
                            tofp32(), 
                            Resample2d(),
                            tofp16()) 
        else:
            self.resample1 = Resample2d()

        # Block (FlowNetS1)
        self.flownets_1 = FlowNetS.FlowNetS(args, batchNorm=self.batchNorm)
        self.upsample2 = nn.Upsample(scale_factor=4, mode='bilinear')
        if args.fp16:
            self.resample2 = nn.Sequential(
                            tofp32(), 
                            Resample2d(),
                            tofp16()) 
        else:
            self.resample2 = Resample2d()


        # Block (FlowNetS2)
        self.flownets_2 = FlowNetS.FlowNetS(args, batchNorm=self.batchNorm)
        self.upsample3 = nn.Upsample(scale_factor=4, mode='nearest') 

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                if m.bias is not None:
                    init.uniform(m.bias)
                init.xavier_uniform(m.weight)

            if isinstance(m, nn.ConvTranspose2d):
                if m.bias is not None:
                    init.uniform(m.bias)
                init.xavier_uniform(m.weight)
                # init_deconv_bilinear(m.weight)

    def forward(self, inputs):
        rgb_mean = inputs.contiguous().view(inputs.size()[:2]+(-1,)).mean(dim=-1).view(inputs.size()[:2] + (1,1,1,))
        
        x = (inputs - rgb_mean) / self.rgb_max
        x1 = x[:,:,0,:,:]
        x2 = x[:,:,1,:,:]
        x = torch.cat((x1,x2), dim = 1)

        # flownetc
        flownetc_flow2 = self.flownetc(x)[0]
        flownetc_flow = self.upsample1(flownetc_flow2*self.div_flow)
        
        # warp img1 to img0; magnitude of diff between img0 and and warped_img1, 
        resampled_img1 = self.resample1(x[:,3:,:,:], flownetc_flow)
        diff_img0 = x[:,:3,:,:] - resampled_img1 
        norm_diff_img0 = self.channelnorm(diff_img0)

        # concat img0, img1, img1->img0, flow, diff-mag ; 
        concat1 = torch.cat((x, resampled_img1, flownetc_flow/self.div_flow, norm_diff_img0), dim=1)
        
        # flownets1
        flownets1_flow2 = self.flownets_1(concat1)[0]
        flownets1_flow = self.upsample2(flownets1_flow2*self.div_flow) 

        # warp img1 to img0 using flownets1; magnitude of diff between img0 and and warped_img1
        resampled_img1 = self.resample2(x[:,3:,:,:], flownets1_flow)
        diff_img0 = x[:,:3,:,:] - resampled_img1
        norm_diff_img0 = self.channelnorm(diff_img0)

        # concat img0, img1, img1->img0, flow, diff-mag
        concat2 = torch.cat((x, resampled_img1, flownets1_flow/self.div_flow, norm_diff_img0), dim=1)

        # flownets2
        flownets2_flow2 = self.flownets_2(concat2)[0]
        flownets2_flow = self.upsample3(flownets2_flow2 * self.div_flow)

        return flownets2_flow



================================================
FILE: dvs/flownet2/networks/FlowNetC.py
================================================
import torch
import torch.nn as nn
from torch.nn import init

import math
import numpy as np

from .correlation_package.correlation import Correlation

from .submodules import *
'Parameter count , 39,175,298 '

class FlowNetC(nn.Module):
    def __init__(self,args, batchNorm=True, div_flow = 20):
        super(FlowNetC,self).__init__()

        self.batchNorm = batchNorm
        self.div_flow = div_flow

        self.conv1   = conv(self.batchNorm,   3,   64, kernel_size=7, stride=2)
        self.conv2   = conv(self.batchNorm,  64,  128, kernel_size=5, stride=2)
        self.conv3   = conv(self.batchNorm, 128,  256, kernel_size=5, stride=2)
        self.conv_redir  = conv(self.batchNorm, 256,   32, kernel_size=1, stride=1)

        if args.fp16:
            self.corr = nn.Sequential(
                tofp32(),
                Correlation(pad_size=20, kernel_size=1, max_displacement=20, stride1=1, stride2=2, corr_multiply=1),
                tofp16())
        else:
            self.corr = Correlation(pad_size=20, kernel_size=1, max_displacement=20, stride1=1, stride2=2, corr_multiply=1)

        self.corr_activation = nn.LeakyReLU(0.1,inplace=True)
        self.conv3_1 = conv(self.batchNorm, 473,  256)
        self.conv4   = conv(self.batchNorm, 256,  512, stride=2)
        self.conv4_1 = conv(self.batchNorm, 512,  512)
        self.conv5   = conv(self.batchNorm, 512,  512, stride=2)
        self.conv5_1 = conv(self.batchNorm, 512,  512)
        self.conv6   = conv(self.batchNorm, 512, 1024, stride=2)
        self.conv6_1 = conv(self.batchNorm,1024, 1024)

        self.deconv5 = deconv(1024,512)
        self.deconv4 = deconv(1026,256)
        self.deconv3 = deconv(770,128)
        self.deconv2 = deconv(386,64)

        self.predict_flow6 = predict_flow(1024)
        self.predict_flow5 = predict_flow(1026)
        self.predict_flow4 = predict_flow(770)
        self.predict_flow3 = predict_flow(386)
        self.predict_flow2 = predict_flow(194)

        self.upsampled_flow6_to_5 = nn.ConvTranspose2d(2, 2, 4, 2, 1, bias=True)
        self.upsampled_flow5_to_4 = nn.ConvTranspose2d(2, 2, 4, 2, 1, bias=True)
        self.upsampled_flow4_to_3 = nn.ConvTranspose2d(2, 2, 4, 2, 1, bias=True)
        self.upsampled_flow3_to_2 = nn.ConvTranspose2d(2, 2, 4, 2, 1, bias=True)

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                if m.bias is not None:
                    init.uniform_(m.bias)
                init.xavier_uniform_(m.weight)

            if isinstance(m, nn.ConvTranspose2d):
                if m.bias is not None:
                    init.uniform_(m.bias)
                init.xavier_uniform_(m.weight)
                # init_deconv_bilinear(m.weight)
        self.upsample1 = nn.Upsample(scale_factor=4, mode='bilinear')

    def forward(self, x):
        x1 = x[:,0:3,:,:]
        x2 = x[:,3::,:,:]

        out_conv1a = self.conv1(x1)
        out_conv2a = self.conv2(out_conv1a)
        out_conv3a = self.conv3(out_conv2a)

        # FlownetC bottom input stream
        out_conv1b = self.conv1(x2)
        
        out_conv2b = self.conv2(out_conv1b)
        out_conv3b = self.conv3(out_conv2b)

        # Merge streams
        out_corr = self.corr(out_conv3a, out_conv3b) # False
        out_corr = self.corr_activation(out_corr)

        # Redirect top input stream and concatenate
        out_conv_redir = self.conv_redir(out_conv3a)

        in_conv3_1 = torch.cat((out_conv_redir, out_corr), 1)

        # Merged conv layers
        out_conv3_1 = self.conv3_1(in_conv3_1)

        out_conv4 = self.conv4_1(self.conv4(out_conv3_1))

        out_conv5 = self.conv5_1(self.conv5(out_conv4))
        out_conv6 = self.conv6_1(self.conv6(out_conv5))

        flow6       = self.predict_flow6(out_conv6)
        flow6_up    = self.upsampled_flow6_to_5(flow6)
        out_deconv5 = self.deconv5(out_conv6)

        concat5 = torch.cat((out_conv5,out_deconv5,flow6_up),1)

        flow5       = self.predict_flow5(concat5)
        flow5_up    = self.upsampled_flow5_to_4(flow5)
        out_deconv4 = self.deconv4(concat5)
        concat4 = torch.cat((out_conv4,out_deconv4,flow5_up),1)

        flow4       = self.predict_flow4(concat4)
        flow4_up    = self.upsampled_flow4_to_3(flow4)
        out_deconv3 = self.deconv3(concat4)
        concat3 = torch.cat((out_conv3_1,out_deconv3,flow4_up),1)

        flow3       = self.predict_flow3(concat3)
        flow3_up    = self.upsampled_flow3_to_2(flow3)
        out_deconv2 = self.deconv2(concat3)
        concat2 = torch.cat((out_conv2a,out_deconv2,flow3_up),1)

        flow2 = self.predict_flow2(concat2)

        if self.training:
            return flow2,flow3,flow4,flow5,flow6
        else:
            return flow2,


================================================
FILE: dvs/flownet2/networks/FlowNetFusion.py
================================================
import torch
import torch.nn as nn
from torch.nn import init

import math
import numpy as np

from .submodules import *
'Parameter count = 581,226'

class FlowNetFusion(nn.Module):
    def __init__(self,args, batchNorm=True):
        super(FlowNetFusion,self).__init__()

        self.batchNorm = batchNorm
        self.conv0   = conv(self.batchNorm,  11,   64)
        self.conv1   = conv(self.batchNorm,  64,   64, stride=2)
        self.conv1_1 = conv(self.batchNorm,  64,   128)
        self.conv2   = conv(self.batchNorm,  128,  128, stride=2)
        self.conv2_1 = conv(self.batchNorm,  128,  128)

        self.deconv1 = deconv(128,32)
        self.deconv0 = deconv(162,16)

        self.inter_conv1 = i_conv(self.batchNorm,  162,   32)
        self.inter_conv0 = i_conv(self.batchNorm,  82,   16)

        self.predict_flow2 = predict_flow(128)
        self.predict_flow1 = predict_flow(32)
        self.predict_flow0 = predict_flow(16)

        self.upsampled_flow2_to_1 = nn.ConvTranspose2d(2, 2, 4, 2, 1)
        self.upsampled_flow1_to_0 = nn.ConvTranspose2d(2, 2, 4, 2, 1)

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                if m.bias is not None:
                    init.uniform_(m.bias)
                init.xavier_uniform_(m.weight)

            if isinstance(m, nn.ConvTranspose2d):
                if m.bias is not None:
                    init.uniform_(m.bias)
                init.xavier_uniform_(m.weight)
                # init_deconv_bilinear(m.weight)

    def forward(self, x):
        out_conv0 = self.conv0(x)
        out_conv1 = self.conv1_1(self.conv1(out_conv0))
        out_conv2 = self.conv2_1(self.conv2(out_conv1))

        flow2       = self.predict_flow2(out_conv2)
        flow2_up    = self.upsampled_flow2_to_1(flow2)
        out_deconv1 = self.deconv1(out_conv2)
        
        concat1 = torch.cat((out_conv1,out_deconv1,flow2_up),1)
        out_interconv1 = self.inter_conv1(concat1)
        flow1       = self.predict_flow1(out_interconv1)
        flow1_up    = self.upsampled_flow1_to_0(flow1)
        out_deconv0 = self.deconv0(concat1)
        
        concat0 = torch.cat((out_conv0,out_deconv0,flow1_up),1)
        out_interconv0 = self.inter_conv0(concat0)
        flow0       = self.predict_flow0(out_interconv0)

        return flow0



================================================
FILE: dvs/flownet2/networks/FlowNetS.py
================================================
'''
Portions of this code copyright 2017, Clement Pinard
'''

import torch
import torch.nn as nn
from torch.nn import init

import math
import numpy as np

from .submodules import *
'Parameter count : 38,676,504 '

class FlowNetS(nn.Module):
    def __init__(self, args, input_channels = 12, batchNorm=True):
        super(FlowNetS,self).__init__()

        self.batchNorm = batchNorm
        self.conv1   = conv(self.batchNorm,  input_channels,   64, kernel_size=7, stride=2)
        self.conv2   = conv(self.batchNorm,  64,  128, kernel_size=5, stride=2)
        self.conv3   = conv(self.batchNorm, 128,  256, kernel_size=5, stride=2)
        self.conv3_1 = conv(self.batchNorm, 256,  256)
        self.conv4   = conv(self.batchNorm, 256,  512, stride=2)
        self.conv4_1 = conv(self.batchNorm, 512,  512)
        self.conv5   = conv(self.batchNorm, 512,  512, stride=2)
        self.conv5_1 = conv(self.batchNorm, 512,  512)
        self.conv6   = conv(self.batchNorm, 512, 1024, stride=2)
        self.conv6_1 = conv(self.batchNorm,1024, 1024)

        self.deconv5 = deconv(1024,512)
        self.deconv4 = deconv(1026,256)
        self.deconv3 = deconv(770,128)
        self.deconv2 = deconv(386,64)

        self.predict_flow6 = predict_flow(1024)
        self.predict_flow5 = predict_flow(1026)
        self.predict_flow4 = predict_flow(770)
        self.predict_flow3 = predict_flow(386)
        self.predict_flow2 = predict_flow(194)

        self.upsampled_flow6_to_5 = nn.ConvTranspose2d(2, 2, 4, 2, 1, bias=False)
        self.upsampled_flow5_to_4 = nn.ConvTranspose2d(2, 2, 4, 2, 1, bias=False)
        self.upsampled_flow4_to_3 = nn.ConvTranspose2d(2, 2, 4, 2, 1, bias=False)
        self.upsampled_flow3_to_2 = nn.ConvTranspose2d(2, 2, 4, 2, 1, bias=False)

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                if m.bias is not None:
                    init.uniform_(m.bias)
                init.xavier_uniform_(m.weight)

            if isinstance(m, nn.ConvTranspose2d):
                if m.bias is not None:
                    init.uniform_(m.bias)
                init.xavier_uniform_(m.weight)
                # init_deconv_bilinear(m.weight)
        self.upsample1 = nn.Upsample(scale_factor=4, mode='bilinear')

    def forward(self, x):
        out_conv1 = self.conv1(x)

        out_conv2 = self.conv2(out_conv1)
        out_conv3 = self.conv3_1(self.conv3(out_conv2))
        out_conv4 = self.conv4_1(self.conv4(out_conv3))
        out_conv5 = self.conv5_1(self.conv5(out_conv4))
        out_conv6 = self.conv6_1(self.conv6(out_conv5))

        flow6       = self.predict_flow6(out_conv6)
        flow6_up    = self.upsampled_flow6_to_5(flow6)
        out_deconv5 = self.deconv5(out_conv6)
        
        concat5 = torch.cat((out_conv5,out_deconv5,flow6_up),1)
        flow5       = self.predict_flow5(concat5)
        flow5_up    = self.upsampled_flow5_to_4(flow5)
        out_deconv4 = self.deconv4(concat5)
        
        concat4 = torch.cat((out_conv4,out_deconv4,flow5_up),1)
        flow4       = self.predict_flow4(concat4)
        flow4_up    = self.upsampled_flow4_to_3(flow4)
        out_deconv3 = self.deconv3(concat4)
        
        concat3 = torch.cat((out_conv3,out_deconv3,flow4_up),1)
        flow3       = self.predict_flow3(concat3)
        flow3_up    = self.upsampled_flow3_to_2(flow3)
        out_deconv2 = self.deconv2(concat3)

        concat2 = torch.cat((out_conv2,out_deconv2,flow3_up),1)
        flow2 = self.predict_flow2(concat2)

        if self.training:
            return flow2,flow3,flow4,flow5,flow6
        else:
            return flow2,



================================================
FILE: dvs/flownet2/networks/FlowNetSD.py
================================================
import torch
import torch.nn as nn
from torch.nn import init

import math
import numpy as np

from .submodules import *
'Parameter count = 45,371,666'

class FlowNetSD(nn.Module):
    def __init__(self, args, batchNorm=True):
        super(FlowNetSD,self).__init__()

        self.batchNorm = batchNorm
        self.conv0   = conv(self.batchNorm,  6,   64)
        self.conv1   = conv(self.batchNorm,  64,   64, stride=2)
        self.conv1_1 = conv(self.batchNorm,  64,   128)
        self.conv2   = conv(self.batchNorm,  128,  128, stride=2)
        self.conv2_1 = conv(self.batchNorm,  128,  128)
        self.conv3   = conv(self.batchNorm, 128,  256, stride=2)
        self.conv3_1 = conv(self.batchNorm, 256,  256)
        self.conv4   = conv(self.batchNorm, 256,  512, stride=2)
        self.conv4_1 = conv(self.batchNorm, 512,  512)
        self.conv5   = conv(self.batchNorm, 512,  512, stride=2)
        self.conv5_1 = conv(self.batchNorm, 512,  512)
        self.conv6   = conv(self.batchNorm, 512, 1024, stride=2)
        self.conv6_1 = conv(self.batchNorm,1024, 1024)

        self.deconv5 = deconv(1024,512)
        self.deconv4 = deconv(1026,256)
        self.deconv3 = deconv(770,128)
        self.deconv2 = deconv(386,64)

        self.inter_conv5 = i_conv(self.batchNorm,  1026,   512)
        self.inter_conv4 = i_conv(self.batchNorm,  770,   256)
        self.inter_conv3 = i_conv(self.batchNorm,  386,   128)
        self.inter_conv2 = i_conv(self.batchNorm,  194,   64)

        self.predict_flow6 = predict_flow(1024)
        self.predict_flow5 = predict_flow(512)
        self.predict_flow4 = predict_flow(256)
        self.predict_flow3 = predict_flow(128)
        self.predict_flow2 = predict_flow(64)

        self.upsampled_flow6_to_5 = nn.ConvTranspose2d(2, 2, 4, 2, 1)
        self.upsampled_flow5_to_4 = nn.ConvTranspose2d(2, 2, 4, 2, 1)
        self.upsampled_flow4_to_3 = nn.ConvTranspose2d(2, 2, 4, 2, 1)
        self.upsampled_flow3_to_2 = nn.ConvTranspose2d(2, 2, 4, 2, 1)

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                if m.bias is not None:
                    init.uniform_(m.bias)
                init.xavier_uniform_(m.weight)

            if isinstance(m, nn.ConvTranspose2d):
                if m.bias is not None:
                    init.uniform_(m.bias)
                init.xavier_uniform_(m.weight)
                # init_deconv_bilinear(m.weight)
        self.upsample1 = nn.Upsample(scale_factor=4, mode='bilinear')



    def forward(self, x):
        out_conv0 = self.conv0(x)
        out_conv1 = self.conv1_1(self.conv1(out_conv0))
        out_conv2 = self.conv2_1(self.conv2(out_conv1))

        out_conv3 = self.conv3_1(self.conv3(out_conv2))
        out_conv4 = self.conv4_1(self.conv4(out_conv3))
        out_conv5 = self.conv5_1(self.conv5(out_conv4))
        out_conv6 = self.conv6_1(self.conv6(out_conv5))

        flow6       = self.predict_flow6(out_conv6)
        flow6_up    = self.upsampled_flow6_to_5(flow6)
        out_deconv5 = self.deconv5(out_conv6)
        
        concat5 = torch.cat((out_conv5,out_deconv5,flow6_up),1)
        out_interconv5 = self.inter_conv5(concat5)
        flow5       = self.predict_flow5(out_interconv5)

        flow5_up    = self.upsampled_flow5_to_4(flow5)
        out_deconv4 = self.deconv4(concat5)
        
        concat4 = torch.cat((out_conv4,out_deconv4,flow5_up),1)
        out_interconv4 = self.inter_conv4(concat4)
        flow4       = self.predict_flow4(out_interconv4)
        flow4_up    = self.upsampled_flow4_to_3(flow4)
        out_deconv3 = self.deconv3(concat4)
        
        concat3 = torch.cat((out_conv3,out_deconv3,flow4_up),1)
        out_interconv3 = self.inter_conv3(concat3)
        flow3       = self.predict_flow3(out_interconv3)
        flow3_up    = self.upsampled_flow3_to_2(flow3)
        out_deconv2 = self.deconv2(concat3)

        concat2 = torch.cat((out_conv2,out_deconv2,flow3_up),1)
        out_interconv2 = self.inter_conv2(concat2)
        flow2 = self.predict_flow2(out_interconv2)

        if self.training:
            return flow2,flow3,flow4,flow5,flow6
        else:
            return flow2,


================================================
FILE: dvs/flownet2/networks/__init__.py
================================================


================================================
FILE: dvs/flownet2/networks/channelnorm_package/__init__.py
================================================


================================================
FILE: dvs/flownet2/networks/channelnorm_package/channelnorm.py
================================================
from torch.autograd import Function, Variable
from torch.nn.modules.module import Module
import channelnorm_cuda

class ChannelNormFunction(Function):

    @staticmethod
    def forward(ctx, input1, norm_deg=2):
        assert input1.is_contiguous()
        b, _, h, w = input1.size()
        output = input1.new(b, 1, h, w).zero_()

        channelnorm_cuda.forward(input1, output, norm_deg)
        ctx.save_for_backward(input1, output)
        ctx.norm_deg = norm_deg

        return output

    @staticmethod
    def backward(ctx, grad_output):
        input1, output = ctx.saved_tensors

        grad_input1 = Variable(input1.new(input1.size()).zero_())

        channelnorm_cuda.backward(input1, output, grad_output.data,
                                              grad_input1.data, ctx.norm_deg)

        return grad_input1, None


class ChannelNorm(Module):

    def __init__(self, norm_deg=2):
        super(ChannelNorm, self).__init__()
        self.norm_deg = norm_deg

    def forward(self, input1):
        return ChannelNormFunction.apply(input1, self.norm_deg)



================================================
FILE: dvs/flownet2/networks/channelnorm_package/channelnorm_cuda.cc
================================================
#include <torch/torch.h>
#include <ATen/ATen.h>

#include "channelnorm_kernel.cuh"

int channelnorm_cuda_forward(
    at::Tensor& input1, 
    at::Tensor& output,
    int norm_deg) {

    channelnorm_kernel_forward(input1, output, norm_deg);
    return 1;
}


int channelnorm_cuda_backward(
    at::Tensor& input1, 
    at::Tensor& output,
    at::Tensor& gradOutput,
    at::Tensor& gradInput1,
    int norm_deg) {

    channelnorm_kernel_backward(input1, output, gradOutput, gradInput1, norm_deg);
    return 1;
}

PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
  m.def("forward", &channelnorm_cuda_forward, "Channel norm forward (CUDA)");
  m.def("backward", &channelnorm_cuda_backward, "Channel norm backward (CUDA)");
}



================================================
FILE: dvs/flownet2/networks/channelnorm_package/channelnorm_kernel.cu
================================================
#include <ATen/ATen.h>
#include <ATen/Context.h>
#include <ATen/cuda/CUDAContext.h>

#include "channelnorm_kernel.cuh"

#define CUDA_NUM_THREADS 512 

#define DIM0(TENSOR) ((TENSOR).x)
#define DIM1(TENSOR) ((TENSOR).y)
#define DIM2(TENSOR) ((TENSOR).z)
#define DIM3(TENSOR) ((TENSOR).w)

#define DIM3_INDEX(TENSOR, xx, yy, zz, ww) ((TENSOR)[((xx) * (TENSOR##_stride.x)) + ((yy) * (TENSOR##_stride.y)) + ((zz) * (TENSOR##_stride.z)) + ((ww) * (TENSOR##_stride.w))])

using at::Half;

template <typename scalar_t>
__global__ void kernel_channelnorm_update_output(
    const int n, 
    const scalar_t* __restrict__ input1,
    const long4 input1_size,
    const long4 input1_stride,
    scalar_t* __restrict__ output, 
    const long4 output_size,
    const long4 output_stride,
    int norm_deg) {

    int index = blockIdx.x * blockDim.x + threadIdx.x;

    if (index >= n) {
        return;
    }

    int dim_b = DIM0(output_size);
    int dim_c = DIM1(output_size);
    int dim_h = DIM2(output_size);
    int dim_w = DIM3(output_size);
    int dim_chw = dim_c * dim_h * dim_w;

    int b = ( index / dim_chw ) % dim_b;
    int y = ( index / dim_w )   % dim_h;
    int x = ( index          )  % dim_w;

    int i1dim_c = DIM1(input1_size);
    int i1dim_h = DIM2(input1_size);
    int i1dim_w = DIM3(input1_size);
    int i1dim_chw = i1dim_c * i1dim_h * i1dim_w;
    int i1dim_hw  = i1dim_h * i1dim_w;

    float result = 0.0;

    for (int c = 0; c < i1dim_c; ++c) {
        int i1Index = b * i1dim_chw + c * i1dim_hw + y * i1dim_w + x;
        scalar_t val = input1[i1Index];
        result += static_cast<float>(val * val);
    }
    result = sqrt(result);
    output[index] = static_cast<scalar_t>(result);
}


template <typename scalar_t>
__global__ void kernel_channelnorm_backward_input1(
    const int n,
    const scalar_t* __restrict__ input1, const long4 input1_size, const long4 input1_stride,
    const scalar_t* __restrict__ output, const long4 output_size, const long4 output_stride, 
    const scalar_t* __restrict__ gradOutput, const long4 gradOutput_size, const long4 gradOutput_stride,
    scalar_t* __restrict__ gradInput, const long4 gradInput_size, const long4 gradInput_stride, 
    int norm_deg) {

    int index = blockIdx.x * blockDim.x + threadIdx.x;

    if (index >= n) {
        return;
    }

    float val = 0.0;

    int dim_b = DIM0(gradInput_size);
    int dim_c = DIM1(gradInput_size);
    int dim_h = DIM2(gradInput_size);
    int dim_w = DIM3(gradInput_size);
    int dim_chw = dim_c * dim_h * dim_w;
    int dim_hw  = dim_h * dim_w;

    int b = ( index / dim_chw ) % dim_b;
    int y = ( index / dim_w )   % dim_h;
    int x = ( index          )  % dim_w;


    int outIndex = b * dim_hw + y * dim_w + x;
    val = static_cast<float>(gradOutput[outIndex]) * static_cast<float>(input1[index]) / (static_cast<float>(output[outIndex])+1e-9);
    gradInput[index] = static_cast<scalar_t>(val);

}

void channelnorm_kernel_forward(
    at::Tensor& input1, 
    at::Tensor& output, 
    int norm_deg) {

    const long4 input1_size = make_long4(input1.size(0), input1.size(1), input1.size(2), input1.size(3));
    const long4 input1_stride = make_long4(input1.stride(0), input1.stride(1), input1.stride(2), input1.stride(3));

    const long4 output_size = make_long4(output.size(0), output.size(1), output.size(2), output.size(3));
    const long4 output_stride = make_long4(output.stride(0), output.stride(1), output.stride(2), output.stride(3));

    int n = output.numel();

    AT_DISPATCH_FLOATING_TYPES_AND_HALF(input1.type(), "channelnorm_forward", ([&] {

      kernel_channelnorm_update_output<scalar_t><<< (n + CUDA_NUM_THREADS - 1)/CUDA_NUM_THREADS, CUDA_NUM_THREADS, 0, at::cuda::getCurrentCUDAStream() >>>(
//at::globalContext().getCurrentCUDAStream() >>>(
          n,
          input1.data<scalar_t>(), 
          input1_size,
          input1_stride, 
          output.data<scalar_t>(),
          output_size,
          output_stride, 
          norm_deg);

    }));

      // TODO: ATen-equivalent check

     // THCudaCheck(cudaGetLastError());
}

void channelnorm_kernel_backward(
    at::Tensor& input1, 
    at::Tensor& output,
    at::Tensor& gradOutput, 
    at::Tensor& gradInput1, 
    int norm_deg) {

    const long4 input1_size = make_long4(input1.size(0), input1.size(1), input1.size(2), input1.size(3));
    const long4 input1_stride = make_long4(input1.stride(0), input1.stride(1), input1.stride(2), input1.stride(3));

    const long4 output_size = make_long4(output.size(0), output.size(1), output.size(2), output.size(3));
    const long4 output_stride = make_long4(output.stride(0), output.stride(1), output.stride(2), output.stride(3));

    const long4 gradOutput_size = make_long4(gradOutput.size(0), gradOutput.size(1), gradOutput.size(2), gradOutput.size(3));
    const long4 gradOutput_stride = make_long4(gradOutput.stride(0), gradOutput.stride(1), gradOutput.stride(2), gradOutput.stride(3));

    const long4 gradInput1_size = make_long4(gradInput1.size(0), gradInput1.size(1), gradInput1.size(2), gradInput1.size(3));
    const long4 gradInput1_stride = make_long4(gradInput1.stride(0), gradInput1.stride(1), gradInput1.stride(2), gradInput1.stride(3));

    int n = gradInput1.numel();

    AT_DISPATCH_FLOATING_TYPES_AND_HALF(input1.type(), "channelnorm_backward_input1", ([&] {

      kernel_channelnorm_backward_input1<scalar_t><<< (n + CUDA_NUM_THREADS - 1)/CUDA_NUM_THREADS, CUDA_NUM_THREADS, 0, at::cuda::getCurrentCUDAStream() >>>(
//at::globalContext().getCurrentCUDAStream() >>>(
          n, 
          input1.data<scalar_t>(),
          input1_size,
          input1_stride,
          output.data<scalar_t>(),
          output_size,
          output_stride,
          gradOutput.data<scalar_t>(),
          gradOutput_size,
          gradOutput_stride, 
          gradInput1.data<scalar_t>(),
          gradInput1_size,
          gradInput1_stride,
          norm_deg
    );

    }));

    // TODO: Add ATen-equivalent check

//    THCudaCheck(cudaGetLastError());
}


================================================
FILE: dvs/flownet2/networks/channelnorm_package/channelnorm_kernel.cuh
================================================
#pragma once

#include <ATen/ATen.h>

void channelnorm_kernel_forward(
    at::Tensor& input1,
    at::Tensor& output, 
    int norm_deg);


void channelnorm_kernel_backward(
    at::Tensor& input1,
    at::Tensor& output,
    at::Tensor& gradOutput,
    at::Tensor& gradInput1,
    int norm_deg);


================================================
FILE: dvs/flownet2/networks/channelnorm_package/setup.py
================================================
#!/usr/bin/env python3
import os
import torch

from setuptools import setup
from torch.utils.cpp_extension import BuildExtension, CUDAExtension

cxx_args = ['-std=c++11']

nvcc_args = [
    '-gencode', 'arch=compute_52,code=sm_52',
    '-gencode', 'arch=compute_60,code=sm_60',
    '-gencode', 'arch=compute_61,code=sm_61',
    '-gencode', 'arch=compute_70,code=sm_70',
    '-gencode', 'arch=compute_70,code=compute_70'
]

setup(
    name='channelnorm_cuda',
    ext_modules=[
        CUDAExtension('channelnorm_cuda', [
            'channelnorm_cuda.cc',
            'channelnorm_kernel.cu'
        ], extra_compile_args={'cxx': cxx_args, 'nvcc': nvcc_args})
    ],
    cmdclass={
        'build_ext': BuildExtension
    })


================================================
FILE: dvs/flownet2/networks/correlation_package/__init__.py
================================================


================================================
FILE: dvs/flownet2/networks/correlation_package/correlation.py
================================================
import torch
from torch.nn.modules.module import Module
from torch.autograd import Function
import correlation_cuda

class CorrelationFunction(Function):

    @staticmethod
    def forward(ctx, input1, input2, pad_size=3, kernel_size=3, max_displacement=20, stride1=1, stride2=2, corr_multiply=1):
        ctx.save_for_backward(input1, input2)

        ctx.pad_size = pad_size
        ctx.kernel_size = kernel_size
        ctx.max_displacement = max_displacement
        ctx.stride1 = stride1
        ctx.stride2 = stride2
        ctx.corr_multiply = corr_multiply

        with torch.cuda.device_of(input1):
            rbot1 = input1.new()
            rbot2 = input2.new()
            output = input1.new()

            correlation_cuda.forward(input1, input2, rbot1, rbot2, output,
                ctx.pad_size, ctx.kernel_size, ctx.max_displacement, ctx.stride1, ctx.stride2, ctx.corr_multiply)

        return output

    @staticmethod
    def backward(ctx, grad_output):
        input1, input2 = ctx.saved_tensors

        with torch.cuda.device_of(input1):
            rbot1 = input1.new()
            rbot2 = input2.new()

            grad_input1 = input1.new()
            grad_input2 = input2.new()

            correlation_cuda.backward(input1, input2, rbot1, rbot2, grad_output, grad_input1, grad_input2,
                ctx.pad_size, ctx.kernel_size, ctx.max_displacement, ctx.stride1, ctx.stride2, ctx.corr_multiply)

        return grad_input1, grad_input2, None, None, None, None, None, None


class Correlation(Module):
    def __init__(self, pad_size=0, kernel_size=0, max_displacement=0, stride1=1, stride2=2, corr_multiply=1):
        super(Correlation, self).__init__()
        self.pad_size = pad_size
        self.kernel_size = kernel_size
        self.max_displacement = max_displacement
        self.stride1 = stride1
        self.stride2 = stride2
        self.corr_multiply = corr_multiply

    def forward(self, input1, input2):

        result = CorrelationFunction.apply(input1, input2, self.pad_size, self.kernel_size, self.max_displacement, self.stride1, self.stride2, self.corr_multiply)

        return result



================================================
FILE: dvs/flownet2/networks/correlation_package/correlation_cuda.cc
================================================
#include <torch/torch.h>
#include <ATen/ATen.h>
#include <ATen/Context.h>
#include <ATen/cuda/CUDAContext.h>
#include <stdio.h>
#include <iostream>

#include "correlation_cuda_kernel.cuh"

int correlation_forward_cuda(at::Tensor& input1, at::Tensor& input2, at::Tensor& rInput1, at::Tensor& rInput2, at::Tensor& output,
                       int pad_size,
                       int kernel_size,
                       int max_displacement,
                       int stride1,
                       int stride2,
                       int corr_type_multiply)
{

  int batchSize = input1.size(0);

  int nInputChannels = input1.size(1);
  int inputHeight = input1.size(2);
  int inputWidth = input1.size(3);

  int kernel_radius = (kernel_size - 1) / 2;
  int border_radius = kernel_radius + max_displacement;

  int paddedInputHeight = inputHeight + 2 * pad_size;
  int paddedInputWidth = inputWidth + 2 * pad_size;

  int nOutputChannels = ((max_displacement/stride2)*2 + 1) * ((max_displacement/stride2)*2 + 1);

  int outputHeight = ceil(static_cast<float>(paddedInputHeight - 2 * border_radius) / static_cast<float>(stride1));
  int outputwidth = ceil(static_cast<float>(paddedInputWidth - 2 * border_radius) / static_cast<float>(stride1));

  rInput1.resize_({batchSize, paddedInputHeight, paddedInputWidth, nInputChannels});
  rInput2.resize_({batchSize, paddedInputHeight, paddedInputWidth, nInputChannels});
  output.resize_({batchSize, nOutputChannels, outputHeight, outputwidth});

  rInput1.fill_(0);
  rInput2.fill_(0);
  output.fill_(0);

  int success = correlation_forward_cuda_kernel(
    output,
    output.size(0), 
    output.size(1),
    output.size(2),
    output.size(3),
    output.stride(0),
    output.stride(1),
    output.stride(2),
    output.stride(3),
    input1,
    input1.size(1),
    input1.size(2),
    input1.size(3),
    input1.stride(0),
    input1.stride(1),
    input1.stride(2),
    input1.stride(3),
    input2,
    input2.size(1),
    input2.stride(0),
    input2.stride(1),
    input2.stride(2),
    input2.stride(3),
    rInput1,
    rInput2,
    pad_size,     
    kernel_size,
    max_displacement,
    stride1,
    stride2,
    corr_type_multiply,
	at::cuda::getCurrentCUDAStream()
    //at::globalContext().getCurrentCUDAStream()
  );

  //check for errors
  if (!success) {
    AT_ERROR("CUDA call failed");
  }

  return 1;

}

int correlation_backward_cuda(at::Tensor& input1, at::Tensor& input2, at::Tensor& rInput1, at::Tensor& rInput2, at::Tensor& gradOutput, 
                       at::Tensor& gradInput1, at::Tensor& gradInput2,
                       int pad_size,
                       int kernel_size,
                       int max_displacement,
                       int stride1,
                       int stride2,
                       int corr_type_multiply)
{

  int batchSize = input1.size(0);
  int nInputChannels = input1.size(1);
  int paddedInputHeight = input1.size(2)+ 2 * pad_size;
  int paddedInputWidth = input1.size(3)+ 2 * pad_size;

  int height = input1.size(2);
  int width = input1.size(3);

  rInput1.resize_({batchSize, paddedInputHeight, paddedInputWidth, nInputChannels});
  rInput2.resize_({batchSize, paddedInputHeight, paddedInputWidth, nInputChannels});
  gradInput1.resize_({batchSize, nInputChannels, height, width});
  gradInput2.resize_({batchSize, nInputChannels, height, width});

  rInput1.fill_(0);
  rInput2.fill_(0);
  gradInput1.fill_(0);
  gradInput2.fill_(0);

  int success = correlation_backward_cuda_kernel(gradOutput,
                                                gradOutput.size(0),
                                                gradOutput.size(1),
                                                gradOutput.size(2),
                                                gradOutput.size(3),
                                                gradOutput.stride(0),
                                                gradOutput.stride(1),
                                                gradOutput.stride(2),
                                                gradOutput.stride(3),
                                                input1,
                                                input1.size(1),
                                                input1.size(2),
                                                input1.size(3),
                                                input1.stride(0),
                                                input1.stride(1),
                                                input1.stride(2),
                                                input1.stride(3),
                                                input2,  
                                                input2.stride(0),
                                                input2.stride(1),
                                                input2.stride(2),
                                                input2.stride(3),
                                                gradInput1,
                                                gradInput1.stride(0),
                                                gradInput1.stride(1),
                                                gradInput1.stride(2),
                                                gradInput1.stride(3),
                                                gradInput2,
                                                gradInput2.size(1),
                                                gradInput2.stride(0),
                                                gradInput2.stride(1),
                                                gradInput2.stride(2),
                                                gradInput2.stride(3),
                                                rInput1,
                                                rInput2,
                                                pad_size,
                                                kernel_size,
                                                max_displacement,
                                                stride1, 
                                                stride2,
                                                corr_type_multiply,
												at::cuda::getCurrentCUDAStream()
                                                //at::globalContext().getCurrentCUDAStream()
                                               );

  if (!success) {
    AT_ERROR("CUDA call failed");
  }

  return 1;
}

PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
  m.def("forward", &correlation_forward_cuda, "Correlation forward (CUDA)");
  m.def("backward", &correlation_backward_cuda, "Correlation backward (CUDA)");
}



================================================
FILE: dvs/flownet2/networks/correlation_package/correlation_cuda_kernel.cu
================================================
#include <stdio.h>

#include "correlation_cuda_kernel.cuh"

#define CUDA_NUM_THREADS 1024
#define THREADS_PER_BLOCK 32
#define FULL_MASK 0xffffffff

#include <ATen/ATen.h>
#include <ATen/NativeFunctions.h>
#include <ATen/Dispatch.h>
#include <ATen/cuda/CUDAApplyUtils.cuh>

using at::Half;

template<typename scalar_t>
__forceinline__ __device__ scalar_t warpReduceSum(scalar_t val) {
        for (int offset = 16; offset > 0; offset /= 2)
                val += __shfl_down_sync(FULL_MASK, val, offset);
        return val;
}

template<typename scalar_t>
__forceinline__ __device__ scalar_t blockReduceSum(scalar_t val) {

        static __shared__ scalar_t shared[32];
        int lane = threadIdx.x % warpSize;
        int wid = threadIdx.x / warpSize;

        val = warpReduceSum(val);

        if (lane == 0)
                shared[wid] = val;

        __syncthreads();

        val = (threadIdx.x < blockDim.x / warpSize) ? shared[lane] : 0;

        if (wid == 0)
                val = warpReduceSum(val);

        return val;
}


template <typename scalar_t>
__global__ void channels_first(const scalar_t* __restrict__ input, scalar_t* rinput, int channels, int height, int width, int pad_size)
{

    // n (batch size), c (num of channels), y (height), x (width)
    int n = blockIdx.x;
    int y = blockIdx.y;
    int x = blockIdx.z;

    int ch_off = threadIdx.x;
    scalar_t value;

    int dimcyx = channels * height * width;
    int dimyx = height * width;

    int p_dimx = (width + 2 * pad_size);
    int p_dimy = (height + 2 * pad_size);
    int p_dimyxc = channels * p_dimy * p_dimx;
    int p_dimxc = p_dimx * channels;

    for (int c = ch_off; c < channels; c += THREADS_PER_BLOCK) {
      value = input[n * dimcyx + c * dimyx + y * width + x];
      rinput[n * p_dimyxc + (y + pad_size) * p_dimxc + (x + pad_size) * channels + c] = value;
    }
}


template<typename scalar_t>
__global__ void correlation_forward(scalar_t* __restrict__ output, const int nOutputChannels,
                const int outputHeight, const int outputWidth, const scalar_t* __restrict__ rInput1,
                const int nInputChannels, const int inputHeight, const int inputWidth,
                const scalar_t* __restrict__ rInput2, const int pad_size, const int kernel_size,
                const int max_displacement, const int stride1, const int stride2) {

        int32_t pInputWidth = inputWidth + 2 * pad_size;
        int32_t pInputHeight = inputHeight + 2 * pad_size;

        int32_t kernel_rad = (kernel_size - 1) / 2;

        int32_t displacement_rad = max_displacement / stride2;

        int32_t displacement_size = 2 * displacement_rad + 1;

        int32_t n = blockIdx.x;
        int32_t y1 = blockIdx.y * stride1 + max_displacement;
        int32_t x1 = blockIdx.z * stride1 + max_displacement;
        int32_t c = threadIdx.x;

        int32_t pdimyxc = pInputHeight * pInputWidth * nInputChannels;

        int32_t pdimxc = pInputWidth * nInputChannels;

        int32_t pdimc = nInputChannels;

        int32_t tdimcyx = nOutputChannels * outputHeight * outputWidth;
        int32_t tdimyx = outputHeight * outputWidth;
        int32_t tdimx = outputWidth;

        int32_t nelems = kernel_size * kernel_size * pdimc;

        // element-wise product along channel axis
        for (int tj = -displacement_rad; tj <= displacement_rad; ++tj) {
                for (int ti = -displacement_rad; ti <= displacement_rad; ++ti) {
                        int x2 = x1 + ti * stride2;
                        int y2 = y1 + tj * stride2;

                        float acc0 = 0.0f;

                        for (int j = -kernel_rad; j <= kernel_rad; ++j) {
                                for (int i = -kernel_rad; i <= kernel_rad; ++i) {
                                        // THREADS_PER_BLOCK
                                        #pragma unroll
                                        for (int ch = c; ch < pdimc; ch += blockDim.x) {

                                                int indx1 = n * pdimyxc + (y1 + j) * pdimxc
                                                                + (x1 + i) * pdimc + ch;
                                                int indx2 = n * pdimyxc + (y2 + j) * pdimxc
                                                                + (x2 + i) * pdimc + ch;
                                                acc0 += static_cast<float>(rInput1[indx1] * rInput2[indx2]);
                                        }
                                }
                        }

                        if (blockDim.x == warpSize) {
                            __syncwarp();
                            acc0 = warpReduceSum(acc0);
                        } else {
                            __syncthreads();
                            acc0 = blockReduceSum(acc0);
                        }

                        if (threadIdx.x == 0) {

                                int tc = (tj + displacement_rad) * displacement_size
                                                + (ti + displacement_rad);
                                const int tindx = n * tdimcyx + tc * tdimyx + blockIdx.y * tdimx
                                                + blockIdx.z;
                                output[tindx] = static_cast<scalar_t>(acc0 / nelems);
                        }
            }
        }
}


template <typename scalar_t>
__global__ void correlation_backward_input1(int item, scalar_t* gradInput1, int nInputChannels, int inputHeight, int inputWidth, 
                                            const scalar_t* __restrict__ gradOutput, int nOutputChannels, int outputHeight, int outputWidth, 
                                            const scalar_t* __restrict__ rInput2, 
                                            int pad_size,
                                            int kernel_size,
                                            int max_displacement,
                                            int stride1,
                                            int stride2)
  {
    // n (batch size), c (num of channels), y (height), x (width)

    int n = item; 
    int y = blockIdx.x * stride1 + pad_size;
    int x = blockIdx.y * stride1 + pad_size;
    int c = blockIdx.z;
    int tch_off = threadIdx.x;

    int kernel_rad = (kernel_size - 1) / 2;
    int displacement_rad = max_displacement / stride2;
    int displacement_size = 2 * displacement_rad + 1;

    int xmin = (x - kernel_rad - max_displacement) / stride1;
    int ymin = (y - kernel_rad - max_displacement) / stride1;

    int xmax = (x + kernel_rad - max_displacement) / stride1;
    int ymax = (y + kernel_rad - max_displacement) / stride1;

    if (xmax < 0 || ymax < 0 || xmin >= outputWidth || ymin >= outputHeight) {
        // assumes gradInput1 is pre-allocated and zero filled
      return;
    }

    if (xmin > xmax || ymin > ymax) {
        // assumes gradInput1 is pre-allocated and zero filled
        return;
    }

    xmin = max(0,xmin);
    xmax = min(outputWidth-1,xmax);

    ymin = max(0,ymin);
    ymax = min(outputHeight-1,ymax);

    int pInputWidth = inputWidth + 2 * pad_size;
    int pInputHeight = inputHeight + 2 * pad_size;

    int pdimyxc = pInputHeight * pInputWidth * nInputChannels;
    int pdimxc = pInputWidth * nInputChannels;
    int pdimc = nInputChannels;

    int tdimcyx = nOutputChannels * outputHeight * outputWidth;
    int tdimyx = outputHeight * outputWidth;
    int tdimx = outputWidth;

    int odimcyx = nInputChannels * inputHeight* inputWidth;
    int odimyx = inputHeight * inputWidth;
    int odimx = inputWidth;

    scalar_t nelems = kernel_size * kernel_size * nInputChannels;

    __shared__ scalar_t prod_sum[THREADS_PER_BLOCK];
    prod_sum[tch_off] = 0;

    for (int tc = tch_off; tc < nOutputChannels; tc += THREADS_PER_BLOCK) {

      int i2 = (tc % displacement_size - displacement_rad) * stride2;
      int j2 = (tc / displacement_size - displacement_rad) * stride2;

      int indx2 = n * pdimyxc + (y + j2)* pdimxc + (x + i2) * pdimc + c;
      
      scalar_t val2 = rInput2[indx2];

      for (int j = ymin; j <= ymax; ++j) {
        for (int i = xmin; i <= xmax; ++i) {
          int tindx = n * tdimcyx + tc * tdimyx + j * tdimx + i;
          prod_sum[tch_off] += gradOutput[tindx] * val2;
        }
      }
    }
    __syncthreads();

    if(tch_off == 0) {
      scalar_t reduce_sum = 0;
      for(int idx = 0; idx < THREADS_PER_BLOCK; idx++) {
          reduce_sum += prod_sum[idx];
      }
      const int indx1 = n * odimcyx + c * odimyx + (y - pad_size) * odimx + (x - pad_size);
      gradInput1[indx1] = reduce_sum / nelems;
    }

}

template <typename scalar_t>
__global__ void correlation_backward_input2(int item, scalar_t*  gradInput2, int nInputChannels, int inputHeight, int inputWidth,
                                            const scalar_t* __restrict__ gradOutput, int nOutputChannels, int outputHeight, int outputWidth,
                                            const scalar_t* __restrict__ rInput1,
                                            int pad_size,
                                            int kernel_size,
                                            int max_displacement,
                                            int stride1,
                                            int stride2)
{
    // n (batch size), c (num of channels), y (height), x (width)

    int n = item;
    int y = blockIdx.x * stride1 + pad_size;
    int x = blockIdx.y * stride1 + pad_size;
    int c = blockIdx.z;

    int tch_off = threadIdx.x;

    int kernel_rad = (kernel_size - 1) / 2;
    int displacement_rad = max_displacement / stride2;
    int displacement_size = 2 * displacement_rad + 1;

    int pInputWidth = inputWidth + 2 * pad_size;
    int pInputHeight = inputHeight + 2 * pad_size;

    int pdimyxc = pInputHeight * pInputWidth * nInputChannels;
    int pdimxc = pInputWidth * nInputChannels;
    int pdimc = nInputChannels;

    int tdimcyx = nOutputChannels * outputHeight * outputWidth;
    int tdimyx = outputHeight * outputWidth;
    int tdimx = outputWidth;

    int odimcyx = nInputChannels * inputHeight* inputWidth;
    int odimyx = inputHeight * inputWidth;
    int odimx = inputWidth;

    scalar_t nelems = kernel_size * kernel_size * nInputChannels;

    __shared__ scalar_t prod_sum[THREADS_PER_BLOCK];
    prod_sum[tch_off] = 0;

    for (int tc = tch_off; tc < nOutputChannels; tc += THREADS_PER_BLOCK) {
      int i2 = (tc % displacement_size - displacement_rad) * stride2;
      int j2 = (tc / displacement_size - displacement_rad) * stride2;

      int xmin = (x - kernel_rad - max_displacement - i2) / stride1;
      int ymin = (y - kernel_rad - max_displacement - j2) / stride1;

      int xmax = (x + kernel_rad - max_displacement - i2) / stride1;
      int ymax = (y + kernel_rad - max_displacement - j2) / stride1;

      if (xmax < 0 || ymax < 0 || xmin >= outputWidth || ymin >= outputHeight) {
          // assumes gradInput2 is pre-allocated and zero filled
        continue;
      }

      if (xmin > xmax || ymin > ymax) {
          // assumes gradInput2 is pre-allocated and zero filled
          continue;
      }

      xmin = max(0,xmin);
      xmax = min(outputWidth-1,xmax);

      ymin = max(0,ymin);
      ymax = min(outputHeight-1,ymax);
      
      int indx1 = n * pdimyxc + (y - j2)* pdimxc + (x - i2) * pdimc + c;
      scalar_t val1 = rInput1[indx1];

      for (int j = ymin; j <= ymax; ++j) {
        for (int i = xmin; i <= xmax; ++i) {
          int tindx = n * tdimcyx + tc * tdimyx + j * tdimx + i;
          prod_sum[tch_off] += gradOutput[tindx] * val1;
        }
      }
    }

    __syncthreads();

    if(tch_off == 0) {
      scalar_t reduce_sum = 0;
      for(int idx = 0; idx < THREADS_PER_BLOCK; idx++) {
          reduce_sum += prod_sum[idx];
      }
      const int indx2 = n * odimcyx + c * odimyx + (y - pad_size) * odimx + (x - pad_size);
      gradInput2[indx2] = reduce_sum / nelems;
    }

}

int correlation_forward_cuda_kernel(at::Tensor& output,
                                    int ob,
                                    int oc,
                                    int oh,
                                    int ow,
                                    int osb,
                                    int osc,
                                    int osh,
                                    int osw,

                                    at::Tensor& input1,
                                    int ic,
                                    int ih,
                                    int iw,
                                    int isb,
                                    int isc,
                                    int ish,
                                    int isw,

                                    at::Tensor& input2,
                                    int gc,
                                    int gsb,
                                    int gsc,
                                    int gsh,
                                    int gsw,

                                    at::Tensor& rInput1,
                                    at::Tensor& rInput2,
                                    int pad_size,
                                    int kernel_size,
                                    int max_displacement,
                                    int stride1,
                                    int stride2,
                                    int corr_type_multiply,
                                    cudaStream_t stream) 
{

   int batchSize = ob;

   int nInputChannels = ic;
   int inputWidth = iw;
   int inputHeight = ih;

   int nOutputChannels = oc;
   int outputWidth = ow;
   int outputHeight = oh;

   dim3 blocks_grid(batchSize, inputHeight, inputWidth);
   dim3 threads_block(THREADS_PER_BLOCK);

  AT_DISPATCH_FLOATING_TYPES_AND_HALF(input1.type(), "channels_first_fwd_1", ([&] {

  channels_first<scalar_t><<<blocks_grid,threads_block, 0, stream>>>(
      input1.data<scalar_t>(), rInput1.data<scalar_t>(), nInputChannels, inputHeight, inputWidth, pad_size);

  }));

  AT_DISPATCH_FLOATING_TYPES_AND_HALF(input2.type(), "channels_first_fwd_2", ([&] {

  channels_first<scalar_t><<<blocks_grid,threads_block, 0, stream>>> (
      input2.data<scalar_t>(), rInput2.data<scalar_t>(), nInputChannels, inputHeight, inputWidth, pad_size);

  }));

   dim3 threadsPerBlock(THREADS_PER_BLOCK);
   dim3 totalBlocksCorr(batchSize, outputHeight, outputWidth);

  AT_DISPATCH_FLOATING_TYPES_AND_HALF(input1.type(), "correlation_forward", ([&] {

   correlation_forward<scalar_t><<<totalBlocksCorr, threadsPerBlock, 0, stream>>> 
                        (output.data<scalar_t>(), nOutputChannels, outputHeight, outputWidth,
                         rInput1.data<scalar_t>(), nInputChannels, inputHeight, inputWidth,
                         rInput2.data<scalar_t>(),
                         pad_size,
                         kernel_size,
                         max_displacement,
                         stride1,
                         stride2);

  }));

  cudaError_t err = cudaGetLastError();


  // check for errors
  if (err != cudaSuccess) {
    printf("error in correlation_forward_cuda_kernel: %s\n", cudaGetErrorString(err));
    return 0;
  }

  return 1;
}


int correlation_backward_cuda_kernel(
                                    at::Tensor& gradOutput,
                                    int gob,
                                    int goc,
                                    int goh,
                                    int gow,
                                    int gosb,
                                    int gosc,
                                    int gosh,
                                    int gosw,

                                    at::Tensor& input1,
                                    int ic,
                                    int ih,
                                    int iw,
                                    int isb,
                                    int isc,
                                    int ish,
                                    int isw,

                                    at::Tensor& input2,
                                    int gsb,
                                    int gsc,
                                    int gsh,
                                    int gsw,

                                    at::Tensor& gradInput1,
                                    int gisb,
                                    int gisc,
                                    int gish,
                                    int gisw,

                                    at::Tensor& gradInput2,
                                    int ggc,
                                    int ggsb,
                                    int ggsc,
                                    int ggsh,
                                    int ggsw,

                                    at::Tensor& rInput1,
                                    at::Tensor& rInput2,
                                    int pad_size,
                                    int kernel_size,
                                    int max_displacement,
                                    int stride1,
                                    int stride2,
                                    int corr_type_multiply,
                                    cudaStream_t stream)
{

    int batchSize = gob;
    int num = batchSize;

    int nInputChannels = ic;
    int inputWidth = iw;
    int inputHeight = ih;

    int nOutputChannels = goc;
    int outputWidth = gow;
    int outputHeight = goh;

    dim3 blocks_grid(batchSize, inputHeight, inputWidth);
    dim3 threads_block(THREADS_PER_BLOCK);


    AT_DISPATCH_FLOATING_TYPES_AND_HALF(input1.type(), "lltm_forward_cuda", ([&] {

        channels_first<scalar_t><<<blocks_grid, threads_block, 0, stream>>>(
            input1.data<scalar_t>(),
            rInput1.data<scalar_t>(),
            nInputChannels,
            inputHeight,
            inputWidth,
            pad_size
        );
    }));

    AT_DISPATCH_FLOATING_TYPES_AND_HALF(input2.type(), "lltm_forward_cuda", ([&] {

        channels_first<scalar_t><<<blocks_grid, threads_block, 0, stream>>>(
            input2.data<scalar_t>(),
            rInput2.data<scalar_t>(),
            nInputChannels,
            inputHeight,
            inputWidth,
            pad_size
        );
    }));

    dim3 threadsPerBlock(THREADS_PER_BLOCK);
    dim3 totalBlocksCorr(inputHeight, inputWidth, nInputChannels);

    for (int n = 0; n < num; ++n) {

      AT_DISPATCH_FLOATING_TYPES_AND_HALF(input2.type(), "lltm_forward_cuda", ([&] {


          correlation_backward_input1<scalar_t><<<totalBlocksCorr, threadsPerBlock, 0, stream>>> (
              n, gradInput1.data<scalar_t>(), nInputChannels, inputHeight, inputWidth,
              gradOutput.data<scalar_t>(), nOutputChannels, outputHeight, outputWidth,
              rInput2.data<scalar_t>(),
              pad_size,
              kernel_size,
              max_displacement,
              stride1,
              stride2);
      }));
    }

    for(int n = 0; n < batchSize; n++) {

      AT_DISPATCH_FLOATING_TYPES_AND_HALF(rInput1.type(), "lltm_forward_cuda", ([&] {

        correlation_backward_input2<scalar_t><<<totalBlocksCorr, threadsPerBlock, 0, stream>>>(
            n, gradInput2.data<scalar_t>(), nInputChannels, inputHeight, inputWidth,
            gradOutput.data<scalar_t>(), nOutputChannels, outputHeight, outputWidth,
            rInput1.data<scalar_t>(),
            pad_size,
            kernel_size,
            max_displacement,
            stride1,
            stride2);

        }));
    }

  // check for errors
  cudaError_t err = cudaGetLastError();
  if (err != cudaSuccess) {
    printf("error in correlation_backward_cuda_kernel: %s\n", cudaGetErrorString(err));
    return 0;
  }

  return 1;
}


================================================
FILE: dvs/flownet2/networks/correlation_package/correlation_cuda_kernel.cuh
================================================
#pragma once

#include <ATen/ATen.h>
#include <ATen/Context.h>
#include <cuda_runtime.h>

int correlation_forward_cuda_kernel(at::Tensor& output,
    int ob,
    int oc,
    int oh,
    int ow,
    int osb,
    int osc,
    int osh,
    int osw,

    at::Tensor& input1,
    int ic,
    int ih,
    int iw,
    int isb,
    int isc,
    int ish,
    int isw,

    at::Tensor& input2,
    int gc,
    int gsb,
    int gsc,
    int gsh,
    int gsw,

    at::Tensor& rInput1,
    at::Tensor& rInput2,
    int pad_size,
    int kernel_size,
    int max_displacement,
    int stride1,
    int stride2,
    int corr_type_multiply,
    cudaStream_t stream);


int correlation_backward_cuda_kernel(   
    at::Tensor& gradOutput,
    int gob,
    int goc,
    int goh,
    int gow,
    int gosb,
    int gosc,
    int gosh,
    int gosw,

    at::Tensor& input1,
    int ic,
    int ih,
    int iw,
    int isb,
    int isc,
    int ish,
    int isw,

    at::Tensor& input2,
    int gsb,
    int gsc,
    int gsh,
    int gsw,

    at::Tensor& gradInput1, 
    int gisb,
    int gisc,
    int gish,
    int gisw,

    at::Tensor& gradInput2,
    int ggc,
    int ggsb,
    int ggsc,
    int ggsh,
    int ggsw,

    at::Tensor& rInput1,
    at::Tensor& rInput2,
    int pad_size,
    int kernel_size,
    int max_displacement,
    int stride1,
    int stride2,
    int corr_type_multiply,
    cudaStream_t stream);


================================================
FILE: dvs/flownet2/networks/correlation_package/setup.py
================================================
#!/usr/bin/env python3
import os
import torch

from setuptools import setup, find_packages
from torch.utils.cpp_extension import BuildExtension, CUDAExtension

cxx_args = ['-std=c++11']

nvcc_args = [
    '-gencode', 'arch=compute_50,code=sm_50',
    '-gencode', 'arch=compute_52,code=sm_52',
    '-gencode', 'arch=compute_60,code=sm_60',
    '-gencode', 'arch=compute_61,code=sm_61',
    '-gencode', 'arch=compute_70,code=sm_70',
    '-gencode', 'arch=compute_70,code=compute_70'
]

setup(
    name='correlation_cuda',
    ext_modules=[
        CUDAExtension('correlation_cuda', [
            'correlation_cuda.cc',
            'correlation_cuda_kernel.cu'
        ], extra_compile_args={'cxx': cxx_args, 'nvcc': nvcc_args})
    ],
    cmdclass={
        'build_ext': BuildExtension
    })


================================================
FILE: dvs/flownet2/networks/resample2d_package/__init__.py
================================================


================================================
FILE: dvs/flownet2/networks/resample2d_package/resample2d.py
================================================
from torch.nn.modules.module import Module
from torch.autograd import Function, Variable
import resample2d_cuda

class Resample2dFunction(Function):

    @staticmethod
    def forward(ctx, input1, input2, kernel_size=1, bilinear= True):
        assert input1.is_contiguous()
        assert input2.is_contiguous()

        ctx.save_for_backward(input1, input2)
        ctx.kernel_size = kernel_size
        ctx.bilinear = bilinear

        _, d, _, _ = input1.size()
        b, _, h, w = input2.size()
        output = input1.new(b, d, h, w).zero_()

        resample2d_cuda.forward(input1, input2, output, kernel_size, bilinear)

        return output

    @staticmethod
    def backward(ctx, grad_output):
        grad_output = grad_output.contiguous()
        assert grad_output.is_contiguous()

        input1, input2 = ctx.saved_tensors

        grad_input1 = Variable(input1.new(input1.size()).zero_())
        grad_input2 = Variable(input1.new(input2.size()).zero_())

        resample2d_cuda.backward(input1, input2, grad_output.data,
                                 grad_input1.data, grad_input2.data,
                                 ctx.kernel_size, ctx.bilinear)

        return grad_input1, grad_input2, None, None

class Resample2d(Module):

    def __init__(self, kernel_size=1, bilinear = True):
        super(Resample2d, self).__init__()
        self.kernel_size = kernel_size
        self.bilinear = bilinear

    def forward(self, input1, input2):
        input1_c = input1.contiguous()
        return Resample2dFunction.apply(input1_c, input2, self.kernel_size, self.bilinear)


================================================
FILE: dvs/flownet2/networks/resample2d_package/resample2d_cuda.cc
================================================
#include <ATen/ATen.h>
#include <torch/torch.h>

#include "resample2d_kernel.cuh"

int resample2d_cuda_forward(
    at::Tensor& input1,
    at::Tensor& input2, 
    at::Tensor& output,
    int kernel_size, bool bilinear) {
      resample2d_kernel_forward(input1, input2, output, kernel_size, bilinear);
    return 1;
}

int resample2d_cuda_backward(
    at::Tensor& input1, 
    at::Tensor& input2,
    at::Tensor& gradOutput,
    at::Tensor& gradInput1, 
    at::Tensor& gradInput2, 
    int kernel_size, bool bilinear) {
        resample2d_kernel_backward(input1, input2, gradOutput, gradInput1, gradInput2, kernel_size, bilinear);
    return 1;
}



PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
  m.def("forward", &resample2d_cuda_forward, "Resample2D forward (CUDA)");
  m.def("backward", &resample2d_cuda_backward, "Resample2D backward (CUDA)");
}



================================================
FILE: dvs/flownet2/networks/resample2d_package/resample2d_kernel.cu
================================================
#include <ATen/ATen.h>
#include <ATen/Context.h>
#include <ATen/cuda/CUDAContext.h>

#define CUDA_NUM_THREADS 512 
#define THREADS_PER_BLOCK 64 

#define DIM0(TENSOR) ((TENSOR).x)
#define DIM1(TENSOR) ((TENSOR).y)
#define DIM2(TENSOR) ((TENSOR).z)
#define DIM3(TENSOR) ((TENSOR).w)

#define DIM3_INDEX(TENSOR, xx, yy, zz, ww) ((TENSOR)[((xx) * (TENSOR##_stride.x)) + ((yy) * (TENSOR##_stride.y)) + ((zz) * (TENSOR##_stride.z)) + ((ww) * (TENSOR##_stride.w))])

template <typename scalar_t>
__global__ void kernel_resample2d_update_output(const int n, 
                                               const scalar_t* __restrict__ input1, const long4 input1_size, const long4 input1_stride,
                                               const scalar_t* __restrict__ input2, const long4 input2_size, const long4 input2_stride, 
                                               scalar_t* __restrict__ output, const long4 output_size, const long4 output_stride, int kernel_size, bool bilinear) {
    int index = blockIdx.x * blockDim.x + threadIdx.x;

    if (index >= n) {
        return;
    }

    scalar_t val = 0.0f;

    int dim_b = DIM0(output_size);
    int dim_c = DIM1(output_size);
    int dim_h = DIM2(output_size);
    int dim_w = DIM3(output_size);
    int dim_chw = dim_c * dim_h * dim_w;
    int dim_hw  = dim_h * dim_w;

    int b = ( index / dim_chw ) % dim_b;
    int c = ( index / dim_hw )  % dim_c;
    int y = ( index / dim_w )   % dim_h;
    int x = ( index          )  % dim_w;

    scalar_t dx = DIM3_INDEX(input2, b, 0, y, x);
    scalar_t dy = DIM3_INDEX(input2, b, 1, y, x);

    scalar_t xf = static_cast<scalar_t>(x) + dx;
    scalar_t yf = static_cast<scalar_t>(y) + dy;
    scalar_t alpha = xf - floor(xf); // alpha
    scalar_t beta = yf - floor(yf); // beta

    if (bilinear) {
        int xL = max(min( int (floor(xf)),    dim_w-1), 0);
        int xR = max(min( int (floor(xf)+1), dim_w -1), 0);
        int yT = max(min( int (floor(yf)),    dim_h-1), 0);
        int yB = max(min( int (floor(yf)+1),  dim_h-1), 0);

        for (int fy = 0; fy < kernel_size; fy += 1) {
            for (int fx = 0; fx < kernel_size; fx += 1) {
                val += static_cast<float>((1. - alpha)*(1. - beta) * DIM3_INDEX(input1, b, c, yT + fy, xL + fx));
                val += static_cast<float>((alpha)*(1. - beta) * DIM3_INDEX(input1, b, c, yT + fy, xR + fx));
                val += static_cast<float>((1. - alpha)*(beta) * DIM3_INDEX(input1, b, c, yB + fy, xL + fx));
                val += static_cast<float>((alpha)*(beta) * DIM3_INDEX(input1, b, c, yB + fy, xR + fx));
            }
        }

        output[index] = val;
    }
    else {
        int xN = max(min( int (floor(xf + 0.5)), dim_w - 1), 0);
        int yN = max(min( int (floor(yf + 0.5)), dim_h - 1), 0);

        output[index] = static_cast<float> ( DIM3_INDEX(input1, b, c, yN, xN) );
    }

}


template <typename scalar_t>
__global__ void kernel_resample2d_backward_input1(
    const int n, const scalar_t* __restrict__ input1, const long4 input1_size, const long4 input1_stride,
    const scalar_t* __restrict__ input2, const long4 input2_size, const long4 input2_stride,
    const scalar_t* __restrict__ gradOutput, const long4 gradOutput_size, const long4 gradOutput_stride,
    scalar_t* __restrict__ gradInput, const long4 gradInput_size, const long4 gradInput_stride, int kernel_size, bool bilinear) {

    int index = blockIdx.x * blockDim.x + threadIdx.x;

    if (index >= n) {
        return;
    }

    int dim_b = DIM0(gradOutput_size);
    int dim_c = DIM1(gradOutput_size);
    int dim_h = DIM2(gradOutput_size);
    int dim_w = DIM3(gradOutput_size);
    int dim_chw = dim_c * dim_h * dim_w;
    int dim_hw  = dim_h * dim_w;

    int b = ( index / dim_chw ) % dim_b;
    int c = ( index / dim_hw )  % dim_c;
    int y = ( index / dim_w )   % dim_h;
    int x = ( index          )  % dim_w;

    scalar_t dx = DIM3_INDEX(input2, b, 0, y, x);
    scalar_t dy = DIM3_INDEX(input2, b, 1, y, x);

    scalar_t xf = static_cast<scalar_t>(x) + dx;
    scalar_t yf = static_cast<scalar_t>(y) + dy;
    scalar_t alpha = xf - int(xf); // alpha
    scalar_t beta = yf - int(yf); // beta

    int idim_h = DIM2(input1_size);
    int idim_w = DIM3(input1_size);

    int xL = max(min( int (floor(xf)),    idim_w-1), 0);
    int xR = max(min( int (floor(xf)+1), idim_w -1), 0);
    int yT = max(min( int (floor(yf)),    idim_h-1), 0);
    int yB = max(min( int (floor(yf)+1),  idim_h-1), 0);

    for (int fy = 0; fy < kernel_size; fy += 1) {
        for (int fx = 0; fx < kernel_size; fx += 1) {
            atomicAdd(&DIM3_INDEX(gradInput, b, c, (yT + fy), (xL + fx)), (1-alpha)*(1-beta) * DIM3_INDEX(gradOutput, b, c, y, x));
            atomicAdd(&DIM3_INDEX(gradInput, b, c, (yT + fy), (xR + fx)),   (alpha)*(1-beta) * DIM3_INDEX(gradOutput, b, c, y, x));
            atomicAdd(&DIM3_INDEX(gradInput, b, c, (yB + fy), (xL + fx)),   (1-alpha)*(beta) * DIM3_INDEX(gradOutput, b, c, y, x));
            atomicAdd(&DIM3_INDEX(gradInput, b, c, (yB + fy), (xR + fx)),     (alpha)*(beta) * DIM3_INDEX(gradOutput, b, c, y, x));
        }
    }

}

template <typename scalar_t>
__global__ void kernel_resample2d_backward_input2(
    const int n, const scalar_t* __restrict__ input1, const long4 input1_size, const long4 input1_stride,
    const scalar_t* __restrict__ input2, const long4 input2_size, const long4 input2_stride,
    const scalar_t* __restrict__ gradOutput, const long4 gradOutput_size, const long4 gradOutput_stride,
    scalar_t* __restrict__ gradInput, const long4 gradInput_size, const long4 gradInput_stride, int kernel_size, bool bilinear) {

    int index = blockIdx.x * blockDim.x + threadIdx.x;

    if (index >= n) {
        return;
    }

    scalar_t output = 0.0;
    int kernel_rad = (kernel_size - 1)/2;

    int dim_b = DIM0(gradInput_size);
    int dim_c = DIM1(gradInput_size);
    int dim_h = DIM2(gradInput_size);
    int dim_w = DIM3(gradInput_size);
    int dim_chw = dim_c * dim_h * dim_w;
    int dim_hw  = dim_h * dim_w;

    int b = ( index / dim_chw ) % dim_b;
    int c = ( index / dim_hw )  % dim_c;
    int y = ( index / dim_w )   % dim_h;
    int x = ( index          )  % dim_w;

    int odim_c = DIM1(gradOutput_size);

    scalar_t dx = DIM3_INDEX(input2, b, 0, y, x);
    scalar_t dy = DIM3_INDEX(input2, b, 1, y, x);

    scalar_t xf = static_cast<scalar_t>(x) + dx;
    scalar_t yf = static_cast<scalar_t>(y) + dy;

    int xL = max(min( int (floor(xf)),    dim_w-1), 0);
    int xR = max(min( int (floor(xf)+1), dim_w -1), 0);
    int yT = max(min( int (floor(yf)),    dim_h-1), 0);
    int yB = max(min( int (floor(yf)+1),  dim_h-1), 0);
    
    if (c % 2) {
        float gamma = 1 - (xf - floor(xf)); // alpha
        for (int i = 0; i <= 2*kernel_rad; ++i) {
            for (int j = 0; j <= 2*kernel_rad; ++j) {
                for (int ch = 0; ch < odim_c; ++ch) {
                    output += (gamma) * DIM3_INDEX(gradOutput, b, ch, y, x) * DIM3_INDEX(input1, b, ch, (yB + j), (xL + i));
                    output -= (gamma) * DIM3_INDEX(gradOutput, b, ch, y, x) * DIM3_INDEX(input1, b, ch, (yT + j), (xL + i));
                    output += (1-gamma) * DIM3_INDEX(gradOutput, b, ch, y, x) * DIM3_INDEX(input1, b, ch, (yB + j), (xR + i));
                    output -= (1-gamma) * DIM3_INDEX(gradOutput, b, ch, y, x) * DIM3_INDEX(input1, b, ch, (yT + j), (xR + i));
                }
            }
        }
    }
    else {
        float gamma = 1 - (yf - floor(yf)); // alpha
        for (int i = 0; i <= 2*kernel_rad; ++i) {
            for (int j = 0; j <= 2*kernel_rad; ++j) {
                for (int ch = 0; ch < odim_c; ++ch) {
                    output += (gamma) * DIM3_INDEX(gradOutput, b, ch, y, x) * DIM3_INDEX(input1, b, ch, (yT + j), (xR + i));
                    output -= (gamma) * DIM3_INDEX(gradOutput, b, ch, y, x) * DIM3_INDEX(input1, b, ch, (yT + j), (xL + i));
                    output += (1-gamma) * DIM3_INDEX(gradOutput, b, ch, y, x) * DIM3_INDEX(input1, b, ch, (yB + j), (xR + i));
                    output -= (1-gamma) * DIM3_INDEX(gradOutput, b, ch, y, x) * DIM3_INDEX(input1, b, ch, (yB + j), (xL + i));
                }
            }
        }

    }

    gradInput[index] = output;

}

void resample2d_kernel_forward(
    at::Tensor& input1, 
    at::Tensor& input2,
    at::Tensor& output, 
    int kernel_size,
    bool bilinear) {

    int n = output.numel();

    const long4 input1_size = make_long4(input1.size(0), input1.size(1), input1.size(2), input1.size(3));
    const long4 input1_stride = make_long4(input1.stride(0), input1.stride(1), input1.stride(2), input1.stride(3));

    const long4 input2_size = make_long4(input2.size(0), input2.size(1), input2.size(2), input2.size(3));
    const long4 input2_stride = make_long4(input2.stride(0), input2.stride(1), input2.stride(2), input2.stride(3));

    const long4 output_size = make_long4(output.size(0), output.size(1), output.size(2), output.size(3));
    const long4 output_stride = make_long4(output.stride(0), output.stride(1), output.stride(2), output.stride(3));

    // TODO: when atomicAdd gets resolved, change to AT_DISPATCH_FLOATING_TYPES_AND_HALF
//    AT_DISPATCH_FLOATING_TYPES(input1.type(), "resample_forward_kernel", ([&] {

        kernel_resample2d_update_output<float><<< (n + CUDA_NUM_THREADS - 1)/CUDA_NUM_THREADS, CUDA_NUM_THREADS, 0, at::cuda::getCurrentCUDAStream() >>>(
//at::globalContext().getCurrentCUDAStream() >>>(
            n,
            input1.data<float>(),
            input1_size,
            input1_stride, 
            input2.data<float>(),
            input2_size,
            input2_stride,
            output.data<float>(),
            output_size,
            output_stride,
            kernel_size,
            bilinear);

//    }));

        // TODO: ATen-equivalent check

       //    THCudaCheck(cudaGetLastError());

}

void resample2d_kernel_backward(
    at::Tensor& input1,
    at::Tensor& input2,
    at::Tensor& gradOutput,
    at::Tensor& gradInput1,
    at::Tensor& gradInput2,
    int kernel_size,
    bool bilinear) {

    int n = gradOutput.numel();

    const long4 input1_size = make_long4(input1.size(0), input1.size(1), input1.size(2), input1.size(3));
    const long4 input1_stride = make_long4(input1.stride(0), input1.stride(1), input1.stride(2), input1.stride(3));

    const long4 input2_size = make_long4(input2.size(0), input2.size(1), input2.size(2), input2.size(3));
    const long4 input2_stride = make_long4(input2.stride(0), input2.stride(1), input2.stride(2), input2.stride(3));

    const long4 gradOutput_size = make_long4(gradOutput.size(0), gradOutput.size(1), gradOutput.size(2), gradOutput.size(3));
    const long4 gradOutput_stride = make_long4(gradOutput.stride(0), gradOutput.stride(1), gradOutput.stride(2), gradOutput.stride(3));

    const long4 gradInput1_size = make_long4(gradInput1.size(0), gradInput1.size(1), gradInput1.size(2), gradInput1.size(3));
    const long4 gradInput1_stride = make_long4(gradInput1.stride(0), gradInput1.stride(1), gradInput1.stride(2), gradInput1.stride(3));

//    AT_DISPATCH_FLOATING_TYPES(input1.type(), "resample_backward_input1", ([&] {

        kernel_resample2d_backward_input1<float><<< (n + CUDA_NUM_THREADS - 1)/CUDA_NUM_THREADS, CUDA_NUM_THREADS, 0, at::cuda::getCurrentCUDAStream() >>>(
//at::globalContext().getCurrentCUDAStream() >>>(
            n, 
            input1.data<float>(), 
            input1_size,
            input1_stride,
            input2.data<float>(),
            input2_size, 
            input2_stride,
            gradOutput.data<float>(),
            gradOutput_size,
            gradOutput_stride,
            gradInput1.data<float>(),
            gradInput1_size,
            gradInput1_stride, 
            kernel_size,
            bilinear
        );

//    }));

    const long4 gradInput2_size = make_long4(gradInput2.size(0), gradInput2.size(1), gradInput2.size(2), gradInput2.size(3));
    const long4 gradInput2_stride = make_long4(gradInput2.stride(0), gradInput2.stride(1), gradInput2.stride(2), gradInput2.stride(3));

    n = gradInput2.numel();

//    AT_DISPATCH_FLOATING_TYPES(gradInput2.type(), "resample_backward_input2", ([&] {


        kernel_resample2d_backward_input2<float><<< (n + CUDA_NUM_THREADS - 1)/CUDA_NUM_THREADS, CUDA_NUM_THREADS, 0, at::cuda::getCurrentCUDAStream() >>>(
//at::globalContext().getCurrentCUDAStream() >>>(
            n, 
            input1.data<float>(), 
            input1_size, 
            input1_stride,
            input2.data<float>(), 
            input2_size,
            input2_stride,
            gradOutput.data<float>(),
            gradOutput_size,
            gradOutput_stride,
            gradInput2.data<float>(),
            gradInput2_size,
            gradInput2_stride,
            kernel_size,
            bilinear
       );

//    }));

    // TODO: Use the ATen equivalent to get last error

    //    THCudaCheck(cudaGetLastError());

}


================================================
FILE: dvs/flownet2/networks/resample2d_package/resample2d_kernel.cuh
================================================
#pragma once

#include <ATen/ATen.h>

void resample2d_kernel_forward(
    at::Tensor& input1,
    at::Tensor& input2,
    at::Tensor& output,
    int kernel_size,
    bool bilinear);

void resample2d_kernel_backward(
    at::Tensor& input1,
    at::Tensor& input2,
    at::Tensor& gradOutput,
    at::Tensor& gradInput1, 
    at::Tensor& gradInput2, 
    int kernel_size,
    bool bilinear);

================================================
FILE: dvs/flownet2/networks/resample2d_package/setup.py
================================================
#!/usr/bin/env python3
import os
import torch

from setuptools import setup
from torch.utils.cpp_extension import BuildExtension, CUDAExtension

cxx_args = ['-std=c++11']

nvcc_args = [
    '-gencode', 'arch=compute_50,code=sm_50',
    '-gencode', 'arch=compute_52,code=sm_52',
    '-gencode', 'arch=compute_60,code=sm_60',
    '-gencode', 'arch=compute_61,code=sm_61',
    '-gencode', 'arch=compute_70,code=sm_70',
    '-gencode', 'arch=compute_70,code=compute_70'
]

setup(
    name='resample2d_cuda',
    ext_modules=[
        CUDAExtension('resample2d_cuda', [
            'resample2d_cuda.cc',
            'resample2d_kernel.cu'
        ], extra_compile_args={'cxx': cxx_args, 'nvcc': nvcc_args})
    ],
    cmdclass={
        'build_ext': BuildExtension
    })


================================================
FILE: dvs/flownet2/networks/submodules.py
================================================
# freda (todo) : 

import torch.nn as nn
import torch
import numpy as np 

def conv(batchNorm, in_planes, out_planes, kernel_size=3, stride=1):
    if batchNorm:
        return nn.Sequential(
            nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride, padding=(kernel_size-1)//2, bias=False),
            nn.BatchNorm2d(out_planes),
            nn.LeakyReLU(0.1,inplace=True)
        )
    else:
        return nn.Sequential(
            nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride, padding=(kernel_size-1)//2, bias=True),
            nn.LeakyReLU(0.1,inplace=True)
        )

def i_conv(batchNorm, in_planes, out_planes, kernel_size=3, stride=1, bias = True):
    if batchNorm:
        return nn.Sequential(
            nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride, padding=(kernel_size-1)//2, bias=bias),
            nn.BatchNorm2d(out_planes),
        )
    else:
        return nn.Sequential(
            nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride, padding=(kernel_size-1)//2, bias=bias),
        )

def predict_flow(in_planes):
    return nn.Conv2d(in_planes,2,kernel_size=3,stride=1,padding=1,bias=True)

def deconv(in_planes, out_planes):
    return nn.Sequential(
        nn.ConvTranspose2d(in_planes, out_planes, kernel_size=4, stride=2, padding=1, bias=True),
        nn.LeakyReLU(0.1,inplace=True)
    )

class tofp16(nn.Module):
    def __init__(self):
        super(tofp16, self).__init__()

    def forward(self, input):
        return input.half()


class tofp32(nn.Module):
    def __init__(self):
        super(tofp32, self).__init__()

    def forward(self, input):
        return input.float()


def init_deconv_bilinear(weight):
    f_shape = weight.size()
    heigh, width = f_shape[-2], f_shape[-1]
    f = np.ceil(width/2.0)
    c = (2 * f - 1 - f % 2) / (2.0 * f)
    bilinear = np.zeros([heigh, width])
    for x in range(width):
        for y in range(heigh):
            value = (1 - abs(x / f - c)) * (1 - abs(y / f - c))
            bilinear[x, y] = value
    weight.data.fill_(0.)
    for i in range(f_shape[0]):
        for j in range(f_shape[1]):
            weight.data[i,j,:,:] = torch.from_numpy(bilinear)


def save_grad(grads, name):
    def hook(grad):
        grads[name] = grad
    return hook

'''
def save_grad(grads, name):
    def hook(grad):
        grads[name] = grad
    return hook
import torch
from channelnorm_package.modules.channelnorm import ChannelNorm 
model = ChannelNorm().cuda()
grads = {}
a = 100*torch.autograd.Variable(torch.randn((1,3,5,5)).cuda(), requires_grad=True)
a.register_hook(save_grad(grads, 'a'))
b = model(a)
y = torch.mean(b)
y.backward()

'''


================================================
FILE: dvs/flownet2/run.sh
================================================
#!/bin/bash
python main.py --inference --model FlowNet2 --save_flow --inference_dataset Google \
	--inference_dataset_root ./../video \
	--resume ./FlowNet2_checkpoint.pth.tar \
	--inference_visualize


================================================
FILE: dvs/flownet2/run_release.sh
================================================
#!/bin/bash
python main.py --inference --model FlowNet2 --save_flow --inference_dataset Google \
	--inference_dataset_root ./../dataset_release/test \
	--resume ./FlowNet2_checkpoint.pth.tar \
	--inference_visualize

python main.py --inference --model FlowNet2 --save_flow --inference_dataset Google \
	--inference_dataset_root ./../dataset_release/training \
	--resume ./FlowNet2_checkpoint.pth.tar \
	--inference_visualize

================================================
FILE: dvs/flownet2/utils/__init__.py
================================================


================================================
FILE: dvs/flownet2/utils/flow_utils.py
================================================
import numpy as np
import matplotlib.pyplot as plt
import os.path

TAG_CHAR = np.array([202021.25], np.float32)

def readFlow(fn):
    """ Read .flo file in Middlebury format"""
    # Code adapted from:
    # http://stackoverflow.com/questions/28013200/reading-middlebury-flow-files-with-python-bytes-array-numpy

    # WARNING: this will work on little-endian architectures (eg Intel x86) only!
    # print 'fn = %s'%(fn)
    with open(fn, 'rb') as f:
        magic = np.fromfile(f, np.float32, count=1)
        if 202021.25 != magic:
            print('Magic number incorrect. Invalid .flo file')
            return None
        else:
            w = np.fromfile(f, np.int32, count=1)
            h = np.fromfile(f, np.int32, count=1)
            # print 'Reading %d x %d flo file\n' % (w, h)
            data = np.fromfile(f, np.float32, count=2*int(w)*int(h))
            # Reshape data into 3D array (columns, rows, bands)
            # The reshape here is for visualization, the original code is (w,h,2)
            return np.resize(data, (int(h), int(w), 2))

def writeFlow(filename,uv,v=None):
    """ Write optical flow to file.
    
    If v is None, uv is assumed to contain both u and v channels,
    stacked in depth.
    Original code by Deqing Sun, adapted from Daniel Scharstein.
    """
    nBands = 2

    if v is None:
        assert(uv.ndim == 3)
        assert(uv.shape[2] == 2)
        u = uv[:,:,0]
        v = uv[:,:,1]
    else:
        u = uv

    assert(u.shape == v.shape)
    height,width = u.shape
    f = open(filename,'wb')
    # write the header
    f.write(TAG_CHAR)
    np.array(width).astype(np.int32).tofile(f)
    np.array(height).astype(np.int32).tofile(f)
    # arrange into matrix form
    tmp = np.zeros((height, width*nBands))
    tmp[:,np.arange(width)*2] = u
    tmp[:,np.arange(width)*2 + 1] = v
    tmp.astype(np.float32).tofile(f)
    f.close()


# ref: https://github.com/sampepose/flownet2-tf/
# blob/18f87081db44939414fc4a48834f9e0da3e69f4c/src/flowlib.py#L240
def visulize_flow_file(flow_filename, save_dir=None):
	flow_data = readFlow(flow_filename)
	img = flow2img(flow_data)
	# plt.imshow(img)
	# plt.show()
	if save_dir:
		idx = flow_filename.rfind("/") + 1
		plt.imsave(os.path.join(save_dir, "%s-vis.png" % flow_filename[idx:-4]), img)


def flow2img(flow_data):
	"""
	convert optical flow into color image
	:param flow_data:
	:return: color image
	"""
	# print(flow_data.shape)
	# print(type(flow_data))
	u = flow_data[:, :, 0]
	v = flow_data[:, :, 1]

	UNKNOW_FLOW_THRESHOLD = 1e7
	pr1 = abs(u) > UNKNOW_FLOW_THRESHOLD
	pr2 = abs(v) > UNKNOW_FLOW_THRESHOLD
	idx_unknown = (pr1 | pr2)
	u[idx_unknown] = v[idx_unknown] = 0

	# get max value in each direction
	maxu = -999.
	maxv = -999.
	minu = 999.
	minv = 999.
	maxu = max(maxu, np.max(u))
	maxv = max(maxv, np.max(v))
	minu = min(minu, np.min(u))
	minv = min(minv, np.min(v))

	rad = np.sqrt(u ** 2 + v ** 2)
	maxrad = max(-1, np.max(rad))
	u = u / maxrad + np.finfo(float).eps
	v = v / maxrad + np.finfo(float).eps

	img = compute_color(u, v)

	idx = np.repeat(idx_unknown[:, :, np.newaxis], 3, axis=2)
	img[idx] = 0

	return np.uint8(img)


def compute_color(u, v):
	"""
	compute optical flow color map
	:param u: horizontal optical flow
	:param v: vertical optical flow
	:return:
	"""

	height, width = u.shape
	img = np.zeros((height, width, 3))

	NAN_idx = np.isnan(u) | np.isnan(v)
	u[NAN_idx] = v[NAN_idx] = 0

	colorwheel = make_color_wheel()
	ncols = np.size(colorwheel, 0)

	rad = np.sqrt(u ** 2 + v ** 2)

	a = np.arctan2(-v, -u) / np.pi

	fk = (a + 1) / 2 * (ncols - 1) + 1

	k0 = np.floor(fk).astype(int)

	k1 = k0 + 1
	k1[k1 == ncols + 1] = 1
	f = fk - k0

	for i in range(0, np.size(colorwheel, 1)):
		tmp = colorwheel[:, i]
		col0 = tmp[k0 - 1] / 255
		col1 = tmp[k1 - 1] / 255
		col = (1 - f) * col0 + f * col1

		idx = rad <= 1
		col[idx] = 1 - rad[idx] * (1 - col[idx])
		notidx = np.logical_not(idx)

		col[notidx] *= 0.75
		img[:, :, i] = np.uint8(np.floor(255 * col * (1 - NAN_idx)))

	return img


def make_color_wheel():
	"""
	Generate color wheel according Middlebury color code
	:return: Color wheel
	"""
	RY = 15
	YG = 6
	GC = 4
	CB = 11
	BM = 13
	MR = 6

	ncols = RY + YG + GC + CB + BM + MR

	colorwheel = np.zeros([ncols, 3])

	col = 0

	# RY
	colorwheel[0:RY, 0] = 255
	colorwheel[0:RY, 1] = np.transpose(np.floor(255 * np.arange(0, RY) / RY))
	col += RY

	# YG
	colorwheel[col:col + YG, 0] = 255 - np.transpose(np.floor(255 * np.arange(0, YG) / YG))
	colorwheel[col:col + YG, 1] = 255
	col += YG

	# GC
	colorwheel[col:col + GC, 1] = 255
	colorwheel[col:col + GC, 2] = np.transpose(np.floor(255 * np.arange(0, GC) / GC))
	col += GC

	# CB
	colorwheel[col:col + CB, 1] = 255 - np.transpose(np.floor(255 * np.arange(0, CB) / CB))
	colorwheel[col:col + CB, 2] = 255
	col += CB

	# BM
	colorwheel[col:col + BM, 2] = 255
	colorwheel[col:col + BM, 0] = np.transpose(np.floor(255 * np.arange(0, BM) / BM))
	col += + BM

	# MR
	colorwheel[col:col + MR, 2] = 255 - np.transpose(np.floor(255 * np.arange(0, MR) / MR))
	colorwheel[col:col + MR, 0] = 255

	return colorwheel


================================================
FILE: dvs/flownet2/utils/frame_utils.py
================================================
import numpy as np
from os.path import *
from imageio import imread
from . import flow_utils 

def read_gen(file_name):
    ext = splitext(file_name)[-1]
    if ext == '.png' or ext == '.jpeg' or ext == '.ppm' or ext == '.jpg':
        im = imread(file_name)
        if im.shape[2] > 3:
            return im[:,:,:3]
        else:
            return im
    elif ext == '.bin' or ext == '.raw':
        return np.load(file_name)
    elif ext == '.flo':
        return flow_utils.readFlow(file_name).astype(np.float32)
    return []


================================================
FILE: dvs/flownet2/utils/param_utils.py
================================================
import torch
import torch.nn as nn
import numpy as np

def parse_flownetc(modules, weights, biases):
    keys = [
    'conv1',
    'conv2',
    'conv3',
    'conv_redir',
    'conv3_1',
    'conv4',
    'conv4_1',
    'conv5',
    'conv5_1',
    'conv6',
    'conv6_1',
    
    'deconv5',
    'deconv4',
    'deconv3',
    'deconv2',
    
    'Convolution1',
    'Convolution2',
    'Convolution3',
    'Convolution4',
    'Convolution5',

    'upsample_flow6to5',
    'upsample_flow5to4',
    'upsample_flow4to3',
    'upsample_flow3to2',
    
    ]
    i = 0
    for m in modules:
        if isinstance(m, nn.Conv2d) or isinstance(m, nn.ConvTranspose2d):
            weight = weights[keys[i]].copy()
            bias = biases[keys[i]].copy()
            if keys[i] == 'conv1':
                m.weight.data[:,:,:,:] = torch.from_numpy(np.flip(weight, axis=1).copy())
                m.bias.data[:] = torch.from_numpy(bias)
            else:
                m.weight.data[:,:,:,:] = torch.from_numpy(weight)
                m.bias.data[:] = torch.from_numpy(bias)                    

            i = i + 1
    return

def parse_flownets(modules, weights, biases, param_prefix='net2_'):
    keys = [
    'conv1',
    'conv2',
    'conv3',
    'conv3_1',
    'conv4',
    'conv4_1',
    'conv5',
    'conv5_1',
    'conv6',
    'conv6_1',
    
    'deconv5',
    'deconv4',
    'deconv3',
    'deconv2',
    
    'predict_conv6',
    'predict_conv5',
    'predict_conv4',
    'predict_conv3',
    'predict_conv2',

    'upsample_flow6to5',
    'upsample_flow5to4',
    'upsample_flow4to3',
    'upsample_flow3to2',
    ]
    for i, k in enumerate(keys):
        if 'upsample' in k:
            keys[i] = param_prefix + param_prefix + k
        else:
            keys[i] = param_prefix + k
    i = 0
    for m in modules:
        if isinstance(m, nn.Conv2d) or isinstance(m, nn.ConvTranspose2d):
            weight = weights[keys[i]].copy()
            bias = biases[keys[i]].copy()
            if keys[i] == param_prefix+'conv1':
                m.weight.data[:,0:3,:,:] = torch.from_numpy(np.flip(weight[:,0:3,:,:], axis=1).copy())
                m.weight.data[:,3:6,:,:] = torch.from_numpy(np.flip(weight[:,3:6,:,:], axis=1).copy())
                m.weight.data[:,6:9,:,:] = torch.from_numpy(np.flip(weight[:,6:9,:,:], axis=1).copy())
                m.weight.data[:,9::,:,:] = torch.from_numpy(weight[:,9:,:,:].copy())
                if m.bias is not None:
                    m.bias.data[:] = torch.from_numpy(bias)
            else:
                m.weight.data[:,:,:,:] = torch.from_numpy(weight)
                if m.bias is not None:
                    m.bias.data[:] = torch.from_numpy(bias)
            i = i + 1
    return

def parse_flownetsonly(modules, weights, biases, param_prefix=''):
    keys = [
    'conv1',
    'conv2',
    'conv3',
    'conv3_1',
    'conv4',
    'conv4_1',
    'conv5',
    'conv5_1',
    'conv6',
    'conv6_1',
    
    'deconv5',
    'deconv4',
    'deconv3',
    'deconv2',
    
    'Convolution1',
    'Convolution2',
    'Convolution3',
    'Convolution4',
    'Convolution5',

    'upsample_flow6to5',
    'upsample_flow5to4',
    'upsample_flow4to3',
    'upsample_flow3to2',
    ]
    for i, k in enumerate(keys):
        if 'upsample' in k:
            keys[i] = param_prefix + param_prefix + k
        else:
            keys[i] = param_prefix + k
    i = 0
    for m in modules:
        if isinstance(m, nn.Conv2d) or isinstance(m, nn.ConvTranspose2d):
            weight = weights[keys[i]].copy()
            bias = biases[keys[i]].copy()
            if keys[i] == param_prefix+'conv1':
                # print ("%s :"%(keys[i]), m.weight.size(), m.bias.size(), tf_w[keys[i]].shape[::-1])
                m.weight.data[:,0:3,:,:] = torch.from_numpy(np.flip(weight[:,0:3,:,:], axis=1).copy())
                m.weight.data[:,3:6,:,:] = torch.from_numpy(np.flip(weight[:,3:6,:,:], axis=1).copy())
                if m.bias is not None:
                    m.bias.data[:] = torch.from_numpy(bias)
            else:
                m.weight.data[:,:,:,:] = torch.from_numpy(weight)
                if m.bias is not None:
                    m.bias.data[:] = torch.from_numpy(bias)
            i = i + 1
    return

def parse_flownetsd(modules, weights, biases, param_prefix='netsd_'):
    keys = [
    'conv0',
    'conv1',
    'conv1_1',
    'conv2',
    'conv2_1',
    'conv3',
    'conv3_1',
    'conv4',
    'conv4_1',
    'conv5',
    'conv5_1',
    'conv6',
    'conv6_1',
    
    'deconv5',
    'deconv4',
    'deconv3',
    'deconv2',

    'interconv5',
    'interconv4',
    'interconv3',
    'interconv2',
    
    'Convolution1',
    'Convolution2',
    'Convolution3',
    'Convolution4',
    'Convolution5',

    'upsample_flow6to5',
    'upsample_flow5to4',
    'upsample_flow4to3',
    'upsample_flow3to2',
    ]
    for i, k in enumerate(keys):
        keys[i] = param_prefix + k

    i = 0
    for m in modules:
        if isinstance(m, nn.Conv2d) or isinstance(m, nn.ConvTranspose2d):
            weight = weights[keys[i]].copy()
            bias = biases[keys[i]].copy()
            if keys[i] == param_prefix+'conv0':
                m.weight.data[:,0:3,:,:] = torch.from_numpy(np.flip(weight[:,0:3,:,:], axis=1).copy())
                m.weight.data[:,3:6,:,:] = torch.from_numpy(np.flip(weight[:,3:6,:,:], axis=1).copy())
                if m.bias is not None:
                    m.bias.data[:] = torch.from_numpy(bias)
            else:
                m.weight.data[:,:,:,:] = torch.from_numpy(weight)
                if m.bias is not None:
                    m.bias.data[:] = torch.from_numpy(bias)
            i = i + 1

    return

def parse_flownetfusion(modules, weights, biases, param_prefix='fuse_'):
    keys = [
    'conv0',
    'conv1',
    'conv1_1',
    'conv2',
    'conv2_1',

    'deconv1',
    'deconv0',

    'interconv1',
    'interconv0',
    
    '_Convolution5',
    '_Convolution6',
    '_Convolution7',

    'upsample_flow2to1',
    'upsample_flow1to0',
    ]
    for i, k in enumerate(keys):
        keys[i] = param_prefix + k

    i = 0
    for m in modules:
        if isinstance(m, nn.Conv2d) or isinstance(m, nn.ConvTranspose2d):
            weight = weights[keys[i]].copy()
            bias = biases[keys[i]].copy()
            if keys[i] == param_prefix+'conv0':
                m.weight.data[:,0:3,:,:] = torch.from_numpy(np.flip(weight[:,0:3,:,:], axis=1).copy())
                m.weight.data[:,3::,:,:] = torch.from_numpy(weight[:,3:,:,:].copy())
                if m.bias is not None:
                    m.bias.data[:] = torch.from_numpy(bias)
            else:
                m.weight.data[:,:,:,:] = torch.from_numpy(weight)
                if m.bias is not None:
                    m.bias.data[:] = torch.from_numpy(bias)
            i = i + 1

    return


================================================
FILE: dvs/flownet2/utils/tools.py
================================================
# freda (todo) : 

import os, time, sys, math
import subprocess, shutil
from os.path import *
import numpy as np
from inspect import isclass
from pytz import timezone
from datetime import datetime
import inspect
import torch

def datestr():
    pacific = timezone('US/Pacific')
    now = datetime.now(pacific)
    return '{}{:02}{:02}_{:02}{:02}'.format(now.year, now.month, now.day, now.hour, now.minute)

def module_to_dict(module, exclude=[]):
        return dict([(x, getattr(module, x)) for x in dir(module)
                     if isclass(getattr(module, x))
                     and x not in exclude
                     and getattr(module, x) not in exclude])

class TimerBlock: 
    def __init__(self, title):
        print(("{}".format(title)))

    def __enter__(self):
        self.start = time.clock()
        return self

    def __exit__(self, exc_type, exc_value, traceback):
        self.end = time.clock()
        self.interval = self.end - self.start

        if exc_type is not None:
            self.log("Operation failed\n")
        else:
            self.log("Operation finished\n")


    def log(self, string):
        duration = time.clock() - self.start
        units = 's'
        if duration > 60:
            duration = duration / 60.
            units = 'm'
        print(("  [{:.3f}{}] {}".format(duration, units, string)))
    
    def log2file(self, fid, string):
        fid = open(fid, 'a')
        fid.write("%s\n"%(string))
        fid.close()

def add_arguments_for_module(parser, module, argument_for_class, default, skip_params=[], parameter_defaults={}):
    argument_group = parser.add_argument_group(argument_for_class.capitalize())

    module_dict = module_to_dict(module)
    argument_group.add_argument('--' + argument_for_class, type=str, default=default, choices=list(module_dict.keys()))
    
    args, unknown_args = parser.parse_known_args()
    class_obj = module_dict[vars(args)[argument_for_class]]

    argspec = inspect.getargspec(class_obj.__init__)

    defaults = argspec.defaults[::-1] if argspec.defaults else None

    args = argspec.args[::-1]
    for i, arg in enumerate(args):
        cmd_arg = '{}_{}'.format(argument_for_class, arg)
        if arg not in skip_params + ['self', 'args']:
            if arg in list(parameter_defaults.keys()):
                argument_group.add_argument('--{}'.format(cmd_arg), type=type(parameter_defaults[arg]), default=parameter_defaults[arg])
            elif (defaults is not None and i < len(defaults)):
                argument_group.add_argument('--{}'.format(cmd_arg), type=type(defaults[i]), default=defaults[i])
            else:
                print(("[Warning]: non-default argument '{}' detected on class '{}'. This argument cannot be modified via the command line"
                        .format(arg, module.__class__.__name__)))
            # We don't have a good way of dealing with inferring the type of the argument
            # TODO: try creating a custom action and using ast's infer type?
            # else:
            #     argument_group.add_argument('--{}'.format(cmd_arg), required=True)

def kwargs_from_args(args, argument_for_class):
    argument_for_class = argument_for_class + '_'
    return {key[len(argument_for_class):]: value for key, value in list(vars(args).items()) if argument_for_class in key and key != argument_for_class + 'class'}

def format_dictionary_of_losses(labels, values):
    try:
        string = ', '.join([('{}: {:' + ('.3f' if value >= 0.001 else '.1e') +'}').format(name, value) for name, value in zip(labels, values)])
    except (TypeError, ValueError) as e:
        print((list(zip(labels, values))))
        string = '[Log Error] ' + str(e)

    return string


class IteratorTimer():
    def __init__(self, iterable):
        self.iterable = iterable
        self.iterator = self.iterable.__iter__()

    def __iter__(self):
        return self

    def __len__(self):
        return len(self.iterable)

    def __next__(self):
        start = time.time()
        n = next(self.iterator)
        self.last_duration = (time.time() - start)
        return n

    next = __next__

def gpumemusage():
    gpu_mem = subprocess.check_output("nvidia-smi | grep MiB | cut -f 3 -d '|'", shell=True).replace(' ', '').replace('\n', '').replace('i', '')
    all_stat = [float(a) for a in gpu_mem.replace('/','').split('MB')[:-1]]

    gpu_mem = ''
    for i in range(len(all_stat)/2):
        curr, tot = all_stat[2*i], all_stat[2*i+1]
        util = "%1.2f"%(100*curr/tot)+'%'
        cmem = str(int(math.ceil(curr/1024.)))+'GB'
        gmem = str(int(math.ceil(tot/1024.)))+'GB'
        gpu_mem += util + '--' + join(cmem, gmem) + ' '
    return gpu_mem


def update_hyperparameter_schedule(args, epoch, global_iteration, optimizer):
    if args.schedule_lr_frequency > 0:
        for param_group in optimizer.param_groups:
            if (global_iteration + 1) % args.schedule_lr_frequency == 0:
                param_group['lr'] /= float(args.schedule_lr_fraction)
                param_group['lr'] = float(np.maximum(param_group['lr'], 0.000001))

def save_checkpoint(state, is_best, path, prefix, filename='checkpoint.pth.tar'):
    prefix_save = os.path.join(path, prefix)
    name = prefix_save + '_' + filename
    torch.save(state, name)
    if is_best:
        shutil.copyfile(name, prefix_save + '_model_best.pth.tar')



================================================
FILE: dvs/gyro/__init__.py
================================================
from .gyro_function import (
    GetGyroAtTimeStamp,
    QuaternionProduct,
    QuaternionReciprocal,
    ConvertQuaternionToAxisAngle,
    FindOISAtTimeStamp,
    GetMetadata,
    GetProjections,
    GetVirtualProjection,
    GetForwardGrid,
    CenterZoom,
    GetWarpingFlow,
    torch_norm_quat,
    torch_QuaternionProduct, 
    torch_QuaternionReciprocal,
    torch_GetVirtualProjection,
    get_static,
    torch_GetForwardGrid,
    torch_GetWarpingFlow,
    train_GetGyroAtTimeStamp,
    train_ConvertQuaternionToAxisAngle,
    ConvertAxisAngleToQuaternion,
    torch_ConvertAxisAngleToQuaternion,
    torch_ConvertQuaternionToAxisAngle,
    ConvertAxisAngleToQuaternion_no_angle,
    ConvertQuaternionToAxisAngle_no_angle,
    torch_GetHomographyTransformFromProjections,
    torch_ApplyTransform,
    norm_quat,
    SlerpWithDefault
    )
from .gyro_io import (
    LoadGyroData, 
    LoadOISData, 
    LoadFrameData, 
    LoadStabResult,
    get_grid, 
    get_rotations, 
    visual_rotation
    )

================================================
FILE: dvs/gyro/gyro_function.py
================================================
import numpy as np
from numpy import linalg as LA
import matplotlib.pyplot as plt
import torch
from torch.autograd import Variable

def get_static(height = 1080, width = 1920, ratio = 0.1):
    static_options = {}
    static_options["active_array_width"] = 4032
    static_options["active_array_height"] = 3024
    static_options["crop_window_width"] = 4032
    static_options["crop_window_height"] = 2272
    static_options["num_grid_rows"] = 12
    static_options["num_grid_cols"] = 12
    static_options["dim_homography"] = 9
    static_options["width"] = width  # frame width.
    static_options["height"] = height # frame height
    # static_options["fov"] = 1.27 # sensor_width/sensor_focal_length
    static_options["cropping_ratio"] = 0.0 #ratio # normalized cropping ratio at each side. 
    return static_options

# Quaternion: [x, y, z, w]

def norm_quat(quat):
    norm_quat = LA.norm(quat)   
    if norm_quat > 1e-6:
        quat = quat / norm_quat   
        #     [0 norm_quat norm_quat - 1e-6]
    else:
        # print('bad len for Reciprocal')
        quat = np.array([0,0,0,1])
    return quat

def torch_norm_quat(quat, USE_CUDA = True):
    # Method 1:
    batch_size = quat.size()[0]
    quat_out = Variable(torch.zeros((batch_size, 4), requires_grad=True))
    if USE_CUDA == True:
        quat_out = quat_out.cuda()
    for i in range(batch_size):
        norm_quat = torch.norm(quat[i])   
        if norm_quat > 1e-6:        
            quat_out[i] = quat[i] / norm_quat  
            #     [0 norm_quat norm_quat - 1e-6]
        else:
            quat_out[i,:3] = quat[i,:3] * 0
            quat_out[i,3] = quat[i,3] / quat[i,3]

    # Method 2:
    # quat = quat / (torch.unsqueeze(torch.norm(quat, dim = 1), 1) + 1e-6) # check norm
    return quat_out

def ConvertAxisAngleToQuaternion(axis, angle):
    if LA.norm(axis) > 1e-6 and angle > 1e-6: 
        axis = axis/LA.norm(axis)  
    half_angle = angle*0.5  
    sin_half_angle = np.sin(half_angle)
    quat = np.array([sin_half_angle* axis[0], sin_half_angle* axis[1], sin_half_angle* axis[2], np.cos(half_angle)])

    return norm_quat(quat)

def ConvertAxisAngleToQuaternion_no_angle(axis):
    angle = LA.norm(axis)  
    if LA.norm(axis) > 1e-6: 
        axis = axis/LA.norm(axis)  
    half_angle = angle*0.5  
    sin_half_angle = np.sin(half_angle)
    quat = np.array([sin_half_angle* axis[0], sin_half_angle* axis[1], sin_half_angle* axis[2], np.cos(half_angle)])

    return norm_quat(quat)

def torch_ConvertAxisAngleToQuaternion(axis, USE_CUDA = True):
    batch_size = axis.size()[0]

    angle = torch.norm(axis[:,:3], dim = 1)

    half_angle = angle * 0.5 
    sin_half_angle = torch.sin(half_angle)
    quats = Variable(torch.zeros((batch_size, 4), requires_grad=True))
    norm_axis = axis[:,:3] * 1
    if USE_CUDA:
        quats = quats.cuda()
    for i in range(batch_size):
        if angle[i] > 1e-6:
            norm_axis[i] = axis[i,:3]/angle[i]
    quats[:, :3] = sin_half_angle * norm_axis
    quats[:, 3] = torch.cos(half_angle)
    return torch_norm_quat(quats)

def ConvertQuaternionToAxisAngle(quat):
    quat = quat/LA.norm(quat)   
    axis_norm = LA.norm(quat[0:3])
    axis = np.array([0.0, 0.0, 0.0])
    if axis_norm < 1e-6:
        angle = 0   
    else:
        axis_norm_reciprocal = 1/axis_norm

Download .txt

gitextract__lkvtuhi/

├── .gitignore
├── LICENSE
├── README.md
├── docs/
│   ├── code-of-conduct.md
│   └── contributing.md
└── dvs/
    ├── checkpoint/
    │   └── stabilzation/
    │       └── stabilzation_last.checkpoint
    ├── conf/
    │   ├── stabilzation.yaml
    │   └── stabilzation_train.yaml
    ├── dataset.py
    ├── flownet2/
    │   ├── LICENSE
    │   ├── README.md
    │   ├── __init__.py
    │   ├── convert.py
    │   ├── datasets.py
    │   ├── install.sh
    │   ├── losses.py
    │   ├── main.py
    │   ├── models.py
    │   ├── networks/
    │   │   ├── FlowNetC.py
    │   │   ├── FlowNetFusion.py
    │   │   ├── FlowNetS.py
    │   │   ├── FlowNetSD.py
    │   │   ├── __init__.py
    │   │   ├── channelnorm_package/
    │   │   │   ├── __init__.py
    │   │   │   ├── channelnorm.py
    │   │   │   ├── channelnorm_cuda.cc
    │   │   │   ├── channelnorm_kernel.cu
    │   │   │   ├── channelnorm_kernel.cuh
    │   │   │   └── setup.py
    │   │   ├── correlation_package/
    │   │   │   ├── __init__.py
    │   │   │   ├── correlation.py
    │   │   │   ├── correlation_cuda.cc
    │   │   │   ├── correlation_cuda_kernel.cu
    │   │   │   ├── correlation_cuda_kernel.cuh
    │   │   │   └── setup.py
    │   │   ├── resample2d_package/
    │   │   │   ├── __init__.py
    │   │   │   ├── resample2d.py
    │   │   │   ├── resample2d_cuda.cc
    │   │   │   ├── resample2d_kernel.cu
    │   │   │   ├── resample2d_kernel.cuh
    │   │   │   └── setup.py
    │   │   └── submodules.py
    │   ├── run.sh
    │   ├── run_release.sh
    │   └── utils/
    │       ├── __init__.py
    │       ├── flow_utils.py
    │       ├── frame_utils.py
    │       ├── param_utils.py
    │       └── tools.py
    ├── gyro/
    │   ├── __init__.py
    │   ├── gyro_function.py
    │   └── gyro_io.py
    ├── inference.py
    ├── load_frame_sensor_data.py
    ├── loss.py
    ├── metrics.py
    ├── model.py
    ├── printer.py
    ├── requirements.txt
    ├── train.py
    ├── util.py
    └── warp/
        ├── __init__.py
        ├── rasterizer.py
        ├── read_write.py
        └── warping.py

Download .txt

SYMBOL INDEX (342 symbols across 33 files)

FILE: dvs/dataset.py
  function get_data_loader (line 26) | def get_data_loader(cf, no_flo = False):
  function get_dataset (line 34) | def get_dataset(cf, no_flo = False):
  function get_inference_data_loader (line 52) | def get_inference_data_loader(cf, data_path, no_flo = False):
  function get_inference_dataset (line 57) | def get_inference_dataset(cf, data_path, no_flo = False):
  function _data_transforms (line 66) | def _data_transforms():
  class DVS_data (line 77) | class DVS_data():
    method __init__ (line 78) | def __init__(self):
  class Dataset_Gyro (line 87) | class Dataset_Gyro(Dataset):
    method __init__ (line 88) | def __init__(self, path, sample_freq = 33*1000000, number_real = 10, t...
    method process_one_video (line 122) | def process_one_video(self, path):
    method generate_quaternions (line 154) | def generate_quaternions(self, dvs_data):
    method load_flo (line 179) | def load_flo(self, idx, first_id):
    method load_real_projections (line 195) | def load_real_projections(self, idx, first_id):
    method __getitem__ (line 203) | def __getitem__(self, idx):
    method __len__ (line 212) | def __len__(self):
    method get_virtual_data (line 215) | def get_virtual_data(self, virtual_queue, real_queue_idx, pre_times, c...
    method update_virtual_queue (line 232) | def update_virtual_queue(self, batch_size, virtual_queue, out, times):
    method random_init_virtual_queue (line 244) | def random_init_virtual_queue(self, batch_size, real_postion, times):
    method get_data_at_timestamp (line 259) | def get_data_at_timestamp(self, gyro_data, ois_data, time_stamp, quat_...
    method get_ois_at_timestamp (line 264) | def get_ois_at_timestamp(self, ois_data, time_stamp):
  function get_timestamp (line 269) | def get_timestamp(frame_data, idx):
  function preprocess_gyro (line 275) | def preprocess_gyro(gyro, extend = 200):
  function LoadFlow (line 286) | def LoadFlow(path):
  function get_virtual_at_timestamp (line 293) | def get_virtual_at_timestamp(virtual_queue, real_queue, time_stamp, time...

FILE: dvs/flownet2/datasets.py
  class StaticRandomCrop (line 13) | class StaticRandomCrop(object):
    method __init__ (line 14) | def __init__(self, image_size, crop_size):
    method __call__ (line 20) | def __call__(self, img):
  class StaticCenterCrop (line 23) | class StaticCenterCrop(object):
    method __init__ (line 24) | def __init__(self, image_size, crop_size):
    method __call__ (line 27) | def __call__(self, img):
  class Padding (line 30) | class Padding(object):
    method __init__ (line 31) | def __init__(self, image_size, pad_size):
    method __call__ (line 34) | def __call__(self, img):
  class MpiSintel (line 39) | class MpiSintel(data.Dataset):
    method __init__ (line 40) | def __init__(self, args, is_cropped = False, root = '', dstype = 'clea...
    method __getitem__ (line 85) | def __getitem__(self, index):
    method __len__ (line 112) | def __len__(self):
  class MpiSintelClean (line 115) | class MpiSintelClean(MpiSintel):
    method __init__ (line 116) | def __init__(self, args, is_cropped = False, root = '', replicates = 1):
  class MpiSintelFinal (line 119) | class MpiSintelFinal(MpiSintel):
    method __init__ (line 120) | def __init__(self, args, is_cropped = False, root = '', replicates = 1):
  class FlyingChairs (line 123) | class FlyingChairs(data.Dataset):
    method __init__ (line 124) | def __init__(self, args, is_cropped, root = '/path/to/FlyingChairs_rel...
    method __getitem__ (line 155) | def __getitem__(self, index):
    method __len__ (line 181) | def __len__(self):
  class FlyingThings (line 184) | class FlyingThings(data.Dataset):
    method __init__ (line 185) | def __init__(self, args, is_cropped, root = '/path/to/flyingthings3d',...
    method __getitem__ (line 222) | def __getitem__(self, index):
    method __len__ (line 248) | def __len__(self):
  class FlyingThingsClean (line 251) | class FlyingThingsClean(FlyingThings):
    method __init__ (line 252) | def __init__(self, args, is_cropped = False, root = '', replicates = 1):
  class FlyingThingsFinal (line 255) | class FlyingThingsFinal(FlyingThings):
    method __init__ (line 256) | def __init__(self, args, is_cropped = False, root = '', replicates = 1):
  class ChairsSDHom (line 259) | class ChairsSDHom(data.Dataset):
    method __init__ (line 260) | def __init__(self, args, is_cropped, root = '/path/to/chairssdhom/data...
    method __getitem__ (line 291) | def __getitem__(self, index):
    method __len__ (line 318) | def __len__(self):
  class ChairsSDHomTrain (line 321) | class ChairsSDHomTrain(ChairsSDHom):
    method __init__ (line 322) | def __init__(self, args, is_cropped = False, root = '', replicates = 1):
  class ChairsSDHomTest (line 325) | class ChairsSDHomTest(ChairsSDHom):
    method __init__ (line 326) | def __init__(self, args, is_cropped = False, root = '', replicates = 1):
  class ImagesFromFolder (line 329) | class ImagesFromFolder(data.Dataset):
    method __init__ (line 330) | def __init__(self, args, is_cropped, root = '/path/to/frames/only/fold...
    method __getitem__ (line 354) | def __getitem__(self, index):
    method __len__ (line 373) | def __len__(self):
  class Google (line 377) | class Google(data.Dataset):
    method __init__ (line 378) | def __init__(self, args, is_cropped = False, root = '', dstype = 'fram...
    method __getitem__ (line 412) | def __getitem__(self, index):
    method __len__ (line 434) | def __len__(self):

FILE: dvs/flownet2/losses.py
  function EPE (line 11) | def EPE(input_flow, target_flow):
  class L1 (line 14) | class L1(nn.Module):
    method __init__ (line 15) | def __init__(self):
    method forward (line 17) | def forward(self, output, target):
  class L2 (line 21) | class L2(nn.Module):
    method __init__ (line 22) | def __init__(self):
    method forward (line 24) | def forward(self, output, target):
  class L1Loss (line 28) | class L1Loss(nn.Module):
    method __init__ (line 29) | def __init__(self, args):
    method forward (line 35) | def forward(self, output, target):
  class L2Loss (line 40) | class L2Loss(nn.Module):
    method __init__ (line 41) | def __init__(self, args):
    method forward (line 47) | def forward(self, output, target):
  class MultiScale (line 52) | class MultiScale(nn.Module):
    method __init__ (line 53) | def __init__(self, args, startScale = 4, numScales = 5, l_weight= 0.32...
    method forward (line 72) | def forward(self, output, target):

FILE: dvs/flownet2/main.py
  function inference (line 22) | def inference(args, epoch, data_path, data_loader, model, offset=0):
  class Model (line 183) | class Model(nn.Module):
    method __init__ (line 184) | def __init__(self, args):
    method forward (line 189) | def forward(self, data):

FILE: dvs/flownet2/models.py
  class FlowNet2 (line 30) | class FlowNet2(nn.Module):
    method __init__ (line 32) | def __init__(self, args, batchNorm=False, div_flow = 20.):
    method init_deconv_bilinear (line 104) | def init_deconv_bilinear(self, weight):
    method forward (line 120) | def forward(self, inputs):
  class FlowNet2C (line 187) | class FlowNet2C(FlowNetC.FlowNetC):
    method __init__ (line 188) | def __init__(self, args, batchNorm=False, div_flow=20):
    method forward (line 192) | def forward(self, inputs):
  class FlowNet2S (line 255) | class FlowNet2S(FlowNetS.FlowNetS):
    method __init__ (line 256) | def __init__(self, args, batchNorm=False, div_flow=20):
    method forward (line 261) | def forward(self, inputs):
  class FlowNet2SD (line 301) | class FlowNet2SD(FlowNetSD.FlowNetSD):
    method __init__ (line 302) | def __init__(self, args, batchNorm=False, div_flow=20):
    method forward (line 307) | def forward(self, inputs):
  class FlowNet2CS (line 353) | class FlowNet2CS(nn.Module):
    method __init__ (line 355) | def __init__(self, args, batchNorm=False, div_flow = 20.):
    method forward (line 392) | def forward(self, inputs):
  class FlowNet2CSS (line 418) | class FlowNet2CSS(nn.Module):
    method __init__ (line 420) | def __init__(self, args, batchNorm=False, div_flow = 20.):
    method forward (line 469) | def forward(self, inputs):

FILE: dvs/flownet2/networks/FlowNetC.py
  class FlowNetC (line 13) | class FlowNetC(nn.Module):
    method __init__ (line 14) | def __init__(self,args, batchNorm=True, div_flow = 20):
    method forward (line 71) | def forward(self, x):

FILE: dvs/flownet2/networks/FlowNetFusion.py
  class FlowNetFusion (line 11) | class FlowNetFusion(nn.Module):
    method __init__ (line 12) | def __init__(self,args, batchNorm=True):
    method forward (line 47) | def forward(self, x):

FILE: dvs/flownet2/networks/FlowNetS.py
  class FlowNetS (line 15) | class FlowNetS(nn.Module):
    method __init__ (line 16) | def __init__(self, args, input_channels = 12, batchNorm=True):
    method forward (line 60) | def forward(self, x):

FILE: dvs/flownet2/networks/FlowNetSD.py
  class FlowNetSD (line 11) | class FlowNetSD(nn.Module):
    method __init__ (line 12) | def __init__(self, args, batchNorm=True):
    method forward (line 66) | def forward(self, x):

FILE: dvs/flownet2/networks/channelnorm_package/channelnorm.py
  class ChannelNormFunction (line 5) | class ChannelNormFunction(Function):
    method forward (line 8) | def forward(ctx, input1, norm_deg=2):
    method backward (line 20) | def backward(ctx, grad_output):
  class ChannelNorm (line 31) | class ChannelNorm(Module):
    method __init__ (line 33) | def __init__(self, norm_deg=2):
    method forward (line 37) | def forward(self, input1):

FILE: dvs/flownet2/networks/channelnorm_package/channelnorm_cuda.cc
  function channelnorm_cuda_forward (line 6) | int channelnorm_cuda_forward(
  function channelnorm_cuda_backward (line 16) | int channelnorm_cuda_backward(
  function PYBIND11_MODULE (line 27) | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {

FILE: dvs/flownet2/networks/correlation_package/correlation.py
  class CorrelationFunction (line 6) | class CorrelationFunction(Function):
    method forward (line 9) | def forward(ctx, input1, input2, pad_size=3, kernel_size=3, max_displa...
    method backward (line 30) | def backward(ctx, grad_output):
  class Correlation (line 46) | class Correlation(Module):
    method __init__ (line 47) | def __init__(self, pad_size=0, kernel_size=0, max_displacement=0, stri...
    method forward (line 56) | def forward(self, input1, input2):

FILE: dvs/flownet2/networks/correlation_package/correlation_cuda.cc
  function correlation_forward_cuda (line 10) | int correlation_forward_cuda(at::Tensor& input1, at::Tensor& input2, at:...
  function correlation_backward_cuda (line 89) | int correlation_backward_cuda(at::Tensor& input1, at::Tensor& input2, at...
  function PYBIND11_MODULE (line 169) | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {

FILE: dvs/flownet2/networks/resample2d_package/resample2d.py
  class Resample2dFunction (line 5) | class Resample2dFunction(Function):
    method forward (line 8) | def forward(ctx, input1, input2, kernel_size=1, bilinear= True):
    method backward (line 25) | def backward(ctx, grad_output):
  class Resample2d (line 40) | class Resample2d(Module):
    method __init__ (line 42) | def __init__(self, kernel_size=1, bilinear = True):
    method forward (line 47) | def forward(self, input1, input2):

FILE: dvs/flownet2/networks/resample2d_package/resample2d_cuda.cc
  function resample2d_cuda_forward (line 6) | int resample2d_cuda_forward(
  function resample2d_cuda_backward (line 15) | int resample2d_cuda_backward(
  function PYBIND11_MODULE (line 28) | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {

FILE: dvs/flownet2/networks/submodules.py
  function conv (line 7) | def conv(batchNorm, in_planes, out_planes, kernel_size=3, stride=1):
  function i_conv (line 20) | def i_conv(batchNorm, in_planes, out_planes, kernel_size=3, stride=1, bi...
  function predict_flow (line 31) | def predict_flow(in_planes):
  function deconv (line 34) | def deconv(in_planes, out_planes):
  class tofp16 (line 40) | class tofp16(nn.Module):
    method __init__ (line 41) | def __init__(self):
    method forward (line 44) | def forward(self, input):
  class tofp32 (line 48) | class tofp32(nn.Module):
    method __init__ (line 49) | def __init__(self):
    method forward (line 52) | def forward(self, input):
  function init_deconv_bilinear (line 56) | def init_deconv_bilinear(weight):
  function save_grad (line 72) | def save_grad(grads, name):

FILE: dvs/flownet2/utils/flow_utils.py
  function readFlow (line 7) | def readFlow(fn):
  function writeFlow (line 28) | def writeFlow(filename,uv,v=None):
  function visulize_flow_file (line 62) | def visulize_flow_file(flow_filename, save_dir=None):
  function flow2img (line 72) | def flow2img(flow_data):
  function compute_color (line 112) | def compute_color(u, v):
  function make_color_wheel (line 157) | def make_color_wheel():

FILE: dvs/flownet2/utils/frame_utils.py
  function read_gen (line 6) | def read_gen(file_name):

FILE: dvs/flownet2/utils/param_utils.py
  function parse_flownetc (line 5) | def parse_flownetc(modules, weights, biases):
  function parse_flownets (line 51) | def parse_flownets(modules, weights, biases, param_prefix='net2_'):
  function parse_flownetsonly (line 104) | def parse_flownetsonly(modules, weights, biases, param_prefix=''):
  function parse_flownetsd (line 156) | def parse_flownetsd(modules, weights, biases, param_prefix='netsd_'):
  function parse_flownetfusion (line 214) | def parse_flownetfusion(modules, weights, biases, param_prefix='fuse_'):

FILE: dvs/flownet2/utils/tools.py
  function datestr (line 13) | def datestr():
  function module_to_dict (line 18) | def module_to_dict(module, exclude=[]):
  class TimerBlock (line 24) | class TimerBlock:
    method __init__ (line 25) | def __init__(self, title):
    method __enter__ (line 28) | def __enter__(self):
    method __exit__ (line 32) | def __exit__(self, exc_type, exc_value, traceback):
    method log (line 42) | def log(self, string):
    method log2file (line 50) | def log2file(self, fid, string):
  function add_arguments_for_module (line 55) | def add_arguments_for_module(parser, module, argument_for_class, default...
  function kwargs_from_args (line 84) | def kwargs_from_args(args, argument_for_class):
  function format_dictionary_of_losses (line 88) | def format_dictionary_of_losses(labels, values):
  class IteratorTimer (line 98) | class IteratorTimer():
    method __init__ (line 99) | def __init__(self, iterable):
    method __iter__ (line 103) | def __iter__(self):
    method __len__ (line 106) | def __len__(self):
    method __next__ (line 109) | def __next__(self):
  function gpumemusage (line 117) | def gpumemusage():
  function update_hyperparameter_schedule (line 131) | def update_hyperparameter_schedule(args, epoch, global_iteration, optimi...
  function save_checkpoint (line 138) | def save_checkpoint(state, is_best, path, prefix, filename='checkpoint.p...

FILE: dvs/gyro/gyro_function.py
  function get_static (line 7) | def get_static(height = 1080, width = 1920, ratio = 0.1):
  function norm_quat (line 24) | def norm_quat(quat):
  function torch_norm_quat (line 34) | def torch_norm_quat(quat, USE_CUDA = True):
  function ConvertAxisAngleToQuaternion (line 53) | def ConvertAxisAngleToQuaternion(axis, angle):
  function ConvertAxisAngleToQuaternion_no_angle (line 62) | def ConvertAxisAngleToQuaternion_no_angle(axis):
  function torch_ConvertAxisAngleToQuaternion (line 72) | def torch_ConvertAxisAngleToQuaternion(axis, USE_CUDA = True):
  function ConvertQuaternionToAxisAngle (line 90) | def ConvertQuaternionToAxisAngle(quat):
  function ConvertQuaternionToAxisAngle_no_angle (line 104) | def ConvertQuaternionToAxisAngle_no_angle(quat):
  function torch_ConvertQuaternionToAxisAngle (line 115) | def torch_ConvertQuaternionToAxisAngle(quat, USE_CUDA = True):
  function train_ConvertQuaternionToAxisAngle (line 129) | def train_ConvertQuaternionToAxisAngle(quat):
  function AngularVelocityToQuat (line 134) | def AngularVelocityToQuat(angular_v, dt):
  function QuaternionProduct (line 144) | def QuaternionProduct(q1, q2):
  function torch_QuaternionProduct (line 163) | def torch_QuaternionProduct(q1, q2, USE_CUDA = True):
  function ProcessGyroRotation (line 188) | def ProcessGyroRotation(gyro_data):
  function QuaternionReciprocal (line 199) | def QuaternionReciprocal(q):
  function torch_QuaternionReciprocal (line 203) | def torch_QuaternionReciprocal(q,  USE_CUDA = True):
  function ProcessGyroData (line 210) | def ProcessGyroData(gyro_data):
  function SlerpWithDefault (line 221) | def SlerpWithDefault(q1, q2, t, q_default):
  function GetGyroAtTimeStamp (line 260) | def GetGyroAtTimeStamp(gyro_data, timestamp):
  function train_GetGyroAtTimeStamp (line 275) | def train_GetGyroAtTimeStamp(gyro_data, timestamp, check = False):
  function FindOISAtTimeStamp (line 291) | def FindOISAtTimeStamp(ois_log, time):
  function GetMetadata (line 311) | def GetMetadata(frame_data, frame_index, result_poses = {} ):
  function GetProjections (line 330) | def GetProjections(static_options, metadata, quats_data, ois_data,  no_s...
  function GetRealProjection (line 344) | def GetRealProjection(static_options, quats_data, ois_data, fov, timesta...
  function GetProjectionHomography (line 354) | def GetProjectionHomography(rot, fov, offset, width, height):
  function torch_GetProjectionHomography (line 365) | def torch_GetProjectionHomography(rot, fov, width, height, USE_CUDA = Tr...
  function ConvertQuaternionToRotationMatrix (line 381) | def ConvertQuaternionToRotationMatrix(quat):
  function torch_ConvertQuaternionToRotationMatrix (line 399) | def torch_ConvertQuaternionToRotationMatrix(quat, USE_CUDA = True):
  function ConvertRotationMatrixToQuaternion (line 422) | def ConvertRotationMatrixToQuaternion(m):
  function GetIntrinsics (line 450) | def GetIntrinsics(focal_length, offset, width, height):
  function GetVirtualProjection (line 459) | def GetVirtualProjection(static_options, result_pose, metadata, frame_in...
  function torch_GetVirtualProjection (line 470) | def torch_GetVirtualProjection(static_options, quat, virtual_fov = 1.27):
  function GetForwardGrid (line 476) | def GetForwardGrid(static_options, real_projections, virtual_projection):
  function torch_GetForwardGrid (line 498) | def torch_GetForwardGrid(static_options, real_projections, virtual_proje...
  function GetWarpingFlow (line 530) | def GetWarpingFlow(real_projections_src, real_projections_dst, num_rows,...
  function torch_GetWarpingFlow (line 549) | def torch_GetWarpingFlow(static_options, real_projections_src, real_proj...
  function GetHomographyTransformFromProjections (line 581) | def GetHomographyTransformFromProjections(proj_src, proj_dst):
  function torch_GetHomographyTransformFromProjections (line 584) | def torch_GetHomographyTransformFromProjections(proj_src, proj_dst):
  function ApplyTransform (line 587) | def ApplyTransform(transform, point):
  function torch_ApplyTransform (line 594) | def torch_ApplyTransform(transform, point):
  function CenterZoom (line 601) | def CenterZoom(grid, ratio):

FILE: dvs/gyro/gyro_io.py
  function load_gyro_mesh (line 13) | def load_gyro_mesh(input_name):
  function get_grid (line 19) | def get_grid(static_options, frame_data, quats_data, ois_data, virtual_d...
  function get_rotations (line 34) | def get_rotations(frame_data, quats_data, ois_data, num_frames):
  function visual_rotation (line 50) | def visual_rotation(rotations_real, lens_offsets_real, rotations_virtual...
  function LoadOISData (line 106) | def LoadOISData(ois_name):
  function LoadFrameData (line 111) | def LoadFrameData(frame_log_name):
  function LoadGyroData (line 117) | def LoadGyroData(gyro_log_name):
  function LoadStabResult (line 126) | def LoadStabResult(input_name):
  function ReadLine (line 142) | def ReadLine(fid):
  function str2num (line 162) | def str2num(string):

FILE: dvs/inference.py
  function run (line 29) | def run(model, loader, cf, USE_CUDA=True):
  function inference (line 120) | def inference(cf, data_path, USE_CUDA):
  function visual_result (line 173) | def visual_result(cf, data, video_name, virtual_queue, virtual_queue2 = ...
  function main (line 187) | def main(args = None):

FILE: dvs/load_frame_sensor_data.py
  function run (line 31) | def run(loader, cf, USE_CUDA=True):
  function inference (line 68) | def inference(cf, data_path, USE_CUDA):
  function main (line 97) | def main(args = None):

FILE: dvs/loss.py
  class C2_Smooth_loss (line 21) | class C2_Smooth_loss(torch.nn.Module):
    method __init__ (line 22) | def __init__(self):
    method forward (line 26) | def forward(self, Qt, Qt_1, Qt_2):
  class C1_Smooth_loss (line 30) | class C1_Smooth_loss(torch.nn.Module):
    method __init__ (line 31) | def __init__(self):
    method forward (line 35) | def forward(self, v_r_axis, v_axis_t_1 = None, real_postion = None):
  class Follow_loss (line 40) | class Follow_loss(torch.nn.Module):
    method __init__ (line 41) | def __init__(self):
    method forward (line 45) | def forward(self, virtual_quat, real_quat, real_postion = None):
  class Stay_loss (line 50) | class Stay_loss(torch.nn.Module):
    method __init__ (line 51) | def __init__(self):
    method forward (line 55) | def forward(self, virtual_quat):
  class Angle_loss (line 59) | class Angle_loss(torch.nn.Module):
    method __init__ (line 60) | def __init__(self):
    method forward (line 64) | def forward(self, Q1, Q2, threshold = 0.5236, logistic_beta1 = 100):
  class Optical_loss (line 73) | class Optical_loss(torch.nn.Module):
    method __init__ (line 74) | def __init__(self):
    method forward (line 79) | def forward(self, Vt, Vt_1, flo, flo_back, real_projection_t, real_pro...
  function get_mesh (line 122) | def get_mesh(height = 270, width = 480, USE_CUDA = True):
  class Undefine_loss (line 132) | class Undefine_loss(torch.nn.Module):
    method __init__ (line 133) | def __init__(self, ratio = 0.08, inner_ratio = 0.04, USE_CUDA = True):
    method forward (line 153) | def forward(self, Vt, Rt, ratio = 0.04):
    method get_loss (line 177) | def get_loss(self, p):

FILE: dvs/metrics.py
  function _pickle_keypoints (line 19) | def _pickle_keypoints(point):
  function crop_metric (line 30) | def crop_metric(M):
  function get_scale (line 39) | def get_scale(M):
  function get_rescale_matrix (line 53) | def get_rescale_matrix(M, sx, sy):
  function metrics (line 64) | def metrics(in_src, out_src, package, crop_scale = False, re_compute = F...
  function crop_rm_outlier (line 294) | def crop_rm_outlier(crop):

FILE: dvs/model.py
  class LayerLSTM (line 16) | class LayerLSTM(nn.Module):
    method __init__ (line 17) | def __init__(self, input_size, hidden_size, bias):
    method init_hidden (line 22) | def init_hidden(self, batch_size):
    method forward (line 26) | def forward(self, x):
  class LayerCNN (line 31) | class LayerCNN(nn.Module):
    method __init__ (line 32) | def __init__(self, in_channel, out_channel, kernel_size, stride, paddi...
    method forward (line 43) | def forward(self, x):
  class LayerFC (line 52) | class LayerFC(nn.Module):
    method __init__ (line 53) | def __init__(self, in_features, out_features, bias, drop_out=0, activa...
    method forward (line 61) | def forward(self, x):
  class Net (line 71) | class Net(nn.Module):
    method __init__ (line 72) | def __init__(self, cf):
    method init_hidden (line 160) | def init_hidden(self, batch_size):
    method forward (line 164) | def forward(self, x, flo, ois):
  class Model (line 178) | class Model():
    method __init__ (line 179) | def __init__(self, cf):
    method loss (line 203) | def loss(
    method init_weights (line 243) | def init_weights(self, cf):
    method save_checkpoint (line 258) | def save_checkpoint(self, epoch = 0, optimizer=None):
  class UNet (line 272) | class UNet(nn.Module):
    method __init__ (line 273) | def __init__(self, n_channels = 4, n_classes = 16, bilinear=True):
    method forward (line 288) | def forward(self, x, x_back = None):
  class DoubleConv (line 304) | class DoubleConv(nn.Module):
    method __init__ (line 307) | def __init__(self, in_channels, out_channels, mid_channels=None):
    method forward (line 316) | def forward(self, x):
  class Down (line 320) | class Down(nn.Module):
    method __init__ (line 323) | def __init__(self, in_channels, out_channels):
    method forward (line 330) | def forward(self, x):
  class Up (line 334) | class Up(nn.Module):
    method __init__ (line 337) | def __init__(self, in_channels, out_channels, bilinear=True):
    method forward (line 349) | def forward(self, x1, x2):
  class OutConv (line 364) | class OutConv(nn.Module):
    method __init__ (line 365) | def __init__(self, in_channels, out_channels):
    method forward (line 369) | def forward(self, x):

FILE: dvs/printer.py
  class Printer (line 3) | class Printer(object):
    method __init__ (line 4) | def __init__(self, *files):
    method open (line 8) | def open(self):
    method close (line 15) | def close(self):
    method write (line 23) | def write(self, obj):
    method flush (line 28) | def flush(self):

FILE: dvs/train.py
  function run_epoch (line 22) | def run_epoch(model, loader, cf, epoch, lr, optimizer=None, is_training=...
  function train (line 150) | def train(args = None):

FILE: dvs/util.py
  function save_train_info (line 11) | def save_train_info(name, checkpoints_dir, cf, model, count, optimizer =...
  function make_dir (line 21) | def make_dir(checkpoints_dir ,cf):
  function get_optimizer (line 38) | def get_optimizer(optimizer, model, init_lr, cf):
  function crop_video (line 45) | def crop_video(in_path, out_path, crop_ratio):
  function norm_flow (line 55) | def norm_flow(flow, h, w):
  class AverageMeter (line 64) | class AverageMeter(object):
    method __init__ (line 65) | def __init__(self):
    method reset (line 68) | def reset(self):
    method update (line 73) | def update(self, val, n=1):

FILE: dvs/warp/rasterizer.py
  function Rasterization (line 10) | def Rasterization(image, grid, get_mesh_only = False):
  function grid_to_triangle (line 73) | def grid_to_triangle(grid):
  function grid_size (line 91) | def grid_size(upper_triangle, lower_triangle, height, width):
  function generate_mesh_grid (line 103) | def generate_mesh_grid(height, width):
  function triangle2mask (line 111) | def triangle2mask(d, height, width): # d: [N x T x 3 x 2]
  function edgefunc (line 144) | def edgefunc(v0, v1, p):

FILE: dvs/warp/read_write.py
  function load_video (line 11) | def load_video(path, save_dir = None, resize = None, length = -1): # N x...
  function video2frame (line 39) | def video2frame(path, resize = None):
  function video2frame_one_seq (line 55) | def video2frame_one_seq(path, save_dir = None, resize = None): # N x H x...
  function save_video (line 79) | def save_video(path,frame_array, fps, size, losses = None, frame_number ...
  function draw_number (line 97) | def draw_number(frame, num, x = 10, y = 10, message = "Frame: "):

FILE: dvs/warp/warping.py
  function warp_video (line 9) | def warp_video(mesh_path, video_path, save_path, losses = None, frame_nu...
  function warpping_rast (line 30) | def warpping_rast(grid_data, frame_array, losses = None):
  function warpping_one_frame_rast (line 37) | def warpping_one_frame_rast(image, grid):

Download .json

Condensed preview — 65 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (304K chars).

[
  {
    "path": ".gitignore",
    "chars": 52,
    "preview": "*.pyc\n.torch\n_ext\n*.o\n_ext/\n*.png\n*.jpg\n*.tar\nlog/*\n"
  },
  {
    "path": "LICENSE",
    "chars": 11358,
    "preview": "\n                                 Apache License\n                           Version 2.0, January 2004\n                  "
  },
  {
    "path": "README.md",
    "chars": 3608,
    "preview": "# Deep Online Fused Video Stabilization\n\n[[Paper]](https://openaccess.thecvf.com/content/WACV2022/papers/Shi_Deep_Online"
  },
  {
    "path": "docs/code-of-conduct.md",
    "chars": 3167,
    "preview": "# Google Open Source Community Guidelines\n\nAt Google, we recognize and celebrate the creativity and collaboration of ope"
  },
  {
    "path": "docs/contributing.md",
    "chars": 1097,
    "preview": "# How to Contribute\n\nWe'd love to accept your patches and contributions to this project. There are\njust a few small guid"
  },
  {
    "path": "dvs/conf/stabilzation.yaml",
    "chars": 1374,
    "preview": "data:\n  exp: 'stabilzation'\n  checkpoints_dir: './checkpoint'\n  log: './log'\n  data_dir: './video'           \n  use_cuda"
  },
  {
    "path": "dvs/conf/stabilzation_train.yaml",
    "chars": 1390,
    "preview": "data:\n  exp: 'stabilzation_train'\n  checkpoints_dir: './checkpoint'\n  log: './log'\n  data_dir: './dataset_release'      "
  },
  {
    "path": "dvs/dataset.py",
    "chars": 13230,
    "preview": "from torch.utils.data import Dataset\nimport os\nimport collections\nfrom gyro import (\n    LoadGyroData, \n    LoadOISData,"
  },
  {
    "path": "dvs/flownet2/LICENSE",
    "chars": 558,
    "preview": "Copyright 2017 NVIDIA CORPORATION\n\nLicensed under the Apache License, Version 2.0 (the \"License\");\nyou may not use this "
  },
  {
    "path": "dvs/flownet2/README.md",
    "chars": 5514,
    "preview": "# flownet2-pytorch \n\nPytorch implementation of [FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks](ht"
  },
  {
    "path": "dvs/flownet2/__init__.py",
    "chars": 36,
    "preview": "from .utils import flow_utils, tools"
  },
  {
    "path": "dvs/flownet2/convert.py",
    "chars": 4703,
    "preview": "#!/usr/bin/env python2.7\n\nimport caffe\nfrom caffe.proto import caffe_pb2\nimport sys, os\n\nimport torch\nimport torch.nn as"
  },
  {
    "path": "dvs/flownet2/datasets.py",
    "chars": 15386,
    "preview": "import torch\nimport torch.utils.data as data\n\nimport os, math, random\nfrom os.path import *\nimport numpy as np\n\nfrom glo"
  },
  {
    "path": "dvs/flownet2/install.sh",
    "chars": 340,
    "preview": "#!/bin/bash\ncd ./networks/correlation_package\nrm -rf *_cuda.egg-info build dist __pycache__\npython3 setup.py install --u"
  },
  {
    "path": "dvs/flownet2/losses.py",
    "chars": 2826,
    "preview": "'''\r\nPortions of this code copyright 2017, Clement Pinard\r\n'''\r\n\r\n# freda (todo) : adversarial loss \r\n\r\nimport torch\r\nim"
  },
  {
    "path": "dvs/flownet2/main.py",
    "chars": 11058,
    "preview": "#!/usr/bin/env python\nimport os\nos.environ[\"CUDA_VISIBLE_DEVICES\"] = \"0\"\nimport torch\nimport torch.nn as nn\nfrom torch.u"
  },
  {
    "path": "dvs/flownet2/models.py",
    "chars": 19918,
    "preview": "import torch\r\nimport torch.nn as nn\r\nfrom torch.nn import init\r\n\r\nimport math\r\nimport numpy as np\r\n\r\ntry:\r\n    from netw"
  },
  {
    "path": "dvs/flownet2/networks/FlowNetC.py",
    "chars": 4757,
    "preview": "import torch\nimport torch.nn as nn\nfrom torch.nn import init\n\nimport math\nimport numpy as np\n\nfrom .correlation_package."
  },
  {
    "path": "dvs/flownet2/networks/FlowNetFusion.py",
    "chars": 2329,
    "preview": "import torch\nimport torch.nn as nn\nfrom torch.nn import init\n\nimport math\nimport numpy as np\n\nfrom .submodules import *\n"
  },
  {
    "path": "dvs/flownet2/networks/FlowNetS.py",
    "chars": 3652,
    "preview": "'''\nPortions of this code copyright 2017, Clement Pinard\n'''\n\nimport torch\nimport torch.nn as nn\nfrom torch.nn import in"
  },
  {
    "path": "dvs/flownet2/networks/FlowNetSD.py",
    "chars": 4187,
    "preview": "import torch\nimport torch.nn as nn\nfrom torch.nn import init\n\nimport math\nimport numpy as np\n\nfrom .submodules import *\n"
  },
  {
    "path": "dvs/flownet2/networks/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "dvs/flownet2/networks/channelnorm_package/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "dvs/flownet2/networks/channelnorm_package/channelnorm.py",
    "chars": 1080,
    "preview": "from torch.autograd import Function, Variable\nfrom torch.nn.modules.module import Module\nimport channelnorm_cuda\n\nclass "
  },
  {
    "path": "dvs/flownet2/networks/channelnorm_package/channelnorm_cuda.cc",
    "chars": 722,
    "preview": "#include <torch/torch.h>\n#include <ATen/ATen.h>\n\n#include \"channelnorm_kernel.cuh\"\n\nint channelnorm_cuda_forward(\n    at"
  },
  {
    "path": "dvs/flownet2/networks/channelnorm_package/channelnorm_kernel.cu",
    "chars": 6061,
    "preview": "#include <ATen/ATen.h>\n#include <ATen/Context.h>\n#include <ATen/cuda/CUDAContext.h>\n\n#include \"channelnorm_kernel.cuh\"\n\n"
  },
  {
    "path": "dvs/flownet2/networks/channelnorm_package/channelnorm_kernel.cuh",
    "chars": 298,
    "preview": "#pragma once\n\n#include <ATen/ATen.h>\n\nvoid channelnorm_kernel_forward(\n    at::Tensor& input1,\n    at::Tensor& output, \n"
  },
  {
    "path": "dvs/flownet2/networks/channelnorm_package/setup.py",
    "chars": 725,
    "preview": "#!/usr/bin/env python3\nimport os\nimport torch\n\nfrom setuptools import setup\nfrom torch.utils.cpp_extension import BuildE"
  },
  {
    "path": "dvs/flownet2/networks/correlation_package/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "dvs/flownet2/networks/correlation_package/correlation.py",
    "chars": 2145,
    "preview": "import torch\nfrom torch.nn.modules.module import Module\nfrom torch.autograd import Function\nimport correlation_cuda\n\ncla"
  },
  {
    "path": "dvs/flownet2/networks/correlation_package/correlation_cuda.cc",
    "chars": 6611,
    "preview": "#include <torch/torch.h>\n#include <ATen/ATen.h>\n#include <ATen/Context.h>\n#include <ATen/cuda/CUDAContext.h>\n#include <s"
  },
  {
    "path": "dvs/flownet2/networks/correlation_package/correlation_cuda_kernel.cu",
    "chars": 19919,
    "preview": "#include <stdio.h>\n\n#include \"correlation_cuda_kernel.cuh\"\n\n#define CUDA_NUM_THREADS 1024\n#define THREADS_PER_BLOCK 32\n#"
  },
  {
    "path": "dvs/flownet2/networks/correlation_package/correlation_cuda_kernel.cuh",
    "chars": 1409,
    "preview": "#pragma once\n\n#include <ATen/ATen.h>\n#include <ATen/Context.h>\n#include <cuda_runtime.h>\n\nint correlation_forward_cuda_k"
  },
  {
    "path": "dvs/flownet2/networks/correlation_package/setup.py",
    "chars": 791,
    "preview": "#!/usr/bin/env python3\nimport os\nimport torch\n\nfrom setuptools import setup, find_packages\nfrom torch.utils.cpp_extensio"
  },
  {
    "path": "dvs/flownet2/networks/resample2d_package/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "dvs/flownet2/networks/resample2d_package/resample2d.py",
    "chars": 1597,
    "preview": "from torch.nn.modules.module import Module\nfrom torch.autograd import Function, Variable\nimport resample2d_cuda\n\nclass R"
  },
  {
    "path": "dvs/flownet2/networks/resample2d_package/resample2d_cuda.cc",
    "chars": 852,
    "preview": "#include <ATen/ATen.h>\n#include <torch/torch.h>\n\n#include \"resample2d_kernel.cuh\"\n\nint resample2d_cuda_forward(\n    at::"
  },
  {
    "path": "dvs/flownet2/networks/resample2d_package/resample2d_kernel.cu",
    "chars": 13033,
    "preview": "#include <ATen/ATen.h>\n#include <ATen/Context.h>\n#include <ATen/cuda/CUDAContext.h>\n\n#define CUDA_NUM_THREADS 512 \n#defi"
  },
  {
    "path": "dvs/flownet2/networks/resample2d_package/resample2d_kernel.cuh",
    "chars": 391,
    "preview": "#pragma once\n\n#include <ATen/ATen.h>\n\nvoid resample2d_kernel_forward(\n    at::Tensor& input1,\n    at::Tensor& input2,\n  "
  },
  {
    "path": "dvs/flownet2/networks/resample2d_package/setup.py",
    "chars": 767,
    "preview": "#!/usr/bin/env python3\nimport os\nimport torch\n\nfrom setuptools import setup\nfrom torch.utils.cpp_extension import BuildE"
  },
  {
    "path": "dvs/flownet2/networks/submodules.py",
    "chars": 2820,
    "preview": "# freda (todo) : \r\n\r\nimport torch.nn as nn\r\nimport torch\r\nimport numpy as np \r\n\r\ndef conv(batchNorm, in_planes, out_plan"
  },
  {
    "path": "dvs/flownet2/run.sh",
    "chars": 201,
    "preview": "#!/bin/bash\npython main.py --inference --model FlowNet2 --save_flow --inference_dataset Google \\\n\t--inference_dataset_ro"
  },
  {
    "path": "dvs/flownet2/run_release.sh",
    "chars": 424,
    "preview": "#!/bin/bash\npython main.py --inference --model FlowNet2 --save_flow --inference_dataset Google \\\n\t--inference_dataset_ro"
  },
  {
    "path": "dvs/flownet2/utils/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "dvs/flownet2/utils/flow_utils.py",
    "chars": 5106,
    "preview": "import numpy as np\nimport matplotlib.pyplot as plt\nimport os.path\n\nTAG_CHAR = np.array([202021.25], np.float32)\n\ndef rea"
  },
  {
    "path": "dvs/flownet2/utils/frame_utils.py",
    "chars": 531,
    "preview": "import numpy as np\nfrom os.path import *\nfrom imageio import imread\nfrom . import flow_utils \n\ndef read_gen(file_name):\n"
  },
  {
    "path": "dvs/flownet2/utils/param_utils.py",
    "chars": 6878,
    "preview": "import torch\nimport torch.nn as nn\nimport numpy as np\n\ndef parse_flownetc(modules, weights, biases):\n    keys = [\n    'c"
  },
  {
    "path": "dvs/flownet2/utils/tools.py",
    "chars": 5388,
    "preview": "# freda (todo) : \n\nimport os, time, sys, math\nimport subprocess, shutil\nfrom os.path import *\nimport numpy as np\nfrom in"
  },
  {
    "path": "dvs/gyro/__init__.py",
    "chars": 1009,
    "preview": "from .gyro_function import (\n    GetGyroAtTimeStamp,\n    QuaternionProduct,\n    QuaternionReciprocal,\n    ConvertQuatern"
  },
  {
    "path": "dvs/gyro/gyro_function.py",
    "chars": 22582,
    "preview": "import numpy as np\nfrom numpy import linalg as LA\nimport matplotlib.pyplot as plt\nimport torch\nfrom torch.autograd impor"
  },
  {
    "path": "dvs/gyro/gyro_io.py",
    "chars": 5633,
    "preview": "import numpy as np\nfrom numpy import linalg as LA\nimport matplotlib.pyplot as plt\nimport scipy.io as sio\nfrom .gyro_func"
  },
  {
    "path": "dvs/inference.py",
    "chars": 8597,
    "preview": "import os\nimport sys\nimport torch\nimport torchvision\nimport torch.nn as nn\nfrom torch.autograd import Variable\n\nimport t"
  },
  {
    "path": "dvs/load_frame_sensor_data.py",
    "chars": 4283,
    "preview": "import os\nos.environ[\"CUDA_VISIBLE_DEVICES\"] = \"0\"\nimport sys\nimport torch\nimport torchvision\nimport torch.nn as nn\nfrom"
  },
  {
    "path": "dvs/loss.py",
    "chars": 7532,
    "preview": "import torch\nimport numpy as np\nfrom torch.autograd import Variable\nimport operator\nimport torch.nn.functional as F\nimpo"
  },
  {
    "path": "dvs/metrics.py",
    "chars": 11587,
    "preview": "import os\nimport sys\nimport numpy as np\nimport cv2\nimport math\nimport pdb\nimport matplotlib.pyplot as plt\nfrom printer i"
  },
  {
    "path": "dvs/model.py",
    "chars": 15138,
    "preview": "import math\nimport torch\nfrom collections import OrderedDict\n\nimport torch.nn as nn\nimport numpy as np\nimport util\nimpor"
  },
  {
    "path": "dvs/printer.py",
    "chars": 812,
    "preview": "import sys\n\nclass Printer(object):\n    def __init__(self, *files):\n        self.files = files\n        \n    #Redirect Pri"
  },
  {
    "path": "dvs/requirements.txt",
    "chars": 187,
    "preview": "colorama==0.4.4\nffmpeg==1.4\nimageio==2.9.0\nmatplotlib==3.3.4\nopencv-contrib-python==4.5.1.48\nopencv-python==4.5.1.48\npyt"
  },
  {
    "path": "dvs/train.py",
    "chars": 9694,
    "preview": "import os\nimport sys\nimport torch\nimport torchvision\nimport torch.nn as nn\nfrom torch.autograd import Variable\n\nimport t"
  },
  {
    "path": "dvs/util.py",
    "chars": 2687,
    "preview": "import os\nimport torch\nimport cv2\nfrom itertools import chain\nfrom warp import load_video, save_video\nimport numpy as np"
  },
  {
    "path": "dvs/warp/__init__.py",
    "chars": 131,
    "preview": "from .warping import (\n    warp_video\n    )\nfrom .read_write import (\n    save_video,\n    load_video,\n    video2frame_on"
  },
  {
    "path": "dvs/warp/rasterizer.py",
    "chars": 6666,
    "preview": "import numpy as np\nimport matplotlib.pyplot as plt\nfrom numpy import array\nimport torch\nimport cv2\nimport time\n\ndevice ="
  },
  {
    "path": "dvs/warp/read_write.py",
    "chars": 4172,
    "preview": "import numpy as np\nimport cv2\nimport os\nfrom PIL import Image, ImageDraw, ImageFont\nimport matplotlib.pyplot as plt\nimpo"
  },
  {
    "path": "dvs/warp/warping.py",
    "chars": 1625,
    "preview": "import numpy as np\nfrom .read_write import load_video, save_video\nimport torch\nimport cv2\nfrom .rasterizer import Raster"
  }
]

// ... and 1 more files (download for full content)

About this extraction

This page contains the full source code of the googleinterns/deep-stabilization GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 65 files (42.8 MB), approximately 79.8k tokens, and a symbol index with 342 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo