Showing preview only (303K chars total). Download the full file or copy to clipboard to get everything.
Repository: googleinterns/deep-stabilization
Branch: master
Commit: 7159c09d21ae
Files: 65
Total size: 42.8 MB
Directory structure:
gitextract__lkvtuhi/
├── .gitignore
├── LICENSE
├── README.md
├── docs/
│ ├── code-of-conduct.md
│ └── contributing.md
└── dvs/
├── checkpoint/
│ └── stabilzation/
│ └── stabilzation_last.checkpoint
├── conf/
│ ├── stabilzation.yaml
│ └── stabilzation_train.yaml
├── dataset.py
├── flownet2/
│ ├── LICENSE
│ ├── README.md
│ ├── __init__.py
│ ├── convert.py
│ ├── datasets.py
│ ├── install.sh
│ ├── losses.py
│ ├── main.py
│ ├── models.py
│ ├── networks/
│ │ ├── FlowNetC.py
│ │ ├── FlowNetFusion.py
│ │ ├── FlowNetS.py
│ │ ├── FlowNetSD.py
│ │ ├── __init__.py
│ │ ├── channelnorm_package/
│ │ │ ├── __init__.py
│ │ │ ├── channelnorm.py
│ │ │ ├── channelnorm_cuda.cc
│ │ │ ├── channelnorm_kernel.cu
│ │ │ ├── channelnorm_kernel.cuh
│ │ │ └── setup.py
│ │ ├── correlation_package/
│ │ │ ├── __init__.py
│ │ │ ├── correlation.py
│ │ │ ├── correlation_cuda.cc
│ │ │ ├── correlation_cuda_kernel.cu
│ │ │ ├── correlation_cuda_kernel.cuh
│ │ │ └── setup.py
│ │ ├── resample2d_package/
│ │ │ ├── __init__.py
│ │ │ ├── resample2d.py
│ │ │ ├── resample2d_cuda.cc
│ │ │ ├── resample2d_kernel.cu
│ │ │ ├── resample2d_kernel.cuh
│ │ │ └── setup.py
│ │ └── submodules.py
│ ├── run.sh
│ ├── run_release.sh
│ └── utils/
│ ├── __init__.py
│ ├── flow_utils.py
│ ├── frame_utils.py
│ ├── param_utils.py
│ └── tools.py
├── gyro/
│ ├── __init__.py
│ ├── gyro_function.py
│ └── gyro_io.py
├── inference.py
├── load_frame_sensor_data.py
├── loss.py
├── metrics.py
├── model.py
├── printer.py
├── requirements.txt
├── train.py
├── util.py
└── warp/
├── __init__.py
├── rasterizer.py
├── read_write.py
└── warping.py
================================================
FILE CONTENTS
================================================
================================================
FILE: .gitignore
================================================
*.pyc
.torch
_ext
*.o
_ext/
*.png
*.jpg
*.tar
log/*
================================================
FILE: LICENSE
================================================
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
================================================
FILE: README.md
================================================
# Deep Online Fused Video Stabilization
[[Paper]](https://openaccess.thecvf.com/content/WACV2022/papers/Shi_Deep_Online_Fused_Video_Stabilization_WACV_2022_paper.pdf)[[Supplementary]](https://zhmeishi.github.io/dvs/paper/dvs_supp.pdf) [[Project Page]](https://zhmeishi.github.io/dvs/) [[Dataset]](https://storage.googleapis.com/dataset_release/all.zip) [[Our Result]](https://storage.googleapis.com/dataset_release/inference_result_release.zip) [[More Results]](https://zhmeishi.github.io/dvs/supp/results.html)
This repository contains the Pytorch implementation of our method in the paper "Deep Online Fused Video Stabilization".
## Environment Setting
Python version >= 3.6
Pytorch with CUDA >= 1.0.0 (guide is [here](https://pytorch.org/get-started/locally/))
Install other used packages:
```
cd dvs
pip install -r requirements.txt --ignore-installed
```
## Data Preparation
Download sample video [here](https://drive.google.com/file/d/1PpF3-6BbQKy9fldjIfwa5AlbtQflx3sG/view?usp=sharing).
Uncompress the *video* folder under the *dvs* folder.
```
python load_frame_sensor_data.py
```
Demo of curve visualization:
The **gyro/OIS curve visualization** can be found at *dvs/video/s_114_outdoor_running_trail_daytime/ControlCam_20200930_104820_real.jpg*.
## FlowNet2 Preparation
Note, we provide optical flow result of one test video in our Data Preparation. If you would like to generate them for all test videos, please follow [FlowNet2 official website](https://github.com/NVIDIA/flownet2-pytorch) and guide below. Otherwise, you can skip this section.
Note, FlowNet2 installation is tricky. Please use Python=3.6 and Pytorch=1.0.0. More details are [here](https://github.com/NVIDIA/flownet2-pytorch/issues/156) or contact us for any questions.
Download FlowNet2 model *FlowNet2_checkpoint.pth.tar* [here](https://drive.google.com/file/d/1hF8vS6YeHkx3j2pfCeQqqZGwA_PJq_Da/view). Move it under folder *dvs/flownet2*.
```
python warp/read_write.py # video2frames
cd flownet2
bash install.sh # install package
bash run.sh # generate optical flow file for dataset
```
## Running Inference
```
python inference.py
python metrics.py
```
The loss and metric information will be printed in the terminal. The metric numbers can be slightly different due to difference on opencv/pytorch versions.
The result is under *dvs/test/stabilzation*.
In *s_114_outdoor_running_trail_daytime.jpg*, the blue curve is the output of our models, and the green curve is the input.
*s_114_outdoor_running_trail_daytime_stab.mp4* is uncropped stabilized video.
*s_114_outdoor_running_trail_daytime_stab_crop.mp4* is cropped stabilized video. Note, the cropped video is generated after running the metrics code.
## Training
Download dataset for training and test [here](https://storage.googleapis.com/dataset_release/all.zip).
Uncompress *all.zip* and move *dataset_release* folder under the *dvs* folder.
Follow FlowNet2 Preparation Section.
```
python warp/read_write.py --dir_path ./dataset_release # video2frames
cd flownet2
bash run_release.sh # generate optical flow file for dataset
```
Run training code.
```
python train.py
```
The model is saved in *checkpoint/stabilzation_train*.
## Citation
If you use this code or dataset for your research, please cite our paper.
```
@inproceedings{shi2022deep,
title={Deep Online Fused Video Stabilization},
author={Shi, Zhenmei and Shi, Fuhao and Lai, Wei-Sheng and Liang, Chia-Kai and Liang, Yingyu},
booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
pages={1250--1258},
year={2022}
}
```
================================================
FILE: docs/code-of-conduct.md
================================================
# Google Open Source Community Guidelines
At Google, we recognize and celebrate the creativity and collaboration of open
source contributors and the diversity of skills, experiences, cultures, and
opinions they bring to the projects and communities they participate in.
Every one of Google's open source projects and communities are inclusive
environments, based on treating all individuals respectfully, regardless of
gender identity and expression, sexual orientation, disabilities,
neurodiversity, physical appearance, body size, ethnicity, nationality, race,
age, religion, or similar personal characteristic.
We value diverse opinions, but we value respectful behavior more.
Respectful behavior includes:
* Being considerate, kind, constructive, and helpful.
* Not engaging in demeaning, discriminatory, harassing, hateful, sexualized, or
physically threatening behavior, speech, and imagery.
* Not engaging in unwanted physical contact.
Some Google open source projects [may adopt][] an explicit project code of
conduct, which may have additional detailed expectations for participants. Most
of those projects will use our [modified Contributor Covenant][].
[may adopt]: https://opensource.google/docs/releasing/preparing/#conduct
[modified Contributor Covenant]: https://opensource.google/docs/releasing/template/CODE_OF_CONDUCT/
## Resolve peacefully
We do not believe that all conflict is necessarily bad; healthy debate and
disagreement often yields positive results. However, it is never okay to be
disrespectful.
If you see someone behaving disrespectfully, you are encouraged to address the
behavior directly with those involved. Many issues can be resolved quickly and
easily, and this gives people more control over the outcome of their dispute.
If you are unable to resolve the matter for any reason, or if the behavior is
threatening or harassing, report it. We are dedicated to providing an
environment where participants feel welcome and safe.
## Reporting problems
Some Google open source projects may adopt a project-specific code of conduct.
In those cases, a Google employee will be identified as the Project Steward,
who will receive and handle reports of code of conduct violations. In the event
that a project hasn’t identified a Project Steward, you can report problems by
emailing opensource@google.com.
We will investigate every complaint, but you may not receive a direct response.
We will use our discretion in determining when and how to follow up on reported
incidents, which may range from not taking action to permanent expulsion from
the project and project-sponsored spaces. We will notify the accused of the
report and provide them an opportunity to discuss it before any action is
taken. The identity of the reporter will be omitted from the details of the
report supplied to the accused. In potentially harmful situations, such as
ongoing harassment or threats to anyone's safety, we may take action without
notice.
*This document was adapted from the [IndieWeb Code of Conduct][] and can also
be found at <https://opensource.google/conduct/>.*
[IndieWeb Code of Conduct]: https://indieweb.org/code-of-conduct
================================================
FILE: docs/contributing.md
================================================
# How to Contribute
We'd love to accept your patches and contributions to this project. There are
just a few small guidelines you need to follow.
## Contributor License Agreement
Contributions to this project must be accompanied by a Contributor License
Agreement. You (or your employer) retain the copyright to your contribution;
this simply gives us permission to use and redistribute your contributions as
part of the project. Head over to <https://cla.developers.google.com/> to see
your current agreements on file or to sign a new one.
You generally only need to submit a CLA once, so if you've already submitted one
(even if it was for a different project), you probably don't need to do it
again.
## Code reviews
All submissions, including submissions by project members, require review. We
use GitHub pull requests for this purpose. Consult
[GitHub Help](https://help.github.com/articles/about-pull-requests/) for more
information on using pull requests.
## Community Guidelines
This project follows [Google's Open Source Community
Guidelines](https://opensource.google/conduct/).
================================================
FILE: dvs/checkpoint/stabilzation/stabilzation_last.checkpoint
================================================
[File too large to display: 42.5 MB]
================================================
FILE: dvs/conf/stabilzation.yaml
================================================
data:
exp: 'stabilzation'
checkpoints_dir: './checkpoint'
log: './log'
data_dir: './video'
use_cuda: true
batch_size: 16
resize_ratio: 0.25
number_real: 10
number_virtual: 2
time_train: 2000 # ms
sample_freq: 40 # ms
channel_size: 1
num_workers: 16 # num_workers for data_loader
model:
load_model: null
cnn:
activate_function: relu # sigmoid, relu, tanh, quadratic
batch_norm: true
gap: false
layers:
rnn:
layers:
- - 512
- true
- - 512
- true
fc:
activate_function: relu
batch_norm: false # (batch_norm and drop_out) is False
layers:
- - 256
- true
- - 4 # last layer should be equal to nr_class
- true
drop_out: 0
train:
optimizer: "adam" # adam or sgd
momentum: 0.9 # for sgd
decay_epoch: null
epoch: 400
snapshot: 2
init_lr: 0.0001
lr_decay: 0.5
lr_step: 200 # if > 0 decay_epoch should be null
seed: 1
weight_decay: 0.0001
clip_norm: False
init: "xavier_uniform" # xavier_uniform or xavier_normal
loss:
follow: 10
angle: 1
smooth: 10 #10
c2_smooth: 200 #20
undefine: 2.0
opt: 0.1
stay: 0
================================================
FILE: dvs/conf/stabilzation_train.yaml
================================================
data:
exp: 'stabilzation_train'
checkpoints_dir: './checkpoint'
log: './log'
data_dir: './dataset_release'
use_cuda: true
batch_size: 16
resize_ratio: 0.25
number_real: 10
number_virtual: 2
time_train: 2000 # ms
sample_freq: 40 # ms
channel_size: 1
num_workers: 16 # num_workers for data_loader
model:
load_model: null
cnn:
activate_function: relu # sigmoid, relu, tanh, quadratic
batch_norm: true
gap: false
layers:
rnn:
layers:
- - 512
- true
- - 512
- true
fc:
activate_function: relu
batch_norm: false # (batch_norm and drop_out) is False
layers:
- - 256
- true
- - 4 # last layer should be equal to nr_class
- true
drop_out: 0
train:
optimizer: "adam" # adam or sgd
momentum: 0.9 # for sgd
decay_epoch: null
epoch: 400
snapshot: 2
init_lr: 0.0001
lr_decay: 0.5
lr_step: 200 # if > 0 decay_epoch should be null
seed: 1
weight_decay: 0.0001
clip_norm: False
init: "xavier_uniform" # xavier_uniform or xavier_normal
loss:
follow: 10
angle: 1
smooth: 10 #10
c2_smooth: 200 #20
undefine: 2.0
opt: 0.1
stay: 0
================================================
FILE: dvs/dataset.py
================================================
from torch.utils.data import Dataset
import os
import collections
from gyro import (
LoadGyroData,
LoadOISData,
LoadFrameData,
GetGyroAtTimeStamp,
get_static,
GetMetadata,
GetProjections,
train_GetGyroAtTimeStamp,
QuaternionProduct,
QuaternionReciprocal,
FindOISAtTimeStamp,
norm_quat
)
import random
import numpy as np
import torchvision.transforms as transforms
import torch
from flownet2 import flow_utils
from scipy import ndimage, misc
from numpy import linalg as LA
def get_data_loader(cf, no_flo = False):
size = cf["data"]["batch_size"]
num_workers = cf["data"]["num_workers"]
train_data, test_data = get_dataset(cf, no_flo)
trainloader = torch.utils.data.DataLoader(train_data, batch_size=size,shuffle=True, pin_memory=True, num_workers=num_workers)
testloader = torch.utils.data.DataLoader(test_data, batch_size=size,shuffle=False, pin_memory=True, num_workers=num_workers)
return trainloader,testloader
def get_dataset(cf, no_flo = False):
resize_ratio = cf["data"]["resize_ratio"]
train_transform, test_transform = _data_transforms()
train_path = os.path.join(cf["data"]["data_dir"], "training")
test_path = os.path.join(cf["data"]["data_dir"], "test")
if not os.path.exists(train_path):
train_path = cf["data"]["data_dir"]
if not os.path.exists(test_path):
test_path = cf["data"]["data_dir"]
train_data = Dataset_Gyro(
train_path, sample_freq = cf["data"]["sample_freq"]*1000000, number_real = cf["data"]["number_real"],
time_train = cf["data"]["time_train"]*1000000, transform = train_transform, resize_ratio = resize_ratio, no_flo = no_flo)
test_data = Dataset_Gyro(
test_path, sample_freq = cf["data"]["sample_freq"]*1000000, number_real = cf["data"]["number_real"],
time_train = cf["data"]["time_train"]*1000000, transform = test_transform, resize_ratio = resize_ratio, no_flo = no_flo)
return train_data, test_data
def get_inference_data_loader(cf, data_path, no_flo = False):
test_data = get_inference_dataset(cf, data_path, no_flo)
testloader = torch.utils.data.DataLoader(test_data, batch_size=1,shuffle=False, pin_memory=True, num_workers=1)
return testloader
def get_inference_dataset(cf, data_path, no_flo = False):
resize_ratio = cf["data"]["resize_ratio"]
_, test_transform = _data_transforms()
test_data = Dataset_Gyro(
data_path, sample_freq = cf["data"]["sample_freq"]*1000000, number_real = cf["data"]["number_real"],
time_train = cf["data"]["time_train"]*1000000, transform = test_transform, resize_ratio = resize_ratio,
inference_only = True, no_flo = no_flo)
return test_data
def _data_transforms():
test_transform = transforms.Compose(
[transforms.ToTensor(),
])
train_transform = transforms.Compose(
[transforms.ToTensor(),
])
return train_transform, test_transform
class DVS_data():
def __init__(self):
self.gyro = None
self.ois = None
self.frame = None
self.length = 0
self.flo_path = None
self.flo_shape = None
self.flo_back_path = None
class Dataset_Gyro(Dataset):
def __init__(self, path, sample_freq = 33*1000000, number_real = 10, time_train = 2000*1000000, \
transform = None, inference_only = False, no_flo = False, resize_ratio = 1):
r"""
Arguments:
sample_freq: real quaternions [t-sample_freq*number_real, t+sample_freq*number_real] ns
number_real: real gyro num in half time_interval
time_train: time for a batch ns
"""
self.sample_freq = sample_freq
self.number_real = number_real
self.no_flo = no_flo
self.resize_ratio = resize_ratio
self.static_options = get_static()
self.inference_only = inference_only
self.ois_ratio = np.array([self.static_options["crop_window_width"] / self.static_options["width"], \
self.static_options["crop_window_height"] / self.static_options["height"]]) * 0.01
self.unit_size = 4
if inference_only:
self.length = 1
self.data = [self.process_one_video(path)]
self.number_train = self.data[0].length
return
self.time_train = time_train
self.number_train = time_train//self.sample_freq
self.data_name = sorted(os.listdir(path))
self.length = len(self.data_name)
self.data = []
for i in range(self.length):
self.data.append(self.process_one_video(os.path.join(path,self.data_name[i])))
def process_one_video(self, path):
dvs_data = DVS_data()
files = sorted(os.listdir(path))
print(path)
for f in files:
file_path = os.path.join(path,f)
if "gimbal" in file_path.lower():
continue
if "frame" in f and "txt" in f:
dvs_data.frame = LoadFrameData(file_path)
print("frame:", dvs_data.frame.shape, end=" ")
elif "gyro" in f:
dvs_data.gyro = LoadGyroData(file_path)
dvs_data.gyro = preprocess_gyro(dvs_data.gyro)
print("gyro:", dvs_data.gyro.shape, end=" ")
elif "ois" in f and "txt" in f:
dvs_data.ois = LoadOISData(file_path)
print("ois:", dvs_data.ois.shape, end=" ")
elif f == "flo":
dvs_data.flo_path, dvs_data.flo_shape = LoadFlow(file_path)
print("flo_path:", len(dvs_data.flo_path), end=" ")
print("flo_shape:", dvs_data.flo_shape, end=" ")
elif f == "flo_back":
dvs_data.flo_back_path, _ = LoadFlow(file_path)
print()
if dvs_data.flo_path is not None:
dvs_data.length = min(dvs_data.frame.shape[0] - 1, len(dvs_data.flo_path))
else:
dvs_data.length = dvs_data.frame.shape[0] - 1
return dvs_data
def generate_quaternions(self, dvs_data):
first_id = random.randint(0, dvs_data.length - self.number_train) + 1 # skip the first frame
sample_data = np.zeros((self.number_train, 2 * self.number_real + 1, self.unit_size), dtype=np.float32)
sample_ois = np.zeros((self.number_train, 2), dtype=np.float32)
sample_time = np.zeros((self.number_train+1), dtype=np.float32)
sample_time[0] = get_timestamp(dvs_data.frame, first_id - 1)
real_postion = np.zeros((self.number_train, 4), dtype=np.float32)
time_start = sample_time[0]
for i in range(self.number_train):
sample_time[i+1] = get_timestamp(dvs_data.frame, first_id + i)
real_postion[i] = GetGyroAtTimeStamp(dvs_data.gyro, sample_time[i+1] - self.sample_freq)
sample_ois[i] = self.get_ois_at_timestamp(dvs_data.ois, sample_time[i+1])
for j in range(-self.number_real, self.number_real+1):
index = j + self.number_real
time_stamp = sample_time[i+1] + self.sample_freq * j
sample_data[i, index] = self.get_data_at_timestamp(dvs_data.gyro, dvs_data.ois, time_stamp, real_postion[i])
sample_data = np.reshape(sample_data, (self.number_train, (2*self.number_real+1) * self.unit_size))
return sample_data, sample_time, first_id, real_postion, sample_ois
def load_flo(self, idx, first_id):
shape = self.data[idx].flo_shape
h, w = shape[0], shape[1]
flo = np.zeros((self.number_train, h, w, 2))
flo_back = np.zeros((self.number_train, h, w, 2))
for i in range(self.number_train):
frame_id = i + first_id
f = flow_utils.readFlow(self.data[idx].flo_path[frame_id-1]).astype(np.float32)
flo[i] = f
f_b = flow_utils.readFlow(self.data[idx].flo_back_path[frame_id-1]).astype(np.float32)
flo_back[i] = f_b
return flo, flo_back
def load_real_projections(self, idx, first_id):
real_projections = np.zeros((self.number_train + 1, self.static_options["num_grid_rows"], 3, 3))
for i in range(self.number_train + 1):
frame_id = i + first_id
metadata = GetMetadata(self.data[idx].frame, frame_id - 1)
real_projections[i] = np.array(GetProjections(self.static_options, metadata, self.data[idx].gyro, np.zeros(self.data[idx].ois.shape), no_shutter = True))
return real_projections
def __getitem__(self, idx):
inputs, times, first_id, real_postion, ois = self.generate_quaternions(self.data[idx])
real_projections = self.load_real_projections(idx, first_id)
if self.no_flo:
flo, flo_back = 0, 0
else:
flo, flo_back = self.load_flo(idx, first_id)
return inputs, times, flo, flo_back, real_projections, real_postion, ois, idx
def __len__(self):
return self.length
def get_virtual_data(self, virtual_queue, real_queue_idx, pre_times, cur_times, time_start, batch_size, number_virtual, quat_t_1):
# virtual_queue: [batch_size, num, 5 (timestamp, quats)]
# eular angle,
# deta R angular velocity [Q't-1, Q't-2]
# output virtual angular velocity, x, x*dtime => detaQt
virtual_data = np.zeros((batch_size, number_virtual, 4), dtype=np.float32)
vt_1 = np.zeros((batch_size, 4), dtype=np.float32)
quat_t_1 = quat_t_1.numpy()
for i in range(batch_size):
sample_time = cur_times[i]
for j in range(number_virtual):
time_stamp = sample_time - self.sample_freq * (number_virtual - j)
virtual_data[i, j] = get_virtual_at_timestamp(virtual_queue[i], self.data[real_queue_idx[i]].gyro, time_stamp, time_start[i], quat_t_1[i])
vt_1[i] = get_virtual_at_timestamp(virtual_queue[i], self.data[real_queue_idx[i]].gyro, pre_times[i], time_start[i], None)
virtual_data = np.reshape(virtual_data, (batch_size, number_virtual * 4))
return torch.tensor(virtual_data, dtype=torch.float), torch.tensor(vt_1, dtype=torch.float)
def update_virtual_queue(self, batch_size, virtual_queue, out, times):
virtual_data = np.zeros((batch_size, 5))
virtual_data[:,0] = times
virtual_data[:, 1:] = out
virtual_data = np.expand_dims(virtual_data, axis = 1)
if None in virtual_queue:
virtual_queue = virtual_data
else:
virtual_queue = np.concatenate((virtual_queue, virtual_data), axis = 1)
return virtual_queue
def random_init_virtual_queue(self, batch_size, real_postion, times):
virtual_queue = np.zeros((batch_size, 3, 5))
virtual_queue[:, 2, 0] = times - 0.1 * self.sample_freq
virtual_queue[:, 1, 0] = times - 1.1 * self.sample_freq
virtual_queue[:, 0, 0] = times - 2.1 * self.sample_freq
for i in range(batch_size):
quat = np.random.uniform(low=-0.06, high= 0.06, size=4) # transfer to angle # 0.05
quat[3] = 1
quat = quat / LA.norm(quat)
quat = norm_quat(QuaternionProduct(real_postion[i], quat))
virtual_queue[i, 2, 1:] = quat
virtual_queue[i, 1, 1:] = quat
virtual_queue[i, 0, 1:] = quat
return virtual_queue
def get_data_at_timestamp(self, gyro_data, ois_data, time_stamp, quat_t_1):
quat_t = GetGyroAtTimeStamp(gyro_data, time_stamp)
quat_dif = QuaternionProduct(quat_t, QuaternionReciprocal(quat_t_1))
return quat_dif
def get_ois_at_timestamp(self, ois_data, time_stamp):
ois_t = FindOISAtTimeStamp(ois_data, time_stamp)
ois_t = np.array(ois_t) / self.ois_ratio
return ois_t
def get_timestamp(frame_data, idx):
sample_time = frame_data[idx, 0]
metadata = GetMetadata(frame_data, idx)
timestmap_ns = metadata["timestamp_ns"] + metadata["rs_time_ns"] * 0.5
return timestmap_ns
def preprocess_gyro(gyro, extend = 200):
fake_gyro = np.zeros((extend, 5))
time_start = gyro[0,0]
for i in range(extend):
fake_gyro[-i-1, 0] = time_start - (gyro[i+1, 0] - time_start)
fake_gyro[-i-1, 4] = gyro[i+1, 4]
fake_gyro[-i-1, 1:4] = -gyro[i+1, 1:4]
new_gyro = np.concatenate((fake_gyro, gyro), axis = 0)
return new_gyro
def LoadFlow(path):
file_names = sorted(os.listdir(path))
file_path =[]
for n in file_names:
file_path.append(os.path.join(path, n))
return file_path, flow_utils.readFlow(file_path[0]).shape
def get_virtual_at_timestamp(virtual_queue, real_queue, time_stamp, time_start, quat_t_1 = None, sample_freq = None):
if virtual_queue is None:
quat_t = GetGyroAtTimeStamp(real_queue, time_stamp)
else:
quat_t = train_GetGyroAtTimeStamp(virtual_queue, time_stamp)
if quat_t is None:
quat_t = GetGyroAtTimeStamp(real_queue, time_stamp)
if quat_t_1 is None:
return quat_t
else:
quat_dif = QuaternionProduct(quat_t, QuaternionReciprocal(quat_t_1))
return quat_dif
================================================
FILE: dvs/flownet2/LICENSE
================================================
Copyright 2017 NVIDIA CORPORATION
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
================================================
FILE: dvs/flownet2/README.md
================================================
# flownet2-pytorch
Pytorch implementation of [FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks](https://arxiv.org/abs/1612.01925).
Multiple GPU training is supported, and the code provides examples for training or inference on [MPI-Sintel](http://sintel.is.tue.mpg.de/) clean and final datasets. The same commands can be used for training or inference with other datasets. See below for more detail.
Inference using fp16 (half-precision) is also supported.
For more help, type <br />
python main.py --help
## Network architectures
Below are the different flownet neural network architectures that are provided. <br />
A batchnorm version for each network is also available.
- **FlowNet2S**
- **FlowNet2C**
- **FlowNet2CS**
- **FlowNet2CSS**
- **FlowNet2SD**
- **FlowNet2**
## Custom layers
`FlowNet2` or `FlowNet2C*` achitectures rely on custom layers `Resample2d` or `Correlation`. <br />
A pytorch implementation of these layers with cuda kernels are available at [./networks](./networks). <br />
Note : Currently, half precision kernels are not available for these layers.
## Data Loaders
Dataloaders for FlyingChairs, FlyingThings, ChairsSDHom and ImagesFromFolder are available in [datasets.py](./datasets.py). <br />
## Loss Functions
L1 and L2 losses with multi-scale support are available in [losses.py](./losses.py). <br />
## Installation
# get flownet2-pytorch source
git clone https://github.com/NVIDIA/flownet2-pytorch.git
cd flownet2-pytorch
# install custom layers
bash install.sh
### Python requirements
Currently, the code supports python 3
* numpy
* PyTorch ( == 0.4.1, for <= 0.4.0 see branch [python36-PyTorch0.4](https://github.com/NVIDIA/flownet2-pytorch/tree/python36-PyTorch0.4))
* scipy
* scikit-image
* tensorboardX
* colorama, tqdm, setproctitle
## Converted Caffe Pre-trained Models
We've included caffe pre-trained models. Should you use these pre-trained weights, please adhere to the [license agreements](https://drive.google.com/file/d/1TVv0BnNFh3rpHZvD-easMb9jYrPE2Eqd/view?usp=sharing).
* [FlowNet2](https://drive.google.com/file/d/1hF8vS6YeHkx3j2pfCeQqqZGwA_PJq_Da/view?usp=sharing)[620MB]
* [FlowNet2-C](https://drive.google.com/file/d/1BFT6b7KgKJC8rA59RmOVAXRM_S7aSfKE/view?usp=sharing)[149MB]
* [FlowNet2-CS](https://drive.google.com/file/d/1iBJ1_o7PloaINpa8m7u_7TsLCX0Dt_jS/view?usp=sharing)[297MB]
* [FlowNet2-CSS](https://drive.google.com/file/d/157zuzVf4YMN6ABAQgZc8rRmR5cgWzSu8/view?usp=sharing)[445MB]
* [FlowNet2-CSS-ft-sd](https://drive.google.com/file/d/1R5xafCIzJCXc8ia4TGfC65irmTNiMg6u/view?usp=sharing)[445MB]
* [FlowNet2-S](https://drive.google.com/file/d/1V61dZjFomwlynwlYklJHC-TLfdFom3Lg/view?usp=sharing)[148MB]
* [FlowNet2-SD](https://drive.google.com/file/d/1QW03eyYG_vD-dT-Mx4wopYvtPu_msTKn/view?usp=sharing)[173MB]
## Inference
# Example on MPISintel Clean
python main.py --inference --model FlowNet2 --save_flow --inference_dataset MpiSintelClean \
--inference_dataset_root /path/to/mpi-sintel/clean/dataset \
--resume /path/to/checkpoints
## Training and validation
# Example on MPISintel Final and Clean, with L1Loss on FlowNet2 model
python main.py --batch_size 8 --model FlowNet2 --loss=L1Loss --optimizer=Adam --optimizer_lr=1e-4 \
--training_dataset MpiSintelFinal --training_dataset_root /path/to/mpi-sintel/final/dataset \
--validation_dataset MpiSintelClean --validation_dataset_root /path/to/mpi-sintel/clean/dataset
# Example on MPISintel Final and Clean, with MultiScale loss on FlowNet2C model
python main.py --batch_size 8 --model FlowNet2C --optimizer=Adam --optimizer_lr=1e-4 --loss=MultiScale --loss_norm=L1 \
--loss_numScales=5 --loss_startScale=4 --optimizer_lr=1e-4 --crop_size 384 512 \
--training_dataset FlyingChairs --training_dataset_root /path/to/flying-chairs/dataset \
--validation_dataset MpiSintelClean --validation_dataset_root /path/to/mpi-sintel/clean/dataset
## Results on MPI-Sintel
[](https://www.youtube.com/watch?v=HtBmabY8aeU "Predicted flows on MPI-Sintel")
## Reference
If you find this implementation useful in your work, please acknowledge it appropriately and cite the paper:
````
@InProceedings{IMKDB17,
author = "E. Ilg and N. Mayer and T. Saikia and M. Keuper and A. Dosovitskiy and T. Brox",
title = "FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks",
booktitle = "IEEE Conference on Computer Vision and Pattern Recognition (CVPR)",
month = "Jul",
year = "2017",
url = "http://lmb.informatik.uni-freiburg.de//Publications/2017/IMKDB17"
}
````
```
@misc{flownet2-pytorch,
author = {Fitsum Reda and Robert Pottorff and Jon Barker and Bryan Catanzaro},
title = {flownet2-pytorch: Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks},
year = {2017},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/NVIDIA/flownet2-pytorch}}
}
```
## Related Optical Flow Work from Nvidia
Code (in Caffe and Pytorch): [PWC-Net](https://github.com/NVlabs/PWC-Net) <br />
Paper : [PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume](https://arxiv.org/abs/1709.02371).
## Acknowledgments
Parts of this code were derived, as noted in the code, from [ClementPinard/FlowNetPytorch](https://github.com/ClementPinard/FlowNetPytorch).
================================================
FILE: dvs/flownet2/__init__.py
================================================
from .utils import flow_utils, tools
================================================
FILE: dvs/flownet2/convert.py
================================================
#!/usr/bin/env python2.7
import caffe
from caffe.proto import caffe_pb2
import sys, os
import torch
import torch.nn as nn
import argparse, tempfile
import numpy as np
parser = argparse.ArgumentParser()
parser.add_argument('caffe_model', help='input model in hdf5 or caffemodel format')
parser.add_argument('prototxt_template',help='prototxt template')
parser.add_argument('flownet2_pytorch', help='path to flownet2-pytorch')
args = parser.parse_args()
args.rgb_max = 255
args.fp16 = False
args.grads = {}
# load models
sys.path.append(args.flownet2_pytorch)
import models
from utils.param_utils import *
width = 256
height = 256
keys = {'TARGET_WIDTH': width,
'TARGET_HEIGHT': height,
'ADAPTED_WIDTH':width,
'ADAPTED_HEIGHT':height,
'SCALE_WIDTH':1.,
'SCALE_HEIGHT':1.,}
template = '\n'.join(np.loadtxt(args.prototxt_template, dtype=str, delimiter='\n'))
for k in keys:
template = template.replace('$%s$'%(k),str(keys[k]))
prototxt = tempfile.NamedTemporaryFile(mode='w', delete=True)
prototxt.write(template)
prototxt.flush()
net = caffe.Net(prototxt.name, args.caffe_model, caffe.TEST)
weights = {}
biases = {}
for k, v in list(net.params.items()):
weights[k] = np.array(v[0].data).reshape(v[0].data.shape)
biases[k] = np.array(v[1].data).reshape(v[1].data.shape)
print((k, weights[k].shape, biases[k].shape))
if 'FlowNet2/' in args.caffe_model:
model = models.FlowNet2(args)
parse_flownetc(model.flownetc.modules(), weights, biases)
parse_flownets(model.flownets_1.modules(), weights, biases, param_prefix='net2_')
parse_flownets(model.flownets_2.modules(), weights, biases, param_prefix='net3_')
parse_flownetsd(model.flownets_d.modules(), weights, biases, param_prefix='netsd_')
parse_flownetfusion(model.flownetfusion.modules(), weights, biases, param_prefix='fuse_')
state = {'epoch': 0,
'state_dict': model.state_dict(),
'best_EPE': 1e10}
torch.save(state, os.path.join(args.flownet2_pytorch, 'FlowNet2_checkpoint.pth.tar'))
elif 'FlowNet2-C/' in args.caffe_model:
model = models.FlowNet2C(args)
parse_flownetc(model.modules(), weights, biases)
state = {'epoch': 0,
'state_dict': model.state_dict(),
'best_EPE': 1e10}
torch.save(state, os.path.join(args.flownet2_pytorch, 'FlowNet2-C_checkpoint.pth.tar'))
elif 'FlowNet2-CS/' in args.caffe_model:
model = models.FlowNet2CS(args)
parse_flownetc(model.flownetc.modules(), weights, biases)
parse_flownets(model.flownets_1.modules(), weights, biases, param_prefix='net2_')
state = {'epoch': 0,
'state_dict': model.state_dict(),
'best_EPE': 1e10}
torch.save(state, os.path.join(args.flownet2_pytorch, 'FlowNet2-CS_checkpoint.pth.tar'))
elif 'FlowNet2-CSS/' in args.caffe_model:
model = models.FlowNet2CSS(args)
parse_flownetc(model.flownetc.modules(), weights, biases)
parse_flownets(model.flownets_1.modules(), weights, biases, param_prefix='net2_')
parse_flownets(model.flownets_2.modules(), weights, biases, param_prefix='net3_')
state = {'epoch': 0,
'state_dict': model.state_dict(),
'best_EPE': 1e10}
torch.save(state, os.path.join(args.flownet2_pytorch, 'FlowNet2-CSS_checkpoint.pth.tar'))
elif 'FlowNet2-CSS-ft-sd/' in args.caffe_model:
model = models.FlowNet2CSS(args)
parse_flownetc(model.flownetc.modules(), weights, biases)
parse_flownets(model.flownets_1.modules(), weights, biases, param_prefix='net2_')
parse_flownets(model.flownets_2.modules(), weights, biases, param_prefix='net3_')
state = {'epoch': 0,
'state_dict': model.state_dict(),
'best_EPE': 1e10}
torch.save(state, os.path.join(args.flownet2_pytorch, 'FlowNet2-CSS-ft-sd_checkpoint.pth.tar'))
elif 'FlowNet2-S/' in args.caffe_model:
model = models.FlowNet2S(args)
parse_flownetsonly(model.modules(), weights, biases, param_prefix='')
state = {'epoch': 0,
'state_dict': model.state_dict(),
'best_EPE': 1e10}
torch.save(state, os.path.join(args.flownet2_pytorch, 'FlowNet2-S_checkpoint.pth.tar'))
elif 'FlowNet2-SD/' in args.caffe_model:
model = models.FlowNet2SD(args)
parse_flownetsd(model.modules(), weights, biases, param_prefix='')
state = {'epoch': 0,
'state_dict': model.state_dict(),
'best_EPE': 1e10}
torch.save(state, os.path.join(args.flownet2_pytorch, 'FlowNet2-SD_checkpoint.pth.tar'))
else:
print(('model type cound not be determined from input caffe model %s'%(args.caffe_model)))
quit()
print(("done converting ", args.caffe_model))
================================================
FILE: dvs/flownet2/datasets.py
================================================
import torch
import torch.utils.data as data
import os, math, random
from os.path import *
import numpy as np
from glob import glob
import utils.frame_utils as frame_utils
from imageio import imread
class StaticRandomCrop(object):
def __init__(self, image_size, crop_size):
self.th, self.tw = crop_size
h, w = image_size
self.h1 = random.randint(0, h - self.th)
self.w1 = random.randint(0, w - self.tw)
def __call__(self, img):
return img[self.h1:(self.h1+self.th), self.w1:(self.w1+self.tw),:]
class StaticCenterCrop(object):
def __init__(self, image_size, crop_size):
self.th, self.tw = crop_size
self.h, self.w = image_size
def __call__(self, img):
return img[(self.h-self.th)//2:(self.h+self.th)//2, (self.w-self.tw)//2:(self.w+self.tw)//2,:]
class Padding(object):
def __init__(self, image_size, pad_size):
self.th, self.tw = pad_size
self.h, self.w = image_size
def __call__(self, img):
out = np.zeros((self.th, self.tw, 3))
out[:self.h, :self.w,:] = img
return out
class MpiSintel(data.Dataset):
def __init__(self, args, is_cropped = False, root = '', dstype = 'clean', replicates = 1):
self.args = args
self.is_cropped = is_cropped
self.crop_size = args.crop_size
self.render_size = args.inference_size
self.replicates = replicates
flow_root = join(root, 'flow')
image_root = join(root, dstype)
file_list = sorted(glob(join(flow_root, '*/*.flo')))
self.flow_list = []
self.image_list = []
for file in file_list:
if 'test' in file:
# print file
continue
fbase = file[len(flow_root)+1:]
fprefix = fbase[:-8]
fnum = int(fbase[-8:-4])
img1 = join(image_root, fprefix + "%04d"%(fnum+0) + '.png')
img2 = join(image_root, fprefix + "%04d"%(fnum+1) + '.png')
if not isfile(img1) or not isfile(img2) or not isfile(file):
continue
self.image_list += [[img1, img2]]
self.flow_list += [file]
self.size = len(self.image_list)
self.frame_size = frame_utils.read_gen(self.image_list[0][0]).shape
if (self.render_size[0] < 0) or (self.render_size[1] < 0) or (self.frame_size[0]%64) or (self.frame_size[1]%64):
self.render_size[0] = ( (self.frame_size[0])//64 ) * 64
self.render_size[1] = ( (self.frame_size[1])//64 ) * 64
args.inference_size = self.render_size
assert (len(self.image_list) == len(self.flow_list))
def __getitem__(self, index):
index = index % self.size
img1 = frame_utils.read_gen(self.image_list[index][0])
img2 = frame_utils.read_gen(self.image_list[index][1])
flow = frame_utils.read_gen(self.flow_list[index])
images = [img1, img2]
image_size = img1.shape[:2]
if self.is_cropped:
cropper = StaticRandomCrop(image_size, self.crop_size)
else:
cropper = StaticCenterCrop(image_size, self.render_size)
images = list(map(cropper, images))
flow = cropper(flow)
images = np.array(images).transpose(3,0,1,2)
flow = flow.transpose(2,0,1)
images = torch.from_numpy(images.astype(np.float32))
flow = torch.from_numpy(flow.astype(np.float32))
return [images], [flow]
def __len__(self):
return self.size * self.replicates
class MpiSintelClean(MpiSintel):
def __init__(self, args, is_cropped = False, root = '', replicates = 1):
super(MpiSintelClean, self).__init__(args, is_cropped = is_cropped, root = root, dstype = 'clean', replicates = replicates)
class MpiSintelFinal(MpiSintel):
def __init__(self, args, is_cropped = False, root = '', replicates = 1):
super(MpiSintelFinal, self).__init__(args, is_cropped = is_cropped, root = root, dstype = 'final', replicates = replicates)
class FlyingChairs(data.Dataset):
def __init__(self, args, is_cropped, root = '/path/to/FlyingChairs_release/data', replicates = 1):
self.args = args
self.is_cropped = is_cropped
self.crop_size = args.crop_size
self.render_size = args.inference_size
self.replicates = replicates
images = sorted( glob( join(root, '*.ppm') ) )
self.flow_list = sorted( glob( join(root, '*.flo') ) )
assert (len(images)//2 == len(self.flow_list))
self.image_list = []
for i in range(len(self.flow_list)):
im1 = images[2*i]
im2 = images[2*i + 1]
self.image_list += [ [ im1, im2 ] ]
assert len(self.image_list) == len(self.flow_list)
self.size = len(self.image_list)
self.frame_size = frame_utils.read_gen(self.image_list[0][0]).shape
if (self.render_size[0] < 0) or (self.render_size[1] < 0) or (self.frame_size[0]%64) or (self.frame_size[1]%64):
self.render_size[0] = ( (self.frame_size[0])//64 ) * 64
self.render_size[1] = ( (self.frame_size[1])//64 ) * 64
args.inference_size = self.render_size
def __getitem__(self, index):
index = index % self.size
img1 = frame_utils.read_gen(self.image_list[index][0])
img2 = frame_utils.read_gen(self.image_list[index][1])
flow = frame_utils.read_gen(self.flow_list[index])
images = [img1, img2]
image_size = img1.shape[:2]
if self.is_cropped:
cropper = StaticRandomCrop(image_size, self.crop_size)
else:
cropper = StaticCenterCrop(image_size, self.render_size)
images = list(map(cropper, images))
flow = cropper(flow)
images = np.array(images).transpose(3,0,1,2)
flow = flow.transpose(2,0,1)
images = torch.from_numpy(images.astype(np.float32))
flow = torch.from_numpy(flow.astype(np.float32))
return [images], [flow]
def __len__(self):
return self.size * self.replicates
class FlyingThings(data.Dataset):
def __init__(self, args, is_cropped, root = '/path/to/flyingthings3d', dstype = 'frames_cleanpass', replicates = 1):
self.args = args
self.is_cropped = is_cropped
self.crop_size = args.crop_size
self.render_size = args.inference_size
self.replicates = replicates
image_dirs = sorted(glob(join(root, dstype, 'TRAIN/*/*')))
image_dirs = sorted([join(f, 'left') for f in image_dirs] + [join(f, 'right') for f in image_dirs])
flow_dirs = sorted(glob(join(root, 'optical_flow_flo_format/TRAIN/*/*')))
flow_dirs = sorted([join(f, 'into_future/left') for f in flow_dirs] + [join(f, 'into_future/right') for f in flow_dirs])
assert (len(image_dirs) == len(flow_dirs))
self.image_list = []
self.flow_list = []
for idir, fdir in zip(image_dirs, flow_dirs):
images = sorted( glob(join(idir, '*.png')) )
flows = sorted( glob(join(fdir, '*.flo')) )
for i in range(len(flows)):
self.image_list += [ [ images[i], images[i+1] ] ]
self.flow_list += [flows[i]]
assert len(self.image_list) == len(self.flow_list)
self.size = len(self.image_list)
self.frame_size = frame_utils.read_gen(self.image_list[0][0]).shape
if (self.render_size[0] < 0) or (self.render_size[1] < 0) or (self.frame_size[0]%64) or (self.frame_size[1]%64):
self.render_size[0] = ( (self.frame_size[0])//64 ) * 64
self.render_size[1] = ( (self.frame_size[1])//64 ) * 64
args.inference_size = self.render_size
def __getitem__(self, index):
index = index % self.size
img1 = frame_utils.read_gen(self.image_list[index][0])
img2 = frame_utils.read_gen(self.image_list[index][1])
flow = frame_utils.read_gen(self.flow_list[index])
images = [img1, img2]
image_size = img1.shape[:2]
if self.is_cropped:
cropper = StaticRandomCrop(image_size, self.crop_size)
else:
cropper = StaticCenterCrop(image_size, self.render_size)
images = list(map(cropper, images))
flow = cropper(flow)
images = np.array(images).transpose(3,0,1,2)
flow = flow.transpose(2,0,1)
images = torch.from_numpy(images.astype(np.float32))
flow = torch.from_numpy(flow.astype(np.float32))
return [images], [flow]
def __len__(self):
return self.size * self.replicates
class FlyingThingsClean(FlyingThings):
def __init__(self, args, is_cropped = False, root = '', replicates = 1):
super(FlyingThingsClean, self).__init__(args, is_cropped = is_cropped, root = root, dstype = 'frames_cleanpass', replicates = replicates)
class FlyingThingsFinal(FlyingThings):
def __init__(self, args, is_cropped = False, root = '', replicates = 1):
super(FlyingThingsFinal, self).__init__(args, is_cropped = is_cropped, root = root, dstype = 'frames_finalpass', replicates = replicates)
class ChairsSDHom(data.Dataset):
def __init__(self, args, is_cropped, root = '/path/to/chairssdhom/data', dstype = 'train', replicates = 1):
self.args = args
self.is_cropped = is_cropped
self.crop_size = args.crop_size
self.render_size = args.inference_size
self.replicates = replicates
image1 = sorted( glob( join(root, dstype, 't0/*.png') ) )
image2 = sorted( glob( join(root, dstype, 't1/*.png') ) )
self.flow_list = sorted( glob( join(root, dstype, 'flow/*.flo') ) )
assert (len(image1) == len(self.flow_list))
self.image_list = []
for i in range(len(self.flow_list)):
im1 = image1[i]
im2 = image2[i]
self.image_list += [ [ im1, im2 ] ]
assert len(self.image_list) == len(self.flow_list)
self.size = len(self.image_list)
self.frame_size = frame_utils.read_gen(self.image_list[0][0]).shape
if (self.render_size[0] < 0) or (self.render_size[1] < 0) or (self.frame_size[0]%64) or (self.frame_size[1]%64):
self.render_size[0] = ( (self.frame_size[0])//64 ) * 64
self.render_size[1] = ( (self.frame_size[1])//64 ) * 64
args.inference_size = self.render_size
def __getitem__(self, index):
index = index % self.size
img1 = frame_utils.read_gen(self.image_list[index][0])
img2 = frame_utils.read_gen(self.image_list[index][1])
flow = frame_utils.read_gen(self.flow_list[index])
flow = flow[::-1,:,:]
images = [img1, img2]
image_size = img1.shape[:2]
if self.is_cropped:
cropper = StaticRandomCrop(image_size, self.crop_size)
else:
cropper = StaticCenterCrop(image_size, self.render_size)
images = list(map(cropper, images))
flow = cropper(flow)
images = np.array(images).transpose(3,0,1,2)
flow = flow.transpose(2,0,1)
images = torch.from_numpy(images.astype(np.float32))
flow = torch.from_numpy(flow.astype(np.float32))
return [images], [flow]
def __len__(self):
return self.size * self.replicates
class ChairsSDHomTrain(ChairsSDHom):
def __init__(self, args, is_cropped = False, root = '', replicates = 1):
super(ChairsSDHomTrain, self).__init__(args, is_cropped = is_cropped, root = root, dstype = 'train', replicates = replicates)
class ChairsSDHomTest(ChairsSDHom):
def __init__(self, args, is_cropped = False, root = '', replicates = 1):
super(ChairsSDHomTest, self).__init__(args, is_cropped = is_cropped, root = root, dstype = 'test', replicates = replicates)
class ImagesFromFolder(data.Dataset):
def __init__(self, args, is_cropped, root = '/path/to/frames/only/folder', iext = 'png', replicates = 1):
self.args = args
self.is_cropped = is_cropped
self.crop_size = args.crop_size
self.render_size = args.inference_size
self.replicates = replicates
images = sorted( glob( join(root, '*.' + iext) ) )
self.image_list = []
for i in range(len(images)-1):
im1 = images[i]
im2 = images[i+1]
self.image_list += [ [ im1, im2 ] ]
self.size = len(self.image_list)
self.frame_size = frame_utils.read_gen(self.image_list[0][0]).shape
if (self.render_size[0] < 0) or (self.render_size[1] < 0) or (self.frame_size[0]%64) or (self.frame_size[1]%64):
self.render_size[0] = ( (self.frame_size[0])//64 ) * 64
self.render_size[1] = ( (self.frame_size[1])//64 ) * 64
args.inference_size = self.render_size
def __getitem__(self, index):
index = index % self.size
img1 = frame_utils.read_gen(self.image_list[index][0])
img2 = frame_utils.read_gen(self.image_list[index][1])
images = [img1, img2]
image_size = img1.shape[:2]
if self.is_cropped:
cropper = StaticRandomCrop(image_size, self.crop_size)
else:
cropper = StaticCenterCrop(image_size, self.render_size)
images = list(map(cropper, images))
images = np.array(images).transpose(3,0,1,2)
images = torch.from_numpy(images.astype(np.float32))
return [images], [torch.zeros(images.size()[0:1] + (2,) + images.size()[-2:])]
def __len__(self):
return self.size * self.replicates
class Google(data.Dataset):
def __init__(self, args, is_cropped = False, root = '', dstype = 'frames', replicates = 1):
self.args = args
self.is_cropped = is_cropped
self.crop_size = args.crop_size
self.render_size = args.inference_size
self.replicates = replicates
image_root = join(root, dstype)
file_list = sorted(glob(join(image_root, '*.png')))
self.image_list = []
for i in range(len(file_list)-1):
img1 = join(file_list[i])
img2 = join(file_list[i+1])
if not isfile(img1) or not isfile(img2):
continue
self.image_list += [[img1, img2]]
self.size = len(self.image_list)
self.frame_size = frame_utils.read_gen(self.image_list[0][0]).shape
if (self.render_size[0] < 0) or (self.render_size[1] < 0) or (self.frame_size[0]%64) or (self.frame_size[1]%64):
self.render_size[0] = ( math.ceil(self.frame_size[0]/64) ) * 64
self.render_size[1] = ( math.ceil(self.frame_size[1]/64) ) * 64
args.inference_size = self.render_size
def __getitem__(self, index):
index = index % self.size
img1 = frame_utils.read_gen(self.image_list[index][0])
img2 = frame_utils.read_gen(self.image_list[index][1])
images = [img1, img2]
image_size = img1.shape[:2]
if self.is_cropped:
cropper = StaticRandomCrop(image_size, self.crop_size)
else:
cropper = Padding(image_size, self.render_size)
images = list(map(cropper, images))
images = np.array(images).transpose(3,0,1,2)
images = torch.from_numpy(images.astype(np.float32))
return [images]
def __len__(self):
return self.size * self.replicates
'''
import argparse
import sys, os
import importlib
from scipy.misc import imsave
import numpy as np
import datasets
reload(datasets)
parser = argparse.ArgumentParser()
args = parser.parse_args()
args.inference_size = [1080, 1920]
args.crop_size = [384, 512]
args.effective_batch_size = 1
index = 500
v_dataset = datasets.MpiSintelClean(args, True, root='../MPI-Sintel/flow/training')
a, b = v_dataset[index]
im1 = a[0].numpy()[:,0,:,:].transpose(1,2,0)
im2 = a[0].numpy()[:,1,:,:].transpose(1,2,0)
imsave('./img1.png', im1)
imsave('./img2.png', im2)
flow_utils.writeFlow('./flow.flo', b[0].numpy().transpose(1,2,0))
'''
================================================
FILE: dvs/flownet2/install.sh
================================================
#!/bin/bash
cd ./networks/correlation_package
rm -rf *_cuda.egg-info build dist __pycache__
python3 setup.py install --user
cd ../resample2d_package
rm -rf *_cuda.egg-info build dist __pycache__
python3 setup.py install --user
cd ../channelnorm_package
rm -rf *_cuda.egg-info build dist __pycache__
python3 setup.py install --user
cd ..
================================================
FILE: dvs/flownet2/losses.py
================================================
'''
Portions of this code copyright 2017, Clement Pinard
'''
# freda (todo) : adversarial loss
import torch
import torch.nn as nn
import math
def EPE(input_flow, target_flow):
return torch.norm(target_flow-input_flow,p=2,dim=1).mean()
class L1(nn.Module):
def __init__(self):
super(L1, self).__init__()
def forward(self, output, target):
lossvalue = torch.abs(output - target).mean()
return lossvalue
class L2(nn.Module):
def __init__(self):
super(L2, self).__init__()
def forward(self, output, target):
lossvalue = torch.norm(output-target,p=2,dim=1).mean()
return lossvalue
class L1Loss(nn.Module):
def __init__(self, args):
super(L1Loss, self).__init__()
self.args = args
self.loss = L1()
self.loss_labels = ['L1', 'EPE']
def forward(self, output, target):
lossvalue = self.loss(output, target)
epevalue = EPE(output, target)
return [lossvalue, epevalue]
class L2Loss(nn.Module):
def __init__(self, args):
super(L2Loss, self).__init__()
self.args = args
self.loss = L2()
self.loss_labels = ['L2', 'EPE']
def forward(self, output, target):
lossvalue = self.loss(output, target)
epevalue = EPE(output, target)
return [lossvalue, epevalue]
class MultiScale(nn.Module):
def __init__(self, args, startScale = 4, numScales = 5, l_weight= 0.32, norm= 'L1'):
super(MultiScale,self).__init__()
self.startScale = startScale
self.numScales = numScales
self.loss_weights = torch.FloatTensor([(l_weight / 2 ** scale) for scale in range(self.numScales)])
self.args = args
self.l_type = norm
self.div_flow = 0.05
assert(len(self.loss_weights) == self.numScales)
if self.l_type == 'L1':
self.loss = L1()
else:
self.loss = L2()
self.multiScales = [nn.AvgPool2d(self.startScale * (2**scale), self.startScale * (2**scale)) for scale in range(self.numScales)]
self.loss_labels = ['MultiScale-'+self.l_type, 'EPE'],
def forward(self, output, target):
lossvalue = 0
epevalue = 0
if type(output) is tuple:
target = self.div_flow * target
for i, output_ in enumerate(output):
target_ = self.multiScales[i](target)
epevalue += self.loss_weights[i]*EPE(output_, target_)
lossvalue += self.loss_weights[i]*self.loss(output_, target_)
return [lossvalue, epevalue]
else:
epevalue += EPE(output, target)
lossvalue += self.loss(output, target)
return [lossvalue, epevalue]
================================================
FILE: dvs/flownet2/main.py
================================================
#!/usr/bin/env python
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
import torch
import torch.nn as nn
from torch.utils.data import DataLoader
from torch.autograd import Variable
from tensorboardX import SummaryWriter
import argparse, os, sys, subprocess
import colorama
import numpy as np
from tqdm import tqdm
from glob import glob
from os.path import *
import models, datasets
from utils import flow_utils, tools
import time
# Reusable function for inference
def inference(args, epoch, data_path, data_loader, model, offset=0):
model.eval()
if args.save_flow or args.render_validation:
flow_folder = "{}/flo".format(data_path)
flow_back_folder = "{}/flo_back".format(data_path)
if not os.path.exists(flow_folder):
os.makedirs(flow_folder)
if not os.path.exists(flow_back_folder):
os.makedirs(flow_back_folder)
# visualization folder
if args.inference_visualize:
flow_vis_folder = "{}/flo_vis".format(data_path)
if not os.path.exists(flow_vis_folder):
os.makedirs(flow_vis_folder)
flow_back_vis_folder = "{}/flo_back_vis".format(data_path)
if not os.path.exists(flow_back_vis_folder):
os.makedirs(flow_back_vis_folder)
args.inference_n_batches = np.inf if args.inference_n_batches < 0 else args.inference_n_batches
progress = tqdm(data_loader, ncols=100, total=np.minimum(len(data_loader), args.inference_n_batches), desc='Inferencing ',
leave=True, position=offset)
for batch_idx, (data) in enumerate(progress):
data = data[0]
data_back = torch.cat((data[:,:,1:,:,:], data[:,:,:1,:,:]), dim = 2)
if args.cuda:
data_forward = data.cuda(non_blocking=True)
data_back = data_back.cuda(non_blocking=True)
data_forward = Variable(data_forward)
data_back = Variable(data_back)
flo_path = join(flow_folder, '%06d.flo'%(batch_idx))
flo_back_path = join(flow_back_folder, '%06d.flo'%(batch_idx))
frame_size = data_loader.dataset.frame_size
if not os.path.exists(flo_path):
with torch.no_grad():
output = model(data_forward)[:,:,:frame_size[0], :frame_size[1]]
if args.save_flow or args.render_validation:
_pflow = output[0].data.cpu().numpy().transpose(1, 2, 0)
flow_utils.writeFlow( flo_path, _pflow)
if args.inference_visualize:
flow_utils.visulize_flow_file(
join(flow_folder, '%06d.flo' % (batch_idx)),flow_vis_folder)
if not os.path.exists(flo_back_path):
with torch.no_grad():
output = model(data_back)[:,:,:frame_size[0], :frame_size[1]]
if args.save_flow or args.render_validation:
_pflow = output[0].data.cpu().numpy().transpose(1, 2, 0)
flow_utils.writeFlow( flo_back_path, _pflow)
if args.inference_visualize:
flow_utils.visulize_flow_file(
join(flow_back_folder, '%06d.flo' % (batch_idx)), flow_back_vis_folder)
progress.update(1)
if batch_idx == (args.inference_n_batches - 1):
break
progress.close()
return
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--fp16', action='store_true', help='Run model in pseudo-fp16 mode (fp16 storage fp32 math).')
parser.add_argument('--fp16_scale', type=float, default=1024., help='Loss scaling, positive power of 2 values can improve fp16 convergence.')
parser.add_argument('--start_epoch', type=int, default=1)
parser.add_argument('--batch_size', '-b', type=int, default=8, help="Batch size")
parser.add_argument('--crop_size', type=int, nargs='+', default = [256, 256], help="Spatial dimension to crop training samples for training")
parser.add_argument("--rgb_max", type=float, default = 255.)
parser.add_argument('--number_workers', '-nw', '--num_workers', type=int, default=8)
parser.add_argument('--number_gpus', '-ng', type=int, default=-1, help='number of GPUs to use')
parser.add_argument('--no_cuda', action='store_true')
parser.add_argument('--save', '-s', default='./Google', type=str, help='directory for saving')
parser.add_argument('--inference', action='store_true')
parser.add_argument('--inference_visualize', action='store_true',
help="visualize the optical flow during inference")
parser.add_argument('--inference_size', type=int, nargs='+', default = [-1,-1], help='spatial size divisible by 64. default (-1,-1) - largest possible valid size would be used')
parser.add_argument('--inference_batch_size', type=int, default=1)
parser.add_argument('--inference_n_batches', type=int, default=-1)
parser.add_argument('--save_flow', action='store_true', help='save predicted flows to file')
parser.add_argument('--resume', default='', type=str, metavar='PATH', help='path to latest checkpoint (default: none)')
parser.add_argument('--log_frequency', '--summ_iter', type=int, default=1, help="Log every n batches")
tools.add_arguments_for_module(parser, models, argument_for_class='model', default='FlowNet2')
tools.add_arguments_for_module(parser, datasets, argument_for_class='inference_dataset', default='Google',
skip_params=['is_cropped'],
parameter_defaults={'root': './Google/train',
'replicates': 1})
main_dir = os.path.dirname(os.path.realpath(__file__))
os.chdir(main_dir)
# Parse the official arguments
with tools.TimerBlock("Parsing Arguments") as block:
args = parser.parse_args()
if args.number_gpus < 0 : args.number_gpus = torch.cuda.device_count()
# Get argument defaults (hastag #thisisahack)
parser.add_argument('--IGNORE', action='store_true')
defaults = vars(parser.parse_args(['--IGNORE']))
# Print all arguments, color the non-defaults
for argument, value in sorted(vars(args).items()):
reset = colorama.Style.RESET_ALL
color = reset if value == defaults[argument] else colorama.Fore.MAGENTA
block.log('{}{}: {}{}'.format(color, argument, value, reset))
args.model_class = tools.module_to_dict(models)[args.model]
args.inference_dataset_class = tools.module_to_dict(datasets)[args.inference_dataset]
args.cuda = not args.no_cuda and torch.cuda.is_available()
# args.current_hash = subprocess.check_output(["git", "rev-parse", "HEAD"]).rstrip()
args.log_file = join(args.save, 'args.txt')
# dict to collect activation gradients (for training debug purpose)
args.grads = {}
args.total_epochs = 1
args.inference_dir = "{}/inference".format(args.save)
print('Source Code')
# print((' Current Git Hash: {}\n'.format(args.current_hash)))
# Dynamically load the dataset class with parameters passed in via "--argument_[param]=[value]" arguments
with tools.TimerBlock("Initializing Datasets") as block:
args.effective_batch_size = args.batch_size * args.number_gpus
args.effective_inference_batch_size = args.inference_batch_size * args.number_gpus
args.effective_number_workers = args.number_workers * args.number_gpus
gpuargs = {'num_workers': args.effective_number_workers,
'pin_memory': True,
'drop_last' : True} if args.cuda else {}
inf_gpuargs = gpuargs.copy()
inf_gpuargs['num_workers'] = args.number_workers
block.log('Inference Dataset: {}'.format(args.inference_dataset))
dataset_root = args.inference_dataset_root
data_name = sorted(os.listdir(dataset_root))
block.log(data_name)
inference_loaders = {}
for i in range(len(data_name)):
dataset_path = os.path.join(dataset_root, data_name[i])
args.inference_dataset_root = dataset_path
inference_dataset = args.inference_dataset_class(args, False, **tools.kwargs_from_args(args, 'inference_dataset'))
inference_loaders[dataset_path] = DataLoader(inference_dataset, batch_size=args.effective_inference_batch_size, shuffle=False, **inf_gpuargs)
block.log('Inference Input: {}'.format(' '.join([str([d for d in x.size()]) for x in inference_dataset[0][0]])))
# Dynamically load model and loss class with parameters passed in via "--model_[param]=[value]" or "--loss_[param]=[value]" arguments
with tools.TimerBlock("Building {} model".format(args.model)) as block:
class Model(nn.Module):
def __init__(self, args):
super(Model, self).__init__()
kwargs = tools.kwargs_from_args(args, 'model')
self.model = args.model_class(args, **kwargs)
def forward(self, data):
output = self.model(data)
return output
model = Model(args)
block.log('Effective Batch Size: {}'.format(args.effective_batch_size))
block.log('Number of parameters: {}'.format(sum([p.data.nelement() if p.requires_grad else 0 for p in model.parameters()])))
if args.cuda and args.number_gpus > 0:
block.log('Initializing CUDA')
model = model.cuda()
block.log('Parallelizing')
model = nn.parallel.DataParallel(model, device_ids=list(range(args.number_gpus)))
# Load weights if needed, otherwise randomly initialize
if args.resume and os.path.isfile(args.resume):
block.log("Loading checkpoint '{}'".format(args.resume))
checkpoint = torch.load(args.resume)
model.module.model.load_state_dict(checkpoint['state_dict'])
block.log("Loaded checkpoint '{}' (at epoch {})".format(args.resume, checkpoint['epoch']))
elif args.resume and args.inference:
block.log("No checkpoint found at '{}'".format(args.resume))
quit()
else:
block.log("Random initialization")
block.log("Initializing save directory: {}".format(args.save))
if not os.path.exists(args.save):
os.makedirs(args.save)
# Log all arguments to file
for argument, value in sorted(vars(args).items()):
block.log2file(args.log_file, '{}: {}'.format(argument, value))
for data_path in inference_loaders:
# Primary epoch loop
progress = tqdm(list(range(args.start_epoch, args.total_epochs + 1)), miniters=1, ncols=100, desc='Overall Progress', leave=True, position=0)
offset = 1
for epoch in progress:
stats = inference(args=args, epoch=epoch - 1, data_path = data_path, data_loader=inference_loaders[data_path], model=model, offset=offset)
offset += 1
print("\n")
================================================
FILE: dvs/flownet2/models.py
================================================
import torch
import torch.nn as nn
from torch.nn import init
import math
import numpy as np
try:
from networks.resample2d_package.resample2d import Resample2d
from networks.channelnorm_package.channelnorm import ChannelNorm
from networks import FlowNetC
from networks import FlowNetS
from networks import FlowNetSD
from networks import FlowNetFusion
from networks.submodules import *
except:
from .networks.resample2d_package.resample2d import Resample2d
from .networks.channelnorm_package.channelnorm import ChannelNorm
from .networks import FlowNetC
from .networks import FlowNetS
from .networks import FlowNetSD
from .networks import FlowNetFusion
from .networks.submodules import *
'Parameter count = 162,518,834'
class FlowNet2(nn.Module):
def __init__(self, args, batchNorm=False, div_flow = 20.):
super(FlowNet2,self).__init__()
self.batchNorm = batchNorm
self.div_flow = div_flow
self.rgb_max = args.rgb_max
self.args = args
self.channelnorm = ChannelNorm()
# First Block (FlowNetC)
self.flownetc = FlowNetC.FlowNetC(args, batchNorm=self.batchNorm)
self.upsample1 = nn.Upsample(scale_factor=4, mode='bilinear')
if args.fp16:
self.resample1 = nn.Sequential(
tofp32(),
Resample2d(),
tofp16())
else:
self.resample1 = Resample2d()
# Block (FlowNetS1)
self.flownets_1 = FlowNetS.FlowNetS(args, batchNorm=self.batchNorm)
self.upsample2 = nn.Upsample(scale_factor=4, mode='bilinear')
if args.fp16:
self.resample2 = nn.Sequential(
tofp32(),
Resample2d(),
tofp16())
else:
self.resample2 = Resample2d()
# Block (FlowNetS2)
self.flownets_2 = FlowNetS.FlowNetS(args, batchNorm=self.batchNorm)
# Block (FlowNetSD)
self.flownets_d = FlowNetSD.FlowNetSD(args, batchNorm=self.batchNorm)
self.upsample3 = nn.Upsample(scale_factor=4, mode='nearest')
self.upsample4 = nn.Upsample(scale_factor=4, mode='nearest')
if args.fp16:
self.resample3 = nn.Sequential(
tofp32(),
Resample2d(),
tofp16())
else:
self.resample3 = Resample2d()
if args.fp16:
self.resample4 = nn.Sequential(
tofp32(),
Resample2d(),
tofp16())
else:
self.resample4 = Resample2d()
# Block (FLowNetFusion)
self.flownetfusion = FlowNetFusion.FlowNetFusion(args, batchNorm=self.batchNorm)
for m in self.modules():
if isinstance(m, nn.Conv2d):
if m.bias is not None:
init.uniform_(m.bias)
init.xavier_uniform_(m.weight)
if isinstance(m, nn.ConvTranspose2d):
if m.bias is not None:
init.uniform_(m.bias)
init.xavier_uniform_(m.weight)
# init_deconv_bilinear(m.weight)
def init_deconv_bilinear(self, weight):
f_shape = weight.size()
heigh, width = f_shape[-2], f_shape[-1]
f = np.ceil(width/2.0)
c = (2 * f - 1 - f % 2) / (2.0 * f)
bilinear = np.zeros([heigh, width])
for x in range(width):
for y in range(heigh):
value = (1 - abs(x / f - c)) * (1 - abs(y / f - c))
bilinear[x, y] = value
min_dim = min(f_shape[0], f_shape[1])
weight.data.fill_(0.)
for i in range(min_dim):
weight.data[i,i,:,:] = torch.from_numpy(bilinear)
return
def forward(self, inputs):
rgb_mean = inputs.contiguous().view(inputs.size()[:2]+(-1,)).mean(dim=-1).view(inputs.size()[:2] + (1,1,1,))
x = (inputs - rgb_mean) / self.rgb_max
x1 = x[:,:,0,:,:]
x2 = x[:,:,1,:,:]
x = torch.cat((x1,x2), dim = 1)
# flownetc
flownetc_flow2 = self.flownetc(x)[0]
flownetc_flow = self.upsample1(flownetc_flow2*self.div_flow)
# warp img1 to img0; magnitude of diff between img0 and and warped_img1,
resampled_img1 = self.resample1(x[:,3:,:,:], flownetc_flow)
diff_img0 = x[:,:3,:,:] - resampled_img1
norm_diff_img0 = self.channelnorm(diff_img0)
# concat img0, img1, img1->img0, flow, diff-mag ;
concat1 = torch.cat((x, resampled_img1, flownetc_flow/self.div_flow, norm_diff_img0), dim=1)
# flownets1
flownets1_flow2 = self.flownets_1(concat1)[0]
flownets1_flow = self.upsample2(flownets1_flow2*self.div_flow)
# warp img1 to img0 using flownets1; magnitude of diff between img0 and and warped_img1
resampled_img1 = self.resample2(x[:,3:,:,:], flownets1_flow)
diff_img0 = x[:,:3,:,:] - resampled_img1
norm_diff_img0 = self.channelnorm(diff_img0)
# concat img0, img1, img1->img0, flow, diff-mag
concat2 = torch.cat((x, resampled_img1, flownets1_flow/self.div_flow, norm_diff_img0), dim=1)
# flownets2
flownets2_flow2 = self.flownets_2(concat2)[0]
flownets2_flow = self.upsample4(flownets2_flow2 * self.div_flow)
norm_flownets2_flow = self.channelnorm(flownets2_flow)
diff_flownets2_flow = self.resample4(x[:,3:,:,:], flownets2_flow)
# if not diff_flownets2_flow.volatile:
# diff_flownets2_flow.register_hook(save_grad(self.args.grads, 'diff_flownets2_flow'))
diff_flownets2_img1 = self.channelnorm((x[:,:3,:,:]-diff_flownets2_flow))
# if not diff_flownets2_img1.volatile:
# diff_flownets2_img1.register_hook(save_grad(self.args.grads, 'diff_flownets2_img1'))
# flownetsd
flownetsd_flow2 = self.flownets_d(x)[0]
flownetsd_flow = self.upsample3(flownetsd_flow2 / self.div_flow)
norm_flownetsd_flow = self.channelnorm(flownetsd_flow)
diff_flownetsd_flow = self.resample3(x[:,3:,:,:], flownetsd_flow)
# if not diff_flownetsd_flow.volatile:
# diff_flownetsd_flow.register_hook(save_grad(self.args.grads, 'diff_flownetsd_flow'))
diff_flownetsd_img1 = self.channelnorm((x[:,:3,:,:]-diff_flownetsd_flow))
# if not diff_flownetsd_img1.volatile:
# diff_flownetsd_img1.register_hook(save_grad(self.args.grads, 'diff_flownetsd_img1'))
# concat img1 flownetsd, flownets2, norm_flownetsd, norm_flownets2, diff_flownetsd_img1, diff_flownets2_img1
concat3 = torch.cat((x[:,:3,:,:], flownetsd_flow, flownets2_flow, norm_flownetsd_flow, norm_flownets2_flow, diff_flownetsd_img1, diff_flownets2_img1), dim=1)
flownetfusion_flow = self.flownetfusion(concat3)
# if not flownetfusion_flow.volatile:
# flownetfusion_flow.register_hook(save_grad(self.args.grads, 'flownetfusion_flow'))
return flownetfusion_flow
class FlowNet2C(FlowNetC.FlowNetC):
def __init__(self, args, batchNorm=False, div_flow=20):
super(FlowNet2C,self).__init__(args, batchNorm=batchNorm, div_flow=20)
self.rgb_max = args.rgb_max
def forward(self, inputs):
rgb_mean = inputs.contiguous().view(inputs.size()[:2]+(-1,)).mean(dim=-1).view(inputs.size()[:2] + (1,1,1,))
x = (inputs - rgb_mean) / self.rgb_max
x1 = x[:,:,0,:,:]
x2 = x[:,:,1,:,:]
# FlownetC top input stream
out_conv1a = self.conv1(x1)
out_conv2a = self.conv2(out_conv1a)
out_conv3a = self.conv3(out_conv2a)
# FlownetC bottom input stream
out_conv1b = self.conv1(x2)
out_conv2b = self.conv2(out_conv1b)
out_conv3b = self.conv3(out_conv2b)
# Merge streams
out_corr = self.corr(out_conv3a, out_conv3b) # False
out_corr = self.corr_activation(out_corr)
# Redirect top input stream and concatenate
out_conv_redir = self.conv_redir(out_conv3a)
in_conv3_1 = torch.cat((out_conv_redir, out_corr), 1)
# Merged conv layers
out_conv3_1 = self.conv3_1(in_conv3_1)
out_conv4 = self.conv4_1(self.conv4(out_conv3_1))
out_conv5 = self.conv5_1(self.conv5(out_conv4))
out_conv6 = self.conv6_1(self.conv6(out_conv5))
flow6 = self.predict_flow6(out_conv6)
flow6_up = self.upsampled_flow6_to_5(flow6)
out_deconv5 = self.deconv5(out_conv6)
concat5 = torch.cat((out_conv5,out_deconv5,flow6_up),1)
flow5 = self.predict_flow5(concat5)
flow5_up = self.upsampled_flow5_to_4(flow5)
out_deconv4 = self.deconv4(concat5)
concat4 = torch.cat((out_conv4,out_deconv4,flow5_up),1)
flow4 = self.predict_flow4(concat4)
flow4_up = self.upsampled_flow4_to_3(flow4)
out_deconv3 = self.deconv3(concat4)
concat3 = torch.cat((out_conv3_1,out_deconv3,flow4_up),1)
flow3 = self.predict_flow3(concat3)
flow3_up = self.upsampled_flow3_to_2(flow3)
out_deconv2 = self.deconv2(concat3)
concat2 = torch.cat((out_conv2a,out_deconv2,flow3_up),1)
flow2 = self.predict_flow2(concat2)
if self.training:
return flow2,flow3,flow4,flow5,flow6
else:
return self.upsample1(flow2*self.div_flow)
class FlowNet2S(FlowNetS.FlowNetS):
def __init__(self, args, batchNorm=False, div_flow=20):
super(FlowNet2S,self).__init__(args, input_channels = 6, batchNorm=batchNorm)
self.rgb_max = args.rgb_max
self.div_flow = div_flow
def forward(self, inputs):
rgb_mean = inputs.contiguous().view(inputs.size()[:2]+(-1,)).mean(dim=-1).view(inputs.size()[:2] + (1,1,1,))
x = (inputs - rgb_mean) / self.rgb_max
x = torch.cat( (x[:,:,0,:,:], x[:,:,1,:,:]), dim = 1)
out_conv1 = self.conv1(x)
out_conv2 = self.conv2(out_conv1)
out_conv3 = self.conv3_1(self.conv3(out_conv2))
out_conv4 = self.conv4_1(self.conv4(out_conv3))
out_conv5 = self.conv5_1(self.conv5(out_conv4))
out_conv6 = self.conv6_1(self.conv6(out_conv5))
flow6 = self.predict_flow6(out_conv6)
flow6_up = self.upsampled_flow6_to_5(flow6)
out_deconv5 = self.deconv5(out_conv6)
concat5 = torch.cat((out_conv5,out_deconv5,flow6_up),1)
flow5 = self.predict_flow5(concat5)
flow5_up = self.upsampled_flow5_to_4(flow5)
out_deconv4 = self.deconv4(concat5)
concat4 = torch.cat((out_conv4,out_deconv4,flow5_up),1)
flow4 = self.predict_flow4(concat4)
flow4_up = self.upsampled_flow4_to_3(flow4)
out_deconv3 = self.deconv3(concat4)
concat3 = torch.cat((out_conv3,out_deconv3,flow4_up),1)
flow3 = self.predict_flow3(concat3)
flow3_up = self.upsampled_flow3_to_2(flow3)
out_deconv2 = self.deconv2(concat3)
concat2 = torch.cat((out_conv2,out_deconv2,flow3_up),1)
flow2 = self.predict_flow2(concat2)
if self.training:
return flow2,flow3,flow4,flow5,flow6
else:
return self.upsample1(flow2*self.div_flow)
class FlowNet2SD(FlowNetSD.FlowNetSD):
def __init__(self, args, batchNorm=False, div_flow=20):
super(FlowNet2SD,self).__init__(args, batchNorm=batchNorm)
self.rgb_max = args.rgb_max
self.div_flow = div_flow
def forward(self, inputs):
rgb_mean = inputs.contiguous().view(inputs.size()[:2]+(-1,)).mean(dim=-1).view(inputs.size()[:2] + (1,1,1,))
x = (inputs - rgb_mean) / self.rgb_max
x = torch.cat( (x[:,:,0,:,:], x[:,:,1,:,:]), dim = 1)
out_conv0 = self.conv0(x)
out_conv1 = self.conv1_1(self.conv1(out_conv0))
out_conv2 = self.conv2_1(self.conv2(out_conv1))
out_conv3 = self.conv3_1(self.conv3(out_conv2))
out_conv4 = self.conv4_1(self.conv4(out_conv3))
out_conv5 = self.conv5_1(self.conv5(out_conv4))
out_conv6 = self.conv6_1(self.conv6(out_conv5))
flow6 = self.predict_flow6(out_conv6)
flow6_up = self.upsampled_flow6_to_5(flow6)
out_deconv5 = self.deconv5(out_conv6)
concat5 = torch.cat((out_conv5,out_deconv5,flow6_up),1)
out_interconv5 = self.inter_conv5(concat5)
flow5 = self.predict_flow5(out_interconv5)
flow5_up = self.upsampled_flow5_to_4(flow5)
out_deconv4 = self.deconv4(concat5)
concat4 = torch.cat((out_conv4,out_deconv4,flow5_up),1)
out_interconv4 = self.inter_conv4(concat4)
flow4 = self.predict_flow4(out_interconv4)
flow4_up = self.upsampled_flow4_to_3(flow4)
out_deconv3 = self.deconv3(concat4)
concat3 = torch.cat((out_conv3,out_deconv3,flow4_up),1)
out_interconv3 = self.inter_conv3(concat3)
flow3 = self.predict_flow3(out_interconv3)
flow3_up = self.upsampled_flow3_to_2(flow3)
out_deconv2 = self.deconv2(concat3)
concat2 = torch.cat((out_conv2,out_deconv2,flow3_up),1)
out_interconv2 = self.inter_conv2(concat2)
flow2 = self.predict_flow2(out_interconv2)
if self.training:
return flow2,flow3,flow4,flow5,flow6
else:
return self.upsample1(flow2*self.div_flow)
class FlowNet2CS(nn.Module):
def __init__(self, args, batchNorm=False, div_flow = 20.):
super(FlowNet2CS,self).__init__()
self.batchNorm = batchNorm
self.div_flow = div_flow
self.rgb_max = args.rgb_max
self.args = args
self.channelnorm = ChannelNorm()
# First Block (FlowNetC)
self.flownetc = FlowNetC.FlowNetC(args, batchNorm=self.batchNorm)
self.upsample1 = nn.Upsample(scale_factor=4, mode='bilinear')
if args.fp16:
self.resample1 = nn.Sequential(
tofp32(),
Resample2d(),
tofp16())
else:
self.resample1 = Resample2d()
# Block (FlowNetS1)
self.flownets_1 = FlowNetS.FlowNetS(args, batchNorm=self.batchNorm)
self.upsample2 = nn.Upsample(scale_factor=4, mode='bilinear')
for m in self.modules():
if isinstance(m, nn.Conv2d):
if m.bias is not None:
init.uniform(m.bias)
init.xavier_uniform(m.weight)
if isinstance(m, nn.ConvTranspose2d):
if m.bias is not None:
init.uniform(m.bias)
init.xavier_uniform(m.weight)
# init_deconv_bilinear(m.weight)
def forward(self, inputs):
rgb_mean = inputs.contiguous().view(inputs.size()[:2]+(-1,)).mean(dim=-1).view(inputs.size()[:2] + (1,1,1,))
x = (inputs - rgb_mean) / self.rgb_max
x1 = x[:,:,0,:,:]
x2 = x[:,:,1,:,:]
x = torch.cat((x1,x2), dim = 1)
# flownetc
flownetc_flow2 = self.flownetc(x)[0]
flownetc_flow = self.upsample1(flownetc_flow2*self.div_flow)
# warp img1 to img0; magnitude of diff between img0 and and warped_img1,
resampled_img1 = self.resample1(x[:,3:,:,:], flownetc_flow)
diff_img0 = x[:,:3,:,:] - resampled_img1
norm_diff_img0 = self.channelnorm(diff_img0)
# concat img0, img1, img1->img0, flow, diff-mag ;
concat1 = torch.cat((x, resampled_img1, flownetc_flow/self.div_flow, norm_diff_img0), dim=1)
# flownets1
flownets1_flow2 = self.flownets_1(concat1)[0]
flownets1_flow = self.upsample2(flownets1_flow2*self.div_flow)
return flownets1_flow
class FlowNet2CSS(nn.Module):
def __init__(self, args, batchNorm=False, div_flow = 20.):
super(FlowNet2CSS,self).__init__()
self.batchNorm = batchNorm
self.div_flow = div_flow
self.rgb_max = args.rgb_max
self.args = args
self.channelnorm = ChannelNorm()
# First Block (FlowNetC)
self.flownetc = FlowNetC.FlowNetC(args, batchNorm=self.batchNorm)
self.upsample1 = nn.Upsample(scale_factor=4, mode='bilinear')
if args.fp16:
self.resample1 = nn.Sequential(
tofp32(),
Resample2d(),
tofp16())
else:
self.resample1 = Resample2d()
# Block (FlowNetS1)
self.flownets_1 = FlowNetS.FlowNetS(args, batchNorm=self.batchNorm)
self.upsample2 = nn.Upsample(scale_factor=4, mode='bilinear')
if args.fp16:
self.resample2 = nn.Sequential(
tofp32(),
Resample2d(),
tofp16())
else:
self.resample2 = Resample2d()
# Block (FlowNetS2)
self.flownets_2 = FlowNetS.FlowNetS(args, batchNorm=self.batchNorm)
self.upsample3 = nn.Upsample(scale_factor=4, mode='nearest')
for m in self.modules():
if isinstance(m, nn.Conv2d):
if m.bias is not None:
init.uniform(m.bias)
init.xavier_uniform(m.weight)
if isinstance(m, nn.ConvTranspose2d):
if m.bias is not None:
init.uniform(m.bias)
init.xavier_uniform(m.weight)
# init_deconv_bilinear(m.weight)
def forward(self, inputs):
rgb_mean = inputs.contiguous().view(inputs.size()[:2]+(-1,)).mean(dim=-1).view(inputs.size()[:2] + (1,1,1,))
x = (inputs - rgb_mean) / self.rgb_max
x1 = x[:,:,0,:,:]
x2 = x[:,:,1,:,:]
x = torch.cat((x1,x2), dim = 1)
# flownetc
flownetc_flow2 = self.flownetc(x)[0]
flownetc_flow = self.upsample1(flownetc_flow2*self.div_flow)
# warp img1 to img0; magnitude of diff between img0 and and warped_img1,
resampled_img1 = self.resample1(x[:,3:,:,:], flownetc_flow)
diff_img0 = x[:,:3,:,:] - resampled_img1
norm_diff_img0 = self.channelnorm(diff_img0)
# concat img0, img1, img1->img0, flow, diff-mag ;
concat1 = torch.cat((x, resampled_img1, flownetc_flow/self.div_flow, norm_diff_img0), dim=1)
# flownets1
flownets1_flow2 = self.flownets_1(concat1)[0]
flownets1_flow = self.upsample2(flownets1_flow2*self.div_flow)
# warp img1 to img0 using flownets1; magnitude of diff between img0 and and warped_img1
resampled_img1 = self.resample2(x[:,3:,:,:], flownets1_flow)
diff_img0 = x[:,:3,:,:] - resampled_img1
norm_diff_img0 = self.channelnorm(diff_img0)
# concat img0, img1, img1->img0, flow, diff-mag
concat2 = torch.cat((x, resampled_img1, flownets1_flow/self.div_flow, norm_diff_img0), dim=1)
# flownets2
flownets2_flow2 = self.flownets_2(concat2)[0]
flownets2_flow = self.upsample3(flownets2_flow2 * self.div_flow)
return flownets2_flow
================================================
FILE: dvs/flownet2/networks/FlowNetC.py
================================================
import torch
import torch.nn as nn
from torch.nn import init
import math
import numpy as np
from .correlation_package.correlation import Correlation
from .submodules import *
'Parameter count , 39,175,298 '
class FlowNetC(nn.Module):
def __init__(self,args, batchNorm=True, div_flow = 20):
super(FlowNetC,self).__init__()
self.batchNorm = batchNorm
self.div_flow = div_flow
self.conv1 = conv(self.batchNorm, 3, 64, kernel_size=7, stride=2)
self.conv2 = conv(self.batchNorm, 64, 128, kernel_size=5, stride=2)
self.conv3 = conv(self.batchNorm, 128, 256, kernel_size=5, stride=2)
self.conv_redir = conv(self.batchNorm, 256, 32, kernel_size=1, stride=1)
if args.fp16:
self.corr = nn.Sequential(
tofp32(),
Correlation(pad_size=20, kernel_size=1, max_displacement=20, stride1=1, stride2=2, corr_multiply=1),
tofp16())
else:
self.corr = Correlation(pad_size=20, kernel_size=1, max_displacement=20, stride1=1, stride2=2, corr_multiply=1)
self.corr_activation = nn.LeakyReLU(0.1,inplace=True)
self.conv3_1 = conv(self.batchNorm, 473, 256)
self.conv4 = conv(self.batchNorm, 256, 512, stride=2)
self.conv4_1 = conv(self.batchNorm, 512, 512)
self.conv5 = conv(self.batchNorm, 512, 512, stride=2)
self.conv5_1 = conv(self.batchNorm, 512, 512)
self.conv6 = conv(self.batchNorm, 512, 1024, stride=2)
self.conv6_1 = conv(self.batchNorm,1024, 1024)
self.deconv5 = deconv(1024,512)
self.deconv4 = deconv(1026,256)
self.deconv3 = deconv(770,128)
self.deconv2 = deconv(386,64)
self.predict_flow6 = predict_flow(1024)
self.predict_flow5 = predict_flow(1026)
self.predict_flow4 = predict_flow(770)
self.predict_flow3 = predict_flow(386)
self.predict_flow2 = predict_flow(194)
self.upsampled_flow6_to_5 = nn.ConvTranspose2d(2, 2, 4, 2, 1, bias=True)
self.upsampled_flow5_to_4 = nn.ConvTranspose2d(2, 2, 4, 2, 1, bias=True)
self.upsampled_flow4_to_3 = nn.ConvTranspose2d(2, 2, 4, 2, 1, bias=True)
self.upsampled_flow3_to_2 = nn.ConvTranspose2d(2, 2, 4, 2, 1, bias=True)
for m in self.modules():
if isinstance(m, nn.Conv2d):
if m.bias is not None:
init.uniform_(m.bias)
init.xavier_uniform_(m.weight)
if isinstance(m, nn.ConvTranspose2d):
if m.bias is not None:
init.uniform_(m.bias)
init.xavier_uniform_(m.weight)
# init_deconv_bilinear(m.weight)
self.upsample1 = nn.Upsample(scale_factor=4, mode='bilinear')
def forward(self, x):
x1 = x[:,0:3,:,:]
x2 = x[:,3::,:,:]
out_conv1a = self.conv1(x1)
out_conv2a = self.conv2(out_conv1a)
out_conv3a = self.conv3(out_conv2a)
# FlownetC bottom input stream
out_conv1b = self.conv1(x2)
out_conv2b = self.conv2(out_conv1b)
out_conv3b = self.conv3(out_conv2b)
# Merge streams
out_corr = self.corr(out_conv3a, out_conv3b) # False
out_corr = self.corr_activation(out_corr)
# Redirect top input stream and concatenate
out_conv_redir = self.conv_redir(out_conv3a)
in_conv3_1 = torch.cat((out_conv_redir, out_corr), 1)
# Merged conv layers
out_conv3_1 = self.conv3_1(in_conv3_1)
out_conv4 = self.conv4_1(self.conv4(out_conv3_1))
out_conv5 = self.conv5_1(self.conv5(out_conv4))
out_conv6 = self.conv6_1(self.conv6(out_conv5))
flow6 = self.predict_flow6(out_conv6)
flow6_up = self.upsampled_flow6_to_5(flow6)
out_deconv5 = self.deconv5(out_conv6)
concat5 = torch.cat((out_conv5,out_deconv5,flow6_up),1)
flow5 = self.predict_flow5(concat5)
flow5_up = self.upsampled_flow5_to_4(flow5)
out_deconv4 = self.deconv4(concat5)
concat4 = torch.cat((out_conv4,out_deconv4,flow5_up),1)
flow4 = self.predict_flow4(concat4)
flow4_up = self.upsampled_flow4_to_3(flow4)
out_deconv3 = self.deconv3(concat4)
concat3 = torch.cat((out_conv3_1,out_deconv3,flow4_up),1)
flow3 = self.predict_flow3(concat3)
flow3_up = self.upsampled_flow3_to_2(flow3)
out_deconv2 = self.deconv2(concat3)
concat2 = torch.cat((out_conv2a,out_deconv2,flow3_up),1)
flow2 = self.predict_flow2(concat2)
if self.training:
return flow2,flow3,flow4,flow5,flow6
else:
return flow2,
================================================
FILE: dvs/flownet2/networks/FlowNetFusion.py
================================================
import torch
import torch.nn as nn
from torch.nn import init
import math
import numpy as np
from .submodules import *
'Parameter count = 581,226'
class FlowNetFusion(nn.Module):
def __init__(self,args, batchNorm=True):
super(FlowNetFusion,self).__init__()
self.batchNorm = batchNorm
self.conv0 = conv(self.batchNorm, 11, 64)
self.conv1 = conv(self.batchNorm, 64, 64, stride=2)
self.conv1_1 = conv(self.batchNorm, 64, 128)
self.conv2 = conv(self.batchNorm, 128, 128, stride=2)
self.conv2_1 = conv(self.batchNorm, 128, 128)
self.deconv1 = deconv(128,32)
self.deconv0 = deconv(162,16)
self.inter_conv1 = i_conv(self.batchNorm, 162, 32)
self.inter_conv0 = i_conv(self.batchNorm, 82, 16)
self.predict_flow2 = predict_flow(128)
self.predict_flow1 = predict_flow(32)
self.predict_flow0 = predict_flow(16)
self.upsampled_flow2_to_1 = nn.ConvTranspose2d(2, 2, 4, 2, 1)
self.upsampled_flow1_to_0 = nn.ConvTranspose2d(2, 2, 4, 2, 1)
for m in self.modules():
if isinstance(m, nn.Conv2d):
if m.bias is not None:
init.uniform_(m.bias)
init.xavier_uniform_(m.weight)
if isinstance(m, nn.ConvTranspose2d):
if m.bias is not None:
init.uniform_(m.bias)
init.xavier_uniform_(m.weight)
# init_deconv_bilinear(m.weight)
def forward(self, x):
out_conv0 = self.conv0(x)
out_conv1 = self.conv1_1(self.conv1(out_conv0))
out_conv2 = self.conv2_1(self.conv2(out_conv1))
flow2 = self.predict_flow2(out_conv2)
flow2_up = self.upsampled_flow2_to_1(flow2)
out_deconv1 = self.deconv1(out_conv2)
concat1 = torch.cat((out_conv1,out_deconv1,flow2_up),1)
out_interconv1 = self.inter_conv1(concat1)
flow1 = self.predict_flow1(out_interconv1)
flow1_up = self.upsampled_flow1_to_0(flow1)
out_deconv0 = self.deconv0(concat1)
concat0 = torch.cat((out_conv0,out_deconv0,flow1_up),1)
out_interconv0 = self.inter_conv0(concat0)
flow0 = self.predict_flow0(out_interconv0)
return flow0
================================================
FILE: dvs/flownet2/networks/FlowNetS.py
================================================
'''
Portions of this code copyright 2017, Clement Pinard
'''
import torch
import torch.nn as nn
from torch.nn import init
import math
import numpy as np
from .submodules import *
'Parameter count : 38,676,504 '
class FlowNetS(nn.Module):
def __init__(self, args, input_channels = 12, batchNorm=True):
super(FlowNetS,self).__init__()
self.batchNorm = batchNorm
self.conv1 = conv(self.batchNorm, input_channels, 64, kernel_size=7, stride=2)
self.conv2 = conv(self.batchNorm, 64, 128, kernel_size=5, stride=2)
self.conv3 = conv(self.batchNorm, 128, 256, kernel_size=5, stride=2)
self.conv3_1 = conv(self.batchNorm, 256, 256)
self.conv4 = conv(self.batchNorm, 256, 512, stride=2)
self.conv4_1 = conv(self.batchNorm, 512, 512)
self.conv5 = conv(self.batchNorm, 512, 512, stride=2)
self.conv5_1 = conv(self.batchNorm, 512, 512)
self.conv6 = conv(self.batchNorm, 512, 1024, stride=2)
self.conv6_1 = conv(self.batchNorm,1024, 1024)
self.deconv5 = deconv(1024,512)
self.deconv4 = deconv(1026,256)
self.deconv3 = deconv(770,128)
self.deconv2 = deconv(386,64)
self.predict_flow6 = predict_flow(1024)
self.predict_flow5 = predict_flow(1026)
self.predict_flow4 = predict_flow(770)
self.predict_flow3 = predict_flow(386)
self.predict_flow2 = predict_flow(194)
self.upsampled_flow6_to_5 = nn.ConvTranspose2d(2, 2, 4, 2, 1, bias=False)
self.upsampled_flow5_to_4 = nn.ConvTranspose2d(2, 2, 4, 2, 1, bias=False)
self.upsampled_flow4_to_3 = nn.ConvTranspose2d(2, 2, 4, 2, 1, bias=False)
self.upsampled_flow3_to_2 = nn.ConvTranspose2d(2, 2, 4, 2, 1, bias=False)
for m in self.modules():
if isinstance(m, nn.Conv2d):
if m.bias is not None:
init.uniform_(m.bias)
init.xavier_uniform_(m.weight)
if isinstance(m, nn.ConvTranspose2d):
if m.bias is not None:
init.uniform_(m.bias)
init.xavier_uniform_(m.weight)
# init_deconv_bilinear(m.weight)
self.upsample1 = nn.Upsample(scale_factor=4, mode='bilinear')
def forward(self, x):
out_conv1 = self.conv1(x)
out_conv2 = self.conv2(out_conv1)
out_conv3 = self.conv3_1(self.conv3(out_conv2))
out_conv4 = self.conv4_1(self.conv4(out_conv3))
out_conv5 = self.conv5_1(self.conv5(out_conv4))
out_conv6 = self.conv6_1(self.conv6(out_conv5))
flow6 = self.predict_flow6(out_conv6)
flow6_up = self.upsampled_flow6_to_5(flow6)
out_deconv5 = self.deconv5(out_conv6)
concat5 = torch.cat((out_conv5,out_deconv5,flow6_up),1)
flow5 = self.predict_flow5(concat5)
flow5_up = self.upsampled_flow5_to_4(flow5)
out_deconv4 = self.deconv4(concat5)
concat4 = torch.cat((out_conv4,out_deconv4,flow5_up),1)
flow4 = self.predict_flow4(concat4)
flow4_up = self.upsampled_flow4_to_3(flow4)
out_deconv3 = self.deconv3(concat4)
concat3 = torch.cat((out_conv3,out_deconv3,flow4_up),1)
flow3 = self.predict_flow3(concat3)
flow3_up = self.upsampled_flow3_to_2(flow3)
out_deconv2 = self.deconv2(concat3)
concat2 = torch.cat((out_conv2,out_deconv2,flow3_up),1)
flow2 = self.predict_flow2(concat2)
if self.training:
return flow2,flow3,flow4,flow5,flow6
else:
return flow2,
================================================
FILE: dvs/flownet2/networks/FlowNetSD.py
================================================
import torch
import torch.nn as nn
from torch.nn import init
import math
import numpy as np
from .submodules import *
'Parameter count = 45,371,666'
class FlowNetSD(nn.Module):
def __init__(self, args, batchNorm=True):
super(FlowNetSD,self).__init__()
self.batchNorm = batchNorm
self.conv0 = conv(self.batchNorm, 6, 64)
self.conv1 = conv(self.batchNorm, 64, 64, stride=2)
self.conv1_1 = conv(self.batchNorm, 64, 128)
self.conv2 = conv(self.batchNorm, 128, 128, stride=2)
self.conv2_1 = conv(self.batchNorm, 128, 128)
self.conv3 = conv(self.batchNorm, 128, 256, stride=2)
self.conv3_1 = conv(self.batchNorm, 256, 256)
self.conv4 = conv(self.batchNorm, 256, 512, stride=2)
self.conv4_1 = conv(self.batchNorm, 512, 512)
self.conv5 = conv(self.batchNorm, 512, 512, stride=2)
self.conv5_1 = conv(self.batchNorm, 512, 512)
self.conv6 = conv(self.batchNorm, 512, 1024, stride=2)
self.conv6_1 = conv(self.batchNorm,1024, 1024)
self.deconv5 = deconv(1024,512)
self.deconv4 = deconv(1026,256)
self.deconv3 = deconv(770,128)
self.deconv2 = deconv(386,64)
self.inter_conv5 = i_conv(self.batchNorm, 1026, 512)
self.inter_conv4 = i_conv(self.batchNorm, 770, 256)
self.inter_conv3 = i_conv(self.batchNorm, 386, 128)
self.inter_conv2 = i_conv(self.batchNorm, 194, 64)
self.predict_flow6 = predict_flow(1024)
self.predict_flow5 = predict_flow(512)
self.predict_flow4 = predict_flow(256)
self.predict_flow3 = predict_flow(128)
self.predict_flow2 = predict_flow(64)
self.upsampled_flow6_to_5 = nn.ConvTranspose2d(2, 2, 4, 2, 1)
self.upsampled_flow5_to_4 = nn.ConvTranspose2d(2, 2, 4, 2, 1)
self.upsampled_flow4_to_3 = nn.ConvTranspose2d(2, 2, 4, 2, 1)
self.upsampled_flow3_to_2 = nn.ConvTranspose2d(2, 2, 4, 2, 1)
for m in self.modules():
if isinstance(m, nn.Conv2d):
if m.bias is not None:
init.uniform_(m.bias)
init.xavier_uniform_(m.weight)
if isinstance(m, nn.ConvTranspose2d):
if m.bias is not None:
init.uniform_(m.bias)
init.xavier_uniform_(m.weight)
# init_deconv_bilinear(m.weight)
self.upsample1 = nn.Upsample(scale_factor=4, mode='bilinear')
def forward(self, x):
out_conv0 = self.conv0(x)
out_conv1 = self.conv1_1(self.conv1(out_conv0))
out_conv2 = self.conv2_1(self.conv2(out_conv1))
out_conv3 = self.conv3_1(self.conv3(out_conv2))
out_conv4 = self.conv4_1(self.conv4(out_conv3))
out_conv5 = self.conv5_1(self.conv5(out_conv4))
out_conv6 = self.conv6_1(self.conv6(out_conv5))
flow6 = self.predict_flow6(out_conv6)
flow6_up = self.upsampled_flow6_to_5(flow6)
out_deconv5 = self.deconv5(out_conv6)
concat5 = torch.cat((out_conv5,out_deconv5,flow6_up),1)
out_interconv5 = self.inter_conv5(concat5)
flow5 = self.predict_flow5(out_interconv5)
flow5_up = self.upsampled_flow5_to_4(flow5)
out_deconv4 = self.deconv4(concat5)
concat4 = torch.cat((out_conv4,out_deconv4,flow5_up),1)
out_interconv4 = self.inter_conv4(concat4)
flow4 = self.predict_flow4(out_interconv4)
flow4_up = self.upsampled_flow4_to_3(flow4)
out_deconv3 = self.deconv3(concat4)
concat3 = torch.cat((out_conv3,out_deconv3,flow4_up),1)
out_interconv3 = self.inter_conv3(concat3)
flow3 = self.predict_flow3(out_interconv3)
flow3_up = self.upsampled_flow3_to_2(flow3)
out_deconv2 = self.deconv2(concat3)
concat2 = torch.cat((out_conv2,out_deconv2,flow3_up),1)
out_interconv2 = self.inter_conv2(concat2)
flow2 = self.predict_flow2(out_interconv2)
if self.training:
return flow2,flow3,flow4,flow5,flow6
else:
return flow2,
================================================
FILE: dvs/flownet2/networks/__init__.py
================================================
================================================
FILE: dvs/flownet2/networks/channelnorm_package/__init__.py
================================================
================================================
FILE: dvs/flownet2/networks/channelnorm_package/channelnorm.py
================================================
from torch.autograd import Function, Variable
from torch.nn.modules.module import Module
import channelnorm_cuda
class ChannelNormFunction(Function):
@staticmethod
def forward(ctx, input1, norm_deg=2):
assert input1.is_contiguous()
b, _, h, w = input1.size()
output = input1.new(b, 1, h, w).zero_()
channelnorm_cuda.forward(input1, output, norm_deg)
ctx.save_for_backward(input1, output)
ctx.norm_deg = norm_deg
return output
@staticmethod
def backward(ctx, grad_output):
input1, output = ctx.saved_tensors
grad_input1 = Variable(input1.new(input1.size()).zero_())
channelnorm_cuda.backward(input1, output, grad_output.data,
grad_input1.data, ctx.norm_deg)
return grad_input1, None
class ChannelNorm(Module):
def __init__(self, norm_deg=2):
super(ChannelNorm, self).__init__()
self.norm_deg = norm_deg
def forward(self, input1):
return ChannelNormFunction.apply(input1, self.norm_deg)
================================================
FILE: dvs/flownet2/networks/channelnorm_package/channelnorm_cuda.cc
================================================
#include <torch/torch.h>
#include <ATen/ATen.h>
#include "channelnorm_kernel.cuh"
int channelnorm_cuda_forward(
at::Tensor& input1,
at::Tensor& output,
int norm_deg) {
channelnorm_kernel_forward(input1, output, norm_deg);
return 1;
}
int channelnorm_cuda_backward(
at::Tensor& input1,
at::Tensor& output,
at::Tensor& gradOutput,
at::Tensor& gradInput1,
int norm_deg) {
channelnorm_kernel_backward(input1, output, gradOutput, gradInput1, norm_deg);
return 1;
}
PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
m.def("forward", &channelnorm_cuda_forward, "Channel norm forward (CUDA)");
m.def("backward", &channelnorm_cuda_backward, "Channel norm backward (CUDA)");
}
================================================
FILE: dvs/flownet2/networks/channelnorm_package/channelnorm_kernel.cu
================================================
#include <ATen/ATen.h>
#include <ATen/Context.h>
#include <ATen/cuda/CUDAContext.h>
#include "channelnorm_kernel.cuh"
#define CUDA_NUM_THREADS 512
#define DIM0(TENSOR) ((TENSOR).x)
#define DIM1(TENSOR) ((TENSOR).y)
#define DIM2(TENSOR) ((TENSOR).z)
#define DIM3(TENSOR) ((TENSOR).w)
#define DIM3_INDEX(TENSOR, xx, yy, zz, ww) ((TENSOR)[((xx) * (TENSOR##_stride.x)) + ((yy) * (TENSOR##_stride.y)) + ((zz) * (TENSOR##_stride.z)) + ((ww) * (TENSOR##_stride.w))])
using at::Half;
template <typename scalar_t>
__global__ void kernel_channelnorm_update_output(
const int n,
const scalar_t* __restrict__ input1,
const long4 input1_size,
const long4 input1_stride,
scalar_t* __restrict__ output,
const long4 output_size,
const long4 output_stride,
int norm_deg) {
int index = blockIdx.x * blockDim.x + threadIdx.x;
if (index >= n) {
return;
}
int dim_b = DIM0(output_size);
int dim_c = DIM1(output_size);
int dim_h = DIM2(output_size);
int dim_w = DIM3(output_size);
int dim_chw = dim_c * dim_h * dim_w;
int b = ( index / dim_chw ) % dim_b;
int y = ( index / dim_w ) % dim_h;
int x = ( index ) % dim_w;
int i1dim_c = DIM1(input1_size);
int i1dim_h = DIM2(input1_size);
int i1dim_w = DIM3(input1_size);
int i1dim_chw = i1dim_c * i1dim_h * i1dim_w;
int i1dim_hw = i1dim_h * i1dim_w;
float result = 0.0;
for (int c = 0; c < i1dim_c; ++c) {
int i1Index = b * i1dim_chw + c * i1dim_hw + y * i1dim_w + x;
scalar_t val = input1[i1Index];
result += static_cast<float>(val * val);
}
result = sqrt(result);
output[index] = static_cast<scalar_t>(result);
}
template <typename scalar_t>
__global__ void kernel_channelnorm_backward_input1(
const int n,
const scalar_t* __restrict__ input1, const long4 input1_size, const long4 input1_stride,
const scalar_t* __restrict__ output, const long4 output_size, const long4 output_stride,
const scalar_t* __restrict__ gradOutput, const long4 gradOutput_size, const long4 gradOutput_stride,
scalar_t* __restrict__ gradInput, const long4 gradInput_size, const long4 gradInput_stride,
int norm_deg) {
int index = blockIdx.x * blockDim.x + threadIdx.x;
if (index >= n) {
return;
}
float val = 0.0;
int dim_b = DIM0(gradInput_size);
int dim_c = DIM1(gradInput_size);
int dim_h = DIM2(gradInput_size);
int dim_w = DIM3(gradInput_size);
int dim_chw = dim_c * dim_h * dim_w;
int dim_hw = dim_h * dim_w;
int b = ( index / dim_chw ) % dim_b;
int y = ( index / dim_w ) % dim_h;
int x = ( index ) % dim_w;
int outIndex = b * dim_hw + y * dim_w + x;
val = static_cast<float>(gradOutput[outIndex]) * static_cast<float>(input1[index]) / (static_cast<float>(output[outIndex])+1e-9);
gradInput[index] = static_cast<scalar_t>(val);
}
void channelnorm_kernel_forward(
at::Tensor& input1,
at::Tensor& output,
int norm_deg) {
const long4 input1_size = make_long4(input1.size(0), input1.size(1), input1.size(2), input1.size(3));
const long4 input1_stride = make_long4(input1.stride(0), input1.stride(1), input1.stride(2), input1.stride(3));
const long4 output_size = make_long4(output.size(0), output.size(1), output.size(2), output.size(3));
const long4 output_stride = make_long4(output.stride(0), output.stride(1), output.stride(2), output.stride(3));
int n = output.numel();
AT_DISPATCH_FLOATING_TYPES_AND_HALF(input1.type(), "channelnorm_forward", ([&] {
kernel_channelnorm_update_output<scalar_t><<< (n + CUDA_NUM_THREADS - 1)/CUDA_NUM_THREADS, CUDA_NUM_THREADS, 0, at::cuda::getCurrentCUDAStream() >>>(
//at::globalContext().getCurrentCUDAStream() >>>(
n,
input1.data<scalar_t>(),
input1_size,
input1_stride,
output.data<scalar_t>(),
output_size,
output_stride,
norm_deg);
}));
// TODO: ATen-equivalent check
// THCudaCheck(cudaGetLastError());
}
void channelnorm_kernel_backward(
at::Tensor& input1,
at::Tensor& output,
at::Tensor& gradOutput,
at::Tensor& gradInput1,
int norm_deg) {
const long4 input1_size = make_long4(input1.size(0), input1.size(1), input1.size(2), input1.size(3));
const long4 input1_stride = make_long4(input1.stride(0), input1.stride(1), input1.stride(2), input1.stride(3));
const long4 output_size = make_long4(output.size(0), output.size(1), output.size(2), output.size(3));
const long4 output_stride = make_long4(output.stride(0), output.stride(1), output.stride(2), output.stride(3));
const long4 gradOutput_size = make_long4(gradOutput.size(0), gradOutput.size(1), gradOutput.size(2), gradOutput.size(3));
const long4 gradOutput_stride = make_long4(gradOutput.stride(0), gradOutput.stride(1), gradOutput.stride(2), gradOutput.stride(3));
const long4 gradInput1_size = make_long4(gradInput1.size(0), gradInput1.size(1), gradInput1.size(2), gradInput1.size(3));
const long4 gradInput1_stride = make_long4(gradInput1.stride(0), gradInput1.stride(1), gradInput1.stride(2), gradInput1.stride(3));
int n = gradInput1.numel();
AT_DISPATCH_FLOATING_TYPES_AND_HALF(input1.type(), "channelnorm_backward_input1", ([&] {
kernel_channelnorm_backward_input1<scalar_t><<< (n + CUDA_NUM_THREADS - 1)/CUDA_NUM_THREADS, CUDA_NUM_THREADS, 0, at::cuda::getCurrentCUDAStream() >>>(
//at::globalContext().getCurrentCUDAStream() >>>(
n,
input1.data<scalar_t>(),
input1_size,
input1_stride,
output.data<scalar_t>(),
output_size,
output_stride,
gradOutput.data<scalar_t>(),
gradOutput_size,
gradOutput_stride,
gradInput1.data<scalar_t>(),
gradInput1_size,
gradInput1_stride,
norm_deg
);
}));
// TODO: Add ATen-equivalent check
// THCudaCheck(cudaGetLastError());
}
================================================
FILE: dvs/flownet2/networks/channelnorm_package/channelnorm_kernel.cuh
================================================
#pragma once
#include <ATen/ATen.h>
void channelnorm_kernel_forward(
at::Tensor& input1,
at::Tensor& output,
int norm_deg);
void channelnorm_kernel_backward(
at::Tensor& input1,
at::Tensor& output,
at::Tensor& gradOutput,
at::Tensor& gradInput1,
int norm_deg);
================================================
FILE: dvs/flownet2/networks/channelnorm_package/setup.py
================================================
#!/usr/bin/env python3
import os
import torch
from setuptools import setup
from torch.utils.cpp_extension import BuildExtension, CUDAExtension
cxx_args = ['-std=c++11']
nvcc_args = [
'-gencode', 'arch=compute_52,code=sm_52',
'-gencode', 'arch=compute_60,code=sm_60',
'-gencode', 'arch=compute_61,code=sm_61',
'-gencode', 'arch=compute_70,code=sm_70',
'-gencode', 'arch=compute_70,code=compute_70'
]
setup(
name='channelnorm_cuda',
ext_modules=[
CUDAExtension('channelnorm_cuda', [
'channelnorm_cuda.cc',
'channelnorm_kernel.cu'
], extra_compile_args={'cxx': cxx_args, 'nvcc': nvcc_args})
],
cmdclass={
'build_ext': BuildExtension
})
================================================
FILE: dvs/flownet2/networks/correlation_package/__init__.py
================================================
================================================
FILE: dvs/flownet2/networks/correlation_package/correlation.py
================================================
import torch
from torch.nn.modules.module import Module
from torch.autograd import Function
import correlation_cuda
class CorrelationFunction(Function):
@staticmethod
def forward(ctx, input1, input2, pad_size=3, kernel_size=3, max_displacement=20, stride1=1, stride2=2, corr_multiply=1):
ctx.save_for_backward(input1, input2)
ctx.pad_size = pad_size
ctx.kernel_size = kernel_size
ctx.max_displacement = max_displacement
ctx.stride1 = stride1
ctx.stride2 = stride2
ctx.corr_multiply = corr_multiply
with torch.cuda.device_of(input1):
rbot1 = input1.new()
rbot2 = input2.new()
output = input1.new()
correlation_cuda.forward(input1, input2, rbot1, rbot2, output,
ctx.pad_size, ctx.kernel_size, ctx.max_displacement, ctx.stride1, ctx.stride2, ctx.corr_multiply)
return output
@staticmethod
def backward(ctx, grad_output):
input1, input2 = ctx.saved_tensors
with torch.cuda.device_of(input1):
rbot1 = input1.new()
rbot2 = input2.new()
grad_input1 = input1.new()
grad_input2 = input2.new()
correlation_cuda.backward(input1, input2, rbot1, rbot2, grad_output, grad_input1, grad_input2,
ctx.pad_size, ctx.kernel_size, ctx.max_displacement, ctx.stride1, ctx.stride2, ctx.corr_multiply)
return grad_input1, grad_input2, None, None, None, None, None, None
class Correlation(Module):
def __init__(self, pad_size=0, kernel_size=0, max_displacement=0, stride1=1, stride2=2, corr_multiply=1):
super(Correlation, self).__init__()
self.pad_size = pad_size
self.kernel_size = kernel_size
self.max_displacement = max_displacement
self.stride1 = stride1
self.stride2 = stride2
self.corr_multiply = corr_multiply
def forward(self, input1, input2):
result = CorrelationFunction.apply(input1, input2, self.pad_size, self.kernel_size, self.max_displacement, self.stride1, self.stride2, self.corr_multiply)
return result
================================================
FILE: dvs/flownet2/networks/correlation_package/correlation_cuda.cc
================================================
#include <torch/torch.h>
#include <ATen/ATen.h>
#include <ATen/Context.h>
#include <ATen/cuda/CUDAContext.h>
#include <stdio.h>
#include <iostream>
#include "correlation_cuda_kernel.cuh"
int correlation_forward_cuda(at::Tensor& input1, at::Tensor& input2, at::Tensor& rInput1, at::Tensor& rInput2, at::Tensor& output,
int pad_size,
int kernel_size,
int max_displacement,
int stride1,
int stride2,
int corr_type_multiply)
{
int batchSize = input1.size(0);
int nInputChannels = input1.size(1);
int inputHeight = input1.size(2);
int inputWidth = input1.size(3);
int kernel_radius = (kernel_size - 1) / 2;
int border_radius = kernel_radius + max_displacement;
int paddedInputHeight = inputHeight + 2 * pad_size;
int paddedInputWidth = inputWidth + 2 * pad_size;
int nOutputChannels = ((max_displacement/stride2)*2 + 1) * ((max_displacement/stride2)*2 + 1);
int outputHeight = ceil(static_cast<float>(paddedInputHeight - 2 * border_radius) / static_cast<float>(stride1));
int outputwidth = ceil(static_cast<float>(paddedInputWidth - 2 * border_radius) / static_cast<float>(stride1));
rInput1.resize_({batchSize, paddedInputHeight, paddedInputWidth, nInputChannels});
rInput2.resize_({batchSize, paddedInputHeight, paddedInputWidth, nInputChannels});
output.resize_({batchSize, nOutputChannels, outputHeight, outputwidth});
rInput1.fill_(0);
rInput2.fill_(0);
output.fill_(0);
int success = correlation_forward_cuda_kernel(
output,
output.size(0),
output.size(1),
output.size(2),
output.size(3),
output.stride(0),
output.stride(1),
output.stride(2),
output.stride(3),
input1,
input1.size(1),
input1.size(2),
input1.size(3),
input1.stride(0),
input1.stride(1),
input1.stride(2),
input1.stride(3),
input2,
input2.size(1),
input2.stride(0),
input2.stride(1),
input2.stride(2),
input2.stride(3),
rInput1,
rInput2,
pad_size,
kernel_size,
max_displacement,
stride1,
stride2,
corr_type_multiply,
at::cuda::getCurrentCUDAStream()
//at::globalContext().getCurrentCUDAStream()
);
//check for errors
if (!success) {
AT_ERROR("CUDA call failed");
}
return 1;
}
int correlation_backward_cuda(at::Tensor& input1, at::Tensor& input2, at::Tensor& rInput1, at::Tensor& rInput2, at::Tensor& gradOutput,
at::Tensor& gradInput1, at::Tensor& gradInput2,
int pad_size,
int kernel_size,
int max_displacement,
int stride1,
int stride2,
int corr_type_multiply)
{
int batchSize = input1.size(0);
int nInputChannels = input1.size(1);
int paddedInputHeight = input1.size(2)+ 2 * pad_size;
int paddedInputWidth = input1.size(3)+ 2 * pad_size;
int height = input1.size(2);
int width = input1.size(3);
rInput1.resize_({batchSize, paddedInputHeight, paddedInputWidth, nInputChannels});
rInput2.resize_({batchSize, paddedInputHeight, paddedInputWidth, nInputChannels});
gradInput1.resize_({batchSize, nInputChannels, height, width});
gradInput2.resize_({batchSize, nInputChannels, height, width});
rInput1.fill_(0);
rInput2.fill_(0);
gradInput1.fill_(0);
gradInput2.fill_(0);
int success = correlation_backward_cuda_kernel(gradOutput,
gradOutput.size(0),
gradOutput.size(1),
gradOutput.size(2),
gradOutput.size(3),
gradOutput.stride(0),
gradOutput.stride(1),
gradOutput.stride(2),
gradOutput.stride(3),
input1,
input1.size(1),
input1.size(2),
input1.size(3),
input1.stride(0),
input1.stride(1),
input1.stride(2),
input1.stride(3),
input2,
input2.stride(0),
input2.stride(1),
input2.stride(2),
input2.stride(3),
gradInput1,
gradInput1.stride(0),
gradInput1.stride(1),
gradInput1.stride(2),
gradInput1.stride(3),
gradInput2,
gradInput2.size(1),
gradInput2.stride(0),
gradInput2.stride(1),
gradInput2.stride(2),
gradInput2.stride(3),
rInput1,
rInput2,
pad_size,
kernel_size,
max_displacement,
stride1,
stride2,
corr_type_multiply,
at::cuda::getCurrentCUDAStream()
//at::globalContext().getCurrentCUDAStream()
);
if (!success) {
AT_ERROR("CUDA call failed");
}
return 1;
}
PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
m.def("forward", &correlation_forward_cuda, "Correlation forward (CUDA)");
m.def("backward", &correlation_backward_cuda, "Correlation backward (CUDA)");
}
================================================
FILE: dvs/flownet2/networks/correlation_package/correlation_cuda_kernel.cu
================================================
#include <stdio.h>
#include "correlation_cuda_kernel.cuh"
#define CUDA_NUM_THREADS 1024
#define THREADS_PER_BLOCK 32
#define FULL_MASK 0xffffffff
#include <ATen/ATen.h>
#include <ATen/NativeFunctions.h>
#include <ATen/Dispatch.h>
#include <ATen/cuda/CUDAApplyUtils.cuh>
using at::Half;
template<typename scalar_t>
__forceinline__ __device__ scalar_t warpReduceSum(scalar_t val) {
for (int offset = 16; offset > 0; offset /= 2)
val += __shfl_down_sync(FULL_MASK, val, offset);
return val;
}
template<typename scalar_t>
__forceinline__ __device__ scalar_t blockReduceSum(scalar_t val) {
static __shared__ scalar_t shared[32];
int lane = threadIdx.x % warpSize;
int wid = threadIdx.x / warpSize;
val = warpReduceSum(val);
if (lane == 0)
shared[wid] = val;
__syncthreads();
val = (threadIdx.x < blockDim.x / warpSize) ? shared[lane] : 0;
if (wid == 0)
val = warpReduceSum(val);
return val;
}
template <typename scalar_t>
__global__ void channels_first(const scalar_t* __restrict__ input, scalar_t* rinput, int channels, int height, int width, int pad_size)
{
// n (batch size), c (num of channels), y (height), x (width)
int n = blockIdx.x;
int y = blockIdx.y;
int x = blockIdx.z;
int ch_off = threadIdx.x;
scalar_t value;
int dimcyx = channels * height * width;
int dimyx = height * width;
int p_dimx = (width + 2 * pad_size);
int p_dimy = (height + 2 * pad_size);
int p_dimyxc = channels * p_dimy * p_dimx;
int p_dimxc = p_dimx * channels;
for (int c = ch_off; c < channels; c += THREADS_PER_BLOCK) {
value = input[n * dimcyx + c * dimyx + y * width + x];
rinput[n * p_dimyxc + (y + pad_size) * p_dimxc + (x + pad_size) * channels + c] = value;
}
}
template<typename scalar_t>
__global__ void correlation_forward(scalar_t* __restrict__ output, const int nOutputChannels,
const int outputHeight, const int outputWidth, const scalar_t* __restrict__ rInput1,
const int nInputChannels, const int inputHeight, const int inputWidth,
const scalar_t* __restrict__ rInput2, const int pad_size, const int kernel_size,
const int max_displacement, const int stride1, const int stride2) {
int32_t pInputWidth = inputWidth + 2 * pad_size;
int32_t pInputHeight = inputHeight + 2 * pad_size;
int32_t kernel_rad = (kernel_size - 1) / 2;
int32_t displacement_rad = max_displacement / stride2;
int32_t displacement_size = 2 * displacement_rad + 1;
int32_t n = blockIdx.x;
int32_t y1 = blockIdx.y * stride1 + max_displacement;
int32_t x1 = blockIdx.z * stride1 + max_displacement;
int32_t c = threadIdx.x;
int32_t pdimyxc = pInputHeight * pInputWidth * nInputChannels;
int32_t pdimxc = pInputWidth * nInputChannels;
int32_t pdimc = nInputChannels;
int32_t tdimcyx = nOutputChannels * outputHeight * outputWidth;
int32_t tdimyx = outputHeight * outputWidth;
int32_t tdimx = outputWidth;
int32_t nelems = kernel_size * kernel_size * pdimc;
// element-wise product along channel axis
for (int tj = -displacement_rad; tj <= displacement_rad; ++tj) {
for (int ti = -displacement_rad; ti <= displacement_rad; ++ti) {
int x2 = x1 + ti * stride2;
int y2 = y1 + tj * stride2;
float acc0 = 0.0f;
for (int j = -kernel_rad; j <= kernel_rad; ++j) {
for (int i = -kernel_rad; i <= kernel_rad; ++i) {
// THREADS_PER_BLOCK
#pragma unroll
for (int ch = c; ch < pdimc; ch += blockDim.x) {
int indx1 = n * pdimyxc + (y1 + j) * pdimxc
+ (x1 + i) * pdimc + ch;
int indx2 = n * pdimyxc + (y2 + j) * pdimxc
+ (x2 + i) * pdimc + ch;
acc0 += static_cast<float>(rInput1[indx1] * rInput2[indx2]);
}
}
}
if (blockDim.x == warpSize) {
__syncwarp();
acc0 = warpReduceSum(acc0);
} else {
__syncthreads();
acc0 = blockReduceSum(acc0);
}
if (threadIdx.x == 0) {
int tc = (tj + displacement_rad) * displacement_size
+ (ti + displacement_rad);
const int tindx = n * tdimcyx + tc * tdimyx + blockIdx.y * tdimx
+ blockIdx.z;
output[tindx] = static_cast<scalar_t>(acc0 / nelems);
}
}
}
}
template <typename scalar_t>
__global__ void correlation_backward_input1(int item, scalar_t* gradInput1, int nInputChannels, int inputHeight, int inputWidth,
const scalar_t* __restrict__ gradOutput, int nOutputChannels, int outputHeight, int outputWidth,
const scalar_t* __restrict__ rInput2,
int pad_size,
int kernel_size,
int max_displacement,
int stride1,
int stride2)
{
// n (batch size), c (num of channels), y (height), x (width)
int n = item;
int y = blockIdx.x * stride1 + pad_size;
int x = blockIdx.y * stride1 + pad_size;
int c = blockIdx.z;
int tch_off = threadIdx.x;
int kernel_rad = (kernel_size - 1) / 2;
int displacement_rad = max_displacement / stride2;
int displacement_size = 2 * displacement_rad + 1;
int xmin = (x - kernel_rad - max_displacement) / stride1;
int ymin = (y - kernel_rad - max_displacement) / stride1;
int xmax = (x + kernel_rad - max_displacement) / stride1;
int ymax = (y + kernel_rad - max_displacement) / stride1;
if (xmax < 0 || ymax < 0 || xmin >= outputWidth || ymin >= outputHeight) {
// assumes gradInput1 is pre-allocated and zero filled
return;
}
if (xmin > xmax || ymin > ymax) {
// assumes gradInput1 is pre-allocated and zero filled
return;
}
xmin = max(0,xmin);
xmax = min(outputWidth-1,xmax);
ymin = max(0,ymin);
ymax = min(outputHeight-1,ymax);
int pInputWidth = inputWidth + 2 * pad_size;
int pInputHeight = inputHeight + 2 * pad_size;
int pdimyxc = pInputHeight * pInputWidth * nInputChannels;
int pdimxc = pInputWidth * nInputChannels;
int pdimc = nInputChannels;
int tdimcyx = nOutputChannels * outputHeight * outputWidth;
int tdimyx = outputHeight * outputWidth;
int tdimx = outputWidth;
int odimcyx = nInputChannels * inputHeight* inputWidth;
int odimyx = inputHeight * inputWidth;
int odimx = inputWidth;
scalar_t nelems = kernel_size * kernel_size * nInputChannels;
__shared__ scalar_t prod_sum[THREADS_PER_BLOCK];
prod_sum[tch_off] = 0;
for (int tc = tch_off; tc < nOutputChannels; tc += THREADS_PER_BLOCK) {
int i2 = (tc % displacement_size - displacement_rad) * stride2;
int j2 = (tc / displacement_size - displacement_rad) * stride2;
int indx2 = n * pdimyxc + (y + j2)* pdimxc + (x + i2) * pdimc + c;
scalar_t val2 = rInput2[indx2];
for (int j = ymin; j <= ymax; ++j) {
for (int i = xmin; i <= xmax; ++i) {
int tindx = n * tdimcyx + tc * tdimyx + j * tdimx + i;
prod_sum[tch_off] += gradOutput[tindx] * val2;
}
}
}
__syncthreads();
if(tch_off == 0) {
scalar_t reduce_sum = 0;
for(int idx = 0; idx < THREADS_PER_BLOCK; idx++) {
reduce_sum += prod_sum[idx];
}
const int indx1 = n * odimcyx + c * odimyx + (y - pad_size) * odimx + (x - pad_size);
gradInput1[indx1] = reduce_sum / nelems;
}
}
template <typename scalar_t>
__global__ void correlation_backward_input2(int item, scalar_t* gradInput2, int nInputChannels, int inputHeight, int inputWidth,
const scalar_t* __restrict__ gradOutput, int nOutputChannels, int outputHeight, int outputWidth,
const scalar_t* __restrict__ rInput1,
int pad_size,
int kernel_size,
int max_displacement,
int stride1,
int stride2)
{
// n (batch size), c (num of channels), y (height), x (width)
int n = item;
int y = blockIdx.x * stride1 + pad_size;
int x = blockIdx.y * stride1 + pad_size;
int c = blockIdx.z;
int tch_off = threadIdx.x;
int kernel_rad = (kernel_size - 1) / 2;
int displacement_rad = max_displacement / stride2;
int displacement_size = 2 * displacement_rad + 1;
int pInputWidth = inputWidth + 2 * pad_size;
int pInputHeight = inputHeight + 2 * pad_size;
int pdimyxc = pInputHeight * pInputWidth * nInputChannels;
int pdimxc = pInputWidth * nInputChannels;
int pdimc = nInputChannels;
int tdimcyx = nOutputChannels * outputHeight * outputWidth;
int tdimyx = outputHeight * outputWidth;
int tdimx = outputWidth;
int odimcyx = nInputChannels * inputHeight* inputWidth;
int odimyx = inputHeight * inputWidth;
int odimx = inputWidth;
scalar_t nelems = kernel_size * kernel_size * nInputChannels;
__shared__ scalar_t prod_sum[THREADS_PER_BLOCK];
prod_sum[tch_off] = 0;
for (int tc = tch_off; tc < nOutputChannels; tc += THREADS_PER_BLOCK) {
int i2 = (tc % displacement_size - displacement_rad) * stride2;
int j2 = (tc / displacement_size - displacement_rad) * stride2;
int xmin = (x - kernel_rad - max_displacement - i2) / stride1;
int ymin = (y - kernel_rad - max_displacement - j2) / stride1;
int xmax = (x + kernel_rad - max_displacement - i2) / stride1;
int ymax = (y + kernel_rad - max_displacement - j2) / stride1;
if (xmax < 0 || ymax < 0 || xmin >= outputWidth || ymin >= outputHeight) {
// assumes gradInput2 is pre-allocated and zero filled
continue;
}
if (xmin > xmax || ymin > ymax) {
// assumes gradInput2 is pre-allocated and zero filled
continue;
}
xmin = max(0,xmin);
xmax = min(outputWidth-1,xmax);
ymin = max(0,ymin);
ymax = min(outputHeight-1,ymax);
int indx1 = n * pdimyxc + (y - j2)* pdimxc + (x - i2) * pdimc + c;
scalar_t val1 = rInput1[indx1];
for (int j = ymin; j <= ymax; ++j) {
for (int i = xmin; i <= xmax; ++i) {
int tindx = n * tdimcyx + tc * tdimyx + j * tdimx + i;
prod_sum[tch_off] += gradOutput[tindx] * val1;
}
}
}
__syncthreads();
if(tch_off == 0) {
scalar_t reduce_sum = 0;
for(int idx = 0; idx < THREADS_PER_BLOCK; idx++) {
reduce_sum += prod_sum[idx];
}
const int indx2 = n * odimcyx + c * odimyx + (y - pad_size) * odimx + (x - pad_size);
gradInput2[indx2] = reduce_sum / nelems;
}
}
int correlation_forward_cuda_kernel(at::Tensor& output,
int ob,
int oc,
int oh,
int ow,
int osb,
int osc,
int osh,
int osw,
at::Tensor& input1,
int ic,
int ih,
int iw,
int isb,
int isc,
int ish,
int isw,
at::Tensor& input2,
int gc,
int gsb,
int gsc,
int gsh,
int gsw,
at::Tensor& rInput1,
at::Tensor& rInput2,
int pad_size,
int kernel_size,
int max_displacement,
int stride1,
int stride2,
int corr_type_multiply,
cudaStream_t stream)
{
int batchSize = ob;
int nInputChannels = ic;
int inputWidth = iw;
int inputHeight = ih;
int nOutputChannels = oc;
int outputWidth = ow;
int outputHeight = oh;
dim3 blocks_grid(batchSize, inputHeight, inputWidth);
dim3 threads_block(THREADS_PER_BLOCK);
AT_DISPATCH_FLOATING_TYPES_AND_HALF(input1.type(), "channels_first_fwd_1", ([&] {
channels_first<scalar_t><<<blocks_grid,threads_block, 0, stream>>>(
input1.data<scalar_t>(), rInput1.data<scalar_t>(), nInputChannels, inputHeight, inputWidth, pad_size);
}));
AT_DISPATCH_FLOATING_TYPES_AND_HALF(input2.type(), "channels_first_fwd_2", ([&] {
channels_first<scalar_t><<<blocks_grid,threads_block, 0, stream>>> (
input2.data<scalar_t>(), rInput2.data<scalar_t>(), nInputChannels, inputHeight, inputWidth, pad_size);
}));
dim3 threadsPerBlock(THREADS_PER_BLOCK);
dim3 totalBlocksCorr(batchSize, outputHeight, outputWidth);
AT_DISPATCH_FLOATING_TYPES_AND_HALF(input1.type(), "correlation_forward", ([&] {
correlation_forward<scalar_t><<<totalBlocksCorr, threadsPerBlock, 0, stream>>>
(output.data<scalar_t>(), nOutputChannels, outputHeight, outputWidth,
rInput1.data<scalar_t>(), nInputChannels, inputHeight, inputWidth,
rInput2.data<scalar_t>(),
pad_size,
kernel_size,
max_displacement,
stride1,
stride2);
}));
cudaError_t err = cudaGetLastError();
// check for errors
if (err != cudaSuccess) {
printf("error in correlation_forward_cuda_kernel: %s\n", cudaGetErrorString(err));
return 0;
}
return 1;
}
int correlation_backward_cuda_kernel(
at::Tensor& gradOutput,
int gob,
int goc,
int goh,
int gow,
int gosb,
int gosc,
int gosh,
int gosw,
at::Tensor& input1,
int ic,
int ih,
int iw,
int isb,
int isc,
int ish,
int isw,
at::Tensor& input2,
int gsb,
int gsc,
int gsh,
int gsw,
at::Tensor& gradInput1,
int gisb,
int gisc,
int gish,
int gisw,
at::Tensor& gradInput2,
int ggc,
int ggsb,
int ggsc,
int ggsh,
int ggsw,
at::Tensor& rInput1,
at::Tensor& rInput2,
int pad_size,
int kernel_size,
int max_displacement,
int stride1,
int stride2,
int corr_type_multiply,
cudaStream_t stream)
{
int batchSize = gob;
int num = batchSize;
int nInputChannels = ic;
int inputWidth = iw;
int inputHeight = ih;
int nOutputChannels = goc;
int outputWidth = gow;
int outputHeight = goh;
dim3 blocks_grid(batchSize, inputHeight, inputWidth);
dim3 threads_block(THREADS_PER_BLOCK);
AT_DISPATCH_FLOATING_TYPES_AND_HALF(input1.type(), "lltm_forward_cuda", ([&] {
channels_first<scalar_t><<<blocks_grid, threads_block, 0, stream>>>(
input1.data<scalar_t>(),
rInput1.data<scalar_t>(),
nInputChannels,
inputHeight,
inputWidth,
pad_size
);
}));
AT_DISPATCH_FLOATING_TYPES_AND_HALF(input2.type(), "lltm_forward_cuda", ([&] {
channels_first<scalar_t><<<blocks_grid, threads_block, 0, stream>>>(
input2.data<scalar_t>(),
rInput2.data<scalar_t>(),
nInputChannels,
inputHeight,
inputWidth,
pad_size
);
}));
dim3 threadsPerBlock(THREADS_PER_BLOCK);
dim3 totalBlocksCorr(inputHeight, inputWidth, nInputChannels);
for (int n = 0; n < num; ++n) {
AT_DISPATCH_FLOATING_TYPES_AND_HALF(input2.type(), "lltm_forward_cuda", ([&] {
correlation_backward_input1<scalar_t><<<totalBlocksCorr, threadsPerBlock, 0, stream>>> (
n, gradInput1.data<scalar_t>(), nInputChannels, inputHeight, inputWidth,
gradOutput.data<scalar_t>(), nOutputChannels, outputHeight, outputWidth,
rInput2.data<scalar_t>(),
pad_size,
kernel_size,
max_displacement,
stride1,
stride2);
}));
}
for(int n = 0; n < batchSize; n++) {
AT_DISPATCH_FLOATING_TYPES_AND_HALF(rInput1.type(), "lltm_forward_cuda", ([&] {
correlation_backward_input2<scalar_t><<<totalBlocksCorr, threadsPerBlock, 0, stream>>>(
n, gradInput2.data<scalar_t>(), nInputChannels, inputHeight, inputWidth,
gradOutput.data<scalar_t>(), nOutputChannels, outputHeight, outputWidth,
rInput1.data<scalar_t>(),
pad_size,
kernel_size,
max_displacement,
stride1,
stride2);
}));
}
// check for errors
cudaError_t err = cudaGetLastError();
if (err != cudaSuccess) {
printf("error in correlation_backward_cuda_kernel: %s\n", cudaGetErrorString(err));
return 0;
}
return 1;
}
================================================
FILE: dvs/flownet2/networks/correlation_package/correlation_cuda_kernel.cuh
================================================
#pragma once
#include <ATen/ATen.h>
#include <ATen/Context.h>
#include <cuda_runtime.h>
int correlation_forward_cuda_kernel(at::Tensor& output,
int ob,
int oc,
int oh,
int ow,
int osb,
int osc,
int osh,
int osw,
at::Tensor& input1,
int ic,
int ih,
int iw,
int isb,
int isc,
int ish,
int isw,
at::Tensor& input2,
int gc,
int gsb,
int gsc,
int gsh,
int gsw,
at::Tensor& rInput1,
at::Tensor& rInput2,
int pad_size,
int kernel_size,
int max_displacement,
int stride1,
int stride2,
int corr_type_multiply,
cudaStream_t stream);
int correlation_backward_cuda_kernel(
at::Tensor& gradOutput,
int gob,
int goc,
int goh,
int gow,
int gosb,
int gosc,
int gosh,
int gosw,
at::Tensor& input1,
int ic,
int ih,
int iw,
int isb,
int isc,
int ish,
int isw,
at::Tensor& input2,
int gsb,
int gsc,
int gsh,
int gsw,
at::Tensor& gradInput1,
int gisb,
int gisc,
int gish,
int gisw,
at::Tensor& gradInput2,
int ggc,
int ggsb,
int ggsc,
int ggsh,
int ggsw,
at::Tensor& rInput1,
at::Tensor& rInput2,
int pad_size,
int kernel_size,
int max_displacement,
int stride1,
int stride2,
int corr_type_multiply,
cudaStream_t stream);
================================================
FILE: dvs/flownet2/networks/correlation_package/setup.py
================================================
#!/usr/bin/env python3
import os
import torch
from setuptools import setup, find_packages
from torch.utils.cpp_extension import BuildExtension, CUDAExtension
cxx_args = ['-std=c++11']
nvcc_args = [
'-gencode', 'arch=compute_50,code=sm_50',
'-gencode', 'arch=compute_52,code=sm_52',
'-gencode', 'arch=compute_60,code=sm_60',
'-gencode', 'arch=compute_61,code=sm_61',
'-gencode', 'arch=compute_70,code=sm_70',
'-gencode', 'arch=compute_70,code=compute_70'
]
setup(
name='correlation_cuda',
ext_modules=[
CUDAExtension('correlation_cuda', [
'correlation_cuda.cc',
'correlation_cuda_kernel.cu'
], extra_compile_args={'cxx': cxx_args, 'nvcc': nvcc_args})
],
cmdclass={
'build_ext': BuildExtension
})
================================================
FILE: dvs/flownet2/networks/resample2d_package/__init__.py
================================================
================================================
FILE: dvs/flownet2/networks/resample2d_package/resample2d.py
================================================
from torch.nn.modules.module import Module
from torch.autograd import Function, Variable
import resample2d_cuda
class Resample2dFunction(Function):
@staticmethod
def forward(ctx, input1, input2, kernel_size=1, bilinear= True):
assert input1.is_contiguous()
assert input2.is_contiguous()
ctx.save_for_backward(input1, input2)
ctx.kernel_size = kernel_size
ctx.bilinear = bilinear
_, d, _, _ = input1.size()
b, _, h, w = input2.size()
output = input1.new(b, d, h, w).zero_()
resample2d_cuda.forward(input1, input2, output, kernel_size, bilinear)
return output
@staticmethod
def backward(ctx, grad_output):
grad_output = grad_output.contiguous()
assert grad_output.is_contiguous()
input1, input2 = ctx.saved_tensors
grad_input1 = Variable(input1.new(input1.size()).zero_())
grad_input2 = Variable(input1.new(input2.size()).zero_())
resample2d_cuda.backward(input1, input2, grad_output.data,
grad_input1.data, grad_input2.data,
ctx.kernel_size, ctx.bilinear)
return grad_input1, grad_input2, None, None
class Resample2d(Module):
def __init__(self, kernel_size=1, bilinear = True):
super(Resample2d, self).__init__()
self.kernel_size = kernel_size
self.bilinear = bilinear
def forward(self, input1, input2):
input1_c = input1.contiguous()
return Resample2dFunction.apply(input1_c, input2, self.kernel_size, self.bilinear)
================================================
FILE: dvs/flownet2/networks/resample2d_package/resample2d_cuda.cc
================================================
#include <ATen/ATen.h>
#include <torch/torch.h>
#include "resample2d_kernel.cuh"
int resample2d_cuda_forward(
at::Tensor& input1,
at::Tensor& input2,
at::Tensor& output,
int kernel_size, bool bilinear) {
resample2d_kernel_forward(input1, input2, output, kernel_size, bilinear);
return 1;
}
int resample2d_cuda_backward(
at::Tensor& input1,
at::Tensor& input2,
at::Tensor& gradOutput,
at::Tensor& gradInput1,
at::Tensor& gradInput2,
int kernel_size, bool bilinear) {
resample2d_kernel_backward(input1, input2, gradOutput, gradInput1, gradInput2, kernel_size, bilinear);
return 1;
}
PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
m.def("forward", &resample2d_cuda_forward, "Resample2D forward (CUDA)");
m.def("backward", &resample2d_cuda_backward, "Resample2D backward (CUDA)");
}
================================================
FILE: dvs/flownet2/networks/resample2d_package/resample2d_kernel.cu
================================================
#include <ATen/ATen.h>
#include <ATen/Context.h>
#include <ATen/cuda/CUDAContext.h>
#define CUDA_NUM_THREADS 512
#define THREADS_PER_BLOCK 64
#define DIM0(TENSOR) ((TENSOR).x)
#define DIM1(TENSOR) ((TENSOR).y)
#define DIM2(TENSOR) ((TENSOR).z)
#define DIM3(TENSOR) ((TENSOR).w)
#define DIM3_INDEX(TENSOR, xx, yy, zz, ww) ((TENSOR)[((xx) * (TENSOR##_stride.x)) + ((yy) * (TENSOR##_stride.y)) + ((zz) * (TENSOR##_stride.z)) + ((ww) * (TENSOR##_stride.w))])
template <typename scalar_t>
__global__ void kernel_resample2d_update_output(const int n,
const scalar_t* __restrict__ input1, const long4 input1_size, const long4 input1_stride,
const scalar_t* __restrict__ input2, const long4 input2_size, const long4 input2_stride,
scalar_t* __restrict__ output, const long4 output_size, const long4 output_stride, int kernel_size, bool bilinear) {
int index = blockIdx.x * blockDim.x + threadIdx.x;
if (index >= n) {
return;
}
scalar_t val = 0.0f;
int dim_b = DIM0(output_size);
int dim_c = DIM1(output_size);
int dim_h = DIM2(output_size);
int dim_w = DIM3(output_size);
int dim_chw = dim_c * dim_h * dim_w;
int dim_hw = dim_h * dim_w;
int b = ( index / dim_chw ) % dim_b;
int c = ( index / dim_hw ) % dim_c;
int y = ( index / dim_w ) % dim_h;
int x = ( index ) % dim_w;
scalar_t dx = DIM3_INDEX(input2, b, 0, y, x);
scalar_t dy = DIM3_INDEX(input2, b, 1, y, x);
scalar_t xf = static_cast<scalar_t>(x) + dx;
scalar_t yf = static_cast<scalar_t>(y) + dy;
scalar_t alpha = xf - floor(xf); // alpha
scalar_t beta = yf - floor(yf); // beta
if (bilinear) {
int xL = max(min( int (floor(xf)), dim_w-1), 0);
int xR = max(min( int (floor(xf)+1), dim_w -1), 0);
int yT = max(min( int (floor(yf)), dim_h-1), 0);
int yB = max(min( int (floor(yf)+1), dim_h-1), 0);
for (int fy = 0; fy < kernel_size; fy += 1) {
for (int fx = 0; fx < kernel_size; fx += 1) {
val += static_cast<float>((1. - alpha)*(1. - beta) * DIM3_INDEX(input1, b, c, yT + fy, xL + fx));
val += static_cast<float>((alpha)*(1. - beta) * DIM3_INDEX(input1, b, c, yT + fy, xR + fx));
val += static_cast<float>((1. - alpha)*(beta) * DIM3_INDEX(input1, b, c, yB + fy, xL + fx));
val += static_cast<float>((alpha)*(beta) * DIM3_INDEX(input1, b, c, yB + fy, xR + fx));
}
}
output[index] = val;
}
else {
int xN = max(min( int (floor(xf + 0.5)), dim_w - 1), 0);
int yN = max(min( int (floor(yf + 0.5)), dim_h - 1), 0);
output[index] = static_cast<float> ( DIM3_INDEX(input1, b, c, yN, xN) );
}
}
template <typename scalar_t>
__global__ void kernel_resample2d_backward_input1(
const int n, const scalar_t* __restrict__ input1, const long4 input1_size, const long4 input1_stride,
const scalar_t* __restrict__ input2, const long4 input2_size, const long4 input2_stride,
const scalar_t* __restrict__ gradOutput, const long4 gradOutput_size, const long4 gradOutput_stride,
scalar_t* __restrict__ gradInput, const long4 gradInput_size, const long4 gradInput_stride, int kernel_size, bool bilinear) {
int index = blockIdx.x * blockDim.x + threadIdx.x;
if (index >= n) {
return;
}
int dim_b = DIM0(gradOutput_size);
int dim_c = DIM1(gradOutput_size);
int dim_h = DIM2(gradOutput_size);
int dim_w = DIM3(gradOutput_size);
int dim_chw = dim_c * dim_h * dim_w;
int dim_hw = dim_h * dim_w;
int b = ( index / dim_chw ) % dim_b;
int c = ( index / dim_hw ) % dim_c;
int y = ( index / dim_w ) % dim_h;
int x = ( index ) % dim_w;
scalar_t dx = DIM3_INDEX(input2, b, 0, y, x);
scalar_t dy = DIM3_INDEX(input2, b, 1, y, x);
scalar_t xf = static_cast<scalar_t>(x) + dx;
scalar_t yf = static_cast<scalar_t>(y) + dy;
scalar_t alpha = xf - int(xf); // alpha
scalar_t beta = yf - int(yf); // beta
int idim_h = DIM2(input1_size);
int idim_w = DIM3(input1_size);
int xL = max(min( int (floor(xf)), idim_w-1), 0);
int xR = max(min( int (floor(xf)+1), idim_w -1), 0);
int yT = max(min( int (floor(yf)), idim_h-1), 0);
int yB = max(min( int (floor(yf)+1), idim_h-1), 0);
for (int fy = 0; fy < kernel_size; fy += 1) {
for (int fx = 0; fx < kernel_size; fx += 1) {
atomicAdd(&DIM3_INDEX(gradInput, b, c, (yT + fy), (xL + fx)), (1-alpha)*(1-beta) * DIM3_INDEX(gradOutput, b, c, y, x));
atomicAdd(&DIM3_INDEX(gradInput, b, c, (yT + fy), (xR + fx)), (alpha)*(1-beta) * DIM3_INDEX(gradOutput, b, c, y, x));
atomicAdd(&DIM3_INDEX(gradInput, b, c, (yB + fy), (xL + fx)), (1-alpha)*(beta) * DIM3_INDEX(gradOutput, b, c, y, x));
atomicAdd(&DIM3_INDEX(gradInput, b, c, (yB + fy), (xR + fx)), (alpha)*(beta) * DIM3_INDEX(gradOutput, b, c, y, x));
}
}
}
template <typename scalar_t>
__global__ void kernel_resample2d_backward_input2(
const int n, const scalar_t* __restrict__ input1, const long4 input1_size, const long4 input1_stride,
const scalar_t* __restrict__ input2, const long4 input2_size, const long4 input2_stride,
const scalar_t* __restrict__ gradOutput, const long4 gradOutput_size, const long4 gradOutput_stride,
scalar_t* __restrict__ gradInput, const long4 gradInput_size, const long4 gradInput_stride, int kernel_size, bool bilinear) {
int index = blockIdx.x * blockDim.x + threadIdx.x;
if (index >= n) {
return;
}
scalar_t output = 0.0;
int kernel_rad = (kernel_size - 1)/2;
int dim_b = DIM0(gradInput_size);
int dim_c = DIM1(gradInput_size);
int dim_h = DIM2(gradInput_size);
int dim_w = DIM3(gradInput_size);
int dim_chw = dim_c * dim_h * dim_w;
int dim_hw = dim_h * dim_w;
int b = ( index / dim_chw ) % dim_b;
int c = ( index / dim_hw ) % dim_c;
int y = ( index / dim_w ) % dim_h;
int x = ( index ) % dim_w;
int odim_c = DIM1(gradOutput_size);
scalar_t dx = DIM3_INDEX(input2, b, 0, y, x);
scalar_t dy = DIM3_INDEX(input2, b, 1, y, x);
scalar_t xf = static_cast<scalar_t>(x) + dx;
scalar_t yf = static_cast<scalar_t>(y) + dy;
int xL = max(min( int (floor(xf)), dim_w-1), 0);
int xR = max(min( int (floor(xf)+1), dim_w -1), 0);
int yT = max(min( int (floor(yf)), dim_h-1), 0);
int yB = max(min( int (floor(yf)+1), dim_h-1), 0);
if (c % 2) {
float gamma = 1 - (xf - floor(xf)); // alpha
for (int i = 0; i <= 2*kernel_rad; ++i) {
for (int j = 0; j <= 2*kernel_rad; ++j) {
for (int ch = 0; ch < odim_c; ++ch) {
output += (gamma) * DIM3_INDEX(gradOutput, b, ch, y, x) * DIM3_INDEX(input1, b, ch, (yB + j), (xL + i));
output -= (gamma) * DIM3_INDEX(gradOutput, b, ch, y, x) * DIM3_INDEX(input1, b, ch, (yT + j), (xL + i));
output += (1-gamma) * DIM3_INDEX(gradOutput, b, ch, y, x) * DIM3_INDEX(input1, b, ch, (yB + j), (xR + i));
output -= (1-gamma) * DIM3_INDEX(gradOutput, b, ch, y, x) * DIM3_INDEX(input1, b, ch, (yT + j), (xR + i));
}
}
}
}
else {
float gamma = 1 - (yf - floor(yf)); // alpha
for (int i = 0; i <= 2*kernel_rad; ++i) {
for (int j = 0; j <= 2*kernel_rad; ++j) {
for (int ch = 0; ch < odim_c; ++ch) {
output += (gamma) * DIM3_INDEX(gradOutput, b, ch, y, x) * DIM3_INDEX(input1, b, ch, (yT + j), (xR + i));
output -= (gamma) * DIM3_INDEX(gradOutput, b, ch, y, x) * DIM3_INDEX(input1, b, ch, (yT + j), (xL + i));
output += (1-gamma) * DIM3_INDEX(gradOutput, b, ch, y, x) * DIM3_INDEX(input1, b, ch, (yB + j), (xR + i));
output -= (1-gamma) * DIM3_INDEX(gradOutput, b, ch, y, x) * DIM3_INDEX(input1, b, ch, (yB + j), (xL + i));
}
}
}
}
gradInput[index] = output;
}
void resample2d_kernel_forward(
at::Tensor& input1,
at::Tensor& input2,
at::Tensor& output,
int kernel_size,
bool bilinear) {
int n = output.numel();
const long4 input1_size = make_long4(input1.size(0), input1.size(1), input1.size(2), input1.size(3));
const long4 input1_stride = make_long4(input1.stride(0), input1.stride(1), input1.stride(2), input1.stride(3));
const long4 input2_size = make_long4(input2.size(0), input2.size(1), input2.size(2), input2.size(3));
const long4 input2_stride = make_long4(input2.stride(0), input2.stride(1), input2.stride(2), input2.stride(3));
const long4 output_size = make_long4(output.size(0), output.size(1), output.size(2), output.size(3));
const long4 output_stride = make_long4(output.stride(0), output.stride(1), output.stride(2), output.stride(3));
// TODO: when atomicAdd gets resolved, change to AT_DISPATCH_FLOATING_TYPES_AND_HALF
// AT_DISPATCH_FLOATING_TYPES(input1.type(), "resample_forward_kernel", ([&] {
kernel_resample2d_update_output<float><<< (n + CUDA_NUM_THREADS - 1)/CUDA_NUM_THREADS, CUDA_NUM_THREADS, 0, at::cuda::getCurrentCUDAStream() >>>(
//at::globalContext().getCurrentCUDAStream() >>>(
n,
input1.data<float>(),
input1_size,
input1_stride,
input2.data<float>(),
input2_size,
input2_stride,
output.data<float>(),
output_size,
output_stride,
kernel_size,
bilinear);
// }));
// TODO: ATen-equivalent check
// THCudaCheck(cudaGetLastError());
}
void resample2d_kernel_backward(
at::Tensor& input1,
at::Tensor& input2,
at::Tensor& gradOutput,
at::Tensor& gradInput1,
at::Tensor& gradInput2,
int kernel_size,
bool bilinear) {
int n = gradOutput.numel();
const long4 input1_size = make_long4(input1.size(0), input1.size(1), input1.size(2), input1.size(3));
const long4 input1_stride = make_long4(input1.stride(0), input1.stride(1), input1.stride(2), input1.stride(3));
const long4 input2_size = make_long4(input2.size(0), input2.size(1), input2.size(2), input2.size(3));
const long4 input2_stride = make_long4(input2.stride(0), input2.stride(1), input2.stride(2), input2.stride(3));
const long4 gradOutput_size = make_long4(gradOutput.size(0), gradOutput.size(1), gradOutput.size(2), gradOutput.size(3));
const long4 gradOutput_stride = make_long4(gradOutput.stride(0), gradOutput.stride(1), gradOutput.stride(2), gradOutput.stride(3));
const long4 gradInput1_size = make_long4(gradInput1.size(0), gradInput1.size(1), gradInput1.size(2), gradInput1.size(3));
const long4 gradInput1_stride = make_long4(gradInput1.stride(0), gradInput1.stride(1), gradInput1.stride(2), gradInput1.stride(3));
// AT_DISPATCH_FLOATING_TYPES(input1.type(), "resample_backward_input1", ([&] {
kernel_resample2d_backward_input1<float><<< (n + CUDA_NUM_THREADS - 1)/CUDA_NUM_THREADS, CUDA_NUM_THREADS, 0, at::cuda::getCurrentCUDAStream() >>>(
//at::globalContext().getCurrentCUDAStream() >>>(
n,
input1.data<float>(),
input1_size,
input1_stride,
input2.data<float>(),
input2_size,
input2_stride,
gradOutput.data<float>(),
gradOutput_size,
gradOutput_stride,
gradInput1.data<float>(),
gradInput1_size,
gradInput1_stride,
kernel_size,
bilinear
);
// }));
const long4 gradInput2_size = make_long4(gradInput2.size(0), gradInput2.size(1), gradInput2.size(2), gradInput2.size(3));
const long4 gradInput2_stride = make_long4(gradInput2.stride(0), gradInput2.stride(1), gradInput2.stride(2), gradInput2.stride(3));
n = gradInput2.numel();
// AT_DISPATCH_FLOATING_TYPES(gradInput2.type(), "resample_backward_input2", ([&] {
kernel_resample2d_backward_input2<float><<< (n + CUDA_NUM_THREADS - 1)/CUDA_NUM_THREADS, CUDA_NUM_THREADS, 0, at::cuda::getCurrentCUDAStream() >>>(
//at::globalContext().getCurrentCUDAStream() >>>(
n,
input1.data<float>(),
input1_size,
input1_stride,
input2.data<float>(),
input2_size,
input2_stride,
gradOutput.data<float>(),
gradOutput_size,
gradOutput_stride,
gradInput2.data<float>(),
gradInput2_size,
gradInput2_stride,
kernel_size,
bilinear
);
// }));
// TODO: Use the ATen equivalent to get last error
// THCudaCheck(cudaGetLastError());
}
================================================
FILE: dvs/flownet2/networks/resample2d_package/resample2d_kernel.cuh
================================================
#pragma once
#include <ATen/ATen.h>
void resample2d_kernel_forward(
at::Tensor& input1,
at::Tensor& input2,
at::Tensor& output,
int kernel_size,
bool bilinear);
void resample2d_kernel_backward(
at::Tensor& input1,
at::Tensor& input2,
at::Tensor& gradOutput,
at::Tensor& gradInput1,
at::Tensor& gradInput2,
int kernel_size,
bool bilinear);
================================================
FILE: dvs/flownet2/networks/resample2d_package/setup.py
================================================
#!/usr/bin/env python3
import os
import torch
from setuptools import setup
from torch.utils.cpp_extension import BuildExtension, CUDAExtension
cxx_args = ['-std=c++11']
nvcc_args = [
'-gencode', 'arch=compute_50,code=sm_50',
'-gencode', 'arch=compute_52,code=sm_52',
'-gencode', 'arch=compute_60,code=sm_60',
'-gencode', 'arch=compute_61,code=sm_61',
'-gencode', 'arch=compute_70,code=sm_70',
'-gencode', 'arch=compute_70,code=compute_70'
]
setup(
name='resample2d_cuda',
ext_modules=[
CUDAExtension('resample2d_cuda', [
'resample2d_cuda.cc',
'resample2d_kernel.cu'
], extra_compile_args={'cxx': cxx_args, 'nvcc': nvcc_args})
],
cmdclass={
'build_ext': BuildExtension
})
================================================
FILE: dvs/flownet2/networks/submodules.py
================================================
# freda (todo) :
import torch.nn as nn
import torch
import numpy as np
def conv(batchNorm, in_planes, out_planes, kernel_size=3, stride=1):
if batchNorm:
return nn.Sequential(
nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride, padding=(kernel_size-1)//2, bias=False),
nn.BatchNorm2d(out_planes),
nn.LeakyReLU(0.1,inplace=True)
)
else:
return nn.Sequential(
nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride, padding=(kernel_size-1)//2, bias=True),
nn.LeakyReLU(0.1,inplace=True)
)
def i_conv(batchNorm, in_planes, out_planes, kernel_size=3, stride=1, bias = True):
if batchNorm:
return nn.Sequential(
nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride, padding=(kernel_size-1)//2, bias=bias),
nn.BatchNorm2d(out_planes),
)
else:
return nn.Sequential(
nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride, padding=(kernel_size-1)//2, bias=bias),
)
def predict_flow(in_planes):
return nn.Conv2d(in_planes,2,kernel_size=3,stride=1,padding=1,bias=True)
def deconv(in_planes, out_planes):
return nn.Sequential(
nn.ConvTranspose2d(in_planes, out_planes, kernel_size=4, stride=2, padding=1, bias=True),
nn.LeakyReLU(0.1,inplace=True)
)
class tofp16(nn.Module):
def __init__(self):
super(tofp16, self).__init__()
def forward(self, input):
return input.half()
class tofp32(nn.Module):
def __init__(self):
super(tofp32, self).__init__()
def forward(self, input):
return input.float()
def init_deconv_bilinear(weight):
f_shape = weight.size()
heigh, width = f_shape[-2], f_shape[-1]
f = np.ceil(width/2.0)
c = (2 * f - 1 - f % 2) / (2.0 * f)
bilinear = np.zeros([heigh, width])
for x in range(width):
for y in range(heigh):
value = (1 - abs(x / f - c)) * (1 - abs(y / f - c))
bilinear[x, y] = value
weight.data.fill_(0.)
for i in range(f_shape[0]):
for j in range(f_shape[1]):
weight.data[i,j,:,:] = torch.from_numpy(bilinear)
def save_grad(grads, name):
def hook(grad):
grads[name] = grad
return hook
'''
def save_grad(grads, name):
def hook(grad):
grads[name] = grad
return hook
import torch
from channelnorm_package.modules.channelnorm import ChannelNorm
model = ChannelNorm().cuda()
grads = {}
a = 100*torch.autograd.Variable(torch.randn((1,3,5,5)).cuda(), requires_grad=True)
a.register_hook(save_grad(grads, 'a'))
b = model(a)
y = torch.mean(b)
y.backward()
'''
================================================
FILE: dvs/flownet2/run.sh
================================================
#!/bin/bash
python main.py --inference --model FlowNet2 --save_flow --inference_dataset Google \
--inference_dataset_root ./../video \
--resume ./FlowNet2_checkpoint.pth.tar \
--inference_visualize
================================================
FILE: dvs/flownet2/run_release.sh
================================================
#!/bin/bash
python main.py --inference --model FlowNet2 --save_flow --inference_dataset Google \
--inference_dataset_root ./../dataset_release/test \
--resume ./FlowNet2_checkpoint.pth.tar \
--inference_visualize
python main.py --inference --model FlowNet2 --save_flow --inference_dataset Google \
--inference_dataset_root ./../dataset_release/training \
--resume ./FlowNet2_checkpoint.pth.tar \
--inference_visualize
================================================
FILE: dvs/flownet2/utils/__init__.py
================================================
================================================
FILE: dvs/flownet2/utils/flow_utils.py
================================================
import numpy as np
import matplotlib.pyplot as plt
import os.path
TAG_CHAR = np.array([202021.25], np.float32)
def readFlow(fn):
""" Read .flo file in Middlebury format"""
# Code adapted from:
# http://stackoverflow.com/questions/28013200/reading-middlebury-flow-files-with-python-bytes-array-numpy
# WARNING: this will work on little-endian architectures (eg Intel x86) only!
# print 'fn = %s'%(fn)
with open(fn, 'rb') as f:
magic = np.fromfile(f, np.float32, count=1)
if 202021.25 != magic:
print('Magic number incorrect. Invalid .flo file')
return None
else:
w = np.fromfile(f, np.int32, count=1)
h = np.fromfile(f, np.int32, count=1)
# print 'Reading %d x %d flo file\n' % (w, h)
data = np.fromfile(f, np.float32, count=2*int(w)*int(h))
# Reshape data into 3D array (columns, rows, bands)
# The reshape here is for visualization, the original code is (w,h,2)
return np.resize(data, (int(h), int(w), 2))
def writeFlow(filename,uv,v=None):
""" Write optical flow to file.
If v is None, uv is assumed to contain both u and v channels,
stacked in depth.
Original code by Deqing Sun, adapted from Daniel Scharstein.
"""
nBands = 2
if v is None:
assert(uv.ndim == 3)
assert(uv.shape[2] == 2)
u = uv[:,:,0]
v = uv[:,:,1]
else:
u = uv
assert(u.shape == v.shape)
height,width = u.shape
f = open(filename,'wb')
# write the header
f.write(TAG_CHAR)
np.array(width).astype(np.int32).tofile(f)
np.array(height).astype(np.int32).tofile(f)
# arrange into matrix form
tmp = np.zeros((height, width*nBands))
tmp[:,np.arange(width)*2] = u
tmp[:,np.arange(width)*2 + 1] = v
tmp.astype(np.float32).tofile(f)
f.close()
# ref: https://github.com/sampepose/flownet2-tf/
# blob/18f87081db44939414fc4a48834f9e0da3e69f4c/src/flowlib.py#L240
def visulize_flow_file(flow_filename, save_dir=None):
flow_data = readFlow(flow_filename)
img = flow2img(flow_data)
# plt.imshow(img)
# plt.show()
if save_dir:
idx = flow_filename.rfind("/") + 1
plt.imsave(os.path.join(save_dir, "%s-vis.png" % flow_filename[idx:-4]), img)
def flow2img(flow_data):
"""
convert optical flow into color image
:param flow_data:
:return: color image
"""
# print(flow_data.shape)
# print(type(flow_data))
u = flow_data[:, :, 0]
v = flow_data[:, :, 1]
UNKNOW_FLOW_THRESHOLD = 1e7
pr1 = abs(u) > UNKNOW_FLOW_THRESHOLD
pr2 = abs(v) > UNKNOW_FLOW_THRESHOLD
idx_unknown = (pr1 | pr2)
u[idx_unknown] = v[idx_unknown] = 0
# get max value in each direction
maxu = -999.
maxv = -999.
minu = 999.
minv = 999.
maxu = max(maxu, np.max(u))
maxv = max(maxv, np.max(v))
minu = min(minu, np.min(u))
minv = min(minv, np.min(v))
rad = np.sqrt(u ** 2 + v ** 2)
maxrad = max(-1, np.max(rad))
u = u / maxrad + np.finfo(float).eps
v = v / maxrad + np.finfo(float).eps
img = compute_color(u, v)
idx = np.repeat(idx_unknown[:, :, np.newaxis], 3, axis=2)
img[idx] = 0
return np.uint8(img)
def compute_color(u, v):
"""
compute optical flow color map
:param u: horizontal optical flow
:param v: vertical optical flow
:return:
"""
height, width = u.shape
img = np.zeros((height, width, 3))
NAN_idx = np.isnan(u) | np.isnan(v)
u[NAN_idx] = v[NAN_idx] = 0
colorwheel = make_color_wheel()
ncols = np.size(colorwheel, 0)
rad = np.sqrt(u ** 2 + v ** 2)
a = np.arctan2(-v, -u) / np.pi
fk = (a + 1) / 2 * (ncols - 1) + 1
k0 = np.floor(fk).astype(int)
k1 = k0 + 1
k1[k1 == ncols + 1] = 1
f = fk - k0
for i in range(0, np.size(colorwheel, 1)):
tmp = colorwheel[:, i]
col0 = tmp[k0 - 1] / 255
col1 = tmp[k1 - 1] / 255
col = (1 - f) * col0 + f * col1
idx = rad <= 1
col[idx] = 1 - rad[idx] * (1 - col[idx])
notidx = np.logical_not(idx)
col[notidx] *= 0.75
img[:, :, i] = np.uint8(np.floor(255 * col * (1 - NAN_idx)))
return img
def make_color_wheel():
"""
Generate color wheel according Middlebury color code
:return: Color wheel
"""
RY = 15
YG = 6
GC = 4
CB = 11
BM = 13
MR = 6
ncols = RY + YG + GC + CB + BM + MR
colorwheel = np.zeros([ncols, 3])
col = 0
# RY
colorwheel[0:RY, 0] = 255
colorwheel[0:RY, 1] = np.transpose(np.floor(255 * np.arange(0, RY) / RY))
col += RY
# YG
colorwheel[col:col + YG, 0] = 255 - np.transpose(np.floor(255 * np.arange(0, YG) / YG))
colorwheel[col:col + YG, 1] = 255
col += YG
# GC
colorwheel[col:col + GC, 1] = 255
colorwheel[col:col + GC, 2] = np.transpose(np.floor(255 * np.arange(0, GC) / GC))
col += GC
# CB
colorwheel[col:col + CB, 1] = 255 - np.transpose(np.floor(255 * np.arange(0, CB) / CB))
colorwheel[col:col + CB, 2] = 255
col += CB
# BM
colorwheel[col:col + BM, 2] = 255
colorwheel[col:col + BM, 0] = np.transpose(np.floor(255 * np.arange(0, BM) / BM))
col += + BM
# MR
colorwheel[col:col + MR, 2] = 255 - np.transpose(np.floor(255 * np.arange(0, MR) / MR))
colorwheel[col:col + MR, 0] = 255
return colorwheel
================================================
FILE: dvs/flownet2/utils/frame_utils.py
================================================
import numpy as np
from os.path import *
from imageio import imread
from . import flow_utils
def read_gen(file_name):
ext = splitext(file_name)[-1]
if ext == '.png' or ext == '.jpeg' or ext == '.ppm' or ext == '.jpg':
im = imread(file_name)
if im.shape[2] > 3:
return im[:,:,:3]
else:
return im
elif ext == '.bin' or ext == '.raw':
return np.load(file_name)
elif ext == '.flo':
return flow_utils.readFlow(file_name).astype(np.float32)
return []
================================================
FILE: dvs/flownet2/utils/param_utils.py
================================================
import torch
import torch.nn as nn
import numpy as np
def parse_flownetc(modules, weights, biases):
keys = [
'conv1',
'conv2',
'conv3',
'conv_redir',
'conv3_1',
'conv4',
'conv4_1',
'conv5',
'conv5_1',
'conv6',
'conv6_1',
'deconv5',
'deconv4',
'deconv3',
'deconv2',
'Convolution1',
'Convolution2',
'Convolution3',
'Convolution4',
'Convolution5',
'upsample_flow6to5',
'upsample_flow5to4',
'upsample_flow4to3',
'upsample_flow3to2',
]
i = 0
for m in modules:
if isinstance(m, nn.Conv2d) or isinstance(m, nn.ConvTranspose2d):
weight = weights[keys[i]].copy()
bias = biases[keys[i]].copy()
if keys[i] == 'conv1':
m.weight.data[:,:,:,:] = torch.from_numpy(np.flip(weight, axis=1).copy())
m.bias.data[:] = torch.from_numpy(bias)
else:
m.weight.data[:,:,:,:] = torch.from_numpy(weight)
m.bias.data[:] = torch.from_numpy(bias)
i = i + 1
return
def parse_flownets(modules, weights, biases, param_prefix='net2_'):
keys = [
'conv1',
'conv2',
'conv3',
'conv3_1',
'conv4',
'conv4_1',
'conv5',
'conv5_1',
'conv6',
'conv6_1',
'deconv5',
'deconv4',
'deconv3',
'deconv2',
'predict_conv6',
'predict_conv5',
'predict_conv4',
'predict_conv3',
'predict_conv2',
'upsample_flow6to5',
'upsample_flow5to4',
'upsample_flow4to3',
'upsample_flow3to2',
]
for i, k in enumerate(keys):
if 'upsample' in k:
keys[i] = param_prefix + param_prefix + k
else:
keys[i] = param_prefix + k
i = 0
for m in modules:
if isinstance(m, nn.Conv2d) or isinstance(m, nn.ConvTranspose2d):
weight = weights[keys[i]].copy()
bias = biases[keys[i]].copy()
if keys[i] == param_prefix+'conv1':
m.weight.data[:,0:3,:,:] = torch.from_numpy(np.flip(weight[:,0:3,:,:], axis=1).copy())
m.weight.data[:,3:6,:,:] = torch.from_numpy(np.flip(weight[:,3:6,:,:], axis=1).copy())
m.weight.data[:,6:9,:,:] = torch.from_numpy(np.flip(weight[:,6:9,:,:], axis=1).copy())
m.weight.data[:,9::,:,:] = torch.from_numpy(weight[:,9:,:,:].copy())
if m.bias is not None:
m.bias.data[:] = torch.from_numpy(bias)
else:
m.weight.data[:,:,:,:] = torch.from_numpy(weight)
if m.bias is not None:
m.bias.data[:] = torch.from_numpy(bias)
i = i + 1
return
def parse_flownetsonly(modules, weights, biases, param_prefix=''):
keys = [
'conv1',
'conv2',
'conv3',
'conv3_1',
'conv4',
'conv4_1',
'conv5',
'conv5_1',
'conv6',
'conv6_1',
'deconv5',
'deconv4',
'deconv3',
'deconv2',
'Convolution1',
'Convolution2',
'Convolution3',
'Convolution4',
'Convolution5',
'upsample_flow6to5',
'upsample_flow5to4',
'upsample_flow4to3',
'upsample_flow3to2',
]
for i, k in enumerate(keys):
if 'upsample' in k:
keys[i] = param_prefix + param_prefix + k
else:
keys[i] = param_prefix + k
i = 0
for m in modules:
if isinstance(m, nn.Conv2d) or isinstance(m, nn.ConvTranspose2d):
weight = weights[keys[i]].copy()
bias = biases[keys[i]].copy()
if keys[i] == param_prefix+'conv1':
# print ("%s :"%(keys[i]), m.weight.size(), m.bias.size(), tf_w[keys[i]].shape[::-1])
m.weight.data[:,0:3,:,:] = torch.from_numpy(np.flip(weight[:,0:3,:,:], axis=1).copy())
m.weight.data[:,3:6,:,:] = torch.from_numpy(np.flip(weight[:,3:6,:,:], axis=1).copy())
if m.bias is not None:
m.bias.data[:] = torch.from_numpy(bias)
else:
m.weight.data[:,:,:,:] = torch.from_numpy(weight)
if m.bias is not None:
m.bias.data[:] = torch.from_numpy(bias)
i = i + 1
return
def parse_flownetsd(modules, weights, biases, param_prefix='netsd_'):
keys = [
'conv0',
'conv1',
'conv1_1',
'conv2',
'conv2_1',
'conv3',
'conv3_1',
'conv4',
'conv4_1',
'conv5',
'conv5_1',
'conv6',
'conv6_1',
'deconv5',
'deconv4',
'deconv3',
'deconv2',
'interconv5',
'interconv4',
'interconv3',
'interconv2',
'Convolution1',
'Convolution2',
'Convolution3',
'Convolution4',
'Convolution5',
'upsample_flow6to5',
'upsample_flow5to4',
'upsample_flow4to3',
'upsample_flow3to2',
]
for i, k in enumerate(keys):
keys[i] = param_prefix + k
i = 0
for m in modules:
if isinstance(m, nn.Conv2d) or isinstance(m, nn.ConvTranspose2d):
weight = weights[keys[i]].copy()
bias = biases[keys[i]].copy()
if keys[i] == param_prefix+'conv0':
m.weight.data[:,0:3,:,:] = torch.from_numpy(np.flip(weight[:,0:3,:,:], axis=1).copy())
m.weight.data[:,3:6,:,:] = torch.from_numpy(np.flip(weight[:,3:6,:,:], axis=1).copy())
if m.bias is not None:
m.bias.data[:] = torch.from_numpy(bias)
else:
m.weight.data[:,:,:,:] = torch.from_numpy(weight)
if m.bias is not None:
m.bias.data[:] = torch.from_numpy(bias)
i = i + 1
return
def parse_flownetfusion(modules, weights, biases, param_prefix='fuse_'):
keys = [
'conv0',
'conv1',
'conv1_1',
'conv2',
'conv2_1',
'deconv1',
'deconv0',
'interconv1',
'interconv0',
'_Convolution5',
'_Convolution6',
'_Convolution7',
'upsample_flow2to1',
'upsample_flow1to0',
]
for i, k in enumerate(keys):
keys[i] = param_prefix + k
i = 0
for m in modules:
if isinstance(m, nn.Conv2d) or isinstance(m, nn.ConvTranspose2d):
weight = weights[keys[i]].copy()
bias = biases[keys[i]].copy()
if keys[i] == param_prefix+'conv0':
m.weight.data[:,0:3,:,:] = torch.from_numpy(np.flip(weight[:,0:3,:,:], axis=1).copy())
m.weight.data[:,3::,:,:] = torch.from_numpy(weight[:,3:,:,:].copy())
if m.bias is not None:
m.bias.data[:] = torch.from_numpy(bias)
else:
m.weight.data[:,:,:,:] = torch.from_numpy(weight)
if m.bias is not None:
m.bias.data[:] = torch.from_numpy(bias)
i = i + 1
return
================================================
FILE: dvs/flownet2/utils/tools.py
================================================
# freda (todo) :
import os, time, sys, math
import subprocess, shutil
from os.path import *
import numpy as np
from inspect import isclass
from pytz import timezone
from datetime import datetime
import inspect
import torch
def datestr():
pacific = timezone('US/Pacific')
now = datetime.now(pacific)
return '{}{:02}{:02}_{:02}{:02}'.format(now.year, now.month, now.day, now.hour, now.minute)
def module_to_dict(module, exclude=[]):
return dict([(x, getattr(module, x)) for x in dir(module)
if isclass(getattr(module, x))
and x not in exclude
and getattr(module, x) not in exclude])
class TimerBlock:
def __init__(self, title):
print(("{}".format(title)))
def __enter__(self):
self.start = time.clock()
return self
def __exit__(self, exc_type, exc_value, traceback):
self.end = time.clock()
self.interval = self.end - self.start
if exc_type is not None:
self.log("Operation failed\n")
else:
self.log("Operation finished\n")
def log(self, string):
duration = time.clock() - self.start
units = 's'
if duration > 60:
duration = duration / 60.
units = 'm'
print((" [{:.3f}{}] {}".format(duration, units, string)))
def log2file(self, fid, string):
fid = open(fid, 'a')
fid.write("%s\n"%(string))
fid.close()
def add_arguments_for_module(parser, module, argument_for_class, default, skip_params=[], parameter_defaults={}):
argument_group = parser.add_argument_group(argument_for_class.capitalize())
module_dict = module_to_dict(module)
argument_group.add_argument('--' + argument_for_class, type=str, default=default, choices=list(module_dict.keys()))
args, unknown_args = parser.parse_known_args()
class_obj = module_dict[vars(args)[argument_for_class]]
argspec = inspect.getargspec(class_obj.__init__)
defaults = argspec.defaults[::-1] if argspec.defaults else None
args = argspec.args[::-1]
for i, arg in enumerate(args):
cmd_arg = '{}_{}'.format(argument_for_class, arg)
if arg not in skip_params + ['self', 'args']:
if arg in list(parameter_defaults.keys()):
argument_group.add_argument('--{}'.format(cmd_arg), type=type(parameter_defaults[arg]), default=parameter_defaults[arg])
elif (defaults is not None and i < len(defaults)):
argument_group.add_argument('--{}'.format(cmd_arg), type=type(defaults[i]), default=defaults[i])
else:
print(("[Warning]: non-default argument '{}' detected on class '{}'. This argument cannot be modified via the command line"
.format(arg, module.__class__.__name__)))
# We don't have a good way of dealing with inferring the type of the argument
# TODO: try creating a custom action and using ast's infer type?
# else:
# argument_group.add_argument('--{}'.format(cmd_arg), required=True)
def kwargs_from_args(args, argument_for_class):
argument_for_class = argument_for_class + '_'
return {key[len(argument_for_class):]: value for key, value in list(vars(args).items()) if argument_for_class in key and key != argument_for_class + 'class'}
def format_dictionary_of_losses(labels, values):
try:
string = ', '.join([('{}: {:' + ('.3f' if value >= 0.001 else '.1e') +'}').format(name, value) for name, value in zip(labels, values)])
except (TypeError, ValueError) as e:
print((list(zip(labels, values))))
string = '[Log Error] ' + str(e)
return string
class IteratorTimer():
def __init__(self, iterable):
self.iterable = iterable
self.iterator = self.iterable.__iter__()
def __iter__(self):
return self
def __len__(self):
return len(self.iterable)
def __next__(self):
start = time.time()
n = next(self.iterator)
self.last_duration = (time.time() - start)
return n
next = __next__
def gpumemusage():
gpu_mem = subprocess.check_output("nvidia-smi | grep MiB | cut -f 3 -d '|'", shell=True).replace(' ', '').replace('\n', '').replace('i', '')
all_stat = [float(a) for a in gpu_mem.replace('/','').split('MB')[:-1]]
gpu_mem = ''
for i in range(len(all_stat)/2):
curr, tot = all_stat[2*i], all_stat[2*i+1]
util = "%1.2f"%(100*curr/tot)+'%'
cmem = str(int(math.ceil(curr/1024.)))+'GB'
gmem = str(int(math.ceil(tot/1024.)))+'GB'
gpu_mem += util + '--' + join(cmem, gmem) + ' '
return gpu_mem
def update_hyperparameter_schedule(args, epoch, global_iteration, optimizer):
if args.schedule_lr_frequency > 0:
for param_group in optimizer.param_groups:
if (global_iteration + 1) % args.schedule_lr_frequency == 0:
param_group['lr'] /= float(args.schedule_lr_fraction)
param_group['lr'] = float(np.maximum(param_group['lr'], 0.000001))
def save_checkpoint(state, is_best, path, prefix, filename='checkpoint.pth.tar'):
prefix_save = os.path.join(path, prefix)
name = prefix_save + '_' + filename
torch.save(state, name)
if is_best:
shutil.copyfile(name, prefix_save + '_model_best.pth.tar')
================================================
FILE: dvs/gyro/__init__.py
================================================
from .gyro_function import (
GetGyroAtTimeStamp,
QuaternionProduct,
QuaternionReciprocal,
ConvertQuaternionToAxisAngle,
FindOISAtTimeStamp,
GetMetadata,
GetProjections,
GetVirtualProjection,
GetForwardGrid,
CenterZoom,
GetWarpingFlow,
torch_norm_quat,
torch_QuaternionProduct,
torch_QuaternionReciprocal,
torch_GetVirtualProjection,
get_static,
torch_GetForwardGrid,
torch_GetWarpingFlow,
train_GetGyroAtTimeStamp,
train_ConvertQuaternionToAxisAngle,
ConvertAxisAngleToQuaternion,
torch_ConvertAxisAngleToQuaternion,
torch_ConvertQuaternionToAxisAngle,
ConvertAxisAngleToQuaternion_no_angle,
ConvertQuaternionToAxisAngle_no_angle,
torch_GetHomographyTransformFromProjections,
torch_ApplyTransform,
norm_quat,
SlerpWithDefault
)
from .gyro_io import (
LoadGyroData,
LoadOISData,
LoadFrameData,
LoadStabResult,
get_grid,
get_rotations,
visual_rotation
)
================================================
FILE: dvs/gyro/gyro_function.py
================================================
import numpy as np
from numpy import linalg as LA
import matplotlib.pyplot as plt
import torch
from torch.autograd import Variable
def get_static(height = 1080, width = 1920, ratio = 0.1):
static_options = {}
static_options["active_array_width"] = 4032
static_options["active_array_height"] = 3024
static_options["crop_window_width"] = 4032
static_options["crop_window_height"] = 2272
static_options["num_grid_rows"] = 12
static_options["num_grid_cols"] = 12
static_options["dim_homography"] = 9
static_options["width"] = width # frame width.
static_options["height"] = height # frame height
# static_options["fov"] = 1.27 # sensor_width/sensor_focal_length
static_options["cropping_ratio"] = 0.0 #ratio # normalized cropping ratio at each side.
return static_options
# Quaternion: [x, y, z, w]
def norm_quat(quat):
norm_quat = LA.norm(quat)
if norm_quat > 1e-6:
quat = quat / norm_quat
# [0 norm_quat norm_quat - 1e-6]
else:
# print('bad len for Reciprocal')
quat = np.array([0,0,0,1])
return quat
def torch_norm_quat(quat, USE_CUDA = True):
# Method 1:
batch_size = quat.size()[0]
quat_out = Variable(torch.zeros((batch_size, 4), requires_grad=True))
if USE_CUDA == True:
quat_out = quat_out.cuda()
for i in range(batch_size):
norm_quat = torch.norm(quat[i])
if norm_quat > 1e-6:
quat_out[i] = quat[i] / norm_quat
# [0 norm_quat norm_quat - 1e-6]
else:
quat_out[i,:3] = quat[i,:3] * 0
quat_out[i,3] = quat[i,3] / quat[i,3]
# Method 2:
# quat = quat / (torch.unsqueeze(torch.norm(quat, dim = 1), 1) + 1e-6) # check norm
return quat_out
def ConvertAxisAngleToQuaternion(axis, angle):
if LA.norm(axis) > 1e-6 and angle > 1e-6:
axis = axis/LA.norm(axis)
half_angle = angle*0.5
sin_half_angle = np.sin(half_angle)
quat = np.array([sin_half_angle* axis[0], sin_half_angle* axis[1], sin_half_angle* axis[2], np.cos(half_angle)])
return norm_quat(quat)
def ConvertAxisAngleToQuaternion_no_angle(axis):
angle = LA.norm(axis)
if LA.norm(axis) > 1e-6:
axis = axis/LA.norm(axis)
half_angle = angle*0.5
sin_half_angle = np.sin(half_angle)
quat = np.array([sin_half_angle* axis[0], sin_half_angle* axis[1], sin_half_angle* axis[2], np.cos(half_angle)])
return norm_quat(quat)
def torch_ConvertAxisAngleToQuaternion(axis, USE_CUDA = True):
batch_size = axis.size()[0]
angle = torch.norm(axis[:,:3], dim = 1)
half_angle = angle * 0.5
sin_half_angle = torch.sin(half_angle)
quats = Variable(torch.zeros((batch_size, 4), requires_grad=True))
norm_axis = axis[:,:3] * 1
if USE_CUDA:
quats = quats.cuda()
for i in range(batch_size):
if angle[i] > 1e-6:
norm_axis[i] = axis[i,:3]/angle[i]
quats[:, :3] = sin_half_angle * norm_axis
quats[:, 3] = torch.cos(half_angle)
return torch_norm_quat(quats)
def ConvertQuaternionToAxisAngle(quat):
quat = quat/LA.norm(quat)
axis_norm = LA.norm(quat[0:3])
axis = np.array([0.0, 0.0, 0.0])
if axis_norm < 1e-6:
angle = 0
else:
axis_norm_reciprocal = 1/axis_norm
gitextract__lkvtuhi/
├── .gitignore
├── LICENSE
├── README.md
├── docs/
│ ├── code-of-conduct.md
│ └── contributing.md
└── dvs/
├── checkpoint/
│ └── stabilzation/
│ └── stabilzation_last.checkpoint
├── conf/
│ ├── stabilzation.yaml
│ └── stabilzation_train.yaml
├── dataset.py
├── flownet2/
│ ├── LICENSE
│ ├── README.md
│ ├── __init__.py
│ ├── convert.py
│ ├── datasets.py
│ ├── install.sh
│ ├── losses.py
│ ├── main.py
│ ├── models.py
│ ├── networks/
│ │ ├── FlowNetC.py
│ │ ├── FlowNetFusion.py
│ │ ├── FlowNetS.py
│ │ ├── FlowNetSD.py
│ │ ├── __init__.py
│ │ ├── channelnorm_package/
│ │ │ ├── __init__.py
│ │ │ ├── channelnorm.py
│ │ │ ├── channelnorm_cuda.cc
│ │ │ ├── channelnorm_kernel.cu
│ │ │ ├── channelnorm_kernel.cuh
│ │ │ └── setup.py
│ │ ├── correlation_package/
│ │ │ ├── __init__.py
│ │ │ ├── correlation.py
│ │ │ ├── correlation_cuda.cc
│ │ │ ├── correlation_cuda_kernel.cu
│ │ │ ├── correlation_cuda_kernel.cuh
│ │ │ └── setup.py
│ │ ├── resample2d_package/
│ │ │ ├── __init__.py
│ │ │ ├── resample2d.py
│ │ │ ├── resample2d_cuda.cc
│ │ │ ├── resample2d_kernel.cu
│ │ │ ├── resample2d_kernel.cuh
│ │ │ └── setup.py
│ │ └── submodules.py
│ ├── run.sh
│ ├── run_release.sh
│ └── utils/
│ ├── __init__.py
│ ├── flow_utils.py
│ ├── frame_utils.py
│ ├── param_utils.py
│ └── tools.py
├── gyro/
│ ├── __init__.py
│ ├── gyro_function.py
│ └── gyro_io.py
├── inference.py
├── load_frame_sensor_data.py
├── loss.py
├── metrics.py
├── model.py
├── printer.py
├── requirements.txt
├── train.py
├── util.py
└── warp/
├── __init__.py
├── rasterizer.py
├── read_write.py
└── warping.py
SYMBOL INDEX (342 symbols across 33 files)
FILE: dvs/dataset.py
function get_data_loader (line 26) | def get_data_loader(cf, no_flo = False):
function get_dataset (line 34) | def get_dataset(cf, no_flo = False):
function get_inference_data_loader (line 52) | def get_inference_data_loader(cf, data_path, no_flo = False):
function get_inference_dataset (line 57) | def get_inference_dataset(cf, data_path, no_flo = False):
function _data_transforms (line 66) | def _data_transforms():
class DVS_data (line 77) | class DVS_data():
method __init__ (line 78) | def __init__(self):
class Dataset_Gyro (line 87) | class Dataset_Gyro(Dataset):
method __init__ (line 88) | def __init__(self, path, sample_freq = 33*1000000, number_real = 10, t...
method process_one_video (line 122) | def process_one_video(self, path):
method generate_quaternions (line 154) | def generate_quaternions(self, dvs_data):
method load_flo (line 179) | def load_flo(self, idx, first_id):
method load_real_projections (line 195) | def load_real_projections(self, idx, first_id):
method __getitem__ (line 203) | def __getitem__(self, idx):
method __len__ (line 212) | def __len__(self):
method get_virtual_data (line 215) | def get_virtual_data(self, virtual_queue, real_queue_idx, pre_times, c...
method update_virtual_queue (line 232) | def update_virtual_queue(self, batch_size, virtual_queue, out, times):
method random_init_virtual_queue (line 244) | def random_init_virtual_queue(self, batch_size, real_postion, times):
method get_data_at_timestamp (line 259) | def get_data_at_timestamp(self, gyro_data, ois_data, time_stamp, quat_...
method get_ois_at_timestamp (line 264) | def get_ois_at_timestamp(self, ois_data, time_stamp):
function get_timestamp (line 269) | def get_timestamp(frame_data, idx):
function preprocess_gyro (line 275) | def preprocess_gyro(gyro, extend = 200):
function LoadFlow (line 286) | def LoadFlow(path):
function get_virtual_at_timestamp (line 293) | def get_virtual_at_timestamp(virtual_queue, real_queue, time_stamp, time...
FILE: dvs/flownet2/datasets.py
class StaticRandomCrop (line 13) | class StaticRandomCrop(object):
method __init__ (line 14) | def __init__(self, image_size, crop_size):
method __call__ (line 20) | def __call__(self, img):
class StaticCenterCrop (line 23) | class StaticCenterCrop(object):
method __init__ (line 24) | def __init__(self, image_size, crop_size):
method __call__ (line 27) | def __call__(self, img):
class Padding (line 30) | class Padding(object):
method __init__ (line 31) | def __init__(self, image_size, pad_size):
method __call__ (line 34) | def __call__(self, img):
class MpiSintel (line 39) | class MpiSintel(data.Dataset):
method __init__ (line 40) | def __init__(self, args, is_cropped = False, root = '', dstype = 'clea...
method __getitem__ (line 85) | def __getitem__(self, index):
method __len__ (line 112) | def __len__(self):
class MpiSintelClean (line 115) | class MpiSintelClean(MpiSintel):
method __init__ (line 116) | def __init__(self, args, is_cropped = False, root = '', replicates = 1):
class MpiSintelFinal (line 119) | class MpiSintelFinal(MpiSintel):
method __init__ (line 120) | def __init__(self, args, is_cropped = False, root = '', replicates = 1):
class FlyingChairs (line 123) | class FlyingChairs(data.Dataset):
method __init__ (line 124) | def __init__(self, args, is_cropped, root = '/path/to/FlyingChairs_rel...
method __getitem__ (line 155) | def __getitem__(self, index):
method __len__ (line 181) | def __len__(self):
class FlyingThings (line 184) | class FlyingThings(data.Dataset):
method __init__ (line 185) | def __init__(self, args, is_cropped, root = '/path/to/flyingthings3d',...
method __getitem__ (line 222) | def __getitem__(self, index):
method __len__ (line 248) | def __len__(self):
class FlyingThingsClean (line 251) | class FlyingThingsClean(FlyingThings):
method __init__ (line 252) | def __init__(self, args, is_cropped = False, root = '', replicates = 1):
class FlyingThingsFinal (line 255) | class FlyingThingsFinal(FlyingThings):
method __init__ (line 256) | def __init__(self, args, is_cropped = False, root = '', replicates = 1):
class ChairsSDHom (line 259) | class ChairsSDHom(data.Dataset):
method __init__ (line 260) | def __init__(self, args, is_cropped, root = '/path/to/chairssdhom/data...
method __getitem__ (line 291) | def __getitem__(self, index):
method __len__ (line 318) | def __len__(self):
class ChairsSDHomTrain (line 321) | class ChairsSDHomTrain(ChairsSDHom):
method __init__ (line 322) | def __init__(self, args, is_cropped = False, root = '', replicates = 1):
class ChairsSDHomTest (line 325) | class ChairsSDHomTest(ChairsSDHom):
method __init__ (line 326) | def __init__(self, args, is_cropped = False, root = '', replicates = 1):
class ImagesFromFolder (line 329) | class ImagesFromFolder(data.Dataset):
method __init__ (line 330) | def __init__(self, args, is_cropped, root = '/path/to/frames/only/fold...
method __getitem__ (line 354) | def __getitem__(self, index):
method __len__ (line 373) | def __len__(self):
class Google (line 377) | class Google(data.Dataset):
method __init__ (line 378) | def __init__(self, args, is_cropped = False, root = '', dstype = 'fram...
method __getitem__ (line 412) | def __getitem__(self, index):
method __len__ (line 434) | def __len__(self):
FILE: dvs/flownet2/losses.py
function EPE (line 11) | def EPE(input_flow, target_flow):
class L1 (line 14) | class L1(nn.Module):
method __init__ (line 15) | def __init__(self):
method forward (line 17) | def forward(self, output, target):
class L2 (line 21) | class L2(nn.Module):
method __init__ (line 22) | def __init__(self):
method forward (line 24) | def forward(self, output, target):
class L1Loss (line 28) | class L1Loss(nn.Module):
method __init__ (line 29) | def __init__(self, args):
method forward (line 35) | def forward(self, output, target):
class L2Loss (line 40) | class L2Loss(nn.Module):
method __init__ (line 41) | def __init__(self, args):
method forward (line 47) | def forward(self, output, target):
class MultiScale (line 52) | class MultiScale(nn.Module):
method __init__ (line 53) | def __init__(self, args, startScale = 4, numScales = 5, l_weight= 0.32...
method forward (line 72) | def forward(self, output, target):
FILE: dvs/flownet2/main.py
function inference (line 22) | def inference(args, epoch, data_path, data_loader, model, offset=0):
class Model (line 183) | class Model(nn.Module):
method __init__ (line 184) | def __init__(self, args):
method forward (line 189) | def forward(self, data):
FILE: dvs/flownet2/models.py
class FlowNet2 (line 30) | class FlowNet2(nn.Module):
method __init__ (line 32) | def __init__(self, args, batchNorm=False, div_flow = 20.):
method init_deconv_bilinear (line 104) | def init_deconv_bilinear(self, weight):
method forward (line 120) | def forward(self, inputs):
class FlowNet2C (line 187) | class FlowNet2C(FlowNetC.FlowNetC):
method __init__ (line 188) | def __init__(self, args, batchNorm=False, div_flow=20):
method forward (line 192) | def forward(self, inputs):
class FlowNet2S (line 255) | class FlowNet2S(FlowNetS.FlowNetS):
method __init__ (line 256) | def __init__(self, args, batchNorm=False, div_flow=20):
method forward (line 261) | def forward(self, inputs):
class FlowNet2SD (line 301) | class FlowNet2SD(FlowNetSD.FlowNetSD):
method __init__ (line 302) | def __init__(self, args, batchNorm=False, div_flow=20):
method forward (line 307) | def forward(self, inputs):
class FlowNet2CS (line 353) | class FlowNet2CS(nn.Module):
method __init__ (line 355) | def __init__(self, args, batchNorm=False, div_flow = 20.):
method forward (line 392) | def forward(self, inputs):
class FlowNet2CSS (line 418) | class FlowNet2CSS(nn.Module):
method __init__ (line 420) | def __init__(self, args, batchNorm=False, div_flow = 20.):
method forward (line 469) | def forward(self, inputs):
FILE: dvs/flownet2/networks/FlowNetC.py
class FlowNetC (line 13) | class FlowNetC(nn.Module):
method __init__ (line 14) | def __init__(self,args, batchNorm=True, div_flow = 20):
method forward (line 71) | def forward(self, x):
FILE: dvs/flownet2/networks/FlowNetFusion.py
class FlowNetFusion (line 11) | class FlowNetFusion(nn.Module):
method __init__ (line 12) | def __init__(self,args, batchNorm=True):
method forward (line 47) | def forward(self, x):
FILE: dvs/flownet2/networks/FlowNetS.py
class FlowNetS (line 15) | class FlowNetS(nn.Module):
method __init__ (line 16) | def __init__(self, args, input_channels = 12, batchNorm=True):
method forward (line 60) | def forward(self, x):
FILE: dvs/flownet2/networks/FlowNetSD.py
class FlowNetSD (line 11) | class FlowNetSD(nn.Module):
method __init__ (line 12) | def __init__(self, args, batchNorm=True):
method forward (line 66) | def forward(self, x):
FILE: dvs/flownet2/networks/channelnorm_package/channelnorm.py
class ChannelNormFunction (line 5) | class ChannelNormFunction(Function):
method forward (line 8) | def forward(ctx, input1, norm_deg=2):
method backward (line 20) | def backward(ctx, grad_output):
class ChannelNorm (line 31) | class ChannelNorm(Module):
method __init__ (line 33) | def __init__(self, norm_deg=2):
method forward (line 37) | def forward(self, input1):
FILE: dvs/flownet2/networks/channelnorm_package/channelnorm_cuda.cc
function channelnorm_cuda_forward (line 6) | int channelnorm_cuda_forward(
function channelnorm_cuda_backward (line 16) | int channelnorm_cuda_backward(
function PYBIND11_MODULE (line 27) | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
FILE: dvs/flownet2/networks/correlation_package/correlation.py
class CorrelationFunction (line 6) | class CorrelationFunction(Function):
method forward (line 9) | def forward(ctx, input1, input2, pad_size=3, kernel_size=3, max_displa...
method backward (line 30) | def backward(ctx, grad_output):
class Correlation (line 46) | class Correlation(Module):
method __init__ (line 47) | def __init__(self, pad_size=0, kernel_size=0, max_displacement=0, stri...
method forward (line 56) | def forward(self, input1, input2):
FILE: dvs/flownet2/networks/correlation_package/correlation_cuda.cc
function correlation_forward_cuda (line 10) | int correlation_forward_cuda(at::Tensor& input1, at::Tensor& input2, at:...
function correlation_backward_cuda (line 89) | int correlation_backward_cuda(at::Tensor& input1, at::Tensor& input2, at...
function PYBIND11_MODULE (line 169) | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
FILE: dvs/flownet2/networks/resample2d_package/resample2d.py
class Resample2dFunction (line 5) | class Resample2dFunction(Function):
method forward (line 8) | def forward(ctx, input1, input2, kernel_size=1, bilinear= True):
method backward (line 25) | def backward(ctx, grad_output):
class Resample2d (line 40) | class Resample2d(Module):
method __init__ (line 42) | def __init__(self, kernel_size=1, bilinear = True):
method forward (line 47) | def forward(self, input1, input2):
FILE: dvs/flownet2/networks/resample2d_package/resample2d_cuda.cc
function resample2d_cuda_forward (line 6) | int resample2d_cuda_forward(
function resample2d_cuda_backward (line 15) | int resample2d_cuda_backward(
function PYBIND11_MODULE (line 28) | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
FILE: dvs/flownet2/networks/submodules.py
function conv (line 7) | def conv(batchNorm, in_planes, out_planes, kernel_size=3, stride=1):
function i_conv (line 20) | def i_conv(batchNorm, in_planes, out_planes, kernel_size=3, stride=1, bi...
function predict_flow (line 31) | def predict_flow(in_planes):
function deconv (line 34) | def deconv(in_planes, out_planes):
class tofp16 (line 40) | class tofp16(nn.Module):
method __init__ (line 41) | def __init__(self):
method forward (line 44) | def forward(self, input):
class tofp32 (line 48) | class tofp32(nn.Module):
method __init__ (line 49) | def __init__(self):
method forward (line 52) | def forward(self, input):
function init_deconv_bilinear (line 56) | def init_deconv_bilinear(weight):
function save_grad (line 72) | def save_grad(grads, name):
FILE: dvs/flownet2/utils/flow_utils.py
function readFlow (line 7) | def readFlow(fn):
function writeFlow (line 28) | def writeFlow(filename,uv,v=None):
function visulize_flow_file (line 62) | def visulize_flow_file(flow_filename, save_dir=None):
function flow2img (line 72) | def flow2img(flow_data):
function compute_color (line 112) | def compute_color(u, v):
function make_color_wheel (line 157) | def make_color_wheel():
FILE: dvs/flownet2/utils/frame_utils.py
function read_gen (line 6) | def read_gen(file_name):
FILE: dvs/flownet2/utils/param_utils.py
function parse_flownetc (line 5) | def parse_flownetc(modules, weights, biases):
function parse_flownets (line 51) | def parse_flownets(modules, weights, biases, param_prefix='net2_'):
function parse_flownetsonly (line 104) | def parse_flownetsonly(modules, weights, biases, param_prefix=''):
function parse_flownetsd (line 156) | def parse_flownetsd(modules, weights, biases, param_prefix='netsd_'):
function parse_flownetfusion (line 214) | def parse_flownetfusion(modules, weights, biases, param_prefix='fuse_'):
FILE: dvs/flownet2/utils/tools.py
function datestr (line 13) | def datestr():
function module_to_dict (line 18) | def module_to_dict(module, exclude=[]):
class TimerBlock (line 24) | class TimerBlock:
method __init__ (line 25) | def __init__(self, title):
method __enter__ (line 28) | def __enter__(self):
method __exit__ (line 32) | def __exit__(self, exc_type, exc_value, traceback):
method log (line 42) | def log(self, string):
method log2file (line 50) | def log2file(self, fid, string):
function add_arguments_for_module (line 55) | def add_arguments_for_module(parser, module, argument_for_class, default...
function kwargs_from_args (line 84) | def kwargs_from_args(args, argument_for_class):
function format_dictionary_of_losses (line 88) | def format_dictionary_of_losses(labels, values):
class IteratorTimer (line 98) | class IteratorTimer():
method __init__ (line 99) | def __init__(self, iterable):
method __iter__ (line 103) | def __iter__(self):
method __len__ (line 106) | def __len__(self):
method __next__ (line 109) | def __next__(self):
function gpumemusage (line 117) | def gpumemusage():
function update_hyperparameter_schedule (line 131) | def update_hyperparameter_schedule(args, epoch, global_iteration, optimi...
function save_checkpoint (line 138) | def save_checkpoint(state, is_best, path, prefix, filename='checkpoint.p...
FILE: dvs/gyro/gyro_function.py
function get_static (line 7) | def get_static(height = 1080, width = 1920, ratio = 0.1):
function norm_quat (line 24) | def norm_quat(quat):
function torch_norm_quat (line 34) | def torch_norm_quat(quat, USE_CUDA = True):
function ConvertAxisAngleToQuaternion (line 53) | def ConvertAxisAngleToQuaternion(axis, angle):
function ConvertAxisAngleToQuaternion_no_angle (line 62) | def ConvertAxisAngleToQuaternion_no_angle(axis):
function torch_ConvertAxisAngleToQuaternion (line 72) | def torch_ConvertAxisAngleToQuaternion(axis, USE_CUDA = True):
function ConvertQuaternionToAxisAngle (line 90) | def ConvertQuaternionToAxisAngle(quat):
function ConvertQuaternionToAxisAngle_no_angle (line 104) | def ConvertQuaternionToAxisAngle_no_angle(quat):
function torch_ConvertQuaternionToAxisAngle (line 115) | def torch_ConvertQuaternionToAxisAngle(quat, USE_CUDA = True):
function train_ConvertQuaternionToAxisAngle (line 129) | def train_ConvertQuaternionToAxisAngle(quat):
function AngularVelocityToQuat (line 134) | def AngularVelocityToQuat(angular_v, dt):
function QuaternionProduct (line 144) | def QuaternionProduct(q1, q2):
function torch_QuaternionProduct (line 163) | def torch_QuaternionProduct(q1, q2, USE_CUDA = True):
function ProcessGyroRotation (line 188) | def ProcessGyroRotation(gyro_data):
function QuaternionReciprocal (line 199) | def QuaternionReciprocal(q):
function torch_QuaternionReciprocal (line 203) | def torch_QuaternionReciprocal(q, USE_CUDA = True):
function ProcessGyroData (line 210) | def ProcessGyroData(gyro_data):
function SlerpWithDefault (line 221) | def SlerpWithDefault(q1, q2, t, q_default):
function GetGyroAtTimeStamp (line 260) | def GetGyroAtTimeStamp(gyro_data, timestamp):
function train_GetGyroAtTimeStamp (line 275) | def train_GetGyroAtTimeStamp(gyro_data, timestamp, check = False):
function FindOISAtTimeStamp (line 291) | def FindOISAtTimeStamp(ois_log, time):
function GetMetadata (line 311) | def GetMetadata(frame_data, frame_index, result_poses = {} ):
function GetProjections (line 330) | def GetProjections(static_options, metadata, quats_data, ois_data, no_s...
function GetRealProjection (line 344) | def GetRealProjection(static_options, quats_data, ois_data, fov, timesta...
function GetProjectionHomography (line 354) | def GetProjectionHomography(rot, fov, offset, width, height):
function torch_GetProjectionHomography (line 365) | def torch_GetProjectionHomography(rot, fov, width, height, USE_CUDA = Tr...
function ConvertQuaternionToRotationMatrix (line 381) | def ConvertQuaternionToRotationMatrix(quat):
function torch_ConvertQuaternionToRotationMatrix (line 399) | def torch_ConvertQuaternionToRotationMatrix(quat, USE_CUDA = True):
function ConvertRotationMatrixToQuaternion (line 422) | def ConvertRotationMatrixToQuaternion(m):
function GetIntrinsics (line 450) | def GetIntrinsics(focal_length, offset, width, height):
function GetVirtualProjection (line 459) | def GetVirtualProjection(static_options, result_pose, metadata, frame_in...
function torch_GetVirtualProjection (line 470) | def torch_GetVirtualProjection(static_options, quat, virtual_fov = 1.27):
function GetForwardGrid (line 476) | def GetForwardGrid(static_options, real_projections, virtual_projection):
function torch_GetForwardGrid (line 498) | def torch_GetForwardGrid(static_options, real_projections, virtual_proje...
function GetWarpingFlow (line 530) | def GetWarpingFlow(real_projections_src, real_projections_dst, num_rows,...
function torch_GetWarpingFlow (line 549) | def torch_GetWarpingFlow(static_options, real_projections_src, real_proj...
function GetHomographyTransformFromProjections (line 581) | def GetHomographyTransformFromProjections(proj_src, proj_dst):
function torch_GetHomographyTransformFromProjections (line 584) | def torch_GetHomographyTransformFromProjections(proj_src, proj_dst):
function ApplyTransform (line 587) | def ApplyTransform(transform, point):
function torch_ApplyTransform (line 594) | def torch_ApplyTransform(transform, point):
function CenterZoom (line 601) | def CenterZoom(grid, ratio):
FILE: dvs/gyro/gyro_io.py
function load_gyro_mesh (line 13) | def load_gyro_mesh(input_name):
function get_grid (line 19) | def get_grid(static_options, frame_data, quats_data, ois_data, virtual_d...
function get_rotations (line 34) | def get_rotations(frame_data, quats_data, ois_data, num_frames):
function visual_rotation (line 50) | def visual_rotation(rotations_real, lens_offsets_real, rotations_virtual...
function LoadOISData (line 106) | def LoadOISData(ois_name):
function LoadFrameData (line 111) | def LoadFrameData(frame_log_name):
function LoadGyroData (line 117) | def LoadGyroData(gyro_log_name):
function LoadStabResult (line 126) | def LoadStabResult(input_name):
function ReadLine (line 142) | def ReadLine(fid):
function str2num (line 162) | def str2num(string):
FILE: dvs/inference.py
function run (line 29) | def run(model, loader, cf, USE_CUDA=True):
function inference (line 120) | def inference(cf, data_path, USE_CUDA):
function visual_result (line 173) | def visual_result(cf, data, video_name, virtual_queue, virtual_queue2 = ...
function main (line 187) | def main(args = None):
FILE: dvs/load_frame_sensor_data.py
function run (line 31) | def run(loader, cf, USE_CUDA=True):
function inference (line 68) | def inference(cf, data_path, USE_CUDA):
function main (line 97) | def main(args = None):
FILE: dvs/loss.py
class C2_Smooth_loss (line 21) | class C2_Smooth_loss(torch.nn.Module):
method __init__ (line 22) | def __init__(self):
method forward (line 26) | def forward(self, Qt, Qt_1, Qt_2):
class C1_Smooth_loss (line 30) | class C1_Smooth_loss(torch.nn.Module):
method __init__ (line 31) | def __init__(self):
method forward (line 35) | def forward(self, v_r_axis, v_axis_t_1 = None, real_postion = None):
class Follow_loss (line 40) | class Follow_loss(torch.nn.Module):
method __init__ (line 41) | def __init__(self):
method forward (line 45) | def forward(self, virtual_quat, real_quat, real_postion = None):
class Stay_loss (line 50) | class Stay_loss(torch.nn.Module):
method __init__ (line 51) | def __init__(self):
method forward (line 55) | def forward(self, virtual_quat):
class Angle_loss (line 59) | class Angle_loss(torch.nn.Module):
method __init__ (line 60) | def __init__(self):
method forward (line 64) | def forward(self, Q1, Q2, threshold = 0.5236, logistic_beta1 = 100):
class Optical_loss (line 73) | class Optical_loss(torch.nn.Module):
method __init__ (line 74) | def __init__(self):
method forward (line 79) | def forward(self, Vt, Vt_1, flo, flo_back, real_projection_t, real_pro...
function get_mesh (line 122) | def get_mesh(height = 270, width = 480, USE_CUDA = True):
class Undefine_loss (line 132) | class Undefine_loss(torch.nn.Module):
method __init__ (line 133) | def __init__(self, ratio = 0.08, inner_ratio = 0.04, USE_CUDA = True):
method forward (line 153) | def forward(self, Vt, Rt, ratio = 0.04):
method get_loss (line 177) | def get_loss(self, p):
FILE: dvs/metrics.py
function _pickle_keypoints (line 19) | def _pickle_keypoints(point):
function crop_metric (line 30) | def crop_metric(M):
function get_scale (line 39) | def get_scale(M):
function get_rescale_matrix (line 53) | def get_rescale_matrix(M, sx, sy):
function metrics (line 64) | def metrics(in_src, out_src, package, crop_scale = False, re_compute = F...
function crop_rm_outlier (line 294) | def crop_rm_outlier(crop):
FILE: dvs/model.py
class LayerLSTM (line 16) | class LayerLSTM(nn.Module):
method __init__ (line 17) | def __init__(self, input_size, hidden_size, bias):
method init_hidden (line 22) | def init_hidden(self, batch_size):
method forward (line 26) | def forward(self, x):
class LayerCNN (line 31) | class LayerCNN(nn.Module):
method __init__ (line 32) | def __init__(self, in_channel, out_channel, kernel_size, stride, paddi...
method forward (line 43) | def forward(self, x):
class LayerFC (line 52) | class LayerFC(nn.Module):
method __init__ (line 53) | def __init__(self, in_features, out_features, bias, drop_out=0, activa...
method forward (line 61) | def forward(self, x):
class Net (line 71) | class Net(nn.Module):
method __init__ (line 72) | def __init__(self, cf):
method init_hidden (line 160) | def init_hidden(self, batch_size):
method forward (line 164) | def forward(self, x, flo, ois):
class Model (line 178) | class Model():
method __init__ (line 179) | def __init__(self, cf):
method loss (line 203) | def loss(
method init_weights (line 243) | def init_weights(self, cf):
method save_checkpoint (line 258) | def save_checkpoint(self, epoch = 0, optimizer=None):
class UNet (line 272) | class UNet(nn.Module):
method __init__ (line 273) | def __init__(self, n_channels = 4, n_classes = 16, bilinear=True):
method forward (line 288) | def forward(self, x, x_back = None):
class DoubleConv (line 304) | class DoubleConv(nn.Module):
method __init__ (line 307) | def __init__(self, in_channels, out_channels, mid_channels=None):
method forward (line 316) | def forward(self, x):
class Down (line 320) | class Down(nn.Module):
method __init__ (line 323) | def __init__(self, in_channels, out_channels):
method forward (line 330) | def forward(self, x):
class Up (line 334) | class Up(nn.Module):
method __init__ (line 337) | def __init__(self, in_channels, out_channels, bilinear=True):
method forward (line 349) | def forward(self, x1, x2):
class OutConv (line 364) | class OutConv(nn.Module):
method __init__ (line 365) | def __init__(self, in_channels, out_channels):
method forward (line 369) | def forward(self, x):
FILE: dvs/printer.py
class Printer (line 3) | class Printer(object):
method __init__ (line 4) | def __init__(self, *files):
method open (line 8) | def open(self):
method close (line 15) | def close(self):
method write (line 23) | def write(self, obj):
method flush (line 28) | def flush(self):
FILE: dvs/train.py
function run_epoch (line 22) | def run_epoch(model, loader, cf, epoch, lr, optimizer=None, is_training=...
function train (line 150) | def train(args = None):
FILE: dvs/util.py
function save_train_info (line 11) | def save_train_info(name, checkpoints_dir, cf, model, count, optimizer =...
function make_dir (line 21) | def make_dir(checkpoints_dir ,cf):
function get_optimizer (line 38) | def get_optimizer(optimizer, model, init_lr, cf):
function crop_video (line 45) | def crop_video(in_path, out_path, crop_ratio):
function norm_flow (line 55) | def norm_flow(flow, h, w):
class AverageMeter (line 64) | class AverageMeter(object):
method __init__ (line 65) | def __init__(self):
method reset (line 68) | def reset(self):
method update (line 73) | def update(self, val, n=1):
FILE: dvs/warp/rasterizer.py
function Rasterization (line 10) | def Rasterization(image, grid, get_mesh_only = False):
function grid_to_triangle (line 73) | def grid_to_triangle(grid):
function grid_size (line 91) | def grid_size(upper_triangle, lower_triangle, height, width):
function generate_mesh_grid (line 103) | def generate_mesh_grid(height, width):
function triangle2mask (line 111) | def triangle2mask(d, height, width): # d: [N x T x 3 x 2]
function edgefunc (line 144) | def edgefunc(v0, v1, p):
FILE: dvs/warp/read_write.py
function load_video (line 11) | def load_video(path, save_dir = None, resize = None, length = -1): # N x...
function video2frame (line 39) | def video2frame(path, resize = None):
function video2frame_one_seq (line 55) | def video2frame_one_seq(path, save_dir = None, resize = None): # N x H x...
function save_video (line 79) | def save_video(path,frame_array, fps, size, losses = None, frame_number ...
function draw_number (line 97) | def draw_number(frame, num, x = 10, y = 10, message = "Frame: "):
FILE: dvs/warp/warping.py
function warp_video (line 9) | def warp_video(mesh_path, video_path, save_path, losses = None, frame_nu...
function warpping_rast (line 30) | def warpping_rast(grid_data, frame_array, losses = None):
function warpping_one_frame_rast (line 37) | def warpping_one_frame_rast(image, grid):
Condensed preview — 65 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (304K chars).
[
{
"path": ".gitignore",
"chars": 52,
"preview": "*.pyc\n.torch\n_ext\n*.o\n_ext/\n*.png\n*.jpg\n*.tar\nlog/*\n"
},
{
"path": "LICENSE",
"chars": 11358,
"preview": "\n Apache License\n Version 2.0, January 2004\n "
},
{
"path": "README.md",
"chars": 3608,
"preview": "# Deep Online Fused Video Stabilization\n\n[[Paper]](https://openaccess.thecvf.com/content/WACV2022/papers/Shi_Deep_Online"
},
{
"path": "docs/code-of-conduct.md",
"chars": 3167,
"preview": "# Google Open Source Community Guidelines\n\nAt Google, we recognize and celebrate the creativity and collaboration of ope"
},
{
"path": "docs/contributing.md",
"chars": 1097,
"preview": "# How to Contribute\n\nWe'd love to accept your patches and contributions to this project. There are\njust a few small guid"
},
{
"path": "dvs/conf/stabilzation.yaml",
"chars": 1374,
"preview": "data:\n exp: 'stabilzation'\n checkpoints_dir: './checkpoint'\n log: './log'\n data_dir: './video' \n use_cuda"
},
{
"path": "dvs/conf/stabilzation_train.yaml",
"chars": 1390,
"preview": "data:\n exp: 'stabilzation_train'\n checkpoints_dir: './checkpoint'\n log: './log'\n data_dir: './dataset_release' "
},
{
"path": "dvs/dataset.py",
"chars": 13230,
"preview": "from torch.utils.data import Dataset\nimport os\nimport collections\nfrom gyro import (\n LoadGyroData, \n LoadOISData,"
},
{
"path": "dvs/flownet2/LICENSE",
"chars": 558,
"preview": "Copyright 2017 NVIDIA CORPORATION\n\nLicensed under the Apache License, Version 2.0 (the \"License\");\nyou may not use this "
},
{
"path": "dvs/flownet2/README.md",
"chars": 5514,
"preview": "# flownet2-pytorch \n\nPytorch implementation of [FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks](ht"
},
{
"path": "dvs/flownet2/__init__.py",
"chars": 36,
"preview": "from .utils import flow_utils, tools"
},
{
"path": "dvs/flownet2/convert.py",
"chars": 4703,
"preview": "#!/usr/bin/env python2.7\n\nimport caffe\nfrom caffe.proto import caffe_pb2\nimport sys, os\n\nimport torch\nimport torch.nn as"
},
{
"path": "dvs/flownet2/datasets.py",
"chars": 15386,
"preview": "import torch\nimport torch.utils.data as data\n\nimport os, math, random\nfrom os.path import *\nimport numpy as np\n\nfrom glo"
},
{
"path": "dvs/flownet2/install.sh",
"chars": 340,
"preview": "#!/bin/bash\ncd ./networks/correlation_package\nrm -rf *_cuda.egg-info build dist __pycache__\npython3 setup.py install --u"
},
{
"path": "dvs/flownet2/losses.py",
"chars": 2826,
"preview": "'''\r\nPortions of this code copyright 2017, Clement Pinard\r\n'''\r\n\r\n# freda (todo) : adversarial loss \r\n\r\nimport torch\r\nim"
},
{
"path": "dvs/flownet2/main.py",
"chars": 11058,
"preview": "#!/usr/bin/env python\nimport os\nos.environ[\"CUDA_VISIBLE_DEVICES\"] = \"0\"\nimport torch\nimport torch.nn as nn\nfrom torch.u"
},
{
"path": "dvs/flownet2/models.py",
"chars": 19918,
"preview": "import torch\r\nimport torch.nn as nn\r\nfrom torch.nn import init\r\n\r\nimport math\r\nimport numpy as np\r\n\r\ntry:\r\n from netw"
},
{
"path": "dvs/flownet2/networks/FlowNetC.py",
"chars": 4757,
"preview": "import torch\nimport torch.nn as nn\nfrom torch.nn import init\n\nimport math\nimport numpy as np\n\nfrom .correlation_package."
},
{
"path": "dvs/flownet2/networks/FlowNetFusion.py",
"chars": 2329,
"preview": "import torch\nimport torch.nn as nn\nfrom torch.nn import init\n\nimport math\nimport numpy as np\n\nfrom .submodules import *\n"
},
{
"path": "dvs/flownet2/networks/FlowNetS.py",
"chars": 3652,
"preview": "'''\nPortions of this code copyright 2017, Clement Pinard\n'''\n\nimport torch\nimport torch.nn as nn\nfrom torch.nn import in"
},
{
"path": "dvs/flownet2/networks/FlowNetSD.py",
"chars": 4187,
"preview": "import torch\nimport torch.nn as nn\nfrom torch.nn import init\n\nimport math\nimport numpy as np\n\nfrom .submodules import *\n"
},
{
"path": "dvs/flownet2/networks/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "dvs/flownet2/networks/channelnorm_package/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "dvs/flownet2/networks/channelnorm_package/channelnorm.py",
"chars": 1080,
"preview": "from torch.autograd import Function, Variable\nfrom torch.nn.modules.module import Module\nimport channelnorm_cuda\n\nclass "
},
{
"path": "dvs/flownet2/networks/channelnorm_package/channelnorm_cuda.cc",
"chars": 722,
"preview": "#include <torch/torch.h>\n#include <ATen/ATen.h>\n\n#include \"channelnorm_kernel.cuh\"\n\nint channelnorm_cuda_forward(\n at"
},
{
"path": "dvs/flownet2/networks/channelnorm_package/channelnorm_kernel.cu",
"chars": 6061,
"preview": "#include <ATen/ATen.h>\n#include <ATen/Context.h>\n#include <ATen/cuda/CUDAContext.h>\n\n#include \"channelnorm_kernel.cuh\"\n\n"
},
{
"path": "dvs/flownet2/networks/channelnorm_package/channelnorm_kernel.cuh",
"chars": 298,
"preview": "#pragma once\n\n#include <ATen/ATen.h>\n\nvoid channelnorm_kernel_forward(\n at::Tensor& input1,\n at::Tensor& output, \n"
},
{
"path": "dvs/flownet2/networks/channelnorm_package/setup.py",
"chars": 725,
"preview": "#!/usr/bin/env python3\nimport os\nimport torch\n\nfrom setuptools import setup\nfrom torch.utils.cpp_extension import BuildE"
},
{
"path": "dvs/flownet2/networks/correlation_package/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "dvs/flownet2/networks/correlation_package/correlation.py",
"chars": 2145,
"preview": "import torch\nfrom torch.nn.modules.module import Module\nfrom torch.autograd import Function\nimport correlation_cuda\n\ncla"
},
{
"path": "dvs/flownet2/networks/correlation_package/correlation_cuda.cc",
"chars": 6611,
"preview": "#include <torch/torch.h>\n#include <ATen/ATen.h>\n#include <ATen/Context.h>\n#include <ATen/cuda/CUDAContext.h>\n#include <s"
},
{
"path": "dvs/flownet2/networks/correlation_package/correlation_cuda_kernel.cu",
"chars": 19919,
"preview": "#include <stdio.h>\n\n#include \"correlation_cuda_kernel.cuh\"\n\n#define CUDA_NUM_THREADS 1024\n#define THREADS_PER_BLOCK 32\n#"
},
{
"path": "dvs/flownet2/networks/correlation_package/correlation_cuda_kernel.cuh",
"chars": 1409,
"preview": "#pragma once\n\n#include <ATen/ATen.h>\n#include <ATen/Context.h>\n#include <cuda_runtime.h>\n\nint correlation_forward_cuda_k"
},
{
"path": "dvs/flownet2/networks/correlation_package/setup.py",
"chars": 791,
"preview": "#!/usr/bin/env python3\nimport os\nimport torch\n\nfrom setuptools import setup, find_packages\nfrom torch.utils.cpp_extensio"
},
{
"path": "dvs/flownet2/networks/resample2d_package/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "dvs/flownet2/networks/resample2d_package/resample2d.py",
"chars": 1597,
"preview": "from torch.nn.modules.module import Module\nfrom torch.autograd import Function, Variable\nimport resample2d_cuda\n\nclass R"
},
{
"path": "dvs/flownet2/networks/resample2d_package/resample2d_cuda.cc",
"chars": 852,
"preview": "#include <ATen/ATen.h>\n#include <torch/torch.h>\n\n#include \"resample2d_kernel.cuh\"\n\nint resample2d_cuda_forward(\n at::"
},
{
"path": "dvs/flownet2/networks/resample2d_package/resample2d_kernel.cu",
"chars": 13033,
"preview": "#include <ATen/ATen.h>\n#include <ATen/Context.h>\n#include <ATen/cuda/CUDAContext.h>\n\n#define CUDA_NUM_THREADS 512 \n#defi"
},
{
"path": "dvs/flownet2/networks/resample2d_package/resample2d_kernel.cuh",
"chars": 391,
"preview": "#pragma once\n\n#include <ATen/ATen.h>\n\nvoid resample2d_kernel_forward(\n at::Tensor& input1,\n at::Tensor& input2,\n "
},
{
"path": "dvs/flownet2/networks/resample2d_package/setup.py",
"chars": 767,
"preview": "#!/usr/bin/env python3\nimport os\nimport torch\n\nfrom setuptools import setup\nfrom torch.utils.cpp_extension import BuildE"
},
{
"path": "dvs/flownet2/networks/submodules.py",
"chars": 2820,
"preview": "# freda (todo) : \r\n\r\nimport torch.nn as nn\r\nimport torch\r\nimport numpy as np \r\n\r\ndef conv(batchNorm, in_planes, out_plan"
},
{
"path": "dvs/flownet2/run.sh",
"chars": 201,
"preview": "#!/bin/bash\npython main.py --inference --model FlowNet2 --save_flow --inference_dataset Google \\\n\t--inference_dataset_ro"
},
{
"path": "dvs/flownet2/run_release.sh",
"chars": 424,
"preview": "#!/bin/bash\npython main.py --inference --model FlowNet2 --save_flow --inference_dataset Google \\\n\t--inference_dataset_ro"
},
{
"path": "dvs/flownet2/utils/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "dvs/flownet2/utils/flow_utils.py",
"chars": 5106,
"preview": "import numpy as np\nimport matplotlib.pyplot as plt\nimport os.path\n\nTAG_CHAR = np.array([202021.25], np.float32)\n\ndef rea"
},
{
"path": "dvs/flownet2/utils/frame_utils.py",
"chars": 531,
"preview": "import numpy as np\nfrom os.path import *\nfrom imageio import imread\nfrom . import flow_utils \n\ndef read_gen(file_name):\n"
},
{
"path": "dvs/flownet2/utils/param_utils.py",
"chars": 6878,
"preview": "import torch\nimport torch.nn as nn\nimport numpy as np\n\ndef parse_flownetc(modules, weights, biases):\n keys = [\n 'c"
},
{
"path": "dvs/flownet2/utils/tools.py",
"chars": 5388,
"preview": "# freda (todo) : \n\nimport os, time, sys, math\nimport subprocess, shutil\nfrom os.path import *\nimport numpy as np\nfrom in"
},
{
"path": "dvs/gyro/__init__.py",
"chars": 1009,
"preview": "from .gyro_function import (\n GetGyroAtTimeStamp,\n QuaternionProduct,\n QuaternionReciprocal,\n ConvertQuatern"
},
{
"path": "dvs/gyro/gyro_function.py",
"chars": 22582,
"preview": "import numpy as np\nfrom numpy import linalg as LA\nimport matplotlib.pyplot as plt\nimport torch\nfrom torch.autograd impor"
},
{
"path": "dvs/gyro/gyro_io.py",
"chars": 5633,
"preview": "import numpy as np\nfrom numpy import linalg as LA\nimport matplotlib.pyplot as plt\nimport scipy.io as sio\nfrom .gyro_func"
},
{
"path": "dvs/inference.py",
"chars": 8597,
"preview": "import os\nimport sys\nimport torch\nimport torchvision\nimport torch.nn as nn\nfrom torch.autograd import Variable\n\nimport t"
},
{
"path": "dvs/load_frame_sensor_data.py",
"chars": 4283,
"preview": "import os\nos.environ[\"CUDA_VISIBLE_DEVICES\"] = \"0\"\nimport sys\nimport torch\nimport torchvision\nimport torch.nn as nn\nfrom"
},
{
"path": "dvs/loss.py",
"chars": 7532,
"preview": "import torch\nimport numpy as np\nfrom torch.autograd import Variable\nimport operator\nimport torch.nn.functional as F\nimpo"
},
{
"path": "dvs/metrics.py",
"chars": 11587,
"preview": "import os\nimport sys\nimport numpy as np\nimport cv2\nimport math\nimport pdb\nimport matplotlib.pyplot as plt\nfrom printer i"
},
{
"path": "dvs/model.py",
"chars": 15138,
"preview": "import math\nimport torch\nfrom collections import OrderedDict\n\nimport torch.nn as nn\nimport numpy as np\nimport util\nimpor"
},
{
"path": "dvs/printer.py",
"chars": 812,
"preview": "import sys\n\nclass Printer(object):\n def __init__(self, *files):\n self.files = files\n \n #Redirect Pri"
},
{
"path": "dvs/requirements.txt",
"chars": 187,
"preview": "colorama==0.4.4\nffmpeg==1.4\nimageio==2.9.0\nmatplotlib==3.3.4\nopencv-contrib-python==4.5.1.48\nopencv-python==4.5.1.48\npyt"
},
{
"path": "dvs/train.py",
"chars": 9694,
"preview": "import os\nimport sys\nimport torch\nimport torchvision\nimport torch.nn as nn\nfrom torch.autograd import Variable\n\nimport t"
},
{
"path": "dvs/util.py",
"chars": 2687,
"preview": "import os\nimport torch\nimport cv2\nfrom itertools import chain\nfrom warp import load_video, save_video\nimport numpy as np"
},
{
"path": "dvs/warp/__init__.py",
"chars": 131,
"preview": "from .warping import (\n warp_video\n )\nfrom .read_write import (\n save_video,\n load_video,\n video2frame_on"
},
{
"path": "dvs/warp/rasterizer.py",
"chars": 6666,
"preview": "import numpy as np\nimport matplotlib.pyplot as plt\nfrom numpy import array\nimport torch\nimport cv2\nimport time\n\ndevice ="
},
{
"path": "dvs/warp/read_write.py",
"chars": 4172,
"preview": "import numpy as np\nimport cv2\nimport os\nfrom PIL import Image, ImageDraw, ImageFont\nimport matplotlib.pyplot as plt\nimpo"
},
{
"path": "dvs/warp/warping.py",
"chars": 1625,
"preview": "import numpy as np\nfrom .read_write import load_video, save_video\nimport torch\nimport cv2\nfrom .rasterizer import Raster"
}
]
// ... and 1 more files (download for full content)
About this extraction
This page contains the full source code of the googleinterns/deep-stabilization GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 65 files (42.8 MB), approximately 79.8k tokens, and a symbol index with 342 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.