Full Code of NVlabs/Bi3D for AI

master 4b5fdb48d820 cached
20 files
69.8 KB
20.2k tokens
59 symbols
1 requests
Download .txt
Repository: NVlabs/Bi3D
Branch: master
Commit: 4b5fdb48d820
Files: 20
Total size: 69.8 KB

Directory structure:
gitextract_ulzmvk8p/

├── .gitignore
├── LICENSE.md
├── README.md
├── envs/
│   ├── bi3d_conda_env.yml
│   └── bi3d_pytorch_19_01.DockerFile
└── src/
    ├── models/
    │   ├── Bi3DNet.py
    │   ├── DispRefine2D.py
    │   ├── FeatExtractNet.py
    │   ├── GCNet.py
    │   ├── PSMNet.py
    │   ├── RefineNet2D.py
    │   ├── RefineNet3D.py
    │   ├── SegNet2D.py
    │   └── __init__.py
    ├── project.toml
    ├── run_binary_depth_estimation.py
    ├── run_continuous_depth_estimation.py
    ├── run_demo_kitti15.sh
    ├── run_demo_sf.sh
    └── util.py

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
# Add any directories, files, or patterns you don't want to be tracked by version control

*.png
*.pfm
*.pth.tar
*.npy
*.ppm
*.pyc
*.tar
*.zip
*.gif

================================================
FILE: LICENSE.md
================================================
# NVIDIA Source Code License for Bi3D

## 1. Definitions

“Licensor” means any person or entity that distributes its Work.

“Software” means the original work of authorship made available under this License.

“Work” means the Software and any additions to or derivative works of the Software that are made available under this License.

“NVIDIA Processors” means any central processing unit (CPU), graphics processing unit (GPU), field-programmable gate array (FPGA), application-specific integrated circuit (ASIC) or any combination thereof designed, made, sold, or provided by NVIDIA or its affiliates.

The terms “reproduce,” “reproduction,” “derivative works,” and “distribution” have the meaning as provided under U.S. copyright law; provided, however, that for the purposes of this License, derivative works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work.

Works, including the Software, are “made available” under this License by including in or with the Work either (a) a copyright notice referencing the applicability of this License to the Work, or (b) a copy of this License.

## 2. License Grant

### 2.1 Copyright Grant.
Subject to the terms and conditions of this License, each Licensor grants to you a perpetual, worldwide, non-exclusive, royalty-free, copyright license to reproduce, prepare derivative works of, publicly display, publicly perform, sublicense and distribute its Work and any resulting derivative works in any form.

## 3. Limitations

### 3.1 Redistribution.
You may reproduce or distribute the Work only if (a) you do so under this License, (b) you include a complete copy of this License with your distribution, and (c) you retain without modification any copyright, patent, trademark, or attribution notices that are present in the Work.

### 3.2 Derivative Works.
You may specify that additional or different terms apply to the use, reproduction, and distribution of your derivative works of the Work (“Your Terms”) only if (a) Your Terms provide that the use limitation in Section 3.3 applies to your derivative works, and (b) you identify the specific derivative works that are subject to Your Terms. Notwithstanding Your Terms, this License (including the redistribution requirements in Section 3.1) will continue to apply to the Work itself.

### 3.3 Use Limitation.
The Work and any derivative works thereof only may be used or intended for use non-commercially and with NVIDIA Processors. Notwithstanding the foregoing, NVIDIA and its affiliates may use the Work and any derivative works commercially. As used herein, “non-commercially” means for research or evaluation purposes only.

### 3.4 Patent Claims.
If you bring or threaten to bring a patent claim against any Licensor (including any claim, cross-claim or counterclaim in a lawsuit) to enforce any patents that you allege are infringed by any Work, then your rights under this License from such Licensor (including the grant in Section 2.1) will terminate immediately.

### 3.5 Trademarks.
This License does not grant any rights to use any Licensor’s or its affiliates’ names, logos, or trademarks, except as necessary to reproduce the notices described in this License.

### 3.6 Termination.
If you violate any term of this License, then your rights under this License (including the grant in Section 2.1) will terminate immediately.

## 4. Disclaimer of Warranty.

THE WORK IS PROVIDED “AS IS” WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING WARRANTIES OR CONDITIONS OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE OR NON-INFRINGEMENT. YOU BEAR THE RISK OF UNDERTAKING ANY ACTIVITIES UNDER THIS LICENSE.

## 5. Limitation of Liability.

EXCEPT AS PROHIBITED BY APPLICABLE LAW, IN NO EVENT AND UNDER NO LEGAL THEORY, WHETHER IN TORT (INCLUDING NEGLIGENCE), CONTRACT, OR OTHERWISE SHALL ANY LICENSOR BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF OR RELATED TO THIS LICENSE, THE USE OR INABILITY TO USE THE WORK (INCLUDING BUT NOT LIMITED TO LOSS OF GOODWILL, BUSINESS INTERRUPTION, LOST PROFITS OR DATA, COMPUTER FAILURE OR MALFUNCTION, OR ANY OTHER COMMERCIAL DAMAGES OR LOSSES), EVEN IF THE LICENSOR HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

================================================
FILE: README.md
================================================
## Bi3D — Official PyTorch Implementation

![Teaser image](data/teaser.png)

**Bi3D: Stereo Depth Estimation via Binary Classifications**<br>
Abhishek Badki, Alejandro Troccoli, Kihwan Kim, Jan Kautz, Pradeep Sen, and Orazio Gallo<br>
IEEE CVPR 2020<br>

## Abstract: 
*Stereo-based depth estimation is a cornerstone of computer vision, with state-of-the-art methods delivering accurate results in real time. For several applications such as autonomous navigation, however, it may be useful to trade accuracy for lower latency. We present Bi3D, a method that estimates depth via a series of binary classifications. Rather than testing if objects are* at *a particular depth D, as existing stereo methods do, it classifies them as being* closer *or* farther *than D. This property offers a powerful mechanism to balance accuracy and latency. Given a strict time budget, Bi3D can detect objects closer than a given distance in as little as a few milliseconds, or estimate depth with arbitrarily coarse quantization, with complexity linear with the number of quantization levels. Bi3D can also use the allotted quantization levels to get continuous depth, but in a specific depth range. For standard stereo (i.e., continuous depth on the whole range), our method is close to or on par with state-of-the-art, finely tuned stereo methods.*


## Paper:
https://arxiv.org/pdf/2005.07274.pdf<br>

## Videos:<br>
<a href="https://www.youtube.com/watch?v=HuEwjpw5O64&feature=youtu.be">
  <img src="https://img.youtube.com/vi/HuEwjpw5O64/0.jpg" width="300"/>
</a>
<a href="https://www.youtube.com/watch?v=UfvUny4pdMA&feature=youtu.be">
  <img src="https://img.youtube.com/vi/UfvUny4pdMA/0.jpg" width="300"/>
</a>
<a href="https://www.youtube.com/watch?v=Ifgcm6VI3NE&feature=youtu.be">
  <img src="https://img.youtube.com/vi/Ifgcm6VI3NE/0.jpg" width="300"/>
</a>

## Citing Bi3D:
    @InProceedings{badki2020Bi3D,
    author = {Badki, Abhishek and Troccoli, Alejandro and Kim, Kihwan and Kautz, Jan and Sen, Pradeep and Gallo, Orazio},
    title = {{Bi3D}: {S}tereo Depth Estimation via Binary Classifications},
    booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year = {2020}
    }

or the arXiv paper

    @InProceedings{badki2020Bi3D,
    author = {Badki, Abhishek and Troccoli, Alejandro and Kim, Kihwan and Kautz, Jan and Sen, Pradeep and Gallo, Orazio},
    title = {{Bi3D}: {S}tereo Depth Estimation via Binary Classifications},
    booktitle = {arXiv preprint arXiv:2005.07274},
    year = {2020}
    }


## Code:<br>

### License

Copyright (C) 2020 NVIDIA Corporation.  All rights reserved.

Licensed under the [NVIDIA Source Code License](LICENSE.md)

### Description


### Setup

We offer two ways of setting up your environemnt, through Docker or Conda.

#### Docker
For convenience, we provide a Dockerfile to build a container image to run the code. The image will contain the Python dependencies.

System requirements:

1. Docker (Tested on version 19.03.11)

2. [NVIDIA Docker](https://github.com/NVIDIA/nvidia-docker/wiki)

3. NVIDIA GPU driver.

Build the container image:
```
docker build -t bi3d . -f envs/bi3d_pytorch_19_01.DockerFile
```
To launch the container, run the following:
```
docker run --rm -it --gpus=all -v $(pwd):/bi3d -w /bi3d --net=host --ipc=host bi3d:latest /bin/bash
```

#### Conda
All dependencies will be installed automatically using the following:
```
conda env create -f envs/bi3d_conda_env.yml 
```
You can activate the environment by running:
```
conda activate bi3d
```

### Pre-trained models
Download the pre-trained models [here](https://drive.google.com/file/d/1X4Ing9WumtIxonNXXCzKJulJtPgzk61n).

### Run the demo

```
cd src
# RUN DEMO FOR SCENEFLOW DATASET 
sh run_demo_sf.sh
# RUN DEMO FOR KITTI15 DATASET
sh run_demo_kitti15.sh
```


================================================
FILE: envs/bi3d_conda_env.yml
================================================
name: bi3d
channels:
  - pytorch
  - soumith
  - defaults
dependencies:
  - _libgcc_mutex=0.1=main
  - blas=1.0=mkl
  - ca-certificates=2020.6.24=0
  - certifi=2020.6.20=py37_0
  - cudatoolkit=10.0.130=0
  - freetype=2.10.2=h5ab3b9f_0
  - intel-openmp=2020.1=217
  - jpeg=9b=h024ee3a_2
  - lcms2=2.11=h396b838_0
  - ld_impl_linux-64=2.33.1=h53a641e_7
  - libedit=3.1.20191231=h14c3975_1
  - libffi=3.3=he6710b0_2
  - libgcc-ng=9.1.0=hdf63c60_0
  - libgfortran-ng=7.3.0=hdf63c60_0
  - libpng=1.6.37=hbc83047_0
  - libstdcxx-ng=9.1.0=hdf63c60_0
  - libtiff=4.1.0=h2733197_1
  - lz4-c=1.9.2=he6710b0_0
  - mkl=2020.1=217
  - mkl-service=2.3.0=py37he904b0f_0
  - mkl_fft=1.1.0=py37h23d657b_0
  - mkl_random=1.1.1=py37h0573a6f_0
  - ncurses=6.2=he6710b0_1
  - ninja=1.9.0=py37hfd86e86_0
  - numpy=1.18.5=py37ha1c710e_0
  - numpy-base=1.18.5=py37hde5b4d6_0
  - olefile=0.46=py_0
  - openssl=1.1.1g=h7b6447c_0
  - pillow=7.2.0=py37hb39fc2d_0
  - pip=20.1.1=py37_1
  - python=3.7.7=hcff3b4d_5
  - pytorch=1.4.0=py3.7_cuda10.0.130_cudnn7.6.3_0
  - readline=8.0=h7b6447c_0
  - setuptools=49.2.0=py37_0
  - six=1.15.0=py_0
  - sqlite=3.32.3=h62c20be_0
  - tk=8.6.10=hbc83047_0
  - torchvision=0.5.0=py37_cu100
  - wheel=0.34.2=py37_0
  - xz=5.2.5=h7b6447c_0
  - zlib=1.2.11=h7b6447c_3
  - zstd=1.4.5=h0b5b093_0
  - pip:
    - imageio==2.9.0
    - opencv-python==4.3.0.36
    - protobuf==3.12.2
    - tensorboardx==2.1



================================================
FILE: envs/bi3d_pytorch_19_01.DockerFile
================================================
FROM nvcr.io/nvidia/pytorch:19.01-py3

RUN pip install Pillow
RUN pip install imageio
RUN pip install tensorboardX
RUN pip install opencv-python


================================================
FILE: src/models/Bi3DNet.py
================================================
# Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
#
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto.  Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.

import numpy as np
import torch
import torch.nn as nn
from torch.autograd import Variable
import torch.nn.functional as F

import models.FeatExtractNet as FeatNet
import models.SegNet2D as SegNet
import models.RefineNet2D as RefineNet
import models.RefineNet3D as RefineNet3D


__all__ = ["bi3dnet_binary_depth", "bi3dnet_continuous_depth_2D", "bi3dnet_continuous_depth_3D"]


def compute_cost_volume(features_left, features_right, disp_ids, max_disp, is_disps_per_example):

    batch_size = features_left.shape[0]
    feature_size = features_left.shape[1]
    H = features_left.shape[2]
    W = features_left.shape[3]

    psv_size = disp_ids.shape[1]

    psv = Variable(features_left.new_zeros(batch_size, psv_size, feature_size * 2, H, W + max_disp)).cuda()

    if is_disps_per_example:
        for i in range(batch_size):
            psv[i, 0, :feature_size, :, 0:W] = features_left[i]
            psv[i, 0, feature_size:, :, disp_ids[i, 0] : W + disp_ids[i, 0]] = features_right[i]
        psv = psv.contiguous()
    else:
        for i in range(psv_size):
            psv[:, i, :feature_size, :, 0:W] = features_left
            psv[:, i, feature_size:, :, disp_ids[0, i] : W + disp_ids[0, i]] = features_right
        psv = psv.contiguous()

    return psv


"""
Bi3DNet for continuous depthmap generation. Doesn't use 3D regularization.
"""


class Bi3DNetContinuousDepth2D(nn.Module):
    def __init__(self, options, featnet_arch, segnet_arch, refinenet_arch=None, max_disparity=192):

        super(Bi3DNetContinuousDepth2D, self).__init__()

        self.max_disparity = max_disparity
        self.max_disparity_seg = int(self.max_disparity / 3)
        self.is_disps_per_example = False
        self.is_save_memory = False

        self.is_refine = True
        if refinenet_arch == None:
            self.is_refine = False

        self.featnet = FeatNet.__dict__[featnet_arch](options, data=None)
        self.segnet = SegNet.__dict__[segnet_arch](options, data=None)
        if self.is_refine:
            self.refinenet = RefineNet.__dict__[refinenet_arch](options, data=None)

        return

    def forward(self, img_left, img_right, disp_ids):

        batch_size = img_left.shape[0]
        psv_size = disp_ids.shape[1]

        if psv_size == 1:
            self.is_disps_per_example = True
        else:
            self.is_disps_per_example = False

        # Feature Extraction
        features_left = self.featnet(img_left)
        features_right = self.featnet(img_right)
        feature_size = features_left.shape[1]
        H = features_left.shape[2]
        W = features_left.shape[3]

        # Cost Volume Generation
        psv = compute_cost_volume(
            features_left, features_right, disp_ids, self.max_disparity_seg, self.is_disps_per_example
        )

        psv = psv.view(batch_size * psv_size, feature_size * 2, H, W + self.max_disparity_seg)

        # Segmentation Network
        seg_raw_low_res = self.segnet(psv)[:, :, :, :W]
        seg_raw_low_res = seg_raw_low_res.view(batch_size, 1, psv_size, H, W)

        # Upsampling
        seg_prob_low_res_up = torch.sigmoid(
            F.interpolate(
                seg_raw_low_res,
                size=[psv_size * 3, img_left.size()[-2], img_left.size()[-1]],
                mode="trilinear",
                align_corners=False,
            )
        )
        seg_prob_low_res_up = seg_prob_low_res_up[:, 0, 1:-1, :, :]

        # Projection
        disparity_normalized = torch.mean((seg_prob_low_res_up), dim=1, keepdim=True)

        # Refinement
        if self.is_refine:
            refine_net_input = torch.cat((disparity_normalized, img_left), dim=1)
            disparity_normalized = self.refinenet(refine_net_input)

        return seg_prob_low_res_up, disparity_normalized


def bi3dnet_continuous_depth_2D(options, data=None):

    print("==> USING Bi3DNetContinuousDepth2D")
    for key in options:
        if "bi3dnet" in key:
            print("{} : {}".format(key, options[key]))

    model = Bi3DNetContinuousDepth2D(
        options,
        featnet_arch=options["bi3dnet_featnet_arch"],
        segnet_arch=options["bi3dnet_segnet_arch"],
        refinenet_arch=options["bi3dnet_refinenet_arch"],
        max_disparity=options["bi3dnet_max_disparity"],
    )

    if data is not None:
        model.load_state_dict(data["state_dict"])

    return model


"""
Bi3DNet for continuous depthmap generation. Uses 3D regularization.
"""


class Bi3DNetContinuousDepth3D(nn.Module):
    def __init__(
        self,
        options,
        featnet_arch,
        segnet_arch,
        refinenet_arch=None,
        refinenet3d_arch=None,
        max_disparity=192,
    ):

        super(Bi3DNetContinuousDepth3D, self).__init__()

        self.max_disparity = max_disparity
        self.max_disparity_seg = int(self.max_disparity / 3)
        self.is_disps_per_example = False
        self.is_save_memory = False

        self.is_refine = True
        if refinenet_arch == None:
            self.is_refine = False

        self.featnet = FeatNet.__dict__[featnet_arch](options, data=None)
        self.segnet = SegNet.__dict__[segnet_arch](options, data=None)
        if self.is_refine:
            self.refinenet = RefineNet.__dict__[refinenet_arch](options, data=None)
            self.refinenet3d = RefineNet3D.__dict__[refinenet3d_arch](options, data=None)

        return

    def forward(self, img_left, img_right, disp_ids):

        batch_size = img_left.shape[0]
        psv_size = disp_ids.shape[1]

        if psv_size == 1:
            self.is_disps_per_example = True
        else:
            self.is_disps_per_example = False

        # Feature Extraction
        features_left = self.featnet(img_left)
        features_right = self.featnet(img_right)
        feature_size = features_left.shape[1]
        H = features_left.shape[2]
        W = features_left.shape[3]

        # Cost Volume Generation
        psv = compute_cost_volume(
            features_left, features_right, disp_ids, self.max_disparity_seg, self.is_disps_per_example
        )

        psv = psv.view(batch_size * psv_size, feature_size * 2, H, W + self.max_disparity_seg)

        # Segmentation Network
        seg_raw_low_res = self.segnet(psv)[:, :, :, :W]  # cropped to remove excess boundary
        seg_raw_low_res = seg_raw_low_res.view(batch_size, 1, psv_size, H, W)

        # Upsampling
        seg_prob_low_res_up = torch.sigmoid(
            F.interpolate(
                seg_raw_low_res,
                size=[psv_size * 3, img_left.size()[-2], img_left.size()[-1]],
                mode="trilinear",
                align_corners=False,
            )
        )

        seg_prob_low_res_up = seg_prob_low_res_up[:, 0, 1:-1, :, :]

        # Upsampling after 3D Regularization
        seg_raw_low_res_refined = seg_raw_low_res
        seg_raw_low_res_refined[:, :, 1:, :, :] = self.refinenet3d(
            features_left, seg_raw_low_res_refined[:, :, 1:, :, :]
        )

        seg_prob_low_res_refined_up = torch.sigmoid(
            F.interpolate(
                seg_raw_low_res_refined,
                size=[psv_size * 3, img_left.size()[-2], img_left.size()[-1]],
                mode="trilinear",
                align_corners=False,
            )
        )

        seg_prob_low_res_refined_up = seg_prob_low_res_refined_up[:, 0, 1:-1, :, :]

        # Projection
        disparity_normalized_noisy = torch.mean((seg_prob_low_res_refined_up), dim=1, keepdim=True)

        # Refinement
        if self.is_refine:
            refine_net_input = torch.cat((disparity_normalized_noisy, img_left), dim=1)
            disparity_normalized = self.refinenet(refine_net_input)

        return (
            seg_prob_low_res_up,
            seg_prob_low_res_refined_up,
            disparity_normalized_noisy,
            disparity_normalized,
        )


def bi3dnet_continuous_depth_3D(options, data=None):

    print("==> USING Bi3DNetContinuousDepth3D")
    for key in options:
        if "bi3dnet" in key:
            print("{} : {}".format(key, options[key]))

    model = Bi3DNetContinuousDepth3D(
        options,
        featnet_arch=options["bi3dnet_featnet_arch"],
        segnet_arch=options["bi3dnet_segnet_arch"],
        refinenet_arch=options["bi3dnet_refinenet_arch"],
        refinenet3d_arch=options["bi3dnet_regnet_arch"],
        max_disparity=options["bi3dnet_max_disparity"],
    )

    if data is not None:
        model.load_state_dict(data["state_dict"])

    return model


"""
Bi3DNet for binary depthmap generation.
"""


class Bi3DNetBinaryDepth(nn.Module):
    def __init__(
        self,
        options,
        featnet_arch,
        segnet_arch,
        refinenet_arch=None,
        featnethr_arch=None,
        max_disparity=192,
        is_disps_per_example=False,
    ):

        super(Bi3DNetBinaryDepth, self).__init__()

        self.max_disparity = max_disparity
        self.max_disparity_seg = int(max_disparity / 3)
        self.is_disps_per_example = is_disps_per_example

        self.is_refine = True
        if refinenet_arch == None:
            self.is_refine = False

        self.featnet = FeatNet.__dict__[featnet_arch](options, data=None)
        self.featnethr = FeatNet.__dict__[featnethr_arch](options, data=None)
        self.segnet = SegNet.__dict__[segnet_arch](options, data=None)
        if self.is_refine:
            self.refinenet = RefineNet.__dict__[refinenet_arch](options, data=None)

        return

    def forward(self, img_left, img_right, disp_ids):

        batch_size = img_left.shape[0]
        psv_size = disp_ids.shape[1]

        if psv_size == 1:
            self.is_disps_per_example = True
        else:
            self.is_disps_per_example = False

        # Feature Extraction
        features = self.featnet(torch.cat((img_left, img_right), dim=0))

        features_left = features[:batch_size, :, :, :]
        features_right = features[batch_size:, :, :, :]

        if self.is_refine:
            features_lefthr = self.featnethr(img_left)
        feature_size = features_left.shape[1]
        H = features_left.shape[2]
        W = features_left.shape[3]

        # Cost Volume Generation
        psv = compute_cost_volume(
            features_left, features_right, disp_ids, self.max_disparity_seg, self.is_disps_per_example
        )

        psv = psv.view(batch_size * psv_size, feature_size * 2, H, W + self.max_disparity_seg)

        # Segmentation Network
        seg_raw_low_res = self.segnet(psv)[:, :, :, :W]  # cropped to remove excess boundary
        seg_prob_low_res = torch.sigmoid(seg_raw_low_res)
        seg_prob_low_res = seg_prob_low_res.view(batch_size, psv_size, H, W)

        seg_prob_low_res_up = F.interpolate(
            seg_prob_low_res, size=img_left.size()[-2:], mode="bilinear", align_corners=False
        )
        out = []
        out.append(seg_prob_low_res_up)

        # Refinement
        if self.is_refine:
            seg_raw_high_res = F.interpolate(
                seg_raw_low_res, size=img_left.size()[-2:], mode="bilinear", align_corners=False
            )
            # Refine Net
            features_left_expand = (
                features_lefthr[:, None, :, :, :].expand(-1, psv_size, -1, -1, -1).contiguous()
            )
            features_left_expand = features_left_expand.view(
                -1, features_lefthr.size()[1], features_lefthr.size()[2], features_lefthr.size()[3]
            )
            refine_net_input = torch.cat((seg_raw_high_res, features_left_expand), dim=1)

            seg_raw_high_res = self.refinenet(refine_net_input)

            seg_prob_high_res = torch.sigmoid(seg_raw_high_res)
            seg_prob_high_res = seg_prob_high_res.view(
                batch_size, psv_size, img_left.size()[-2], img_left.size()[-1]
            )
            out.append(seg_prob_high_res)
        else:
            out.append(seg_prob_low_res_up)

        return out


def bi3dnet_binary_depth(options, data=None):

    print("==> USING Bi3DNetBinaryDepth")
    for key in options:
        if "bi3dnet" in key:
            print("{} : {}".format(key, options[key]))

    model = Bi3DNetBinaryDepth(
        options,
        featnet_arch=options["bi3dnet_featnet_arch"],
        segnet_arch=options["bi3dnet_segnet_arch"],
        refinenet_arch=options["bi3dnet_refinenet_arch"],
        featnethr_arch=options["bi3dnet_featnethr_arch"],
        max_disparity=options["bi3dnet_max_disparity"],
        is_disps_per_example=options["bi3dnet_disps_per_example_true"],
    )

    if data is not None:
        model.load_state_dict(data["state_dict"])

    return model


================================================
FILE: src/models/DispRefine2D.py
================================================
# MIT License
#
# Copyright (c) 2019 Xuanyi Li (xuanyili.edu@gmail.com)
# Copyright (c) 2020 NVIDIA
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.

import torch
import torch.nn as nn
import torch.nn.functional as F
import math

from models.PSMNet import conv2d
from models.PSMNet import conv2d_lrelu

"""
The code in this file is adapted
from https://github.com/meteorshowers/StereoNet-ActiveStereoNet
"""


class BasicBlock(nn.Module):

    expansion = 1

    def __init__(self, inplanes, planes, stride, downsample, pad, dilation):

        super(BasicBlock, self).__init__()

        self.conv1 = conv2d_lrelu(inplanes, planes, 3, stride, pad, dilation)
        self.conv2 = conv2d(planes, planes, 3, 1, pad, dilation)

        self.downsample = downsample
        self.stride = stride

    def forward(self, x):

        out = self.conv1(x)
        out = self.conv2(out)

        if self.downsample is not None:
            x = self.downsample(x)

        out += x

        return out


class DispRefineNet(nn.Module):
    def __init__(self, out_planes=32):

        super(DispRefineNet, self).__init__()

        self.out_planes = out_planes

        self.conv2d_feature = conv2d_lrelu(
            in_planes=4, out_planes=self.out_planes, kernel_size=3, stride=1, pad=1, dilation=1
        )

        self.residual_astrous_blocks = nn.ModuleList()
        astrous_list = [1, 2, 4, 8, 1, 1]
        for di in astrous_list:
            self.residual_astrous_blocks.append(
                BasicBlock(self.out_planes, self.out_planes, stride=1, downsample=None, pad=1, dilation=di)
            )

        self.conv2d_out = nn.Conv2d(self.out_planes, 1, kernel_size=3, stride=1, padding=1)

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
                m.weight.data.normal_(0, math.sqrt(2.0 / n))
            elif isinstance(m, nn.Conv3d):
                n = m.kernel_size[0] * m.kernel_size[1] * m.kernel_size[2] * m.out_channels
                m.weight.data.normal_(0, math.sqrt(2.0 / n))
            elif isinstance(m, nn.BatchNorm2d):
                m.weight.data.fill_(1)
                m.bias.data.zero_()
            elif isinstance(m, nn.BatchNorm3d):
                m.weight.data.fill_(1)
                m.bias.data.zero_()
            elif isinstance(m, nn.Linear):
                m.bias.data.zero_()

        return

    def forward(self, x):

        disp = x[:, 0, :, :][:, None, :, :]
        output = self.conv2d_feature(x)

        for astrous_block in self.residual_astrous_blocks:
            output = astrous_block(output)

        output = self.conv2d_out(output)  # residual disparity
        output = output + disp  # final disparity

        return output


================================================
FILE: src/models/FeatExtractNet.py
================================================
# Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
#
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto.  Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.

from __future__ import print_function
import torch
import torch.nn as nn
import math

from models.PSMNet import conv2d
from models.PSMNet import conv2d_relu
from models.PSMNet import FeatExtractNetSPP

__all__ = ["featextractnetspp", "featextractnethr"]


"""
Feature extraction network. 
Generates 16D features at the image resolution.
Used for final refinement. 
"""


class FeatExtractNetHR(nn.Module):
    def __init__(self, out_planes=16):

        super(FeatExtractNetHR, self).__init__()

        self.conv1 = nn.Sequential(
            conv2d_relu(3, out_planes, kernel_size=3, stride=1, pad=1, dilation=1),
            conv2d_relu(out_planes, out_planes, kernel_size=3, stride=1, pad=1, dilation=1),
            nn.Conv2d(out_planes, out_planes, kernel_size=1, padding=0, stride=1, bias=False),
        )

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
                m.weight.data.normal_(0, math.sqrt(2.0 / n))
            elif isinstance(m, nn.Conv3d):
                n = m.kernel_size[0] * m.kernel_size[1] * m.kernel_size[2] * m.out_channels
                m.weight.data.normal_(0, math.sqrt(2.0 / n))
            elif isinstance(m, nn.BatchNorm2d):
                m.weight.data.fill_(1)
                m.bias.data.zero_()
            elif isinstance(m, nn.BatchNorm3d):
                m.weight.data.fill_(1)
                m.bias.data.zero_()
            elif isinstance(m, nn.Linear):
                m.bias.data.zero_()

        return

    def forward(self, input):

        output = self.conv1(input)
        return output


def featextractnethr(options, data=None):

    print("==> USING FeatExtractNetHR")
    for key in options:
        if "featextractnethr" in key:
            print("{} : {}".format(key, options[key]))

    model = FeatExtractNetHR(out_planes=options["featextractnethr_out_planes"])

    if data is not None:
        model.load_state_dict(data["state_dict"])

    return model


"""
Feature extraction network. 
Generates 32D features at 3x less resolution.
Uses Spatial Pyramid Pooling inspired by PSMNet.
"""


def featextractnetspp(options, data=None):

    print("==> USING FeatExtractNetSPP")
    for key in options:
        if "feat" in key:
            print("{} : {}".format(key, options[key]))

    model = FeatExtractNetSPP()

    if data is not None:
        model.load_state_dict(data["state_dict"])

    return model


================================================
FILE: src/models/GCNet.py
================================================
# Copyright (c) 2018 Wang Yufeng
# Copyright (c) 2020 NVIDIA
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import torch
import torch.nn as nn

"""
The code in this file is adapted from https://github.com/wyf2017/DSMnet
"""


def conv3d_relu(in_planes, out_planes, kernel_size=3, stride=1, activefun=nn.ReLU(inplace=True)):

    return nn.Sequential(
        nn.Conv3d(in_planes, out_planes, kernel_size, stride, padding=(kernel_size - 1) // 2, bias=True),
        activefun,
    )


def deconv3d_relu(in_planes, out_planes, kernel_size=4, stride=2, activefun=nn.ReLU(inplace=True)):

    assert stride > 1
    p = (kernel_size - 1) // 2
    op = stride - (kernel_size - 2 * p)
    return nn.Sequential(
        nn.ConvTranspose3d(
            in_planes, out_planes, kernel_size, stride, padding=p, output_padding=op, bias=True
        ),
        activefun,
    )


"""
GCNet style 3D regularization network
"""


class feature3d(nn.Module):
    def __init__(self, num_F):

        super(feature3d, self).__init__()
        self.F = num_F

        self.l19 = conv3d_relu(self.F + 32, self.F, kernel_size=3, stride=1)
        self.l20 = conv3d_relu(self.F, self.F, kernel_size=3, stride=1)

        self.l21 = conv3d_relu(self.F + 32, self.F * 2, kernel_size=3, stride=2)
        self.l22 = conv3d_relu(self.F * 2, self.F * 2, kernel_size=3, stride=1)
        self.l23 = conv3d_relu(self.F * 2, self.F * 2, kernel_size=3, stride=1)

        self.l24 = conv3d_relu(self.F * 2, self.F * 2, kernel_size=3, stride=2)
        self.l25 = conv3d_relu(self.F * 2, self.F * 2, kernel_size=3, stride=1)
        self.l26 = conv3d_relu(self.F * 2, self.F * 2, kernel_size=3, stride=1)

        self.l27 = conv3d_relu(self.F * 2, self.F * 2, kernel_size=3, stride=2)
        self.l28 = conv3d_relu(self.F * 2, self.F * 2, kernel_size=3, stride=1)
        self.l29 = conv3d_relu(self.F * 2, self.F * 2, kernel_size=3, stride=1)

        self.l30 = conv3d_relu(self.F * 2, self.F * 4, kernel_size=3, stride=2)
        self.l31 = conv3d_relu(self.F * 4, self.F * 4, kernel_size=3, stride=1)
        self.l32 = conv3d_relu(self.F * 4, self.F * 4, kernel_size=3, stride=1)

        self.l33 = deconv3d_relu(self.F * 4, self.F * 2, kernel_size=3, stride=2)
        self.l34 = deconv3d_relu(self.F * 2, self.F * 2, kernel_size=3, stride=2)
        self.l35 = deconv3d_relu(self.F * 2, self.F * 2, kernel_size=3, stride=2)
        self.l36 = deconv3d_relu(self.F * 2, self.F, kernel_size=3, stride=2)

        self.l37 = nn.Conv3d(self.F, 1, kernel_size=3, stride=1, padding=1, bias=True)

    def forward(self, x):

        x18 = x
        x21 = self.l21(x18)
        x24 = self.l24(x21)
        x27 = self.l27(x24)
        x30 = self.l30(x27)
        x31 = self.l31(x30)
        x32 = self.l32(x31)

        x29 = self.l29(self.l28(x27))
        x33 = self.l33(x32) + x29

        x26 = self.l26(self.l25(x24))
        x34 = self.l34(x33) + x26

        x23 = self.l23(self.l22(x21))
        x35 = self.l35(x34) + x23

        x20 = self.l20(self.l19(x18))
        x36 = self.l36(x35) + x20

        x37 = self.l37(x36)

        conf_volume_wo_sig = x37

        return conf_volume_wo_sig


================================================
FILE: src/models/PSMNet.py
================================================
# MIT License
#
# Copyright (c) 2018 Jia-Ren Chang
# Copyright (c) 2020 NVIDIA
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.

import torch
import torch.nn as nn
import torch.nn.functional as F
import math

"""
The code in this file is adapted from https://github.com/JiaRenChang/PSMNet
"""


def conv2d(in_planes, out_planes, kernel_size, stride, pad, dilation):

    return nn.Sequential(
        nn.Conv2d(
            in_planes,
            out_planes,
            kernel_size=kernel_size,
            stride=stride,
            padding=dilation if dilation > 1 else pad,
            dilation=dilation,
            bias=True,
        )
    )


def conv2d_relu(in_planes, out_planes, kernel_size, stride, pad, dilation):

    return nn.Sequential(
        nn.Conv2d(
            in_planes,
            out_planes,
            kernel_size=kernel_size,
            stride=stride,
            padding=dilation if dilation > 1 else pad,
            dilation=dilation,
            bias=True,
        ),
        nn.ReLU(inplace=True),
    )


def conv2d_lrelu(in_planes, out_planes, kernel_size, stride, pad, dilation=1):

    return nn.Sequential(
        nn.Conv2d(
            in_planes,
            out_planes,
            kernel_size=kernel_size,
            stride=stride,
            padding=dilation if dilation > 1 else pad,
            dilation=dilation,
            bias=True,
        ),
        nn.LeakyReLU(0.1, inplace=True),
    )


class BasicBlock(nn.Module):

    expansion = 1

    def __init__(self, inplanes, planes, stride, downsample, pad, dilation):

        super(BasicBlock, self).__init__()

        self.conv1 = conv2d_relu(inplanes, planes, 3, stride, pad, dilation)
        self.conv2 = conv2d(planes, planes, 3, 1, pad, dilation)

        self.downsample = downsample
        self.stride = stride

    def forward(self, x):

        out = self.conv1(x)
        out = self.conv2(out)

        if self.downsample is not None:
            x = self.downsample(x)

        out += x

        return out


class FeatExtractNetSPP(nn.Module):
    def __init__(self):

        super(FeatExtractNetSPP, self).__init__()

        self.align_corners = False
        self.inplanes = 32

        self.firstconv = nn.Sequential(
            conv2d_relu(3, 32, 3, 3, 1, 1), conv2d_relu(32, 32, 3, 1, 1, 1), conv2d_relu(32, 32, 3, 1, 1, 1)
        )

        self.layer1 = self._make_layer(BasicBlock, 32, 2, 1, 1, 2)

        self.branch1 = nn.Sequential(nn.AvgPool2d((64, 64), stride=(64, 64)), conv2d_relu(32, 32, 1, 1, 0, 1))

        self.branch2 = nn.Sequential(nn.AvgPool2d((32, 32), stride=(32, 32)), conv2d_relu(32, 32, 1, 1, 0, 1))

        self.branch3 = nn.Sequential(nn.AvgPool2d((16, 16), stride=(16, 16)), conv2d_relu(32, 32, 1, 1, 0, 1))

        self.branch4 = nn.Sequential(nn.AvgPool2d((8, 8), stride=(8, 8)), conv2d_relu(32, 32, 1, 1, 0, 1))

        self.lastconv = nn.Sequential(
            conv2d_relu(160, 64, 3, 1, 1, 1),
            nn.Conv2d(64, 32, kernel_size=1, padding=0, stride=1, bias=False),
        )

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
                m.weight.data.normal_(0, math.sqrt(2.0 / n))
            elif isinstance(m, nn.Conv3d):
                n = m.kernel_size[0] * m.kernel_size[1] * m.kernel_size[2] * m.out_channels
                m.weight.data.normal_(0, math.sqrt(2.0 / n))
            elif isinstance(m, nn.BatchNorm2d):
                m.weight.data.fill_(1)
                m.bias.data.zero_()
            elif isinstance(m, nn.BatchNorm3d):
                m.weight.data.fill_(1)
                m.bias.data.zero_()
            elif isinstance(m, nn.Linear):
                m.bias.data.zero_()

    def _make_layer(self, block, planes, blocks, stride, pad, dilation):
        downsample = None
        if stride != 1 or self.inplanes != planes * block.expansion:
            downsample = nn.Sequential(
                nn.Conv2d(self.inplanes, planes * block.expansion, kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(planes * block.expansion),
            )

        layers = []
        layers.append(block(self.inplanes, planes, stride, downsample, pad, dilation))
        self.inplanes = planes * block.expansion
        for i in range(1, blocks):
            layers.append(block(self.inplanes, planes, 1, None, pad, dilation))

        return nn.Sequential(*layers)

    def forward(self, input):

        output0 = self.firstconv(input)
        output1 = self.layer1(output0)

        output_branch1 = self.branch1(output1)
        output_branch1 = F.interpolate(
            output_branch1,
            (output1.size()[2], output1.size()[3]),
            mode="bilinear",
            align_corners=self.align_corners,
        )

        output_branch2 = self.branch2(output1)
        output_branch2 = F.interpolate(
            output_branch2,
            (output1.size()[2], output1.size()[3]),
            mode="bilinear",
            align_corners=self.align_corners,
        )

        output_branch3 = self.branch3(output1)
        output_branch3 = F.interpolate(
            output_branch3,
            (output1.size()[2], output1.size()[3]),
            mode="bilinear",
            align_corners=self.align_corners,
        )

        output_branch4 = self.branch4(output1)
        output_branch4 = F.interpolate(
            output_branch4,
            (output1.size()[2], output1.size()[3]),
            mode="bilinear",
            align_corners=self.align_corners,
        )

        output_feature = torch.cat(
            (output1, output_branch4, output_branch3, output_branch2, output_branch1), 1
        )

        output_feature = self.lastconv(output_feature)

        return output_feature


================================================
FILE: src/models/RefineNet2D.py
================================================
# Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
#
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto.  Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.

from __future__ import print_function
import torch
import torch.nn as nn
import torch.nn.functional as F
import math
import argparse
import time
import torch.backends.cudnn as cudnn

from models.PSMNet import conv2d
from models.PSMNet import conv2d_lrelu

from models.DispRefine2D import DispRefineNet

__all__ = ["disprefinenet", "segrefinenet"]


"""
Disparity refinement network.
Takes concatenated input image and the disparity map to generate refined disparity map.
Generates refined output using input image as guide.
"""


def disprefinenet(options, data=None):

    print("==> USING DispRefineNet")
    for key in options:
        if "disprefinenet" in key:
            print("{} : {}".format(key, options[key]))

    model = DispRefineNet(out_planes=options["disprefinenet_out_planes"])

    if data is not None:
        model.load_state_dict(data["state_dict"])

    return model


"""
Binary segmentation refinement network.
Takes as input high resolution features of input image and the disparity map.
Generates refined output using input image as guide.
"""


class SegRefineNet(nn.Module):
    def __init__(self, in_planes=17, out_planes=8):

        super(SegRefineNet, self).__init__()

        self.conv1 = nn.Sequential(conv2d_lrelu(in_planes, out_planes, kernel_size=3, stride=1, pad=1))

        self.classif1 = nn.Conv2d(out_planes, 1, kernel_size=3, padding=1, stride=1, bias=False)

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
                m.weight.data.normal_(0, math.sqrt(2.0 / n))
            elif isinstance(m, nn.Conv3d):
                n = m.kernel_size[0] * m.kernel_size[1] * m.kernel_size[2] * m.out_channels
                m.weight.data.normal_(0, math.sqrt(2.0 / n))
            elif isinstance(m, nn.BatchNorm2d):
                m.weight.data.fill_(1)
                m.bias.data.zero_()
            elif isinstance(m, nn.BatchNorm3d):
                m.weight.data.fill_(1)
                m.bias.data.zero_()
            elif isinstance(m, nn.Linear):
                m.bias.data.zero_()

    def forward(self, input):

        output0 = self.conv1(input)
        output = self.classif1(output0)

        return output


def segrefinenet(options, data=None):

    print("==> USING SegRefineNet")
    for key in options:
        if "segrefinenet" in key:
            print("{} : {}".format(key, options[key]))

    model = SegRefineNet(
        in_planes=options["segrefinenet_in_planes"], out_planes=options["segrefinenet_out_planes"]
    )

    if data is not None:
        model.load_state_dict(data["state_dict"])

    return model


================================================
FILE: src/models/RefineNet3D.py
================================================
# Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
#
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto.  Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.

import torch
import torch.nn as nn
import numpy as np

__all__ = ["segregnet3d"]

from models.GCNet import conv3d_relu
from models.GCNet import deconv3d_relu
from models.GCNet import feature3d


def net_init(net):

    for m in net.modules():
        if isinstance(m, nn.Linear):
            m.weight.data = fanin_init(m.weight.data.size())
        elif isinstance(m, nn.Conv3d):
            n = m.kernel_size[0] * m.kernel_size[1] * m.kernel_size[2] * m.out_channels
            m.weight.data.normal_(0, np.sqrt(2.0 / n))
        elif isinstance(m, nn.Conv2d):
            n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
            m.weight.data.normal_(0, np.sqrt(2.0 / n))
        elif isinstance(m, nn.Conv1d):
            n = m.kernel_size[0] * m.out_channels
            m.weight.data.normal_(0, np.sqrt(2.0 / n))
        elif isinstance(m, nn.BatchNorm3d):
            m.weight.data.fill_(1)
            m.bias.data.zero_()
        elif isinstance(m, nn.BatchNorm2d):
            m.weight.data.fill_(1)
            m.bias.data.zero_()
        elif isinstance(m, nn.BatchNorm1d):
            m.weight.data.fill_(1)
            m.bias.data.zero_()


class SegRegNet3D(nn.Module):
    def __init__(self, F=16):

        super(SegRegNet3D, self).__init__()

        self.conf_preprocess = conv3d_relu(1, F, kernel_size=3, stride=1)
        self.layer3d = feature3d(F)

        net_init(self)

    def forward(self, fL, conf_volume):

        fL_stack = fL[:, :, None, :, :].repeat(1, 1, int(conf_volume.shape[2]), 1, 1)
        conf_vol_preprocess = self.conf_preprocess(conf_volume)
        input_volume = torch.cat((fL_stack, conf_vol_preprocess), dim=1)
        oL = self.layer3d(input_volume)

        return oL


def segregnet3d(options, data=None):

    print("==> USING SegRegNet3D")
    for key in options:
        if "regnet" in key:
            print("{} : {}".format(key, options[key]))

    model = SegRegNet3D(F=options["regnet_out_planes"])
    if data is not None:
        model.load_state_dict(data["state_dict"])

    return model


================================================
FILE: src/models/SegNet2D.py
================================================
# Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
#
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto.  Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.

import torch
import torch.nn as nn
import argparse
import math
import torch.nn.functional as F
import torch.backends.cudnn as cudnn
import time

__all__ = ["segnet2d"]

# Util Functions
def conv(in_planes, out_planes, kernel_size=3, stride=1, activefun=nn.LeakyReLU(0.1, inplace=True)):

    return nn.Sequential(
        nn.Conv2d(
            in_planes,
            out_planes,
            kernel_size=kernel_size,
            stride=stride,
            padding=(kernel_size - 1) // 2,
            bias=True,
        ),
        activefun,
    )


def deconv(in_planes, out_planes, kernel_size=4, stride=2, activefun=nn.LeakyReLU(0.1, inplace=True)):

    return nn.Sequential(
        nn.ConvTranspose2d(
            in_planes, out_planes, kernel_size=kernel_size, stride=stride, padding=1, bias=True
        ),
        activefun,
    )


class SegNet2D(nn.Module):
    def __init__(self):

        super(SegNet2D, self).__init__()

        self.activefun = nn.LeakyReLU(0.1, inplace=True)

        cps = [64, 128, 256, 512, 512, 512]
        dps = [512, 512, 256, 128, 64]

        # Encoder
        self.conv1 = conv(cps[0], cps[1], kernel_size=3, stride=2, activefun=self.activefun)
        self.conv1_1 = conv(cps[1], cps[1], kernel_size=3, stride=1, activefun=self.activefun)

        self.conv2 = conv(cps[1], cps[2], kernel_size=3, stride=2, activefun=self.activefun)
        self.conv2_1 = conv(cps[2], cps[2], kernel_size=3, stride=1, activefun=self.activefun)

        self.conv3 = conv(cps[2], cps[3], kernel_size=3, stride=2, activefun=self.activefun)
        self.conv3_1 = conv(cps[3], cps[3], kernel_size=3, stride=1, activefun=self.activefun)

        self.conv4 = conv(cps[3], cps[4], kernel_size=3, stride=2, activefun=self.activefun)
        self.conv4_1 = conv(cps[4], cps[4], kernel_size=3, stride=1, activefun=self.activefun)

        self.conv5 = conv(cps[4], cps[5], kernel_size=3, stride=2, activefun=self.activefun)
        self.conv5_1 = conv(cps[5], cps[5], kernel_size=3, stride=1, activefun=self.activefun)

        # Decoder
        self.deconv5 = deconv(cps[5], dps[0], kernel_size=4, stride=2, activefun=self.activefun)
        self.deconv5_1 = conv(dps[0] + cps[4], dps[0], kernel_size=3, stride=1, activefun=self.activefun)

        self.deconv4 = deconv(cps[4], dps[1], kernel_size=4, stride=2, activefun=self.activefun)
        self.deconv4_1 = conv(dps[1] + cps[3], dps[1], kernel_size=3, stride=1, activefun=self.activefun)

        self.deconv3 = deconv(dps[1], dps[2], kernel_size=4, stride=2, activefun=self.activefun)
        self.deconv3_1 = conv(dps[2] + cps[2], dps[2], kernel_size=3, stride=1, activefun=self.activefun)

        self.deconv2 = deconv(dps[2], dps[3], kernel_size=4, stride=2, activefun=self.activefun)
        self.deconv2_1 = conv(dps[3] + cps[1], dps[3], kernel_size=3, stride=1, activefun=self.activefun)

        self.deconv1 = deconv(dps[3], dps[4], kernel_size=4, stride=2, activefun=self.activefun)
        self.deconv1_1 = conv(dps[4] + cps[0], dps[4], kernel_size=3, stride=1, activefun=self.activefun)

        self.last_conv = nn.Conv2d(dps[4], 1, kernel_size=3, stride=1, padding=1, bias=True)

        # Init
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
                m.weight.data.normal_(0, math.sqrt(2.0 / n))
            elif isinstance(m, nn.Conv3d):
                n = m.kernel_size[0] * m.kernel_size[1] * m.kernel_size[2] * m.out_channels
                m.weight.data.normal_(0, math.sqrt(2.0 / n))
            elif isinstance(m, nn.BatchNorm2d):
                m.weight.data.fill_(1)
                m.bias.data.zero_()
            elif isinstance(m, nn.BatchNorm3d):
                m.weight.data.fill_(1)
                m.bias.data.zero_()
            elif isinstance(m, nn.Linear):
                m.bias.data.zero_()

        return

    def forward(self, x):

        out_conv0 = x
        out_conv1 = self.conv1_1(self.conv1(out_conv0))
        out_conv2 = self.conv2_1(self.conv2(out_conv1))
        out_conv3 = self.conv3_1(self.conv3(out_conv2))
        out_conv4 = self.conv4_1(self.conv4(out_conv3))
        out_conv5 = self.conv5_1(self.conv5(out_conv4))

        out_deconv5 = self.deconv5(out_conv5)
        out_deconv5_1 = self.deconv5_1(torch.cat((out_conv4, out_deconv5), 1))

        out_deconv4 = self.deconv4(out_deconv5_1)
        out_deconv4_1 = self.deconv4_1(torch.cat((out_conv3, out_deconv4), 1))

        out_deconv3 = self.deconv3(out_deconv4_1)
        out_deconv3_1 = self.deconv3_1(torch.cat((out_conv2, out_deconv3), 1))

        out_deconv2 = self.deconv2(out_deconv3_1)
        out_deconv2_1 = self.deconv2_1(torch.cat((out_conv1, out_deconv2), 1))

        out_deconv1 = self.deconv1(out_deconv2_1)
        out_deconv1_1 = self.deconv1_1(torch.cat((out_conv0, out_deconv1), 1))

        raw_seg = self.last_conv(out_deconv1_1)

        return raw_seg


def segnet2d(options, data=None):

    print("==> USING SegNet2D")
    for key in options:
        if "segnet2d" in key:
            print("{} : {}".format(key, options[key]))

    model = SegNet2D()

    if data is not None:
        model.load_state_dict(data["state_dict"])

    return model


================================================
FILE: src/models/__init__.py
================================================
from .Bi3DNet import *
from .FeatExtractNet import *
from .SegNet2D import *
from .RefineNet2D import *
from .RefineNet3D import *
from .PSMNet import *
from .GCNet import *
from .DispRefine2D import *



================================================
FILE: src/project.toml
================================================
[tool.black]
line-length = 110
target-version = ['py37']

================================================
FILE: src/run_binary_depth_estimation.py
================================================
# Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
#
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto.  Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.

import argparse
import os
import torch
import torchvision.transforms as transforms
from PIL import Image

import models
import cv2
import numpy as np

from util import disp2rgb, str2bool
import random

model_names = sorted(name for name in models.__dict__ if name.islower() and not name.startswith("__"))


# Parse arguments
parser = argparse.ArgumentParser(allow_abbrev=False)

# Model
parser.add_argument("--arch", type=str, default="bi3dnet_binary_depth")

parser.add_argument("--bi3dnet_featnet_arch", type=str, default="featextractnetspp")
parser.add_argument("--bi3dnet_featnethr_arch", type=str, default="featextractnethr")
parser.add_argument("--bi3dnet_segnet_arch", type=str, default="segnet2d")
parser.add_argument("--bi3dnet_refinenet_arch", type=str, default="segrefinenet")
parser.add_argument("--bi3dnet_max_disparity", type=int, default=192)
parser.add_argument("--bi3dnet_disps_per_example_true", type=str2bool, default=True)

parser.add_argument("--featextractnethr_out_planes", type=int, default=16)
parser.add_argument("--segrefinenet_in_planes", type=int, default=17)
parser.add_argument("--segrefinenet_out_planes", type=int, default=8)

# Input
parser.add_argument("--pretrained", type=str)
parser.add_argument("--img_left", type=str)
parser.add_argument("--img_right", type=str)
parser.add_argument("--disp_vals", type=float, nargs="*")
parser.add_argument("--crop_height", type=int)
parser.add_argument("--crop_width", type=int)

args, unknown = parser.parse_known_args()

####################################################################################################
def main():

    options = vars(args)
    print("==> ALL PARAMETERS")
    for key in options:
        print("{} : {}".format(key, options[key]))

    out_dir = "out"
    if not os.path.isdir(out_dir):
        os.mkdir(out_dir)

    base_name = os.path.splitext(os.path.basename(args.img_left))[0]

    # Model
    network_data = torch.load(args.pretrained)
    print("=> using pre-trained model '{}'".format(args.arch))
    model = models.__dict__[args.arch](options, network_data).cuda()

    # Inputs
    img_left = Image.open(args.img_left).convert("RGB")
    img_left = transforms.functional.to_tensor(img_left)
    img_left = transforms.functional.normalize(img_left, [0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
    img_left = img_left.type(torch.cuda.FloatTensor)[None, :, :, :]
    img_right = Image.open(args.img_right).convert("RGB")
    img_right = transforms.functional.to_tensor(img_right)
    img_right = transforms.functional.normalize(img_right, [0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
    img_right = img_right.type(torch.cuda.FloatTensor)[None, :, :, :]

    segs = []
    for disp_val in args.disp_vals:

        assert disp_val % 3 == 0, "disparity value should be a multiple of 3 as we downsample the image by 3"
        disp_long = torch.Tensor([[disp_val / 3]]).type(torch.LongTensor).cuda()

        # Pad inputs
        tw = args.crop_width
        th = args.crop_height
        assert tw % 96 == 0, "image dimensions should be a multiple of 96"
        assert th % 96 == 0, "image dimensions should be a multiple of 96"
        h = img_left.shape[2]
        w = img_left.shape[3]
        x1 = random.randint(0, max(0, w - tw))
        y1 = random.randint(0, max(0, h - th))
        pad_w = tw - w if tw - w > 0 else 0
        pad_h = th - h if th - h > 0 else 0
        pad_opr = torch.nn.ZeroPad2d((pad_w, 0, pad_h, 0))
        img_left = img_left[:, :, y1 : y1 + min(th, h), x1 : x1 + min(tw, w)]
        img_right = img_right[:, :, y1 : y1 + min(th, h), x1 : x1 + min(tw, w)]
        img_left_pad = pad_opr(img_left)
        img_right_pad = pad_opr(img_right)

        # Inference
        model.eval()
        with torch.no_grad():
            output = model(img_left_pad, img_right_pad, disp_long)[1][:, :, pad_h:, pad_w:]

        # Write binary depth results
        seg_img = output[0, 0][None, :, :].clone().cpu().detach().numpy()
        seg_img = np.transpose(seg_img * 255.0, (1, 2, 0))
        cv2.imwrite(
            os.path.join(out_dir, "%s_%s_seg_confidence_%d.png" % (base_name, args.arch, disp_val)), seg_img
        )

        segs.append(output[0, 0][None, :, :].clone().cpu().detach().numpy())

    # Generate quantized depth results
    segs = np.concatenate(segs, axis=0)
    segs = np.insert(segs, 0, np.ones((1, h, w), dtype=np.float32), axis=0)
    segs = np.append(segs, np.zeros((1, h, w), dtype=np.float32), axis=0)

    segs = 1.0 - segs

    # Get the pdf values for each segmented region
    pdf_method = segs[1:, :, :] - segs[:-1, :, :]

    # Get the labels
    labels_method = np.argmax(pdf_method, axis=0).astype(np.int)
    disp_map = labels_method.astype(np.float32)

    disp_vals = args.disp_vals
    disp_vals.insert(0, 0)
    disp_vals.append(args.bi3dnet_max_disparity)

    for i in range(len(disp_vals) - 1):
        min_disp = disp_vals[i]
        max_disp = disp_vals[i + 1]
        mid_disp = 0.5 * (min_disp + max_disp)
        disp_map[labels_method == i] = mid_disp

    disp_vals_str_list = ["%d" % disp_val for disp_val in disp_vals]
    disp_vals_str = "-".join(disp_vals_str_list)

    img_disp = np.clip(disp_map, 0, args.bi3dnet_max_disparity)
    img_disp = img_disp / args.bi3dnet_max_disparity
    img_disp = (disp2rgb(img_disp) * 255.0).astype(np.uint8)

    cv2.imwrite(
        os.path.join(out_dir, "%s_%s_quant_depth_%s.png" % (base_name, args.arch, disp_vals_str)), img_disp
    )

    return


if __name__ == "__main__":
    main()


================================================
FILE: src/run_continuous_depth_estimation.py
================================================
# Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
#
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto.  Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.

import argparse
import os
import time
import torch
import torchvision.transforms as transforms
from PIL import Image

import models
import cv2
import numpy as np
from util import disp2rgb, str2bool

import random

model_names = sorted(name for name in models.__dict__ if name.islower() and not name.startswith("__"))


# Parse Arguments
parser = argparse.ArgumentParser(allow_abbrev=False)

# Experiment Type
parser.add_argument("--arch", type=str, default="bi3dnet_continuous_depth_2D")

parser.add_argument("--bi3dnet_featnet_arch", type=str, default="featextractnetspp")
parser.add_argument("--bi3dnet_segnet_arch", type=str, default="segnet2d")
parser.add_argument("--bi3dnet_refinenet_arch", type=str, default="disprefinenet")
parser.add_argument("--bi3dnet_regnet_arch", type=str, default="segregnet3d")
parser.add_argument("--bi3dnet_max_disparity", type=int, default=192)
parser.add_argument("--regnet_out_planes", type=int, default=16)
parser.add_argument("--disprefinenet_out_planes", type=int, default=32)
parser.add_argument("--bi3dnet_disps_per_example_true", type=str2bool, default=True)

# Input
parser.add_argument("--pretrained", type=str)
parser.add_argument("--img_left", type=str)
parser.add_argument("--img_right", type=str)
parser.add_argument("--disp_range_min", type=int)
parser.add_argument("--disp_range_max", type=int)
parser.add_argument("--crop_height", type=int)
parser.add_argument("--crop_width", type=int)

args, unknown = parser.parse_known_args()

##############################################################################################################
def main():

    options = vars(args)
    print("==> ALL PARAMETERS")
    for key in options:
        print("{} : {}".format(key, options[key]))

    out_dir = "out"
    if not os.path.isdir(out_dir):
        os.mkdir(out_dir)

    base_name = os.path.splitext(os.path.basename(args.img_left))[0]

    # Model
    if args.pretrained:
        network_data = torch.load(args.pretrained)
    else:
        print("Need an input model")
        exit()

    print("=> using pre-trained model '{}'".format(args.arch))
    model = models.__dict__[args.arch](options, network_data).cuda()

    # Inputs
    img_left = Image.open(args.img_left).convert("RGB")
    img_right = Image.open(args.img_right).convert("RGB")
    img_left = transforms.functional.to_tensor(img_left)
    img_right = transforms.functional.to_tensor(img_right)
    img_left = transforms.functional.normalize(img_left, [0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
    img_right = transforms.functional.normalize(img_right, [0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
    img_left = img_left.type(torch.cuda.FloatTensor)[None, :, :, :]
    img_right = img_right.type(torch.cuda.FloatTensor)[None, :, :, :]

    # Prepare Disparities
    max_disparity = args.disp_range_max
    min_disparity = args.disp_range_min

    assert max_disparity % 3 == 0 and min_disparity % 3 == 0, "disparities should be divisible by 3"

    if args.arch == "bi3dnet_continuous_depth_3D":
        assert (
            max_disparity - min_disparity
        ) % 48 == 0, "for 3D regularization the difference in disparities should be divisible by 48"

    max_disp_levels = (max_disparity - min_disparity) + 1

    max_disparity_3x = int(max_disparity / 3)
    min_disparity_3x = int(min_disparity / 3)
    max_disp_levels_3x = (max_disparity_3x - min_disparity_3x) + 1
    disp_3x = np.linspace(min_disparity_3x, max_disparity_3x, max_disp_levels_3x, dtype=np.int32)
    disp_long_3x_main = torch.from_numpy(disp_3x).type(torch.LongTensor).cuda()
    disp_float_main = np.linspace(min_disparity, max_disparity, max_disp_levels, dtype=np.float32)
    disp_float_main = torch.from_numpy(disp_float_main).type(torch.float32).cuda()
    delta = 1
    d_min_GT = min_disparity - 0.5 * delta
    d_max_GT = max_disparity + 0.5 * delta
    disp_long_3x = disp_long_3x_main[None, :].expand(img_left.shape[0], -1)
    disp_float = disp_float_main[None, :].expand(img_left.shape[0], -1)

    # Pad Inputs
    tw = args.crop_width
    th = args.crop_height
    assert tw % 96 == 0, "image dimensions should be multiple of 96"
    assert th % 96 == 0, "image dimensions should be multiple of 96"
    h = img_left.shape[2]
    w = img_left.shape[3]
    x1 = random.randint(0, max(0, w - tw))
    y1 = random.randint(0, max(0, h - th))
    pad_w = tw - w if tw - w > 0 else 0
    pad_h = th - h if th - h > 0 else 0
    pad_opr = torch.nn.ZeroPad2d((pad_w, 0, pad_h, 0))
    img_left = img_left[:, :, y1 : y1 + min(th, h), x1 : x1 + min(tw, w)]
    img_right = img_right[:, :, y1 : y1 + min(th, h), x1 : x1 + min(tw, w)]
    img_left_pad = pad_opr(img_left)
    img_right_pad = pad_opr(img_right)

    # Inference
    model.eval()
    with torch.no_grad():
        if args.arch == "bi3dnet_continuous_depth_2D":
            output_seg_low_res_upsample, output_disp_normalized = model(
                img_left_pad, img_right_pad, disp_long_3x
            )
            output_seg = output_seg_low_res_upsample
        else:
            (
                output_seg_low_res_upsample,
                output_seg_low_res_upsample_refined,
                output_disp_normalized_no_reg,
                output_disp_normalized,
            ) = model(img_left_pad, img_right_pad, disp_long_3x)
            output_seg = output_seg_low_res_upsample_refined

        output_seg = output_seg[:, :, pad_h:, pad_w:]
        output_disp_normalized = output_disp_normalized[:, :, pad_h:, pad_w:]
        output_disp = torch.clamp(
            output_disp_normalized * delta * max_disp_levels + d_min_GT, min=d_min_GT, max=d_max_GT
        )

    # Write Results
    max_disparity_color = 192
    output_disp_clamp = output_disp[0, 0, :, :].cpu().clone().numpy()
    output_disp_clamp[output_disp_clamp < min_disparity] = 0
    output_disp_clamp[output_disp_clamp > max_disparity] = max_disparity_color
    disp_np_ours_color = disp2rgb(output_disp_clamp / max_disparity_color) * 255.0
    cv2.imwrite(
        os.path.join(out_dir, "%s_%s_%d_%d.png" % (base_name, args.arch, min_disparity, max_disparity)),
        disp_np_ours_color,
    )

    return


if __name__ == "__main__":
    main()


================================================
FILE: src/run_demo_kitti15.sh
================================================
#!/usr/bin/env bash

# GENERATE BINARY DEPTH SEGMENTATIONS AND COMBINE THEM TO GENERATE QUANTIZED DEPTH
CUDA_VISIBLE_DEVICES=0 python run_binary_depth_estimation.py \
    --arch bi3dnet_binary_depth \
    --bi3dnet_featnet_arch featextractnetspp \
    --bi3dnet_featnethr_arch featextractnethr \
    --bi3dnet_segnet_arch segnet2d \
    --bi3dnet_refinenet_arch segrefinenet \
    --featextractnethr_out_planes 16 \
    --segrefinenet_in_planes 17 \
    --segrefinenet_out_planes 8 \
    --crop_height 384 --crop_width 1248 \
    --disp_vals 12 21 30 39 48 \
    --img_left  '../data/kitti15_img_left.jpg' \
    --img_right '../data/kitti15_img_right.jpg' \
    --pretrained '../model_weights/kitti15_binary_depth.pth.tar'


# FULL RANGE CONTINOUS DEPTH ESTIMATION WITHOUT 3D REGULARIZATION
CUDA_VISIBLE_DEVICES=0 python run_continuous_depth_estimation.py \
    --arch bi3dnet_continuous_depth_2D \
    --bi3dnet_featnet_arch featextractnetspp \
    --bi3dnet_segnet_arch segnet2d \
    --bi3dnet_refinenet_arch disprefinenet \
    --disprefinenet_out_planes 32 \
    --crop_height 384 --crop_width 1248 \
    --disp_range_min 0 \
    --disp_range_max 192 \
    --bi3dnet_max_disparity 192 \
    --img_left  '../data/kitti15_img_left.jpg' \
    --img_right '../data/kitti15_img_right.jpg' \
    --pretrained '../model_weights/kitti15_continuous_depth_no_conf_reg.pth.tar'


# SELECTIVE RANGE CONTINOUS DEPTH ESTIMATION WITHOUT 3D REGULARIZATION
CUDA_VISIBLE_DEVICES=0 python run_continuous_depth_estimation.py \
    --arch bi3dnet_continuous_depth_2D \
    --bi3dnet_featnet_arch featextractnetspp \
    --bi3dnet_segnet_arch segnet2d \
    --bi3dnet_refinenet_arch disprefinenet \
    --disprefinenet_out_planes 32 \
    --crop_height 384 --crop_width 1248 \
    --disp_range_min 12 \
    --disp_range_max 48 \
    --bi3dnet_max_disparity 192 \
    --img_left  '../data/kitti15_img_left.jpg' \
    --img_right '../data/kitti15_img_right.jpg' \
    --pretrained '../model_weights/kitti15_continuous_depth_no_conf_reg.pth.tar'


# FULL RANGE CONTINOUS DEPTH ESTIMATION WITH 3D REGULARIZATION 
CUDA_VISIBLE_DEVICES=0 python run_continuous_depth_estimation.py \
    --arch bi3dnet_continuous_depth_3D \
    --bi3dnet_featnet_arch featextractnetspp \
    --bi3dnet_segnet_arch segnet2d \
    --bi3dnet_refinenet_arch disprefinenet \
    --bi3dnet_regnet_arch segregnet3d \
    --disprefinenet_out_planes 32 \
    --regnet_out_planes 16 \
    --crop_height 384 --crop_width 1248 \
    --disp_range_min 0 \
    --disp_range_max 192 \
    --bi3dnet_max_disparity 192 \
    --img_left  '../data/kitti15_img_left.jpg' \
    --img_right '../data/kitti15_img_right.jpg' \
    --pretrained '../model_weights/kitti15_continuous_depth_conf_reg.pth.tar'
    

================================================
FILE: src/run_demo_sf.sh
================================================
#!/usr/bin/env bash

# GENERATE BINARY DEPTH SEGMENTATIONS AND COMBINE THEM TO GENERATE QUANTIZED DEPTH
CUDA_VISIBLE_DEVICES=0 python run_binary_depth_estimation.py \
    --arch bi3dnet_binary_depth \
    --bi3dnet_featnet_arch featextractnetspp \
    --bi3dnet_featnethr_arch featextractnethr \
    --bi3dnet_segnet_arch segnet2d \
    --bi3dnet_refinenet_arch segrefinenet \
    --featextractnethr_out_planes 16 \
    --segrefinenet_in_planes 17 \
    --segrefinenet_out_planes 8 \
    --crop_height 576 --crop_width 960 \
    --disp_vals 24 36 54 96 144 \
    --img_left  '../data/sf_img_left.jpg' \
    --img_right '../data/sf_img_right.jpg' \
    --pretrained '../model_weights/sf_binary_depth.pth.tar'


# FULL RANGE CONTINOUS DEPTH ESTIMATION WITHOUT 3D REGULARIZATION
CUDA_VISIBLE_DEVICES=0 python run_continuous_depth_estimation.py \
    --arch bi3dnet_continuous_depth_2D \
    --bi3dnet_featnet_arch featextractnetspp \
    --bi3dnet_segnet_arch segnet2d \
    --bi3dnet_refinenet_arch disprefinenet \
    --disprefinenet_out_planes 32 \
    --crop_height 576 --crop_width 960 \
    --disp_range_min 0 \
    --disp_range_max 192 \
    --bi3dnet_max_disparity 192 \
    --img_left  '../data/sf_img_left.jpg' \
    --img_right '../data/sf_img_right.jpg' \
    --pretrained '../model_weights/sf_continuous_depth_no_conf_reg.pth.tar'


# SELECTIVE RANGE CONTINOUS DEPTH ESTIMATION WITHOUT 3D REGULARIZATION
CUDA_VISIBLE_DEVICES=0 python run_continuous_depth_estimation.py \
    --arch bi3dnet_continuous_depth_2D \
    --bi3dnet_featnet_arch featextractnetspp \
    --bi3dnet_segnet_arch segnet2d \
    --bi3dnet_refinenet_arch disprefinenet \
    --disprefinenet_out_planes 32 \
    --crop_height 576 --crop_width 960 \
    --disp_range_min 18 \
    --disp_range_max 60 \
    --bi3dnet_max_disparity 192 \
    --img_left  '../data/sf_img_left.jpg' \
    --img_right '../data/sf_img_right.jpg' \
    --pretrained '../model_weights/sf_continuous_depth_no_conf_reg.pth.tar'


# FULL RANGE CONTINOUS DEPTH ESTIMATION WITH 3D REGULARIZATION 
CUDA_VISIBLE_DEVICES=0 python run_continuous_depth_estimation.py \
    --arch bi3dnet_continuous_depth_3D \
    --bi3dnet_featnet_arch featextractnetspp \
    --bi3dnet_segnet_arch segnet2d \
    --bi3dnet_refinenet_arch disprefinenet \
    --bi3dnet_regnet_arch segregnet3d \
    --disprefinenet_out_planes 32 \
    --regnet_out_planes 16 \
    --crop_height 576 --crop_width 960 \
    --disp_range_min 0 \
    --disp_range_max 192 \
    --bi3dnet_max_disparity 192 \
    --img_left  '../data/sf_img_left.jpg' \
    --img_right '../data/sf_img_right.jpg' \
    --pretrained '../model_weights/sf_continuous_depth_conf_reg.pth.tar'


================================================
FILE: src/util.py
================================================
# Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
#
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto.  Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.

import os
import numpy as np


def disp2rgb(disp):
    H = disp.shape[0]
    W = disp.shape[1]

    I = disp.flatten()

    map = np.array(
        [
            [0, 0, 0, 114],
            [0, 0, 1, 185],
            [1, 0, 0, 114],
            [1, 0, 1, 174],
            [0, 1, 0, 114],
            [0, 1, 1, 185],
            [1, 1, 0, 114],
            [1, 1, 1, 0],
        ]
    )
    bins = map[:-1, 3]
    cbins = np.cumsum(bins)
    bins = bins / cbins[-1]
    cbins = cbins[:-1] / cbins[-1]

    ind = np.minimum(
        np.sum(np.repeat(I[None, :], 6, axis=0) > np.repeat(cbins[:, None], I.shape[0], axis=1), axis=0), 6
    )
    bins = np.reciprocal(bins)
    cbins = np.append(np.array([[0]]), cbins[:, None])

    I = np.multiply(I - cbins[ind], bins[ind])
    I = np.minimum(
        np.maximum(
            np.multiply(map[ind, 0:3], np.repeat(1 - I[:, None], 3, axis=1))
            + np.multiply(map[ind + 1, 0:3], np.repeat(I[:, None], 3, axis=1)),
            0,
        ),
        1,
    )

    I = np.reshape(I, [H, W, 3]).astype(np.float32)

    return I


def str2bool(bool_input_string):
    if isinstance(bool_input_string, bool):
        return bool_input_string
    if bool_input_string.lower() in ("true"):
        return True
    elif bool_input_string.lower() in ("false"):
        return False
    else:
        raise NameError("Please provide boolean type.")
Download .txt
gitextract_ulzmvk8p/

├── .gitignore
├── LICENSE.md
├── README.md
├── envs/
│   ├── bi3d_conda_env.yml
│   └── bi3d_pytorch_19_01.DockerFile
└── src/
    ├── models/
    │   ├── Bi3DNet.py
    │   ├── DispRefine2D.py
    │   ├── FeatExtractNet.py
    │   ├── GCNet.py
    │   ├── PSMNet.py
    │   ├── RefineNet2D.py
    │   ├── RefineNet3D.py
    │   ├── SegNet2D.py
    │   └── __init__.py
    ├── project.toml
    ├── run_binary_depth_estimation.py
    ├── run_continuous_depth_estimation.py
    ├── run_demo_kitti15.sh
    ├── run_demo_sf.sh
    └── util.py
Download .txt
SYMBOL INDEX (59 symbols across 11 files)

FILE: src/models/Bi3DNet.py
  function compute_cost_volume (line 24) | def compute_cost_volume(features_left, features_right, disp_ids, max_dis...
  class Bi3DNetContinuousDepth2D (line 54) | class Bi3DNetContinuousDepth2D(nn.Module):
    method __init__ (line 55) | def __init__(self, options, featnet_arch, segnet_arch, refinenet_arch=...
    method forward (line 75) | def forward(self, img_left, img_right, disp_ids):
  function bi3dnet_continuous_depth_2D (line 125) | def bi3dnet_continuous_depth_2D(options, data=None):
  class Bi3DNetContinuousDepth3D (line 151) | class Bi3DNetContinuousDepth3D(nn.Module):
    method __init__ (line 152) | def __init__(
    method forward (line 181) | def forward(self, img_left, img_right, disp_ids):
  function bi3dnet_continuous_depth_3D (line 254) | def bi3dnet_continuous_depth_3D(options, data=None):
  class Bi3DNetBinaryDepth (line 281) | class Bi3DNetBinaryDepth(nn.Module):
    method __init__ (line 282) | def __init__(
    method forward (line 311) | def forward(self, img_left, img_right, disp_ids):
  function bi3dnet_binary_depth (line 378) | def bi3dnet_binary_depth(options, data=None):

FILE: src/models/DispRefine2D.py
  class BasicBlock (line 38) | class BasicBlock(nn.Module):
    method __init__ (line 42) | def __init__(self, inplanes, planes, stride, downsample, pad, dilation):
    method forward (line 52) | def forward(self, x):
  class DispRefineNet (line 65) | class DispRefineNet(nn.Module):
    method __init__ (line 66) | def __init__(self, out_planes=32):
    method forward (line 103) | def forward(self, x):

FILE: src/models/FeatExtractNet.py
  class FeatExtractNetHR (line 28) | class FeatExtractNetHR(nn.Module):
    method __init__ (line 29) | def __init__(self, out_planes=16):
    method forward (line 57) | def forward(self, input):
  function featextractnethr (line 63) | def featextractnethr(options, data=None):
  function featextractnetspp (line 85) | def featextractnetspp(options, data=None):

FILE: src/models/GCNet.py
  function conv3d_relu (line 24) | def conv3d_relu(in_planes, out_planes, kernel_size=3, stride=1, activefu...
  function deconv3d_relu (line 32) | def deconv3d_relu(in_planes, out_planes, kernel_size=4, stride=2, active...
  class feature3d (line 50) | class feature3d(nn.Module):
    method __init__ (line 51) | def __init__(self, num_F):
    method forward (line 82) | def forward(self, x):

FILE: src/models/PSMNet.py
  function conv2d (line 34) | def conv2d(in_planes, out_planes, kernel_size, stride, pad, dilation):
  function conv2d_relu (line 49) | def conv2d_relu(in_planes, out_planes, kernel_size, stride, pad, dilation):
  function conv2d_lrelu (line 65) | def conv2d_lrelu(in_planes, out_planes, kernel_size, stride, pad, dilati...
  class BasicBlock (line 81) | class BasicBlock(nn.Module):
    method __init__ (line 85) | def __init__(self, inplanes, planes, stride, downsample, pad, dilation):
    method forward (line 95) | def forward(self, x):
  class FeatExtractNetSPP (line 108) | class FeatExtractNetSPP(nn.Module):
    method __init__ (line 109) | def __init__(self):
    method _make_layer (line 151) | def _make_layer(self, block, planes, blocks, stride, pad, dilation):
    method forward (line 167) | def forward(self, input):

FILE: src/models/RefineNet2D.py
  function disprefinenet (line 33) | def disprefinenet(options, data=None):
  class SegRefineNet (line 55) | class SegRefineNet(nn.Module):
    method __init__ (line 56) | def __init__(self, in_planes=17, out_planes=8):
    method forward (line 80) | def forward(self, input):
  function segrefinenet (line 88) | def segrefinenet(options, data=None):

FILE: src/models/RefineNet3D.py
  function net_init (line 20) | def net_init(net):
  class SegRegNet3D (line 45) | class SegRegNet3D(nn.Module):
    method __init__ (line 46) | def __init__(self, F=16):
    method forward (line 55) | def forward(self, fL, conf_volume):
  function segregnet3d (line 65) | def segregnet3d(options, data=None):

FILE: src/models/SegNet2D.py
  function conv (line 20) | def conv(in_planes, out_planes, kernel_size=3, stride=1, activefun=nn.Le...
  function deconv (line 35) | def deconv(in_planes, out_planes, kernel_size=4, stride=2, activefun=nn....
  class SegNet2D (line 45) | class SegNet2D(nn.Module):
    method __init__ (line 46) | def __init__(self):
    method forward (line 108) | def forward(self, x):
  function segnet2d (line 137) | def segnet2d(options, data=None):

FILE: src/run_binary_depth_estimation.py
  function main (line 53) | def main():

FILE: src/run_continuous_depth_estimation.py
  function main (line 53) | def main():

FILE: src/util.py
  function disp2rgb (line 13) | def disp2rgb(disp):
  function str2bool (line 57) | def str2bool(bool_input_string):
Condensed preview — 20 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (75K chars).
[
  {
    "path": ".gitignore",
    "chars": 148,
    "preview": "# Add any directories, files, or patterns you don't want to be tracked by version control\n\n*.png\n*.pfm\n*.pth.tar\n*.npy\n*"
  },
  {
    "path": "LICENSE.md",
    "chars": 4350,
    "preview": "# NVIDIA Source Code License for Bi3D\n\n## 1. Definitions\n\n“Licensor” means any person or entity that distributes its Wor"
  },
  {
    "path": "README.md",
    "chars": 3828,
    "preview": "## Bi3D &mdash; Official PyTorch Implementation\n\n![Teaser image](data/teaser.png)\n\n**Bi3D: Stereo Depth Estimation via B"
  },
  {
    "path": "envs/bi3d_conda_env.yml",
    "chars": 1408,
    "preview": "name: bi3d\nchannels:\n  - pytorch\n  - soumith\n  - defaults\ndependencies:\n  - _libgcc_mutex=0.1=main\n  - blas=1.0=mkl\n  - "
  },
  {
    "path": "envs/bi3d_pytorch_19_01.DockerFile",
    "chars": 145,
    "preview": "FROM nvcr.io/nvidia/pytorch:19.01-py3\n\nRUN pip install Pillow\nRUN pip install imageio\nRUN pip install tensorboardX\nRUN p"
  },
  {
    "path": "src/models/Bi3DNet.py",
    "chars": 13087,
    "preview": "# Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.\n#\n# NVIDIA CORPORATION and its licensors retain all inte"
  },
  {
    "path": "src/models/DispRefine2D.py",
    "chars": 3800,
    "preview": "# MIT License\n#\n# Copyright (c) 2019 Xuanyi Li (xuanyili.edu@gmail.com)\n# Copyright (c) 2020 NVIDIA\n#\n# Permission is he"
  },
  {
    "path": "src/models/FeatExtractNet.py",
    "chars": 2906,
    "preview": "# Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.\n#\n# NVIDIA CORPORATION and its licensors retain all inte"
  },
  {
    "path": "src/models/GCNet.py",
    "chars": 3669,
    "preview": "# Copyright (c) 2018 Wang Yufeng\n# Copyright (c) 2020 NVIDIA\n#\n# Licensed under the Apache License, Version 2.0 (the \"Li"
  },
  {
    "path": "src/models/PSMNet.py",
    "chars": 6836,
    "preview": "# MIT License\n#\n# Copyright (c) 2018 Jia-Ren Chang\n# Copyright (c) 2020 NVIDIA\n#\n# Permission is hereby granted, free of"
  },
  {
    "path": "src/models/RefineNet2D.py",
    "chars": 3109,
    "preview": "# Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.\n#\n# NVIDIA CORPORATION and its licensors retain all inte"
  },
  {
    "path": "src/models/RefineNet3D.py",
    "chars": 2488,
    "preview": "# Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.\n#\n# NVIDIA CORPORATION and its licensors retain all inte"
  },
  {
    "path": "src/models/SegNet2D.py",
    "chars": 5704,
    "preview": "# Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.\n#\n# NVIDIA CORPORATION and its licensors retain all inte"
  },
  {
    "path": "src/models/__init__.py",
    "chars": 203,
    "preview": "from .Bi3DNet import *\nfrom .FeatExtractNet import *\nfrom .SegNet2D import *\nfrom .RefineNet2D import *\nfrom .RefineNet3"
  },
  {
    "path": "src/project.toml",
    "chars": 56,
    "preview": "[tool.black]\nline-length = 110\ntarget-version = ['py37']"
  },
  {
    "path": "src/run_binary_depth_estimation.py",
    "chars": 5935,
    "preview": "# Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.\n#\n# NVIDIA CORPORATION and its licensors retain all inte"
  },
  {
    "path": "src/run_continuous_depth_estimation.py",
    "chars": 6603,
    "preview": "# Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.\n#\n# NVIDIA CORPORATION and its licensors retain all inte"
  },
  {
    "path": "src/run_demo_kitti15.sh",
    "chars": 2743,
    "preview": "#!/usr/bin/env bash\n\n# GENERATE BINARY DEPTH SEGMENTATIONS AND COMBINE THEM TO GENERATE QUANTIZED DEPTH\nCUDA_VISIBLE_DEV"
  },
  {
    "path": "src/run_demo_sf.sh",
    "chars": 2676,
    "preview": "#!/usr/bin/env bash\n\n# GENERATE BINARY DEPTH SEGMENTATIONS AND COMBINE THEM TO GENERATE QUANTIZED DEPTH\nCUDA_VISIBLE_DEV"
  },
  {
    "path": "src/util.py",
    "chars": 1821,
    "preview": "# Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.\n#\n# NVIDIA CORPORATION and its licensors retain all inte"
  }
]

About this extraction

This page contains the full source code of the NVlabs/Bi3D GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 20 files (69.8 KB), approximately 20.2k tokens, and a symbol index with 59 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!