Full Code of meteorshowers/StereoNet for AI

master a4010bd63fe8 cached
86 files
325.0 KB
92.1k tokens
357 symbols
1 requests
Download .txt
Showing preview only (348K chars total). Download the full file or copy to clipboard to get everything.
Repository: meteorshowers/StereoNet
Branch: master
Commit: a4010bd63fe8
Files: 86
Total size: 325.0 KB

Directory structure:
gitextract_bxzusud1/

├── LICENSE
├── README.md
├── configs/
│   └── config_disp.py
├── data
├── disparity/
│   ├── __init__.py
│   ├── csrc/
│   │   ├── BuildCostVolume.h
│   │   ├── ROIAlign.h
│   │   ├── ROIPool.h
│   │   ├── SigmoidFocalLoss.h
│   │   ├── cpu/
│   │   │   ├── ROIAlign_cpu.cpp
│   │   │   ├── nms_cpu.cpp
│   │   │   └── vision.h
│   │   ├── cuda/
│   │   │   ├── BuildCostVolume_cuda.cu
│   │   │   ├── ROIAlign_cuda.cu
│   │   │   ├── ROIPool_cuda.cu
│   │   │   ├── SigmoidFocalLoss_cuda.cu
│   │   │   ├── nms.cu
│   │   │   └── vision.h
│   │   ├── nms.h
│   │   └── vision.cpp
│   ├── dataloader/
│   │   ├── DataStatistics.py
│   │   ├── KITTILoader.py
│   │   ├── KITTI_submission_loader.py
│   │   ├── KITTI_submission_loader2012.py
│   │   ├── KITTIloader2012.py
│   │   ├── KITTIloader2015.py
│   │   ├── SceneFlowLoader_demo.py
│   │   ├── SecenFlowLoader.py
│   │   ├── SecenFlowLoader1.py
│   │   ├── SecenFlowLoaderfix.py
│   │   ├── Testloader.py
│   │   ├── __init__.py
│   │   ├── listflowfile.py
│   │   ├── listflowfilefix.py
│   │   ├── preprocess.py
│   │   └── readpfm.py
│   ├── eval/
│   │   ├── __init__.py
│   │   ├── kitti/
│   │   │   ├── README.md
│   │   │   ├── compile.sh
│   │   │   ├── eval.sh
│   │   │   ├── eval_05.sh
│   │   │   ├── evaluate_object_3d_offline
│   │   │   ├── evaluate_object_3d_offline.cpp
│   │   │   └── mail.h
│   │   └── kitti-object-eval-python/
│   │       ├── .gitignore
│   │       ├── LICENSE
│   │       ├── README.md
│   │       ├── eval.py
│   │       ├── eval.sh
│   │       ├── eval_dist.sh
│   │       ├── evaluate.py
│   │       ├── kitti_common.py
│   │       └── rotate_iou.py
│   ├── layers/
│   │   ├── __init__.py
│   │   ├── _utils.py
│   │   ├── batch_norm.py
│   │   ├── build_cost_volume.py
│   │   ├── iou_loss.py
│   │   ├── misc.py
│   │   ├── nms.py
│   │   ├── roi_align.py
│   │   ├── roi_pool.py
│   │   ├── scale.py
│   │   ├── sigmoid_focal_loss.py
│   │   └── smooth_l1_loss.py
│   ├── models/
│   │   ├── ActiveStereoNet.py
│   │   ├── __init__.py
│   │   ├── stereonet.py
│   │   ├── stereonet_disp.py
│   │   └── submodule.py
│   └── utils/
│       ├── __init__.py
│       ├── logger.py
│       ├── preprocess.py
│       ├── readpfm.py
│       ├── tensorboardx.py
│       └── utils.py
├── preprocessing/
│   ├── generate_disp.py
│   ├── generate_lidar.py
│   └── kitti_util.py
├── requirement.txt
├── setup.py
└── tools/
    ├── env_utils/
    │   ├── __init__.py
    │   ├── exp.py
    │   ├── logger.py
    │   └── utils.py
    └── train_net_disp.py

================================================
FILE CONTENTS
================================================

================================================
FILE: LICENSE
================================================
MIT License

Copyright (c) 2020 Yilun Chen

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


================================================
FILE: README.md
================================================
<div align="left">
 <img src="doc/log.png" width="80%">
</div>
X-StereoLab is an open source stereo matching and stereo 3D object detection toolbox based on PyTorch.

## News: We released the codebase v0.0.0.
* matching and detection model  result.
<div align="center">
 <img src="doc/demo.png" width="50%">
</div>


* GOOGLE HITNET model pytorch model will be released.
<div align="center">
 <img src="doc/hitnet.png" width="80%">
</div>

* GOOGLE HITNET model pytorch  KITTI2015 submission: http://www.cvlibs.net/datasets/kitti/eval_scene_flow_detail.php?benchmark=stereo&result=226494ba5559e9f5f46bdbd681d1564fee78409e
  ranking 145 with 80GMAC


### Requirements
All the codes are tested in the following environment:
* Ubuntu 16.04
* Python 3.7
* PyTorch 1.1.0 or 1.2.0 or 1.3.0
* Torchvision 0.2.2 or 0.4.1

### Installation 

(1) Clone this repository.
```
git clone git@github.com:meteorshowers/X-StereoLab.git && cd X-StereoLab
```

(2) Setup Python environment.
```
conda activate -n xstereolab
pip install -r requirements.txt --user

## conda deactivate xstereolab
```

<!-- (3) Compile the rotated IoU library (for 3D detection). 
```
cd X-stereoLab/utils/rotate_iou && bash compile.sh & cd ../../../
```

(4) Compile and install X-StereoLab library (for 3D detection).
```
# the following will install the lib with symbolic links, so that
# you can modify the file if you want and won't need to re-build it.
python3 setup.py build develop --user
``` -->

### Data Preparation

(1) Please download the KITTI dataset.
```
ln -s /path/to/KITTI_DATA_PATH ./data/kitti/
ln -s /path/to/OUTPUT_PATH ./outputs/
```


### Multi-GPU Training

The training scripts support [multi-processing distributed training](https://github.com/pytorch/examples/tree/master/imagenet), which is much faster than the typical PyTorch DataParallel interface.
```
python3 tools/train_net_disp.py --cfg ./configs/config_xxx.py --savemodel ./outputs/MODEL_NAME -btrain 4 -d 0-3 --multiprocessing-distributed
```
The training models, configuration and logs will be saved in the model folder.

To load some pretrained model, you can run
```
python3 tools/train_net_disp.py --cfg xxx/config.py --loadmodel ./outputs/MODEL_NAMEx --start_epoch xxx --savemodel ./outputs/MODEL_NAME -btrain 4 -d 0-3 --multiprocessing-distributed
```
If you want to continue training from some epochs, just set the cfg, loadmodel and start_epoch to the respective model path.

Besides, you can start a tensorboard session by
```
tensorboard --logdir=./outputs/MODEL_NAME/tensorboard --port=6666
```
and visualize your training process by accessing https://localhost:6666 on your browser.

### Inference and Evaluation

on working ...

### stereo matching Performance and Model Zoo

<!-- We provide several pretrained models for our experiments. -->

<table>
    <thead>
        <tr>
            <th>Methods</th>
            <th>Epochs</th>
            <!-- <th>Inference Time(s/im)</th> -->
            <th>Train Mem (GB/Img)</th>
            <th>Test Mem (GB/Img)</th>
            <th>EPE</th>
            <th>D1-all</th>
            <th>Models</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>HITNET (kitti)</td>
            <td>4200</td>
            <td></td>
            <td></td>
            <td></td>
            <td>2.43%</td>
            <td><a href=> GoogleDrive </a></td>
        </tr>
            <tr>
            <td>HITNET (sceneflow)</td>
            <td>200</td>
            <td></td>
            <td></td>
            <td>0.65</td>
            <td></td>
            <td><a href=> GoogleDrive </a></td>
        </tr>
          <tr>
            <td>stereonet (sceneflow)</td>
            <td>20</td>
            <td></td>
            <td></td>
            <td>1.10</td>
            <td></td>
            <td><a href=> GoogleDrive </a></td>
        </tr>
          <tr>
            <td>ActiveStereoNet</td>
            <td>10</td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td><a href=> GoogleDrive </a></td>
        </tr>      
        <tr>
            <td>SOS</td>
            <td rowspan=2></td>
            <td rowspan=2> </td>
            <td rowspan=2></td>
            <td></td>
            <td></td>
            <td rowspan=2> </a></td>
        </tr>
        
    </tbody>
</table>

### stereo 3D detection Performance and Model Zoo
#### PLUME: Efficient 3D Object Detection from Stereo Images

<table>
    <thead>
        <tr>
            <th>Methods</th>
            <th>Epochs</th>
            <!-- <th>Inference Time(s/im)</th> -->
            <th>Train Mem (GB/Img)</th>
            <th>Test Mem (GB/Img)</th>
            <th>3D BEV AP (Ours small plume)</th>
            <th>3D BEV AP (Paper small plume)</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>PLUME</td>
            <td></td>
            <td></td>
            <td></td>
            <td>72.9 / 62.5 / 56.9</td>
            <td>74.4 / 61.7 / 55.8</td>
        </tr>
    </tbody>
</table>


### Video Demo

We provide a video demo for showing the result of X-StereoLab. Here we show the predicted disparity map of activastereonet.

<p align="center"> <a href="https://www.youtube.com/watch?v=pqKZs1b1b0Y"><img src="./doc/demo_cover.png" width="50%"></a> </p>

### TODO List
- [x] Multiprocessing GPU training
- [x] TensorboardX
- [x] Reduce training GPU memory usage
- [x] eval and test code
- [ ] Result visualization
- [ ] Still in progress



### Citations
If you find our work useful in your research, please consider citing:
```
@misc{XStereoLab2021,
    title={{X-StereoLab} stereo matching and stereo 3D object detection toolbox},
    author={X-StereoLab Contributors},
    howpublished = {\url{https://github.com/meteorshowers/X-StereoLab}},
    year={2021}
}
* refercence[2] 
@article{tankovich2020hitnet,
  title={HITNet: Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching},
  author={Tankovich, Vladimir and H{\"a}ne, Christian and Fanello, Sean and Zhang, Yinda and Izadi, Shahram and Bouaziz, Sofien},
  journal={arXiv preprint arXiv:2007.12140},
  year={2020}
}

* refercence[3] 
@inproceedings{tankovich2018sos,
  title={Sos: Stereo matching in o (1) with slanted support windows},
  author={Tankovich, Vladimir and Schoenberg, Michael and Fanello, Sean Ryan and Kowdle, Adarsh and Rhemann, Christoph and Dzitsiuk, Maksym and Schmidt, Mirko and Valentin, Julien and Izadi, Shahram},
  booktitle={2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  pages={6782--6789},
  year={2018},
  organization={IEEE}
}

```


## Others contributors

<table border="0">
  <tbody>
    <tr align="center" >
      <!-- <td>
        ​ <a href="https://github.com/shenweichen"><img width="70" height="70" src="https://github.com/shenweichen.png?s=40" alt="pic"></a><br>
        ​ <a href="https://github.com/shenweichen">Shen Weichen</a> ​
        <p>
        Alibaba Group  </p>​
      </td> -->
      <td>
         <a href="https://github.com/vtankovich"><img width="70" height="70" src="https://avatars.githubusercontent.com/u/74434832?v=4" alt="pic"></a><br>
         <a href="https://github.com/vtankovich">vtankovich</a> ​
        <p>GOOGLE  </p>​
      </td>
     <td>
         <a href="https://github.com/mileyan"><img width="70" height="70" src="https://avatars.githubusercontent.com/u/3722398?v=4" alt="pic"></a><br>
         <a href="https://github.com/mileyan">Yan Wang</a> ​
        <p>Waymo  </p>​
      </td>     
    </tr>
  </tbody>
</table>


### Acknowledgment

* Thanks to  <a href="https://github.com/samehkhamis"> SamehKhamis (NVIDIA) 

### License
The code is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License for NonCommercial use only. Any commercial use should get formal permission first.

### Contact
If you have any questions or suggestions about this repo, please feel free to contact me (xuanyili.edu@gmail.com).
Wechat:
<table border="0">
  <tbody>
    <tr align="center" >
      <!-- <td>
        ​ <a href="https://github.com/shenweichen"><img width="70" height="70" src="https://github.com/shenweichen.png?s=40" alt="pic"></a><br>
        ​ <a href="https://github.com/shenweichen">Shen Weichen</a> ​
        <p>
        Alibaba Group  </p>​
      </td> -->
      <td>
         <a href="https://github.com/meteorshowers"><img width="100" height="100" src='doc/wechat.png' alt="pic"></a><br>
         <a href="https://github.com/meteorshowers">XUANYILI</a> ​
        <p>  </p>​
      </td>
    </tr>
  </tbody>
</table>


================================================
FILE: configs/config_disp.py
================================================
import os
import numpy as np
from yacs.config import CfgNode as CN

cfg = CN()

cfg.cnt = 0

cfg.btrain = 4


#------------- disparity ---------------#
cfg.model = 'stereonet' # ['stereonet', 'activestereonet', 'hitnet', 'sos']
cfg.maxdisp = 192
cfg.mindisp = 0
cfg.loss_disp = True
#--------------volume--------------------------#
cfg.PlaneSweepVolume = False
cfg.DispVolume = True


#------------- depth ---------------#

#------------- detection ---------------#


#-------------- debug ----------------#
cfg.debug = False

#-------------- Parameters -----------#

#----------------- centerness --------------#

#----------------------------------------------------#









================================================
FILE: data
================================================
/media/elonli/049150C23EB4F058/DSGN/data

================================================
FILE: disparity/__init__.py
================================================


================================================
FILE: disparity/csrc/BuildCostVolume.h
================================================
#pragma once

#include "cpu/vision.h"

#ifdef WITH_CUDA
#include "cuda/vision.h"
#endif

// Interface for Python
at::Tensor BuildCostVolume_forward(const at::Tensor& left,
                            const at::Tensor& right,
                            const at::Tensor& shift) {
  if (left.type().is_cuda()) {
#ifdef WITH_CUDA
    return BuildCostVolume_forward_cuda(left, right, shift);
#else
    AT_ERROR("Not compiled with GPU support");
#endif
  }
  AT_ERROR("Not implemented on the CPU");
}

std::tuple<at::Tensor, at::Tensor> BuildCostVolume_backward(const at::Tensor& grad,
                             const at::Tensor& shift) {
  if (grad.type().is_cuda()) {
#ifdef WITH_CUDA
    return BuildCostVolume_backward_cuda(grad, shift);
#else
    AT_ERROR("Not compiled with GPU support");
#endif
  }
  AT_ERROR("Not implemented on the CPU");
}



================================================
FILE: disparity/csrc/ROIAlign.h
================================================
// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
#pragma once

#include "cpu/vision.h"

#ifdef WITH_CUDA
#include "cuda/vision.h"
#endif

// Interface for Python
at::Tensor ROIAlign_forward(const at::Tensor& input,
                            const at::Tensor& rois,
                            const float spatial_scale,
                            const int pooled_height,
                            const int pooled_width,
                            const int sampling_ratio) {
  if (input.type().is_cuda()) {
#ifdef WITH_CUDA
    return ROIAlign_forward_cuda(input, rois, spatial_scale, pooled_height, pooled_width, sampling_ratio);
#else
    AT_ERROR("Not compiled with GPU support");
#endif
  }
  return ROIAlign_forward_cpu(input, rois, spatial_scale, pooled_height, pooled_width, sampling_ratio);
}

at::Tensor ROIAlign_backward(const at::Tensor& grad,
                             const at::Tensor& rois,
                             const float spatial_scale,
                             const int pooled_height,
                             const int pooled_width,
                             const int batch_size,
                             const int channels,
                             const int height,
                             const int width,
                             const int sampling_ratio) {
  if (grad.type().is_cuda()) {
#ifdef WITH_CUDA
    return ROIAlign_backward_cuda(grad, rois, spatial_scale, pooled_height, pooled_width, batch_size, channels, height, width, sampling_ratio);
#else
    AT_ERROR("Not compiled with GPU support");
#endif
  }
  AT_ERROR("Not implemented on the CPU");
}



================================================
FILE: disparity/csrc/ROIPool.h
================================================
// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
#pragma once

#include "cpu/vision.h"

#ifdef WITH_CUDA
#include "cuda/vision.h"
#endif


std::tuple<at::Tensor, at::Tensor> ROIPool_forward(const at::Tensor& input,
                                const at::Tensor& rois,
                                const float spatial_scale,
                                const int pooled_height,
                                const int pooled_width) {
  if (input.type().is_cuda()) {
#ifdef WITH_CUDA
    return ROIPool_forward_cuda(input, rois, spatial_scale, pooled_height, pooled_width);
#else
    AT_ERROR("Not compiled with GPU support");
#endif
  }
  AT_ERROR("Not implemented on the CPU");
}

at::Tensor ROIPool_backward(const at::Tensor& grad,
                                 const at::Tensor& input,
                                 const at::Tensor& rois,
                                 const at::Tensor& argmax,
                                 const float spatial_scale,
                                 const int pooled_height,
                                 const int pooled_width,
                                 const int batch_size,
                                 const int channels,
                                 const int height,
                                 const int width) {
  if (grad.type().is_cuda()) {
#ifdef WITH_CUDA
    return ROIPool_backward_cuda(grad, input, rois, argmax, spatial_scale, pooled_height, pooled_width, batch_size, channels, height, width);
#else
    AT_ERROR("Not compiled with GPU support");
#endif
  }
  AT_ERROR("Not implemented on the CPU");
}





================================================
FILE: disparity/csrc/SigmoidFocalLoss.h
================================================
#pragma once

#include "cpu/vision.h"

#ifdef WITH_CUDA
#include "cuda/vision.h"
#endif

// Interface for Python
at::Tensor SigmoidFocalLoss_forward(
		const at::Tensor& logits,
                const at::Tensor& targets,
		const int num_classes, 
		const float gamma, 
		const float alpha) {
  if (logits.type().is_cuda()) {
#ifdef WITH_CUDA
    return SigmoidFocalLoss_forward_cuda(logits, targets, num_classes, gamma, alpha);
#else
    AT_ERROR("Not compiled with GPU support");
#endif
  }
  AT_ERROR("Not implemented on the CPU");
}

at::Tensor SigmoidFocalLoss_backward(
			     const at::Tensor& logits,
                             const at::Tensor& targets,
			     const at::Tensor& d_losses,
			     const int num_classes,
			     const float gamma,
			     const float alpha) {
  if (logits.type().is_cuda()) {
#ifdef WITH_CUDA
    return SigmoidFocalLoss_backward_cuda(logits, targets, d_losses, num_classes, gamma, alpha);
#else
    AT_ERROR("Not compiled with GPU support");
#endif
  }
  AT_ERROR("Not implemented on the CPU");
}


================================================
FILE: disparity/csrc/cpu/ROIAlign_cpu.cpp
================================================
// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
#include "cpu/vision.h"

// implementation taken from Caffe2
template <typename T>
struct PreCalc {
  int pos1;
  int pos2;
  int pos3;
  int pos4;
  T w1;
  T w2;
  T w3;
  T w4;
};

template <typename T>
void pre_calc_for_bilinear_interpolate(
    const int height,
    const int width,
    const int pooled_height,
    const int pooled_width,
    const int iy_upper,
    const int ix_upper,
    T roi_start_h,
    T roi_start_w,
    T bin_size_h,
    T bin_size_w,
    int roi_bin_grid_h,
    int roi_bin_grid_w,
    std::vector<PreCalc<T>>& pre_calc) {
  int pre_calc_index = 0;
  for (int ph = 0; ph < pooled_height; ph++) {
    for (int pw = 0; pw < pooled_width; pw++) {
      for (int iy = 0; iy < iy_upper; iy++) {
        const T yy = roi_start_h + ph * bin_size_h +
            static_cast<T>(iy + .5f) * bin_size_h /
                static_cast<T>(roi_bin_grid_h); // e.g., 0.5, 1.5
        for (int ix = 0; ix < ix_upper; ix++) {
          const T xx = roi_start_w + pw * bin_size_w +
              static_cast<T>(ix + .5f) * bin_size_w /
                  static_cast<T>(roi_bin_grid_w);

          T x = xx;
          T y = yy;
          // deal with: inverse elements are out of feature map boundary
          if (y < -1.0 || y > height || x < -1.0 || x > width) {
            // empty
            PreCalc<T> pc;
            pc.pos1 = 0;
            pc.pos2 = 0;
            pc.pos3 = 0;
            pc.pos4 = 0;
            pc.w1 = 0;
            pc.w2 = 0;
            pc.w3 = 0;
            pc.w4 = 0;
            pre_calc[pre_calc_index] = pc;
            pre_calc_index += 1;
            continue;
          }

          if (y <= 0) {
            y = 0;
          }
          if (x <= 0) {
            x = 0;
          }

          int y_low = (int)y;
          int x_low = (int)x;
          int y_high;
          int x_high;

          if (y_low >= height - 1) {
            y_high = y_low = height - 1;
            y = (T)y_low;
          } else {
            y_high = y_low + 1;
          }

          if (x_low >= width - 1) {
            x_high = x_low = width - 1;
            x = (T)x_low;
          } else {
            x_high = x_low + 1;
          }

          T ly = y - y_low;
          T lx = x - x_low;
          T hy = 1. - ly, hx = 1. - lx;
          T w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx;

          // save weights and indeces
          PreCalc<T> pc;
          pc.pos1 = y_low * width + x_low;
          pc.pos2 = y_low * width + x_high;
          pc.pos3 = y_high * width + x_low;
          pc.pos4 = y_high * width + x_high;
          pc.w1 = w1;
          pc.w2 = w2;
          pc.w3 = w3;
          pc.w4 = w4;
          pre_calc[pre_calc_index] = pc;

          pre_calc_index += 1;
        }
      }
    }
  }
}

template <typename T>
void ROIAlignForward_cpu_kernel(
    const int nthreads,
    const T* bottom_data,
    const T& spatial_scale,
    const int channels,
    const int height,
    const int width,
    const int pooled_height,
    const int pooled_width,
    const int sampling_ratio,
    const T* bottom_rois,
    //int roi_cols,
    T* top_data) {
  //AT_ASSERT(roi_cols == 4 || roi_cols == 5);
  int roi_cols = 5;

  int n_rois = nthreads / channels / pooled_width / pooled_height;
  // (n, c, ph, pw) is an element in the pooled output
  // can be parallelized using omp
  // #pragma omp parallel for num_threads(32)
  for (int n = 0; n < n_rois; n++) {
    int index_n = n * channels * pooled_width * pooled_height;

    // roi could have 4 or 5 columns
    const T* offset_bottom_rois = bottom_rois + n * roi_cols;
    int roi_batch_ind = 0;
    if (roi_cols == 5) {
      roi_batch_ind = offset_bottom_rois[0];
      offset_bottom_rois++;
    }

    // Do not using rounding; this implementation detail is critical
    T roi_start_w = offset_bottom_rois[0] * spatial_scale;
    T roi_start_h = offset_bottom_rois[1] * spatial_scale;
    T roi_end_w = offset_bottom_rois[2] * spatial_scale;
    T roi_end_h = offset_bottom_rois[3] * spatial_scale;
    // T roi_start_w = round(offset_bottom_rois[0] * spatial_scale);
    // T roi_start_h = round(offset_bottom_rois[1] * spatial_scale);
    // T roi_end_w = round(offset_bottom_rois[2] * spatial_scale);
    // T roi_end_h = round(offset_bottom_rois[3] * spatial_scale);

    // Force malformed ROIs to be 1x1
    T roi_width = std::max(roi_end_w - roi_start_w, (T)1.);
    T roi_height = std::max(roi_end_h - roi_start_h, (T)1.);
    T bin_size_h = static_cast<T>(roi_height) / static_cast<T>(pooled_height);
    T bin_size_w = static_cast<T>(roi_width) / static_cast<T>(pooled_width);

    // We use roi_bin_grid to sample the grid and mimic integral
    int roi_bin_grid_h = (sampling_ratio > 0)
        ? sampling_ratio
        : ceil(roi_height / pooled_height); // e.g., = 2
    int roi_bin_grid_w =
        (sampling_ratio > 0) ? sampling_ratio : ceil(roi_width / pooled_width);

    // We do average (integral) pooling inside a bin
    const T count = roi_bin_grid_h * roi_bin_grid_w; // e.g. = 4

    // we want to precalculate indeces and weights shared by all chanels,
    // this is the key point of optimiation
    std::vector<PreCalc<T>> pre_calc(
        roi_bin_grid_h * roi_bin_grid_w * pooled_width * pooled_height);
    pre_calc_for_bilinear_interpolate(
        height,
        width,
        pooled_height,
        pooled_width,
        roi_bin_grid_h,
        roi_bin_grid_w,
        roi_start_h,
        roi_start_w,
        bin_size_h,
        bin_size_w,
        roi_bin_grid_h,
        roi_bin_grid_w,
        pre_calc);

      for (int c = 0; c < channels; c++) {
      int index_n_c = index_n + c * pooled_width * pooled_height;
      const T* offset_bottom_data =
          bottom_data + (roi_batch_ind * channels + c) * height * width;
      int pre_calc_index = 0;

      for (int ph = 0; ph < pooled_height; ph++) {
        for (int pw = 0; pw < pooled_width; pw++) {
          int index = index_n_c + ph * pooled_width + pw;

          T output_val = 0.;
          for (int iy = 0; iy < roi_bin_grid_h; iy++) {
            for (int ix = 0; ix < roi_bin_grid_w; ix++) {
              PreCalc<T> pc = pre_calc[pre_calc_index];
              output_val += pc.w1 * offset_bottom_data[pc.pos1] +
                  pc.w2 * offset_bottom_data[pc.pos2] +
                  pc.w3 * offset_bottom_data[pc.pos3] +
                  pc.w4 * offset_bottom_data[pc.pos4];

              pre_calc_index += 1;
            }
          }
          output_val /= count;

          top_data[index] = output_val;
        } // for pw
      } // for ph
    } // for c
  } // for n
}

at::Tensor ROIAlign_forward_cpu(const at::Tensor& input,
                                const at::Tensor& rois,
                                const float spatial_scale,
                                const int pooled_height,
                                const int pooled_width,
                                const int sampling_ratio) {
  AT_ASSERTM(!input.type().is_cuda(), "input must be a CPU tensor");
  AT_ASSERTM(!rois.type().is_cuda(), "rois must be a CPU tensor");

  auto num_rois = rois.size(0);
  auto channels = input.size(1);
  auto height = input.size(2);
  auto width = input.size(3);

  auto output = at::empty({num_rois, channels, pooled_height, pooled_width}, input.options());
  auto output_size = num_rois * pooled_height * pooled_width * channels;

  if (output.numel() == 0) {
    return output;
  }

  AT_DISPATCH_FLOATING_TYPES(input.type(), "ROIAlign_forward", [&] {
    ROIAlignForward_cpu_kernel<scalar_t>(
         output_size,
         input.data<scalar_t>(),
         spatial_scale,
         channels,
         height,
         width,
         pooled_height,
         pooled_width,
         sampling_ratio,
         rois.data<scalar_t>(),
         output.data<scalar_t>());
  });
  return output;
}


================================================
FILE: disparity/csrc/cpu/nms_cpu.cpp
================================================
// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
#include "cpu/vision.h"


template <typename scalar_t>
at::Tensor nms_cpu_kernel(const at::Tensor& dets,
                          const at::Tensor& scores,
                          const float threshold) {
  AT_ASSERTM(!dets.type().is_cuda(), "dets must be a CPU tensor");
  AT_ASSERTM(!scores.type().is_cuda(), "scores must be a CPU tensor");
  AT_ASSERTM(dets.type() == scores.type(), "dets should have the same type as scores");

  if (dets.numel() == 0) {
    return at::empty({0}, dets.options().dtype(at::kLong).device(at::kCPU));
  }

  auto x1_t = dets.select(1, 0).contiguous();
  auto y1_t = dets.select(1, 1).contiguous();
  auto x2_t = dets.select(1, 2).contiguous();
  auto y2_t = dets.select(1, 3).contiguous();

  at::Tensor areas_t = (x2_t - x1_t + 1) * (y2_t - y1_t + 1);

  auto order_t = std::get<1>(scores.sort(0, /* descending=*/true));

  auto ndets = dets.size(0);
  at::Tensor suppressed_t = at::zeros({ndets}, dets.options().dtype(at::kByte).device(at::kCPU));

  auto suppressed = suppressed_t.data<uint8_t>();
  auto order = order_t.data<int64_t>();
  auto x1 = x1_t.data<scalar_t>();
  auto y1 = y1_t.data<scalar_t>();
  auto x2 = x2_t.data<scalar_t>();
  auto y2 = y2_t.data<scalar_t>();
  auto areas = areas_t.data<scalar_t>();

  for (int64_t _i = 0; _i < ndets; _i++) {
    auto i = order[_i];
    if (suppressed[i] == 1)
      continue;
    auto ix1 = x1[i];
    auto iy1 = y1[i];
    auto ix2 = x2[i];
    auto iy2 = y2[i];
    auto iarea = areas[i];

    for (int64_t _j = _i + 1; _j < ndets; _j++) {
      auto j = order[_j];
      if (suppressed[j] == 1)
        continue;
      auto xx1 = std::max(ix1, x1[j]);
      auto yy1 = std::max(iy1, y1[j]);
      auto xx2 = std::min(ix2, x2[j]);
      auto yy2 = std::min(iy2, y2[j]);

      auto w = std::max(static_cast<scalar_t>(0), xx2 - xx1 + 1);
      auto h = std::max(static_cast<scalar_t>(0), yy2 - yy1 + 1);
      auto inter = w * h;
      auto ovr = inter / (iarea + areas[j] - inter);
      if (ovr >= threshold)
        suppressed[j] = 1;
   }
  }
  return at::nonzero(suppressed_t == 0).squeeze(1);
}

at::Tensor nms_cpu(const at::Tensor& dets,
               const at::Tensor& scores,
               const float threshold) {
  at::Tensor result;
  AT_DISPATCH_FLOATING_TYPES(dets.type(), "nms", [&] {
    result = nms_cpu_kernel<scalar_t>(dets, scores, threshold);
  });
  return result;
}


================================================
FILE: disparity/csrc/cpu/vision.h
================================================
// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
#pragma once
#include <torch/extension.h>


at::Tensor ROIAlign_forward_cpu(const at::Tensor& input,
                                const at::Tensor& rois,
                                const float spatial_scale,
                                const int pooled_height,
                                const int pooled_width,
                                const int sampling_ratio);


at::Tensor nms_cpu(const at::Tensor& dets,
                   const at::Tensor& scores,
                   const float threshold);


================================================
FILE: disparity/csrc/cuda/BuildCostVolume_cuda.cu
================================================
// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
#include <ATen/ATen.h>
#include <ATen/cuda/CUDAContext.h>

#include <THC/THC.h>
#include <THC/THCAtomics.cuh>
#include <THC/THCDeviceUtils.cuh>

// TODO make it in a common file
#define CUDA_1D_KERNEL_LOOP(i, n)                            \
  for (int i = blockIdx.x * blockDim.x + threadIdx.x; i < n; \
       i += blockDim.x * gridDim.x)


template <typename T>
__device__ T bilinear_interpolate(const T* bottom_data,
    const int height, const int width,
    T y, T x) {

  // deal with cases that inverse elements are out of feature map boundary
  if (y < -1.0 || y > height || x < -1.0 || x > width) {
    //empty
    return 0;
  }

  if (y <= 0) y = 0;
  if (x <= 0) x = 0;

  int y_low = (int) y;
  int x_low = (int) x;
  int y_high;
  int x_high;

  if (y_low >= height - 1) {
    y_high = y_low = height - 1;
    y = (T) y_low;
  } else {
    y_high = y_low + 1;
  }

  if (x_low >= width - 1) {
    x_high = x_low = width - 1;
    x = (T) x_low;
  } else {
    x_high = x_low + 1;
  }

  T ly = y - y_low;
  T lx = x - x_low;
  T hy = 1. - ly, hx = 1. - lx;
  // do bilinear interpolation
  T v1 = bottom_data[y_low * width + x_low];
  T v2 = bottom_data[y_low * width + x_high];
  T v3 = bottom_data[y_high * width + x_low];
  T v4 = bottom_data[y_high * width + x_high];
  T w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx;

  T val = (w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4);

  return val;
}

template <typename T>
__global__ void BuildCostVolumeForward(const int nthreads, 
    const T* left, const T* right, const T* shift, 
    const int num_batch, const int channels, const int height,
    const int width, const int max_disp,
    T* cost) {
  CUDA_1D_KERNEL_LOOP(index, nthreads) {
    int pw = index % width;
    int ph = (index / width) % height;
    int pd = (index / width / height) % max_disp;
    int c = (index / width / height/ max_disp) % channels;
    int n = index / width / height / max_disp / channels;

    int index_L = (((n * 2 * channels + c) * max_disp + pd) * height + ph) * width + pw;
    int index_R = index_L + channels * max_disp * height * width;

    T shift_pd = -shift[n * max_disp + pd];

    cost[index_L] = left[((n * channels + c) * height + ph) * width + pw];

    if (pw + shift_pd >= 0. && pw + shift_pd <= width - 1)
    {
        const T* offset_right = right + (n * channels + c) * height * width;
        cost[index_R] = bilinear_interpolate(offset_right, height, width, (T)ph, (T)pw + shift_pd);
    }
    else 
    {
        cost[index_R] = 0.;
    }
  }
}


template <typename T>
__device__ void bilinear_interpolate_gradient(
    const int height, const int width,
    T y, T x,
    T & w1, T & w2, T & w3, T & w4,
    int & x_low, int & x_high, int & y_low, int & y_high) {

  // deal with cases that inverse elements are out of feature map boundary
  if (y < -1.0 || y > height || x < -1.0 || x > width) {
    //empty
    w1 = w2 = w3 = w4 = 0.;
    x_low = x_high = y_low = y_high = -1;
    return;
  }

  if (y <= 0) y = 0;
  if (x <= 0) x = 0;

  y_low = (int) y;
  x_low = (int) x;

  if (y_low >= height - 1) {
    y_high = y_low = height - 1;
    y = (T) y_low;
  } else {
    y_high = y_low + 1;
  }

  if (x_low >= width - 1) {
    x_high = x_low = width - 1;
    x = (T) x_low;
  } else {
    x_high = x_low + 1;
  }

  T ly = y - y_low;
  T lx = x - x_low;
  T hy = 1. - ly, hx = 1. - lx;

  // reference in forward
  // T v1 = bottom_data[y_low * width + x_low];
  // T v2 = bottom_data[y_low * width + x_high];
  // T v3 = bottom_data[y_high * width + x_low];
  // T v4 = bottom_data[y_high * width + x_high];
  // T val = (w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4);

  w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx;

  return;
}

template <typename T>
__global__ void BuildCostVolumeBackwardFeature(const int nthreads, 
    const T* grad, const T* shift, 
    const int num_batch, const int channels, const int height,
    const int width, const int max_disp,
    T* grad_left, T* grad_right) {
  CUDA_1D_KERNEL_LOOP(index, nthreads) {
    int pw = index % width;
    int ph = (index / width) % height;
    int pd = (index / width / height) % max_disp;
    int c = (index / width / height/ max_disp) % channels;
    int n = index / width / height / max_disp / channels;

    int index_L = (((n * 2 * channels + c) * max_disp + pd) * height + ph) * width + pw;
    int index_R = index_L + channels * max_disp * height * width;

    T shift_pd = -shift[n * max_disp + pd];

    // left
    atomicAdd(grad_left + ((n * channels + c) * height + ph) * width + pw, static_cast<T>(grad[index_L]));

    if (pw + shift_pd >= 0. && pw + shift_pd <= width - 1)
    {
        // right
        T w1, w2, w3, w4;
        int x_low, x_high, y_low, y_high;

        bilinear_interpolate_gradient(height, width, (T) ph, (T) pw + shift_pd,
            w1, w2, w3, w4,
            x_low, x_high, y_low, y_high);

        T top_diff_this_bin = grad[index_R];
        T g1 = top_diff_this_bin * w1;
        T g2 = top_diff_this_bin * w2;
        T g3 = top_diff_this_bin * w3;
        T g4 = top_diff_this_bin * w4;

        T* offset_grad_right = grad_right + (n * channels + c) * height * width;
        if (w1 >= 1e-10)
            atomicAdd(offset_grad_right + y_low * width + x_low, static_cast<T>(g1));
        if (w2 >= 1e-10)
            atomicAdd(offset_grad_right + y_low * width + x_high, static_cast<T>(g2));
        if (w3 >= 1e-10)
            atomicAdd(offset_grad_right + y_high * width + x_low, static_cast<T>(g3));
        if (w4 >= 1e-10)
            atomicAdd(offset_grad_right + y_high * width + x_high, static_cast<T>(g4));
    }
  } // CUDA_1D_KERNEL_LOOP
} // BuildCostVolumeBackward


at::Tensor BuildCostVolume_forward_cuda(const at::Tensor& left,
                                 const at::Tensor& right,
                                 const at::Tensor& shift) {
  AT_ASSERTM(left.type().is_cuda(), "left must be a CUDA tensor");
  AT_ASSERTM(right.type().is_cuda(), "right must be a CUDA tensor");
  AT_ASSERTM(shift.type().is_cuda(), "shift must be a CUDA tensor");

  AT_ASSERTM((left.size(0) == right.size(0)) && (left.size(1) == right.size(1)) && \
    (left.size(2) == right.size(2)) && (left.size(3) == right.size(3)), \
    "Left image and right image should match their size.");
  AT_ASSERTM(left.size(0) == shift.size(0), \
    "Image and shift should of same batch.");

  auto num_batch = left.size(0);
  auto channels = left.size(1);
  auto height = left.size(2);
  auto width = left.size(3);
  auto max_disp = shift.size(1);

  auto output = at::empty({num_batch, channels * 2, max_disp, height, width}, left.options());
  auto output_size = num_batch * channels * 2 * max_disp * height * width;
  cudaStream_t stream = at::cuda::getCurrentCUDAStream();

  dim3 grid(std::min(THCCeilDiv((long)(output_size / 2), 512L), 4096L));
  dim3 block(512);

  if (output.numel() == 0) {
    THCudaCheck(cudaGetLastError());
    return output;
  }

  AT_DISPATCH_FLOATING_TYPES(left.type(), "BuildCostVolume_forward", [&] {
    BuildCostVolumeForward<scalar_t><<<grid, block, 0, stream>>>(
         output_size / 2,
         left.contiguous().data<scalar_t>(),
         right.contiguous().data<scalar_t>(),
         shift.contiguous().data<scalar_t>(),
         num_batch,
         channels,
         height,
         width,
         max_disp,
         output.data<scalar_t>());
  });
  THCudaCheck(cudaGetLastError());
  return output;
}

// TODO remove the dependency on input and use instead its sizes -> save memory
std::tuple<at::Tensor, at::Tensor> BuildCostVolume_backward_cuda(const at::Tensor& grad,
                                  const at::Tensor& shift) {
  AT_ASSERTM(shift.type().is_cuda(), "shift must be a CUDA tensor");

  auto num_batch = grad.size(0);
  auto channels = grad.size(1) / 2;
  auto height = grad.size(3);
  auto width = grad.size(4);
  auto max_disp = shift.size(1);

  auto grad_left = at::zeros({num_batch, channels, height, width}, grad.options());
  auto grad_right = at::zeros({num_batch, channels, height, width}, grad.options());

  AT_ASSERTM(grad.numel() == num_batch * channels * 2 * max_disp * height * width,
      "grad shape is wrong");

  cudaStream_t stream = at::cuda::getCurrentCUDAStream();

  dim3 grid(std::min(THCCeilDiv((long)grad.numel(), 512L), 4096L));
  dim3 block(512);

  // handle possibly empty gradients
  if (grad.numel() == 0) {
    THCudaCheck(cudaGetLastError());
    return std::make_tuple(grad_left, grad_right);
  }

  AT_DISPATCH_FLOATING_TYPES(grad.type(), "BuildCostVolume_backward", [&] {
    BuildCostVolumeBackwardFeature<scalar_t><<<grid, block, 0, stream>>>(
         grad.numel() / 2,
         grad.contiguous().data<scalar_t>(),
         shift.contiguous().data<scalar_t>(),
         num_batch,
         channels,
         height,
         width,
         max_disp,
         grad_left.data<scalar_t>(),
         grad_right.data<scalar_t>());
  });
  THCudaCheck(cudaGetLastError());
  return std::make_tuple(grad_left, grad_right);
}



================================================
FILE: disparity/csrc/cuda/ROIAlign_cuda.cu
================================================
// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
#include <ATen/ATen.h>
#include <ATen/cuda/CUDAContext.h>

#include <THC/THC.h>
#include <THC/THCAtomics.cuh>
#include <THC/THCDeviceUtils.cuh>

// TODO make it in a common file
#define CUDA_1D_KERNEL_LOOP(i, n)                            \
  for (int i = blockIdx.x * blockDim.x + threadIdx.x; i < n; \
       i += blockDim.x * gridDim.x)


template <typename T>
__device__ T bilinear_interpolate(const T* bottom_data,
    const int height, const int width,
    T y, T x,
    const int index /* index for debug only*/) {

  // deal with cases that inverse elements are out of feature map boundary
  if (y < -1.0 || y > height || x < -1.0 || x > width) {
    //empty
    return 0;
  }

  if (y <= 0) y = 0;
  if (x <= 0) x = 0;

  int y_low = (int) y;
  int x_low = (int) x;
  int y_high;
  int x_high;

  if (y_low >= height - 1) {
    y_high = y_low = height - 1;
    y = (T) y_low;
  } else {
    y_high = y_low + 1;
  }

  if (x_low >= width - 1) {
    x_high = x_low = width - 1;
    x = (T) x_low;
  } else {
    x_high = x_low + 1;
  }

  T ly = y - y_low;
  T lx = x - x_low;
  T hy = 1. - ly, hx = 1. - lx;
  // do bilinear interpolation
  T v1 = bottom_data[y_low * width + x_low];
  T v2 = bottom_data[y_low * width + x_high];
  T v3 = bottom_data[y_high * width + x_low];
  T v4 = bottom_data[y_high * width + x_high];
  T w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx;

  T val = (w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4);

  return val;
}

template <typename T>
__global__ void RoIAlignForward(const int nthreads, const T* bottom_data,
    const T spatial_scale, const int channels,
    const int height, const int width,
    const int pooled_height, const int pooled_width,
    const int sampling_ratio,
    const T* bottom_rois, T* top_data) {
  CUDA_1D_KERNEL_LOOP(index, nthreads) {
    // (n, c, ph, pw) is an element in the pooled output
    int pw = index % pooled_width;
    int ph = (index / pooled_width) % pooled_height;
    int c = (index / pooled_width / pooled_height) % channels;
    int n = index / pooled_width / pooled_height / channels;

    const T* offset_bottom_rois = bottom_rois + n * 5;
    int roi_batch_ind = offset_bottom_rois[0];

    // Do not using rounding; this implementation detail is critical
    T roi_start_w = offset_bottom_rois[1] * spatial_scale;
    T roi_start_h = offset_bottom_rois[2] * spatial_scale;
    T roi_end_w = offset_bottom_rois[3] * spatial_scale;
    T roi_end_h = offset_bottom_rois[4] * spatial_scale;
    // T roi_start_w = round(offset_bottom_rois[1] * spatial_scale);
    // T roi_start_h = round(offset_bottom_rois[2] * spatial_scale);
    // T roi_end_w = round(offset_bottom_rois[3] * spatial_scale);
    // T roi_end_h = round(offset_bottom_rois[4] * spatial_scale);

    // Force malformed ROIs to be 1x1
    T roi_width = max(roi_end_w - roi_start_w, (T)1.);
    T roi_height = max(roi_end_h - roi_start_h, (T)1.);
    T bin_size_h = static_cast<T>(roi_height) / static_cast<T>(pooled_height);
    T bin_size_w = static_cast<T>(roi_width) / static_cast<T>(pooled_width);

    const T* offset_bottom_data = bottom_data + (roi_batch_ind * channels + c) * height * width;

    // We use roi_bin_grid to sample the grid and mimic integral
    int roi_bin_grid_h = (sampling_ratio > 0) ? sampling_ratio : ceil(roi_height / pooled_height); // e.g., = 2
    int roi_bin_grid_w = (sampling_ratio > 0) ? sampling_ratio : ceil(roi_width / pooled_width);

    // We do average (integral) pooling inside a bin
    const T count = roi_bin_grid_h * roi_bin_grid_w; // e.g. = 4

    T output_val = 0.;
    for (int iy = 0; iy < roi_bin_grid_h; iy ++) // e.g., iy = 0, 1
    {
      const T y = roi_start_h + ph * bin_size_h + static_cast<T>(iy + .5f) * bin_size_h / static_cast<T>(roi_bin_grid_h); // e.g., 0.5, 1.5
      for (int ix = 0; ix < roi_bin_grid_w; ix ++)
      {
        const T x = roi_start_w + pw * bin_size_w + static_cast<T>(ix + .5f) * bin_size_w / static_cast<T>(roi_bin_grid_w);

        T val = bilinear_interpolate(offset_bottom_data, height, width, y, x, index);
        output_val += val;
      }
    }
    output_val /= count;

    top_data[index] = output_val;
  }
}


template <typename T>
__device__ void bilinear_interpolate_gradient(
    const int height, const int width,
    T y, T x,
    T & w1, T & w2, T & w3, T & w4,
    int & x_low, int & x_high, int & y_low, int & y_high,
    const int index /* index for debug only*/) {

  // deal with cases that inverse elements are out of feature map boundary
  if (y < -1.0 || y > height || x < -1.0 || x > width) {
    //empty
    w1 = w2 = w3 = w4 = 0.;
    x_low = x_high = y_low = y_high = -1;
    return;
  }

  if (y <= 0) y = 0;
  if (x <= 0) x = 0;

  y_low = (int) y;
  x_low = (int) x;

  if (y_low >= height - 1) {
    y_high = y_low = height - 1;
    y = (T) y_low;
  } else {
    y_high = y_low + 1;
  }

  if (x_low >= width - 1) {
    x_high = x_low = width - 1;
    x = (T) x_low;
  } else {
    x_high = x_low + 1;
  }

  T ly = y - y_low;
  T lx = x - x_low;
  T hy = 1. - ly, hx = 1. - lx;

  // reference in forward
  // T v1 = bottom_data[y_low * width + x_low];
  // T v2 = bottom_data[y_low * width + x_high];
  // T v3 = bottom_data[y_high * width + x_low];
  // T v4 = bottom_data[y_high * width + x_high];
  // T val = (w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4);

  w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx;

  return;
}

template <typename T>
__global__ void RoIAlignBackwardFeature(const int nthreads, const T* top_diff,
    const int num_rois, const T spatial_scale,
    const int channels, const int height, const int width,
    const int pooled_height, const int pooled_width,
    const int sampling_ratio,
    T* bottom_diff,
    const T* bottom_rois) {
  CUDA_1D_KERNEL_LOOP(index, nthreads) {
    // (n, c, ph, pw) is an element in the pooled output
    int pw = index % pooled_width;
    int ph = (index / pooled_width) % pooled_height;
    int c = (index / pooled_width / pooled_height) % channels;
    int n = index / pooled_width / pooled_height / channels;

    const T* offset_bottom_rois = bottom_rois + n * 5;
    int roi_batch_ind = offset_bottom_rois[0];

    // Do not using rounding; this implementation detail is critical
    T roi_start_w = offset_bottom_rois[1] * spatial_scale;
    T roi_start_h = offset_bottom_rois[2] * spatial_scale;
    T roi_end_w = offset_bottom_rois[3] * spatial_scale;
    T roi_end_h = offset_bottom_rois[4] * spatial_scale;
    // T roi_start_w = round(offset_bottom_rois[1] * spatial_scale);
    // T roi_start_h = round(offset_bottom_rois[2] * spatial_scale);
    // T roi_end_w = round(offset_bottom_rois[3] * spatial_scale);
    // T roi_end_h = round(offset_bottom_rois[4] * spatial_scale);

    // Force malformed ROIs to be 1x1
    T roi_width = max(roi_end_w - roi_start_w, (T)1.);
    T roi_height = max(roi_end_h - roi_start_h, (T)1.);
    T bin_size_h = static_cast<T>(roi_height) / static_cast<T>(pooled_height);
    T bin_size_w = static_cast<T>(roi_width) / static_cast<T>(pooled_width);

    T* offset_bottom_diff = bottom_diff + (roi_batch_ind * channels + c) * height * width;

    int top_offset    = (n * channels + c) * pooled_height * pooled_width;
    const T* offset_top_diff = top_diff + top_offset;
    const T top_diff_this_bin = offset_top_diff[ph * pooled_width + pw];

    // We use roi_bin_grid to sample the grid and mimic integral
    int roi_bin_grid_h = (sampling_ratio > 0) ? sampling_ratio : ceil(roi_height / pooled_height); // e.g., = 2
    int roi_bin_grid_w = (sampling_ratio > 0) ? sampling_ratio : ceil(roi_width / pooled_width);

    // We do average (integral) pooling inside a bin
    const T count = roi_bin_grid_h * roi_bin_grid_w; // e.g. = 4

    for (int iy = 0; iy < roi_bin_grid_h; iy ++) // e.g., iy = 0, 1
    {
      const T y = roi_start_h + ph * bin_size_h + static_cast<T>(iy + .5f) * bin_size_h / static_cast<T>(roi_bin_grid_h); // e.g., 0.5, 1.5
      for (int ix = 0; ix < roi_bin_grid_w; ix ++)
      {
        const T x = roi_start_w + pw * bin_size_w + static_cast<T>(ix + .5f) * bin_size_w / static_cast<T>(roi_bin_grid_w);

        T w1, w2, w3, w4;
        int x_low, x_high, y_low, y_high;

        bilinear_interpolate_gradient(height, width, y, x,
            w1, w2, w3, w4,
            x_low, x_high, y_low, y_high,
            index);

        T g1 = top_diff_this_bin * w1 / count;
        T g2 = top_diff_this_bin * w2 / count;
        T g3 = top_diff_this_bin * w3 / count;
        T g4 = top_diff_this_bin * w4 / count;

        if (x_low >= 0 && x_high >= 0 && y_low >= 0 && y_high >= 0)
        {
          atomicAdd(offset_bottom_diff + y_low * width + x_low, static_cast<T>(g1));
          atomicAdd(offset_bottom_diff + y_low * width + x_high, static_cast<T>(g2));
          atomicAdd(offset_bottom_diff + y_high * width + x_low, static_cast<T>(g3));
          atomicAdd(offset_bottom_diff + y_high * width + x_high, static_cast<T>(g4));
        } // if
      } // ix
    } // iy
  } // CUDA_1D_KERNEL_LOOP
} // RoIAlignBackward


at::Tensor ROIAlign_forward_cuda(const at::Tensor& input,
                                 const at::Tensor& rois,
                                 const float spatial_scale,
                                 const int pooled_height,
                                 const int pooled_width,
                                 const int sampling_ratio) {
  AT_ASSERTM(input.type().is_cuda(), "input must be a CUDA tensor");
  AT_ASSERTM(rois.type().is_cuda(), "rois must be a CUDA tensor");

  auto num_rois = rois.size(0);
  auto channels = input.size(1);
  auto height = input.size(2);
  auto width = input.size(3);

  auto output = at::empty({num_rois, channels, pooled_height, pooled_width}, input.options());
  auto output_size = num_rois * pooled_height * pooled_width * channels;
  cudaStream_t stream = at::cuda::getCurrentCUDAStream();

  dim3 grid(std::min(THCCeilDiv((long)output_size, 512L), 4096L));
  dim3 block(512);

  if (output.numel() == 0) {
    THCudaCheck(cudaGetLastError());
    return output;
  }

  AT_DISPATCH_FLOATING_TYPES(input.type(), "ROIAlign_forward", [&] {
    RoIAlignForward<scalar_t><<<grid, block, 0, stream>>>(
         output_size,
         input.contiguous().data<scalar_t>(),
         spatial_scale,
         channels,
         height,
         width,
         pooled_height,
         pooled_width,
         sampling_ratio,
         rois.contiguous().data<scalar_t>(),
         output.data<scalar_t>());
  });
  THCudaCheck(cudaGetLastError());
  return output;
}

// TODO remove the dependency on input and use instead its sizes -> save memory
at::Tensor ROIAlign_backward_cuda(const at::Tensor& grad,
                                  const at::Tensor& rois,
                                  const float spatial_scale,
                                  const int pooled_height,
                                  const int pooled_width,
                                  const int batch_size,
                                  const int channels,
                                  const int height,
                                  const int width,
                                  const int sampling_ratio) {
  AT_ASSERTM(grad.type().is_cuda(), "grad must be a CUDA tensor");
  AT_ASSERTM(rois.type().is_cuda(), "rois must be a CUDA tensor");

  auto num_rois = rois.size(0);
  auto grad_input = at::zeros({batch_size, channels, height, width}, grad.options());

  cudaStream_t stream = at::cuda::getCurrentCUDAStream();

  dim3 grid(std::min(THCCeilDiv((long)grad.numel(), 512L), 4096L));
  dim3 block(512);

  // handle possibly empty gradients
  if (grad.numel() == 0) {
    THCudaCheck(cudaGetLastError());
    return grad_input;
  }

  AT_DISPATCH_FLOATING_TYPES(grad.type(), "ROIAlign_backward", [&] {
    RoIAlignBackwardFeature<scalar_t><<<grid, block, 0, stream>>>(
         grad.numel(),
         grad.contiguous().data<scalar_t>(),
         num_rois,
         spatial_scale,
         channels,
         height,
         width,
         pooled_height,
         pooled_width,
         sampling_ratio,
         grad_input.data<scalar_t>(),
         rois.contiguous().data<scalar_t>());
  });
  THCudaCheck(cudaGetLastError());
  return grad_input;
}


================================================
FILE: disparity/csrc/cuda/ROIPool_cuda.cu
================================================
// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
#include <ATen/ATen.h>
#include <ATen/cuda/CUDAContext.h>

#include <THC/THC.h>
#include <THC/THCAtomics.cuh>
#include <THC/THCDeviceUtils.cuh>


// TODO make it in a common file
#define CUDA_1D_KERNEL_LOOP(i, n)                            \
  for (int i = blockIdx.x * blockDim.x + threadIdx.x; i < n; \
       i += blockDim.x * gridDim.x)


template <typename T>
__global__ void RoIPoolFForward(const int nthreads, const T* bottom_data,
    const T spatial_scale, const int channels, const int height,
    const int width, const int pooled_height, const int pooled_width,
    const T* bottom_rois, T* top_data, int* argmax_data) {
  CUDA_1D_KERNEL_LOOP(index, nthreads) {
    // (n, c, ph, pw) is an element in the pooled output
    int pw = index % pooled_width;
    int ph = (index / pooled_width) % pooled_height;
    int c = (index / pooled_width / pooled_height) % channels;
    int n = index / pooled_width / pooled_height / channels;

    const T* offset_bottom_rois = bottom_rois + n * 5;
    int roi_batch_ind = offset_bottom_rois[0];
    int roi_start_w = round(offset_bottom_rois[1] * spatial_scale);
    int roi_start_h = round(offset_bottom_rois[2] * spatial_scale);
    int roi_end_w = round(offset_bottom_rois[3] * spatial_scale);
    int roi_end_h = round(offset_bottom_rois[4] * spatial_scale);

    // Force malformed ROIs to be 1x1
    int roi_width = max(roi_end_w - roi_start_w + 1, 1);
    int roi_height = max(roi_end_h - roi_start_h + 1, 1);
    T bin_size_h = static_cast<T>(roi_height)
                       / static_cast<T>(pooled_height);
    T bin_size_w = static_cast<T>(roi_width)
                       / static_cast<T>(pooled_width);

    int hstart = static_cast<int>(floor(static_cast<T>(ph)
                                        * bin_size_h));
    int wstart = static_cast<int>(floor(static_cast<T>(pw)
                                        * bin_size_w));
    int hend = static_cast<int>(ceil(static_cast<T>(ph + 1)
                                     * bin_size_h));
    int wend = static_cast<int>(ceil(static_cast<T>(pw + 1)
                                     * bin_size_w));

    // Add roi offsets and clip to input boundaries
    hstart = min(max(hstart + roi_start_h, 0), height);
    hend = min(max(hend + roi_start_h, 0), height);
    wstart = min(max(wstart + roi_start_w, 0), width);
    wend = min(max(wend + roi_start_w, 0), width);
    bool is_empty = (hend <= hstart) || (wend <= wstart);

    // Define an empty pooling region to be zero
    T maxval = is_empty ? 0 : -FLT_MAX;
    // If nothing is pooled, argmax = -1 causes nothing to be backprop'd
    int maxidx = -1;
    const T* offset_bottom_data =
        bottom_data + (roi_batch_ind * channels + c) * height * width;
    for (int h = hstart; h < hend; ++h) {
      for (int w = wstart; w < wend; ++w) {
        int bottom_index = h * width + w;
        if (offset_bottom_data[bottom_index] > maxval) {
          maxval = offset_bottom_data[bottom_index];
          maxidx = bottom_index;
        }
      }
    }
    top_data[index] = maxval;
    argmax_data[index] = maxidx;
  }
}

template <typename T>
__global__ void RoIPoolFBackward(const int nthreads, const T* top_diff,
    const int* argmax_data, const int num_rois, const T spatial_scale,
    const int channels, const int height, const int width,
    const int pooled_height, const int pooled_width, T* bottom_diff,
    const T* bottom_rois) {
  CUDA_1D_KERNEL_LOOP(index, nthreads) {
    // (n, c, ph, pw) is an element in the pooled output
    int pw = index % pooled_width;
    int ph = (index / pooled_width) % pooled_height;
    int c = (index / pooled_width / pooled_height) % channels;
    int n = index / pooled_width / pooled_height / channels;

    const T* offset_bottom_rois = bottom_rois + n * 5;
    int roi_batch_ind = offset_bottom_rois[0];
    int bottom_offset = (roi_batch_ind * channels + c) * height * width;
    int top_offset    = (n * channels + c) * pooled_height * pooled_width;
    const T* offset_top_diff = top_diff + top_offset;
    T* offset_bottom_diff = bottom_diff + bottom_offset;
    const int* offset_argmax_data = argmax_data + top_offset;

    int argmax = offset_argmax_data[ph * pooled_width + pw];
    if (argmax != -1) {
      atomicAdd(
          offset_bottom_diff + argmax,
          static_cast<T>(offset_top_diff[ph * pooled_width + pw]));

    }
  }
}

std::tuple<at::Tensor, at::Tensor> ROIPool_forward_cuda(const at::Tensor& input,
                                const at::Tensor& rois,
                                const float spatial_scale,
                                const int pooled_height,
                                const int pooled_width) {
  AT_ASSERTM(input.type().is_cuda(), "input must be a CUDA tensor");
  AT_ASSERTM(rois.type().is_cuda(), "rois must be a CUDA tensor");

  auto num_rois = rois.size(0);
  auto channels = input.size(1);
  auto height = input.size(2);
  auto width = input.size(3);

  auto output = at::empty({num_rois, channels, pooled_height, pooled_width}, input.options());
  auto output_size = num_rois * pooled_height * pooled_width * channels;
  auto argmax = at::zeros({num_rois, channels, pooled_height, pooled_width}, input.options().dtype(at::kInt));

  cudaStream_t stream = at::cuda::getCurrentCUDAStream();

  dim3 grid(std::min(THCCeilDiv((long)output_size, 512L), 4096L));
  dim3 block(512);

  if (output.numel() == 0) {
    THCudaCheck(cudaGetLastError());
    return std::make_tuple(output, argmax);
  }

  AT_DISPATCH_FLOATING_TYPES(input.type(), "ROIPool_forward", [&] {
    RoIPoolFForward<scalar_t><<<grid, block, 0, stream>>>(
         output_size,
         input.contiguous().data<scalar_t>(),
         spatial_scale,
         channels,
         height,
         width,
         pooled_height,
         pooled_width,
         rois.contiguous().data<scalar_t>(),
         output.data<scalar_t>(),
         argmax.data<int>());
  });
  THCudaCheck(cudaGetLastError());
  return std::make_tuple(output, argmax);
}

// TODO remove the dependency on input and use instead its sizes -> save memory
at::Tensor ROIPool_backward_cuda(const at::Tensor& grad,
                                 const at::Tensor& input,
                                 const at::Tensor& rois,
                                 const at::Tensor& argmax,
                                 const float spatial_scale,
                                 const int pooled_height,
                                 const int pooled_width,
                                 const int batch_size,
                                 const int channels,
                                 const int height,
                                 const int width) {
  AT_ASSERTM(grad.type().is_cuda(), "grad must be a CUDA tensor");
  AT_ASSERTM(rois.type().is_cuda(), "rois must be a CUDA tensor");
  // TODO add more checks

  auto num_rois = rois.size(0);
  auto grad_input = at::zeros({batch_size, channels, height, width}, grad.options());

  cudaStream_t stream = at::cuda::getCurrentCUDAStream();

  dim3 grid(std::min(THCCeilDiv((long)grad.numel(), 512L), 4096L));
  dim3 block(512);

  // handle possibly empty gradients
  if (grad.numel() == 0) {
    THCudaCheck(cudaGetLastError());
    return grad_input;
  }

  AT_DISPATCH_FLOATING_TYPES(grad.type(), "ROIPool_backward", [&] {
    RoIPoolFBackward<scalar_t><<<grid, block, 0, stream>>>(
         grad.numel(),
         grad.contiguous().data<scalar_t>(),
         argmax.data<int>(),
         num_rois,
         spatial_scale,
         channels,
         height,
         width,
         pooled_height,
         pooled_width,
         grad_input.data<scalar_t>(),
         rois.contiguous().data<scalar_t>());
  });
  THCudaCheck(cudaGetLastError());
  return grad_input;
}


================================================
FILE: disparity/csrc/cuda/SigmoidFocalLoss_cuda.cu
================================================
// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
// This file is modified from  https://github.com/pytorch/pytorch/blob/master/modules/detectron/sigmoid_focal_loss_op.cu
// Cheng-Yang Fu
// cyfu@cs.unc.edu
#include <ATen/ATen.h>
#include <ATen/cuda/CUDAContext.h>

#include <THC/THC.h>
#include <THC/THCAtomics.cuh>
#include <THC/THCDeviceUtils.cuh>

#include <cfloat>

// TODO make it in a common file
#define CUDA_1D_KERNEL_LOOP(i, n)                            \
  for (int i = blockIdx.x * blockDim.x + threadIdx.x; i < n; \
       i += blockDim.x * gridDim.x)


template <typename T>
__global__ void SigmoidFocalLossForward(const int nthreads, 
    const T* logits,
    const int* targets,
    const int num_classes,
    const float gamma, 
    const float alpha,
    const int num, 
    T* losses) {
  CUDA_1D_KERNEL_LOOP(i, nthreads) {

    int n = i / num_classes;
    int d = i % num_classes; // current class[0~79]; 
    int t = targets[n]; // target class [1~80];

    // Decide it is positive or negative case. 
    T c1 = (t == (d+1)); 
    T c2 = (t>=0 & t != (d+1));

    T zn = (1.0 - alpha);
    T zp = (alpha);

    // p = 1. / 1. + expf(-x); p = sigmoid(x)
    T  p = 1. / (1. + expf(-logits[i]));

    // (1-p)**gamma * log(p) where
    T term1 = powf((1. - p), gamma) * logf(max(p, FLT_MIN));

    // p**gamma * log(1-p)
    T term2 = powf(p, gamma) *
            (-1. * logits[i] * (logits[i] >= 0) -   
             logf(1. + expf(logits[i] - 2. * logits[i] * (logits[i] >= 0))));

    losses[i] = 0.0;
    losses[i] += -c1 * term1 * zp;
    losses[i] += -c2 * term2 * zn;

  } // CUDA_1D_KERNEL_LOOP
} // SigmoidFocalLossForward


template <typename T>
__global__ void SigmoidFocalLossBackward(const int nthreads,
                const T* logits,
                const int* targets,
                const T* d_losses,
                const int num_classes,
                const float gamma,
                const float alpha,
                const int num,
                T* d_logits) {
  CUDA_1D_KERNEL_LOOP(i, nthreads) {

    int n = i / num_classes;
    int d = i % num_classes; // current class[0~79]; 
    int t = targets[n]; // target class [1~80], 0 is background;

    // Decide it is positive or negative case. 
    T c1 = (t == (d+1));
    T c2 = (t>=0 & t != (d+1));

    T zn = (1.0 - alpha);
    T zp = (alpha);
    // p = 1. / 1. + expf(-x); p = sigmoid(x)
    T  p = 1. / (1. + expf(-logits[i]));

    // (1-p)**g * (1 - p - g*p*log(p)
    T term1 = powf((1. - p), gamma) *
                      (1. - p - (p * gamma * logf(max(p, FLT_MIN))));

    // (p**g) * (g*(1-p)*log(1-p) - p)
    T term2 = powf(p, gamma) *
                  ((-1. * logits[i] * (logits[i] >= 0) -
                      logf(1. + expf(logits[i] - 2. * logits[i] * (logits[i] >= 0)))) *
                      (1. - p) * gamma - p);
    d_logits[i] = 0.0;
    d_logits[i] += -c1 * term1 * zp;
    d_logits[i] += -c2 * term2 * zn;
    d_logits[i] = d_logits[i] * d_losses[i];

  } // CUDA_1D_KERNEL_LOOP
} // SigmoidFocalLossBackward


at::Tensor SigmoidFocalLoss_forward_cuda(
		const at::Tensor& logits,
                const at::Tensor& targets,
		const int num_classes, 
		const float gamma, 
		const float alpha) {
  AT_ASSERTM(logits.type().is_cuda(), "logits must be a CUDA tensor");
  AT_ASSERTM(targets.type().is_cuda(), "targets must be a CUDA tensor");
  AT_ASSERTM(logits.dim() == 2, "logits should be NxClass");

  const int num_samples = logits.size(0);
	
  auto losses = at::empty({num_samples, logits.size(1)}, logits.options());
  auto losses_size = num_samples * logits.size(1);
  cudaStream_t stream = at::cuda::getCurrentCUDAStream();

  dim3 grid(std::min(THCCeilDiv((long)losses_size, 512L), 4096L));
  dim3 block(512);

  if (losses.numel() == 0) {
    THCudaCheck(cudaGetLastError());
    return losses;
  }

  AT_DISPATCH_FLOATING_TYPES(logits.type(), "SigmoidFocalLoss_forward", [&] {
    SigmoidFocalLossForward<scalar_t><<<grid, block, 0, stream>>>(
         losses_size,
         logits.contiguous().data<scalar_t>(),
	 targets.contiguous().data<int>(),
         num_classes,
	 gamma,
	 alpha,
	 num_samples,
         losses.data<scalar_t>());
  });
  THCudaCheck(cudaGetLastError());
  return losses;   
}	


at::Tensor SigmoidFocalLoss_backward_cuda(
		const at::Tensor& logits,
                const at::Tensor& targets,
		const at::Tensor& d_losses,
		const int num_classes, 
		const float gamma, 
		const float alpha) {
  AT_ASSERTM(logits.type().is_cuda(), "logits must be a CUDA tensor");
  AT_ASSERTM(targets.type().is_cuda(), "targets must be a CUDA tensor");
  AT_ASSERTM(d_losses.type().is_cuda(), "d_losses must be a CUDA tensor");

  AT_ASSERTM(logits.dim() == 2, "logits should be NxClass");

  const int num_samples = logits.size(0);
  AT_ASSERTM(logits.size(1) == num_classes, "logits.size(1) should be num_classes");
	
  auto d_logits = at::zeros({num_samples, num_classes}, logits.options());
  auto d_logits_size = num_samples * logits.size(1);
  cudaStream_t stream = at::cuda::getCurrentCUDAStream();

  dim3 grid(std::min(THCCeilDiv((long)d_logits_size, 512L), 4096L));
  dim3 block(512);

  if (d_logits.numel() == 0) {
    THCudaCheck(cudaGetLastError());
    return d_logits;
  }

  AT_DISPATCH_FLOATING_TYPES(logits.type(), "SigmoidFocalLoss_backward", [&] {
    SigmoidFocalLossBackward<scalar_t><<<grid, block, 0, stream>>>(
         d_logits_size,
         logits.contiguous().data<scalar_t>(),
	 targets.contiguous().data<int>(),
	 d_losses.contiguous().data<scalar_t>(),
         num_classes,
	 gamma,
	 alpha,
	 num_samples,
         d_logits.data<scalar_t>());
  });

  THCudaCheck(cudaGetLastError());
  return d_logits;   
}	



================================================
FILE: disparity/csrc/cuda/nms.cu
================================================
// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
#include <ATen/ATen.h>
#include <ATen/cuda/CUDAContext.h>

#include <THC/THC.h>
#include <THC/THCDeviceUtils.cuh>

#include <vector>
#include <iostream>

int const threadsPerBlock = sizeof(unsigned long long) * 8;

__device__ inline float devIoU(float const * const a, float const * const b) {
  float left = max(a[0], b[0]), right = min(a[2], b[2]);
  float top = max(a[1], b[1]), bottom = min(a[3], b[3]);
  float width = max(right - left + 1, 0.f), height = max(bottom - top + 1, 0.f);
  float interS = width * height;
  float Sa = (a[2] - a[0] + 1) * (a[3] - a[1] + 1);
  float Sb = (b[2] - b[0] + 1) * (b[3] - b[1] + 1);
  return interS / (Sa + Sb - interS);
}

__global__ void nms_kernel(const int n_boxes, const float nms_overlap_thresh,
                           const float *dev_boxes, unsigned long long *dev_mask) {
  const int row_start = blockIdx.y;
  const int col_start = blockIdx.x;

  // if (row_start > col_start) return;

  const int row_size =
        min(n_boxes - row_start * threadsPerBlock, threadsPerBlock);
  const int col_size =
        min(n_boxes - col_start * threadsPerBlock, threadsPerBlock);

  __shared__ float block_boxes[threadsPerBlock * 5];
  if (threadIdx.x < col_size) {
    block_boxes[threadIdx.x * 5 + 0] =
        dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 0];
    block_boxes[threadIdx.x * 5 + 1] =
        dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 1];
    block_boxes[threadIdx.x * 5 + 2] =
        dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 2];
    block_boxes[threadIdx.x * 5 + 3] =
        dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 3];
    block_boxes[threadIdx.x * 5 + 4] =
        dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 4];
  }
  __syncthreads();

  if (threadIdx.x < row_size) {
    const int cur_box_idx = threadsPerBlock * row_start + threadIdx.x;
    const float *cur_box = dev_boxes + cur_box_idx * 5;
    int i = 0;
    unsigned long long t = 0;
    int start = 0;
    if (row_start == col_start) {
      start = threadIdx.x + 1;
    }
    for (i = start; i < col_size; i++) {
      if (devIoU(cur_box, block_boxes + i * 5) > nms_overlap_thresh) {
        t |= 1ULL << i;
      }
    }
    const int col_blocks = THCCeilDiv(n_boxes, threadsPerBlock);
    dev_mask[cur_box_idx * col_blocks + col_start] = t;
  }
}

// boxes is a N x 5 tensor
at::Tensor nms_cuda(const at::Tensor boxes, float nms_overlap_thresh) {
  using scalar_t = float;
  AT_ASSERTM(boxes.type().is_cuda(), "boxes must be a CUDA tensor");
  auto scores = boxes.select(1, 4);
  auto order_t = std::get<1>(scores.sort(0, /* descending=*/true));
  auto boxes_sorted = boxes.index_select(0, order_t);

  int boxes_num = boxes.size(0);

  const int col_blocks = THCCeilDiv(boxes_num, threadsPerBlock);

  scalar_t* boxes_dev = boxes_sorted.data<scalar_t>();

  THCState *state = at::globalContext().lazyInitCUDA(); // TODO replace with getTHCState

  unsigned long long* mask_dev = NULL;
  //THCudaCheck(THCudaMalloc(state, (void**) &mask_dev,
  //                      boxes_num * col_blocks * sizeof(unsigned long long)));

  mask_dev = (unsigned long long*) THCudaMalloc(state, boxes_num * col_blocks * sizeof(unsigned long long));

  dim3 blocks(THCCeilDiv(boxes_num, threadsPerBlock),
              THCCeilDiv(boxes_num, threadsPerBlock));
  dim3 threads(threadsPerBlock);
  nms_kernel<<<blocks, threads>>>(boxes_num,
                                  nms_overlap_thresh,
                                  boxes_dev,
                                  mask_dev);

  std::vector<unsigned long long> mask_host(boxes_num * col_blocks);
  THCudaCheck(cudaMemcpy(&mask_host[0],
                        mask_dev,
                        sizeof(unsigned long long) * boxes_num * col_blocks,
                        cudaMemcpyDeviceToHost));

  std::vector<unsigned long long> remv(col_blocks);
  memset(&remv[0], 0, sizeof(unsigned long long) * col_blocks);

  at::Tensor keep = at::empty({boxes_num}, boxes.options().dtype(at::kLong).device(at::kCPU));
  int64_t* keep_out = keep.data<int64_t>();

  int num_to_keep = 0;
  for (int i = 0; i < boxes_num; i++) {
    int nblock = i / threadsPerBlock;
    int inblock = i % threadsPerBlock;

    if (!(remv[nblock] & (1ULL << inblock))) {
      keep_out[num_to_keep++] = i;
      unsigned long long *p = &mask_host[0] + i * col_blocks;
      for (int j = nblock; j < col_blocks; j++) {
        remv[j] |= p[j];
      }
    }
  }

  THCudaFree(state, mask_dev);
  // TODO improve this part
  return std::get<0>(order_t.index({
                       keep.narrow(/*dim=*/0, /*start=*/0, /*length=*/num_to_keep).to(
                         order_t.device(), keep.scalar_type())
                     }).sort(0, false));
}


================================================
FILE: disparity/csrc/cuda/vision.h
================================================
// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
#pragma once
#include <torch/extension.h>


at::Tensor SigmoidFocalLoss_forward_cuda(
		const at::Tensor& logits,
                const at::Tensor& targets,
		const int num_classes, 
		const float gamma, 
		const float alpha); 

at::Tensor SigmoidFocalLoss_backward_cuda(
			     const at::Tensor& logits,
                             const at::Tensor& targets,
			     const at::Tensor& d_losses,
			     const int num_classes,
			     const float gamma,
			     const float alpha);

at::Tensor ROIAlign_forward_cuda(const at::Tensor& input,
                                 const at::Tensor& rois,
                                 const float spatial_scale,
                                 const int pooled_height,
                                 const int pooled_width,
                                 const int sampling_ratio);

at::Tensor ROIAlign_backward_cuda(const at::Tensor& grad,
                                  const at::Tensor& rois,
                                  const float spatial_scale,
                                  const int pooled_height,
                                  const int pooled_width,
                                  const int batch_size,
                                  const int channels,
                                  const int height,
                                  const int width,
                                  const int sampling_ratio);

at::Tensor ROIDisp_forward_cuda(const at::Tensor& input,
                                 const at::Tensor& input_R,
                                 const at::Tensor& rois,
                                 const float spatial_scale,
                                 const int pooled_height,
                                 const int pooled_width,
                                 const int max_disp);

std::tuple<at::Tensor, at::Tensor> ROIDisp_backward_cuda(const at::Tensor& grad,
                                  const at::Tensor& rois,
                                  const float spatial_scale,
                                  const int pooled_height,
                                  const int pooled_width,
                                  const int batch_size,
                                  const int channels,
                                  const int height,
                                  const int width,
                                  const int max_disp);

std::tuple<at::Tensor, at::Tensor> ROIPool_forward_cuda(const at::Tensor& input,
                                const at::Tensor& rois,
                                const float spatial_scale,
                                const int pooled_height,
                                const int pooled_width);

at::Tensor BuildCostVolume_forward_cuda(const at::Tensor& left,
                                 const at::Tensor& right,
                                 const at::Tensor& shift);

at::Tensor ROIPool_backward_cuda(const at::Tensor& grad,
                                 const at::Tensor& input,
                                 const at::Tensor& rois,
                                 const at::Tensor& argmax,
                                 const float spatial_scale,
                                 const int pooled_height,
                                 const int pooled_width,
                                 const int batch_size,
                                 const int channels,
                                 const int height,
                                 const int width);

std::tuple<at::Tensor, at::Tensor> BuildCostVolume_backward_cuda(const at::Tensor& grad,
                                 const at::Tensor& left);

at::Tensor nms_cuda(const at::Tensor boxes, float nms_overlap_thresh);


at::Tensor compute_flow_cuda(const at::Tensor& boxes,
                             const int height,
                             const int width);


================================================
FILE: disparity/csrc/nms.h
================================================
// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
#pragma once
#include "cpu/vision.h"

#ifdef WITH_CUDA
#include "cuda/vision.h"
#endif


at::Tensor nms(const at::Tensor& dets,
               const at::Tensor& scores,
               const float threshold) {

  if (dets.type().is_cuda()) {
#ifdef WITH_CUDA
    // TODO raise error if not compiled with CUDA
    if (dets.numel() == 0)
      return at::empty({0}, dets.options().dtype(at::kLong).device(at::kCPU));
    auto b = at::cat({dets, scores.unsqueeze(1)}, 1);
    return nms_cuda(b, threshold);
#else
    AT_ERROR("Not compiled with GPU support");
#endif
  }

  at::Tensor result = nms_cpu(dets, scores, threshold);
  return result;
}


================================================
FILE: disparity/csrc/vision.cpp
================================================
// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
#include "nms.h"
#include "ROIAlign.h"
#include "ROIPool.h"
#include "SigmoidFocalLoss.h"
#include "BuildCostVolume.h"

PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
  m.def("nms", &nms, "non-maximum suppression");
  m.def("roi_align_forward", &ROIAlign_forward, "ROIAlign_forward");
  m.def("roi_align_backward", &ROIAlign_backward, "ROIAlign_backward");
  m.def("roi_pool_forward", &ROIPool_forward, "ROIPool_forward");
  m.def("roi_pool_backward", &ROIPool_backward, "ROIPool_backward");
  m.def("sigmoid_focalloss_forward", &SigmoidFocalLoss_forward, "SigmoidFocalLoss_forward");
  m.def("sigmoid_focalloss_backward", &SigmoidFocalLoss_backward, "SigmoidFocalLoss_backward");
  m.def("build_cost_volume_forward", &BuildCostVolume_forward, "BuildCostVolume_forward");
  m.def("build_cost_volume_backward", &BuildCostVolume_backward, "BuildCostVolume_backward");
}


================================================
FILE: disparity/dataloader/DataStatistics.py
================================================
import torch
from dataloader import listflowfile as lt
from dataloader import SecenFlowLoader as DA


================================================
FILE: disparity/dataloader/KITTILoader.py
================================================
import os
import torch
import torch.utils.data as data
import torch
import torchvision.transforms as transforms
import random
from PIL import Image, ImageOps
import numpy as np
from ..utils import preprocess

IMG_EXTENSIONS = [
    '.jpg', '.JPG', '.jpeg', '.JPEG',
    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
]


def is_image_file(filename):
    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)


def default_loader(path):
    
    return Image.open(path).convert('RGB')


def npy_loader(path):
    return np.load(path)

def disparity_loader(path):
    return Image.open(path)


class myImageFloder(data.Dataset):
    def __init__(self, left, right, left_disparity, left_norm, training, loader=default_loader, dploader=disparity_loader):

        self.left = left
        self.right = right
        self.disp_L = left_disparity
        self.norm_L = left_norm
        self.loader = loader
        self.dploader = dploader
        self.npy_loader = npy_loader
        self.training = training

    def __getitem__(self, index):
        left = self.left[index]
        right = self.right[index]
        disp_L = self.disp_L[index]
        norm_L = self.norm_L[index]

        left_img = self.loader(left)
        right_img = self.loader(right)
        dataL = self.dploader(disp_L)
        normL = self.npy_loader(norm_L[:-3]+'npy')
        

        if self.training:
            w, h = left_img.size
            # th, tw = 320, 1152
            # th, tw = 256, 1152
            # th, tw = 311, 1178
            th, tw = 320, 1152
            # th, tw = 256, 512
            

            x1 = random.randint(0, w - tw)
            y1 = random.randint(0, h - th)

            left_img = left_img.crop((x1, y1, x1 + tw, y1 + th))
            right_img = right_img.crop((x1, y1, x1 + tw, y1 + th))

            dataL = np.ascontiguousarray(dataL, dtype=np.float32) / 256
            dataL = dataL[y1:y1 + th, x1:x1 + tw]

            
            normL = normL[y1:y1 + th, x1:x1 + tw, :]

            processed = preprocess.get_transform(augment=True)
            left_img = processed(left_img)
            right_img = processed(right_img)
            # left_img = left_img/255 - 1
            # right_img = right_img/255 - 1

            # left_img, rigt_img = preprocess.get_transform_unsym(left_img, right_img, [th, tw])
            # left_img, right_img = left_img-1, right_img-1

            # delta_h = np.floor(np.random.uniform(50,150))
            # delta_w = np.floor(np.random.uniform(50,200))

            delta_h = np.floor(np.random.uniform(50,180))
            delta_w = np.floor(np.random.uniform(50,250))
            x1_aug = random.randint(0, th - delta_h)
            y1_aug = random.randint(0, tw - delta_w)
            x2_aug = random.randint(0, th - delta_h)
            y2_aug = random.randint(0, tw - delta_w)
            right_img[:,int(x1_aug):int(x1_aug+delta_h), int(y1_aug):int(y1_aug+delta_w)]  = right_img[:,int(x2_aug):int(x2_aug+delta_h), int(y2_aug):int(y2_aug+delta_w)]

            



            return [left_img.unsqueeze(0), right_img.unsqueeze(0), torch.tensor(dataL).unsqueeze(0),torch.tensor(normL)]
        else:
            w, h = left_img.size
            # left_img = left_img.crop((w - 1232, h - 368, w, h))
            # right_img = right_img.crop((w - 1232, h - 368, w, h))
            # left_img = left_img.crop((w - 1152, h - 256, w, h))
            # right_img = right_img.crop((w - 1152, h - 256, w, h))
            left_img = left_img.crop((w - 1152, h - 320, w, h))
            right_img = right_img.crop((w - 1152, h - 320, w, h))
            w1, h1 = left_img.size

            # dataL = dataL.crop((w - 1152, h - 256, w, h))
            dataL = dataL.crop((w - 1152, h - 320, w, h))
            dataL = np.ascontiguousarray(dataL, dtype=np.float32) / 256

            processed = preprocess.get_transform(augment=False)
            left_img = processed(left_img)
            right_img = processed(right_img)
            # print(left_img, right_img, dataL)

            return [left_img, right_img, dataL, dataL]

    def __len__(self):
        return len(self.left)

================================================
FILE: disparity/dataloader/KITTI_submission_loader.py
================================================
import torch.utils.data as data

from PIL import Image
import os
import os.path
import numpy as np

IMG_EXTENSIONS = [
    '.jpg', '.JPG', '.jpeg', '.JPEG',
    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
]


def is_image_file(filename):
    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)

def dataloader(filepath):

  left_fold  = 'image_2/'
  right_fold = 'image_3/'

  # left_fold  = 'colored_0/'
  # right_fold = 'colored_1/'

  image = [img for img in os.listdir(filepath+left_fold) if img.find('_10') > -1]


  left_test  = [filepath+left_fold+img for img in image]
  right_test = [filepath+right_fold+img for img in image]
  


  return left_test, right_test


================================================
FILE: disparity/dataloader/KITTI_submission_loader2012.py
================================================
import torch.utils.data as data

from PIL import Image
import os
import os.path
import numpy as np

IMG_EXTENSIONS = [
    '.jpg', '.JPG', '.jpeg', '.JPEG',
    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
]


def is_image_file(filename):
    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)

def dataloader(filepath):

  left_fold  = 'colored_0/'
  right_fold = 'colored_1/'


  image = [img for img in os.listdir(filepath+left_fold) if img.find('_10') > -1]


  left_test  = [filepath+left_fold+img for img in image]
  right_test = [filepath+right_fold+img for img in image]

  return left_test, right_test


================================================
FILE: disparity/dataloader/KITTIloader2012.py
================================================
import torch.utils.data as data

from PIL import Image
import os
import os.path
import numpy as np
import random

IMG_EXTENSIONS = [
    '.jpg', '.JPG', '.jpeg', '.JPEG',
    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
]


def is_image_file(filename):
    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)

def dataloader(filepath, arg=False):

  left_fold  = 'colored_0/'
  right_fold = 'colored_1/'
  disp_noc   = 'disp_occ/'
  disp_norm   = 'dispnorm_occ/'

  image = [img for img in os.listdir(filepath+left_fold) if img.find('_10') > -1]
  
  valist = [1,15,39,65,101,113,134,154,175,4,16,40,66,102,118,139,156,180,5,19,52,82,104,
            119,143,157,181,9,25,56,85,105,120,145,161,186,11,29,60,89,107,122,148,167,
            188,12,31,63,95,108,128,151,170,14,32,64,97,112,132,153,171]
  # valist = []
  train = []
  val = []
  for i in range(len(image)):
    if i in valist:
      val.append(image[i])
    else:
      train.append(image[i])
  random.shuffle(train)
  
  left_train  = [filepath+left_fold+img for img in train]
  right_train = [filepath+right_fold+img for img in train]
  disp_train = [filepath+disp_noc+img for img in train]
  norm_train = [filepath+disp_norm+img for img in train]


  left_val  = [filepath+left_fold+img for img in val]
  right_val = [filepath+right_fold+img for img in val]
  disp_val = [filepath+disp_noc+img for img in val]
  norm_val = [filepath+disp_norm+img for img in val]

  return left_train, right_train, disp_train, norm_train, left_val, right_val, disp_val, norm_val


================================================
FILE: disparity/dataloader/KITTIloader2015.py
================================================
import torch.utils.data as data

from PIL import Image
import os
import os.path
import numpy as np
import random

IMG_EXTENSIONS = [
    '.jpg', '.JPG', '.jpeg', '.JPEG',
    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
]


def is_image_file(filename):
    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)

def dataloader(filepath):

  left_fold  = 'image_2/'
  right_fold = 'image_3/'
  disp_L = 'disp_occ_0/'
  disp_R = 'disp_occ_1/'
  disp_norm = 'dispnorm_occ/'

  image = [img for img in os.listdir(filepath+left_fold) if img.find('_10') > -1]

  all_index = np.arange(200)
  #np.random.shuffle(all_index)
  # vallist = all_index[:40]
#   val = ['{:06d}_10.png'.format(x) for x in vallist]
  

  val = []
  train = [x for x in image if x not in val]
  random.shuffle(train)


  left_train  = [filepath+left_fold+img for img in train]
  right_train = [filepath+right_fold+img for img in train]
  disp_train_L = [filepath+disp_L+img for img in train]
  disp_train_R = [filepath+disp_R+img for img in train]
  norm_train_L = [filepath+disp_norm+img for img in train]

  left_val  = [filepath+left_fold+img for img in val]
  right_val = [filepath+right_fold+img for img in val]
  disp_val_L = [filepath+disp_L+img for img in val]
  disp_val_R = [filepath+disp_R+img for img in val]
  norm_val_L = [filepath+disp_norm+img for img in val]

  return left_train, right_train, disp_train_L, norm_train_L, left_val, right_val, disp_val_L, norm_val_L


================================================
FILE: disparity/dataloader/SceneFlowLoader_demo.py
================================================
import torch.utils.data as data
import random
from PIL import Image
from . import preprocess
import numpy as np
import sys, os
sys.path.append(os.path.abspath(os.path.dirname(__file__)))
IMG_EXTENSIONS = [
    '.jpg',
    '.JPG',
    '.jpeg',
    '.JPEG',
    '.png',
    '.PNG',
    '.ppm',
    '.PPM',
    '.bmp',
    '.BMP',
]


def is_image_file(filename):
    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)


def default_loader(path):
    return Image.open(path).convert('RGB')


def disparity_loader(path):
    path_prefix = path.split('.')[0]
    path1 = path_prefix + '_exception_assign_minus_1.npy'
    path2 = path_prefix + '.npy'
    path3 = path_prefix + '.pfm'
    import os.path as ospath
    if ospath.exists(path1):
        return np.load(path1)
    else:
        if ospath.exists(path2):
            data = np.load(path2)
        else:
            from readpfm import readPFM
            data, _ = readPFM(path3)
            np.save(path2, data)
        for i in range(data.shape[0]):
            for j in range(data.shape[1]):
                if j - data[i][j] < 0:
                    data[i][j] = -1
        np.save(path1, data)
        return data


class myImageFloder(data.Dataset):
    def __init__(self,
                 left,
                 right,
                 left_disparity,
                 training,
                 normalize,
                 loader=default_loader,
                 dploader=disparity_loader):

        self.left = left
        self.right = right
        self.disp_L = left_disparity
        self.loader = loader
        self.dploader = dploader
        self.training = training
        self.normalize = normalize

    def __getitem__(self, index):
        left = self.left[index]
        right = self.right[index]
        disp_L = self.disp_L[index]

        left_img = self.loader(left)
        right_img = self.loader(right)
        dataL = self.dploader(disp_L)
        dataL = np.ascontiguousarray(dataL, dtype=np.float32)

        processed = preprocess.get_transform(
            augment=False, normalize=self.normalize)
        left_img = processed(left_img)
        right_img = processed(right_img)

        return left_img, right_img, dataL, left, disp_L.split(
            '.')[0] + '_exception_assign_minus_1.npy'

    def __len__(self):
        return len(self.left)


================================================
FILE: disparity/dataloader/SecenFlowLoader.py
================================================
import os
import torch
import torch.utils.data as data
import torch
#import torchvision.transforms as transforms
import random
from PIL import Image, ImageOps
from . import preprocess
from . import listflowfile as lt
from . import readpfm as rp
import numpy as np
import cv2
import torch.nn.functional as F

IMG_EXTENSIONS = [
    '.jpg', '.JPG', '.jpeg', '.JPEG',
    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
]


def is_image_file(filename):
    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)


def default_loader(path):
    image = cv2.imread(path)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    return image


def disparity_loader(path):
    return rp.readPFM(path)

def random_replace(img,num,size):
    #random crop areas and replace to the same size random crop from the image self.
    #from HITNet ,it random crop the right image.
    #args num:num of areas to crop and replace
    #     size: random [0,size]*[0,size]
    h = img.shape[0]
    w = img.shape[1]
    for i in range(num):
        size_ix = random.randint(0,size)
        size_iy = random.randint(0, size)
        x1 = random.randint(0, w - size_ix)
        y1 = random.randint(0, h - size_iy)

        x2 = random.randint(0, w - size_ix)
        y2 = random.randint(0, h - size_iy)
        #replace
        img[y1:y1 + size_iy, x1:x1 + size_ix, :] = img[y2:y2 + size_iy, x2:x2 + size_ix, :]

    return img

class myImageFloder(data.Dataset):
    def __init__(self, left, right, left_disparity,right_disparity, training, loader=default_loader, dploader=disparity_loader):

        self.left = left
        self.right = right
        self.disp_L = left_disparity
        self.loader = loader
        self.dploader = dploader
        self.training = training
        if right_disparity is not None:
            self.disp_R = right_disparity
        else:
            self.disp_R = None

        print('len', len(self.left))
    def __getitem__(self, index):
        left = self.left[index]
        right = self.right[index]
        disp_L = self.disp_L[index]
        if self.disp_R is not None:
            disp_R = self.disp_R[index]
            dataR,scaleR = self.dploader(disp_R)
            dataR = np.ascontiguousarray(dataR, dtype=np.float32)

        left_img = self.loader(left)
        right_img = self.loader(right)
        dataL, scaleL = self.dploader(disp_L)
        dataL = np.ascontiguousarray(dataL, dtype=np.float32)

        if self.training:

            h = left_img.shape[0]
            w = left_img.shape[1]

            th, tw = 320, 960

            x1 = random.randint(0, w - tw)
            y1 =\
                random.randint(0, h - th)

            left_img = left_img[y1:y1+th,x1:x1+tw,:]
            right_img = right_img[y1:y1+th,x1:x1+tw,:]

            #left_img = random_replace(left_img,5,80)

            dataL = dataL[y1:y1 + th, x1:x1 + tw]

            if dataR is not None:
                dataR=dataR[y1:y1 + th, x1:x1 + tw]


            processed = preprocess.get_transform(augment=False)

            #random replace
            #right_img = random_replace(right_img,4,5)

            left_img_and_d= processed(image=left_img,mask=dataL,bboxes=[],category_id=[])
            left_img = left_img_and_d['image']
            dataL = left_img_and_d['mask']
            if dataR is not None:
                right_img_and_d = processed(image=right_img, mask=dataR, bboxes=[], category_id=[])
                right_img = right_img_and_d['image']
                dataR = right_img_and_d['mask']
            else:
                right_img = processed(image=right_img,mask=None,bboxes=[],category_id=[])['image']


            if dataR is not None:
                return left_img, right_img, dataL,dataR
            else:
                return left_img, right_img, dataL
        else:

            h = left_img.shape[0]
            w = left_img.shape[1]

            th, tw = 512, 960

            #x1 = random.randint(0, w - tw)
            #y1 =  random.randint(0, h - th)
            x1 = 0
            y1 = 0

            left_img = left_img[y1:y1 + th, x1:x1 + tw, :]
            right_img = right_img[y1:y1 + th, x1:x1 + tw, :]

            dataL = dataL[y1:y1 + th, x1:x1 + tw]
            if dataR is not None:
                dataR=dataR[y1:y1 + th, x1:x1 + tw]

            processed = preprocess.get_transform(augment=False)
            left_img_and_d= processed(image=left_img,mask=dataL,bboxes=[],category_id=[])
            left_img = left_img_and_d['image']
            dataL = left_img_and_d['mask']
            if dataR is not None:
                right_img_and_d = processed(image=right_img, mask=dataR, bboxes=[], category_id=[])
                right_img = right_img_and_d['image']
                dataR = right_img_and_d['mask']
            else:
                right_img = processed(image=right_img,mask=None,bboxes=[],category_id=[])['image']


            if dataR is not None:
                return left_img, right_img, dataL, dataR
            else:
                return left_img, right_img, dataL


    def __len__(self):
        return len(self.left)



================================================
FILE: disparity/dataloader/SecenFlowLoader1.py
================================================
import os
import torch
import torch.utils.data as data
import torch
import torchvision.transforms as transforms
import random
from PIL import Image, ImageOps
from . import preprocess
from . import listflowfile as lt
from . import readpfm as rp
import numpy as np

IMG_EXTENSIONS = [
    '.jpg', '.JPG', '.jpeg', '.JPEG',
    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
]


def is_image_file(filename):
    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)


def default_loader(path):
    return Image.open(path).convert('RGB')


def disparity_loader(path):
    return rp.readPFM(path)


class myImageFloder(data.Dataset):
    def __init__(self, left, right, left_disparity, training, loader=default_loader, dploader=disparity_loader):

        self.left = left
        self.right = right
        self.disp_L = left_disparity
        self.loader = loader
        self.dploader = dploader
        self.training = training

    def __getitem__(self, index):
        left = self.left[index]
        right = self.right[index]
        disp_L = self.disp_L[index]

        left_img = self.loader(left)
        right_img = self.loader(right)
        dataL, scaleL = self.dploader(disp_L)
        dataL = np.ascontiguousarray(dataL, dtype=np.float32)

        if self.training:
            w, h = left_img.size
            th, tw = 256, 512
            # th, tw = 544, 960

            x1 = random.randint(0, w - tw)
            y1 = random.randint(0, h - th)

            left_img = left_img.crop((x1, y1, x1 + tw, y1 + th))
            right_img = right_img.crop((x1, y1, x1 + tw, y1 + th))

            dataL = dataL[y1:y1 + th, x1:x1 + tw]

            processed = preprocess.get_transform(augment=False)
            left_img = processed(left_img)
            right_img = processed(right_img)

            return left_img, right_img, dataL
        else:
            w, h = left_img.size
            left_img = left_img.crop((w - 960, h - 544, w, h))
            right_img = right_img.crop((w - 960, h - 544, w, h))
            processed = preprocess.get_transform(augment=False)
            left_img = processed(left_img)
            right_img = processed(right_img)

            return left_img, right_img, dataL

    def __len__(self):
        return len(self.left)


================================================
FILE: disparity/dataloader/SecenFlowLoaderfix.py
================================================
import torch.utils.data as data
import random
from PIL import Image
from . import preprocess
# import preprocess
import numpy as np
import sys, os
sys.path.append(os.path.abspath(os.path.dirname(__file__)))
IMG_EXTENSIONS = [
    '.jpg',
    '.JPG',
    '.jpeg',
    '.JPEG',
    '.png',
    '.PNG',
    '.ppm',
    '.PPM',
    '.bmp',
    '.BMP',
]


def is_image_file(filename):
    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)


def default_loader(path):
    return Image.open(path).convert('RGB')


# def disparity_loader(path):
#     path_prefix = path.split('.')[0]
#     # print(path_prefix)
#     path1 = path_prefix + '_exception_assign_minus_1.npy'
#     path2 = path_prefix + '.npy'
#     path3 = path_prefix + '.pfm'
#     import os.path as ospath
#     if ospath.exists(path1):
#         return np.load(path1)
#     else:
#         if ospath.exists(path2):
#             data = np.load(path2)
#         else:
#             # from readpfm import readPFMreadPFM
#             from readpfm import readPFM
#             data, _ = readPFM(path3)
#             np.save(path2, data)
#         for i in range(data.shape[0]):
#             for j in range(data.shape[1]):
#                 if j - data[i][j] < 0:
#                     data[i][j] = -1
#         np.save(path1, data)
#         return data


def disparity_loader(path):
    path_prefix = path.split('.')[0]
    # print(path_prefix)
    path1 = path_prefix + '_exception_assign_minus_1.npy'
    path2 = path_prefix + '.npy'
    path3 = path_prefix + '.pfm'
    import os.path as ospath
    if ospath.exists(path1):
        return np.load(path1)
    else:

        # from readpfm import readPFMreadPFM
        from readpfm import readPFM
        data, _ = readPFM(path3)
        np.save(path2, data)
        for i in range(data.shape[0]):
            for j in range(data.shape[1]):
                if j - data[i][j] < 0:
                    data[i][j] = -1
        np.save(path1, data)
        return data

class myImageFloder(data.Dataset):
    def __init__(self,
                 left,
                 right,
                 left_disparity,
                 right_disparity,
                 training,
                 normalize,
                 loader=default_loader,
                 dploader=disparity_loader):

        self.left = left
        self.right = right
        self.disp_L = left_disparity
        self.disp_R = right_disparity
        self.loader = loader
        self.dploader = dploader
        self.training = training
        self.normalize = normalize

    def __getitem__(self, index):
        
        left = self.left[index]
        
        right = self.right[index]
        disp_L = self.disp_L[index]
        disp_R = self.disp_R[index]
        left_img = self.loader(left)
        right_img = self.loader(right)
        dataL = self.dploader(disp_L)
        dataR = self.dploader(disp_R)
        
        dataL = np.ascontiguousarray(dataL, dtype=np.float32)
        dataR = np.ascontiguousarray(dataR, dtype=np.float32)

        processed = preprocess.get_transform(
            augment=False, normalize=self.normalize)
        left_img = processed(left_img)
        right_img = processed(right_img)


        return left_img, right_img, dataL, dataR

    def __len__(self):
        return len(self.left)
if __name__ == '__main__':
    path = '/media/lxy/sdd1/stereo_coderesource/dataset_nie/SceneFlowData/frames_cleanpass/flyingthings3d_disparity/TRAIN/A/0024/left/0011.pfm'
    res = disparity_loader(path)
    print(res.shape)


================================================
FILE: disparity/dataloader/Testloader.py
================================================

import os
import torch
import torch.utils.data as data
import torch
import torchvision.transforms as transforms
import random
from PIL import Image, ImageOps
import numpy as np
#from dataloader.preprocess import preprocess
import dataloader.preprocess as preprocess

IMG_EXTENSIONS = [
    '.jpg', '.JPG', '.jpeg', '.JPEG',
    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
]


def is_image_file(filename):
    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)


def default_loader(path):
    return Image.open(path).convert('RGB')


def disparity_loader(path):
    return Image.open(path)



def dataloader(filepath):

  left_fold  = 'left/'
  right_fold = 'right/'


  left_test= [img for img in os.listdir(filepath+left_fold) if img.find('_left') > -1]
  left_test.sort()
  right_test= [img for img in os.listdir(filepath+right_fold) if img.find('_right') > -1]
  right_test.sort()

  left_test = [filepath+left_fold+img for img in left_test]
  right_test = [filepath+right_fold+img for img in right_test]

  return left_test, right_test

class myImageFloder(data.Dataset):
    def __init__(self, left, right, loader=default_loader):

        self.left = left
        self.right = right
        self.loader = loader


    def __getitem__(self, index):
        left = self.left[index]
        right = self.right[index]
        print('left',index,left)
        print('right',index,right)


        left_img = self.loader(left)
        right_img = self.loader(right)


        #test   not for training
        w, h = left_img.size

        left_img = left_img.crop((w - 992, h - 736, w, h))
        right_img = right_img.crop((w - 992, h - 736, w, h))
        # left_img = left_img.crop((w - 1232, h - 368, w, h))
        # right_img = right_img.crop((w - 1232, h - 368, w, h))
        w1, h1 = left_img.size

        #dataL = dataL.crop((w - 1232, h - 368, w, h))


        processedL = preprocess.get_transform(augment=False,camera=None)
        processedR = preprocess.get_transform(augment=False,camera=None)
        left_img = processedL(left_img)
        right_img = processedR(right_img)

        return left_img, right_img

    def __len__(self):
        return len(self.left)

if __name__ == '__main__':
    left,right=dataloader('/disk1/hyj/test_picture/819_testpic/')
    print(left)
    print(len(left))
    print(right)
    print(len(right))

================================================
FILE: disparity/dataloader/__init__.py
================================================


================================================
FILE: disparity/dataloader/listflowfile.py
================================================
import torch.utils.data as data

from PIL import Image
import os
import os.path

IMG_EXTENSIONS = [
    '.jpg', '.JPG', '.jpeg', '.JPEG',
    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
]


def is_image_file(filename):
    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)


def dataloader(filepath):
    filepath += '/'
    classes = [d for d in os.listdir(filepath) if os.path.isdir(os.path.join(filepath, d))]
    image = [img for img in classes if img.find('frames_cleanpass') > -1]
    disp = [dsp for dsp in classes if dsp.find('disparity') > -1]
    print(classes)
    print('img',image)
    print('disp', disp)
    monkaa_path = filepath + [x for x in image if 'monkaa' in x][0]
    monkaa_disp = filepath + [x for x in disp if 'monkaa' in x][0]

    monkaa_dir = os.listdir(monkaa_path)

    all_left_img = []
    all_right_img = []
    all_left_disp = []
    all_right_disp = []
    test_left_img = []
    test_right_img = []
    test_left_disp = []

    for dd in monkaa_dir:
        for im in os.listdir(monkaa_path + '/' + dd + '/left/'):
            if is_image_file(monkaa_path + '/' + dd + '/left/' + im):
                all_left_img.append(monkaa_path + '/' + dd + '/left/' + im)
                all_left_disp.append(monkaa_disp + '/' + dd + '/left/' + im.split(".")[0] + '.pfm')
                all_right_disp.append(monkaa_disp + '/' + dd + '/right/' + im.split(".")[0] + '.pfm')

        for im in os.listdir(monkaa_path + '/' + dd + '/right/'):
            if is_image_file(monkaa_path + '/' + dd + '/right/' + im):
                all_right_img.append(monkaa_path + '/' + dd + '/right/' + im)

    flying_path = filepath + [x for x in image if x == 'frames_cleanpass'][0]
    flying_disp = filepath + [x for x in disp if x == 'frames_disparity'][0]
    flying_dir = flying_path + '/TRAIN/'
    subdir = ['A', 'B', 'C']

    for ss in subdir:
        flying = os.listdir(flying_dir + ss)

        for ff in flying:
            imm_l = os.listdir(flying_dir + ss + '/' + ff + '/left/')
            for im in imm_l:
                if is_image_file(flying_dir + ss + '/' + ff + '/left/' + im):
                    all_left_img.append(flying_dir + ss + '/' + ff + '/left/' + im)

                all_left_disp.append(flying_disp + '/TRAIN/' + ss + '/' + ff + '/left/' + im.split(".")[0] + '.pfm')
                all_right_disp.append(flying_disp + '/TRAIN/' + ss + '/' + ff + '/right/' + im.split(".")[0] + '.pfm')

                if is_image_file(flying_dir + ss + '/' + ff + '/right/' + im):
                    all_right_img.append(flying_dir + ss + '/' + ff + '/right/' + im)

    flying_dir = flying_path + '/TEST/'

    subdir = ['A', 'B', 'C']
    # subdir = ['C']
    # print('*****************')

    for ss in subdir:
        flying = os.listdir(flying_dir + ss)

        for ff in flying:
            imm_l = os.listdir(flying_dir + ss + '/' + ff + '/left/')
            for im in imm_l:
                if is_image_file(flying_dir + ss + '/' + ff + '/left/' + im):
                    test_left_img.append(flying_dir + ss + '/' + ff + '/left/' + im)

                test_left_disp.append(flying_disp + '/TEST/' + ss + '/' + ff + '/left/' + im.split(".")[0] + '.pfm')

                if is_image_file(flying_dir + ss + '/' + ff + '/right/' + im):
                    test_right_img.append(flying_dir + ss + '/' + ff + '/right/' + im)

    driving_dir = filepath + [x for x in image if 'driving' in x][0] + '/'
    driving_disp = filepath + [x for x in disp if 'driving' in x][0]

    subdir1 = ['35mm_focallength', '15mm_focallength']
    subdir2 = ['scene_backwards', 'scene_forwards']
    subdir3 = ['fast', 'slow']

    for i in subdir1:
        for j in subdir2:
            for k in subdir3:
                imm_l = os.listdir(driving_dir + i + '/' + j + '/' + k + '/left/')
                for im in imm_l:
                    if is_image_file(driving_dir + i + '/' + j + '/' + k + '/left/' + im):
                        all_left_img.append(driving_dir + i + '/' + j + '/' + k + '/left/' + im)
                    all_left_disp.append(
                        driving_disp + '/' + i + '/' + j + '/' + k + '/left/' + im.split(".")[0] + '.pfm')
                    all_right_disp.append(
                        driving_disp + '/' + i + '/' + j + '/' + k + '/right/' + im.split(".")[0] + '.pfm')

                    if is_image_file(driving_dir + i + '/' + j + '/' + k + '/right/' + im):
                        all_right_img.append(driving_dir + i + '/' + j + '/' + k + '/right/' + im)

    return all_left_img, all_right_img, all_left_disp,all_right_disp, test_left_img, test_right_img, test_left_disp


================================================
FILE: disparity/dataloader/listflowfilefix.py
================================================
import torch.utils.data as data

from PIL import Image
import os
import os.path

IMG_EXTENSIONS = [
    '.jpg', '.JPG', '.jpeg', '.JPEG',
    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',
]


def is_image_file(filename): 
    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)

def dataloader(filepath): # /media/hugonie/Hhome/dataset/SceneFlowData/

 # classes = [d for d in os.listdir(filepath) if os.path.isdir(os.path.join(filepath, d))]
 # print(classes)
 # image = [img for img in classes if img.find('frames_cleanpass') > -1]
 # print(image)
 # disp  = [dsp for dsp in classes if dsp.find('disparity') > -1]
 # print(disp)
 # monkaa
 
 # monkaa_path = filepath + [x for x in image if 'monkaa' in x][0]
 # monkaa_disp = filepath + [x for x in disp if 'monkaa' in x][0]
 monkaa_path = filepath + '/frames_cleanpass/monkaa'
 monkaa_disp = filepath + '/disparity/monkaa'
 monkaa_dir  = os.listdir(monkaa_path)

 all_left_img=[]
 all_right_img=[]
 all_left_disp = []
 all_right_disp = []
 test_left_img=[]
 test_right_img=[]
 test_left_disp = []
 test_right_disp = []


 for dd in monkaa_dir:
   for im in os.listdir(monkaa_path+'/'+dd+'/left/'):
    if is_image_file(monkaa_path+'/'+dd+'/left/'+im):
     all_left_img.append(monkaa_path+'/'+dd+'/left/'+im)
     all_left_disp.append(monkaa_disp+'/'+dd+'/left/'+im.split(".")[0]+'.pfm')
     all_right_disp.append(monkaa_disp+'/'+dd+'/right/'+im.split(".")[0]+'.pfm')

   for im in os.listdir(monkaa_path+'/'+dd+'/right/'):
    if is_image_file(monkaa_path+'/'+dd+'/right/'+im):
     all_right_img.append(monkaa_path+'/'+dd+'/right/'+im)

 # flyingthings
 # flying_path = filepath + [x for x in image if x == 'flyingthings3D'][0]
 # flying_disp = filepath + [x for x in disp if x == 'flyingthings3D'][0]
 flying_path = filepath + '/frames_cleanpass/flyingthings3D'
 flying_disp = filepath + '/disparity/flyingthings3D'
 flying_dir = flying_path+'/TRAIN/'
 subdir = ['A','B','C']

 for ss in subdir:
    flying = os.listdir(flying_dir+ss)

    for ff in flying:
      imm_l = os.listdir(flying_dir+ss+'/'+ff+'/left/')
      for im in imm_l:
       if is_image_file(flying_dir+ss+'/'+ff+'/left/'+im):
         all_left_img.append(flying_dir+ss+'/'+ff+'/left/'+im)

       all_left_disp.append(flying_disp+'/TRAIN/'+ss+'/'+ff+'/left/'+im.split(".")[0]+'.pfm')
       all_right_disp.append(flying_disp+'/TRAIN/'+ss+'/'+ff+'/right/'+im.split(".")[0]+'.pfm')

       if is_image_file(flying_dir+ss+'/'+ff+'/right/'+im):
         all_right_img.append(flying_dir+ss+'/'+ff+'/right/'+im)

 flying_dir = flying_path+'/TEST/'

 subdir = ['A','B','C']

 for ss in subdir:
    flying = os.listdir(flying_dir+ss)

    for ff in flying:
      imm_l = os.listdir(flying_dir+ss+'/'+ff+'/left/')
      for im in imm_l:
       if is_image_file(flying_dir+ss+'/'+ff+'/left/'+im):
         test_left_img.append(flying_dir+ss+'/'+ff+'/left/'+im)

       test_left_disp.append(flying_disp+'/TEST/'+ss+'/'+ff+'/left/'+im.split(".")[0]+'.pfm')
       test_right_disp.append(flying_disp+'/TEST/'+ss+'/'+ff+'/right/'+im.split(".")[0]+'.pfm')

       if is_image_file(flying_dir+ss+'/'+ff+'/right/'+im):
         test_right_img.append(flying_dir+ss+'/'+ff+'/right/'+im)


 # driving
 # driving_dir = filepath + [x for x in image if 'driving' in x][0] + '/'
 # driving_disp = filepath + [x for x in disp if 'driving' in x][0]
 driving_dir = filepath + '/frames_cleanpass/driving/'
 driving_disp = filepath + '/disparity/driving'

 subdir1 = ['15mm_focallength','35mm_focallength']
 subdir2 = ['scene_backwards','scene_forwards']
 subdir3 = ['fast','slow']

 for i in subdir1:
   for j in subdir2:
    for k in subdir3:
        imm_l = os.listdir(driving_dir+i+'/'+j+'/'+k+'/left/')    
        for im in imm_l:
          if is_image_file(driving_dir+i+'/'+j+'/'+k+'/left/'+im):
            all_left_img.append(driving_dir+i+'/'+j+'/'+k+'/left/'+im)
          all_left_disp.append(driving_disp+'/'+i+'/'+j+'/'+k+'/left/'+im.split(".")[0]+'.pfm')
          all_right_disp.append(driving_disp+'/'+i+'/'+j+'/'+k+'/right/'+im.split(".")[0]+'.pfm')

          if is_image_file(driving_dir+i+'/'+j+'/'+k+'/right/'+im):
            all_right_img.append(driving_dir+i+'/'+j+'/'+k+'/right/'+im)


 return all_left_img, all_right_img, all_left_disp,all_right_disp, test_left_img, test_right_img, test_left_disp, test_right_disp

================================================
FILE: disparity/dataloader/preprocess.py
================================================
import torch
#import torchvision.transforms as transforms
import random

import cv2
import albumentations as A
from albumentations.pytorch import ToTensorV2

__imagenet_stats = {'mean': [0.485, 0.456, 0.406],
                   'std': [0.229, 0.224, 0.225]}

#__imagenet_stats = {'mean': [0.5, 0.5, 0.5],
#                   'std': [0.5, 0.5, 0.5]}

__imagenet_pca = {
    'eigval': torch.Tensor([0.2175, 0.0188, 0.0045]),
    'eigvec': torch.Tensor([
        [-0.5675,  0.7192,  0.4009],
        [-0.5808, -0.0045, -0.8140],
        [-0.5836, -0.6948,  0.4203],
    ])
}



def totensor_normalize():

    return A.Compose([
        # A.Normalize(
        #     mean=[0.485, 0.456, 0.406],
        #     std=[0.229, 0.224, 0.225]),
        ToTensorV2(always_apply=True)
    ],p=1)



def augmentv1():
    photometric  = [
        A.Blur(p=0.5),
        A.HueSaturationValue(20,30,20,p=0.5),
        A.RandomBrightnessContrast(0.2,p=0.5),
        A.RandomGamma(p=0.5),
        #A.ISONoise(p=1),
        A.GaussNoise(p=0.5),
        # A.Normalize(
        #     mean=[0.485, 0.456, 0.406],
        #     std=[0.229, 0.224, 0.225],
        # ),
        ToTensorV2()
    ]

    geometric = [
        # A.OpticalDistortion(distort_limit=0.3, shift_limit=0.3,p=1)
        A.ShiftScaleRotate(shift_limit=0.01,scale_limit=0.01,rotate_limit=5,p=0.5)
        #A.ShiftScaleRotate(shift_limit=0.3, scale_limit=0.3, rotate_limit=30, p=0.5)
    ]

    return A.Compose(photometric)



def get_transform(augment=True):


    if augment:
            return augmentv1()
    else:
            return totensor_normalize()








================================================
FILE: disparity/dataloader/readpfm.py
================================================
import re
import numpy as np
import sys


def readPFM(file):
    file = open(file, 'rb')

    color = None
    width = None
    height = None
    scale = None
    endian = None

    header = file.readline().rstrip()
    if header == b'PF':
        color = True
    elif header == b'Pf':
        color = False
    else:
        raise Exception('Not a PFM file.')

    dim_match = re.match(r'^(\d+)\s(\d+)\s$', file.readline().decode('utf-8'))
    if dim_match:
        width, height = map(int, dim_match.groups())
    else:
        raise Exception('Malformed PFM header.')

    scale = float(file.readline().rstrip())
    if scale < 0:  # little-endian
        endian = '<'
        scale = -scale
    else:
        endian = '>'  # big-endian

    data = np.fromfile(file, endian + 'f')
    shape = (height, width, 3) if color else (height, width)

    data = np.reshape(data, shape)
    data = np.flipud(data)
    file.close()
    return data, scale


================================================
FILE: disparity/eval/__init__.py
================================================


================================================
FILE: disparity/eval/kitti/README.md
================================================
Reference: <a href="https://github.com/prclibo/kitti_eval" target="_blank">https://github.com/prclibo/kitti_eval</a>

# kitti_eval

`evaluate_object_3d_offline.cpp`evaluates your KITTI detection locally on your own computer using your validation data selected from KITTI training dataset, with the following metrics:

- overlap on image (AP)
- oriented overlap on image (AOS)
- overlap on ground-plane (AP)
- overlap in 3D (AP)

Compile `evaluate_object_3d_offline.cpp` with dependency of Boost and Linux `dirent.h` (You should already have it under most Linux).

Run the evalutaion by:

    ./evaluate_object_3d_offline groundtruth_dir result_dir
    
Note that you don't have to detect over all KITTI training data. The evaluator only evaluates samples whose result files exist.


### Updates

- June, 2017:
  * Fixed the bug of detection box filtering based on min height according to KITTI's note on 25.04.2017.


================================================
FILE: disparity/eval/kitti/compile.sh
================================================
#/bin/bash
g++ -o evaluate_object_3d_offline evaluate_object_3d_offline.cpp


================================================
FILE: disparity/eval/kitti/eval.sh
================================================
echo "evalutating $1 ..."

./evaluate_object_3d_offline /mnt/backup/project/ylchen/dataset/KITTI_DATASET/kitti_detection/training/label_2 $1


================================================
FILE: disparity/eval/kitti/eval_05.sh
================================================
echo "evalutating $1 ..."

./evaluate_object_3d_offline_05 ../../../data/kitti/training/label_2/ $1


================================================
FILE: disparity/eval/kitti/evaluate_object_3d_offline.cpp
================================================
#include <iostream>
#include <algorithm>
#include <stdio.h>
#include <math.h>
#include <vector>
#include <numeric>
#include <strings.h>
#include <assert.h>

#include <dirent.h>

#include <boost/numeric/ublas/matrix.hpp>
#include <boost/numeric/ublas/io.hpp>

#include <boost/geometry.hpp>
#include <boost/geometry/geometries/point_xy.hpp>
#include <boost/geometry/geometries/polygon.hpp>
#include <boost/geometry/geometries/adapted/c_array.hpp>

#include "mail.h"

BOOST_GEOMETRY_REGISTER_C_ARRAY_CS(cs::cartesian)

typedef boost::geometry::model::polygon<boost::geometry::model::d2::point_xy<double> > Polygon;


using namespace std;

/*=======================================================================
STATIC EVALUATION PARAMETERS
=======================================================================*/

// holds the number of test images on the server
const int32_t N_TESTIMAGES = 7518;

// easy, moderate and hard evaluation level
enum DIFFICULTY{EASY=0, MODERATE=1, HARD=2};

// evaluation metrics: image, ground or 3D
enum METRIC{IMAGE=0, GROUND=1, BOX3D=2};

// evaluation parameter
const int32_t MIN_HEIGHT[3]     = {40, 25, 25};     // minimum height for evaluated groundtruth/detections
const int32_t MAX_OCCLUSION[3]  = {0, 1, 2};        // maximum occlusion level of the groundtruth used for evaluation
const double  MAX_TRUNCATION[3] = {0.15, 0.3, 0.5}; // maximum truncation level of the groundtruth used for evaluation

// evaluated object classes
enum CLASSES{CAR=0, PEDESTRIAN=1, CYCLIST=2};
const int NUM_CLASS = 3;

// parameters varying per class
vector<string> CLASS_NAMES;
// the minimum overlap required for 2D evaluation on the image/ground plane and 3D evaluation
//const double MIN_OVERLAP[3][3] = {{0.7, 0.5, 0.5}, {0.25, 0.25, 0.25}, {0.25, 0.25, 0.25}};
//const double MIN_OVERLAP[3][3] = {{0.7, 0.5, 0.5}, {0.5, 0.25, 0.25}, {0.5, 0.25, 0.25}};
const double MIN_OVERLAP[3][3] = {{0.7, 0.5, 0.5}, {0.7, 0.5, 0.5}, {0.7, 0.5, 0.5}};

// no. of recall steps that should be evaluated (discretized)
const double N_SAMPLE_PTS = 41;


// initialize class names
void initGlobals () {
  CLASS_NAMES.push_back("car");
  CLASS_NAMES.push_back("pedestrian");
  CLASS_NAMES.push_back("cyclist");
}

/*=======================================================================
DATA TYPES FOR EVALUATION
=======================================================================*/

// holding data needed for precision-recall and precision-aos
struct tPrData {
  vector<double> v;           // detection score for computing score thresholds
  double         similarity;  // orientation similarity
  int32_t        tp;          // true positives
  int32_t        fp;          // false positives
  int32_t        fn;          // false negatives
  tPrData () :
    similarity(0), tp(0), fp(0), fn(0) {}
};

// holding bounding boxes for ground truth and detections
struct tBox {
  string  type;     // object type as car, pedestrian or cyclist,...
  double   x1;      // left corner
  double   y1;      // top corner
  double   x2;      // right corner
  double   y2;      // bottom corner
  double   alpha;   // image orientation
  tBox (string type, double x1,double y1,double x2,double y2,double alpha) :
    type(type),x1(x1),y1(y1),x2(x2),y2(y2),alpha(alpha) {}
};

// holding ground truth data
struct tGroundtruth {
  tBox    box;        // object type, box, orientation
  double  truncation; // truncation 0..1
  int32_t occlusion;  // occlusion 0,1,2 (non, partly, fully)
  double ry;
  double  t1, t2, t3;
  double h, w, l;
  tGroundtruth () :
    box(tBox("invalild",-1,-1,-1,-1,-10)),truncation(-1),occlusion(-1) {}
  tGroundtruth (tBox box,double truncation,int32_t occlusion) :
    box(box),truncation(truncation),occlusion(occlusion) {}
  tGroundtruth (string type,double x1,double y1,double x2,double y2,double alpha,double truncation,int32_t occlusion) :
    box(tBox(type,x1,y1,x2,y2,alpha)),truncation(truncation),occlusion(occlusion) {}
};

// holding detection data
struct tDetection {
  tBox    box;    // object type, box, orientation
  double  thresh; // detection score
  double  ry;
  double  t1, t2, t3;
  double  h, w, l;
  tDetection ():
    box(tBox("invalid",-1,-1,-1,-1,-10)),thresh(-1000) {}
  tDetection (tBox box,double thresh) :
    box(box),thresh(thresh) {}
  tDetection (string type,double x1,double y1,double x2,double y2,double alpha,double thresh) :
    box(tBox(type,x1,y1,x2,y2,alpha)),thresh(thresh) {}
};


/*=======================================================================
FUNCTIONS TO LOAD DETECTION AND GROUND TRUTH DATA ONCE, SAVE RESULTS
=======================================================================*/
vector<int32_t> indices;

vector<tDetection> loadDetections(string file_name, bool &compute_aos,
        vector<bool> &eval_image, vector<bool> &eval_ground,
        vector<bool> &eval_3d, bool &success) {

  // holds all detections (ignored detections are indicated by an index vector
  vector<tDetection> detections;
  FILE *fp = fopen(file_name.c_str(),"r");
  if (!fp) {
    success = false;
    return detections;
  }
  while (!feof(fp)) {
    tDetection d;
    double trash;
    char str[255];
    if (fscanf(fp, "%s %lf %lf %lf %lf %lf %lf %lf %lf %lf %lf %lf %lf %lf %lf %lf",
                   str, &trash, &trash, &d.box.alpha, &d.box.x1, &d.box.y1,
                   &d.box.x2, &d.box.y2, &d.h, &d.w, &d.l, &d.t1, &d.t2, &d.t3,
                   &d.ry, &d.thresh)==16) {

        // d.thresh = 1;
      d.box.type = str;
      detections.push_back(d);

      // orientation=-10 is invalid, AOS is not evaluated if at least one orientation is invalid
      if(d.box.alpha == -10)
        compute_aos = false;

      // a class is only evaluated if it is detected at least once
      for (int c = 0; c < NUM_CLASS; c++) {
        if (!strcasecmp(d.box.type.c_str(), CLASS_NAMES[c].c_str())) {
          if (!eval_image[c] && d.box.x1 >= 0)
            eval_image[c] = true;
          if (!eval_ground[c] && d.t1 != -1000)
            eval_ground[c] = true;
          if (!eval_3d[c] && d.t2 != -1000)
            eval_3d[c] = true;
          break;
        }
      }
    }
  }
  fclose(fp);
  success = true;
  return detections;
}

vector<tGroundtruth> loadGroundtruth(string file_name,bool &success) {

  // holds all ground truth (ignored ground truth is indicated by an index vector
  vector<tGroundtruth> groundtruth;
  FILE *fp = fopen(file_name.c_str(),"r");
  if (!fp) {
    success = false;
    return groundtruth;
  }
  while (!feof(fp)) {
    tGroundtruth g;
    char str[255];
    if (fscanf(fp, "%s %lf %d %lf %lf %lf %lf %lf %lf %lf %lf %lf %lf %lf %lf",
                   str, &g.truncation, &g.occlusion, &g.box.alpha,
                   &g.box.x1,   &g.box.y1,     &g.box.x2,    &g.box.y2,
                   &g.h,      &g.w,        &g.l,       &g.t1,
                   &g.t2,      &g.t3,        &g.ry )==15) {
      g.box.type = str;
      groundtruth.push_back(g);
    }
  }
  fclose(fp);
  success = true;
  return groundtruth;
}

void saveStats (const vector<double> &precision, const vector<double> &aos, FILE *fp_det, FILE *fp_ori) {

  // save precision to file
  if(precision.empty())
    return;
  for (int32_t i=0; i<precision.size(); i++)
    fprintf(fp_det,"%f ",precision[i]);
  fprintf(fp_det,"\n");

  // save orientation similarity, only if there were no invalid orientation entries in submission (alpha=-10)
  if(aos.empty())
    return;
  for (int32_t i=0; i<aos.size(); i++)
    fprintf(fp_ori,"%f ",aos[i]);
  fprintf(fp_ori,"\n");
}

/*=======================================================================
EVALUATION HELPER FUNCTIONS
=======================================================================*/

// criterion defines whether the overlap is computed with respect to both areas (ground truth and detection)
// or with respect to box a or b (detection and "dontcare" areas)
inline double imageBoxOverlap(tBox a, tBox b, int32_t criterion=-1){

  // overlap is invalid in the beginning
  double o = -1;

  // get overlapping area
  double x1 = max(a.x1, b.x1);
  double y1 = max(a.y1, b.y1);
  double x2 = min(a.x2, b.x2);
  double y2 = min(a.y2, b.y2);

  // compute width and height of overlapping area
  double w = x2-x1;
  double h = y2-y1;

  // set invalid entries to 0 overlap
  if(w<=0 || h<=0)
    return 0;

  // get overlapping areas
  double inter = w*h;
  double a_area = (a.x2-a.x1) * (a.y2-a.y1);
  double b_area = (b.x2-b.x1) * (b.y2-b.y1);

  // intersection over union overlap depending on users choice
  if(criterion==-1)     // union
    o = inter / (a_area+b_area-inter);
  else if(criterion==0) // bbox_a
    o = inter / a_area;
  else if(criterion==1) // bbox_b
    o = inter / b_area;

  // overlap
  return o;
}

inline double imageBoxOverlap(tDetection a, tGroundtruth b, int32_t criterion=-1){
  return imageBoxOverlap(a.box, b.box, criterion);
}

// compute polygon of an oriented bounding box
template <typename T>
Polygon toPolygon(const T& g) {
    using namespace boost::numeric::ublas;
    using namespace boost::geometry;
    matrix<double> mref(2, 2);
    mref(0, 0) = cos(g.ry); mref(0, 1) = sin(g.ry);
    mref(1, 0) = -sin(g.ry); mref(1, 1) = cos(g.ry);

    static int count = 0;
    matrix<double> corners(2, 4);
    double data[] = {g.l / 2, g.l / 2, -g.l / 2, -g.l / 2,
                     g.w / 2, -g.w / 2, -g.w / 2, g.w / 2};
    std::copy(data, data + 8, corners.data().begin());
    matrix<double> gc = prod(mref, corners);
    for (int i = 0; i < 4; ++i) {
        gc(0, i) += g.t1;
        gc(1, i) += g.t3;
    }

    double points[][2] = {{gc(0, 0), gc(1, 0)},{gc(0, 1), gc(1, 1)},{gc(0, 2), gc(1, 2)},{gc(0, 3), gc(1, 3)},{gc(0, 0), gc(1, 0)}};
    Polygon poly;
    append(poly, points);
    return poly;
}

// measure overlap between bird's eye view bounding boxes, parametrized by (ry, l, w, tx, tz)
inline double groundBoxOverlap(tDetection d, tGroundtruth g, int32_t criterion = -1) {
    using namespace boost::geometry;
    Polygon gp = toPolygon(g);
    Polygon dp = toPolygon(d);

    std::vector<Polygon> in, un;
    intersection(gp, dp, in);
    union_(gp, dp, un);

    double inter_area = in.empty() ? 0 : area(in.front());
    double union_area = area(un.front());
    double o;
    if(criterion==-1)     // union
        o = inter_area / union_area;
    else if(criterion==0) // bbox_a
        o = inter_area / area(dp);
    else if(criterion==1) // bbox_b
        o = inter_area / area(gp);

    return o;
}

// measure overlap between 3D bounding boxes, parametrized by (ry, h, w, l, tx, ty, tz)
inline double box3DOverlap(tDetection d, tGroundtruth g, int32_t criterion = -1) {
    using namespace boost::geometry;
    Polygon gp = toPolygon(g);
    Polygon dp = toPolygon(d);

    std::vector<Polygon> in, un;
    intersection(gp, dp, in);
    union_(gp, dp, un);

    double ymax = min(d.t2, g.t2);
    double ymin = max(d.t2 - d.h, g.t2 - g.h);

    double inter_area = in.empty() ? 0 : area(in.front());
    double inter_vol = inter_area * max(0.0, ymax - ymin);

    double det_vol = d.h * d.l * d.w;
    double gt_vol = g.h * g.l * g.w;

    double o;
    if(criterion==-1)     // union
        o = inter_vol / (det_vol + gt_vol - inter_vol);
    else if(criterion==0) // bbox_a
        o = inter_vol / det_vol;
    else if(criterion==1) // bbox_b
        o = inter_vol / gt_vol;

    return o;
}

vector<double> getThresholds(vector<double> &v, double n_groundtruth){

  // holds scores needed to compute N_SAMPLE_PTS recall values
  vector<double> t;

  // sort scores in descending order
  // (highest score is assumed to give best/most confident detections)
  sort(v.begin(), v.end(), greater<double>());

  // get scores for linearly spaced recall
  double current_recall = 0;
  for(int32_t i=0; i<v.size(); i++){

    // check if right-hand-side recall with respect to current recall is close than left-hand-side one
    // in this case, skip the current detection score
    double l_recall, r_recall, recall;
    l_recall = (double)(i+1)/n_groundtruth;
    if(i<(v.size()-1))
      r_recall = (double)(i+2)/n_groundtruth;
    else
      r_recall = l_recall;

    if( (r_recall-current_recall) < (current_recall-l_recall) && i<(v.size()-1))
      continue;

    // left recall is the best approximation, so use this and goto next recall step for approximation
    recall = l_recall;

    // the next recall step was reached
    t.push_back(v[i]);
    //printf("%.8f\n", v[i]);
    current_recall += 1.0/(N_SAMPLE_PTS-1.0);
  }
  return t;
}

void cleanData(CLASSES current_class, const vector<tGroundtruth> &gt, const vector<tDetection> &det, vector<int32_t> &ignored_gt, vector<tGroundtruth> &dc, vector<int32_t> &ignored_det, int32_t &n_gt, DIFFICULTY difficulty){

  // extract ground truth bounding boxes for current evaluation class
  for(int32_t i=0;i<gt.size(); i++){

    // only bounding boxes with a minimum height are used for evaluation
    double height = gt[i].box.y2 - gt[i].box.y1;

    // neighboring classes are ignored ("van" for "car" and "person_sitting" for "pedestrian")
    // (lower/upper cases are ignored)
    int32_t valid_class;

    // all classes without a neighboring class
    if(!strcasecmp(gt[i].box.type.c_str(), CLASS_NAMES[current_class].c_str()))
      valid_class = 1;

    // classes with a neighboring class
    else if(!strcasecmp(CLASS_NAMES[current_class].c_str(), "Pedestrian") && !strcasecmp("Person_sitting", gt[i].box.type.c_str()))
      valid_class = 0;
    else if(!strcasecmp(CLASS_NAMES[current_class].c_str(), "Car") && !strcasecmp("Van", gt[i].box.type.c_str()))
      valid_class = 0;

    // classes not used for evaluation
    else
      valid_class = -1;

    // ground truth is ignored, if occlusion, truncation exceeds the difficulty or ground truth is too small
    // (doesn't count as FN nor TP, although detections may be assigned)
    bool ignore = false;
    if(gt[i].occlusion>MAX_OCCLUSION[difficulty] || gt[i].truncation>MAX_TRUNCATION[difficulty] || height<MIN_HEIGHT[difficulty])
      ignore = true;

    // set ignored vector for ground truth
    // current class and not ignored (total no. of ground truth is detected for recall denominator)
    if(valid_class==1 && !ignore){
      ignored_gt.push_back(0);
      n_gt++;
    }

    // neighboring class, or current class but ignored
    else if(valid_class==0 || (ignore && valid_class==1))
      ignored_gt.push_back(1);

    // all other classes which are FN in the evaluation
    else
      ignored_gt.push_back(-1);
  }

  // extract dontcare areas
  for(int32_t i=0;i<gt.size(); i++)
    if(!strcasecmp("DontCare", gt[i].box.type.c_str()))
      dc.push_back(gt[i]);

  // extract detections bounding boxes of the current class
  for(int32_t i=0;i<det.size(); i++){

    // neighboring classes are not evaluated
    int32_t valid_class;
    if(!strcasecmp(det[i].box.type.c_str(), CLASS_NAMES[current_class].c_str()))
      valid_class = 1;
    else
      valid_class = -1;

    int32_t height = fabs(det[i].box.y1 - det[i].box.y2);

    // set ignored vector for detections
    if(height<MIN_HEIGHT[difficulty])
      ignored_det.push_back(1);
    else if(valid_class==1)
      ignored_det.push_back(0);
    else
      ignored_det.push_back(-1);
  }
}

tPrData computeStatistics(CLASSES current_class, const vector<tGroundtruth> &gt,
        const vector<tDetection> &det, const vector<tGroundtruth> &dc,
        const vector<int32_t> &ignored_gt, const vector<int32_t>  &ignored_det,
        bool compute_fp, double (*boxoverlap)(tDetection, tGroundtruth, int32_t),
        METRIC metric, bool compute_aos=false, double thresh=0, bool debug=false){

  tPrData stat = tPrData();
  const double NO_DETECTION = -10000000;
  vector<double> delta;            // holds angular difference for TPs (needed for AOS evaluation)
  vector<bool> assigned_detection; // holds wether a detection was assigned to a valid or ignored ground truth
  assigned_detection.assign(det.size(), false);
  vector<bool> ignored_threshold;
  ignored_threshold.assign(det.size(), false); // holds detections with a threshold lower than thresh if FP are computed

  // detections with a low score are ignored for computing precision (needs FP)
  if(compute_fp)
    for(int32_t i=0; i<det.size(); i++)
      if(det[i].thresh<thresh)
        ignored_threshold[i] = true;

  // evaluate all ground truth boxes
  for(int32_t i=0; i<gt.size(); i++){

    // this ground truth is not of the current or a neighboring class and therefore ignored
    if(ignored_gt[i]==-1)
      continue;

    /*=======================================================================
    find candidates (overlap with ground truth > 0.5) (logical len(det))
    =======================================================================*/
    int32_t det_idx          = -1;
    double valid_detection = NO_DETECTION;
    double max_overlap     = 0;

    // search for a possible detection
    bool assigned_ignored_det = false;
    for(int32_t j=0; j<det.size(); j++){

      // detections not of the current class, already assigned or with a low threshold are ignored
      if(ignored_det[j]==-1)
        continue;
      if(assigned_detection[j])
        continue;
      if(ignored_threshold[j])
        continue;

      // find the maximum score for the candidates and get idx of respective detection
      double overlap = boxoverlap(det[j], gt[i], -1);

      // for computing recall thresholds, the candidate with highest score is considered
      if(!compute_fp && overlap>MIN_OVERLAP[metric][current_class] && det[j].thresh>valid_detection){
        det_idx         = j;
        valid_detection = det[j].thresh;
      }

      // for computing pr curve values, the candidate with the greatest overlap is considered
      // if the greatest overlap is an ignored detection (min_height), the overlapping detection is used
      else if(compute_fp && overlap>MIN_OVERLAP[metric][current_class] && (overlap>max_overlap || assigned_ignored_det) && ignored_det[j]==0){
        max_overlap     = overlap;
        det_idx         = j;
        valid_detection = 1;
        assigned_ignored_det = false;
      }
      else if(compute_fp && overlap>MIN_OVERLAP[metric][current_class] && valid_detection==NO_DETECTION && ignored_det[j]==1){
        det_idx              = j;
        valid_detection      = 1;
        assigned_ignored_det = true;
      }
    }

    /*=======================================================================
    compute TP, FP and FN
    =======================================================================*/

    // nothing was assigned to this valid ground truth
    if(valid_detection==NO_DETECTION && ignored_gt[i]==0) {
      stat.fn++;
    }

    // only evaluate valid ground truth <=> detection assignments (considering difficulty level)
    else if(valid_detection!=NO_DETECTION && (ignored_gt[i]==1 || ignored_det[det_idx]==1))
      assigned_detection[det_idx] = true;

    // found a valid true positive
    else if(valid_detection!=NO_DETECTION){

      // write highest score to threshold vector
      stat.tp++;
      stat.v.push_back(det[det_idx].thresh);

      // compute angular difference of detection and ground truth if valid detection orientation was provided
      if(compute_aos)
        delta.push_back(gt[i].box.alpha - det[det_idx].box.alpha);

      // clean up
      assigned_detection[det_idx] = true;
    }
  }

  // if FP are requested, consider stuff area
  if(compute_fp){

    // count fp
    for(int32_t i=0; i<det.size(); i++){

      // count false positives if required (height smaller than required is ignored (ignored_det==1)
      if(!(assigned_detection[i] || ignored_det[i]==-1 || ignored_det[i]==1 || ignored_threshold[i]))
        stat.fp++;
    }

    // do not consider detections overlapping with stuff area
    int32_t nstuff = 0;
    for(int32_t i=0; i<dc.size(); i++){
      for(int32_t j=0; j<det.size(); j++){

        // detections not of the current class, already assigned, with a low threshold or a low minimum height are ignored
        if(assigned_detection[j])
          continue;
        if(ignored_det[j]==-1 || ignored_det[j]==1)
          continue;
        if(ignored_threshold[j])
          continue;

        // compute overlap and assign to stuff area, if overlap exceeds class specific value
        double overlap = boxoverlap(det[j], dc[i], 0);
        if(overlap>MIN_OVERLAP[metric][current_class]){
          assigned_detection[j] = true;
          nstuff++;
        }
      }
    }

    // FP = no. of all not to ground truth assigned detections - detections assigned to stuff areas
    stat.fp -= nstuff;

    // if all orientation values are valid, the AOS is computed
    if(compute_aos){
      vector<double> tmp;

      // FP have a similarity of 0, for all TP compute AOS
      tmp.assign(stat.fp, 0);
      for(int32_t i=0; i<delta.size(); i++)
        tmp.push_back((1.0+cos(delta[i]))/2.0);

      // be sure, that all orientation deltas are computed
      assert(tmp.size()==stat.fp+stat.tp);
      assert(delta.size()==stat.tp);

      // get the mean orientation similarity for this image
      if(stat.tp>0 || stat.fp>0)
        stat.similarity = accumulate(tmp.begin(), tmp.end(), 0.0);

      // there was neither a FP nor a TP, so the similarity is ignored in the evaluation
      else
        stat.similarity = -1;
    }
  }
  return stat;
}

/*=======================================================================
EVALUATE CLASS-WISE
=======================================================================*/

bool eval_class (FILE *fp_det, FILE *fp_ori, CLASSES current_class,
        const vector< vector<tGroundtruth> > &groundtruth,
        const vector< vector<tDetection> > &detections, bool compute_aos,
        double (*boxoverlap)(tDetection, tGroundtruth, int32_t),
        vector<double> &precision, vector<double> &aos,
        DIFFICULTY difficulty, METRIC metric) {
    assert(groundtruth.size() == detections.size());

  // init
  int32_t n_gt=0;                                     // total no. of gt (denominator of recall)
  vector<double> v, thresholds;                       // detection scores, evaluated for recall discretization
  vector< vector<int32_t> > ignored_gt, ignored_det;  // index of ignored gt detection for current class/difficulty
  vector< vector<tGroundtruth> > dontcare;            // index of dontcare areas, included in ground truth

  // for all test images do
  for (int32_t i=0; i<groundtruth.size(); i++){

    // holds ignored ground truth, ignored detections and dontcare areas for current frame
    vector<int32_t> i_gt, i_det;
    vector<tGroundtruth> dc;

    // only evaluate objects of current class and ignore occluded, truncated objects
    cleanData(current_class, groundtruth[i], detections[i], i_gt, dc, i_det, n_gt, difficulty);
    ignored_gt.push_back(i_gt);
    ignored_det.push_back(i_det);
    dontcare.push_back(dc);

    // compute statistics to get recall values
    tPrData pr_tmp = tPrData();
    pr_tmp = computeStatistics(current_class, groundtruth[i], detections[i], dc, i_gt, i_det, false, boxoverlap, metric);

    // add detection scores to vector over all images
    for(int32_t j=0; j<pr_tmp.v.size(); j++)
      v.push_back(pr_tmp.v[j]);
  }
  // get scores that must be evaluated for recall discretization
  thresholds = getThresholds(v, n_gt);

  // compute TP,FP,FN for relevant scores
  vector<tPrData> pr;
  pr.assign(thresholds.size(),tPrData());
  for (int32_t i=0; i<groundtruth.size(); i++){
    // for all scores/recall thresholds do:
    for(int32_t t=0; t<thresholds.size(); t++){
      tPrData tmp = tPrData();
      tmp = computeStatistics(current_class, groundtruth[i], detections[i], dontcare[i],
                              ignored_gt[i], ignored_det[i], true, boxoverlap, metric,
                              compute_aos, thresholds[t], t==38);

      // add no. of TP, FP, FN, AOS for current frame to total evaluation for current threshold
      pr[t].tp += tmp.tp;
      pr[t].fp += tmp.fp;
      pr[t].fn += tmp.fn;
      if(tmp.similarity!=-1)
        pr[t].similarity += tmp.similarity;
    }
  }

  // compute recall, precision and AOS
  vector<double> recall;
  precision.assign(N_SAMPLE_PTS, 0);
  if(compute_aos)
    aos.assign(N_SAMPLE_PTS, 0);
  double r=0;
  for (int32_t i=0; i<thresholds.size(); i++){
    r = pr[i].tp/(double)(pr[i].tp + pr[i].fn);
    recall.push_back(r);
    precision[i] = pr[i].tp/(double)(pr[i].tp + pr[i].fp);
    if(compute_aos)
      aos[i] = pr[i].similarity/(double)(pr[i].tp + pr[i].fp);
  }

  // filter precision and AOS using max_{i..end}(precision)
  for (int32_t i=0; i<thresholds.size(); i++){
    precision[i] = *max_element(precision.begin()+i, precision.end());
    if(compute_aos)
      aos[i] = *max_element(aos.begin()+i, aos.end());
  }

  // save statisics and finish with success
  saveStats(precision, aos, fp_det, fp_ori);
    return true;
}

void saveAndPlotPlots(string dir_name,string file_name,string obj_type,vector<double> vals[],bool is_aos, FILE* res_fp){

  char command[1024];
  // save plot data to file
  FILE *fp = fopen((dir_name + "/" + file_name + ".txt").c_str(),"w");
  printf("save %s\n", (dir_name + "/" + file_name + ".txt").c_str());
  for (int32_t i=0; i<(int)N_SAMPLE_PTS; i++)
    fprintf(fp,"%f %f %f %f\n",(double)i/(N_SAMPLE_PTS-1.0),vals[0][i],vals[1][i],vals[2][i]);
  fclose(fp);

  float sum[3] = {0, 0, 0};
  for (int v = 0; v < 3; ++v)
      for (int i = 0; i < vals[v].size(); i = i + 4)
          sum[v] += vals[v][i];
  printf("%s AP: %f %f %f\n", file_name.c_str(), sum[0] / 11 * 100, sum[1] / 11 * 100, sum[2] / 11 * 100);
  fprintf(res_fp, "%s AP: %f %f %f\n", file_name.c_str(), sum[0] / 11 * 100, sum[1] / 11 * 100, sum[2] / 11 * 100);

  // create png + eps
  for (int32_t j=0; j<2; j++) {

    // open file
    FILE *fp = fopen((dir_name + "/" + file_name + ".gp").c_str(),"w");

    // save gnuplot instructions
    if (j==0) {
      fprintf(fp,"set term png size 450,315 font \"Helvetica\" 11\n");
      fprintf(fp,"set output \"%s.png\"\n",file_name.c_str());
    } else {
      fprintf(fp,"set term postscript eps enhanced color font \"Helvetica\" 20\n");
      fprintf(fp,"set output \"%s.eps\"\n",file_name.c_str());
    }

    // set labels and ranges
    fprintf(fp,"set size ratio 0.7\n");
    fprintf(fp,"set xrange [0:1]\n");
    fprintf(fp,"set yrange [0:1]\n");
    fprintf(fp,"set xlabel \"Recall\"\n");
    if (!is_aos) fprintf(fp,"set ylabel \"Precision\"\n");
    else         fprintf(fp,"set ylabel \"Orientation Similarity\"\n");
    obj_type[0] = toupper(obj_type[0]);
    fprintf(fp,"set title \"%s\"\n",obj_type.c_str());

    // line width
    int32_t   lw = 5;
    if (j==0) lw = 3;

    // plot error curve
    fprintf(fp,"plot ");
    fprintf(fp,"\"%s.txt\" using 1:2 title 'Easy' with lines ls 1 lw %d,",file_name.c_str(),lw);
    fprintf(fp,"\"%s.txt\" using 1:3 title 'Moderate' with lines ls 2 lw %d,",file_name.c_str(),lw);
    fprintf(fp,"\"%s.txt\" using 1:4 title 'Hard' with lines ls 3 lw %d",file_name.c_str(),lw);

    // close file
    fclose(fp);

    // run gnuplot => create png + eps
    sprintf(command,"cd %s; gnuplot %s",dir_name.c_str(),(file_name + ".gp").c_str());
    system(command);
  }

  // create pdf and crop
  sprintf(command,"cd %s; ps2pdf %s.eps %s_large.pdf",dir_name.c_str(),file_name.c_str(),file_name.c_str());
  system(command);
  sprintf(command,"cd %s; pdfcrop %s_large.pdf %s.pdf",dir_name.c_str(),file_name.c_str(),file_name.c_str());
  system(command);
  sprintf(command,"cd %s; rm %s_large.pdf",dir_name.c_str(),file_name.c_str());
  system(command);
}

vector<int32_t> getEvalIndices(const string& result_dir) {

    DIR* dir;
    dirent* entity;
    dir = opendir(result_dir.c_str());
    if (dir) {
        while (entity = readdir(dir)) {
            string path(entity->d_name);
            int32_t len = path.size();
            if (len < 10) continue;
            int32_t index = atoi(path.substr(len - 10, 10).c_str());
            indices.push_back(index);
        }
    }
    return indices;
}

bool eval(string gt_dir, string result_dir, Mail* mail){

  // set some global parameters
  initGlobals();

  // ground truth and result directories
  // string gt_dir         = "data/object/label_2";
  // string result_dir     = "results/" + result_sha;
  string plot_dir       = result_dir + "/plot";
  FILE* res_fp = fopen((result_dir + "/result.txt").c_str(), "w");

  // create output directories
  system(("mkdir " + plot_dir).c_str());

  // hold detections and ground truth in memory
  vector< vector<tGroundtruth> > groundtruth;
  vector< vector<tDetection> >   detections;

  // holds wether orientation similarity shall be computed (might be set to false while loading detections)
  // and which labels where provided by this submission
  bool compute_aos=true;
  vector<bool> eval_image(NUM_CLASS, false);
  vector<bool> eval_ground(NUM_CLASS, false);
  vector<bool> eval_3d(NUM_CLASS, false);

  // for all images read groundtruth and detections
  mail->msg("Loading detections...");
  std::vector<int32_t> indices = getEvalIndices(result_dir + "/data/" );
  printf("number of files for evaluation: %d\n", (int)indices.size());
  fprintf(res_fp, "number of files for evaluation: %d\n", (int)indices.size());

  for (int32_t i=0; i<indices.size(); i++) {

    // file name
    char file_name[256];
    sprintf(file_name,"%06d.txt",indices.at(i));

    // read ground truth and result poses
    bool gt_success,det_success;
    vector<tGroundtruth> gt   = loadGroundtruth(gt_dir + "/" + file_name,gt_success);
    vector<tDetection>   det  = loadDetections(result_dir + "/data/"  + file_name,
            compute_aos, eval_image, eval_ground, eval_3d, det_success);
    groundtruth.push_back(gt);
    detections.push_back(det);

    // check for errors
    if (!gt_success) {
      mail->msg("ERROR: Couldn't read: %s of ground truth. Please write me an email!", file_name);
      return false;
    }
    if (!det_success) {
      mail->msg("ERROR: Couldn't read: %s", file_name);
      return false;
    }
  }
  mail->msg("  done.");

  // holds pointers for result files
  FILE *fp_det=0, *fp_ori=0;

  // eval image 2D bounding boxes
  for (int c = 0; c < NUM_CLASS; c++) {
    CLASSES cls = (CLASSES)c;
    if (eval_image[c]) {
      fp_det = fopen((result_dir + "/stats_" + CLASS_NAMES[c] + "_detection.txt").c_str(), "w");
      if(compute_aos)
        fp_ori = fopen((result_dir + "/stats_" + CLASS_NAMES[c] + "_orientation.txt").c_str(),"w");
      vector<double> precision[3], aos[3];
      if(   !eval_class(fp_det, fp_ori, cls, groundtruth, detections, compute_aos, imageBoxOverlap, precision[0], aos[0], EASY, IMAGE)
         || !eval_class(fp_det, fp_ori, cls, groundtruth, detections, compute_aos, imageBoxOverlap, precision[1], aos[1], MODERATE, IMAGE)
         || !eval_class(fp_det, fp_ori, cls, groundtruth, detections, compute_aos, imageBoxOverlap, precision[2], aos[2], HARD, IMAGE)) {
        mail->msg("%s evaluation failed.", CLASS_NAMES[c].c_str());
        return false;
      }
      fclose(fp_det);
      saveAndPlotPlots(plot_dir, CLASS_NAMES[c] + "_detection", CLASS_NAMES[c], precision, 0, res_fp);

      if(compute_aos){
        saveAndPlotPlots(plot_dir, CLASS_NAMES[c] + "_orientation", CLASS_NAMES[c], aos, 1, res_fp);
        fclose(fp_ori);
      }
    }
  }
  printf("Finished 2D bounding box eval.\n");
  // don't evaluate AOS for birdview boxes and 3D boxes
  compute_aos = false;

  // eval bird's eye view bounding boxes
  for (int c = 0; c < NUM_CLASS; c++) {
    CLASSES cls = (CLASSES)c;
    if (eval_ground[c]) {
      fp_det = fopen((result_dir + "/stats_" + CLASS_NAMES[c] + "_detection_ground.txt").c_str(), "w");
      vector<double> precision[3], aos[3];
      printf("Going to eval ground for class: %s\n", CLASS_NAMES[c].c_str());
      if(   !eval_class(fp_det, fp_ori, cls, groundtruth, detections, compute_aos, groundBoxOverlap, precision[0], aos[0], EASY, GROUND)
         || !eval_class(fp_det, fp_ori, cls, groundtruth, detections, compute_aos, groundBoxOverlap, precision[1], aos[1], MODERATE, GROUND)
         || !eval_class(fp_det, fp_ori, cls, groundtruth, detections, compute_aos, groundBoxOverlap, precision[2], aos[2], HARD, GROUND)) {
        mail->msg("%s evaluation failed.", CLASS_NAMES[c].c_str());
        return false;
      }
      fclose(fp_det);
      saveAndPlotPlots(plot_dir, CLASS_NAMES[c] + "_detection_ground", CLASS_NAMES[c], precision, 0, res_fp);
    }
  }
  printf("Finished Birdeye eval.\n");

  // eval 3D bounding boxes
  for (int c = 0; c < NUM_CLASS; c++) {
    CLASSES cls = (CLASSES)c;
    if (eval_3d[c]) {
      fp_det = fopen((result_dir + "/stats_" + CLASS_NAMES[c] + "_detection_3d.txt").c_str(), "w");
      vector<double> precision[3], aos[3];
      printf("Going to eval 3D box for class: %s\n", CLASS_NAMES[c].c_str());
      if(   !eval_class(fp_det, fp_ori, cls, groundtruth, detections, compute_aos, box3DOverlap, precision[0], aos[0], EASY, BOX3D)
         || !eval_class(fp_det, fp_ori, cls, groundtruth, detections, compute_aos, box3DOverlap, precision[1], aos[1], MODERATE, BOX3D)
         || !eval_class(fp_det, fp_ori, cls, groundtruth, detections, compute_aos, box3DOverlap, precision[2], aos[2], HARD, BOX3D)) {
        mail->msg("%s evaluation failed.", CLASS_NAMES[c].c_str());
        return false;
      }
      fclose(fp_det);
      saveAndPlotPlots(plot_dir, CLASS_NAMES[c] + "_detection_3d", CLASS_NAMES[c], precision, 0, res_fp);
    }
  }
  printf("Finished 3D bounding box eval.\n");
  fclose(res_fp);
  // success
  return true;
}

int32_t main (int32_t argc,char *argv[]) {

  // we need 2 or 4 arguments!
  if (argc!=3) {
    cout << "Usage: ./eval_detection_3d_offline gt_dir result_dir" << endl;
    return 1;
  }

  // read arguments
  string gt_dir = argv[1];
  string result_dir = argv[2];

  // init notification mail
  Mail *mail;
  mail = new Mail();
  mail->msg("Thank you for participating in our evaluation!");

  // run evaluation
  if (eval(gt_dir, result_dir, mail)) {
    mail->msg("Your evaluation results are available at:");
    mail->msg(result_dir.c_str());
  } else {
    system(("rm -r " + result_dir + "/plot").c_str());
    mail->msg("An error occured while processing your results.");
  }

  // send mail and exit
  delete mail;

  return 0;
}




================================================
FILE: disparity/eval/kitti/mail.h
================================================
#ifndef MAIL_H
#define MAIL_H

#include <stdio.h>
#include <stdarg.h>
#include <string.h>

class Mail {

public:

  Mail (std::string email = "") {
    if (email.compare("")) {
      mail = popen("/usr/lib/sendmail -t -f noreply@cvlibs.net","w");
      fprintf(mail,"To: %s\n", email.c_str());
      fprintf(mail,"From: noreply@cvlibs.net\n");
      fprintf(mail,"Subject: KITTI Evaluation Benchmark\n");
      fprintf(mail,"\n\n");
    } else {
      mail = 0;
    }
  }
  
  ~Mail() {
    if (mail) {
      pclose(mail);
    }
  }
  
  void msg (const char *format, ...) {
    va_list args;
    va_start(args,format);
    if (mail) {
      vfprintf(mail,format,args);
      fprintf(mail,"\n");
    }
    vprintf(format,args);
    printf("\n");
    va_end(args);
  }
    
private:

  FILE *mail;
  
};

#endif


================================================
FILE: disparity/eval/kitti-object-eval-python/.gitignore
================================================
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# pyenv
.python-version

# celery beat schedule file
celerybeat-schedule

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/

================================================
FILE: disparity/eval/kitti-object-eval-python/LICENSE
================================================
MIT License

Copyright (c) 2018 

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


================================================
FILE: disparity/eval/kitti-object-eval-python/README.md
================================================
# kitti-object-eval-python
Fast kitti object detection eval in python(finish eval in less than 10 second), support 2d/bev/3d/aos. , support coco-style AP. If you use command line interface, numba need some time to compile jit functions.

_WARNING_: The "coco" isn't official metrics. Only "AP(Average Precision)" is.
## Dependencies
Only support python 3.6+, need `numpy`, `skimage`, `numba`, `fire`, `scipy`. If you have Anaconda, just install `cudatoolkit` in anaconda. Otherwise, please reference to this [page](https://github.com/numba/numba#custom-python-environments) to set up llvm and cuda for numba.
* Install by conda:
```
conda install -c numba cudatoolkit=x.x  (8.0, 9.0, 10.0, depend on your environment) 
```
## Usage
* commandline interface:
```
python evaluate.py evaluate --label_path=/path/to/your_gt_label_folder --result_path=/path/to/your_result_folder --label_split_file=/path/to/val.txt --current_class=0 --coco=False
```
* python interface:
```Python
import kitti_common as kitti
from eval import get_official_eval_result, get_coco_eval_result
def _read_imageset_file(path):
    with open(path, 'r') as f:
        lines = f.readlines()
    return [int(line) for line in lines]
det_path = "/path/to/your_result_folder"
dt_annos = kitti.get_label_annos(det_path)
gt_path = "/path/to/your_gt_label_folder"
gt_split_file = "/path/to/val.txt" # from https://xiaozhichen.github.io/files/mv3d/imagesets.tar.gz
val_image_ids = _read_imageset_file(gt_split_file)
gt_annos = kitti.get_label_annos(gt_path, val_image_ids)
print(get_official_eval_result(gt_annos, dt_annos, 0)) # 6s in my computer
print(get_coco_eval_result(gt_annos, dt_annos, 0)) # 18s in my computer
```


================================================
FILE: disparity/eval/kitti-object-eval-python/eval.py
================================================
import io as sysio
import time

import numba
import numpy as np
from scipy.interpolate import interp1d

from rotate_iou import rotate_iou_gpu_eval


def get_mAP(prec):
    sums = 0
    for i in range(0, len(prec), 4):
        sums += prec[i]
    return sums / 11 * 100


@numba.jit
def get_thresholds(scores: np.ndarray, num_gt, num_sample_pts=41):
    scores.sort()
    scores = scores[::-1]
    current_recall = 0
    thresholds = []
    for i, score in enumerate(scores):
        l_recall = (i + 1) / num_gt
        if i < (len(scores) - 1):
            r_recall = (i + 2) / num_gt
        else:
            r_recall = l_recall
        if (((r_recall - current_recall) < (current_recall - l_recall))
                and (i < (len(scores) - 1))):
            continue
        # recall = l_recall
        thresholds.append(score)
        current_recall += 1 / (num_sample_pts - 1.0)
    # print(len(thresholds), len(scores), num_gt)
    return thresholds


def clean_data(gt_anno, dt_anno, current_class, difficulty):
    CLASS_NAMES = [
        'car', 'pedestrian', 'cyclist', 'van', 'person_sitting', 'car',
        'tractor', 'trailer'
    ]
    MIN_HEIGHT = [40, 25, 25]
    MAX_OCCLUSION = [0, 1, 2]
    MAX_TRUNCATION = [0.15, 0.3, 0.5]
    dc_bboxes, ignored_gt, ignored_dt = [], [], []
    current_cls_name = CLASS_NAMES[current_class].lower()
    num_gt = len(gt_anno["name"])
    num_dt = len(dt_anno["name"])
    num_valid_gt = 0
    for i in range(num_gt):
        bbox = gt_anno["bbox"][i]
        gt_name = gt_anno["name"][i].lower()
        height = bbox[3] - bbox[1]
        valid_class = -1
        if (gt_name == current_cls_name):
            valid_class = 1
        elif (current_cls_name == "Pedestrian".lower()
              and "Person_sitting".lower() == gt_name):
            valid_class = 0
        elif (current_cls_name == "Car".lower() and "Van".lower() == gt_name):
            valid_class = 0
        else:
            valid_class = -1
        ignore = False
        if ((gt_anno["occluded"][i] > MAX_OCCLUSION[difficulty])
                or (gt_anno["truncated"][i] > MAX_TRUNCATION[difficulty])
                or (height <= MIN_HEIGHT[difficulty])):
            # if gt_anno["difficulty"][i] > difficulty or gt_anno["difficulty"][i] == -1:
            ignore = True
        if valid_class == 1 and not ignore:
            ignored_gt.append(0)
            num_valid_gt += 1
        elif (valid_class == 0 or (ignore and (valid_class == 1))):
            ignored_gt.append(1)
        else:
            ignored_gt.append(-1)
    # for i in range(num_gt):
        if gt_anno["name"][i] == "DontCare":
            dc_bboxes.append(gt_anno["bbox"][i])
    for i in range(num_dt):
        if (dt_anno["name"][i].lower() == current_cls_name):
            valid_class = 1
        else:
            valid_class = -1
        height = abs(dt_anno["bbox"][i, 3] - dt_anno["bbox"][i, 1])
        if height < MIN_HEIGHT[difficulty]:
            ignored_dt.append(1)
        elif valid_class == 1:
            ignored_dt.append(0)
        else:
            ignored_dt.append(-1)

    return num_valid_gt, ignored_gt, ignored_dt, dc_bboxes


@numba.jit(nopython=True)
def image_box_overlap(boxes, query_boxes, criterion=-1):
    N = boxes.shape[0]
    K = query_boxes.shape[0]
    overlaps = np.zeros((N, K), dtype=boxes.dtype)
    for k in range(K):
        qbox_area = ((query_boxes[k, 2] - query_boxes[k, 0]) *
                     (query_boxes[k, 3] - query_boxes[k, 1]))
        for n in range(N):
            iw = (min(boxes[n, 2], query_boxes[k, 2]) - max(
                boxes[n, 0], query_boxes[k, 0]))
            if iw > 0:
                ih = (min(boxes[n, 3], query_boxes[k, 3]) - max(
                    boxes[n, 1], query_boxes[k, 1]))
                if ih > 0:
                    if criterion == -1:
                        ua = (
                            (boxes[n, 2] - boxes[n, 0]) *
                            (boxes[n, 3] - boxes[n, 1]) + qbox_area - iw * ih)
                    elif criterion == 0:
                        ua = ((boxes[n, 2] - boxes[n, 0]) *
                              (boxes[n, 3] - boxes[n, 1]))
                    elif criterion == 1:
                        ua = qbox_area
                    else:
                        ua = 1.0
                    overlaps[n, k] = iw * ih / ua
    return overlaps


def bev_box_overlap(boxes, qboxes, criterion=-1):
    riou = rotate_iou_gpu_eval(boxes, qboxes, criterion)
    return riou


@numba.jit(nopython=True, parallel=True)
def d3_box_overlap_kernel(boxes,
                          qboxes,
                          rinc,
                          criterion=-1,
                          z_axis=1,
                          z_center=1.0):
    """
        z_axis: the z (height) axis.
        z_center: unified z (height) center of box.
    """
    N, K = boxes.shape[0], qboxes.shape[0]
    for i in range(N):
        for j in range(K):
            if rinc[i, j] > 0:
                min_z = min(
                    boxes[i, z_axis] + boxes[i, z_axis + 3] * (1 - z_center),
                    qboxes[j, z_axis] + qboxes[j, z_axis + 3] * (1 - z_center))
                max_z = max(
                    boxes[i, z_axis] - boxes[i, z_axis + 3] * z_center,
                    qboxes[j, z_axis] - qboxes[j, z_axis + 3] * z_center)
                iw = min_z - max_z
                if iw > 0:
                    area1 = boxes[i, 3] * boxes[i, 4] * boxes[i, 5]
                    area2 = qboxes[j, 3] * qboxes[j, 4] * qboxes[j, 5]
                    inc = iw * rinc[i, j]
                    if criterion == -1:
                        ua = (area1 + area2 - inc)
                    elif criterion == 0:
                        ua = area1
                    elif criterion == 1:
                        ua = area2
                    else:
                        ua = 1.0
                    rinc[i, j] = inc / ua
                else:
                    rinc[i, j] = 0.0


def d3_box_overlap(boxes, qboxes, criterion=-1, z_axis=1, z_center=1.0):
    """kitti camera format z_axis=1.
    """
    bev_axes = list(range(7))
    bev_axes.pop(z_axis + 3)
    bev_axes.pop(z_axis)
    rinc = rotate_iou_gpu_eval(boxes[:, bev_axes], qboxes[:, bev_axes], 2)
    d3_box_overlap_kernel(boxes, qboxes, rinc, criterion, z_axis, z_center)
    return rinc


@numba.jit(nopython=True)
def compute_statistics_jit(overlaps,
                           gt_datas,
                           dt_datas,
                           ignored_gt,
                           ignored_det,
                           dc_bboxes,
                           metric,
                           min_overlap,
                           thresh=0,
                           compute_fp=False,
                           compute_aos=False):

    det_size = dt_datas.shape[0]
    gt_size = gt_datas.shape[0]
    dt_scores = dt_datas[:, -1]
    dt_alphas = dt_datas[:, 4]
    gt_alphas = gt_datas[:, 4]
    dt_bboxes = dt_datas[:, :4]
    # gt_bboxes = gt_datas[:, :4]

    assigned_detection = [False] * det_size
    ignored_threshold = [False] * det_size
    if compute_fp:
        for i in range(det_size):
            if (dt_scores[i] < thresh):
                ignored_threshold[i] = True
    NO_DETECTION = -10000000
    tp, fp, fn, similarity = 0, 0, 0, 0
    # thresholds = [0.0]
    # delta = [0.0]
    thresholds = np.zeros((gt_size, ))
    thresh_idx = 0
    delta = np.zeros((gt_size, ))
    delta_idx = 0
    for i in range(gt_size):
        if ignored_gt[i] == -1:
            continue
        det_idx = -1
        valid_detection = NO_DETECTION
        max_overlap = 0
        assigned_ignored_det = False

        for j in range(det_size):
            if (ignored_det[j] == -1):
                continue
            if (assigned_detection[j]):
                continue
            if (ignored_threshold[j]):
                continue
            overlap = overlaps[j, i]
            dt_score = dt_scores[j]
            if (not compute_fp and (overlap > min_overlap)
                    and dt_score > valid_detection):
                det_idx = j
                valid_detection = dt_score
            elif (compute_fp and (overlap > min_overlap)
                  and (overlap > max_overlap or assigned_ignored_det)
                  and ignored_det[j] == 0):
                max_overlap = overlap
                det_idx = j
                valid_detection = 1
                assigned_ignored_det = False
            elif (compute_fp and (overlap > min_overlap)
                  and (valid_detection == NO_DETECTION)
                  and ignored_det[j] == 1):
                det_idx = j
                valid_detection = 1
                assigned_ignored_det = True

        if (valid_detection == NO_DETECTION) and ignored_gt[i] == 0:
            fn += 1
        elif ((valid_detection != NO_DETECTION)
              and (ignored_gt[i] == 1 or ignored_det[det_idx] == 1)):
            assigned_detection[det_idx] = True
        elif valid_detection != NO_DETECTION:
            # only a tp add a threshold.
            tp += 1
            # thresholds.append(dt_scores[det_idx])
            thresholds[thresh_idx] = dt_scores[det_idx]
            thresh_idx += 1
            if compute_aos:
                # delta.append(gt_alphas[i] - dt_alphas[det_idx])
                delta[delta_idx] = gt_alphas[i] - dt_alphas[det_idx]
                delta_idx += 1

            assigned_detection[det_idx] = True
    if compute_fp:
        for i in range(det_size):
            if (not (assigned_detection[i] or ignored_det[i] == -1
                     or ignored_det[i] == 1 or ignored_threshold[i])):
                fp += 1
        nstuff = 0
        if metric == 0:
            overlaps_dt_dc = image_box_overlap(dt_bboxes, dc_bboxes, 0)
            for i in range(dc_bboxes.shape[0]):
                for j in range(det_size):
                    if (assigned_detection[j]):
                        continue
                    if (ignored_det[j] == -1 or ignored_det[j] == 1):
                        continue
                    if (ignored_threshold[j]):
                        continue
                    if overlaps_dt_dc[j, i] > min_overlap:
                        assigned_detection[j] = True
                        nstuff += 1
        fp -= nstuff
        if compute_aos:
            tmp = np.zeros((fp + delta_idx, ))
            # tmp = [0] * fp
            for i in range(delta_idx):
                tmp[i + fp] = (1.0 + np.cos(delta[i])) / 2.0
                # tmp.append((1.0 + np.cos(delta[i])) / 2.0)
            # assert len(tmp) == fp + tp
            # assert len(delta) == tp
            if tp > 0 or fp > 0:
                similarity = np.sum(tmp)
            else:
                similarity = -1
    return tp, fp, fn, similarity, thresholds[:thresh_idx]


def get_split_parts(num, num_part):
    same_part = num // num_part
    remain_num = num % num_part
    if remain_num == 0:
        return [same_part] * num_part
    else:
        return [same_part] * num_part + [remain_num]


@numba.jit(nopython=True)
def fused_compute_statistics(overlaps,
                             pr,
                             gt_nums,
                             dt_nums,
                             dc_nums,
                             gt_datas,
                             dt_datas,
                             dontcares,
                             ignored_gts,
                             ignored_dets,
                             metric,
                             min_overlap,
                             thresholds,
                             compute_aos=False):
    gt_num = 0
    dt_num = 0
    dc_num = 0
    for i in range(gt_nums.shape[0]):
        for t, thresh in enumerate(thresholds):
            overlap = overlaps[dt_num:dt_num + dt_nums[i], gt_num:gt_num +
                               gt_nums[i]]

            gt_data = gt_datas[gt_num:gt_num + gt_nums[i]]
            dt_data = dt_datas[dt_num:dt_num + dt_nums[i]]
            ignored_gt = ignored_gts[gt_num:gt_num + gt_nums[i]]
            ignored_det = ignored_dets[dt_num:dt_num + dt_nums[i]]
            dontcare = dontcares[dc_num:dc_num + dc_nums[i]]
            tp, fp, fn, similarity, _ = compute_statistics_jit(
                overlap,
                gt_data,
                dt_data,
                ignored_gt,
                ignored_det,
                dontcare,
                metric,
                min_overlap=min_overlap,
                thresh=thresh,
                compute_fp=True,
                compute_aos=compute_aos)
            pr[t, 0] += tp
            pr[t, 1] += fp
            pr[t, 2] += fn
            if similarity != -1:
                pr[t, 3] += similarity
        gt_num += gt_nums[i]
        dt_num += dt_nums[i]
        dc_num += dc_nums[i]


def calculate_iou_partly(gt_annos,
                         dt_annos,
                         metric,
                         num_parts=50,
                         z_axis=1,
                         z_center=1.0):
    """fast iou algorithm. this function can be used independently to
    do result analysis. 
    Args:
        gt_annos: dict, must from get_label_annos() in kitti_common.py
        dt_annos: dict, must from get_label_annos() in kitti_common.py
        metric: eval type. 0: bbox, 1: bev, 2: 3d
        num_parts: int. a parameter for fast calculate algorithm
        z_axis: height axis. kitti camera use 1, lidar use 2.
    """
    assert len(gt_annos) == len(dt_annos)
    total_dt_num = np.stack([len(a["name"]) for a in dt_annos], 0)
    total_gt_num = np.stack([len(a["name"]) for a in gt_annos], 0)
    num_examples = len(gt_annos)
    split_parts = get_split_parts(num_examples, num_parts)
    parted_overlaps = []
    example_idx = 0
    bev_axes = list(range(3))
    bev_axes.pop(z_axis)
    for num_part in split_parts:
        gt_annos_part = gt_annos[example_idx:example_idx + num_part]
        dt_annos_part = dt_annos[example_idx:example_idx + num_part]
        if metric == 0:
            gt_boxes = np.concatenate([a["bbox"] for a in gt_annos_part], 0)
            dt_boxes = np.concatenate([a["bbox"] for a in dt_annos_part], 0)
            overlap_part = image_box_overlap(gt_boxes, dt_boxes)
        elif metric == 1:
            loc = np.concatenate(
                [a["location"][:, bev_axes] for a in gt_annos_part], 0)
            dims = np.concatenate(
                [a["dimensions"][:, bev_axes] for a in gt_annos_part], 0)
            rots = np.concatenate([a["rotation_y"] for a in gt_annos_part], 0)
            gt_boxes = np.concatenate([loc, dims, rots[..., np.newaxis]],
                                      axis=1)
            loc = np.concatenate(
                [a["location"][:, bev_axes] for a in dt_annos_part], 0)
            dims = np.concatenate(
                [a["dimensions"][:, bev_axes] for a in dt_annos_part], 0)
            rots = np.concatenate([a["rotation_y"] for a in dt_annos_part], 0)
            dt_boxes = np.concatenate([loc, dims, rots[..., np.newaxis]],
                                      axis=1)
            overlap_part = bev_box_overlap(gt_boxes,
                                           dt_boxes).astype(np.float64)
        elif metric == 2:
            loc = np.concatenate([a["location"] for a in gt_annos_part], 0)
            dims = np.concatenate([a["dimensions"] for a in gt_annos_part], 0)
            rots = np.concatenate([a["rotation_y"] for a in gt_annos_part], 0)
            gt_boxes = np.concatenate([loc, dims, rots[..., np.newaxis]],
                                      axis=1)
            loc = np.concatenate([a["location"] for a in dt_annos_part], 0)
            dims = np.concatenate([a["dimensions"] for a in dt_annos_part], 0)
            rots = np.concatenate([a["rotation_y"] for a in dt_annos_part], 0)
            dt_boxes = np.concatenate([loc, dims, rots[..., np.newaxis]],
                                      axis=1)
            overlap_part = d3_box_overlap(
                gt_boxes, dt_boxes, z_axis=z_axis,
                z_center=z_center).astype(np.float64)
        else:
            raise ValueError("unknown metric")
        parted_overlaps.append(overlap_part)
        example_idx += num_part
    overlaps = []
    example_idx = 0
    for j, num_part in enumerate(split_parts):
        gt_annos_part = gt_annos[example_idx:example_idx + num_part]
        dt_annos_part = dt_annos[example_idx:example_idx + num_part]
        gt_num_idx, dt_num_idx = 0, 0
        for i in range(num_part):
            gt_box_num = total_gt_num[example_idx + i]
            dt_box_num = total_dt_num[example_idx + i]
            overlaps.append(
                parted_overlaps[j][gt_num_idx:gt_num_idx +
                                   gt_box_num, dt_num_idx:dt_num_idx +
                                   dt_box_num])
            gt_num_idx += gt_box_num
            dt_num_idx += dt_box_num
        example_idx += num_part

    return overlaps, parted_overlaps, total_gt_num, total_dt_num


def _prepare_data(gt_annos, dt_annos, current_class, difficulty):
    gt_datas_list = []
    dt_datas_list = []
    total_dc_num = []
    ignored_gts, ignored_dets, dontcares = [], [], []
    total_num_valid_gt = 0
    for i in range(len(gt_annos)):
        rets = clean_data(gt_annos[i], dt_annos[i], current_class, difficulty)
        num_valid_gt, ignored_gt, ignored_det, dc_bboxes = rets
        ignored_gts.append(np.array(ignored_gt, dtype=np.int64))
        ignored_dets.append(np.array(ignored_det, dtype=np.int64))
        if len(dc_bboxes) == 0:
            dc_bboxes = np.zeros((0, 4)).astype(np.float64)
        else:
            dc_bboxes = np.stack(dc_bboxes, 0).astype(np.float64)
        total_dc_num.append(dc_bboxes.shape[0])
        dontcares.append(dc_bboxes)
        total_num_valid_gt += num_valid_gt
        gt_datas = np.concatenate(
            [gt_annos[i]["bbox"], gt_annos[i]["alpha"][..., np.newaxis]], 1)
        dt_datas = np.concatenate([
            dt_annos[i]["bbox"], dt_annos[i]["alpha"][..., np.newaxis],
            dt_annos[i]["score"][..., np.newaxis]
        ], 1)
        gt_datas_list.append(gt_datas)
        dt_datas_list.append(dt_datas)
    total_dc_num = np.stack(total_dc_num, axis=0)
    return (gt_datas_list, dt_datas_list, ignored_gts, ignored_dets, dontcares,
            total_dc_num, total_num_valid_gt)


def eval_class(gt_annos,
                  dt_annos,
                  current_classes,
                  difficultys,
                  metric,
                  min_overlaps,
                  compute_aos=False,
                  z_axis=1,
                  z_center=1.0,
                  num_parts=50):
    """Kitti eval. support 2d/bev/3d/aos eval. support 0.5:0.05:0.95 coco AP.
    Args:
        gt_annos: dict, must from get_label_annos() in kitti_common.py
        dt_annos: dict, must from get_label_annos() in kitti_common.py
        current_class: int, 0: car, 1: pedestrian, 2: cyclist
        difficulty: int. eval difficulty, 0: easy, 1: normal, 2: hard
        metric: eval type. 0: bbox, 1: bev, 2: 3d
        min_overlap: float, min overlap. official: 
            [[0.7, 0.5, 0.5], [0.7, 0.5, 0.5], [0.7, 0.5, 0.5]] 
            format: [metric, class]. choose one from matrix above.
        num_parts: int. a parameter for fast calculate algorithm

    Returns:
        dict of recall, precision and aos
    """
    assert len(gt_annos) == len(dt_annos)
    num_examples = len(gt_annos)
    split_parts = get_split_parts(num_examples, num_parts)

    rets = calculate_iou_partly(
        dt_annos,
        gt_annos,
        metric,
        num_parts,
        z_axis=z_axis,
        z_center=z_center)
    overlaps, parted_overlaps, total_dt_num, total_gt_num = rets
    N_SAMPLE_PTS = 41
    num_minoverlap = len(min_overlaps)
    num_class = len(current_classes)
    num_difficulty = len(difficultys)
    precision = np.zeros(
        [num_class, num_difficulty, num_minoverlap, N_SAMPLE_PTS])
    recall = np.zeros(
        [num_class, num_difficulty, num_minoverlap, N_SAMPLE_PTS])
    aos = np.zeros([num_class, num_difficulty, num_minoverlap, N_SAMPLE_PTS])
    all_thresholds = np.zeros([num_class, num_difficulty, num_minoverlap, N_SAMPLE_PTS])
    for m, current_class in enumerate(current_classes):
        for l, difficulty in enumerate(difficultys):
            rets = _prepare_data(gt_annos, dt_annos, current_class, difficulty)
            (gt_datas_list, dt_datas_list, ignored_gts, ignored_dets,
             dontcares, total_dc_num, total_num_valid_gt) = rets
            for k, min_overlap in enumerate(min_overlaps[:, metric, m]):
                thresholdss = []
                for i in range(len(gt_annos)):
                    rets = compute_statistics_jit(
                        overlaps[i],
                        gt_datas_list[i],
                        dt_datas_list[i],
                        ignored_gts[i],
                        ignored_dets[i],
                        dontcares[i],
                        metric,
                        min_overlap=min_overlap,
                        thresh=0.0,
                        compute_fp=False)
                    tp, fp, fn, similarity, thresholds = rets
                    thresholdss += thresholds.tolist()
                thresholdss = np.array(thresholdss)
                thresholds = get_thresholds(thresholdss, total_num_valid_gt)
                thresholds = np.array(thresholds)
                all_thresholds[m, l, k, :len(thresholds)] = thresholds
                pr = np.zeros([len(thresholds), 4])
                idx = 0
                for j, num_part in enumerate(split_parts):
                    gt_datas_part = np.concatenate(
                        gt_datas_list[idx:idx + num_part], 0)
                    dt_datas_part = np.concatenate(
                        dt_datas_list[idx:idx + num_part], 0)
                    dc_datas_part = np.concatenate(
                        dontcares[idx:idx + num_part], 0)
                    ignored_dets_part = np.concatenate(
                        ignored_dets[idx:idx + num_part], 0)
                    ignored_gts_part = np.concatenate(
                        ignored_gts[idx:idx + num_part], 0)
                    fused_compute_statistics(
                        parted_overlaps[j],
                        pr,
                        total_gt_num[idx:idx + num_part],
                        total_dt_num[idx:idx + num_part],
                        total_dc_num[idx:idx + num_part],
                        gt_datas_part,
                        dt_datas_part,
                        dc_datas_part,
                        ignored_gts_part,
                        ignored_dets_part,
                        metric,
                        min_overlap=min_overlap,
                        thresholds=thresholds,
                        compute_aos=compute_aos)
                    idx += num_part
                for i in range(len(thresholds)):
                    precision[m, l, k, i] = pr[i, 0] / (pr[i, 0] + pr[i, 1])
                    if compute_aos:
                        aos[m, l, k, i] = pr[i, 3] / (pr[i, 0] + pr[i, 1])
                for i in range(len(thresholds)):
                    precision[m, l, k, i] = np.max(
                        precision[m, l, k, i:], axis=-1)
                    if compute_aos:
                        aos[m, l, k, i] = np.max(aos[m, l, k, i:], axis=-1)

    ret_dict = {
        # "recall": recall, # [num_class, num_difficulty, num_minoverlap, N_SAMPLE_PTS]
        "precision": precision,
        "orientation": aos,
        "thresholds": all_thresholds,
        "min_overlaps": min_overlaps,
    }
    return ret_dict


def get_mAP_v2(prec):
    sums = 0
    for i in range(0, prec.shape[-1], 4):
        sums = sums + prec[..., i]
    return sums / 11 * 100


def do_eval_v2(gt_annos,
               dt_annos,
               current_classes,
               min_overlaps,
               compute_aos=False,
               difficultys=(0, 1, 2),
               z_axis=1,
               z_center=1.0):
    # min_overlaps: [num_minoverlap, metric, num_class]
    ret = eval_class(
        gt_annos,
        dt_annos,
        current_classes,
        difficultys,
        0,
        min_overlaps,
        compute_aos,
        z_axis=z_axis,
        z_center=z_center)
    # ret: [num_class, num_diff, num_minoverlap, num_sample_points]
    mAP_bbox = get_mAP_v2(ret["precision"])
    mAP_aos = None
    if compute_aos:
        mAP_aos = get_mAP_v2(ret["orientation"])
    ret = eval_class(
        gt_annos,
        dt_annos,
        current_classes,
        difficultys,
        1,
        min_overlaps,
        z_axis=z_axis,
        z_center=z_center)
    mAP_bev = get_mAP_v2(ret["precision"])
    ret = eval_class(
        gt_annos,
        dt_annos,
        current_classes,
        difficultys,
        2,
        min_overlaps,
        z_axis=z_axis,
        z_center=z_center)
    mAP_3d = get_mAP_v2(ret["precision"])
    return mAP_bbox, mAP_bev, mAP_3d, mAP_aos

def do_eval_v3(gt_annos,
               dt_annos,
               current_classes,
               min_overlaps,
               compute_aos=False,
               difficultys=(0, 1, 2),
               z_axis=1,
               z_center=1.0):
    # min_overlaps: [num_minoverlap, metric, num_class]
    types = ["bbox", "bev", "3d"]
    metrics = {}
    for i in range(3):
        ret = eval_class(
            gt_annos,
            dt_annos,
            current_classes,
            difficultys,
            i,
            min_overlaps,
            compute_aos,
            z_axis=z_axis,
            z_center=z_center)
        metrics[types[i]] = ret
    return metrics


def do_coco_style_eval(gt_annos,
                       dt_annos,
                       current_classes,
                       overlap_ranges,
                       compute_aos,
                       z_axis=1,
                       z_center=1.0):
    # overlap_ranges: [range, metric, num_class]
    min_overlaps = np.zeros([10, *overlap_ranges.shape[1:]])
    for i in range(overlap_ranges.shape[1]):
        for j in range(overlap_ranges.shape[2]):
            min_overlaps[:, i, j] = np.linspace(*overlap_ranges[:, i, j])
    mAP_bbox, mAP_bev, mAP_3d, mAP_aos = do_eval_v2(
        gt_annos,
        dt_annos,
        current_classes,
        min_overlaps,
        compute_aos,
        z_axis=z_axis,
        z_center=z_center)
    # ret: [num_class, num_diff, num_minoverlap]
    mAP_bbox = mAP_bbox.mean(-1)
    mAP_bev = mAP_bev.mean(-1)
    mAP_3d = mAP_3d.mean(-1)
    if mAP_aos is not None:
        mAP_aos = mAP_aos.mean(-1)
    return mAP_bbox, mAP_bev, mAP_3d, mAP_aos


def print_str(value, *arg, sstream=None):
    if sstream is None:
        sstream = sysio.StringIO()
    sstream.truncate(0)
    sstream.seek(0)
    print(value, *arg, file=sstream)
    return sstream.getvalue()

def get_official_eval_result(gt_annos,
                             dt_annos,
                             current_classes,
                             difficultys=[0, 1, 2],
                             z_axis=1,
                             z_center=1.0):
    """
        gt_annos and dt_annos must contains following keys:
        [bbox, location, dimensions, rotation_y, score]
    """
    overlap_mod = np.array([[0.7, 0.5, 0.5, 0.7, 0.5, 0.7, 0.7, 0.7],
                            [0.7, 0.5, 0.5, 0.7, 0.5, 0.7, 0.7, 0.7],
                            [0.7, 0.5, 0.5, 0.7, 0.5, 0.7, 0.7, 0.7]])
    overlap_easy = np.array([[0.5, 0.5, 0.5, 0.7, 0.5, 0.5, 0.5, 0.5],
                            [0.5, 0.25, 0.25, 0.5, 0.25, 0.5, 0.5, 0.5],
                            [0.5, 0.25, 0.25, 0.5, 0.25, 0.5, 0.5, 0.5]])
    min_overlaps = np.stack([overlap_mod, overlap_easy], axis=0)  # [2, 3, 5]
    class_to_name = {
        0: 'Car',
        1: 'Pedestrian',
        2: 'Cyclist',
        3: 'Van',
        4: 'Person_sitting',
        5: 'car',
        6: 'tractor',
        7: 'trailer',
    }
    name_to_class = {v: n for n, v in class_to_name.items()}
    if not isinstance(current_classes, (list, tuple)):
        current_classes = [current_classes]
    current_classes_int = []
    for curcls in current_classes:
        if isinstance(curcls, str):
            current_classes_int.append(name_to_class[curcls])
        else:
            current_classes_int.append(curcls)
    current_classes = current_classes_int
    min_overlaps = min_overlaps[:, :, current_classes]
    result = ''
    # check whether alpha is valid
    compute_aos = False
    for anno in dt_annos:
        if anno['alpha'].shape[0] != 0:
            if anno['alpha'][0] != -10:
                compute_aos = True
            break
    metrics = do_eval_v3(
        gt_annos,
        dt_annos,
        current_classes,
        min_overlaps,
        compute_aos,
        difficultys,
        z_axis=z_axis,
        z_center=z_center)
    for j, curcls in enumerate(current_classes):
        # mAP threshold array: [num_minoverlap, metric, class]
        # mAP result: [num_class, num_diff, num_minoverlap]
        for i in range(min_overlaps.shape[0]):
            mAPbbox = get_mAP_v2(metrics["bbox"]["precision"][j, :, i])
            mAPbbox = ", ".join(f"{v:.2f}" for v in mAPbbox)
            mAPbev = get_mAP_v2(metrics["bev"]["precision"][j, :, i])
            mAPbev = ", ".join(f"{v:.2f}" for v in mAPbev)
            mAP3d = get_mAP_v2(metrics["3d"]["precision"][j, :, i])
            mAP3d = ", ".join(f"{v:.2f}" for v in mAP3d)
            result += print_str(
                (f"{class_to_name[curcls]} "
                 "AP(Average Precision)@{:.2f}, {:.2f}, {:.2f}:".format(*min_overlaps[i, :, j])))
            result += print_str(f"bbox AP:{mAPbbox}")
            result += print_str(f"bev  AP:{mAPbev}")
            result += print_str(f"3d   AP:{mAP3d}")
            if compute_aos:
                mAPaos = get_mAP_v2(metrics["bbox"]["orientation"][j, :, i])
                mAPaos = ", ".join(f"{v:.2f}" for v in mAPaos)
                result += print_str(f"aos  AP:{mAPaos}")


    return result


def get_coco_eval_result(gt_annos,
                         dt_annos,
                         current_classes,
                         z_axis=1,
                         z_center=1.0):
    class_to_name = {
        0: 'Car',
        1: 'Pedestrian',
        2: 'Cyclist',
        3: 'Van',
        4: 'Person_sitting',
        5: 'car',
        6: 'tractor',
        7: 'trailer',
    }
    class_to_range = {
        0: [0.5, 1.0, 0.05],
        1: [0.25, 0.75, 0.05],
        2: [0.25, 0.75, 0.05],
        3: [0.5, 1.0, 0.05],
        4: [0.25, 0.75, 0.05],
        5: [0.5, 1.0, 0.05],
        6: [0.5, 1.0, 0.05],
        7: [0.5, 1.0, 0.05],
    }
    class_to_range = {
        0: [0.5, 0.95, 10],
        1: [0.25, 0.7, 10],
        2: [0.25, 0.7, 10],
        3: [0.5, 0.95, 10],
        4: [0.25, 0.7, 10],
        5: [0.5, 0.95, 10],
        6: [0.5, 0.95, 10],
        7: [0.5, 0.95, 10],
    }

    name_to_class = {v: n for n, v in class_to_name.items()}
    if not isinstance(current_classes, (list, tuple)):
        current_classes = [current_classes]
    current_classes_int = []
    for curcls in current_classes:
        if isinstance(curcls, str):
            current_classes_int.append(name_to_class[curcls])
        else:
            current_classes_int.append(curcls)
    current_classes = current_classes_int
    overlap_ranges = np.zeros([3, 3, len(current_classes)])
    for i, curcls in enumerate(current_classes):
        overlap_ranges[:, :, i] = np.array(
            class_to_range[curcls])[:, np.newaxis]
    result = ''
    # check whether alpha is valid
    compute_aos = False
    for anno in dt_annos:
        if anno['alpha'].shape[0] != 0:
            if anno['alpha'][0] != -10:
                compute_aos = True
            break
    mAPbbox, mAPbev, mAP3d, mAPaos = do_coco_style_eval(
        gt_annos,
        dt_annos,
        current_classes,
        overlap_ranges,
        compute_aos,
        z_axis=z_axis,
        z_center=z_center)
    for j, curcls in enumerate(current_classes):
        # mAP threshold array: [num_minoverlap, metric, class]
        # mAP result: [num_class, num_diff, num_minoverlap]
        o_range = np.array(class_to_range[curcls])[[0, 2, 1]]
        o_range[1] = (o_range[2] - o_range[0]) / (o_range[1] - 1)
        result += print_str((f"{class_to_name[curcls]} "
                             "coco AP@{:.2f}:{:.2f}:{:.2f}:".format(*o_range)))
        result += print_str((f"bbox AP:{mAPbbox[j, 0]:.2f}, "
                             f"{mAPbbox[j, 1]:.2f}, "
                             f"{mAPbbox[j, 2]:.2f}"))
        result += print_str((f"bev  AP:{mAPbev[j, 0]:.2f}, "
                             f"{mAPbev[j, 1]:.2f}, "
                             f"{mAPbev[j, 2]:.2f}"))
        result += print_str((f"3d   AP:{mAP3d[j, 0]:.2f}, "
                             f"{mAP3d[j, 1]:.2f}, "
                             f"{mAP3d[j, 2]:.2f}"))
        if compute_aos:
            result += print_str((f"aos  AP:{mAPaos[j, 0]:.2f}, "
                                 f"{mAPaos[j, 1]:.2f}, "
                                 f"{mAPaos[j, 2]:.2f}"))
    return result


================================================
FILE: disparity/eval/kitti-object-eval-python/eval.sh
================================================
#!/bin/bash
echo $1
if [ ! -n "$2" ] ; then
    class="0"
else
    class=$2
fi
echo $class
python3 evaluate.py evaluate \
    --label_path=/mnt/home/ylchen/ylchen/dataset/KITTI_DATASET/kitti_detection/training/label_2/ \
    --result_path=$1 \
    --current_class=$class --coco=False




================================================
FILE: disparity/eval/kitti-object-eval-python/eval_dist.sh
================================================
#!/bin/bash
echo $1
if [ ! -n "$2" ] ; then
    class="0"
else
    class=$2
fi
echo $class

for i in $(seq 0 5 45)
do
	echo "eval $i,$(($i+5)) meters"
	python3.6 evaluate.py evaluate \
	    --label_path=/home/yilunchen/data/kitti/training/label_2/ \
	    --result_path=$1 \
	    --current_class=$class --coco=False \
	    --eval_dist=$i,$(($i+5))
done



================================================
FILE: disparity/eval/kitti-object-eval-python/evaluate.py
================================================
import time
import fire
import kitti_common as kitti
from eval import get_official_eval_result, get_coco_eval_result


def _read_imageset_file(path):
    with open(path, 'r') as f:
        lines = f.readlines()
    return [int(line) for line in lines]


def evaluate(label_path,
             result_path,
             current_class=0,
             coco=False,
             score_thresh=-1,
             eval_dist=None):
    dt_annos, image_ids = kitti.get_label_annos(result_path, return_image_ids=True, eval_dist=eval_dist)
    print('Eval {} images'.format(len(dt_annos)))
    if score_thresh > 0:
        dt_annos = kitti.filter_annos_low_score(dt_annos, score_thresh)
    #val_image_ids = _read_imageset_file(label_split_file)
    gt_annos = kitti.get_label_annos(label_path, image_ids, eval_dist=eval_dist)
    if coco:
        print(get_coco_eval_result(gt_annos, dt_annos, current_class))
    else:
        print(get_official_eval_result(gt_annos, dt_annos, current_class))


if __name__ == '__main__':
    fire.Fire()


================================================
FILE: disparity/eval/kitti-object-eval-python/kitti_common.py
================================================
import concurrent.futures as futures
import os
import pathlib
import re
from collections import OrderedDict

import numpy as np
from skimage import io

def get_image_index_str(img_idx):
    return "{:06d}".format(img_idx)


def get_kitti_info_path(idx,
                        prefix,
                        info_type='image_2',
                        file_tail='.png',
                        training=True,
                        relative_path=True):
    img_idx_str = get_image_index_str(idx)
    img_idx_str += file_tail
    prefix = pathlib.Path(prefix)
    if training:
        file_path = pathlib.Path('training') / info_type / img_idx_str
    else:
        file_path = pathlib.Path('testing') / info_type / img_idx_str
    if not (prefix / file_path).exists():
        raise ValueError("file not exist: {}".format(file_path))
    if relative_path:
        return str(file_path)
    else:
        return str(prefix / file_path)


def get_image_path(idx, prefix, training=True, relative_path=True):
    return get_kitti_info_path(idx, prefix, 'image_2', '.png', training,
                               relative_path)


def get_label_path(idx, prefix, training=True, relative_path=True):
    return get_kitti_info_path(idx, prefix, 'label_2', '.txt', training,
                               relative_path)


def get_velodyne_path(idx, prefix, training=True, relative_path=True):
    return get_kitti_info_path(idx, prefix, 'velodyne', '.bin', training,
                               relative_path)


def get_calib_path(idx, prefix, training=True, relative_path=True):
    return get_kitti_info_path(idx, prefix, 'calib', '.txt', training,
                               relative_path)


def _extend_matrix(mat):
    mat = np.concatenate([mat, np.array([[0., 0., 0., 1.]])], axis=0)
    return mat


def get_kitti_image_info(path,
                         training=True,
                         label_info=True,
                         velodyne=False,
                         calib=False,
                         image_ids=7481,
                         extend_matrix=True,
                         num_worker=8,
                         relative_path=True,
                         with_imageshape=True):
    # image_infos = []
    root_path = pathlib.Path(path)
    if not isinstance(image_ids, list):
        image_ids = list(range(image_ids))

    def map_func(idx):
        image_info = {'image_idx': idx}
        annotations = None
        if velodyne:
            image_info['velodyne_path'] = get_velodyne_path(
                idx, path, training, relative_path)
        image_info['img_path'] = get_image_path(idx, path, training,
                                                relative_path)
        if with_imageshape:
            img_path = image_info['img_path']
            if relative_path:
                img_path = str(root_path / img_path)
            image_info['img_shape'] = np.array(
                io.imread(img_path).shape[:2], dtype=np.int32)
        if label_info:
            label_path = get_label_path(idx, path, training, relative_path)
            if relative_path:
                label_path = str(root_path / label_path)
            annotations = get_label_anno(label_path)
        if calib:
            calib_path = get_calib_path(
                idx, path, training, relative_path=False)
            with open(calib_path, 'r') as f:
                lines = f.readlines()
            P0 = np.array(
                [float(info) for info in lines[0].split(' ')[1:13]]).reshape(
                    [3, 4])
            P1 = np.array(
                [float(info) for info in lines[1].split(' ')[1:13]]).reshape(
                    [3, 4])
            P2 = np.array(
                [float(info) for info in lines[2].split(' ')[1:13]]).reshape(
                    [3, 4])
            P3 = np.array(
                [float(info) for info in lines[3].split(' ')[1:13]]).reshape(
                    [3, 4])
            if extend_matrix:
                P0 = _extend_matrix(P0)
                P1 = _extend_matrix(P1)
                P2 = _extend_matrix(P2)
                P3 = _extend_matrix(P3)
            image_info['calib/P0'] = P0
            image_info['calib/P1'] = P1
            image_info['calib/P2'] = P2
            image_info['calib/P3'] = P3
            R0_rect = np.array([
                float(info) for info in lines[4].split(' ')[1:10]
            ]).reshape([3, 3])
            if extend_matrix:
                rect_4x4 = np.zeros([4, 4], dtype=R0_rect.dtype)
                rect_4x4[3, 3] = 1.
                rect_4x4[:3, :3] = R0_rect
            else:
                rect_4x4 = R0_rect
            image_info['calib/R0_rect'] = rect_4x4
            Tr_velo_to_cam = np.array([
                float(info) for info in lines[5].split(' ')[1:13]
            ]).reshape([3, 4])
            Tr_imu_to_velo = np.array([
                float(info) for info in lines[6].split(' ')[1:13]
            ]).reshape([3, 4])
            if extend_matrix:
                Tr_velo_to_cam = _extend_matrix(Tr_velo_to_cam)
                Tr_imu_to_velo = _extend_matrix(Tr_imu_to_velo)
            image_info['calib/Tr_velo_to_cam'] = Tr_velo_to_cam
            image_info['calib/Tr_imu_to_velo'] = Tr_imu_to_velo
        if annotations is not None:
            image_info['annos'] = annotations
            add_difficulty_to_annos(image_info)
        return image_info

    with futures.ThreadPoolExecutor(num_worker) as executor:
        image_infos = executor.map(map_func, image_ids)
    return list(image_infos)


def filter_kitti_anno(image_anno,
                      used_classes,
                      used_difficulty=None,
                      dontcare_iou=None):
    if not isinstance(used_classes, (list, tuple)):
        used_classes = [used_classes]
    img_filtered_annotations = {}
    relevant_annotation_indices = [
        i for i, x in enumerate(image_anno['name']) if x in used_classes
    ]
    for key in image_anno.keys():
        img_filtered_annotations[key] = (
            image_anno[key][relevant_annotation_indices])
    if used_difficulty is not None:
        relevant_annotation_indices = [
            i for i, x in enumerate(img_filtered_annotations['difficulty'])
            if x in used_difficulty
        ]
        for key in image_anno.keys():
            img_filtered_annotations[key] = (
                img_filtered_annotations[key][relevant_annotation_indices])

    if 'DontCare' in used_classes and dontcare_iou is not None:
        dont_care_indices = [
            i for i, x in enumerate(img_filtered_annotations['name'])
            if x == 'DontCare'
        ]
        # bounding box format [y_min, x_min, y_max, x_max]
        all_boxes = img_filtered_annotations['bbox']
        ious = iou(all_boxes, all_boxes[dont_care_indices])

        # Remove all bounding boxes that overlap with a dontcare region.
        if ious.size > 0:
            boxes_to_remove = np.amax(ious, axis=1) > dontcare_iou
            for key in image_anno.keys():
                img_filtered_annotations[key] = (img_filtered_annotations[key][
                    np.logical_not(boxes_to_remove)])
    return img_filtered_annotations

def filter_annos_low_score(image_annos, thresh):
    new_image_annos = []
    for anno in image_annos:
        img_filtered_annotations = {}
        relevant_annotation_indices = [
            i for i, s in enumerate(anno['score']) if s >= thresh
        ]
        for key in anno.keys():
            img_filtered_annotations[key] = (
                anno[key][relevant_annotation_indices])
        new_image_annos.append(img_filtered_annotations)
    return new_image_annos

def kitti_result_line(result_dict, precision=4):
    prec_float = "{" + ":.{}f".format(precision) + "}"
    res_line = []
    all_field_default = OrderedDict([
        ('name', None),
        ('truncated', -1),
        ('occluded', -1),
        ('alpha', -10),
        ('bbox', None),
        ('dimensions', [-1, -1, -1]),
        ('location', [-1000, -1000, -1000]),
        ('rotation_y', -10),
        ('score', None),
    ])
    res_dict = [(key, None) for key, val in all_field_default.items()]
    res_dict = OrderedDict(res_dict)
    for key, val in result_dict.items():
        if all_field_default[key] is None and val is None:
            raise ValueError("you must specify a value for {}".format(key))
        res_dict[key] = val

    for key, val in res_dict.items():
        if key == 'name':
            res_line.append(val)
        elif key in ['truncated', 'alpha', 'rotation_y', 'score']:
            if val is None:
                res_line.append(str(all_field_default[key]))
            else:
                res_line.append(prec_float.format(val))
        elif key == 'occluded':
            if val is None:
                res_line.append(str(all_field_default[key]))
            else:
                res_line.append('{}'.format(val))
        elif key in ['bbox', 'dimensions', 'location']:
            if val is None:
                res_line += [str(v) for v in all_field_default[key]]
            else:
                res_line += [prec_float.format(v) for v in val]
        else:
            raise ValueError("unknown key. supported ke
Download .txt
gitextract_bxzusud1/

├── LICENSE
├── README.md
├── configs/
│   └── config_disp.py
├── data
├── disparity/
│   ├── __init__.py
│   ├── csrc/
│   │   ├── BuildCostVolume.h
│   │   ├── ROIAlign.h
│   │   ├── ROIPool.h
│   │   ├── SigmoidFocalLoss.h
│   │   ├── cpu/
│   │   │   ├── ROIAlign_cpu.cpp
│   │   │   ├── nms_cpu.cpp
│   │   │   └── vision.h
│   │   ├── cuda/
│   │   │   ├── BuildCostVolume_cuda.cu
│   │   │   ├── ROIAlign_cuda.cu
│   │   │   ├── ROIPool_cuda.cu
│   │   │   ├── SigmoidFocalLoss_cuda.cu
│   │   │   ├── nms.cu
│   │   │   └── vision.h
│   │   ├── nms.h
│   │   └── vision.cpp
│   ├── dataloader/
│   │   ├── DataStatistics.py
│   │   ├── KITTILoader.py
│   │   ├── KITTI_submission_loader.py
│   │   ├── KITTI_submission_loader2012.py
│   │   ├── KITTIloader2012.py
│   │   ├── KITTIloader2015.py
│   │   ├── SceneFlowLoader_demo.py
│   │   ├── SecenFlowLoader.py
│   │   ├── SecenFlowLoader1.py
│   │   ├── SecenFlowLoaderfix.py
│   │   ├── Testloader.py
│   │   ├── __init__.py
│   │   ├── listflowfile.py
│   │   ├── listflowfilefix.py
│   │   ├── preprocess.py
│   │   └── readpfm.py
│   ├── eval/
│   │   ├── __init__.py
│   │   ├── kitti/
│   │   │   ├── README.md
│   │   │   ├── compile.sh
│   │   │   ├── eval.sh
│   │   │   ├── eval_05.sh
│   │   │   ├── evaluate_object_3d_offline
│   │   │   ├── evaluate_object_3d_offline.cpp
│   │   │   └── mail.h
│   │   └── kitti-object-eval-python/
│   │       ├── .gitignore
│   │       ├── LICENSE
│   │       ├── README.md
│   │       ├── eval.py
│   │       ├── eval.sh
│   │       ├── eval_dist.sh
│   │       ├── evaluate.py
│   │       ├── kitti_common.py
│   │       └── rotate_iou.py
│   ├── layers/
│   │   ├── __init__.py
│   │   ├── _utils.py
│   │   ├── batch_norm.py
│   │   ├── build_cost_volume.py
│   │   ├── iou_loss.py
│   │   ├── misc.py
│   │   ├── nms.py
│   │   ├── roi_align.py
│   │   ├── roi_pool.py
│   │   ├── scale.py
│   │   ├── sigmoid_focal_loss.py
│   │   └── smooth_l1_loss.py
│   ├── models/
│   │   ├── ActiveStereoNet.py
│   │   ├── __init__.py
│   │   ├── stereonet.py
│   │   ├── stereonet_disp.py
│   │   └── submodule.py
│   └── utils/
│       ├── __init__.py
│       ├── logger.py
│       ├── preprocess.py
│       ├── readpfm.py
│       ├── tensorboardx.py
│       └── utils.py
├── preprocessing/
│   ├── generate_disp.py
│   ├── generate_lidar.py
│   └── kitti_util.py
├── requirement.txt
├── setup.py
└── tools/
    ├── env_utils/
    │   ├── __init__.py
    │   ├── exp.py
    │   ├── logger.py
    │   └── utils.py
    └── train_net_disp.py
Download .txt
SYMBOL INDEX (357 symbols across 50 files)

FILE: disparity/csrc/cpu/ROIAlign_cpu.cpp
  type PreCalc (line 6) | struct PreCalc {
  function pre_calc_for_bilinear_interpolate (line 18) | void pre_calc_for_bilinear_interpolate(
  function ROIAlignForward_cpu_kernel (line 114) | void ROIAlignForward_cpu_kernel(
  function ROIAlign_forward_cpu (line 221) | at::Tensor ROIAlign_forward_cpu(const at::Tensor& input,

FILE: disparity/csrc/cpu/nms_cpu.cpp
  function nms_cpu_kernel (line 6) | at::Tensor nms_cpu_kernel(const at::Tensor& dets,
  function nms_cpu (line 67) | at::Tensor nms_cpu(const at::Tensor& dets,

FILE: disparity/csrc/nms.h
  function threshold (line 12) | float threshold) {

FILE: disparity/csrc/vision.cpp
  function PYBIND11_MODULE (line 8) | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {

FILE: disparity/dataloader/KITTILoader.py
  function is_image_file (line 17) | def is_image_file(filename):
  function default_loader (line 21) | def default_loader(path):
  function npy_loader (line 26) | def npy_loader(path):
  function disparity_loader (line 29) | def disparity_loader(path):
  class myImageFloder (line 33) | class myImageFloder(data.Dataset):
    method __init__ (line 34) | def __init__(self, left, right, left_disparity, left_norm, training, l...
    method __getitem__ (line 45) | def __getitem__(self, index):
    method __len__ (line 124) | def __len__(self):

FILE: disparity/dataloader/KITTI_submission_loader.py
  function is_image_file (line 14) | def is_image_file(filename):
  function dataloader (line 17) | def dataloader(filepath):

FILE: disparity/dataloader/KITTI_submission_loader2012.py
  function is_image_file (line 14) | def is_image_file(filename):
  function dataloader (line 17) | def dataloader(filepath):

FILE: disparity/dataloader/KITTIloader2012.py
  function is_image_file (line 15) | def is_image_file(filename):
  function dataloader (line 18) | def dataloader(filepath, arg=False):

FILE: disparity/dataloader/KITTIloader2015.py
  function is_image_file (line 15) | def is_image_file(filename):
  function dataloader (line 18) | def dataloader(filepath):

FILE: disparity/dataloader/SceneFlowLoader_demo.py
  function is_image_file (line 22) | def is_image_file(filename):
  function default_loader (line 26) | def default_loader(path):
  function disparity_loader (line 30) | def disparity_loader(path):
  class myImageFloder (line 53) | class myImageFloder(data.Dataset):
    method __init__ (line 54) | def __init__(self,
    method __getitem__ (line 71) | def __getitem__(self, index):
    method __len__ (line 89) | def __len__(self):

FILE: disparity/dataloader/SecenFlowLoader.py
  function is_image_file (line 21) | def is_image_file(filename):
  function default_loader (line 25) | def default_loader(path):
  function disparity_loader (line 31) | def disparity_loader(path):
  function random_replace (line 34) | def random_replace(img,num,size):
  class myImageFloder (line 54) | class myImageFloder(data.Dataset):
    method __init__ (line 55) | def __init__(self, left, right, left_disparity,right_disparity, traini...
    method __getitem__ (line 69) | def __getitem__(self, index):
    method __len__ (line 162) | def __len__(self):

FILE: disparity/dataloader/SecenFlowLoader1.py
  function is_image_file (line 19) | def is_image_file(filename):
  function default_loader (line 23) | def default_loader(path):
  function disparity_loader (line 27) | def disparity_loader(path):
  class myImageFloder (line 31) | class myImageFloder(data.Dataset):
    method __init__ (line 32) | def __init__(self, left, right, left_disparity, training, loader=defau...
    method __getitem__ (line 41) | def __getitem__(self, index):
    method __len__ (line 79) | def __len__(self):

FILE: disparity/dataloader/SecenFlowLoaderfix.py
  function is_image_file (line 23) | def is_image_file(filename):
  function default_loader (line 27) | def default_loader(path):
  function disparity_loader (line 56) | def disparity_loader(path):
  class myImageFloder (line 78) | class myImageFloder(data.Dataset):
    method __init__ (line 79) | def __init__(self,
    method __getitem__ (line 98) | def __getitem__(self, index):
    method __len__ (line 121) | def __len__(self):

FILE: disparity/dataloader/Testloader.py
  function is_image_file (line 19) | def is_image_file(filename):
  function default_loader (line 23) | def default_loader(path):
  function disparity_loader (line 27) | def disparity_loader(path):
  function dataloader (line 32) | def dataloader(filepath):
  class myImageFloder (line 48) | class myImageFloder(data.Dataset):
    method __init__ (line 49) | def __init__(self, left, right, loader=default_loader):
    method __getitem__ (line 56) | def __getitem__(self, index):
    method __len__ (line 86) | def __len__(self):

FILE: disparity/dataloader/listflowfile.py
  function is_image_file (line 13) | def is_image_file(filename):
  function dataloader (line 17) | def dataloader(filepath):

FILE: disparity/dataloader/listflowfilefix.py
  function is_image_file (line 13) | def is_image_file(filename):
  function dataloader (line 16) | def dataloader(filepath): # /media/hugonie/Hhome/dataset/SceneFlowData/

FILE: disparity/dataloader/preprocess.py
  function totensor_normalize (line 26) | def totensor_normalize():
  function augmentv1 (line 37) | def augmentv1():
  function get_transform (line 62) | def get_transform(augment=True):

FILE: disparity/dataloader/readpfm.py
  function readPFM (line 6) | def readPFM(file):

FILE: disparity/eval/kitti-object-eval-python/eval.py
  function get_mAP (line 11) | def get_mAP(prec):
  function get_thresholds (line 19) | def get_thresholds(scores: np.ndarray, num_gt, num_sample_pts=41):
  function clean_data (line 40) | def clean_data(gt_anno, dt_anno, current_class, difficulty):
  function image_box_overlap (line 100) | def image_box_overlap(boxes, query_boxes, criterion=-1):
  function bev_box_overlap (line 129) | def bev_box_overlap(boxes, qboxes, criterion=-1):
  function d3_box_overlap_kernel (line 135) | def d3_box_overlap_kernel(boxes,
  function d3_box_overlap (line 173) | def d3_box_overlap(boxes, qboxes, criterion=-1, z_axis=1, z_center=1.0):
  function compute_statistics_jit (line 185) | def compute_statistics_jit(overlaps,
  function get_split_parts (line 306) | def get_split_parts(num, num_part):
  function fused_compute_statistics (line 316) | def fused_compute_statistics(overlaps,
  function calculate_iou_partly (line 365) | def calculate_iou_partly(gt_annos,
  function _prepare_data (line 451) | def _prepare_data(gt_annos, dt_annos, current_class, difficulty):
  function eval_class (line 482) | def eval_class(gt_annos,
  function get_mAP_v2 (line 603) | def get_mAP_v2(prec):
  function do_eval_v2 (line 610) | def do_eval_v2(gt_annos,
  function do_eval_v3 (line 656) | def do_eval_v3(gt_annos,
  function do_coco_style_eval (line 682) | def do_coco_style_eval(gt_annos,
  function print_str (line 711) | def print_str(value, *arg, sstream=None):
  function get_official_eval_result (line 719) | def get_official_eval_result(gt_annos,
  function get_coco_eval_result (line 799) | def get_coco_eval_result(gt_annos,

FILE: disparity/eval/kitti-object-eval-python/evaluate.py
  function _read_imageset_file (line 7) | def _read_imageset_file(path):
  function evaluate (line 13) | def evaluate(label_path,

FILE: disparity/eval/kitti-object-eval-python/kitti_common.py
  function get_image_index_str (line 10) | def get_image_index_str(img_idx):
  function get_kitti_info_path (line 14) | def get_kitti_info_path(idx,
  function get_image_path (line 35) | def get_image_path(idx, prefix, training=True, relative_path=True):
  function get_label_path (line 40) | def get_label_path(idx, prefix, training=True, relative_path=True):
  function get_velodyne_path (line 45) | def get_velodyne_path(idx, prefix, training=True, relative_path=True):
  function get_calib_path (line 50) | def get_calib_path(idx, prefix, training=True, relative_path=True):
  function _extend_matrix (line 55) | def _extend_matrix(mat):
  function get_kitti_image_info (line 60) | def get_kitti_image_info(path,
  function filter_kitti_anno (line 151) | def filter_kitti_anno(image_anno,
  function filter_annos_low_score (line 190) | def filter_annos_low_score(image_annos, thresh):
  function kitti_result_line (line 203) | def kitti_result_line(result_dict, precision=4):
  function add_difficulty_to_annos (line 248) | def add_difficulty_to_annos(info):
  function get_label_anno (line 293) | def get_label_anno(label_path, eval_dist=None):
  function get_label_annos (line 336) | def get_label_annos(label_folder, image_ids=None, return_image_ids=False...
  function area (line 355) | def area(boxes, add1=False):
  function intersection (line 371) | def intersection(boxes1, boxes2, add1=False):
  function iou (line 402) | def iou(boxes1, boxes2, add1=False):

FILE: disparity/eval/kitti-object-eval-python/rotate_iou.py
  function div_up (line 13) | def div_up(m, n):
  function trangle_area (line 17) | def trangle_area(a, b, c):
  function area (line 23) | def area(int_pts, num_of_inter):
  function sort_vertex_in_convex_polygon (line 33) | def sort_vertex_in_convex_polygon(int_pts, num_of_inter):
  function line_segment_intersection (line 76) | def line_segment_intersection(pts1, pts2, i, j, temp_pts):
  function line_segment_intersection_v1 (line 122) | def line_segment_intersection_v1(pts1, pts2, i, j, temp_pts):
  function point_in_quadrilateral (line 161) | def point_in_quadrilateral(pt_x, pt_y, corners):
  function quadrilateral_intersection (line 180) | def quadrilateral_intersection(pts1, pts2, int_pts):
  function rbbox_to_corners (line 204) | def rbbox_to_corners(corners, rbbox):
  function inter (line 231) | def inter(rbbox1, rbbox2):
  function devRotateIoUEval (line 248) | def devRotateIoUEval(rbox1, rbox2, criterion=-1):
  function rotate_iou_kernel_eval (line 263) | def rotate_iou_kernel_eval(N, K, dev_boxes, dev_query_boxes, dev_iou, cr...
  function rotate_iou_gpu_eval (line 336) | def rotate_iou_gpu_eval(boxes, query_boxes, criterion=-1, device_id=0):

FILE: disparity/eval/kitti/evaluate_object_3d_offline.cpp
  type DIFFICULTY (line 37) | enum DIFFICULTY{EASY=0, MODERATE=1, HARD=2}
  type METRIC (line 40) | enum METRIC{IMAGE=0, GROUND=1, BOX3D=2}
  type CLASSES (line 48) | enum CLASSES{CAR=0, PEDESTRIAN=1, CYCLIST=2}
  function initGlobals (line 63) | void initGlobals () {
  type tPrData (line 74) | struct tPrData {
    method tPrData (line 80) | tPrData () :
  type tBox (line 85) | struct tBox {
    method tBox (line 92) | tBox (string type, double x1,double y1,double x2,double y2,double alph...
  type tGroundtruth (line 97) | struct tGroundtruth {
    method tGroundtruth (line 104) | tGroundtruth () :
    method tGroundtruth (line 106) | tGroundtruth (tBox box,double truncation,int32_t occlusion) :
    method tGroundtruth (line 108) | tGroundtruth (string type,double x1,double y1,double x2,double y2,doub...
  type tDetection (line 113) | struct tDetection {
    method tDetection (line 119) | tDetection ():
    method tDetection (line 121) | tDetection (tBox box,double thresh) :
    method tDetection (line 123) | tDetection (string type,double x1,double y1,double x2,double y2,double...
  function loadDetections (line 133) | vector<tDetection> loadDetections(string file_name, bool &compute_aos,
  function loadGroundtruth (line 180) | vector<tGroundtruth> loadGroundtruth(string file_name,bool &success) {
  function saveStats (line 206) | void saveStats (const vector<double> &precision, const vector<double> &a...
  function imageBoxOverlap (line 229) | inline double imageBoxOverlap(tBox a, tBox b, int32_t criterion=-1){
  function imageBoxOverlap (line 265) | inline double imageBoxOverlap(tDetection a, tGroundtruth b, int32_t crit...
  function Polygon (line 271) | Polygon toPolygon(const T& g) {
  function groundBoxOverlap (line 296) | inline double groundBoxOverlap(tDetection d, tGroundtruth g, int32_t cri...
  function box3DOverlap (line 319) | inline double box3DOverlap(tDetection d, tGroundtruth g, int32_t criteri...
  function getThresholds (line 348) | vector<double> getThresholds(vector<double> &v, double n_groundtruth){
  function cleanData (line 384) | void cleanData(CLASSES current_class, const vector<tGroundtruth> &gt, co...
  function tPrData (line 459) | tPrData computeStatistics(CLASSES current_class, const vector<tGroundtru...
    method tPrData (line 80) | tPrData () :
  function eval_class (line 623) | bool eval_class (FILE *fp_det, FILE *fp_ori, CLASSES current_class,
  function saveAndPlotPlots (line 707) | void saveAndPlotPlots(string dir_name,string file_name,string obj_type,v...
  function getEvalIndices (line 776) | vector<int32_t> getEvalIndices(const string& result_dir) {
  function eval (line 793) | bool eval(string gt_dir, string result_dir, Mail* mail){
  function main (line 922) | int32_t main (int32_t argc,char *argv[]) {

FILE: disparity/eval/kitti/mail.h
  function class (line 8) | class Mail {

FILE: disparity/layers/_utils.py
  function _load_C_extensions (line 14) | def _load_C_extensions():

FILE: disparity/layers/batch_norm.py
  class FrozenBatchNorm2d (line 6) | class FrozenBatchNorm2d(nn.Module):
    method __init__ (line 12) | def __init__(self, n):
    method forward (line 19) | def forward(self, x):

FILE: disparity/layers/build_cost_volume.py
  class _BuildCostVolume (line 10) | class _BuildCostVolume(Function):
    method forward (line 12) | def forward(ctx, left, right, shift):
    method backward (line 22) | def backward(ctx, grad_output):
  class BuildCostVolume (line 34) | class BuildCostVolume(nn.Module):
    method __init__ (line 35) | def __init__(self):
    method forward (line 38) | def forward(self, left, right, shift):
    method __repr__ (line 43) | def __repr__(self):

FILE: disparity/layers/iou_loss.py
  class IOULoss (line 5) | class IOULoss(nn.Module):
    method forward (line 6) | def forward(self, pred, target, weight=None):

FILE: disparity/layers/misc.py
  class _NewEmptyTensorOp (line 17) | class _NewEmptyTensorOp(torch.autograd.Function):
    method forward (line 19) | def forward(ctx, x, new_shape):
    method backward (line 24) | def backward(ctx, grad):
  class Conv2d (line 29) | class Conv2d(torch.nn.Conv2d):
    method forward (line 30) | def forward(self, x):
  class ConvTranspose2d (line 45) | class ConvTranspose2d(torch.nn.ConvTranspose2d):
    method forward (line 46) | def forward(self, x):
  class BatchNorm2d (line 66) | class BatchNorm2d(torch.nn.BatchNorm2d):
    method forward (line 67) | def forward(self, x):
  function interpolate (line 75) | def interpolate(

FILE: disparity/layers/roi_align.py
  class _ROIAlign (line 11) | class _ROIAlign(Function):
    method forward (line 13) | def forward(ctx, input, roi, output_size, spatial_scale, sampling_ratio):
    method backward (line 26) | def backward(ctx, grad_output):
  class ROIAlign (line 50) | class ROIAlign(nn.Module):
    method __init__ (line 51) | def __init__(self, output_size, spatial_scale, sampling_ratio):
    method forward (line 57) | def forward(self, input, rois):
    method __repr__ (line 62) | def __repr__(self):

FILE: disparity/layers/roi_pool.py
  class _ROIPool (line 11) | class _ROIPool(Function):
    method forward (line 13) | def forward(ctx, input, roi, output_size, spatial_scale):
    method backward (line 25) | def backward(ctx, grad_output):
  class ROIPool (line 49) | class ROIPool(nn.Module):
    method __init__ (line 50) | def __init__(self, output_size, spatial_scale):
    method forward (line 55) | def forward(self, input, rois):
    method __repr__ (line 58) | def __repr__(self):

FILE: disparity/layers/scale.py
  class Scale (line 5) | class Scale(nn.Module):
    method __init__ (line 6) | def __init__(self, init_value=1.0):
    method forward (line 10) | def forward(self, input):
  class ScaleShift (line 13) | class ScaleShift(nn.Module):
    method __init__ (line 14) | def __init__(self, scale_value, shift_value, exp=False):
    method forward (line 20) | def forward(self, input):

FILE: disparity/layers/sigmoid_focal_loss.py
  class _SigmoidFocalLoss (line 9) | class _SigmoidFocalLoss(Function):
    method forward (line 11) | def forward(ctx, logits, targets, gamma, alpha):
    method backward (line 25) | def backward(ctx, d_loss):
  function sigmoid_focal_loss_cpu (line 40) | def sigmoid_focal_loss_cpu(logits, targets, gamma, alpha):
  class SigmoidFocalLoss (line 55) | class SigmoidFocalLoss(nn.Module):
    method __init__ (line 56) | def __init__(self, gamma, alpha):
    method forward (line 61) | def forward(self, logits, targets, weights=None):
    method __repr__ (line 73) | def __repr__(self):

FILE: disparity/layers/smooth_l1_loss.py
  function smooth_l1_loss (line 6) | def smooth_l1_loss(input, target, beta=1. / 9, size_average=True):
  function l1_loss (line 18) | def l1_loss(input, target, beta=1., sum_last_dim=False):
  function l2_loss (line 25) | def l2_loss(input, target, beta=1., sum_last_dim=False):
  function ordinal_loss (line 34) | def ordinal_loss(input, target):
  function dorn_decode (line 46) | def dorn_decode(cls, reg, alpha, beta):
  function dorn_encode (line 59) | def dorn_encode(depth, alpha, beta, dorn_dim):
  function bce_loss (line 64) | def bce_loss(score, target):

FILE: disparity/models/ActiveStereoNet.py
  function convbn (line 7) | def convbn(in_channel, out_channel, kernel_size, stride, pad, dilation):
  function convbn_3d (line 19) | def convbn_3d(in_channel, out_channel, kernel_size, stride, pad):
  class ConvolutionBlock (line 30) | class ConvolutionBlock(nn.Module):
    method __init__ (line 31) | def __init__(self, in_channel, out_channel, stride, downsample, pad, d...
    method forward (line 39) | def forward(self, x):
  class ResNetBlock (line 44) | class ResNetBlock(nn.Module):
    method __init__ (line 45) | def __init__(self, in_channel, out_channel, stride, downsample, pad, d...
    method forward (line 51) | def forward(self, x):
  class Siamese_Tower (line 56) | class Siamese_Tower(nn.Module):
    method __init__ (line 57) | def __init__(self):
    method forward (line 77) | def forward(self, rgb_img):
  class Disparity_Refinement (line 91) | class Disparity_Refinement(nn.Module):
    method __init__ (line 93) | def __init__(self, in_channel):
    method forward (line 126) | def forward(self, low_disparity, corresponding_rgb):
  class Invalidation_Net (line 137) | class Invalidation_Net(nn.Module):
    method __init__ (line 139) | def __init__(self):
    method forward (line 160) | def forward(self, left_tower, right_tower, input_img, full_res_dispari...
  class disparityregression (line 187) | class disparityregression(nn.Module):
    method __init__ (line 188) | def __init__(self, maxdisp):
    method forward (line 193) | def forward(self, x):
  class Active_StereoNet (line 199) | class Active_StereoNet(nn.Module):
    method __init__ (line 200) | def __init__(self, maxdisp=144):
    method forward (line 217) | def forward(self, left, right):

FILE: disparity/models/stereonet.py
  function project_rect_to_image (line 15) | def project_rect_to_image(pts_3d_rect, P):
  class StereoNet (line 26) | class StereoNet(nn.Module):
    method __init__ (line 27) | def __init__(self, cfg=None):
    method forward (line 224) | def forward(self, left, right, calibs_fu, calibs_baseline, calibs_Proj...

FILE: disparity/models/stereonet_disp.py
  function convbn (line 12) | def convbn(in_channel, out_channel, kernel_size, stride, pad, dilation):
  function convbn_3d (line 24) | def convbn_3d(in_channel, out_channel, kernel_size, stride, pad):
  class BasicBlock (line 35) | class BasicBlock(nn.Module):
    method __init__ (line 36) | def __init__(self, in_channel, out_channel, stride, downsample, pad, d...
    method forward (line 45) | def forward(self, x):
  class FeatureExtraction (line 56) | class FeatureExtraction(nn.Module):
    method __init__ (line 57) | def __init__(self, k):
    method forward (line 79) | def forward(self, rgb_img):
  class EdgeAwareRefinement (line 87) | class EdgeAwareRefinement(nn.Module):
    method __init__ (line 88) | def __init__(self, in_channel):
    method forward (line 102) | def forward(self, low_disparity, corresponding_rgb):
  class disparityregression (line 119) | class disparityregression(nn.Module):
    method __init__ (line 120) | def __init__(self, maxdisp):
    method forward (line 125) | def forward(self, x):
  class StereoNet (line 131) | class StereoNet(nn.Module):
    method __init__ (line 132) | def __init__(self, k=3, r=3, maxdisp=192):
    method forward (line 151) | def forward(self, left, right):

FILE: disparity/models/submodule.py
  function convbn (line 11) | def convbn(in_planes, out_planes, kernel_size, stride, pad, dilation, gn...
  function convbn_3d (line 16) | def convbn_3d(in_planes, out_planes, kernel_size, stride, pad, gn=False,...
  class BasicBlock (line 20) | class BasicBlock(nn.Module):
    method __init__ (line 22) | def __init__(self, inplanes, planes, stride, downsample, pad, dilation...
    method forward (line 33) | def forward(self, x):
  class disparityregression (line 44) | class disparityregression(nn.Module):
    method __init__ (line 45) | def __init__(self, maxdisp, cfg):
    method forward (line 49) | def forward(self, x, depth):
  class hourglass (line 53) | class hourglass(nn.Module):
    method __init__ (line 54) | def __init__(self, inplanes, gn=False):
    method forward (line 78) | def forward(self, x, presqu, postsqu):
  class hourglass2d (line 99) | class hourglass2d(nn.Module):
    method __init__ (line 100) | def __init__(self, inplanes, gn=False):
    method forward (line 124) | def forward(self, x, presqu, postsqu):
  class feature_extraction (line 145) | class feature_extraction(nn.Module):
    method __init__ (line 146) | def __init__(self, cfg):
    method _make_layer (line 244) | def _make_layer(self, block, planes, blocks, stride, pad, dilation, gn...
    method forward (line 260) | def forward(self, x):

FILE: disparity/utils/logger.py
  function setup_logger (line 5) | def setup_logger(filepath):

FILE: disparity/utils/preprocess.py
  function scale_crop (line 33) | def scale_crop(input_size, scale_size=None, normalize=__imagenet_stats):
  function scale_random_crop (line 40) | def scale_random_crop(input_size, scale_size=None, normalize=__imagenet_...
  function pad_random_crop (line 52) | def pad_random_crop(input_size, scale_size=None, normalize=__imagenet_st...
  function inception_preproccess (line 62) | def inception_preproccess(input_size, normalize=__imagenet_stats):
  function inception_color_preproccess (line 69) | def inception_color_preproccess(input_size, normalize=__imagenet_stats):
  function get_transform (line 95) | def get_transform(name='imagenet', input_size=None,
  class Lighting (line 110) | class Lighting(object):
    method __init__ (line 113) | def __init__(self, alphastd, eigval, eigvec):
    method __call__ (line 118) | def __call__(self, img):
  class Grayscale (line 131) | class Grayscale(object):
    method __call__ (line 133) | def __call__(self, img):
  class Saturation (line 141) | class Saturation(object):
    method __init__ (line 143) | def __init__(self, var):
    method __call__ (line 146) | def __call__(self, img):
  class Brightness (line 152) | class Brightness(object):
    method __init__ (line 154) | def __init__(self, var):
    method __call__ (line 157) | def __call__(self, img):
  class Contrast (line 164) | class Contrast(object):
    method __init__ (line 166) | def __init__(self, var):
    method __call__ (line 169) | def __call__(self, img):
  class RandomOrder (line 177) | class RandomOrder(object):
    method __init__ (line 181) | def __init__(self, transforms):
    method __call__ (line 184) | def __call__(self, img):
  class ColorJitter (line 193) | class ColorJitter(RandomOrder):
    method __init__ (line 195) | def __init__(self, brightness=0.4, contrast=0.4, saturation=0.4):
  function get_transform_unsym (line 209) | def get_transform_unsym(left_img, right_img, size=[512, 960]):

FILE: disparity/utils/readpfm.py
  function readPFM (line 6) | def readPFM(file):

FILE: disparity/utils/utils.py
  function GERF_loss (line 11) | def GERF_loss(GT, pred, args):
  function smooth_L1_loss (line 22) | def smooth_L1_loss(GT, pred, args):

FILE: preprocessing/generate_disp.py
  function generate_dispariy_from_velo (line 12) | def generate_dispariy_from_velo(pc_velo, height, width, calib, depth_as_...

FILE: preprocessing/generate_lidar.py
  function project_disp_to_depth (line 10) | def project_disp_to_depth(calib, disp, max_high, baseline=0.54):

FILE: preprocessing/kitti_util.py
  class Calibration (line 11) | class Calibration(object):
    method __init__ (line 44) | def __init__(self, calib_filepath, right_calib=False):
    method read_calib_file (line 71) | def read_calib_file(self, filepath):
    method cart2hom (line 90) | def cart2hom(self, pts_3d):
    method project_velo_to_ref (line 101) | def project_velo_to_ref(self, pts_3d_velo):
    method project_ref_to_velo (line 105) | def project_ref_to_velo(self, pts_3d_ref):
    method project_rect_to_ref (line 109) | def project_rect_to_ref(self, pts_3d_rect):
    method project_ref_to_rect (line 113) | def project_ref_to_rect(self, pts_3d_ref):
    method project_rect_to_velo (line 117) | def project_rect_to_velo(self, pts_3d_rect):
    method project_velo_to_rect (line 124) | def project_velo_to_rect(self, pts_3d_velo):
    method project_rect_to_image (line 131) | def project_rect_to_image(self, pts_3d_rect):
    method project_velo_to_image (line 141) | def project_velo_to_image(self, pts_3d_velo):
    method project_image_to_rect (line 151) | def project_image_to_rect(self, uv_depth):
    method project_image_to_velo (line 165) | def project_image_to_velo(self, uv_depth):
  function inverse_rigid_trans (line 170) | def inverse_rigid_trans(Tr):

FILE: setup.py
  function get_extensions (line 17) | def get_extensions():

FILE: tools/env_utils/exp.py
  class Experimenter (line 8) | class Experimenter:
    method __init__ (line 9) | def __init__(self, model_dir, cfg_path=None):
    method config (line 36) | def config(self):
    method logger (line 40) | def logger(self):
    method writer (line 48) | def writer(self):

FILE: tools/env_utils/logger.py
  class colorlogger (line 16) | class colorlogger():
    method __init__ (line 17) | def __init__(self, log_dir, log_name='training.log'):
    method debug (line 36) | def debug(self, msg):
    method info (line 39) | def info(self, msg):
    method warning (line 42) | def warning(self, msg):
    method critical (line 45) | def critical(self, msg):
    method error (line 48) | def error(self, msg):
  function yellow (line 51) | def yellow(msg):
  function green (line 54) | def green(msg):
  function red (line 57) | def red(msg):
  function blue (line 60) | def blue(msg):
  function print_yellow (line 63) | def print_yellow(msg, **kwargs):
  function print_green (line 66) | def print_green(msg, **kwargs):
  function print_red (line 69) | def print_red(msg, **kwargs):
  function print_blue (line 72) | def print_blue(msg, **kwargs):
  function error (line 75) | def error(msg):
  function warning (line 78) | def warning(msg):

FILE: tools/env_utils/utils.py
  function mem_info (line 12) | def mem_info():
  function random_int (line 21) | def random_int(obj=None):
  function cmd (line 24) | def cmd(command):
  function reset_seed (line 30) | def reset_seed(seed):

FILE: tools/train_net_disp.py
  function get_parser (line 36) | def get_parser():
  function main (line 90) | def main():
  function main_process (line 129) | def main_process(args):
  function main_worker (line 132) | def main_worker(gpu, ngpus_per_node, args, cfg, exp):
  function train (line 276) | def train(model, cfg, args, optimizer, imgL, imgR, disp_L, norm_L,
  class BatchCollator (line 342) | class BatchCollator(object):
    method __init__ (line 343) | def __init__(self, cfg):
    method __call__ (line 347) | def __call__(self, batch):
  function adjust_learning_rate (line 361) | def adjust_learning_rate(optimizer, epoch, step=None, args=None):
Condensed preview — 86 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (349K chars).
[
  {
    "path": "LICENSE",
    "chars": 1067,
    "preview": "MIT License\n\nCopyright (c) 2020 Yilun Chen\n\nPermission is hereby granted, free of charge, to any person obtaining a copy"
  },
  {
    "path": "README.md",
    "chars": 8614,
    "preview": "<div align=\"left\">\n <img src=\"doc/log.png\" width=\"80%\">\n</div>\nX-StereoLab is an open source stereo matching and stereo "
  },
  {
    "path": "configs/config_disp.py",
    "chars": 677,
    "preview": "import os\nimport numpy as np\nfrom yacs.config import CfgNode as CN\n\ncfg = CN()\n\ncfg.cnt = 0\n\ncfg.btrain = 4\n\n\n#---------"
  },
  {
    "path": "data",
    "chars": 40,
    "preview": "/media/elonli/049150C23EB4F058/DSGN/data"
  },
  {
    "path": "disparity/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "disparity/csrc/BuildCostVolume.h",
    "chars": 850,
    "preview": "#pragma once\n\n#include \"cpu/vision.h\"\n\n#ifdef WITH_CUDA\n#include \"cuda/vision.h\"\n#endif\n\n// Interface for Python\nat::Ten"
  },
  {
    "path": "disparity/csrc/ROIAlign.h",
    "chars": 1654,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#pragma once\n\n#include \"cpu/vision.h\"\n\n#ifdef W"
  },
  {
    "path": "disparity/csrc/ROIPool.h",
    "chars": 1630,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#pragma once\n\n#include \"cpu/vision.h\"\n\n#ifdef W"
  },
  {
    "path": "disparity/csrc/SigmoidFocalLoss.h",
    "chars": 1043,
    "preview": "#pragma once\n\n#include \"cpu/vision.h\"\n\n#ifdef WITH_CUDA\n#include \"cuda/vision.h\"\n#endif\n\n// Interface for Python\nat::Ten"
  },
  {
    "path": "disparity/csrc/cpu/ROIAlign_cpu.cpp",
    "chars": 7939,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#include \"cpu/vision.h\"\n\n// implementation take"
  },
  {
    "path": "disparity/csrc/cpu/nms_cpu.cpp",
    "chars": 2461,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#include \"cpu/vision.h\"\n\n\ntemplate <typename sc"
  },
  {
    "path": "disparity/csrc/cpu/vision.h",
    "chars": 594,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#pragma once\n#include <torch/extension.h>\n\n\nat:"
  },
  {
    "path": "disparity/csrc/cuda/BuildCostVolume_cuda.cu",
    "chars": 9092,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#include <ATen/ATen.h>\n#include <ATen/cuda/CUDA"
  },
  {
    "path": "disparity/csrc/cuda/ROIAlign_cuda.cu",
    "chars": 12327,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#include <ATen/ATen.h>\n#include <ATen/cuda/CUDA"
  },
  {
    "path": "disparity/csrc/cuda/ROIPool_cuda.cu",
    "chars": 7855,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#include <ATen/ATen.h>\n#include <ATen/cuda/CUDA"
  },
  {
    "path": "disparity/csrc/cuda/SigmoidFocalLoss_cuda.cu",
    "chars": 5728,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n// This file is modified from  https://github.c"
  },
  {
    "path": "disparity/csrc/cuda/nms.cu",
    "chars": 4850,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#include <ATen/ATen.h>\n#include <ATen/cuda/CUDA"
  },
  {
    "path": "disparity/csrc/cuda/vision.h",
    "chars": 3947,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#pragma once\n#include <torch/extension.h>\n\n\nat:"
  },
  {
    "path": "disparity/csrc/nms.h",
    "chars": 716,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#pragma once\n#include \"cpu/vision.h\"\n\n#ifdef WI"
  },
  {
    "path": "disparity/csrc/vision.cpp",
    "chars": 937,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#include \"nms.h\"\n#include \"ROIAlign.h\"\n#include"
  },
  {
    "path": "disparity/dataloader/DataStatistics.py",
    "chars": 100,
    "preview": "import torch\nfrom dataloader import listflowfile as lt\nfrom dataloader import SecenFlowLoader as DA\n"
  },
  {
    "path": "disparity/dataloader/KITTILoader.py",
    "chars": 4154,
    "preview": "import os\nimport torch\nimport torch.utils.data as data\nimport torch\nimport torchvision.transforms as transforms\nimport r"
  },
  {
    "path": "disparity/dataloader/KITTI_submission_loader.py",
    "chars": 696,
    "preview": "import torch.utils.data as data\n\nfrom PIL import Image\nimport os\nimport os.path\nimport numpy as np\n\nIMG_EXTENSIONS = [\n "
  },
  {
    "path": "disparity/dataloader/KITTI_submission_loader2012.py",
    "chars": 636,
    "preview": "import torch.utils.data as data\n\nfrom PIL import Image\nimport os\nimport os.path\nimport numpy as np\n\nIMG_EXTENSIONS = [\n "
  },
  {
    "path": "disparity/dataloader/KITTIloader2012.py",
    "chars": 1552,
    "preview": "import torch.utils.data as data\n\nfrom PIL import Image\nimport os\nimport os.path\nimport numpy as np\nimport random\n\nIMG_EX"
  },
  {
    "path": "disparity/dataloader/KITTIloader2015.py",
    "chars": 1472,
    "preview": "import torch.utils.data as data\n\nfrom PIL import Image\nimport os\nimport os.path\nimport numpy as np\nimport random\n\nIMG_EX"
  },
  {
    "path": "disparity/dataloader/SceneFlowLoader_demo.py",
    "chars": 2361,
    "preview": "import torch.utils.data as data\nimport random\nfrom PIL import Image\nfrom . import preprocess\nimport numpy as np\nimport s"
  },
  {
    "path": "disparity/dataloader/SecenFlowLoader.py",
    "chars": 5137,
    "preview": "import os\nimport torch\nimport torch.utils.data as data\nimport torch\n#import torchvision.transforms as transforms\nimport "
  },
  {
    "path": "disparity/dataloader/SecenFlowLoader1.py",
    "chars": 2290,
    "preview": "import os\nimport torch\nimport torch.utils.data as data\nimport torch\nimport torchvision.transforms as transforms\nimport r"
  },
  {
    "path": "disparity/dataloader/SecenFlowLoaderfix.py",
    "chars": 3556,
    "preview": "import torch.utils.data as data\nimport random\nfrom PIL import Image\nfrom . import preprocess\n# import preprocess\nimport "
  },
  {
    "path": "disparity/dataloader/Testloader.py",
    "chars": 2380,
    "preview": "\nimport os\nimport torch\nimport torch.utils.data as data\nimport torch\nimport torchvision.transforms as transforms\nimport "
  },
  {
    "path": "disparity/dataloader/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "disparity/dataloader/listflowfile.py",
    "chars": 4671,
    "preview": "import torch.utils.data as data\n\nfrom PIL import Image\nimport os\nimport os.path\n\nIMG_EXTENSIONS = [\n    '.jpg', '.JPG', "
  },
  {
    "path": "disparity/dataloader/listflowfilefix.py",
    "chars": 4363,
    "preview": "import torch.utils.data as data\n\nfrom PIL import Image\nimport os\nimport os.path\n\nIMG_EXTENSIONS = [\n    '.jpg', '.JPG', "
  },
  {
    "path": "disparity/dataloader/preprocess.py",
    "chars": 1609,
    "preview": "import torch\n#import torchvision.transforms as transforms\nimport random\n\nimport cv2\nimport albumentations as A\nfrom albu"
  },
  {
    "path": "disparity/dataloader/readpfm.py",
    "chars": 949,
    "preview": "import re\nimport numpy as np\nimport sys\n\n\ndef readPFM(file):\n    file = open(file, 'rb')\n\n    color = None\n    width = N"
  },
  {
    "path": "disparity/eval/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "disparity/eval/kitti/README.md",
    "chars": 916,
    "preview": "Reference: <a href=\"https://github.com/prclibo/kitti_eval\" target=\"_blank\">https://github.com/prclibo/kitti_eval</a>\n\n# "
  },
  {
    "path": "disparity/eval/kitti/compile.sh",
    "chars": 76,
    "preview": "#/bin/bash\ng++ -o evaluate_object_3d_offline evaluate_object_3d_offline.cpp\n"
  },
  {
    "path": "disparity/eval/kitti/eval.sh",
    "chars": 141,
    "preview": "echo \"evalutating $1 ...\"\n\n./evaluate_object_3d_offline /mnt/backup/project/ylchen/dataset/KITTI_DATASET/kitti_detection"
  },
  {
    "path": "disparity/eval/kitti/eval_05.sh",
    "chars": 100,
    "preview": "echo \"evalutating $1 ...\"\n\n./evaluate_object_3d_offline_05 ../../../data/kitti/training/label_2/ $1\n"
  },
  {
    "path": "disparity/eval/kitti/evaluate_object_3d_offline.cpp",
    "chars": 34511,
    "preview": "#include <iostream>\n#include <algorithm>\n#include <stdio.h>\n#include <math.h>\n#include <vector>\n#include <numeric>\n#incl"
  },
  {
    "path": "disparity/eval/kitti/mail.h",
    "chars": 811,
    "preview": "#ifndef MAIL_H\n#define MAIL_H\n\n#include <stdio.h>\n#include <stdarg.h>\n#include <string.h>\n\nclass Mail {\n\npublic:\n\n  Mail"
  },
  {
    "path": "disparity/eval/kitti-object-eval-python/.gitignore",
    "chars": 1202,
    "preview": "# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packagi"
  },
  {
    "path": "disparity/eval/kitti-object-eval-python/LICENSE",
    "chars": 1057,
    "preview": "MIT License\n\nCopyright (c) 2018 \n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this s"
  },
  {
    "path": "disparity/eval/kitti-object-eval-python/README.md",
    "chars": 1686,
    "preview": "# kitti-object-eval-python\nFast kitti object detection eval in python(finish eval in less than 10 second), support 2d/be"
  },
  {
    "path": "disparity/eval/kitti-object-eval-python/eval.py",
    "chars": 33402,
    "preview": "import io as sysio\nimport time\n\nimport numba\nimport numpy as np\nfrom scipy.interpolate import interp1d\n\nfrom rotate_iou "
  },
  {
    "path": "disparity/eval/kitti-object-eval-python/eval.sh",
    "chars": 286,
    "preview": "#!/bin/bash\necho $1\nif [ ! -n \"$2\" ] ; then\n    class=\"0\"\nelse\n    class=$2\nfi\necho $class\npython3 evaluate.py evaluate "
  },
  {
    "path": "disparity/eval/kitti-object-eval-python/eval_dist.sh",
    "chars": 353,
    "preview": "#!/bin/bash\necho $1\nif [ ! -n \"$2\" ] ; then\n    class=\"0\"\nelse\n    class=$2\nfi\necho $class\n\nfor i in $(seq 0 5 45)\ndo\n\te"
  },
  {
    "path": "disparity/eval/kitti-object-eval-python/evaluate.py",
    "chars": 1026,
    "preview": "import time\nimport fire\nimport kitti_common as kitti\nfrom eval import get_official_eval_result, get_coco_eval_result\n\n\nd"
  },
  {
    "path": "disparity/eval/kitti-object-eval-python/kitti_common.py",
    "chars": 15578,
    "preview": "import concurrent.futures as futures\nimport os\nimport pathlib\nimport re\nfrom collections import OrderedDict\n\nimport nump"
  },
  {
    "path": "disparity/eval/kitti-object-eval-python/rotate_iou.py",
    "chars": 13474,
    "preview": "#####################\n# Based on https://github.com/hongzhenwang/RRPN-revise\n# Licensed under The MIT License\n# Author: "
  },
  {
    "path": "disparity/layers/__init__.py",
    "chars": 927,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\nimport torch\n\nfrom .batch_norm import FrozenBatc"
  },
  {
    "path": "disparity/layers/_utils.py",
    "chars": 1165,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\nimport glob\nimport os.path\n\nimport torch\n\ntry:\n "
  },
  {
    "path": "disparity/layers/batch_norm.py",
    "chars": 799,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\nimport torch\nfrom torch import nn\n\n\nclass Frozen"
  },
  {
    "path": "disparity/layers/build_cost_volume.py",
    "chars": 1104,
    "preview": "import torch\nfrom torch import nn\nfrom torch.autograd import Function\nfrom torch.autograd.function import once_different"
  },
  {
    "path": "disparity/layers/iou_loss.py",
    "chars": 1204,
    "preview": "import torch\nfrom torch import nn\n\n\nclass IOULoss(nn.Module):\n    def forward(self, pred, target, weight=None):\n        "
  },
  {
    "path": "disparity/layers/misc.py",
    "chars": 3504,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n\"\"\"\nhelper class that supports empty tensors on "
  },
  {
    "path": "disparity/layers/nms.py",
    "chars": 202,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n# from ._utils import _C\nfrom dsgn import _C\n\nnm"
  },
  {
    "path": "disparity/layers/roi_align.py",
    "chars": 2096,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\nimport torch\nfrom torch import nn\nfrom torch.aut"
  },
  {
    "path": "disparity/layers/roi_pool.py",
    "chars": 1841,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\nimport torch\nfrom torch import nn\nfrom torch.aut"
  },
  {
    "path": "disparity/layers/scale.py",
    "chars": 752,
    "preview": "import torch\nfrom torch import nn\n\n\nclass Scale(nn.Module):\n    def __init__(self, init_value=1.0):\n        super(Scale,"
  },
  {
    "path": "disparity/layers/sigmoid_focal_loss.py",
    "chars": 2423,
    "preview": "import torch\nfrom torch import nn\nfrom torch.autograd import Function\nfrom torch.autograd.function import once_different"
  },
  {
    "path": "disparity/layers/smooth_l1_loss.py",
    "chars": 2062,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\nimport torch\nimport numpy as np\n\n# TODO maybe pu"
  },
  {
    "path": "disparity/models/ActiveStereoNet.py",
    "chars": 10405,
    "preview": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nimport numpy as np\nimport torch.backends.cudnn as cud"
  },
  {
    "path": "disparity/models/__init__.py",
    "chars": 115,
    "preview": "from .stereonet import StereoNet\nfrom .hitnet import HitNet\nfrom .stereonet_disp import StereoNet as stereonet_disp"
  },
  {
    "path": "disparity/models/stereonet.py",
    "chars": 24336,
    "preview": "from __future__ import print_function\n\nfrom .submodule import *\nimport torch\nimport torch.nn as nn\nimport torch.utils.da"
  },
  {
    "path": "disparity/models/stereonet_disp.py",
    "chars": 7681,
    "preview": "# ------------------------------------------------------------------------------\n# Copyright (c) NKU\n# Licensed under th"
  },
  {
    "path": "disparity/models/submodule.py",
    "chars": 14242,
    "preview": "from __future__ import print_function\nimport torch\nimport torch.nn as nn\nimport torch.utils.data\nfrom torch.autograd imp"
  },
  {
    "path": "disparity/utils/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "disparity/utils/logger.py",
    "chars": 919,
    "preview": "import logging\nimport os\n\n\ndef setup_logger(filepath):\n    file_formatter = logging.Formatter(\n        \"[%(asctime)s %(f"
  },
  {
    "path": "disparity/utils/preprocess.py",
    "chars": 6626,
    "preview": "import torch\nimport torchvision.transforms as transforms\nimport torchvision\nimport random\nimport numpy as np\n\n__imagenet"
  },
  {
    "path": "disparity/utils/readpfm.py",
    "chars": 914,
    "preview": "import re\nimport numpy as np\nimport sys\n \n\ndef readPFM(file):\n    file = open(file, 'rb')\n\n    color = None\n    width = "
  },
  {
    "path": "disparity/utils/tensorboardx.py",
    "chars": 526,
    "preview": "from tensorboardX import SummaryWriter\nimport numpy as np\nwriter = SummaryWriter(log_dir='/disk1/hyj/DFAStereo/ver2.0/ru"
  },
  {
    "path": "disparity/utils/utils.py",
    "chars": 1713,
    "preview": "# ------------------------------------------------------------------------------\n# Copyright (c) NKU\n# Licensed under th"
  },
  {
    "path": "preprocessing/generate_disp.py",
    "chars": 2986,
    "preview": "import argparse\nimport os\n\nimport numpy as np\nimport scipy.misc as ssc\n\nimport kitti_util\nimport imageio\n\nDEPTH_AS_DISP "
  },
  {
    "path": "preprocessing/generate_lidar.py",
    "chars": 2015,
    "preview": "import argparse\nimport os\n\nimport numpy as np\nimport scipy.misc as ssc\n\nimport kitti_util\nimport imageio\n\ndef project_di"
  },
  {
    "path": "preprocessing/kitti_util.py",
    "chars": 6291,
    "preview": "\"\"\" Helper methods for loading and parsing KITTI data.\n\nAuthor: Charles R. Qi\nDate: September 2017\n\"\"\"\nfrom __future__ i"
  },
  {
    "path": "requirement.txt",
    "chars": 100,
    "preview": "torch==1.3.0\ntorchvision==0.4.1\n\ntensorboardX\nyacs\nopencv-python\nfire    \n\nscipy\nscikit-image\nnumba\n"
  },
  {
    "path": "setup.py",
    "chars": 2067,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#!/usr/bin/env python\n\nimport glob\nimport os\n\nim"
  },
  {
    "path": "tools/env_utils/__init__.py",
    "chars": 82,
    "preview": "from .logger import colorlogger\nfrom .utils import *\nfrom .exp import Experimenter"
  },
  {
    "path": "tools/env_utils/exp.py",
    "chars": 2113,
    "preview": "import os\nimport sys\nimport numpy as np\n\nfrom .logger import colorlogger\nfrom tensorboardX import SummaryWriter \n\nclass "
  },
  {
    "path": "tools/env_utils/logger.py",
    "chars": 1957,
    "preview": "import logging\nimport os\n\nOK = '\\033[92m'\nWARNING = '\\033[93m'\nFAIL = '\\033[91m'\nEND = '\\033[0m'\n\nPINK = '\\033[95m'\nBLUE"
  },
  {
    "path": "tools/env_utils/utils.py",
    "chars": 915,
    "preview": "import os\nimport os.path as osp\nimport shutil\nimport sys\nimport numpy as np\nfrom datetime import datetime\nfrom glob impo"
  },
  {
    "path": "tools/train_net_disp.py",
    "chars": 16134,
    "preview": "from __future__ import print_function\n\nimport argparse\nimport os\nimport time\n\nimport numpy as np\nimport torch\nimport tor"
  }
]

// ... and 1 more files (download for full content)

About this extraction

This page contains the full source code of the meteorshowers/StereoNet GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 86 files (325.0 KB), approximately 92.1k tokens, and a symbol index with 357 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!