[
  {
    "path": "LICENSE",
    "content": "MIT License\n\nCopyright (c) 2020 Yilun Chen\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "README.md",
    "content": "<div align=\"left\">\n <img src=\"doc/log.png\" width=\"80%\">\n</div>\nX-StereoLab is an open source stereo matching and stereo 3D object detection toolbox based on PyTorch.\n\n## News: We released the codebase v0.0.0.\n* matching and detection model  result.\n<div align=\"center\">\n <img src=\"doc/demo.png\" width=\"50%\">\n</div>\n\n\n* GOOGLE HITNET model pytorch model will be released.\n<div align=\"center\">\n <img src=\"doc/hitnet.png\" width=\"80%\">\n</div>\n\n* GOOGLE HITNET model pytorch  KITTI2015 submission: http://www.cvlibs.net/datasets/kitti/eval_scene_flow_detail.php?benchmark=stereo&result=226494ba5559e9f5f46bdbd681d1564fee78409e\n  ranking 145 with 80GMAC\n\n\n### Requirements\nAll the codes are tested in the following environment:\n* Ubuntu 16.04\n* Python 3.7\n* PyTorch 1.1.0 or 1.2.0 or 1.3.0\n* Torchvision 0.2.2 or 0.4.1\n\n### Installation \n\n(1) Clone this repository.\n```\ngit clone git@github.com:meteorshowers/X-StereoLab.git && cd X-StereoLab\n```\n\n(2) Setup Python environment.\n```\nconda activate -n xstereolab\npip install -r requirements.txt --user\n\n## conda deactivate xstereolab\n```\n\n<!-- (3) Compile the rotated IoU library (for 3D detection). \n```\ncd X-stereoLab/utils/rotate_iou && bash compile.sh & cd ../../../\n```\n\n(4) Compile and install X-StereoLab library (for 3D detection).\n```\n# the following will install the lib with symbolic links, so that\n# you can modify the file if you want and won't need to re-build it.\npython3 setup.py build develop --user\n``` -->\n\n### Data Preparation\n\n(1) Please download the KITTI dataset.\n```\nln -s /path/to/KITTI_DATA_PATH ./data/kitti/\nln -s /path/to/OUTPUT_PATH ./outputs/\n```\n\n\n### Multi-GPU Training\n\nThe training scripts support [multi-processing distributed training](https://github.com/pytorch/examples/tree/master/imagenet), which is much faster than the typical PyTorch DataParallel interface.\n```\npython3 tools/train_net_disp.py --cfg ./configs/config_xxx.py --savemodel ./outputs/MODEL_NAME -btrain 4 -d 0-3 --multiprocessing-distributed\n```\nThe training models, configuration and logs will be saved in the model folder.\n\nTo load some pretrained model, you can run\n```\npython3 tools/train_net_disp.py --cfg xxx/config.py --loadmodel ./outputs/MODEL_NAMEx --start_epoch xxx --savemodel ./outputs/MODEL_NAME -btrain 4 -d 0-3 --multiprocessing-distributed\n```\nIf you want to continue training from some epochs, just set the cfg, loadmodel and start_epoch to the respective model path.\n\nBesides, you can start a tensorboard session by\n```\ntensorboard --logdir=./outputs/MODEL_NAME/tensorboard --port=6666\n```\nand visualize your training process by accessing https://localhost:6666 on your browser.\n\n### Inference and Evaluation\n\non working ...\n\n### stereo matching Performance and Model Zoo\n\n<!-- We provide several pretrained models for our experiments. -->\n\n<table>\n    <thead>\n        <tr>\n            <th>Methods</th>\n            <th>Epochs</th>\n            <!-- <th>Inference Time(s/im)</th> -->\n            <th>Train Mem (GB/Img)</th>\n            <th>Test Mem (GB/Img)</th>\n            <th>EPE</th>\n            <th>D1-all</th>\n            <th>Models</th>\n        </tr>\n    </thead>\n    <tbody>\n        <tr>\n            <td>HITNET (kitti)</td>\n            <td>4200</td>\n            <td></td>\n            <td></td>\n            <td></td>\n            <td>2.43%</td>\n            <td><a href=> GoogleDrive </a></td>\n        </tr>\n            <tr>\n            <td>HITNET (sceneflow)</td>\n            <td>200</td>\n            <td></td>\n            <td></td>\n            <td>0.65</td>\n            <td></td>\n            <td><a href=> GoogleDrive </a></td>\n        </tr>\n          <tr>\n            <td>stereonet (sceneflow)</td>\n            <td>20</td>\n            <td></td>\n            <td></td>\n            <td>1.10</td>\n            <td></td>\n            <td><a href=> GoogleDrive </a></td>\n        </tr>\n          <tr>\n            <td>ActiveStereoNet</td>\n            <td>10</td>\n            <td></td>\n            <td></td>\n            <td></td>\n            <td></td>\n            <td><a href=> GoogleDrive </a></td>\n        </tr>      \n        <tr>\n            <td>SOS</td>\n            <td rowspan=2></td>\n            <td rowspan=2> </td>\n            <td rowspan=2></td>\n            <td></td>\n            <td></td>\n            <td rowspan=2> </a></td>\n        </tr>\n        \n    </tbody>\n</table>\n\n### stereo 3D detection Performance and Model Zoo\n#### PLUME: Efficient 3D Object Detection from Stereo Images\n\n<table>\n    <thead>\n        <tr>\n            <th>Methods</th>\n            <th>Epochs</th>\n            <!-- <th>Inference Time(s/im)</th> -->\n            <th>Train Mem (GB/Img)</th>\n            <th>Test Mem (GB/Img)</th>\n            <th>3D BEV AP (Ours small plume)</th>\n            <th>3D BEV AP (Paper small plume)</th>\n        </tr>\n    </thead>\n    <tbody>\n        <tr>\n            <td>PLUME</td>\n            <td></td>\n            <td></td>\n            <td></td>\n            <td>72.9 / 62.5 / 56.9</td>\n            <td>74.4 / 61.7 / 55.8</td>\n        </tr>\n    </tbody>\n</table>\n\n\n### Video Demo\n\nWe provide a video demo for showing the result of X-StereoLab. Here we show the predicted disparity map of activastereonet.\n\n<p align=\"center\"> <a href=\"https://www.youtube.com/watch?v=pqKZs1b1b0Y\"><img src=\"./doc/demo_cover.png\" width=\"50%\"></a> </p>\n\n### TODO List\n- [x] Multiprocessing GPU training\n- [x] TensorboardX\n- [x] Reduce training GPU memory usage\n- [x] eval and test code\n- [ ] Result visualization\n- [ ] Still in progress\n\n\n\n### Citations\nIf you find our work useful in your research, please consider citing:\n```\n@misc{XStereoLab2021,\n    title={{X-StereoLab} stereo matching and stereo 3D object detection toolbox},\n    author={X-StereoLab Contributors},\n    howpublished = {\\url{https://github.com/meteorshowers/X-StereoLab}},\n    year={2021}\n}\n* refercence[2] \n@article{tankovich2020hitnet,\n  title={HITNet: Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching},\n  author={Tankovich, Vladimir and H{\\\"a}ne, Christian and Fanello, Sean and Zhang, Yinda and Izadi, Shahram and Bouaziz, Sofien},\n  journal={arXiv preprint arXiv:2007.12140},\n  year={2020}\n}\n\n* refercence[3] \n@inproceedings{tankovich2018sos,\n  title={Sos: Stereo matching in o (1) with slanted support windows},\n  author={Tankovich, Vladimir and Schoenberg, Michael and Fanello, Sean Ryan and Kowdle, Adarsh and Rhemann, Christoph and Dzitsiuk, Maksym and Schmidt, Mirko and Valentin, Julien and Izadi, Shahram},\n  booktitle={2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},\n  pages={6782--6789},\n  year={2018},\n  organization={IEEE}\n}\n\n```\n\n\n## Others contributors\n\n<table border=\"0\">\n  <tbody>\n    <tr align=\"center\" >\n      <!-- <td>\n        ​ <a href=\"https://github.com/shenweichen\"><img width=\"70\" height=\"70\" src=\"https://github.com/shenweichen.png?s=40\" alt=\"pic\"></a><br>\n        ​ <a href=\"https://github.com/shenweichen\">Shen Weichen</a> ​\n        <p>\n        Alibaba Group  </p>​\n      </td> -->\n      <td>\n         <a href=\"https://github.com/vtankovich\"><img width=\"70\" height=\"70\" src=\"https://avatars.githubusercontent.com/u/74434832?v=4\" alt=\"pic\"></a><br>\n         <a href=\"https://github.com/vtankovich\">vtankovich</a> ​\n        <p>GOOGLE  </p>​\n      </td>\n     <td>\n         <a href=\"https://github.com/mileyan\"><img width=\"70\" height=\"70\" src=\"https://avatars.githubusercontent.com/u/3722398?v=4\" alt=\"pic\"></a><br>\n         <a href=\"https://github.com/mileyan\">Yan Wang</a> ​\n        <p>Waymo  </p>​\n      </td>     \n    </tr>\n  </tbody>\n</table>\n\n\n### Acknowledgment\n\n* Thanks to  <a href=\"https://github.com/samehkhamis\"> SamehKhamis (NVIDIA) \n\n### License\nThe code is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License for NonCommercial use only. Any commercial use should get formal permission first.\n\n### Contact\nIf you have any questions or suggestions about this repo, please feel free to contact me (xuanyili.edu@gmail.com).\nWechat:\n<table border=\"0\">\n  <tbody>\n    <tr align=\"center\" >\n      <!-- <td>\n        ​ <a href=\"https://github.com/shenweichen\"><img width=\"70\" height=\"70\" src=\"https://github.com/shenweichen.png?s=40\" alt=\"pic\"></a><br>\n        ​ <a href=\"https://github.com/shenweichen\">Shen Weichen</a> ​\n        <p>\n        Alibaba Group  </p>​\n      </td> -->\n      <td>\n         <a href=\"https://github.com/meteorshowers\"><img width=\"100\" height=\"100\" src='doc/wechat.png' alt=\"pic\"></a><br>\n         <a href=\"https://github.com/meteorshowers\">XUANYILI</a> ​\n        <p>  </p>​\n      </td>\n    </tr>\n  </tbody>\n</table>\n"
  },
  {
    "path": "configs/config_disp.py",
    "content": "import os\nimport numpy as np\nfrom yacs.config import CfgNode as CN\n\ncfg = CN()\n\ncfg.cnt = 0\n\ncfg.btrain = 4\n\n\n#------------- disparity ---------------#\ncfg.model = 'stereonet' # ['stereonet', 'activestereonet', 'hitnet', 'sos']\ncfg.maxdisp = 192\ncfg.mindisp = 0\ncfg.loss_disp = True\n#--------------volume--------------------------#\ncfg.PlaneSweepVolume = False\ncfg.DispVolume = True\n\n\n#------------- depth ---------------#\n\n#------------- detection ---------------#\n\n\n#-------------- debug ----------------#\ncfg.debug = False\n\n#-------------- Parameters -----------#\n\n#----------------- centerness --------------#\n\n#----------------------------------------------------#\n\n\n\n\n\n\n\n"
  },
  {
    "path": "data",
    "content": "/media/elonli/049150C23EB4F058/DSGN/data"
  },
  {
    "path": "disparity/__init__.py",
    "content": ""
  },
  {
    "path": "disparity/csrc/BuildCostVolume.h",
    "content": "#pragma once\n\n#include \"cpu/vision.h\"\n\n#ifdef WITH_CUDA\n#include \"cuda/vision.h\"\n#endif\n\n// Interface for Python\nat::Tensor BuildCostVolume_forward(const at::Tensor& left,\n                            const at::Tensor& right,\n                            const at::Tensor& shift) {\n  if (left.type().is_cuda()) {\n#ifdef WITH_CUDA\n    return BuildCostVolume_forward_cuda(left, right, shift);\n#else\n    AT_ERROR(\"Not compiled with GPU support\");\n#endif\n  }\n  AT_ERROR(\"Not implemented on the CPU\");\n}\n\nstd::tuple<at::Tensor, at::Tensor> BuildCostVolume_backward(const at::Tensor& grad,\n                             const at::Tensor& shift) {\n  if (grad.type().is_cuda()) {\n#ifdef WITH_CUDA\n    return BuildCostVolume_backward_cuda(grad, shift);\n#else\n    AT_ERROR(\"Not compiled with GPU support\");\n#endif\n  }\n  AT_ERROR(\"Not implemented on the CPU\");\n}\n\n"
  },
  {
    "path": "disparity/csrc/ROIAlign.h",
    "content": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#pragma once\n\n#include \"cpu/vision.h\"\n\n#ifdef WITH_CUDA\n#include \"cuda/vision.h\"\n#endif\n\n// Interface for Python\nat::Tensor ROIAlign_forward(const at::Tensor& input,\n                            const at::Tensor& rois,\n                            const float spatial_scale,\n                            const int pooled_height,\n                            const int pooled_width,\n                            const int sampling_ratio) {\n  if (input.type().is_cuda()) {\n#ifdef WITH_CUDA\n    return ROIAlign_forward_cuda(input, rois, spatial_scale, pooled_height, pooled_width, sampling_ratio);\n#else\n    AT_ERROR(\"Not compiled with GPU support\");\n#endif\n  }\n  return ROIAlign_forward_cpu(input, rois, spatial_scale, pooled_height, pooled_width, sampling_ratio);\n}\n\nat::Tensor ROIAlign_backward(const at::Tensor& grad,\n                             const at::Tensor& rois,\n                             const float spatial_scale,\n                             const int pooled_height,\n                             const int pooled_width,\n                             const int batch_size,\n                             const int channels,\n                             const int height,\n                             const int width,\n                             const int sampling_ratio) {\n  if (grad.type().is_cuda()) {\n#ifdef WITH_CUDA\n    return ROIAlign_backward_cuda(grad, rois, spatial_scale, pooled_height, pooled_width, batch_size, channels, height, width, sampling_ratio);\n#else\n    AT_ERROR(\"Not compiled with GPU support\");\n#endif\n  }\n  AT_ERROR(\"Not implemented on the CPU\");\n}\n\n"
  },
  {
    "path": "disparity/csrc/ROIPool.h",
    "content": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#pragma once\n\n#include \"cpu/vision.h\"\n\n#ifdef WITH_CUDA\n#include \"cuda/vision.h\"\n#endif\n\n\nstd::tuple<at::Tensor, at::Tensor> ROIPool_forward(const at::Tensor& input,\n                                const at::Tensor& rois,\n                                const float spatial_scale,\n                                const int pooled_height,\n                                const int pooled_width) {\n  if (input.type().is_cuda()) {\n#ifdef WITH_CUDA\n    return ROIPool_forward_cuda(input, rois, spatial_scale, pooled_height, pooled_width);\n#else\n    AT_ERROR(\"Not compiled with GPU support\");\n#endif\n  }\n  AT_ERROR(\"Not implemented on the CPU\");\n}\n\nat::Tensor ROIPool_backward(const at::Tensor& grad,\n                                 const at::Tensor& input,\n                                 const at::Tensor& rois,\n                                 const at::Tensor& argmax,\n                                 const float spatial_scale,\n                                 const int pooled_height,\n                                 const int pooled_width,\n                                 const int batch_size,\n                                 const int channels,\n                                 const int height,\n                                 const int width) {\n  if (grad.type().is_cuda()) {\n#ifdef WITH_CUDA\n    return ROIPool_backward_cuda(grad, input, rois, argmax, spatial_scale, pooled_height, pooled_width, batch_size, channels, height, width);\n#else\n    AT_ERROR(\"Not compiled with GPU support\");\n#endif\n  }\n  AT_ERROR(\"Not implemented on the CPU\");\n}\n\n\n\n"
  },
  {
    "path": "disparity/csrc/SigmoidFocalLoss.h",
    "content": "#pragma once\n\n#include \"cpu/vision.h\"\n\n#ifdef WITH_CUDA\n#include \"cuda/vision.h\"\n#endif\n\n// Interface for Python\nat::Tensor SigmoidFocalLoss_forward(\n\t\tconst at::Tensor& logits,\n                const at::Tensor& targets,\n\t\tconst int num_classes, \n\t\tconst float gamma, \n\t\tconst float alpha) {\n  if (logits.type().is_cuda()) {\n#ifdef WITH_CUDA\n    return SigmoidFocalLoss_forward_cuda(logits, targets, num_classes, gamma, alpha);\n#else\n    AT_ERROR(\"Not compiled with GPU support\");\n#endif\n  }\n  AT_ERROR(\"Not implemented on the CPU\");\n}\n\nat::Tensor SigmoidFocalLoss_backward(\n\t\t\t     const at::Tensor& logits,\n                             const at::Tensor& targets,\n\t\t\t     const at::Tensor& d_losses,\n\t\t\t     const int num_classes,\n\t\t\t     const float gamma,\n\t\t\t     const float alpha) {\n  if (logits.type().is_cuda()) {\n#ifdef WITH_CUDA\n    return SigmoidFocalLoss_backward_cuda(logits, targets, d_losses, num_classes, gamma, alpha);\n#else\n    AT_ERROR(\"Not compiled with GPU support\");\n#endif\n  }\n  AT_ERROR(\"Not implemented on the CPU\");\n}\n"
  },
  {
    "path": "disparity/csrc/cpu/ROIAlign_cpu.cpp",
    "content": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#include \"cpu/vision.h\"\n\n// implementation taken from Caffe2\ntemplate <typename T>\nstruct PreCalc {\n  int pos1;\n  int pos2;\n  int pos3;\n  int pos4;\n  T w1;\n  T w2;\n  T w3;\n  T w4;\n};\n\ntemplate <typename T>\nvoid pre_calc_for_bilinear_interpolate(\n    const int height,\n    const int width,\n    const int pooled_height,\n    const int pooled_width,\n    const int iy_upper,\n    const int ix_upper,\n    T roi_start_h,\n    T roi_start_w,\n    T bin_size_h,\n    T bin_size_w,\n    int roi_bin_grid_h,\n    int roi_bin_grid_w,\n    std::vector<PreCalc<T>>& pre_calc) {\n  int pre_calc_index = 0;\n  for (int ph = 0; ph < pooled_height; ph++) {\n    for (int pw = 0; pw < pooled_width; pw++) {\n      for (int iy = 0; iy < iy_upper; iy++) {\n        const T yy = roi_start_h + ph * bin_size_h +\n            static_cast<T>(iy + .5f) * bin_size_h /\n                static_cast<T>(roi_bin_grid_h); // e.g., 0.5, 1.5\n        for (int ix = 0; ix < ix_upper; ix++) {\n          const T xx = roi_start_w + pw * bin_size_w +\n              static_cast<T>(ix + .5f) * bin_size_w /\n                  static_cast<T>(roi_bin_grid_w);\n\n          T x = xx;\n          T y = yy;\n          // deal with: inverse elements are out of feature map boundary\n          if (y < -1.0 || y > height || x < -1.0 || x > width) {\n            // empty\n            PreCalc<T> pc;\n            pc.pos1 = 0;\n            pc.pos2 = 0;\n            pc.pos3 = 0;\n            pc.pos4 = 0;\n            pc.w1 = 0;\n            pc.w2 = 0;\n            pc.w3 = 0;\n            pc.w4 = 0;\n            pre_calc[pre_calc_index] = pc;\n            pre_calc_index += 1;\n            continue;\n          }\n\n          if (y <= 0) {\n            y = 0;\n          }\n          if (x <= 0) {\n            x = 0;\n          }\n\n          int y_low = (int)y;\n          int x_low = (int)x;\n          int y_high;\n          int x_high;\n\n          if (y_low >= height - 1) {\n            y_high = y_low = height - 1;\n            y = (T)y_low;\n          } else {\n            y_high = y_low + 1;\n          }\n\n          if (x_low >= width - 1) {\n            x_high = x_low = width - 1;\n            x = (T)x_low;\n          } else {\n            x_high = x_low + 1;\n          }\n\n          T ly = y - y_low;\n          T lx = x - x_low;\n          T hy = 1. - ly, hx = 1. - lx;\n          T w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx;\n\n          // save weights and indeces\n          PreCalc<T> pc;\n          pc.pos1 = y_low * width + x_low;\n          pc.pos2 = y_low * width + x_high;\n          pc.pos3 = y_high * width + x_low;\n          pc.pos4 = y_high * width + x_high;\n          pc.w1 = w1;\n          pc.w2 = w2;\n          pc.w3 = w3;\n          pc.w4 = w4;\n          pre_calc[pre_calc_index] = pc;\n\n          pre_calc_index += 1;\n        }\n      }\n    }\n  }\n}\n\ntemplate <typename T>\nvoid ROIAlignForward_cpu_kernel(\n    const int nthreads,\n    const T* bottom_data,\n    const T& spatial_scale,\n    const int channels,\n    const int height,\n    const int width,\n    const int pooled_height,\n    const int pooled_width,\n    const int sampling_ratio,\n    const T* bottom_rois,\n    //int roi_cols,\n    T* top_data) {\n  //AT_ASSERT(roi_cols == 4 || roi_cols == 5);\n  int roi_cols = 5;\n\n  int n_rois = nthreads / channels / pooled_width / pooled_height;\n  // (n, c, ph, pw) is an element in the pooled output\n  // can be parallelized using omp\n  // #pragma omp parallel for num_threads(32)\n  for (int n = 0; n < n_rois; n++) {\n    int index_n = n * channels * pooled_width * pooled_height;\n\n    // roi could have 4 or 5 columns\n    const T* offset_bottom_rois = bottom_rois + n * roi_cols;\n    int roi_batch_ind = 0;\n    if (roi_cols == 5) {\n      roi_batch_ind = offset_bottom_rois[0];\n      offset_bottom_rois++;\n    }\n\n    // Do not using rounding; this implementation detail is critical\n    T roi_start_w = offset_bottom_rois[0] * spatial_scale;\n    T roi_start_h = offset_bottom_rois[1] * spatial_scale;\n    T roi_end_w = offset_bottom_rois[2] * spatial_scale;\n    T roi_end_h = offset_bottom_rois[3] * spatial_scale;\n    // T roi_start_w = round(offset_bottom_rois[0] * spatial_scale);\n    // T roi_start_h = round(offset_bottom_rois[1] * spatial_scale);\n    // T roi_end_w = round(offset_bottom_rois[2] * spatial_scale);\n    // T roi_end_h = round(offset_bottom_rois[3] * spatial_scale);\n\n    // Force malformed ROIs to be 1x1\n    T roi_width = std::max(roi_end_w - roi_start_w, (T)1.);\n    T roi_height = std::max(roi_end_h - roi_start_h, (T)1.);\n    T bin_size_h = static_cast<T>(roi_height) / static_cast<T>(pooled_height);\n    T bin_size_w = static_cast<T>(roi_width) / static_cast<T>(pooled_width);\n\n    // We use roi_bin_grid to sample the grid and mimic integral\n    int roi_bin_grid_h = (sampling_ratio > 0)\n        ? sampling_ratio\n        : ceil(roi_height / pooled_height); // e.g., = 2\n    int roi_bin_grid_w =\n        (sampling_ratio > 0) ? sampling_ratio : ceil(roi_width / pooled_width);\n\n    // We do average (integral) pooling inside a bin\n    const T count = roi_bin_grid_h * roi_bin_grid_w; // e.g. = 4\n\n    // we want to precalculate indeces and weights shared by all chanels,\n    // this is the key point of optimiation\n    std::vector<PreCalc<T>> pre_calc(\n        roi_bin_grid_h * roi_bin_grid_w * pooled_width * pooled_height);\n    pre_calc_for_bilinear_interpolate(\n        height,\n        width,\n        pooled_height,\n        pooled_width,\n        roi_bin_grid_h,\n        roi_bin_grid_w,\n        roi_start_h,\n        roi_start_w,\n        bin_size_h,\n        bin_size_w,\n        roi_bin_grid_h,\n        roi_bin_grid_w,\n        pre_calc);\n\n      for (int c = 0; c < channels; c++) {\n      int index_n_c = index_n + c * pooled_width * pooled_height;\n      const T* offset_bottom_data =\n          bottom_data + (roi_batch_ind * channels + c) * height * width;\n      int pre_calc_index = 0;\n\n      for (int ph = 0; ph < pooled_height; ph++) {\n        for (int pw = 0; pw < pooled_width; pw++) {\n          int index = index_n_c + ph * pooled_width + pw;\n\n          T output_val = 0.;\n          for (int iy = 0; iy < roi_bin_grid_h; iy++) {\n            for (int ix = 0; ix < roi_bin_grid_w; ix++) {\n              PreCalc<T> pc = pre_calc[pre_calc_index];\n              output_val += pc.w1 * offset_bottom_data[pc.pos1] +\n                  pc.w2 * offset_bottom_data[pc.pos2] +\n                  pc.w3 * offset_bottom_data[pc.pos3] +\n                  pc.w4 * offset_bottom_data[pc.pos4];\n\n              pre_calc_index += 1;\n            }\n          }\n          output_val /= count;\n\n          top_data[index] = output_val;\n        } // for pw\n      } // for ph\n    } // for c\n  } // for n\n}\n\nat::Tensor ROIAlign_forward_cpu(const at::Tensor& input,\n                                const at::Tensor& rois,\n                                const float spatial_scale,\n                                const int pooled_height,\n                                const int pooled_width,\n                                const int sampling_ratio) {\n  AT_ASSERTM(!input.type().is_cuda(), \"input must be a CPU tensor\");\n  AT_ASSERTM(!rois.type().is_cuda(), \"rois must be a CPU tensor\");\n\n  auto num_rois = rois.size(0);\n  auto channels = input.size(1);\n  auto height = input.size(2);\n  auto width = input.size(3);\n\n  auto output = at::empty({num_rois, channels, pooled_height, pooled_width}, input.options());\n  auto output_size = num_rois * pooled_height * pooled_width * channels;\n\n  if (output.numel() == 0) {\n    return output;\n  }\n\n  AT_DISPATCH_FLOATING_TYPES(input.type(), \"ROIAlign_forward\", [&] {\n    ROIAlignForward_cpu_kernel<scalar_t>(\n         output_size,\n         input.data<scalar_t>(),\n         spatial_scale,\n         channels,\n         height,\n         width,\n         pooled_height,\n         pooled_width,\n         sampling_ratio,\n         rois.data<scalar_t>(),\n         output.data<scalar_t>());\n  });\n  return output;\n}\n"
  },
  {
    "path": "disparity/csrc/cpu/nms_cpu.cpp",
    "content": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#include \"cpu/vision.h\"\n\n\ntemplate <typename scalar_t>\nat::Tensor nms_cpu_kernel(const at::Tensor& dets,\n                          const at::Tensor& scores,\n                          const float threshold) {\n  AT_ASSERTM(!dets.type().is_cuda(), \"dets must be a CPU tensor\");\n  AT_ASSERTM(!scores.type().is_cuda(), \"scores must be a CPU tensor\");\n  AT_ASSERTM(dets.type() == scores.type(), \"dets should have the same type as scores\");\n\n  if (dets.numel() == 0) {\n    return at::empty({0}, dets.options().dtype(at::kLong).device(at::kCPU));\n  }\n\n  auto x1_t = dets.select(1, 0).contiguous();\n  auto y1_t = dets.select(1, 1).contiguous();\n  auto x2_t = dets.select(1, 2).contiguous();\n  auto y2_t = dets.select(1, 3).contiguous();\n\n  at::Tensor areas_t = (x2_t - x1_t + 1) * (y2_t - y1_t + 1);\n\n  auto order_t = std::get<1>(scores.sort(0, /* descending=*/true));\n\n  auto ndets = dets.size(0);\n  at::Tensor suppressed_t = at::zeros({ndets}, dets.options().dtype(at::kByte).device(at::kCPU));\n\n  auto suppressed = suppressed_t.data<uint8_t>();\n  auto order = order_t.data<int64_t>();\n  auto x1 = x1_t.data<scalar_t>();\n  auto y1 = y1_t.data<scalar_t>();\n  auto x2 = x2_t.data<scalar_t>();\n  auto y2 = y2_t.data<scalar_t>();\n  auto areas = areas_t.data<scalar_t>();\n\n  for (int64_t _i = 0; _i < ndets; _i++) {\n    auto i = order[_i];\n    if (suppressed[i] == 1)\n      continue;\n    auto ix1 = x1[i];\n    auto iy1 = y1[i];\n    auto ix2 = x2[i];\n    auto iy2 = y2[i];\n    auto iarea = areas[i];\n\n    for (int64_t _j = _i + 1; _j < ndets; _j++) {\n      auto j = order[_j];\n      if (suppressed[j] == 1)\n        continue;\n      auto xx1 = std::max(ix1, x1[j]);\n      auto yy1 = std::max(iy1, y1[j]);\n      auto xx2 = std::min(ix2, x2[j]);\n      auto yy2 = std::min(iy2, y2[j]);\n\n      auto w = std::max(static_cast<scalar_t>(0), xx2 - xx1 + 1);\n      auto h = std::max(static_cast<scalar_t>(0), yy2 - yy1 + 1);\n      auto inter = w * h;\n      auto ovr = inter / (iarea + areas[j] - inter);\n      if (ovr >= threshold)\n        suppressed[j] = 1;\n   }\n  }\n  return at::nonzero(suppressed_t == 0).squeeze(1);\n}\n\nat::Tensor nms_cpu(const at::Tensor& dets,\n               const at::Tensor& scores,\n               const float threshold) {\n  at::Tensor result;\n  AT_DISPATCH_FLOATING_TYPES(dets.type(), \"nms\", [&] {\n    result = nms_cpu_kernel<scalar_t>(dets, scores, threshold);\n  });\n  return result;\n}\n"
  },
  {
    "path": "disparity/csrc/cpu/vision.h",
    "content": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#pragma once\n#include <torch/extension.h>\n\n\nat::Tensor ROIAlign_forward_cpu(const at::Tensor& input,\n                                const at::Tensor& rois,\n                                const float spatial_scale,\n                                const int pooled_height,\n                                const int pooled_width,\n                                const int sampling_ratio);\n\n\nat::Tensor nms_cpu(const at::Tensor& dets,\n                   const at::Tensor& scores,\n                   const float threshold);\n"
  },
  {
    "path": "disparity/csrc/cuda/BuildCostVolume_cuda.cu",
    "content": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#include <ATen/ATen.h>\n#include <ATen/cuda/CUDAContext.h>\n\n#include <THC/THC.h>\n#include <THC/THCAtomics.cuh>\n#include <THC/THCDeviceUtils.cuh>\n\n// TODO make it in a common file\n#define CUDA_1D_KERNEL_LOOP(i, n)                            \\\n  for (int i = blockIdx.x * blockDim.x + threadIdx.x; i < n; \\\n       i += blockDim.x * gridDim.x)\n\n\ntemplate <typename T>\n__device__ T bilinear_interpolate(const T* bottom_data,\n    const int height, const int width,\n    T y, T x) {\n\n  // deal with cases that inverse elements are out of feature map boundary\n  if (y < -1.0 || y > height || x < -1.0 || x > width) {\n    //empty\n    return 0;\n  }\n\n  if (y <= 0) y = 0;\n  if (x <= 0) x = 0;\n\n  int y_low = (int) y;\n  int x_low = (int) x;\n  int y_high;\n  int x_high;\n\n  if (y_low >= height - 1) {\n    y_high = y_low = height - 1;\n    y = (T) y_low;\n  } else {\n    y_high = y_low + 1;\n  }\n\n  if (x_low >= width - 1) {\n    x_high = x_low = width - 1;\n    x = (T) x_low;\n  } else {\n    x_high = x_low + 1;\n  }\n\n  T ly = y - y_low;\n  T lx = x - x_low;\n  T hy = 1. - ly, hx = 1. - lx;\n  // do bilinear interpolation\n  T v1 = bottom_data[y_low * width + x_low];\n  T v2 = bottom_data[y_low * width + x_high];\n  T v3 = bottom_data[y_high * width + x_low];\n  T v4 = bottom_data[y_high * width + x_high];\n  T w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx;\n\n  T val = (w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4);\n\n  return val;\n}\n\ntemplate <typename T>\n__global__ void BuildCostVolumeForward(const int nthreads, \n    const T* left, const T* right, const T* shift, \n    const int num_batch, const int channels, const int height,\n    const int width, const int max_disp,\n    T* cost) {\n  CUDA_1D_KERNEL_LOOP(index, nthreads) {\n    int pw = index % width;\n    int ph = (index / width) % height;\n    int pd = (index / width / height) % max_disp;\n    int c = (index / width / height/ max_disp) % channels;\n    int n = index / width / height / max_disp / channels;\n\n    int index_L = (((n * 2 * channels + c) * max_disp + pd) * height + ph) * width + pw;\n    int index_R = index_L + channels * max_disp * height * width;\n\n    T shift_pd = -shift[n * max_disp + pd];\n\n    cost[index_L] = left[((n * channels + c) * height + ph) * width + pw];\n\n    if (pw + shift_pd >= 0. && pw + shift_pd <= width - 1)\n    {\n        const T* offset_right = right + (n * channels + c) * height * width;\n        cost[index_R] = bilinear_interpolate(offset_right, height, width, (T)ph, (T)pw + shift_pd);\n    }\n    else \n    {\n        cost[index_R] = 0.;\n    }\n  }\n}\n\n\ntemplate <typename T>\n__device__ void bilinear_interpolate_gradient(\n    const int height, const int width,\n    T y, T x,\n    T & w1, T & w2, T & w3, T & w4,\n    int & x_low, int & x_high, int & y_low, int & y_high) {\n\n  // deal with cases that inverse elements are out of feature map boundary\n  if (y < -1.0 || y > height || x < -1.0 || x > width) {\n    //empty\n    w1 = w2 = w3 = w4 = 0.;\n    x_low = x_high = y_low = y_high = -1;\n    return;\n  }\n\n  if (y <= 0) y = 0;\n  if (x <= 0) x = 0;\n\n  y_low = (int) y;\n  x_low = (int) x;\n\n  if (y_low >= height - 1) {\n    y_high = y_low = height - 1;\n    y = (T) y_low;\n  } else {\n    y_high = y_low + 1;\n  }\n\n  if (x_low >= width - 1) {\n    x_high = x_low = width - 1;\n    x = (T) x_low;\n  } else {\n    x_high = x_low + 1;\n  }\n\n  T ly = y - y_low;\n  T lx = x - x_low;\n  T hy = 1. - ly, hx = 1. - lx;\n\n  // reference in forward\n  // T v1 = bottom_data[y_low * width + x_low];\n  // T v2 = bottom_data[y_low * width + x_high];\n  // T v3 = bottom_data[y_high * width + x_low];\n  // T v4 = bottom_data[y_high * width + x_high];\n  // T val = (w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4);\n\n  w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx;\n\n  return;\n}\n\ntemplate <typename T>\n__global__ void BuildCostVolumeBackwardFeature(const int nthreads, \n    const T* grad, const T* shift, \n    const int num_batch, const int channels, const int height,\n    const int width, const int max_disp,\n    T* grad_left, T* grad_right) {\n  CUDA_1D_KERNEL_LOOP(index, nthreads) {\n    int pw = index % width;\n    int ph = (index / width) % height;\n    int pd = (index / width / height) % max_disp;\n    int c = (index / width / height/ max_disp) % channels;\n    int n = index / width / height / max_disp / channels;\n\n    int index_L = (((n * 2 * channels + c) * max_disp + pd) * height + ph) * width + pw;\n    int index_R = index_L + channels * max_disp * height * width;\n\n    T shift_pd = -shift[n * max_disp + pd];\n\n    // left\n    atomicAdd(grad_left + ((n * channels + c) * height + ph) * width + pw, static_cast<T>(grad[index_L]));\n\n    if (pw + shift_pd >= 0. && pw + shift_pd <= width - 1)\n    {\n        // right\n        T w1, w2, w3, w4;\n        int x_low, x_high, y_low, y_high;\n\n        bilinear_interpolate_gradient(height, width, (T) ph, (T) pw + shift_pd,\n            w1, w2, w3, w4,\n            x_low, x_high, y_low, y_high);\n\n        T top_diff_this_bin = grad[index_R];\n        T g1 = top_diff_this_bin * w1;\n        T g2 = top_diff_this_bin * w2;\n        T g3 = top_diff_this_bin * w3;\n        T g4 = top_diff_this_bin * w4;\n\n        T* offset_grad_right = grad_right + (n * channels + c) * height * width;\n        if (w1 >= 1e-10)\n            atomicAdd(offset_grad_right + y_low * width + x_low, static_cast<T>(g1));\n        if (w2 >= 1e-10)\n            atomicAdd(offset_grad_right + y_low * width + x_high, static_cast<T>(g2));\n        if (w3 >= 1e-10)\n            atomicAdd(offset_grad_right + y_high * width + x_low, static_cast<T>(g3));\n        if (w4 >= 1e-10)\n            atomicAdd(offset_grad_right + y_high * width + x_high, static_cast<T>(g4));\n    }\n  } // CUDA_1D_KERNEL_LOOP\n} // BuildCostVolumeBackward\n\n\nat::Tensor BuildCostVolume_forward_cuda(const at::Tensor& left,\n                                 const at::Tensor& right,\n                                 const at::Tensor& shift) {\n  AT_ASSERTM(left.type().is_cuda(), \"left must be a CUDA tensor\");\n  AT_ASSERTM(right.type().is_cuda(), \"right must be a CUDA tensor\");\n  AT_ASSERTM(shift.type().is_cuda(), \"shift must be a CUDA tensor\");\n\n  AT_ASSERTM((left.size(0) == right.size(0)) && (left.size(1) == right.size(1)) && \\\n    (left.size(2) == right.size(2)) && (left.size(3) == right.size(3)), \\\n    \"Left image and right image should match their size.\");\n  AT_ASSERTM(left.size(0) == shift.size(0), \\\n    \"Image and shift should of same batch.\");\n\n  auto num_batch = left.size(0);\n  auto channels = left.size(1);\n  auto height = left.size(2);\n  auto width = left.size(3);\n  auto max_disp = shift.size(1);\n\n  auto output = at::empty({num_batch, channels * 2, max_disp, height, width}, left.options());\n  auto output_size = num_batch * channels * 2 * max_disp * height * width;\n  cudaStream_t stream = at::cuda::getCurrentCUDAStream();\n\n  dim3 grid(std::min(THCCeilDiv((long)(output_size / 2), 512L), 4096L));\n  dim3 block(512);\n\n  if (output.numel() == 0) {\n    THCudaCheck(cudaGetLastError());\n    return output;\n  }\n\n  AT_DISPATCH_FLOATING_TYPES(left.type(), \"BuildCostVolume_forward\", [&] {\n    BuildCostVolumeForward<scalar_t><<<grid, block, 0, stream>>>(\n         output_size / 2,\n         left.contiguous().data<scalar_t>(),\n         right.contiguous().data<scalar_t>(),\n         shift.contiguous().data<scalar_t>(),\n         num_batch,\n         channels,\n         height,\n         width,\n         max_disp,\n         output.data<scalar_t>());\n  });\n  THCudaCheck(cudaGetLastError());\n  return output;\n}\n\n// TODO remove the dependency on input and use instead its sizes -> save memory\nstd::tuple<at::Tensor, at::Tensor> BuildCostVolume_backward_cuda(const at::Tensor& grad,\n                                  const at::Tensor& shift) {\n  AT_ASSERTM(shift.type().is_cuda(), \"shift must be a CUDA tensor\");\n\n  auto num_batch = grad.size(0);\n  auto channels = grad.size(1) / 2;\n  auto height = grad.size(3);\n  auto width = grad.size(4);\n  auto max_disp = shift.size(1);\n\n  auto grad_left = at::zeros({num_batch, channels, height, width}, grad.options());\n  auto grad_right = at::zeros({num_batch, channels, height, width}, grad.options());\n\n  AT_ASSERTM(grad.numel() == num_batch * channels * 2 * max_disp * height * width,\n      \"grad shape is wrong\");\n\n  cudaStream_t stream = at::cuda::getCurrentCUDAStream();\n\n  dim3 grid(std::min(THCCeilDiv((long)grad.numel(), 512L), 4096L));\n  dim3 block(512);\n\n  // handle possibly empty gradients\n  if (grad.numel() == 0) {\n    THCudaCheck(cudaGetLastError());\n    return std::make_tuple(grad_left, grad_right);\n  }\n\n  AT_DISPATCH_FLOATING_TYPES(grad.type(), \"BuildCostVolume_backward\", [&] {\n    BuildCostVolumeBackwardFeature<scalar_t><<<grid, block, 0, stream>>>(\n         grad.numel() / 2,\n         grad.contiguous().data<scalar_t>(),\n         shift.contiguous().data<scalar_t>(),\n         num_batch,\n         channels,\n         height,\n         width,\n         max_disp,\n         grad_left.data<scalar_t>(),\n         grad_right.data<scalar_t>());\n  });\n  THCudaCheck(cudaGetLastError());\n  return std::make_tuple(grad_left, grad_right);\n}\n\n"
  },
  {
    "path": "disparity/csrc/cuda/ROIAlign_cuda.cu",
    "content": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#include <ATen/ATen.h>\n#include <ATen/cuda/CUDAContext.h>\n\n#include <THC/THC.h>\n#include <THC/THCAtomics.cuh>\n#include <THC/THCDeviceUtils.cuh>\n\n// TODO make it in a common file\n#define CUDA_1D_KERNEL_LOOP(i, n)                            \\\n  for (int i = blockIdx.x * blockDim.x + threadIdx.x; i < n; \\\n       i += blockDim.x * gridDim.x)\n\n\ntemplate <typename T>\n__device__ T bilinear_interpolate(const T* bottom_data,\n    const int height, const int width,\n    T y, T x,\n    const int index /* index for debug only*/) {\n\n  // deal with cases that inverse elements are out of feature map boundary\n  if (y < -1.0 || y > height || x < -1.0 || x > width) {\n    //empty\n    return 0;\n  }\n\n  if (y <= 0) y = 0;\n  if (x <= 0) x = 0;\n\n  int y_low = (int) y;\n  int x_low = (int) x;\n  int y_high;\n  int x_high;\n\n  if (y_low >= height - 1) {\n    y_high = y_low = height - 1;\n    y = (T) y_low;\n  } else {\n    y_high = y_low + 1;\n  }\n\n  if (x_low >= width - 1) {\n    x_high = x_low = width - 1;\n    x = (T) x_low;\n  } else {\n    x_high = x_low + 1;\n  }\n\n  T ly = y - y_low;\n  T lx = x - x_low;\n  T hy = 1. - ly, hx = 1. - lx;\n  // do bilinear interpolation\n  T v1 = bottom_data[y_low * width + x_low];\n  T v2 = bottom_data[y_low * width + x_high];\n  T v3 = bottom_data[y_high * width + x_low];\n  T v4 = bottom_data[y_high * width + x_high];\n  T w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx;\n\n  T val = (w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4);\n\n  return val;\n}\n\ntemplate <typename T>\n__global__ void RoIAlignForward(const int nthreads, const T* bottom_data,\n    const T spatial_scale, const int channels,\n    const int height, const int width,\n    const int pooled_height, const int pooled_width,\n    const int sampling_ratio,\n    const T* bottom_rois, T* top_data) {\n  CUDA_1D_KERNEL_LOOP(index, nthreads) {\n    // (n, c, ph, pw) is an element in the pooled output\n    int pw = index % pooled_width;\n    int ph = (index / pooled_width) % pooled_height;\n    int c = (index / pooled_width / pooled_height) % channels;\n    int n = index / pooled_width / pooled_height / channels;\n\n    const T* offset_bottom_rois = bottom_rois + n * 5;\n    int roi_batch_ind = offset_bottom_rois[0];\n\n    // Do not using rounding; this implementation detail is critical\n    T roi_start_w = offset_bottom_rois[1] * spatial_scale;\n    T roi_start_h = offset_bottom_rois[2] * spatial_scale;\n    T roi_end_w = offset_bottom_rois[3] * spatial_scale;\n    T roi_end_h = offset_bottom_rois[4] * spatial_scale;\n    // T roi_start_w = round(offset_bottom_rois[1] * spatial_scale);\n    // T roi_start_h = round(offset_bottom_rois[2] * spatial_scale);\n    // T roi_end_w = round(offset_bottom_rois[3] * spatial_scale);\n    // T roi_end_h = round(offset_bottom_rois[4] * spatial_scale);\n\n    // Force malformed ROIs to be 1x1\n    T roi_width = max(roi_end_w - roi_start_w, (T)1.);\n    T roi_height = max(roi_end_h - roi_start_h, (T)1.);\n    T bin_size_h = static_cast<T>(roi_height) / static_cast<T>(pooled_height);\n    T bin_size_w = static_cast<T>(roi_width) / static_cast<T>(pooled_width);\n\n    const T* offset_bottom_data = bottom_data + (roi_batch_ind * channels + c) * height * width;\n\n    // We use roi_bin_grid to sample the grid and mimic integral\n    int roi_bin_grid_h = (sampling_ratio > 0) ? sampling_ratio : ceil(roi_height / pooled_height); // e.g., = 2\n    int roi_bin_grid_w = (sampling_ratio > 0) ? sampling_ratio : ceil(roi_width / pooled_width);\n\n    // We do average (integral) pooling inside a bin\n    const T count = roi_bin_grid_h * roi_bin_grid_w; // e.g. = 4\n\n    T output_val = 0.;\n    for (int iy = 0; iy < roi_bin_grid_h; iy ++) // e.g., iy = 0, 1\n    {\n      const T y = roi_start_h + ph * bin_size_h + static_cast<T>(iy + .5f) * bin_size_h / static_cast<T>(roi_bin_grid_h); // e.g., 0.5, 1.5\n      for (int ix = 0; ix < roi_bin_grid_w; ix ++)\n      {\n        const T x = roi_start_w + pw * bin_size_w + static_cast<T>(ix + .5f) * bin_size_w / static_cast<T>(roi_bin_grid_w);\n\n        T val = bilinear_interpolate(offset_bottom_data, height, width, y, x, index);\n        output_val += val;\n      }\n    }\n    output_val /= count;\n\n    top_data[index] = output_val;\n  }\n}\n\n\ntemplate <typename T>\n__device__ void bilinear_interpolate_gradient(\n    const int height, const int width,\n    T y, T x,\n    T & w1, T & w2, T & w3, T & w4,\n    int & x_low, int & x_high, int & y_low, int & y_high,\n    const int index /* index for debug only*/) {\n\n  // deal with cases that inverse elements are out of feature map boundary\n  if (y < -1.0 || y > height || x < -1.0 || x > width) {\n    //empty\n    w1 = w2 = w3 = w4 = 0.;\n    x_low = x_high = y_low = y_high = -1;\n    return;\n  }\n\n  if (y <= 0) y = 0;\n  if (x <= 0) x = 0;\n\n  y_low = (int) y;\n  x_low = (int) x;\n\n  if (y_low >= height - 1) {\n    y_high = y_low = height - 1;\n    y = (T) y_low;\n  } else {\n    y_high = y_low + 1;\n  }\n\n  if (x_low >= width - 1) {\n    x_high = x_low = width - 1;\n    x = (T) x_low;\n  } else {\n    x_high = x_low + 1;\n  }\n\n  T ly = y - y_low;\n  T lx = x - x_low;\n  T hy = 1. - ly, hx = 1. - lx;\n\n  // reference in forward\n  // T v1 = bottom_data[y_low * width + x_low];\n  // T v2 = bottom_data[y_low * width + x_high];\n  // T v3 = bottom_data[y_high * width + x_low];\n  // T v4 = bottom_data[y_high * width + x_high];\n  // T val = (w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4);\n\n  w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx;\n\n  return;\n}\n\ntemplate <typename T>\n__global__ void RoIAlignBackwardFeature(const int nthreads, const T* top_diff,\n    const int num_rois, const T spatial_scale,\n    const int channels, const int height, const int width,\n    const int pooled_height, const int pooled_width,\n    const int sampling_ratio,\n    T* bottom_diff,\n    const T* bottom_rois) {\n  CUDA_1D_KERNEL_LOOP(index, nthreads) {\n    // (n, c, ph, pw) is an element in the pooled output\n    int pw = index % pooled_width;\n    int ph = (index / pooled_width) % pooled_height;\n    int c = (index / pooled_width / pooled_height) % channels;\n    int n = index / pooled_width / pooled_height / channels;\n\n    const T* offset_bottom_rois = bottom_rois + n * 5;\n    int roi_batch_ind = offset_bottom_rois[0];\n\n    // Do not using rounding; this implementation detail is critical\n    T roi_start_w = offset_bottom_rois[1] * spatial_scale;\n    T roi_start_h = offset_bottom_rois[2] * spatial_scale;\n    T roi_end_w = offset_bottom_rois[3] * spatial_scale;\n    T roi_end_h = offset_bottom_rois[4] * spatial_scale;\n    // T roi_start_w = round(offset_bottom_rois[1] * spatial_scale);\n    // T roi_start_h = round(offset_bottom_rois[2] * spatial_scale);\n    // T roi_end_w = round(offset_bottom_rois[3] * spatial_scale);\n    // T roi_end_h = round(offset_bottom_rois[4] * spatial_scale);\n\n    // Force malformed ROIs to be 1x1\n    T roi_width = max(roi_end_w - roi_start_w, (T)1.);\n    T roi_height = max(roi_end_h - roi_start_h, (T)1.);\n    T bin_size_h = static_cast<T>(roi_height) / static_cast<T>(pooled_height);\n    T bin_size_w = static_cast<T>(roi_width) / static_cast<T>(pooled_width);\n\n    T* offset_bottom_diff = bottom_diff + (roi_batch_ind * channels + c) * height * width;\n\n    int top_offset    = (n * channels + c) * pooled_height * pooled_width;\n    const T* offset_top_diff = top_diff + top_offset;\n    const T top_diff_this_bin = offset_top_diff[ph * pooled_width + pw];\n\n    // We use roi_bin_grid to sample the grid and mimic integral\n    int roi_bin_grid_h = (sampling_ratio > 0) ? sampling_ratio : ceil(roi_height / pooled_height); // e.g., = 2\n    int roi_bin_grid_w = (sampling_ratio > 0) ? sampling_ratio : ceil(roi_width / pooled_width);\n\n    // We do average (integral) pooling inside a bin\n    const T count = roi_bin_grid_h * roi_bin_grid_w; // e.g. = 4\n\n    for (int iy = 0; iy < roi_bin_grid_h; iy ++) // e.g., iy = 0, 1\n    {\n      const T y = roi_start_h + ph * bin_size_h + static_cast<T>(iy + .5f) * bin_size_h / static_cast<T>(roi_bin_grid_h); // e.g., 0.5, 1.5\n      for (int ix = 0; ix < roi_bin_grid_w; ix ++)\n      {\n        const T x = roi_start_w + pw * bin_size_w + static_cast<T>(ix + .5f) * bin_size_w / static_cast<T>(roi_bin_grid_w);\n\n        T w1, w2, w3, w4;\n        int x_low, x_high, y_low, y_high;\n\n        bilinear_interpolate_gradient(height, width, y, x,\n            w1, w2, w3, w4,\n            x_low, x_high, y_low, y_high,\n            index);\n\n        T g1 = top_diff_this_bin * w1 / count;\n        T g2 = top_diff_this_bin * w2 / count;\n        T g3 = top_diff_this_bin * w3 / count;\n        T g4 = top_diff_this_bin * w4 / count;\n\n        if (x_low >= 0 && x_high >= 0 && y_low >= 0 && y_high >= 0)\n        {\n          atomicAdd(offset_bottom_diff + y_low * width + x_low, static_cast<T>(g1));\n          atomicAdd(offset_bottom_diff + y_low * width + x_high, static_cast<T>(g2));\n          atomicAdd(offset_bottom_diff + y_high * width + x_low, static_cast<T>(g3));\n          atomicAdd(offset_bottom_diff + y_high * width + x_high, static_cast<T>(g4));\n        } // if\n      } // ix\n    } // iy\n  } // CUDA_1D_KERNEL_LOOP\n} // RoIAlignBackward\n\n\nat::Tensor ROIAlign_forward_cuda(const at::Tensor& input,\n                                 const at::Tensor& rois,\n                                 const float spatial_scale,\n                                 const int pooled_height,\n                                 const int pooled_width,\n                                 const int sampling_ratio) {\n  AT_ASSERTM(input.type().is_cuda(), \"input must be a CUDA tensor\");\n  AT_ASSERTM(rois.type().is_cuda(), \"rois must be a CUDA tensor\");\n\n  auto num_rois = rois.size(0);\n  auto channels = input.size(1);\n  auto height = input.size(2);\n  auto width = input.size(3);\n\n  auto output = at::empty({num_rois, channels, pooled_height, pooled_width}, input.options());\n  auto output_size = num_rois * pooled_height * pooled_width * channels;\n  cudaStream_t stream = at::cuda::getCurrentCUDAStream();\n\n  dim3 grid(std::min(THCCeilDiv((long)output_size, 512L), 4096L));\n  dim3 block(512);\n\n  if (output.numel() == 0) {\n    THCudaCheck(cudaGetLastError());\n    return output;\n  }\n\n  AT_DISPATCH_FLOATING_TYPES(input.type(), \"ROIAlign_forward\", [&] {\n    RoIAlignForward<scalar_t><<<grid, block, 0, stream>>>(\n         output_size,\n         input.contiguous().data<scalar_t>(),\n         spatial_scale,\n         channels,\n         height,\n         width,\n         pooled_height,\n         pooled_width,\n         sampling_ratio,\n         rois.contiguous().data<scalar_t>(),\n         output.data<scalar_t>());\n  });\n  THCudaCheck(cudaGetLastError());\n  return output;\n}\n\n// TODO remove the dependency on input and use instead its sizes -> save memory\nat::Tensor ROIAlign_backward_cuda(const at::Tensor& grad,\n                                  const at::Tensor& rois,\n                                  const float spatial_scale,\n                                  const int pooled_height,\n                                  const int pooled_width,\n                                  const int batch_size,\n                                  const int channels,\n                                  const int height,\n                                  const int width,\n                                  const int sampling_ratio) {\n  AT_ASSERTM(grad.type().is_cuda(), \"grad must be a CUDA tensor\");\n  AT_ASSERTM(rois.type().is_cuda(), \"rois must be a CUDA tensor\");\n\n  auto num_rois = rois.size(0);\n  auto grad_input = at::zeros({batch_size, channels, height, width}, grad.options());\n\n  cudaStream_t stream = at::cuda::getCurrentCUDAStream();\n\n  dim3 grid(std::min(THCCeilDiv((long)grad.numel(), 512L), 4096L));\n  dim3 block(512);\n\n  // handle possibly empty gradients\n  if (grad.numel() == 0) {\n    THCudaCheck(cudaGetLastError());\n    return grad_input;\n  }\n\n  AT_DISPATCH_FLOATING_TYPES(grad.type(), \"ROIAlign_backward\", [&] {\n    RoIAlignBackwardFeature<scalar_t><<<grid, block, 0, stream>>>(\n         grad.numel(),\n         grad.contiguous().data<scalar_t>(),\n         num_rois,\n         spatial_scale,\n         channels,\n         height,\n         width,\n         pooled_height,\n         pooled_width,\n         sampling_ratio,\n         grad_input.data<scalar_t>(),\n         rois.contiguous().data<scalar_t>());\n  });\n  THCudaCheck(cudaGetLastError());\n  return grad_input;\n}\n"
  },
  {
    "path": "disparity/csrc/cuda/ROIPool_cuda.cu",
    "content": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#include <ATen/ATen.h>\n#include <ATen/cuda/CUDAContext.h>\n\n#include <THC/THC.h>\n#include <THC/THCAtomics.cuh>\n#include <THC/THCDeviceUtils.cuh>\n\n\n// TODO make it in a common file\n#define CUDA_1D_KERNEL_LOOP(i, n)                            \\\n  for (int i = blockIdx.x * blockDim.x + threadIdx.x; i < n; \\\n       i += blockDim.x * gridDim.x)\n\n\ntemplate <typename T>\n__global__ void RoIPoolFForward(const int nthreads, const T* bottom_data,\n    const T spatial_scale, const int channels, const int height,\n    const int width, const int pooled_height, const int pooled_width,\n    const T* bottom_rois, T* top_data, int* argmax_data) {\n  CUDA_1D_KERNEL_LOOP(index, nthreads) {\n    // (n, c, ph, pw) is an element in the pooled output\n    int pw = index % pooled_width;\n    int ph = (index / pooled_width) % pooled_height;\n    int c = (index / pooled_width / pooled_height) % channels;\n    int n = index / pooled_width / pooled_height / channels;\n\n    const T* offset_bottom_rois = bottom_rois + n * 5;\n    int roi_batch_ind = offset_bottom_rois[0];\n    int roi_start_w = round(offset_bottom_rois[1] * spatial_scale);\n    int roi_start_h = round(offset_bottom_rois[2] * spatial_scale);\n    int roi_end_w = round(offset_bottom_rois[3] * spatial_scale);\n    int roi_end_h = round(offset_bottom_rois[4] * spatial_scale);\n\n    // Force malformed ROIs to be 1x1\n    int roi_width = max(roi_end_w - roi_start_w + 1, 1);\n    int roi_height = max(roi_end_h - roi_start_h + 1, 1);\n    T bin_size_h = static_cast<T>(roi_height)\n                       / static_cast<T>(pooled_height);\n    T bin_size_w = static_cast<T>(roi_width)\n                       / static_cast<T>(pooled_width);\n\n    int hstart = static_cast<int>(floor(static_cast<T>(ph)\n                                        * bin_size_h));\n    int wstart = static_cast<int>(floor(static_cast<T>(pw)\n                                        * bin_size_w));\n    int hend = static_cast<int>(ceil(static_cast<T>(ph + 1)\n                                     * bin_size_h));\n    int wend = static_cast<int>(ceil(static_cast<T>(pw + 1)\n                                     * bin_size_w));\n\n    // Add roi offsets and clip to input boundaries\n    hstart = min(max(hstart + roi_start_h, 0), height);\n    hend = min(max(hend + roi_start_h, 0), height);\n    wstart = min(max(wstart + roi_start_w, 0), width);\n    wend = min(max(wend + roi_start_w, 0), width);\n    bool is_empty = (hend <= hstart) || (wend <= wstart);\n\n    // Define an empty pooling region to be zero\n    T maxval = is_empty ? 0 : -FLT_MAX;\n    // If nothing is pooled, argmax = -1 causes nothing to be backprop'd\n    int maxidx = -1;\n    const T* offset_bottom_data =\n        bottom_data + (roi_batch_ind * channels + c) * height * width;\n    for (int h = hstart; h < hend; ++h) {\n      for (int w = wstart; w < wend; ++w) {\n        int bottom_index = h * width + w;\n        if (offset_bottom_data[bottom_index] > maxval) {\n          maxval = offset_bottom_data[bottom_index];\n          maxidx = bottom_index;\n        }\n      }\n    }\n    top_data[index] = maxval;\n    argmax_data[index] = maxidx;\n  }\n}\n\ntemplate <typename T>\n__global__ void RoIPoolFBackward(const int nthreads, const T* top_diff,\n    const int* argmax_data, const int num_rois, const T spatial_scale,\n    const int channels, const int height, const int width,\n    const int pooled_height, const int pooled_width, T* bottom_diff,\n    const T* bottom_rois) {\n  CUDA_1D_KERNEL_LOOP(index, nthreads) {\n    // (n, c, ph, pw) is an element in the pooled output\n    int pw = index % pooled_width;\n    int ph = (index / pooled_width) % pooled_height;\n    int c = (index / pooled_width / pooled_height) % channels;\n    int n = index / pooled_width / pooled_height / channels;\n\n    const T* offset_bottom_rois = bottom_rois + n * 5;\n    int roi_batch_ind = offset_bottom_rois[0];\n    int bottom_offset = (roi_batch_ind * channels + c) * height * width;\n    int top_offset    = (n * channels + c) * pooled_height * pooled_width;\n    const T* offset_top_diff = top_diff + top_offset;\n    T* offset_bottom_diff = bottom_diff + bottom_offset;\n    const int* offset_argmax_data = argmax_data + top_offset;\n\n    int argmax = offset_argmax_data[ph * pooled_width + pw];\n    if (argmax != -1) {\n      atomicAdd(\n          offset_bottom_diff + argmax,\n          static_cast<T>(offset_top_diff[ph * pooled_width + pw]));\n\n    }\n  }\n}\n\nstd::tuple<at::Tensor, at::Tensor> ROIPool_forward_cuda(const at::Tensor& input,\n                                const at::Tensor& rois,\n                                const float spatial_scale,\n                                const int pooled_height,\n                                const int pooled_width) {\n  AT_ASSERTM(input.type().is_cuda(), \"input must be a CUDA tensor\");\n  AT_ASSERTM(rois.type().is_cuda(), \"rois must be a CUDA tensor\");\n\n  auto num_rois = rois.size(0);\n  auto channels = input.size(1);\n  auto height = input.size(2);\n  auto width = input.size(3);\n\n  auto output = at::empty({num_rois, channels, pooled_height, pooled_width}, input.options());\n  auto output_size = num_rois * pooled_height * pooled_width * channels;\n  auto argmax = at::zeros({num_rois, channels, pooled_height, pooled_width}, input.options().dtype(at::kInt));\n\n  cudaStream_t stream = at::cuda::getCurrentCUDAStream();\n\n  dim3 grid(std::min(THCCeilDiv((long)output_size, 512L), 4096L));\n  dim3 block(512);\n\n  if (output.numel() == 0) {\n    THCudaCheck(cudaGetLastError());\n    return std::make_tuple(output, argmax);\n  }\n\n  AT_DISPATCH_FLOATING_TYPES(input.type(), \"ROIPool_forward\", [&] {\n    RoIPoolFForward<scalar_t><<<grid, block, 0, stream>>>(\n         output_size,\n         input.contiguous().data<scalar_t>(),\n         spatial_scale,\n         channels,\n         height,\n         width,\n         pooled_height,\n         pooled_width,\n         rois.contiguous().data<scalar_t>(),\n         output.data<scalar_t>(),\n         argmax.data<int>());\n  });\n  THCudaCheck(cudaGetLastError());\n  return std::make_tuple(output, argmax);\n}\n\n// TODO remove the dependency on input and use instead its sizes -> save memory\nat::Tensor ROIPool_backward_cuda(const at::Tensor& grad,\n                                 const at::Tensor& input,\n                                 const at::Tensor& rois,\n                                 const at::Tensor& argmax,\n                                 const float spatial_scale,\n                                 const int pooled_height,\n                                 const int pooled_width,\n                                 const int batch_size,\n                                 const int channels,\n                                 const int height,\n                                 const int width) {\n  AT_ASSERTM(grad.type().is_cuda(), \"grad must be a CUDA tensor\");\n  AT_ASSERTM(rois.type().is_cuda(), \"rois must be a CUDA tensor\");\n  // TODO add more checks\n\n  auto num_rois = rois.size(0);\n  auto grad_input = at::zeros({batch_size, channels, height, width}, grad.options());\n\n  cudaStream_t stream = at::cuda::getCurrentCUDAStream();\n\n  dim3 grid(std::min(THCCeilDiv((long)grad.numel(), 512L), 4096L));\n  dim3 block(512);\n\n  // handle possibly empty gradients\n  if (grad.numel() == 0) {\n    THCudaCheck(cudaGetLastError());\n    return grad_input;\n  }\n\n  AT_DISPATCH_FLOATING_TYPES(grad.type(), \"ROIPool_backward\", [&] {\n    RoIPoolFBackward<scalar_t><<<grid, block, 0, stream>>>(\n         grad.numel(),\n         grad.contiguous().data<scalar_t>(),\n         argmax.data<int>(),\n         num_rois,\n         spatial_scale,\n         channels,\n         height,\n         width,\n         pooled_height,\n         pooled_width,\n         grad_input.data<scalar_t>(),\n         rois.contiguous().data<scalar_t>());\n  });\n  THCudaCheck(cudaGetLastError());\n  return grad_input;\n}\n"
  },
  {
    "path": "disparity/csrc/cuda/SigmoidFocalLoss_cuda.cu",
    "content": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n// This file is modified from  https://github.com/pytorch/pytorch/blob/master/modules/detectron/sigmoid_focal_loss_op.cu\n// Cheng-Yang Fu\n// cyfu@cs.unc.edu\n#include <ATen/ATen.h>\n#include <ATen/cuda/CUDAContext.h>\n\n#include <THC/THC.h>\n#include <THC/THCAtomics.cuh>\n#include <THC/THCDeviceUtils.cuh>\n\n#include <cfloat>\n\n// TODO make it in a common file\n#define CUDA_1D_KERNEL_LOOP(i, n)                            \\\n  for (int i = blockIdx.x * blockDim.x + threadIdx.x; i < n; \\\n       i += blockDim.x * gridDim.x)\n\n\ntemplate <typename T>\n__global__ void SigmoidFocalLossForward(const int nthreads, \n    const T* logits,\n    const int* targets,\n    const int num_classes,\n    const float gamma, \n    const float alpha,\n    const int num, \n    T* losses) {\n  CUDA_1D_KERNEL_LOOP(i, nthreads) {\n\n    int n = i / num_classes;\n    int d = i % num_classes; // current class[0~79]; \n    int t = targets[n]; // target class [1~80];\n\n    // Decide it is positive or negative case. \n    T c1 = (t == (d+1)); \n    T c2 = (t>=0 & t != (d+1));\n\n    T zn = (1.0 - alpha);\n    T zp = (alpha);\n\n    // p = 1. / 1. + expf(-x); p = sigmoid(x)\n    T  p = 1. / (1. + expf(-logits[i]));\n\n    // (1-p)**gamma * log(p) where\n    T term1 = powf((1. - p), gamma) * logf(max(p, FLT_MIN));\n\n    // p**gamma * log(1-p)\n    T term2 = powf(p, gamma) *\n            (-1. * logits[i] * (logits[i] >= 0) -   \n             logf(1. + expf(logits[i] - 2. * logits[i] * (logits[i] >= 0))));\n\n    losses[i] = 0.0;\n    losses[i] += -c1 * term1 * zp;\n    losses[i] += -c2 * term2 * zn;\n\n  } // CUDA_1D_KERNEL_LOOP\n} // SigmoidFocalLossForward\n\n\ntemplate <typename T>\n__global__ void SigmoidFocalLossBackward(const int nthreads,\n                const T* logits,\n                const int* targets,\n                const T* d_losses,\n                const int num_classes,\n                const float gamma,\n                const float alpha,\n                const int num,\n                T* d_logits) {\n  CUDA_1D_KERNEL_LOOP(i, nthreads) {\n\n    int n = i / num_classes;\n    int d = i % num_classes; // current class[0~79]; \n    int t = targets[n]; // target class [1~80], 0 is background;\n\n    // Decide it is positive or negative case. \n    T c1 = (t == (d+1));\n    T c2 = (t>=0 & t != (d+1));\n\n    T zn = (1.0 - alpha);\n    T zp = (alpha);\n    // p = 1. / 1. + expf(-x); p = sigmoid(x)\n    T  p = 1. / (1. + expf(-logits[i]));\n\n    // (1-p)**g * (1 - p - g*p*log(p)\n    T term1 = powf((1. - p), gamma) *\n                      (1. - p - (p * gamma * logf(max(p, FLT_MIN))));\n\n    // (p**g) * (g*(1-p)*log(1-p) - p)\n    T term2 = powf(p, gamma) *\n                  ((-1. * logits[i] * (logits[i] >= 0) -\n                      logf(1. + expf(logits[i] - 2. * logits[i] * (logits[i] >= 0)))) *\n                      (1. - p) * gamma - p);\n    d_logits[i] = 0.0;\n    d_logits[i] += -c1 * term1 * zp;\n    d_logits[i] += -c2 * term2 * zn;\n    d_logits[i] = d_logits[i] * d_losses[i];\n\n  } // CUDA_1D_KERNEL_LOOP\n} // SigmoidFocalLossBackward\n\n\nat::Tensor SigmoidFocalLoss_forward_cuda(\n\t\tconst at::Tensor& logits,\n                const at::Tensor& targets,\n\t\tconst int num_classes, \n\t\tconst float gamma, \n\t\tconst float alpha) {\n  AT_ASSERTM(logits.type().is_cuda(), \"logits must be a CUDA tensor\");\n  AT_ASSERTM(targets.type().is_cuda(), \"targets must be a CUDA tensor\");\n  AT_ASSERTM(logits.dim() == 2, \"logits should be NxClass\");\n\n  const int num_samples = logits.size(0);\n\t\n  auto losses = at::empty({num_samples, logits.size(1)}, logits.options());\n  auto losses_size = num_samples * logits.size(1);\n  cudaStream_t stream = at::cuda::getCurrentCUDAStream();\n\n  dim3 grid(std::min(THCCeilDiv((long)losses_size, 512L), 4096L));\n  dim3 block(512);\n\n  if (losses.numel() == 0) {\n    THCudaCheck(cudaGetLastError());\n    return losses;\n  }\n\n  AT_DISPATCH_FLOATING_TYPES(logits.type(), \"SigmoidFocalLoss_forward\", [&] {\n    SigmoidFocalLossForward<scalar_t><<<grid, block, 0, stream>>>(\n         losses_size,\n         logits.contiguous().data<scalar_t>(),\n\t targets.contiguous().data<int>(),\n         num_classes,\n\t gamma,\n\t alpha,\n\t num_samples,\n         losses.data<scalar_t>());\n  });\n  THCudaCheck(cudaGetLastError());\n  return losses;   \n}\t\n\n\nat::Tensor SigmoidFocalLoss_backward_cuda(\n\t\tconst at::Tensor& logits,\n                const at::Tensor& targets,\n\t\tconst at::Tensor& d_losses,\n\t\tconst int num_classes, \n\t\tconst float gamma, \n\t\tconst float alpha) {\n  AT_ASSERTM(logits.type().is_cuda(), \"logits must be a CUDA tensor\");\n  AT_ASSERTM(targets.type().is_cuda(), \"targets must be a CUDA tensor\");\n  AT_ASSERTM(d_losses.type().is_cuda(), \"d_losses must be a CUDA tensor\");\n\n  AT_ASSERTM(logits.dim() == 2, \"logits should be NxClass\");\n\n  const int num_samples = logits.size(0);\n  AT_ASSERTM(logits.size(1) == num_classes, \"logits.size(1) should be num_classes\");\n\t\n  auto d_logits = at::zeros({num_samples, num_classes}, logits.options());\n  auto d_logits_size = num_samples * logits.size(1);\n  cudaStream_t stream = at::cuda::getCurrentCUDAStream();\n\n  dim3 grid(std::min(THCCeilDiv((long)d_logits_size, 512L), 4096L));\n  dim3 block(512);\n\n  if (d_logits.numel() == 0) {\n    THCudaCheck(cudaGetLastError());\n    return d_logits;\n  }\n\n  AT_DISPATCH_FLOATING_TYPES(logits.type(), \"SigmoidFocalLoss_backward\", [&] {\n    SigmoidFocalLossBackward<scalar_t><<<grid, block, 0, stream>>>(\n         d_logits_size,\n         logits.contiguous().data<scalar_t>(),\n\t targets.contiguous().data<int>(),\n\t d_losses.contiguous().data<scalar_t>(),\n         num_classes,\n\t gamma,\n\t alpha,\n\t num_samples,\n         d_logits.data<scalar_t>());\n  });\n\n  THCudaCheck(cudaGetLastError());\n  return d_logits;   \n}\t\n\n"
  },
  {
    "path": "disparity/csrc/cuda/nms.cu",
    "content": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#include <ATen/ATen.h>\n#include <ATen/cuda/CUDAContext.h>\n\n#include <THC/THC.h>\n#include <THC/THCDeviceUtils.cuh>\n\n#include <vector>\n#include <iostream>\n\nint const threadsPerBlock = sizeof(unsigned long long) * 8;\n\n__device__ inline float devIoU(float const * const a, float const * const b) {\n  float left = max(a[0], b[0]), right = min(a[2], b[2]);\n  float top = max(a[1], b[1]), bottom = min(a[3], b[3]);\n  float width = max(right - left + 1, 0.f), height = max(bottom - top + 1, 0.f);\n  float interS = width * height;\n  float Sa = (a[2] - a[0] + 1) * (a[3] - a[1] + 1);\n  float Sb = (b[2] - b[0] + 1) * (b[3] - b[1] + 1);\n  return interS / (Sa + Sb - interS);\n}\n\n__global__ void nms_kernel(const int n_boxes, const float nms_overlap_thresh,\n                           const float *dev_boxes, unsigned long long *dev_mask) {\n  const int row_start = blockIdx.y;\n  const int col_start = blockIdx.x;\n\n  // if (row_start > col_start) return;\n\n  const int row_size =\n        min(n_boxes - row_start * threadsPerBlock, threadsPerBlock);\n  const int col_size =\n        min(n_boxes - col_start * threadsPerBlock, threadsPerBlock);\n\n  __shared__ float block_boxes[threadsPerBlock * 5];\n  if (threadIdx.x < col_size) {\n    block_boxes[threadIdx.x * 5 + 0] =\n        dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 0];\n    block_boxes[threadIdx.x * 5 + 1] =\n        dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 1];\n    block_boxes[threadIdx.x * 5 + 2] =\n        dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 2];\n    block_boxes[threadIdx.x * 5 + 3] =\n        dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 3];\n    block_boxes[threadIdx.x * 5 + 4] =\n        dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 4];\n  }\n  __syncthreads();\n\n  if (threadIdx.x < row_size) {\n    const int cur_box_idx = threadsPerBlock * row_start + threadIdx.x;\n    const float *cur_box = dev_boxes + cur_box_idx * 5;\n    int i = 0;\n    unsigned long long t = 0;\n    int start = 0;\n    if (row_start == col_start) {\n      start = threadIdx.x + 1;\n    }\n    for (i = start; i < col_size; i++) {\n      if (devIoU(cur_box, block_boxes + i * 5) > nms_overlap_thresh) {\n        t |= 1ULL << i;\n      }\n    }\n    const int col_blocks = THCCeilDiv(n_boxes, threadsPerBlock);\n    dev_mask[cur_box_idx * col_blocks + col_start] = t;\n  }\n}\n\n// boxes is a N x 5 tensor\nat::Tensor nms_cuda(const at::Tensor boxes, float nms_overlap_thresh) {\n  using scalar_t = float;\n  AT_ASSERTM(boxes.type().is_cuda(), \"boxes must be a CUDA tensor\");\n  auto scores = boxes.select(1, 4);\n  auto order_t = std::get<1>(scores.sort(0, /* descending=*/true));\n  auto boxes_sorted = boxes.index_select(0, order_t);\n\n  int boxes_num = boxes.size(0);\n\n  const int col_blocks = THCCeilDiv(boxes_num, threadsPerBlock);\n\n  scalar_t* boxes_dev = boxes_sorted.data<scalar_t>();\n\n  THCState *state = at::globalContext().lazyInitCUDA(); // TODO replace with getTHCState\n\n  unsigned long long* mask_dev = NULL;\n  //THCudaCheck(THCudaMalloc(state, (void**) &mask_dev,\n  //                      boxes_num * col_blocks * sizeof(unsigned long long)));\n\n  mask_dev = (unsigned long long*) THCudaMalloc(state, boxes_num * col_blocks * sizeof(unsigned long long));\n\n  dim3 blocks(THCCeilDiv(boxes_num, threadsPerBlock),\n              THCCeilDiv(boxes_num, threadsPerBlock));\n  dim3 threads(threadsPerBlock);\n  nms_kernel<<<blocks, threads>>>(boxes_num,\n                                  nms_overlap_thresh,\n                                  boxes_dev,\n                                  mask_dev);\n\n  std::vector<unsigned long long> mask_host(boxes_num * col_blocks);\n  THCudaCheck(cudaMemcpy(&mask_host[0],\n                        mask_dev,\n                        sizeof(unsigned long long) * boxes_num * col_blocks,\n                        cudaMemcpyDeviceToHost));\n\n  std::vector<unsigned long long> remv(col_blocks);\n  memset(&remv[0], 0, sizeof(unsigned long long) * col_blocks);\n\n  at::Tensor keep = at::empty({boxes_num}, boxes.options().dtype(at::kLong).device(at::kCPU));\n  int64_t* keep_out = keep.data<int64_t>();\n\n  int num_to_keep = 0;\n  for (int i = 0; i < boxes_num; i++) {\n    int nblock = i / threadsPerBlock;\n    int inblock = i % threadsPerBlock;\n\n    if (!(remv[nblock] & (1ULL << inblock))) {\n      keep_out[num_to_keep++] = i;\n      unsigned long long *p = &mask_host[0] + i * col_blocks;\n      for (int j = nblock; j < col_blocks; j++) {\n        remv[j] |= p[j];\n      }\n    }\n  }\n\n  THCudaFree(state, mask_dev);\n  // TODO improve this part\n  return std::get<0>(order_t.index({\n                       keep.narrow(/*dim=*/0, /*start=*/0, /*length=*/num_to_keep).to(\n                         order_t.device(), keep.scalar_type())\n                     }).sort(0, false));\n}\n"
  },
  {
    "path": "disparity/csrc/cuda/vision.h",
    "content": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#pragma once\n#include <torch/extension.h>\n\n\nat::Tensor SigmoidFocalLoss_forward_cuda(\n\t\tconst at::Tensor& logits,\n                const at::Tensor& targets,\n\t\tconst int num_classes, \n\t\tconst float gamma, \n\t\tconst float alpha); \n\nat::Tensor SigmoidFocalLoss_backward_cuda(\n\t\t\t     const at::Tensor& logits,\n                             const at::Tensor& targets,\n\t\t\t     const at::Tensor& d_losses,\n\t\t\t     const int num_classes,\n\t\t\t     const float gamma,\n\t\t\t     const float alpha);\n\nat::Tensor ROIAlign_forward_cuda(const at::Tensor& input,\n                                 const at::Tensor& rois,\n                                 const float spatial_scale,\n                                 const int pooled_height,\n                                 const int pooled_width,\n                                 const int sampling_ratio);\n\nat::Tensor ROIAlign_backward_cuda(const at::Tensor& grad,\n                                  const at::Tensor& rois,\n                                  const float spatial_scale,\n                                  const int pooled_height,\n                                  const int pooled_width,\n                                  const int batch_size,\n                                  const int channels,\n                                  const int height,\n                                  const int width,\n                                  const int sampling_ratio);\n\nat::Tensor ROIDisp_forward_cuda(const at::Tensor& input,\n                                 const at::Tensor& input_R,\n                                 const at::Tensor& rois,\n                                 const float spatial_scale,\n                                 const int pooled_height,\n                                 const int pooled_width,\n                                 const int max_disp);\n\nstd::tuple<at::Tensor, at::Tensor> ROIDisp_backward_cuda(const at::Tensor& grad,\n                                  const at::Tensor& rois,\n                                  const float spatial_scale,\n                                  const int pooled_height,\n                                  const int pooled_width,\n                                  const int batch_size,\n                                  const int channels,\n                                  const int height,\n                                  const int width,\n                                  const int max_disp);\n\nstd::tuple<at::Tensor, at::Tensor> ROIPool_forward_cuda(const at::Tensor& input,\n                                const at::Tensor& rois,\n                                const float spatial_scale,\n                                const int pooled_height,\n                                const int pooled_width);\n\nat::Tensor BuildCostVolume_forward_cuda(const at::Tensor& left,\n                                 const at::Tensor& right,\n                                 const at::Tensor& shift);\n\nat::Tensor ROIPool_backward_cuda(const at::Tensor& grad,\n                                 const at::Tensor& input,\n                                 const at::Tensor& rois,\n                                 const at::Tensor& argmax,\n                                 const float spatial_scale,\n                                 const int pooled_height,\n                                 const int pooled_width,\n                                 const int batch_size,\n                                 const int channels,\n                                 const int height,\n                                 const int width);\n\nstd::tuple<at::Tensor, at::Tensor> BuildCostVolume_backward_cuda(const at::Tensor& grad,\n                                 const at::Tensor& left);\n\nat::Tensor nms_cuda(const at::Tensor boxes, float nms_overlap_thresh);\n\n\nat::Tensor compute_flow_cuda(const at::Tensor& boxes,\n                             const int height,\n                             const int width);\n"
  },
  {
    "path": "disparity/csrc/nms.h",
    "content": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#pragma once\n#include \"cpu/vision.h\"\n\n#ifdef WITH_CUDA\n#include \"cuda/vision.h\"\n#endif\n\n\nat::Tensor nms(const at::Tensor& dets,\n               const at::Tensor& scores,\n               const float threshold) {\n\n  if (dets.type().is_cuda()) {\n#ifdef WITH_CUDA\n    // TODO raise error if not compiled with CUDA\n    if (dets.numel() == 0)\n      return at::empty({0}, dets.options().dtype(at::kLong).device(at::kCPU));\n    auto b = at::cat({dets, scores.unsqueeze(1)}, 1);\n    return nms_cuda(b, threshold);\n#else\n    AT_ERROR(\"Not compiled with GPU support\");\n#endif\n  }\n\n  at::Tensor result = nms_cpu(dets, scores, threshold);\n  return result;\n}\n"
  },
  {
    "path": "disparity/csrc/vision.cpp",
    "content": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#include \"nms.h\"\n#include \"ROIAlign.h\"\n#include \"ROIPool.h\"\n#include \"SigmoidFocalLoss.h\"\n#include \"BuildCostVolume.h\"\n\nPYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {\n  m.def(\"nms\", &nms, \"non-maximum suppression\");\n  m.def(\"roi_align_forward\", &ROIAlign_forward, \"ROIAlign_forward\");\n  m.def(\"roi_align_backward\", &ROIAlign_backward, \"ROIAlign_backward\");\n  m.def(\"roi_pool_forward\", &ROIPool_forward, \"ROIPool_forward\");\n  m.def(\"roi_pool_backward\", &ROIPool_backward, \"ROIPool_backward\");\n  m.def(\"sigmoid_focalloss_forward\", &SigmoidFocalLoss_forward, \"SigmoidFocalLoss_forward\");\n  m.def(\"sigmoid_focalloss_backward\", &SigmoidFocalLoss_backward, \"SigmoidFocalLoss_backward\");\n  m.def(\"build_cost_volume_forward\", &BuildCostVolume_forward, \"BuildCostVolume_forward\");\n  m.def(\"build_cost_volume_backward\", &BuildCostVolume_backward, \"BuildCostVolume_backward\");\n}\n"
  },
  {
    "path": "disparity/dataloader/DataStatistics.py",
    "content": "import torch\nfrom dataloader import listflowfile as lt\nfrom dataloader import SecenFlowLoader as DA\n"
  },
  {
    "path": "disparity/dataloader/KITTILoader.py",
    "content": "import os\nimport torch\nimport torch.utils.data as data\nimport torch\nimport torchvision.transforms as transforms\nimport random\nfrom PIL import Image, ImageOps\nimport numpy as np\nfrom ..utils import preprocess\n\nIMG_EXTENSIONS = [\n    '.jpg', '.JPG', '.jpeg', '.JPEG',\n    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',\n]\n\n\ndef is_image_file(filename):\n    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)\n\n\ndef default_loader(path):\n    \n    return Image.open(path).convert('RGB')\n\n\ndef npy_loader(path):\n    return np.load(path)\n\ndef disparity_loader(path):\n    return Image.open(path)\n\n\nclass myImageFloder(data.Dataset):\n    def __init__(self, left, right, left_disparity, left_norm, training, loader=default_loader, dploader=disparity_loader):\n\n        self.left = left\n        self.right = right\n        self.disp_L = left_disparity\n        self.norm_L = left_norm\n        self.loader = loader\n        self.dploader = dploader\n        self.npy_loader = npy_loader\n        self.training = training\n\n    def __getitem__(self, index):\n        left = self.left[index]\n        right = self.right[index]\n        disp_L = self.disp_L[index]\n        norm_L = self.norm_L[index]\n\n        left_img = self.loader(left)\n        right_img = self.loader(right)\n        dataL = self.dploader(disp_L)\n        normL = self.npy_loader(norm_L[:-3]+'npy')\n        \n\n        if self.training:\n            w, h = left_img.size\n            # th, tw = 320, 1152\n            # th, tw = 256, 1152\n            # th, tw = 311, 1178\n            th, tw = 320, 1152\n            # th, tw = 256, 512\n            \n\n            x1 = random.randint(0, w - tw)\n            y1 = random.randint(0, h - th)\n\n            left_img = left_img.crop((x1, y1, x1 + tw, y1 + th))\n            right_img = right_img.crop((x1, y1, x1 + tw, y1 + th))\n\n            dataL = np.ascontiguousarray(dataL, dtype=np.float32) / 256\n            dataL = dataL[y1:y1 + th, x1:x1 + tw]\n\n            \n            normL = normL[y1:y1 + th, x1:x1 + tw, :]\n\n            processed = preprocess.get_transform(augment=True)\n            left_img = processed(left_img)\n            right_img = processed(right_img)\n            # left_img = left_img/255 - 1\n            # right_img = right_img/255 - 1\n\n            # left_img, rigt_img = preprocess.get_transform_unsym(left_img, right_img, [th, tw])\n            # left_img, right_img = left_img-1, right_img-1\n\n            # delta_h = np.floor(np.random.uniform(50,150))\n            # delta_w = np.floor(np.random.uniform(50,200))\n\n            delta_h = np.floor(np.random.uniform(50,180))\n            delta_w = np.floor(np.random.uniform(50,250))\n            x1_aug = random.randint(0, th - delta_h)\n            y1_aug = random.randint(0, tw - delta_w)\n            x2_aug = random.randint(0, th - delta_h)\n            y2_aug = random.randint(0, tw - delta_w)\n            right_img[:,int(x1_aug):int(x1_aug+delta_h), int(y1_aug):int(y1_aug+delta_w)]  = right_img[:,int(x2_aug):int(x2_aug+delta_h), int(y2_aug):int(y2_aug+delta_w)]\n\n            \n\n\n\n            return [left_img.unsqueeze(0), right_img.unsqueeze(0), torch.tensor(dataL).unsqueeze(0),torch.tensor(normL)]\n        else:\n            w, h = left_img.size\n            # left_img = left_img.crop((w - 1232, h - 368, w, h))\n            # right_img = right_img.crop((w - 1232, h - 368, w, h))\n            # left_img = left_img.crop((w - 1152, h - 256, w, h))\n            # right_img = right_img.crop((w - 1152, h - 256, w, h))\n            left_img = left_img.crop((w - 1152, h - 320, w, h))\n            right_img = right_img.crop((w - 1152, h - 320, w, h))\n            w1, h1 = left_img.size\n\n            # dataL = dataL.crop((w - 1152, h - 256, w, h))\n            dataL = dataL.crop((w - 1152, h - 320, w, h))\n            dataL = np.ascontiguousarray(dataL, dtype=np.float32) / 256\n\n            processed = preprocess.get_transform(augment=False)\n            left_img = processed(left_img)\n            right_img = processed(right_img)\n            # print(left_img, right_img, dataL)\n\n            return [left_img, right_img, dataL, dataL]\n\n    def __len__(self):\n        return len(self.left)"
  },
  {
    "path": "disparity/dataloader/KITTI_submission_loader.py",
    "content": "import torch.utils.data as data\n\nfrom PIL import Image\nimport os\nimport os.path\nimport numpy as np\n\nIMG_EXTENSIONS = [\n    '.jpg', '.JPG', '.jpeg', '.JPEG',\n    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',\n]\n\n\ndef is_image_file(filename):\n    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)\n\ndef dataloader(filepath):\n\n  left_fold  = 'image_2/'\n  right_fold = 'image_3/'\n\n  # left_fold  = 'colored_0/'\n  # right_fold = 'colored_1/'\n\n  image = [img for img in os.listdir(filepath+left_fold) if img.find('_10') > -1]\n\n\n  left_test  = [filepath+left_fold+img for img in image]\n  right_test = [filepath+right_fold+img for img in image]\n  \n\n\n  return left_test, right_test\n"
  },
  {
    "path": "disparity/dataloader/KITTI_submission_loader2012.py",
    "content": "import torch.utils.data as data\n\nfrom PIL import Image\nimport os\nimport os.path\nimport numpy as np\n\nIMG_EXTENSIONS = [\n    '.jpg', '.JPG', '.jpeg', '.JPEG',\n    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',\n]\n\n\ndef is_image_file(filename):\n    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)\n\ndef dataloader(filepath):\n\n  left_fold  = 'colored_0/'\n  right_fold = 'colored_1/'\n\n\n  image = [img for img in os.listdir(filepath+left_fold) if img.find('_10') > -1]\n\n\n  left_test  = [filepath+left_fold+img for img in image]\n  right_test = [filepath+right_fold+img for img in image]\n\n  return left_test, right_test\n"
  },
  {
    "path": "disparity/dataloader/KITTIloader2012.py",
    "content": "import torch.utils.data as data\n\nfrom PIL import Image\nimport os\nimport os.path\nimport numpy as np\nimport random\n\nIMG_EXTENSIONS = [\n    '.jpg', '.JPG', '.jpeg', '.JPEG',\n    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',\n]\n\n\ndef is_image_file(filename):\n    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)\n\ndef dataloader(filepath, arg=False):\n\n  left_fold  = 'colored_0/'\n  right_fold = 'colored_1/'\n  disp_noc   = 'disp_occ/'\n  disp_norm   = 'dispnorm_occ/'\n\n  image = [img for img in os.listdir(filepath+left_fold) if img.find('_10') > -1]\n  \n  valist = [1,15,39,65,101,113,134,154,175,4,16,40,66,102,118,139,156,180,5,19,52,82,104,\n            119,143,157,181,9,25,56,85,105,120,145,161,186,11,29,60,89,107,122,148,167,\n            188,12,31,63,95,108,128,151,170,14,32,64,97,112,132,153,171]\n  # valist = []\n  train = []\n  val = []\n  for i in range(len(image)):\n    if i in valist:\n      val.append(image[i])\n    else:\n      train.append(image[i])\n  random.shuffle(train)\n  \n  left_train  = [filepath+left_fold+img for img in train]\n  right_train = [filepath+right_fold+img for img in train]\n  disp_train = [filepath+disp_noc+img for img in train]\n  norm_train = [filepath+disp_norm+img for img in train]\n\n\n  left_val  = [filepath+left_fold+img for img in val]\n  right_val = [filepath+right_fold+img for img in val]\n  disp_val = [filepath+disp_noc+img for img in val]\n  norm_val = [filepath+disp_norm+img for img in val]\n\n  return left_train, right_train, disp_train, norm_train, left_val, right_val, disp_val, norm_val\n"
  },
  {
    "path": "disparity/dataloader/KITTIloader2015.py",
    "content": "import torch.utils.data as data\n\nfrom PIL import Image\nimport os\nimport os.path\nimport numpy as np\nimport random\n\nIMG_EXTENSIONS = [\n    '.jpg', '.JPG', '.jpeg', '.JPEG',\n    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',\n]\n\n\ndef is_image_file(filename):\n    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)\n\ndef dataloader(filepath):\n\n  left_fold  = 'image_2/'\n  right_fold = 'image_3/'\n  disp_L = 'disp_occ_0/'\n  disp_R = 'disp_occ_1/'\n  disp_norm = 'dispnorm_occ/'\n\n  image = [img for img in os.listdir(filepath+left_fold) if img.find('_10') > -1]\n\n  all_index = np.arange(200)\n  #np.random.shuffle(all_index)\n  # vallist = all_index[:40]\n#   val = ['{:06d}_10.png'.format(x) for x in vallist]\n  \n\n  val = []\n  train = [x for x in image if x not in val]\n  random.shuffle(train)\n\n\n  left_train  = [filepath+left_fold+img for img in train]\n  right_train = [filepath+right_fold+img for img in train]\n  disp_train_L = [filepath+disp_L+img for img in train]\n  disp_train_R = [filepath+disp_R+img for img in train]\n  norm_train_L = [filepath+disp_norm+img for img in train]\n\n  left_val  = [filepath+left_fold+img for img in val]\n  right_val = [filepath+right_fold+img for img in val]\n  disp_val_L = [filepath+disp_L+img for img in val]\n  disp_val_R = [filepath+disp_R+img for img in val]\n  norm_val_L = [filepath+disp_norm+img for img in val]\n\n  return left_train, right_train, disp_train_L, norm_train_L, left_val, right_val, disp_val_L, norm_val_L\n"
  },
  {
    "path": "disparity/dataloader/SceneFlowLoader_demo.py",
    "content": "import torch.utils.data as data\nimport random\nfrom PIL import Image\nfrom . import preprocess\nimport numpy as np\nimport sys, os\nsys.path.append(os.path.abspath(os.path.dirname(__file__)))\nIMG_EXTENSIONS = [\n    '.jpg',\n    '.JPG',\n    '.jpeg',\n    '.JPEG',\n    '.png',\n    '.PNG',\n    '.ppm',\n    '.PPM',\n    '.bmp',\n    '.BMP',\n]\n\n\ndef is_image_file(filename):\n    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)\n\n\ndef default_loader(path):\n    return Image.open(path).convert('RGB')\n\n\ndef disparity_loader(path):\n    path_prefix = path.split('.')[0]\n    path1 = path_prefix + '_exception_assign_minus_1.npy'\n    path2 = path_prefix + '.npy'\n    path3 = path_prefix + '.pfm'\n    import os.path as ospath\n    if ospath.exists(path1):\n        return np.load(path1)\n    else:\n        if ospath.exists(path2):\n            data = np.load(path2)\n        else:\n            from readpfm import readPFM\n            data, _ = readPFM(path3)\n            np.save(path2, data)\n        for i in range(data.shape[0]):\n            for j in range(data.shape[1]):\n                if j - data[i][j] < 0:\n                    data[i][j] = -1\n        np.save(path1, data)\n        return data\n\n\nclass myImageFloder(data.Dataset):\n    def __init__(self,\n                 left,\n                 right,\n                 left_disparity,\n                 training,\n                 normalize,\n                 loader=default_loader,\n                 dploader=disparity_loader):\n\n        self.left = left\n        self.right = right\n        self.disp_L = left_disparity\n        self.loader = loader\n        self.dploader = dploader\n        self.training = training\n        self.normalize = normalize\n\n    def __getitem__(self, index):\n        left = self.left[index]\n        right = self.right[index]\n        disp_L = self.disp_L[index]\n\n        left_img = self.loader(left)\n        right_img = self.loader(right)\n        dataL = self.dploader(disp_L)\n        dataL = np.ascontiguousarray(dataL, dtype=np.float32)\n\n        processed = preprocess.get_transform(\n            augment=False, normalize=self.normalize)\n        left_img = processed(left_img)\n        right_img = processed(right_img)\n\n        return left_img, right_img, dataL, left, disp_L.split(\n            '.')[0] + '_exception_assign_minus_1.npy'\n\n    def __len__(self):\n        return len(self.left)\n"
  },
  {
    "path": "disparity/dataloader/SecenFlowLoader.py",
    "content": "import os\nimport torch\nimport torch.utils.data as data\nimport torch\n#import torchvision.transforms as transforms\nimport random\nfrom PIL import Image, ImageOps\nfrom . import preprocess\nfrom . import listflowfile as lt\nfrom . import readpfm as rp\nimport numpy as np\nimport cv2\nimport torch.nn.functional as F\n\nIMG_EXTENSIONS = [\n    '.jpg', '.JPG', '.jpeg', '.JPEG',\n    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',\n]\n\n\ndef is_image_file(filename):\n    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)\n\n\ndef default_loader(path):\n    image = cv2.imread(path)\n    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n    return image\n\n\ndef disparity_loader(path):\n    return rp.readPFM(path)\n\ndef random_replace(img,num,size):\n    #random crop areas and replace to the same size random crop from the image self.\n    #from HITNet ,it random crop the right image.\n    #args num:num of areas to crop and replace\n    #     size: random [0,size]*[0,size]\n    h = img.shape[0]\n    w = img.shape[1]\n    for i in range(num):\n        size_ix = random.randint(0,size)\n        size_iy = random.randint(0, size)\n        x1 = random.randint(0, w - size_ix)\n        y1 = random.randint(0, h - size_iy)\n\n        x2 = random.randint(0, w - size_ix)\n        y2 = random.randint(0, h - size_iy)\n        #replace\n        img[y1:y1 + size_iy, x1:x1 + size_ix, :] = img[y2:y2 + size_iy, x2:x2 + size_ix, :]\n\n    return img\n\nclass myImageFloder(data.Dataset):\n    def __init__(self, left, right, left_disparity,right_disparity, training, loader=default_loader, dploader=disparity_loader):\n\n        self.left = left\n        self.right = right\n        self.disp_L = left_disparity\n        self.loader = loader\n        self.dploader = dploader\n        self.training = training\n        if right_disparity is not None:\n            self.disp_R = right_disparity\n        else:\n            self.disp_R = None\n\n        print('len', len(self.left))\n    def __getitem__(self, index):\n        left = self.left[index]\n        right = self.right[index]\n        disp_L = self.disp_L[index]\n        if self.disp_R is not None:\n            disp_R = self.disp_R[index]\n            dataR,scaleR = self.dploader(disp_R)\n            dataR = np.ascontiguousarray(dataR, dtype=np.float32)\n\n        left_img = self.loader(left)\n        right_img = self.loader(right)\n        dataL, scaleL = self.dploader(disp_L)\n        dataL = np.ascontiguousarray(dataL, dtype=np.float32)\n\n        if self.training:\n\n            h = left_img.shape[0]\n            w = left_img.shape[1]\n\n            th, tw = 320, 960\n\n            x1 = random.randint(0, w - tw)\n            y1 =\\\n                random.randint(0, h - th)\n\n            left_img = left_img[y1:y1+th,x1:x1+tw,:]\n            right_img = right_img[y1:y1+th,x1:x1+tw,:]\n\n            #left_img = random_replace(left_img,5,80)\n\n            dataL = dataL[y1:y1 + th, x1:x1 + tw]\n\n            if dataR is not None:\n                dataR=dataR[y1:y1 + th, x1:x1 + tw]\n\n\n            processed = preprocess.get_transform(augment=False)\n\n            #random replace\n            #right_img = random_replace(right_img,4,5)\n\n            left_img_and_d= processed(image=left_img,mask=dataL,bboxes=[],category_id=[])\n            left_img = left_img_and_d['image']\n            dataL = left_img_and_d['mask']\n            if dataR is not None:\n                right_img_and_d = processed(image=right_img, mask=dataR, bboxes=[], category_id=[])\n                right_img = right_img_and_d['image']\n                dataR = right_img_and_d['mask']\n            else:\n                right_img = processed(image=right_img,mask=None,bboxes=[],category_id=[])['image']\n\n\n            if dataR is not None:\n                return left_img, right_img, dataL,dataR\n            else:\n                return left_img, right_img, dataL\n        else:\n\n            h = left_img.shape[0]\n            w = left_img.shape[1]\n\n            th, tw = 512, 960\n\n            #x1 = random.randint(0, w - tw)\n            #y1 =  random.randint(0, h - th)\n            x1 = 0\n            y1 = 0\n\n            left_img = left_img[y1:y1 + th, x1:x1 + tw, :]\n            right_img = right_img[y1:y1 + th, x1:x1 + tw, :]\n\n            dataL = dataL[y1:y1 + th, x1:x1 + tw]\n            if dataR is not None:\n                dataR=dataR[y1:y1 + th, x1:x1 + tw]\n\n            processed = preprocess.get_transform(augment=False)\n            left_img_and_d= processed(image=left_img,mask=dataL,bboxes=[],category_id=[])\n            left_img = left_img_and_d['image']\n            dataL = left_img_and_d['mask']\n            if dataR is not None:\n                right_img_and_d = processed(image=right_img, mask=dataR, bboxes=[], category_id=[])\n                right_img = right_img_and_d['image']\n                dataR = right_img_and_d['mask']\n            else:\n                right_img = processed(image=right_img,mask=None,bboxes=[],category_id=[])['image']\n\n\n            if dataR is not None:\n                return left_img, right_img, dataL, dataR\n            else:\n                return left_img, right_img, dataL\n\n\n    def __len__(self):\n        return len(self.left)\n\n"
  },
  {
    "path": "disparity/dataloader/SecenFlowLoader1.py",
    "content": "import os\nimport torch\nimport torch.utils.data as data\nimport torch\nimport torchvision.transforms as transforms\nimport random\nfrom PIL import Image, ImageOps\nfrom . import preprocess\nfrom . import listflowfile as lt\nfrom . import readpfm as rp\nimport numpy as np\n\nIMG_EXTENSIONS = [\n    '.jpg', '.JPG', '.jpeg', '.JPEG',\n    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',\n]\n\n\ndef is_image_file(filename):\n    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)\n\n\ndef default_loader(path):\n    return Image.open(path).convert('RGB')\n\n\ndef disparity_loader(path):\n    return rp.readPFM(path)\n\n\nclass myImageFloder(data.Dataset):\n    def __init__(self, left, right, left_disparity, training, loader=default_loader, dploader=disparity_loader):\n\n        self.left = left\n        self.right = right\n        self.disp_L = left_disparity\n        self.loader = loader\n        self.dploader = dploader\n        self.training = training\n\n    def __getitem__(self, index):\n        left = self.left[index]\n        right = self.right[index]\n        disp_L = self.disp_L[index]\n\n        left_img = self.loader(left)\n        right_img = self.loader(right)\n        dataL, scaleL = self.dploader(disp_L)\n        dataL = np.ascontiguousarray(dataL, dtype=np.float32)\n\n        if self.training:\n            w, h = left_img.size\n            th, tw = 256, 512\n            # th, tw = 544, 960\n\n            x1 = random.randint(0, w - tw)\n            y1 = random.randint(0, h - th)\n\n            left_img = left_img.crop((x1, y1, x1 + tw, y1 + th))\n            right_img = right_img.crop((x1, y1, x1 + tw, y1 + th))\n\n            dataL = dataL[y1:y1 + th, x1:x1 + tw]\n\n            processed = preprocess.get_transform(augment=False)\n            left_img = processed(left_img)\n            right_img = processed(right_img)\n\n            return left_img, right_img, dataL\n        else:\n            w, h = left_img.size\n            left_img = left_img.crop((w - 960, h - 544, w, h))\n            right_img = right_img.crop((w - 960, h - 544, w, h))\n            processed = preprocess.get_transform(augment=False)\n            left_img = processed(left_img)\n            right_img = processed(right_img)\n\n            return left_img, right_img, dataL\n\n    def __len__(self):\n        return len(self.left)\n"
  },
  {
    "path": "disparity/dataloader/SecenFlowLoaderfix.py",
    "content": "import torch.utils.data as data\nimport random\nfrom PIL import Image\nfrom . import preprocess\n# import preprocess\nimport numpy as np\nimport sys, os\nsys.path.append(os.path.abspath(os.path.dirname(__file__)))\nIMG_EXTENSIONS = [\n    '.jpg',\n    '.JPG',\n    '.jpeg',\n    '.JPEG',\n    '.png',\n    '.PNG',\n    '.ppm',\n    '.PPM',\n    '.bmp',\n    '.BMP',\n]\n\n\ndef is_image_file(filename):\n    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)\n\n\ndef default_loader(path):\n    return Image.open(path).convert('RGB')\n\n\n# def disparity_loader(path):\n#     path_prefix = path.split('.')[0]\n#     # print(path_prefix)\n#     path1 = path_prefix + '_exception_assign_minus_1.npy'\n#     path2 = path_prefix + '.npy'\n#     path3 = path_prefix + '.pfm'\n#     import os.path as ospath\n#     if ospath.exists(path1):\n#         return np.load(path1)\n#     else:\n#         if ospath.exists(path2):\n#             data = np.load(path2)\n#         else:\n#             # from readpfm import readPFMreadPFM\n#             from readpfm import readPFM\n#             data, _ = readPFM(path3)\n#             np.save(path2, data)\n#         for i in range(data.shape[0]):\n#             for j in range(data.shape[1]):\n#                 if j - data[i][j] < 0:\n#                     data[i][j] = -1\n#         np.save(path1, data)\n#         return data\n\n\ndef disparity_loader(path):\n    path_prefix = path.split('.')[0]\n    # print(path_prefix)\n    path1 = path_prefix + '_exception_assign_minus_1.npy'\n    path2 = path_prefix + '.npy'\n    path3 = path_prefix + '.pfm'\n    import os.path as ospath\n    if ospath.exists(path1):\n        return np.load(path1)\n    else:\n\n        # from readpfm import readPFMreadPFM\n        from readpfm import readPFM\n        data, _ = readPFM(path3)\n        np.save(path2, data)\n        for i in range(data.shape[0]):\n            for j in range(data.shape[1]):\n                if j - data[i][j] < 0:\n                    data[i][j] = -1\n        np.save(path1, data)\n        return data\n\nclass myImageFloder(data.Dataset):\n    def __init__(self,\n                 left,\n                 right,\n                 left_disparity,\n                 right_disparity,\n                 training,\n                 normalize,\n                 loader=default_loader,\n                 dploader=disparity_loader):\n\n        self.left = left\n        self.right = right\n        self.disp_L = left_disparity\n        self.disp_R = right_disparity\n        self.loader = loader\n        self.dploader = dploader\n        self.training = training\n        self.normalize = normalize\n\n    def __getitem__(self, index):\n        \n        left = self.left[index]\n        \n        right = self.right[index]\n        disp_L = self.disp_L[index]\n        disp_R = self.disp_R[index]\n        left_img = self.loader(left)\n        right_img = self.loader(right)\n        dataL = self.dploader(disp_L)\n        dataR = self.dploader(disp_R)\n        \n        dataL = np.ascontiguousarray(dataL, dtype=np.float32)\n        dataR = np.ascontiguousarray(dataR, dtype=np.float32)\n\n        processed = preprocess.get_transform(\n            augment=False, normalize=self.normalize)\n        left_img = processed(left_img)\n        right_img = processed(right_img)\n\n\n        return left_img, right_img, dataL, dataR\n\n    def __len__(self):\n        return len(self.left)\nif __name__ == '__main__':\n    path = '/media/lxy/sdd1/stereo_coderesource/dataset_nie/SceneFlowData/frames_cleanpass/flyingthings3d_disparity/TRAIN/A/0024/left/0011.pfm'\n    res = disparity_loader(path)\n    print(res.shape)\n"
  },
  {
    "path": "disparity/dataloader/Testloader.py",
    "content": "\nimport os\nimport torch\nimport torch.utils.data as data\nimport torch\nimport torchvision.transforms as transforms\nimport random\nfrom PIL import Image, ImageOps\nimport numpy as np\n#from dataloader.preprocess import preprocess\nimport dataloader.preprocess as preprocess\n\nIMG_EXTENSIONS = [\n    '.jpg', '.JPG', '.jpeg', '.JPEG',\n    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',\n]\n\n\ndef is_image_file(filename):\n    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)\n\n\ndef default_loader(path):\n    return Image.open(path).convert('RGB')\n\n\ndef disparity_loader(path):\n    return Image.open(path)\n\n\n\ndef dataloader(filepath):\n\n  left_fold  = 'left/'\n  right_fold = 'right/'\n\n\n  left_test= [img for img in os.listdir(filepath+left_fold) if img.find('_left') > -1]\n  left_test.sort()\n  right_test= [img for img in os.listdir(filepath+right_fold) if img.find('_right') > -1]\n  right_test.sort()\n\n  left_test = [filepath+left_fold+img for img in left_test]\n  right_test = [filepath+right_fold+img for img in right_test]\n\n  return left_test, right_test\n\nclass myImageFloder(data.Dataset):\n    def __init__(self, left, right, loader=default_loader):\n\n        self.left = left\n        self.right = right\n        self.loader = loader\n\n\n    def __getitem__(self, index):\n        left = self.left[index]\n        right = self.right[index]\n        print('left',index,left)\n        print('right',index,right)\n\n\n        left_img = self.loader(left)\n        right_img = self.loader(right)\n\n\n        #test   not for training\n        w, h = left_img.size\n\n        left_img = left_img.crop((w - 992, h - 736, w, h))\n        right_img = right_img.crop((w - 992, h - 736, w, h))\n        # left_img = left_img.crop((w - 1232, h - 368, w, h))\n        # right_img = right_img.crop((w - 1232, h - 368, w, h))\n        w1, h1 = left_img.size\n\n        #dataL = dataL.crop((w - 1232, h - 368, w, h))\n\n\n        processedL = preprocess.get_transform(augment=False,camera=None)\n        processedR = preprocess.get_transform(augment=False,camera=None)\n        left_img = processedL(left_img)\n        right_img = processedR(right_img)\n\n        return left_img, right_img\n\n    def __len__(self):\n        return len(self.left)\n\nif __name__ == '__main__':\n    left,right=dataloader('/disk1/hyj/test_picture/819_testpic/')\n    print(left)\n    print(len(left))\n    print(right)\n    print(len(right))"
  },
  {
    "path": "disparity/dataloader/__init__.py",
    "content": ""
  },
  {
    "path": "disparity/dataloader/listflowfile.py",
    "content": "import torch.utils.data as data\n\nfrom PIL import Image\nimport os\nimport os.path\n\nIMG_EXTENSIONS = [\n    '.jpg', '.JPG', '.jpeg', '.JPEG',\n    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',\n]\n\n\ndef is_image_file(filename):\n    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)\n\n\ndef dataloader(filepath):\n    filepath += '/'\n    classes = [d for d in os.listdir(filepath) if os.path.isdir(os.path.join(filepath, d))]\n    image = [img for img in classes if img.find('frames_cleanpass') > -1]\n    disp = [dsp for dsp in classes if dsp.find('disparity') > -1]\n    print(classes)\n    print('img',image)\n    print('disp', disp)\n    monkaa_path = filepath + [x for x in image if 'monkaa' in x][0]\n    monkaa_disp = filepath + [x for x in disp if 'monkaa' in x][0]\n\n    monkaa_dir = os.listdir(monkaa_path)\n\n    all_left_img = []\n    all_right_img = []\n    all_left_disp = []\n    all_right_disp = []\n    test_left_img = []\n    test_right_img = []\n    test_left_disp = []\n\n    for dd in monkaa_dir:\n        for im in os.listdir(monkaa_path + '/' + dd + '/left/'):\n            if is_image_file(monkaa_path + '/' + dd + '/left/' + im):\n                all_left_img.append(monkaa_path + '/' + dd + '/left/' + im)\n                all_left_disp.append(monkaa_disp + '/' + dd + '/left/' + im.split(\".\")[0] + '.pfm')\n                all_right_disp.append(monkaa_disp + '/' + dd + '/right/' + im.split(\".\")[0] + '.pfm')\n\n        for im in os.listdir(monkaa_path + '/' + dd + '/right/'):\n            if is_image_file(monkaa_path + '/' + dd + '/right/' + im):\n                all_right_img.append(monkaa_path + '/' + dd + '/right/' + im)\n\n    flying_path = filepath + [x for x in image if x == 'frames_cleanpass'][0]\n    flying_disp = filepath + [x for x in disp if x == 'frames_disparity'][0]\n    flying_dir = flying_path + '/TRAIN/'\n    subdir = ['A', 'B', 'C']\n\n    for ss in subdir:\n        flying = os.listdir(flying_dir + ss)\n\n        for ff in flying:\n            imm_l = os.listdir(flying_dir + ss + '/' + ff + '/left/')\n            for im in imm_l:\n                if is_image_file(flying_dir + ss + '/' + ff + '/left/' + im):\n                    all_left_img.append(flying_dir + ss + '/' + ff + '/left/' + im)\n\n                all_left_disp.append(flying_disp + '/TRAIN/' + ss + '/' + ff + '/left/' + im.split(\".\")[0] + '.pfm')\n                all_right_disp.append(flying_disp + '/TRAIN/' + ss + '/' + ff + '/right/' + im.split(\".\")[0] + '.pfm')\n\n                if is_image_file(flying_dir + ss + '/' + ff + '/right/' + im):\n                    all_right_img.append(flying_dir + ss + '/' + ff + '/right/' + im)\n\n    flying_dir = flying_path + '/TEST/'\n\n    subdir = ['A', 'B', 'C']\n    # subdir = ['C']\n    # print('*****************')\n\n    for ss in subdir:\n        flying = os.listdir(flying_dir + ss)\n\n        for ff in flying:\n            imm_l = os.listdir(flying_dir + ss + '/' + ff + '/left/')\n            for im in imm_l:\n                if is_image_file(flying_dir + ss + '/' + ff + '/left/' + im):\n                    test_left_img.append(flying_dir + ss + '/' + ff + '/left/' + im)\n\n                test_left_disp.append(flying_disp + '/TEST/' + ss + '/' + ff + '/left/' + im.split(\".\")[0] + '.pfm')\n\n                if is_image_file(flying_dir + ss + '/' + ff + '/right/' + im):\n                    test_right_img.append(flying_dir + ss + '/' + ff + '/right/' + im)\n\n    driving_dir = filepath + [x for x in image if 'driving' in x][0] + '/'\n    driving_disp = filepath + [x for x in disp if 'driving' in x][0]\n\n    subdir1 = ['35mm_focallength', '15mm_focallength']\n    subdir2 = ['scene_backwards', 'scene_forwards']\n    subdir3 = ['fast', 'slow']\n\n    for i in subdir1:\n        for j in subdir2:\n            for k in subdir3:\n                imm_l = os.listdir(driving_dir + i + '/' + j + '/' + k + '/left/')\n                for im in imm_l:\n                    if is_image_file(driving_dir + i + '/' + j + '/' + k + '/left/' + im):\n                        all_left_img.append(driving_dir + i + '/' + j + '/' + k + '/left/' + im)\n                    all_left_disp.append(\n                        driving_disp + '/' + i + '/' + j + '/' + k + '/left/' + im.split(\".\")[0] + '.pfm')\n                    all_right_disp.append(\n                        driving_disp + '/' + i + '/' + j + '/' + k + '/right/' + im.split(\".\")[0] + '.pfm')\n\n                    if is_image_file(driving_dir + i + '/' + j + '/' + k + '/right/' + im):\n                        all_right_img.append(driving_dir + i + '/' + j + '/' + k + '/right/' + im)\n\n    return all_left_img, all_right_img, all_left_disp,all_right_disp, test_left_img, test_right_img, test_left_disp\n"
  },
  {
    "path": "disparity/dataloader/listflowfilefix.py",
    "content": "import torch.utils.data as data\n\nfrom PIL import Image\nimport os\nimport os.path\n\nIMG_EXTENSIONS = [\n    '.jpg', '.JPG', '.jpeg', '.JPEG',\n    '.png', '.PNG', '.ppm', '.PPM', '.bmp', '.BMP',\n]\n\n\ndef is_image_file(filename): \n    return any(filename.endswith(extension) for extension in IMG_EXTENSIONS)\n\ndef dataloader(filepath): # /media/hugonie/Hhome/dataset/SceneFlowData/\n\n # classes = [d for d in os.listdir(filepath) if os.path.isdir(os.path.join(filepath, d))]\n # print(classes)\n # image = [img for img in classes if img.find('frames_cleanpass') > -1]\n # print(image)\n # disp  = [dsp for dsp in classes if dsp.find('disparity') > -1]\n # print(disp)\n # monkaa\n \n # monkaa_path = filepath + [x for x in image if 'monkaa' in x][0]\n # monkaa_disp = filepath + [x for x in disp if 'monkaa' in x][0]\n monkaa_path = filepath + '/frames_cleanpass/monkaa'\n monkaa_disp = filepath + '/disparity/monkaa'\n monkaa_dir  = os.listdir(monkaa_path)\n\n all_left_img=[]\n all_right_img=[]\n all_left_disp = []\n all_right_disp = []\n test_left_img=[]\n test_right_img=[]\n test_left_disp = []\n test_right_disp = []\n\n\n for dd in monkaa_dir:\n   for im in os.listdir(monkaa_path+'/'+dd+'/left/'):\n    if is_image_file(monkaa_path+'/'+dd+'/left/'+im):\n     all_left_img.append(monkaa_path+'/'+dd+'/left/'+im)\n     all_left_disp.append(monkaa_disp+'/'+dd+'/left/'+im.split(\".\")[0]+'.pfm')\n     all_right_disp.append(monkaa_disp+'/'+dd+'/right/'+im.split(\".\")[0]+'.pfm')\n\n   for im in os.listdir(monkaa_path+'/'+dd+'/right/'):\n    if is_image_file(monkaa_path+'/'+dd+'/right/'+im):\n     all_right_img.append(monkaa_path+'/'+dd+'/right/'+im)\n\n # flyingthings\n # flying_path = filepath + [x for x in image if x == 'flyingthings3D'][0]\n # flying_disp = filepath + [x for x in disp if x == 'flyingthings3D'][0]\n flying_path = filepath + '/frames_cleanpass/flyingthings3D'\n flying_disp = filepath + '/disparity/flyingthings3D'\n flying_dir = flying_path+'/TRAIN/'\n subdir = ['A','B','C']\n\n for ss in subdir:\n    flying = os.listdir(flying_dir+ss)\n\n    for ff in flying:\n      imm_l = os.listdir(flying_dir+ss+'/'+ff+'/left/')\n      for im in imm_l:\n       if is_image_file(flying_dir+ss+'/'+ff+'/left/'+im):\n         all_left_img.append(flying_dir+ss+'/'+ff+'/left/'+im)\n\n       all_left_disp.append(flying_disp+'/TRAIN/'+ss+'/'+ff+'/left/'+im.split(\".\")[0]+'.pfm')\n       all_right_disp.append(flying_disp+'/TRAIN/'+ss+'/'+ff+'/right/'+im.split(\".\")[0]+'.pfm')\n\n       if is_image_file(flying_dir+ss+'/'+ff+'/right/'+im):\n         all_right_img.append(flying_dir+ss+'/'+ff+'/right/'+im)\n\n flying_dir = flying_path+'/TEST/'\n\n subdir = ['A','B','C']\n\n for ss in subdir:\n    flying = os.listdir(flying_dir+ss)\n\n    for ff in flying:\n      imm_l = os.listdir(flying_dir+ss+'/'+ff+'/left/')\n      for im in imm_l:\n       if is_image_file(flying_dir+ss+'/'+ff+'/left/'+im):\n         test_left_img.append(flying_dir+ss+'/'+ff+'/left/'+im)\n\n       test_left_disp.append(flying_disp+'/TEST/'+ss+'/'+ff+'/left/'+im.split(\".\")[0]+'.pfm')\n       test_right_disp.append(flying_disp+'/TEST/'+ss+'/'+ff+'/right/'+im.split(\".\")[0]+'.pfm')\n\n       if is_image_file(flying_dir+ss+'/'+ff+'/right/'+im):\n         test_right_img.append(flying_dir+ss+'/'+ff+'/right/'+im)\n\n\n # driving\n # driving_dir = filepath + [x for x in image if 'driving' in x][0] + '/'\n # driving_disp = filepath + [x for x in disp if 'driving' in x][0]\n driving_dir = filepath + '/frames_cleanpass/driving/'\n driving_disp = filepath + '/disparity/driving'\n\n subdir1 = ['15mm_focallength','35mm_focallength']\n subdir2 = ['scene_backwards','scene_forwards']\n subdir3 = ['fast','slow']\n\n for i in subdir1:\n   for j in subdir2:\n    for k in subdir3:\n        imm_l = os.listdir(driving_dir+i+'/'+j+'/'+k+'/left/')    \n        for im in imm_l:\n          if is_image_file(driving_dir+i+'/'+j+'/'+k+'/left/'+im):\n            all_left_img.append(driving_dir+i+'/'+j+'/'+k+'/left/'+im)\n          all_left_disp.append(driving_disp+'/'+i+'/'+j+'/'+k+'/left/'+im.split(\".\")[0]+'.pfm')\n          all_right_disp.append(driving_disp+'/'+i+'/'+j+'/'+k+'/right/'+im.split(\".\")[0]+'.pfm')\n\n          if is_image_file(driving_dir+i+'/'+j+'/'+k+'/right/'+im):\n            all_right_img.append(driving_dir+i+'/'+j+'/'+k+'/right/'+im)\n\n\n return all_left_img, all_right_img, all_left_disp,all_right_disp, test_left_img, test_right_img, test_left_disp, test_right_disp"
  },
  {
    "path": "disparity/dataloader/preprocess.py",
    "content": "import torch\n#import torchvision.transforms as transforms\nimport random\n\nimport cv2\nimport albumentations as A\nfrom albumentations.pytorch import ToTensorV2\n\n__imagenet_stats = {'mean': [0.485, 0.456, 0.406],\n                   'std': [0.229, 0.224, 0.225]}\n\n#__imagenet_stats = {'mean': [0.5, 0.5, 0.5],\n#                   'std': [0.5, 0.5, 0.5]}\n\n__imagenet_pca = {\n    'eigval': torch.Tensor([0.2175, 0.0188, 0.0045]),\n    'eigvec': torch.Tensor([\n        [-0.5675,  0.7192,  0.4009],\n        [-0.5808, -0.0045, -0.8140],\n        [-0.5836, -0.6948,  0.4203],\n    ])\n}\n\n\n\ndef totensor_normalize():\n\n    return A.Compose([\n        # A.Normalize(\n        #     mean=[0.485, 0.456, 0.406],\n        #     std=[0.229, 0.224, 0.225]),\n        ToTensorV2(always_apply=True)\n    ],p=1)\n\n\n\ndef augmentv1():\n    photometric  = [\n        A.Blur(p=0.5),\n        A.HueSaturationValue(20,30,20,p=0.5),\n        A.RandomBrightnessContrast(0.2,p=0.5),\n        A.RandomGamma(p=0.5),\n        #A.ISONoise(p=1),\n        A.GaussNoise(p=0.5),\n        # A.Normalize(\n        #     mean=[0.485, 0.456, 0.406],\n        #     std=[0.229, 0.224, 0.225],\n        # ),\n        ToTensorV2()\n    ]\n\n    geometric = [\n        # A.OpticalDistortion(distort_limit=0.3, shift_limit=0.3,p=1)\n        A.ShiftScaleRotate(shift_limit=0.01,scale_limit=0.01,rotate_limit=5,p=0.5)\n        #A.ShiftScaleRotate(shift_limit=0.3, scale_limit=0.3, rotate_limit=30, p=0.5)\n    ]\n\n    return A.Compose(photometric)\n\n\n\ndef get_transform(augment=True):\n\n\n    if augment:\n            return augmentv1()\n    else:\n            return totensor_normalize()\n\n\n\n\n\n\n"
  },
  {
    "path": "disparity/dataloader/readpfm.py",
    "content": "import re\nimport numpy as np\nimport sys\n\n\ndef readPFM(file):\n    file = open(file, 'rb')\n\n    color = None\n    width = None\n    height = None\n    scale = None\n    endian = None\n\n    header = file.readline().rstrip()\n    if header == b'PF':\n        color = True\n    elif header == b'Pf':\n        color = False\n    else:\n        raise Exception('Not a PFM file.')\n\n    dim_match = re.match(r'^(\\d+)\\s(\\d+)\\s$', file.readline().decode('utf-8'))\n    if dim_match:\n        width, height = map(int, dim_match.groups())\n    else:\n        raise Exception('Malformed PFM header.')\n\n    scale = float(file.readline().rstrip())\n    if scale < 0:  # little-endian\n        endian = '<'\n        scale = -scale\n    else:\n        endian = '>'  # big-endian\n\n    data = np.fromfile(file, endian + 'f')\n    shape = (height, width, 3) if color else (height, width)\n\n    data = np.reshape(data, shape)\n    data = np.flipud(data)\n    file.close()\n    return data, scale\n"
  },
  {
    "path": "disparity/eval/__init__.py",
    "content": ""
  },
  {
    "path": "disparity/eval/kitti/README.md",
    "content": "Reference: <a href=\"https://github.com/prclibo/kitti_eval\" target=\"_blank\">https://github.com/prclibo/kitti_eval</a>\n\n# kitti_eval\n\n`evaluate_object_3d_offline.cpp`evaluates your KITTI detection locally on your own computer using your validation data selected from KITTI training dataset, with the following metrics:\n\n- overlap on image (AP)\n- oriented overlap on image (AOS)\n- overlap on ground-plane (AP)\n- overlap in 3D (AP)\n\nCompile `evaluate_object_3d_offline.cpp` with dependency of Boost and Linux `dirent.h` (You should already have it under most Linux).\n\nRun the evalutaion by:\n\n    ./evaluate_object_3d_offline groundtruth_dir result_dir\n    \nNote that you don't have to detect over all KITTI training data. The evaluator only evaluates samples whose result files exist.\n\n\n### Updates\n\n- June, 2017:\n  * Fixed the bug of detection box filtering based on min height according to KITTI's note on 25.04.2017.\n"
  },
  {
    "path": "disparity/eval/kitti/compile.sh",
    "content": "#/bin/bash\ng++ -o evaluate_object_3d_offline evaluate_object_3d_offline.cpp\n"
  },
  {
    "path": "disparity/eval/kitti/eval.sh",
    "content": "echo \"evalutating $1 ...\"\n\n./evaluate_object_3d_offline /mnt/backup/project/ylchen/dataset/KITTI_DATASET/kitti_detection/training/label_2 $1\n"
  },
  {
    "path": "disparity/eval/kitti/eval_05.sh",
    "content": "echo \"evalutating $1 ...\"\n\n./evaluate_object_3d_offline_05 ../../../data/kitti/training/label_2/ $1\n"
  },
  {
    "path": "disparity/eval/kitti/evaluate_object_3d_offline.cpp",
    "content": "#include <iostream>\n#include <algorithm>\n#include <stdio.h>\n#include <math.h>\n#include <vector>\n#include <numeric>\n#include <strings.h>\n#include <assert.h>\n\n#include <dirent.h>\n\n#include <boost/numeric/ublas/matrix.hpp>\n#include <boost/numeric/ublas/io.hpp>\n\n#include <boost/geometry.hpp>\n#include <boost/geometry/geometries/point_xy.hpp>\n#include <boost/geometry/geometries/polygon.hpp>\n#include <boost/geometry/geometries/adapted/c_array.hpp>\n\n#include \"mail.h\"\n\nBOOST_GEOMETRY_REGISTER_C_ARRAY_CS(cs::cartesian)\n\ntypedef boost::geometry::model::polygon<boost::geometry::model::d2::point_xy<double> > Polygon;\n\n\nusing namespace std;\n\n/*=======================================================================\nSTATIC EVALUATION PARAMETERS\n=======================================================================*/\n\n// holds the number of test images on the server\nconst int32_t N_TESTIMAGES = 7518;\n\n// easy, moderate and hard evaluation level\nenum DIFFICULTY{EASY=0, MODERATE=1, HARD=2};\n\n// evaluation metrics: image, ground or 3D\nenum METRIC{IMAGE=0, GROUND=1, BOX3D=2};\n\n// evaluation parameter\nconst int32_t MIN_HEIGHT[3]     = {40, 25, 25};     // minimum height for evaluated groundtruth/detections\nconst int32_t MAX_OCCLUSION[3]  = {0, 1, 2};        // maximum occlusion level of the groundtruth used for evaluation\nconst double  MAX_TRUNCATION[3] = {0.15, 0.3, 0.5}; // maximum truncation level of the groundtruth used for evaluation\n\n// evaluated object classes\nenum CLASSES{CAR=0, PEDESTRIAN=1, CYCLIST=2};\nconst int NUM_CLASS = 3;\n\n// parameters varying per class\nvector<string> CLASS_NAMES;\n// the minimum overlap required for 2D evaluation on the image/ground plane and 3D evaluation\n//const double MIN_OVERLAP[3][3] = {{0.7, 0.5, 0.5}, {0.25, 0.25, 0.25}, {0.25, 0.25, 0.25}};\n//const double MIN_OVERLAP[3][3] = {{0.7, 0.5, 0.5}, {0.5, 0.25, 0.25}, {0.5, 0.25, 0.25}};\nconst double MIN_OVERLAP[3][3] = {{0.7, 0.5, 0.5}, {0.7, 0.5, 0.5}, {0.7, 0.5, 0.5}};\n\n// no. of recall steps that should be evaluated (discretized)\nconst double N_SAMPLE_PTS = 41;\n\n\n// initialize class names\nvoid initGlobals () {\n  CLASS_NAMES.push_back(\"car\");\n  CLASS_NAMES.push_back(\"pedestrian\");\n  CLASS_NAMES.push_back(\"cyclist\");\n}\n\n/*=======================================================================\nDATA TYPES FOR EVALUATION\n=======================================================================*/\n\n// holding data needed for precision-recall and precision-aos\nstruct tPrData {\n  vector<double> v;           // detection score for computing score thresholds\n  double         similarity;  // orientation similarity\n  int32_t        tp;          // true positives\n  int32_t        fp;          // false positives\n  int32_t        fn;          // false negatives\n  tPrData () :\n    similarity(0), tp(0), fp(0), fn(0) {}\n};\n\n// holding bounding boxes for ground truth and detections\nstruct tBox {\n  string  type;     // object type as car, pedestrian or cyclist,...\n  double   x1;      // left corner\n  double   y1;      // top corner\n  double   x2;      // right corner\n  double   y2;      // bottom corner\n  double   alpha;   // image orientation\n  tBox (string type, double x1,double y1,double x2,double y2,double alpha) :\n    type(type),x1(x1),y1(y1),x2(x2),y2(y2),alpha(alpha) {}\n};\n\n// holding ground truth data\nstruct tGroundtruth {\n  tBox    box;        // object type, box, orientation\n  double  truncation; // truncation 0..1\n  int32_t occlusion;  // occlusion 0,1,2 (non, partly, fully)\n  double ry;\n  double  t1, t2, t3;\n  double h, w, l;\n  tGroundtruth () :\n    box(tBox(\"invalild\",-1,-1,-1,-1,-10)),truncation(-1),occlusion(-1) {}\n  tGroundtruth (tBox box,double truncation,int32_t occlusion) :\n    box(box),truncation(truncation),occlusion(occlusion) {}\n  tGroundtruth (string type,double x1,double y1,double x2,double y2,double alpha,double truncation,int32_t occlusion) :\n    box(tBox(type,x1,y1,x2,y2,alpha)),truncation(truncation),occlusion(occlusion) {}\n};\n\n// holding detection data\nstruct tDetection {\n  tBox    box;    // object type, box, orientation\n  double  thresh; // detection score\n  double  ry;\n  double  t1, t2, t3;\n  double  h, w, l;\n  tDetection ():\n    box(tBox(\"invalid\",-1,-1,-1,-1,-10)),thresh(-1000) {}\n  tDetection (tBox box,double thresh) :\n    box(box),thresh(thresh) {}\n  tDetection (string type,double x1,double y1,double x2,double y2,double alpha,double thresh) :\n    box(tBox(type,x1,y1,x2,y2,alpha)),thresh(thresh) {}\n};\n\n\n/*=======================================================================\nFUNCTIONS TO LOAD DETECTION AND GROUND TRUTH DATA ONCE, SAVE RESULTS\n=======================================================================*/\nvector<int32_t> indices;\n\nvector<tDetection> loadDetections(string file_name, bool &compute_aos,\n        vector<bool> &eval_image, vector<bool> &eval_ground,\n        vector<bool> &eval_3d, bool &success) {\n\n  // holds all detections (ignored detections are indicated by an index vector\n  vector<tDetection> detections;\n  FILE *fp = fopen(file_name.c_str(),\"r\");\n  if (!fp) {\n    success = false;\n    return detections;\n  }\n  while (!feof(fp)) {\n    tDetection d;\n    double trash;\n    char str[255];\n    if (fscanf(fp, \"%s %lf %lf %lf %lf %lf %lf %lf %lf %lf %lf %lf %lf %lf %lf %lf\",\n                   str, &trash, &trash, &d.box.alpha, &d.box.x1, &d.box.y1,\n                   &d.box.x2, &d.box.y2, &d.h, &d.w, &d.l, &d.t1, &d.t2, &d.t3,\n                   &d.ry, &d.thresh)==16) {\n\n        // d.thresh = 1;\n      d.box.type = str;\n      detections.push_back(d);\n\n      // orientation=-10 is invalid, AOS is not evaluated if at least one orientation is invalid\n      if(d.box.alpha == -10)\n        compute_aos = false;\n\n      // a class is only evaluated if it is detected at least once\n      for (int c = 0; c < NUM_CLASS; c++) {\n        if (!strcasecmp(d.box.type.c_str(), CLASS_NAMES[c].c_str())) {\n          if (!eval_image[c] && d.box.x1 >= 0)\n            eval_image[c] = true;\n          if (!eval_ground[c] && d.t1 != -1000)\n            eval_ground[c] = true;\n          if (!eval_3d[c] && d.t2 != -1000)\n            eval_3d[c] = true;\n          break;\n        }\n      }\n    }\n  }\n  fclose(fp);\n  success = true;\n  return detections;\n}\n\nvector<tGroundtruth> loadGroundtruth(string file_name,bool &success) {\n\n  // holds all ground truth (ignored ground truth is indicated by an index vector\n  vector<tGroundtruth> groundtruth;\n  FILE *fp = fopen(file_name.c_str(),\"r\");\n  if (!fp) {\n    success = false;\n    return groundtruth;\n  }\n  while (!feof(fp)) {\n    tGroundtruth g;\n    char str[255];\n    if (fscanf(fp, \"%s %lf %d %lf %lf %lf %lf %lf %lf %lf %lf %lf %lf %lf %lf\",\n                   str, &g.truncation, &g.occlusion, &g.box.alpha,\n                   &g.box.x1,   &g.box.y1,     &g.box.x2,    &g.box.y2,\n                   &g.h,      &g.w,        &g.l,       &g.t1,\n                   &g.t2,      &g.t3,        &g.ry )==15) {\n      g.box.type = str;\n      groundtruth.push_back(g);\n    }\n  }\n  fclose(fp);\n  success = true;\n  return groundtruth;\n}\n\nvoid saveStats (const vector<double> &precision, const vector<double> &aos, FILE *fp_det, FILE *fp_ori) {\n\n  // save precision to file\n  if(precision.empty())\n    return;\n  for (int32_t i=0; i<precision.size(); i++)\n    fprintf(fp_det,\"%f \",precision[i]);\n  fprintf(fp_det,\"\\n\");\n\n  // save orientation similarity, only if there were no invalid orientation entries in submission (alpha=-10)\n  if(aos.empty())\n    return;\n  for (int32_t i=0; i<aos.size(); i++)\n    fprintf(fp_ori,\"%f \",aos[i]);\n  fprintf(fp_ori,\"\\n\");\n}\n\n/*=======================================================================\nEVALUATION HELPER FUNCTIONS\n=======================================================================*/\n\n// criterion defines whether the overlap is computed with respect to both areas (ground truth and detection)\n// or with respect to box a or b (detection and \"dontcare\" areas)\ninline double imageBoxOverlap(tBox a, tBox b, int32_t criterion=-1){\n\n  // overlap is invalid in the beginning\n  double o = -1;\n\n  // get overlapping area\n  double x1 = max(a.x1, b.x1);\n  double y1 = max(a.y1, b.y1);\n  double x2 = min(a.x2, b.x2);\n  double y2 = min(a.y2, b.y2);\n\n  // compute width and height of overlapping area\n  double w = x2-x1;\n  double h = y2-y1;\n\n  // set invalid entries to 0 overlap\n  if(w<=0 || h<=0)\n    return 0;\n\n  // get overlapping areas\n  double inter = w*h;\n  double a_area = (a.x2-a.x1) * (a.y2-a.y1);\n  double b_area = (b.x2-b.x1) * (b.y2-b.y1);\n\n  // intersection over union overlap depending on users choice\n  if(criterion==-1)     // union\n    o = inter / (a_area+b_area-inter);\n  else if(criterion==0) // bbox_a\n    o = inter / a_area;\n  else if(criterion==1) // bbox_b\n    o = inter / b_area;\n\n  // overlap\n  return o;\n}\n\ninline double imageBoxOverlap(tDetection a, tGroundtruth b, int32_t criterion=-1){\n  return imageBoxOverlap(a.box, b.box, criterion);\n}\n\n// compute polygon of an oriented bounding box\ntemplate <typename T>\nPolygon toPolygon(const T& g) {\n    using namespace boost::numeric::ublas;\n    using namespace boost::geometry;\n    matrix<double> mref(2, 2);\n    mref(0, 0) = cos(g.ry); mref(0, 1) = sin(g.ry);\n    mref(1, 0) = -sin(g.ry); mref(1, 1) = cos(g.ry);\n\n    static int count = 0;\n    matrix<double> corners(2, 4);\n    double data[] = {g.l / 2, g.l / 2, -g.l / 2, -g.l / 2,\n                     g.w / 2, -g.w / 2, -g.w / 2, g.w / 2};\n    std::copy(data, data + 8, corners.data().begin());\n    matrix<double> gc = prod(mref, corners);\n    for (int i = 0; i < 4; ++i) {\n        gc(0, i) += g.t1;\n        gc(1, i) += g.t3;\n    }\n\n    double points[][2] = {{gc(0, 0), gc(1, 0)},{gc(0, 1), gc(1, 1)},{gc(0, 2), gc(1, 2)},{gc(0, 3), gc(1, 3)},{gc(0, 0), gc(1, 0)}};\n    Polygon poly;\n    append(poly, points);\n    return poly;\n}\n\n// measure overlap between bird's eye view bounding boxes, parametrized by (ry, l, w, tx, tz)\ninline double groundBoxOverlap(tDetection d, tGroundtruth g, int32_t criterion = -1) {\n    using namespace boost::geometry;\n    Polygon gp = toPolygon(g);\n    Polygon dp = toPolygon(d);\n\n    std::vector<Polygon> in, un;\n    intersection(gp, dp, in);\n    union_(gp, dp, un);\n\n    double inter_area = in.empty() ? 0 : area(in.front());\n    double union_area = area(un.front());\n    double o;\n    if(criterion==-1)     // union\n        o = inter_area / union_area;\n    else if(criterion==0) // bbox_a\n        o = inter_area / area(dp);\n    else if(criterion==1) // bbox_b\n        o = inter_area / area(gp);\n\n    return o;\n}\n\n// measure overlap between 3D bounding boxes, parametrized by (ry, h, w, l, tx, ty, tz)\ninline double box3DOverlap(tDetection d, tGroundtruth g, int32_t criterion = -1) {\n    using namespace boost::geometry;\n    Polygon gp = toPolygon(g);\n    Polygon dp = toPolygon(d);\n\n    std::vector<Polygon> in, un;\n    intersection(gp, dp, in);\n    union_(gp, dp, un);\n\n    double ymax = min(d.t2, g.t2);\n    double ymin = max(d.t2 - d.h, g.t2 - g.h);\n\n    double inter_area = in.empty() ? 0 : area(in.front());\n    double inter_vol = inter_area * max(0.0, ymax - ymin);\n\n    double det_vol = d.h * d.l * d.w;\n    double gt_vol = g.h * g.l * g.w;\n\n    double o;\n    if(criterion==-1)     // union\n        o = inter_vol / (det_vol + gt_vol - inter_vol);\n    else if(criterion==0) // bbox_a\n        o = inter_vol / det_vol;\n    else if(criterion==1) // bbox_b\n        o = inter_vol / gt_vol;\n\n    return o;\n}\n\nvector<double> getThresholds(vector<double> &v, double n_groundtruth){\n\n  // holds scores needed to compute N_SAMPLE_PTS recall values\n  vector<double> t;\n\n  // sort scores in descending order\n  // (highest score is assumed to give best/most confident detections)\n  sort(v.begin(), v.end(), greater<double>());\n\n  // get scores for linearly spaced recall\n  double current_recall = 0;\n  for(int32_t i=0; i<v.size(); i++){\n\n    // check if right-hand-side recall with respect to current recall is close than left-hand-side one\n    // in this case, skip the current detection score\n    double l_recall, r_recall, recall;\n    l_recall = (double)(i+1)/n_groundtruth;\n    if(i<(v.size()-1))\n      r_recall = (double)(i+2)/n_groundtruth;\n    else\n      r_recall = l_recall;\n\n    if( (r_recall-current_recall) < (current_recall-l_recall) && i<(v.size()-1))\n      continue;\n\n    // left recall is the best approximation, so use this and goto next recall step for approximation\n    recall = l_recall;\n\n    // the next recall step was reached\n    t.push_back(v[i]);\n    //printf(\"%.8f\\n\", v[i]);\n    current_recall += 1.0/(N_SAMPLE_PTS-1.0);\n  }\n  return t;\n}\n\nvoid cleanData(CLASSES current_class, const vector<tGroundtruth> &gt, const vector<tDetection> &det, vector<int32_t> &ignored_gt, vector<tGroundtruth> &dc, vector<int32_t> &ignored_det, int32_t &n_gt, DIFFICULTY difficulty){\n\n  // extract ground truth bounding boxes for current evaluation class\n  for(int32_t i=0;i<gt.size(); i++){\n\n    // only bounding boxes with a minimum height are used for evaluation\n    double height = gt[i].box.y2 - gt[i].box.y1;\n\n    // neighboring classes are ignored (\"van\" for \"car\" and \"person_sitting\" for \"pedestrian\")\n    // (lower/upper cases are ignored)\n    int32_t valid_class;\n\n    // all classes without a neighboring class\n    if(!strcasecmp(gt[i].box.type.c_str(), CLASS_NAMES[current_class].c_str()))\n      valid_class = 1;\n\n    // classes with a neighboring class\n    else if(!strcasecmp(CLASS_NAMES[current_class].c_str(), \"Pedestrian\") && !strcasecmp(\"Person_sitting\", gt[i].box.type.c_str()))\n      valid_class = 0;\n    else if(!strcasecmp(CLASS_NAMES[current_class].c_str(), \"Car\") && !strcasecmp(\"Van\", gt[i].box.type.c_str()))\n      valid_class = 0;\n\n    // classes not used for evaluation\n    else\n      valid_class = -1;\n\n    // ground truth is ignored, if occlusion, truncation exceeds the difficulty or ground truth is too small\n    // (doesn't count as FN nor TP, although detections may be assigned)\n    bool ignore = false;\n    if(gt[i].occlusion>MAX_OCCLUSION[difficulty] || gt[i].truncation>MAX_TRUNCATION[difficulty] || height<MIN_HEIGHT[difficulty])\n      ignore = true;\n\n    // set ignored vector for ground truth\n    // current class and not ignored (total no. of ground truth is detected for recall denominator)\n    if(valid_class==1 && !ignore){\n      ignored_gt.push_back(0);\n      n_gt++;\n    }\n\n    // neighboring class, or current class but ignored\n    else if(valid_class==0 || (ignore && valid_class==1))\n      ignored_gt.push_back(1);\n\n    // all other classes which are FN in the evaluation\n    else\n      ignored_gt.push_back(-1);\n  }\n\n  // extract dontcare areas\n  for(int32_t i=0;i<gt.size(); i++)\n    if(!strcasecmp(\"DontCare\", gt[i].box.type.c_str()))\n      dc.push_back(gt[i]);\n\n  // extract detections bounding boxes of the current class\n  for(int32_t i=0;i<det.size(); i++){\n\n    // neighboring classes are not evaluated\n    int32_t valid_class;\n    if(!strcasecmp(det[i].box.type.c_str(), CLASS_NAMES[current_class].c_str()))\n      valid_class = 1;\n    else\n      valid_class = -1;\n\n    int32_t height = fabs(det[i].box.y1 - det[i].box.y2);\n\n    // set ignored vector for detections\n    if(height<MIN_HEIGHT[difficulty])\n      ignored_det.push_back(1);\n    else if(valid_class==1)\n      ignored_det.push_back(0);\n    else\n      ignored_det.push_back(-1);\n  }\n}\n\ntPrData computeStatistics(CLASSES current_class, const vector<tGroundtruth> &gt,\n        const vector<tDetection> &det, const vector<tGroundtruth> &dc,\n        const vector<int32_t> &ignored_gt, const vector<int32_t>  &ignored_det,\n        bool compute_fp, double (*boxoverlap)(tDetection, tGroundtruth, int32_t),\n        METRIC metric, bool compute_aos=false, double thresh=0, bool debug=false){\n\n  tPrData stat = tPrData();\n  const double NO_DETECTION = -10000000;\n  vector<double> delta;            // holds angular difference for TPs (needed for AOS evaluation)\n  vector<bool> assigned_detection; // holds wether a detection was assigned to a valid or ignored ground truth\n  assigned_detection.assign(det.size(), false);\n  vector<bool> ignored_threshold;\n  ignored_threshold.assign(det.size(), false); // holds detections with a threshold lower than thresh if FP are computed\n\n  // detections with a low score are ignored for computing precision (needs FP)\n  if(compute_fp)\n    for(int32_t i=0; i<det.size(); i++)\n      if(det[i].thresh<thresh)\n        ignored_threshold[i] = true;\n\n  // evaluate all ground truth boxes\n  for(int32_t i=0; i<gt.size(); i++){\n\n    // this ground truth is not of the current or a neighboring class and therefore ignored\n    if(ignored_gt[i]==-1)\n      continue;\n\n    /*=======================================================================\n    find candidates (overlap with ground truth > 0.5) (logical len(det))\n    =======================================================================*/\n    int32_t det_idx          = -1;\n    double valid_detection = NO_DETECTION;\n    double max_overlap     = 0;\n\n    // search for a possible detection\n    bool assigned_ignored_det = false;\n    for(int32_t j=0; j<det.size(); j++){\n\n      // detections not of the current class, already assigned or with a low threshold are ignored\n      if(ignored_det[j]==-1)\n        continue;\n      if(assigned_detection[j])\n        continue;\n      if(ignored_threshold[j])\n        continue;\n\n      // find the maximum score for the candidates and get idx of respective detection\n      double overlap = boxoverlap(det[j], gt[i], -1);\n\n      // for computing recall thresholds, the candidate with highest score is considered\n      if(!compute_fp && overlap>MIN_OVERLAP[metric][current_class] && det[j].thresh>valid_detection){\n        det_idx         = j;\n        valid_detection = det[j].thresh;\n      }\n\n      // for computing pr curve values, the candidate with the greatest overlap is considered\n      // if the greatest overlap is an ignored detection (min_height), the overlapping detection is used\n      else if(compute_fp && overlap>MIN_OVERLAP[metric][current_class] && (overlap>max_overlap || assigned_ignored_det) && ignored_det[j]==0){\n        max_overlap     = overlap;\n        det_idx         = j;\n        valid_detection = 1;\n        assigned_ignored_det = false;\n      }\n      else if(compute_fp && overlap>MIN_OVERLAP[metric][current_class] && valid_detection==NO_DETECTION && ignored_det[j]==1){\n        det_idx              = j;\n        valid_detection      = 1;\n        assigned_ignored_det = true;\n      }\n    }\n\n    /*=======================================================================\n    compute TP, FP and FN\n    =======================================================================*/\n\n    // nothing was assigned to this valid ground truth\n    if(valid_detection==NO_DETECTION && ignored_gt[i]==0) {\n      stat.fn++;\n    }\n\n    // only evaluate valid ground truth <=> detection assignments (considering difficulty level)\n    else if(valid_detection!=NO_DETECTION && (ignored_gt[i]==1 || ignored_det[det_idx]==1))\n      assigned_detection[det_idx] = true;\n\n    // found a valid true positive\n    else if(valid_detection!=NO_DETECTION){\n\n      // write highest score to threshold vector\n      stat.tp++;\n      stat.v.push_back(det[det_idx].thresh);\n\n      // compute angular difference of detection and ground truth if valid detection orientation was provided\n      if(compute_aos)\n        delta.push_back(gt[i].box.alpha - det[det_idx].box.alpha);\n\n      // clean up\n      assigned_detection[det_idx] = true;\n    }\n  }\n\n  // if FP are requested, consider stuff area\n  if(compute_fp){\n\n    // count fp\n    for(int32_t i=0; i<det.size(); i++){\n\n      // count false positives if required (height smaller than required is ignored (ignored_det==1)\n      if(!(assigned_detection[i] || ignored_det[i]==-1 || ignored_det[i]==1 || ignored_threshold[i]))\n        stat.fp++;\n    }\n\n    // do not consider detections overlapping with stuff area\n    int32_t nstuff = 0;\n    for(int32_t i=0; i<dc.size(); i++){\n      for(int32_t j=0; j<det.size(); j++){\n\n        // detections not of the current class, already assigned, with a low threshold or a low minimum height are ignored\n        if(assigned_detection[j])\n          continue;\n        if(ignored_det[j]==-1 || ignored_det[j]==1)\n          continue;\n        if(ignored_threshold[j])\n          continue;\n\n        // compute overlap and assign to stuff area, if overlap exceeds class specific value\n        double overlap = boxoverlap(det[j], dc[i], 0);\n        if(overlap>MIN_OVERLAP[metric][current_class]){\n          assigned_detection[j] = true;\n          nstuff++;\n        }\n      }\n    }\n\n    // FP = no. of all not to ground truth assigned detections - detections assigned to stuff areas\n    stat.fp -= nstuff;\n\n    // if all orientation values are valid, the AOS is computed\n    if(compute_aos){\n      vector<double> tmp;\n\n      // FP have a similarity of 0, for all TP compute AOS\n      tmp.assign(stat.fp, 0);\n      for(int32_t i=0; i<delta.size(); i++)\n        tmp.push_back((1.0+cos(delta[i]))/2.0);\n\n      // be sure, that all orientation deltas are computed\n      assert(tmp.size()==stat.fp+stat.tp);\n      assert(delta.size()==stat.tp);\n\n      // get the mean orientation similarity for this image\n      if(stat.tp>0 || stat.fp>0)\n        stat.similarity = accumulate(tmp.begin(), tmp.end(), 0.0);\n\n      // there was neither a FP nor a TP, so the similarity is ignored in the evaluation\n      else\n        stat.similarity = -1;\n    }\n  }\n  return stat;\n}\n\n/*=======================================================================\nEVALUATE CLASS-WISE\n=======================================================================*/\n\nbool eval_class (FILE *fp_det, FILE *fp_ori, CLASSES current_class,\n        const vector< vector<tGroundtruth> > &groundtruth,\n        const vector< vector<tDetection> > &detections, bool compute_aos,\n        double (*boxoverlap)(tDetection, tGroundtruth, int32_t),\n        vector<double> &precision, vector<double> &aos,\n        DIFFICULTY difficulty, METRIC metric) {\n    assert(groundtruth.size() == detections.size());\n\n  // init\n  int32_t n_gt=0;                                     // total no. of gt (denominator of recall)\n  vector<double> v, thresholds;                       // detection scores, evaluated for recall discretization\n  vector< vector<int32_t> > ignored_gt, ignored_det;  // index of ignored gt detection for current class/difficulty\n  vector< vector<tGroundtruth> > dontcare;            // index of dontcare areas, included in ground truth\n\n  // for all test images do\n  for (int32_t i=0; i<groundtruth.size(); i++){\n\n    // holds ignored ground truth, ignored detections and dontcare areas for current frame\n    vector<int32_t> i_gt, i_det;\n    vector<tGroundtruth> dc;\n\n    // only evaluate objects of current class and ignore occluded, truncated objects\n    cleanData(current_class, groundtruth[i], detections[i], i_gt, dc, i_det, n_gt, difficulty);\n    ignored_gt.push_back(i_gt);\n    ignored_det.push_back(i_det);\n    dontcare.push_back(dc);\n\n    // compute statistics to get recall values\n    tPrData pr_tmp = tPrData();\n    pr_tmp = computeStatistics(current_class, groundtruth[i], detections[i], dc, i_gt, i_det, false, boxoverlap, metric);\n\n    // add detection scores to vector over all images\n    for(int32_t j=0; j<pr_tmp.v.size(); j++)\n      v.push_back(pr_tmp.v[j]);\n  }\n  // get scores that must be evaluated for recall discretization\n  thresholds = getThresholds(v, n_gt);\n\n  // compute TP,FP,FN for relevant scores\n  vector<tPrData> pr;\n  pr.assign(thresholds.size(),tPrData());\n  for (int32_t i=0; i<groundtruth.size(); i++){\n    // for all scores/recall thresholds do:\n    for(int32_t t=0; t<thresholds.size(); t++){\n      tPrData tmp = tPrData();\n      tmp = computeStatistics(current_class, groundtruth[i], detections[i], dontcare[i],\n                              ignored_gt[i], ignored_det[i], true, boxoverlap, metric,\n                              compute_aos, thresholds[t], t==38);\n\n      // add no. of TP, FP, FN, AOS for current frame to total evaluation for current threshold\n      pr[t].tp += tmp.tp;\n      pr[t].fp += tmp.fp;\n      pr[t].fn += tmp.fn;\n      if(tmp.similarity!=-1)\n        pr[t].similarity += tmp.similarity;\n    }\n  }\n\n  // compute recall, precision and AOS\n  vector<double> recall;\n  precision.assign(N_SAMPLE_PTS, 0);\n  if(compute_aos)\n    aos.assign(N_SAMPLE_PTS, 0);\n  double r=0;\n  for (int32_t i=0; i<thresholds.size(); i++){\n    r = pr[i].tp/(double)(pr[i].tp + pr[i].fn);\n    recall.push_back(r);\n    precision[i] = pr[i].tp/(double)(pr[i].tp + pr[i].fp);\n    if(compute_aos)\n      aos[i] = pr[i].similarity/(double)(pr[i].tp + pr[i].fp);\n  }\n\n  // filter precision and AOS using max_{i..end}(precision)\n  for (int32_t i=0; i<thresholds.size(); i++){\n    precision[i] = *max_element(precision.begin()+i, precision.end());\n    if(compute_aos)\n      aos[i] = *max_element(aos.begin()+i, aos.end());\n  }\n\n  // save statisics and finish with success\n  saveStats(precision, aos, fp_det, fp_ori);\n    return true;\n}\n\nvoid saveAndPlotPlots(string dir_name,string file_name,string obj_type,vector<double> vals[],bool is_aos, FILE* res_fp){\n\n  char command[1024];\n  // save plot data to file\n  FILE *fp = fopen((dir_name + \"/\" + file_name + \".txt\").c_str(),\"w\");\n  printf(\"save %s\\n\", (dir_name + \"/\" + file_name + \".txt\").c_str());\n  for (int32_t i=0; i<(int)N_SAMPLE_PTS; i++)\n    fprintf(fp,\"%f %f %f %f\\n\",(double)i/(N_SAMPLE_PTS-1.0),vals[0][i],vals[1][i],vals[2][i]);\n  fclose(fp);\n\n  float sum[3] = {0, 0, 0};\n  for (int v = 0; v < 3; ++v)\n      for (int i = 0; i < vals[v].size(); i = i + 4)\n          sum[v] += vals[v][i];\n  printf(\"%s AP: %f %f %f\\n\", file_name.c_str(), sum[0] / 11 * 100, sum[1] / 11 * 100, sum[2] / 11 * 100);\n  fprintf(res_fp, \"%s AP: %f %f %f\\n\", file_name.c_str(), sum[0] / 11 * 100, sum[1] / 11 * 100, sum[2] / 11 * 100);\n\n  // create png + eps\n  for (int32_t j=0; j<2; j++) {\n\n    // open file\n    FILE *fp = fopen((dir_name + \"/\" + file_name + \".gp\").c_str(),\"w\");\n\n    // save gnuplot instructions\n    if (j==0) {\n      fprintf(fp,\"set term png size 450,315 font \\\"Helvetica\\\" 11\\n\");\n      fprintf(fp,\"set output \\\"%s.png\\\"\\n\",file_name.c_str());\n    } else {\n      fprintf(fp,\"set term postscript eps enhanced color font \\\"Helvetica\\\" 20\\n\");\n      fprintf(fp,\"set output \\\"%s.eps\\\"\\n\",file_name.c_str());\n    }\n\n    // set labels and ranges\n    fprintf(fp,\"set size ratio 0.7\\n\");\n    fprintf(fp,\"set xrange [0:1]\\n\");\n    fprintf(fp,\"set yrange [0:1]\\n\");\n    fprintf(fp,\"set xlabel \\\"Recall\\\"\\n\");\n    if (!is_aos) fprintf(fp,\"set ylabel \\\"Precision\\\"\\n\");\n    else         fprintf(fp,\"set ylabel \\\"Orientation Similarity\\\"\\n\");\n    obj_type[0] = toupper(obj_type[0]);\n    fprintf(fp,\"set title \\\"%s\\\"\\n\",obj_type.c_str());\n\n    // line width\n    int32_t   lw = 5;\n    if (j==0) lw = 3;\n\n    // plot error curve\n    fprintf(fp,\"plot \");\n    fprintf(fp,\"\\\"%s.txt\\\" using 1:2 title 'Easy' with lines ls 1 lw %d,\",file_name.c_str(),lw);\n    fprintf(fp,\"\\\"%s.txt\\\" using 1:3 title 'Moderate' with lines ls 2 lw %d,\",file_name.c_str(),lw);\n    fprintf(fp,\"\\\"%s.txt\\\" using 1:4 title 'Hard' with lines ls 3 lw %d\",file_name.c_str(),lw);\n\n    // close file\n    fclose(fp);\n\n    // run gnuplot => create png + eps\n    sprintf(command,\"cd %s; gnuplot %s\",dir_name.c_str(),(file_name + \".gp\").c_str());\n    system(command);\n  }\n\n  // create pdf and crop\n  sprintf(command,\"cd %s; ps2pdf %s.eps %s_large.pdf\",dir_name.c_str(),file_name.c_str(),file_name.c_str());\n  system(command);\n  sprintf(command,\"cd %s; pdfcrop %s_large.pdf %s.pdf\",dir_name.c_str(),file_name.c_str(),file_name.c_str());\n  system(command);\n  sprintf(command,\"cd %s; rm %s_large.pdf\",dir_name.c_str(),file_name.c_str());\n  system(command);\n}\n\nvector<int32_t> getEvalIndices(const string& result_dir) {\n\n    DIR* dir;\n    dirent* entity;\n    dir = opendir(result_dir.c_str());\n    if (dir) {\n        while (entity = readdir(dir)) {\n            string path(entity->d_name);\n            int32_t len = path.size();\n            if (len < 10) continue;\n            int32_t index = atoi(path.substr(len - 10, 10).c_str());\n            indices.push_back(index);\n        }\n    }\n    return indices;\n}\n\nbool eval(string gt_dir, string result_dir, Mail* mail){\n\n  // set some global parameters\n  initGlobals();\n\n  // ground truth and result directories\n  // string gt_dir         = \"data/object/label_2\";\n  // string result_dir     = \"results/\" + result_sha;\n  string plot_dir       = result_dir + \"/plot\";\n  FILE* res_fp = fopen((result_dir + \"/result.txt\").c_str(), \"w\");\n\n  // create output directories\n  system((\"mkdir \" + plot_dir).c_str());\n\n  // hold detections and ground truth in memory\n  vector< vector<tGroundtruth> > groundtruth;\n  vector< vector<tDetection> >   detections;\n\n  // holds wether orientation similarity shall be computed (might be set to false while loading detections)\n  // and which labels where provided by this submission\n  bool compute_aos=true;\n  vector<bool> eval_image(NUM_CLASS, false);\n  vector<bool> eval_ground(NUM_CLASS, false);\n  vector<bool> eval_3d(NUM_CLASS, false);\n\n  // for all images read groundtruth and detections\n  mail->msg(\"Loading detections...\");\n  std::vector<int32_t> indices = getEvalIndices(result_dir + \"/data/\" );\n  printf(\"number of files for evaluation: %d\\n\", (int)indices.size());\n  fprintf(res_fp, \"number of files for evaluation: %d\\n\", (int)indices.size());\n\n  for (int32_t i=0; i<indices.size(); i++) {\n\n    // file name\n    char file_name[256];\n    sprintf(file_name,\"%06d.txt\",indices.at(i));\n\n    // read ground truth and result poses\n    bool gt_success,det_success;\n    vector<tGroundtruth> gt   = loadGroundtruth(gt_dir + \"/\" + file_name,gt_success);\n    vector<tDetection>   det  = loadDetections(result_dir + \"/data/\"  + file_name,\n            compute_aos, eval_image, eval_ground, eval_3d, det_success);\n    groundtruth.push_back(gt);\n    detections.push_back(det);\n\n    // check for errors\n    if (!gt_success) {\n      mail->msg(\"ERROR: Couldn't read: %s of ground truth. Please write me an email!\", file_name);\n      return false;\n    }\n    if (!det_success) {\n      mail->msg(\"ERROR: Couldn't read: %s\", file_name);\n      return false;\n    }\n  }\n  mail->msg(\"  done.\");\n\n  // holds pointers for result files\n  FILE *fp_det=0, *fp_ori=0;\n\n  // eval image 2D bounding boxes\n  for (int c = 0; c < NUM_CLASS; c++) {\n    CLASSES cls = (CLASSES)c;\n    if (eval_image[c]) {\n      fp_det = fopen((result_dir + \"/stats_\" + CLASS_NAMES[c] + \"_detection.txt\").c_str(), \"w\");\n      if(compute_aos)\n        fp_ori = fopen((result_dir + \"/stats_\" + CLASS_NAMES[c] + \"_orientation.txt\").c_str(),\"w\");\n      vector<double> precision[3], aos[3];\n      if(   !eval_class(fp_det, fp_ori, cls, groundtruth, detections, compute_aos, imageBoxOverlap, precision[0], aos[0], EASY, IMAGE)\n         || !eval_class(fp_det, fp_ori, cls, groundtruth, detections, compute_aos, imageBoxOverlap, precision[1], aos[1], MODERATE, IMAGE)\n         || !eval_class(fp_det, fp_ori, cls, groundtruth, detections, compute_aos, imageBoxOverlap, precision[2], aos[2], HARD, IMAGE)) {\n        mail->msg(\"%s evaluation failed.\", CLASS_NAMES[c].c_str());\n        return false;\n      }\n      fclose(fp_det);\n      saveAndPlotPlots(plot_dir, CLASS_NAMES[c] + \"_detection\", CLASS_NAMES[c], precision, 0, res_fp);\n\n      if(compute_aos){\n        saveAndPlotPlots(plot_dir, CLASS_NAMES[c] + \"_orientation\", CLASS_NAMES[c], aos, 1, res_fp);\n        fclose(fp_ori);\n      }\n    }\n  }\n  printf(\"Finished 2D bounding box eval.\\n\");\n  // don't evaluate AOS for birdview boxes and 3D boxes\n  compute_aos = false;\n\n  // eval bird's eye view bounding boxes\n  for (int c = 0; c < NUM_CLASS; c++) {\n    CLASSES cls = (CLASSES)c;\n    if (eval_ground[c]) {\n      fp_det = fopen((result_dir + \"/stats_\" + CLASS_NAMES[c] + \"_detection_ground.txt\").c_str(), \"w\");\n      vector<double> precision[3], aos[3];\n      printf(\"Going to eval ground for class: %s\\n\", CLASS_NAMES[c].c_str());\n      if(   !eval_class(fp_det, fp_ori, cls, groundtruth, detections, compute_aos, groundBoxOverlap, precision[0], aos[0], EASY, GROUND)\n         || !eval_class(fp_det, fp_ori, cls, groundtruth, detections, compute_aos, groundBoxOverlap, precision[1], aos[1], MODERATE, GROUND)\n         || !eval_class(fp_det, fp_ori, cls, groundtruth, detections, compute_aos, groundBoxOverlap, precision[2], aos[2], HARD, GROUND)) {\n        mail->msg(\"%s evaluation failed.\", CLASS_NAMES[c].c_str());\n        return false;\n      }\n      fclose(fp_det);\n      saveAndPlotPlots(plot_dir, CLASS_NAMES[c] + \"_detection_ground\", CLASS_NAMES[c], precision, 0, res_fp);\n    }\n  }\n  printf(\"Finished Birdeye eval.\\n\");\n\n  // eval 3D bounding boxes\n  for (int c = 0; c < NUM_CLASS; c++) {\n    CLASSES cls = (CLASSES)c;\n    if (eval_3d[c]) {\n      fp_det = fopen((result_dir + \"/stats_\" + CLASS_NAMES[c] + \"_detection_3d.txt\").c_str(), \"w\");\n      vector<double> precision[3], aos[3];\n      printf(\"Going to eval 3D box for class: %s\\n\", CLASS_NAMES[c].c_str());\n      if(   !eval_class(fp_det, fp_ori, cls, groundtruth, detections, compute_aos, box3DOverlap, precision[0], aos[0], EASY, BOX3D)\n         || !eval_class(fp_det, fp_ori, cls, groundtruth, detections, compute_aos, box3DOverlap, precision[1], aos[1], MODERATE, BOX3D)\n         || !eval_class(fp_det, fp_ori, cls, groundtruth, detections, compute_aos, box3DOverlap, precision[2], aos[2], HARD, BOX3D)) {\n        mail->msg(\"%s evaluation failed.\", CLASS_NAMES[c].c_str());\n        return false;\n      }\n      fclose(fp_det);\n      saveAndPlotPlots(plot_dir, CLASS_NAMES[c] + \"_detection_3d\", CLASS_NAMES[c], precision, 0, res_fp);\n    }\n  }\n  printf(\"Finished 3D bounding box eval.\\n\");\n  fclose(res_fp);\n  // success\n  return true;\n}\n\nint32_t main (int32_t argc,char *argv[]) {\n\n  // we need 2 or 4 arguments!\n  if (argc!=3) {\n    cout << \"Usage: ./eval_detection_3d_offline gt_dir result_dir\" << endl;\n    return 1;\n  }\n\n  // read arguments\n  string gt_dir = argv[1];\n  string result_dir = argv[2];\n\n  // init notification mail\n  Mail *mail;\n  mail = new Mail();\n  mail->msg(\"Thank you for participating in our evaluation!\");\n\n  // run evaluation\n  if (eval(gt_dir, result_dir, mail)) {\n    mail->msg(\"Your evaluation results are available at:\");\n    mail->msg(result_dir.c_str());\n  } else {\n    system((\"rm -r \" + result_dir + \"/plot\").c_str());\n    mail->msg(\"An error occured while processing your results.\");\n  }\n\n  // send mail and exit\n  delete mail;\n\n  return 0;\n}\n\n\n"
  },
  {
    "path": "disparity/eval/kitti/mail.h",
    "content": "#ifndef MAIL_H\n#define MAIL_H\n\n#include <stdio.h>\n#include <stdarg.h>\n#include <string.h>\n\nclass Mail {\n\npublic:\n\n  Mail (std::string email = \"\") {\n    if (email.compare(\"\")) {\n      mail = popen(\"/usr/lib/sendmail -t -f noreply@cvlibs.net\",\"w\");\n      fprintf(mail,\"To: %s\\n\", email.c_str());\n      fprintf(mail,\"From: noreply@cvlibs.net\\n\");\n      fprintf(mail,\"Subject: KITTI Evaluation Benchmark\\n\");\n      fprintf(mail,\"\\n\\n\");\n    } else {\n      mail = 0;\n    }\n  }\n  \n  ~Mail() {\n    if (mail) {\n      pclose(mail);\n    }\n  }\n  \n  void msg (const char *format, ...) {\n    va_list args;\n    va_start(args,format);\n    if (mail) {\n      vfprintf(mail,format,args);\n      fprintf(mail,\"\\n\");\n    }\n    vprintf(format,args);\n    printf(\"\\n\");\n    va_end(args);\n  }\n    \nprivate:\n\n  FILE *mail;\n  \n};\n\n#endif\n"
  },
  {
    "path": "disparity/eval/kitti-object-eval-python/.gitignore",
    "content": "# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packaging\n.Python\nbuild/\ndevelop-eggs/\ndist/\ndownloads/\neggs/\n.eggs/\nlib/\nlib64/\nparts/\nsdist/\nvar/\nwheels/\n*.egg-info/\n.installed.cfg\n*.egg\nMANIFEST\n\n# PyInstaller\n#  Usually these files are written by a python script from a template\n#  before PyInstaller builds the exe, so as to inject date/other infos into it.\n*.manifest\n*.spec\n\n# Installer logs\npip-log.txt\npip-delete-this-directory.txt\n\n# Unit test / coverage reports\nhtmlcov/\n.tox/\n.coverage\n.coverage.*\n.cache\nnosetests.xml\ncoverage.xml\n*.cover\n.hypothesis/\n.pytest_cache/\n\n# Translations\n*.mo\n*.pot\n\n# Django stuff:\n*.log\nlocal_settings.py\ndb.sqlite3\n\n# Flask stuff:\ninstance/\n.webassets-cache\n\n# Scrapy stuff:\n.scrapy\n\n# Sphinx documentation\ndocs/_build/\n\n# PyBuilder\ntarget/\n\n# Jupyter Notebook\n.ipynb_checkpoints\n\n# pyenv\n.python-version\n\n# celery beat schedule file\ncelerybeat-schedule\n\n# SageMath parsed files\n*.sage.py\n\n# Environments\n.env\n.venv\nenv/\nvenv/\nENV/\nenv.bak/\nvenv.bak/\n\n# Spyder project settings\n.spyderproject\n.spyproject\n\n# Rope project settings\n.ropeproject\n\n# mkdocs documentation\n/site\n\n# mypy\n.mypy_cache/"
  },
  {
    "path": "disparity/eval/kitti-object-eval-python/LICENSE",
    "content": "MIT License\n\nCopyright (c) 2018 \n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "disparity/eval/kitti-object-eval-python/README.md",
    "content": "# kitti-object-eval-python\nFast kitti object detection eval in python(finish eval in less than 10 second), support 2d/bev/3d/aos. , support coco-style AP. If you use command line interface, numba need some time to compile jit functions.\n\n_WARNING_: The \"coco\" isn't official metrics. Only \"AP(Average Precision)\" is.\n## Dependencies\nOnly support python 3.6+, need `numpy`, `skimage`, `numba`, `fire`, `scipy`. If you have Anaconda, just install `cudatoolkit` in anaconda. Otherwise, please reference to this [page](https://github.com/numba/numba#custom-python-environments) to set up llvm and cuda for numba.\n* Install by conda:\n```\nconda install -c numba cudatoolkit=x.x  (8.0, 9.0, 10.0, depend on your environment) \n```\n## Usage\n* commandline interface:\n```\npython evaluate.py evaluate --label_path=/path/to/your_gt_label_folder --result_path=/path/to/your_result_folder --label_split_file=/path/to/val.txt --current_class=0 --coco=False\n```\n* python interface:\n```Python\nimport kitti_common as kitti\nfrom eval import get_official_eval_result, get_coco_eval_result\ndef _read_imageset_file(path):\n    with open(path, 'r') as f:\n        lines = f.readlines()\n    return [int(line) for line in lines]\ndet_path = \"/path/to/your_result_folder\"\ndt_annos = kitti.get_label_annos(det_path)\ngt_path = \"/path/to/your_gt_label_folder\"\ngt_split_file = \"/path/to/val.txt\" # from https://xiaozhichen.github.io/files/mv3d/imagesets.tar.gz\nval_image_ids = _read_imageset_file(gt_split_file)\ngt_annos = kitti.get_label_annos(gt_path, val_image_ids)\nprint(get_official_eval_result(gt_annos, dt_annos, 0)) # 6s in my computer\nprint(get_coco_eval_result(gt_annos, dt_annos, 0)) # 18s in my computer\n```\n"
  },
  {
    "path": "disparity/eval/kitti-object-eval-python/eval.py",
    "content": "import io as sysio\nimport time\n\nimport numba\nimport numpy as np\nfrom scipy.interpolate import interp1d\n\nfrom rotate_iou import rotate_iou_gpu_eval\n\n\ndef get_mAP(prec):\n    sums = 0\n    for i in range(0, len(prec), 4):\n        sums += prec[i]\n    return sums / 11 * 100\n\n\n@numba.jit\ndef get_thresholds(scores: np.ndarray, num_gt, num_sample_pts=41):\n    scores.sort()\n    scores = scores[::-1]\n    current_recall = 0\n    thresholds = []\n    for i, score in enumerate(scores):\n        l_recall = (i + 1) / num_gt\n        if i < (len(scores) - 1):\n            r_recall = (i + 2) / num_gt\n        else:\n            r_recall = l_recall\n        if (((r_recall - current_recall) < (current_recall - l_recall))\n                and (i < (len(scores) - 1))):\n            continue\n        # recall = l_recall\n        thresholds.append(score)\n        current_recall += 1 / (num_sample_pts - 1.0)\n    # print(len(thresholds), len(scores), num_gt)\n    return thresholds\n\n\ndef clean_data(gt_anno, dt_anno, current_class, difficulty):\n    CLASS_NAMES = [\n        'car', 'pedestrian', 'cyclist', 'van', 'person_sitting', 'car',\n        'tractor', 'trailer'\n    ]\n    MIN_HEIGHT = [40, 25, 25]\n    MAX_OCCLUSION = [0, 1, 2]\n    MAX_TRUNCATION = [0.15, 0.3, 0.5]\n    dc_bboxes, ignored_gt, ignored_dt = [], [], []\n    current_cls_name = CLASS_NAMES[current_class].lower()\n    num_gt = len(gt_anno[\"name\"])\n    num_dt = len(dt_anno[\"name\"])\n    num_valid_gt = 0\n    for i in range(num_gt):\n        bbox = gt_anno[\"bbox\"][i]\n        gt_name = gt_anno[\"name\"][i].lower()\n        height = bbox[3] - bbox[1]\n        valid_class = -1\n        if (gt_name == current_cls_name):\n            valid_class = 1\n        elif (current_cls_name == \"Pedestrian\".lower()\n              and \"Person_sitting\".lower() == gt_name):\n            valid_class = 0\n        elif (current_cls_name == \"Car\".lower() and \"Van\".lower() == gt_name):\n            valid_class = 0\n        else:\n            valid_class = -1\n        ignore = False\n        if ((gt_anno[\"occluded\"][i] > MAX_OCCLUSION[difficulty])\n                or (gt_anno[\"truncated\"][i] > MAX_TRUNCATION[difficulty])\n                or (height <= MIN_HEIGHT[difficulty])):\n            # if gt_anno[\"difficulty\"][i] > difficulty or gt_anno[\"difficulty\"][i] == -1:\n            ignore = True\n        if valid_class == 1 and not ignore:\n            ignored_gt.append(0)\n            num_valid_gt += 1\n        elif (valid_class == 0 or (ignore and (valid_class == 1))):\n            ignored_gt.append(1)\n        else:\n            ignored_gt.append(-1)\n    # for i in range(num_gt):\n        if gt_anno[\"name\"][i] == \"DontCare\":\n            dc_bboxes.append(gt_anno[\"bbox\"][i])\n    for i in range(num_dt):\n        if (dt_anno[\"name\"][i].lower() == current_cls_name):\n            valid_class = 1\n        else:\n            valid_class = -1\n        height = abs(dt_anno[\"bbox\"][i, 3] - dt_anno[\"bbox\"][i, 1])\n        if height < MIN_HEIGHT[difficulty]:\n            ignored_dt.append(1)\n        elif valid_class == 1:\n            ignored_dt.append(0)\n        else:\n            ignored_dt.append(-1)\n\n    return num_valid_gt, ignored_gt, ignored_dt, dc_bboxes\n\n\n@numba.jit(nopython=True)\ndef image_box_overlap(boxes, query_boxes, criterion=-1):\n    N = boxes.shape[0]\n    K = query_boxes.shape[0]\n    overlaps = np.zeros((N, K), dtype=boxes.dtype)\n    for k in range(K):\n        qbox_area = ((query_boxes[k, 2] - query_boxes[k, 0]) *\n                     (query_boxes[k, 3] - query_boxes[k, 1]))\n        for n in range(N):\n            iw = (min(boxes[n, 2], query_boxes[k, 2]) - max(\n                boxes[n, 0], query_boxes[k, 0]))\n            if iw > 0:\n                ih = (min(boxes[n, 3], query_boxes[k, 3]) - max(\n                    boxes[n, 1], query_boxes[k, 1]))\n                if ih > 0:\n                    if criterion == -1:\n                        ua = (\n                            (boxes[n, 2] - boxes[n, 0]) *\n                            (boxes[n, 3] - boxes[n, 1]) + qbox_area - iw * ih)\n                    elif criterion == 0:\n                        ua = ((boxes[n, 2] - boxes[n, 0]) *\n                              (boxes[n, 3] - boxes[n, 1]))\n                    elif criterion == 1:\n                        ua = qbox_area\n                    else:\n                        ua = 1.0\n                    overlaps[n, k] = iw * ih / ua\n    return overlaps\n\n\ndef bev_box_overlap(boxes, qboxes, criterion=-1):\n    riou = rotate_iou_gpu_eval(boxes, qboxes, criterion)\n    return riou\n\n\n@numba.jit(nopython=True, parallel=True)\ndef d3_box_overlap_kernel(boxes,\n                          qboxes,\n                          rinc,\n                          criterion=-1,\n                          z_axis=1,\n                          z_center=1.0):\n    \"\"\"\n        z_axis: the z (height) axis.\n        z_center: unified z (height) center of box.\n    \"\"\"\n    N, K = boxes.shape[0], qboxes.shape[0]\n    for i in range(N):\n        for j in range(K):\n            if rinc[i, j] > 0:\n                min_z = min(\n                    boxes[i, z_axis] + boxes[i, z_axis + 3] * (1 - z_center),\n                    qboxes[j, z_axis] + qboxes[j, z_axis + 3] * (1 - z_center))\n                max_z = max(\n                    boxes[i, z_axis] - boxes[i, z_axis + 3] * z_center,\n                    qboxes[j, z_axis] - qboxes[j, z_axis + 3] * z_center)\n                iw = min_z - max_z\n                if iw > 0:\n                    area1 = boxes[i, 3] * boxes[i, 4] * boxes[i, 5]\n                    area2 = qboxes[j, 3] * qboxes[j, 4] * qboxes[j, 5]\n                    inc = iw * rinc[i, j]\n                    if criterion == -1:\n                        ua = (area1 + area2 - inc)\n                    elif criterion == 0:\n                        ua = area1\n                    elif criterion == 1:\n                        ua = area2\n                    else:\n                        ua = 1.0\n                    rinc[i, j] = inc / ua\n                else:\n                    rinc[i, j] = 0.0\n\n\ndef d3_box_overlap(boxes, qboxes, criterion=-1, z_axis=1, z_center=1.0):\n    \"\"\"kitti camera format z_axis=1.\n    \"\"\"\n    bev_axes = list(range(7))\n    bev_axes.pop(z_axis + 3)\n    bev_axes.pop(z_axis)\n    rinc = rotate_iou_gpu_eval(boxes[:, bev_axes], qboxes[:, bev_axes], 2)\n    d3_box_overlap_kernel(boxes, qboxes, rinc, criterion, z_axis, z_center)\n    return rinc\n\n\n@numba.jit(nopython=True)\ndef compute_statistics_jit(overlaps,\n                           gt_datas,\n                           dt_datas,\n                           ignored_gt,\n                           ignored_det,\n                           dc_bboxes,\n                           metric,\n                           min_overlap,\n                           thresh=0,\n                           compute_fp=False,\n                           compute_aos=False):\n\n    det_size = dt_datas.shape[0]\n    gt_size = gt_datas.shape[0]\n    dt_scores = dt_datas[:, -1]\n    dt_alphas = dt_datas[:, 4]\n    gt_alphas = gt_datas[:, 4]\n    dt_bboxes = dt_datas[:, :4]\n    # gt_bboxes = gt_datas[:, :4]\n\n    assigned_detection = [False] * det_size\n    ignored_threshold = [False] * det_size\n    if compute_fp:\n        for i in range(det_size):\n            if (dt_scores[i] < thresh):\n                ignored_threshold[i] = True\n    NO_DETECTION = -10000000\n    tp, fp, fn, similarity = 0, 0, 0, 0\n    # thresholds = [0.0]\n    # delta = [0.0]\n    thresholds = np.zeros((gt_size, ))\n    thresh_idx = 0\n    delta = np.zeros((gt_size, ))\n    delta_idx = 0\n    for i in range(gt_size):\n        if ignored_gt[i] == -1:\n            continue\n        det_idx = -1\n        valid_detection = NO_DETECTION\n        max_overlap = 0\n        assigned_ignored_det = False\n\n        for j in range(det_size):\n            if (ignored_det[j] == -1):\n                continue\n            if (assigned_detection[j]):\n                continue\n            if (ignored_threshold[j]):\n                continue\n            overlap = overlaps[j, i]\n            dt_score = dt_scores[j]\n            if (not compute_fp and (overlap > min_overlap)\n                    and dt_score > valid_detection):\n                det_idx = j\n                valid_detection = dt_score\n            elif (compute_fp and (overlap > min_overlap)\n                  and (overlap > max_overlap or assigned_ignored_det)\n                  and ignored_det[j] == 0):\n                max_overlap = overlap\n                det_idx = j\n                valid_detection = 1\n                assigned_ignored_det = False\n            elif (compute_fp and (overlap > min_overlap)\n                  and (valid_detection == NO_DETECTION)\n                  and ignored_det[j] == 1):\n                det_idx = j\n                valid_detection = 1\n                assigned_ignored_det = True\n\n        if (valid_detection == NO_DETECTION) and ignored_gt[i] == 0:\n            fn += 1\n        elif ((valid_detection != NO_DETECTION)\n              and (ignored_gt[i] == 1 or ignored_det[det_idx] == 1)):\n            assigned_detection[det_idx] = True\n        elif valid_detection != NO_DETECTION:\n            # only a tp add a threshold.\n            tp += 1\n            # thresholds.append(dt_scores[det_idx])\n            thresholds[thresh_idx] = dt_scores[det_idx]\n            thresh_idx += 1\n            if compute_aos:\n                # delta.append(gt_alphas[i] - dt_alphas[det_idx])\n                delta[delta_idx] = gt_alphas[i] - dt_alphas[det_idx]\n                delta_idx += 1\n\n            assigned_detection[det_idx] = True\n    if compute_fp:\n        for i in range(det_size):\n            if (not (assigned_detection[i] or ignored_det[i] == -1\n                     or ignored_det[i] == 1 or ignored_threshold[i])):\n                fp += 1\n        nstuff = 0\n        if metric == 0:\n            overlaps_dt_dc = image_box_overlap(dt_bboxes, dc_bboxes, 0)\n            for i in range(dc_bboxes.shape[0]):\n                for j in range(det_size):\n                    if (assigned_detection[j]):\n                        continue\n                    if (ignored_det[j] == -1 or ignored_det[j] == 1):\n                        continue\n                    if (ignored_threshold[j]):\n                        continue\n                    if overlaps_dt_dc[j, i] > min_overlap:\n                        assigned_detection[j] = True\n                        nstuff += 1\n        fp -= nstuff\n        if compute_aos:\n            tmp = np.zeros((fp + delta_idx, ))\n            # tmp = [0] * fp\n            for i in range(delta_idx):\n                tmp[i + fp] = (1.0 + np.cos(delta[i])) / 2.0\n                # tmp.append((1.0 + np.cos(delta[i])) / 2.0)\n            # assert len(tmp) == fp + tp\n            # assert len(delta) == tp\n            if tp > 0 or fp > 0:\n                similarity = np.sum(tmp)\n            else:\n                similarity = -1\n    return tp, fp, fn, similarity, thresholds[:thresh_idx]\n\n\ndef get_split_parts(num, num_part):\n    same_part = num // num_part\n    remain_num = num % num_part\n    if remain_num == 0:\n        return [same_part] * num_part\n    else:\n        return [same_part] * num_part + [remain_num]\n\n\n@numba.jit(nopython=True)\ndef fused_compute_statistics(overlaps,\n                             pr,\n                             gt_nums,\n                             dt_nums,\n                             dc_nums,\n                             gt_datas,\n                             dt_datas,\n                             dontcares,\n                             ignored_gts,\n                             ignored_dets,\n                             metric,\n                             min_overlap,\n                             thresholds,\n                             compute_aos=False):\n    gt_num = 0\n    dt_num = 0\n    dc_num = 0\n    for i in range(gt_nums.shape[0]):\n        for t, thresh in enumerate(thresholds):\n            overlap = overlaps[dt_num:dt_num + dt_nums[i], gt_num:gt_num +\n                               gt_nums[i]]\n\n            gt_data = gt_datas[gt_num:gt_num + gt_nums[i]]\n            dt_data = dt_datas[dt_num:dt_num + dt_nums[i]]\n            ignored_gt = ignored_gts[gt_num:gt_num + gt_nums[i]]\n            ignored_det = ignored_dets[dt_num:dt_num + dt_nums[i]]\n            dontcare = dontcares[dc_num:dc_num + dc_nums[i]]\n            tp, fp, fn, similarity, _ = compute_statistics_jit(\n                overlap,\n                gt_data,\n                dt_data,\n                ignored_gt,\n                ignored_det,\n                dontcare,\n                metric,\n                min_overlap=min_overlap,\n                thresh=thresh,\n                compute_fp=True,\n                compute_aos=compute_aos)\n            pr[t, 0] += tp\n            pr[t, 1] += fp\n            pr[t, 2] += fn\n            if similarity != -1:\n                pr[t, 3] += similarity\n        gt_num += gt_nums[i]\n        dt_num += dt_nums[i]\n        dc_num += dc_nums[i]\n\n\ndef calculate_iou_partly(gt_annos,\n                         dt_annos,\n                         metric,\n                         num_parts=50,\n                         z_axis=1,\n                         z_center=1.0):\n    \"\"\"fast iou algorithm. this function can be used independently to\n    do result analysis. \n    Args:\n        gt_annos: dict, must from get_label_annos() in kitti_common.py\n        dt_annos: dict, must from get_label_annos() in kitti_common.py\n        metric: eval type. 0: bbox, 1: bev, 2: 3d\n        num_parts: int. a parameter for fast calculate algorithm\n        z_axis: height axis. kitti camera use 1, lidar use 2.\n    \"\"\"\n    assert len(gt_annos) == len(dt_annos)\n    total_dt_num = np.stack([len(a[\"name\"]) for a in dt_annos], 0)\n    total_gt_num = np.stack([len(a[\"name\"]) for a in gt_annos], 0)\n    num_examples = len(gt_annos)\n    split_parts = get_split_parts(num_examples, num_parts)\n    parted_overlaps = []\n    example_idx = 0\n    bev_axes = list(range(3))\n    bev_axes.pop(z_axis)\n    for num_part in split_parts:\n        gt_annos_part = gt_annos[example_idx:example_idx + num_part]\n        dt_annos_part = dt_annos[example_idx:example_idx + num_part]\n        if metric == 0:\n            gt_boxes = np.concatenate([a[\"bbox\"] for a in gt_annos_part], 0)\n            dt_boxes = np.concatenate([a[\"bbox\"] for a in dt_annos_part], 0)\n            overlap_part = image_box_overlap(gt_boxes, dt_boxes)\n        elif metric == 1:\n            loc = np.concatenate(\n                [a[\"location\"][:, bev_axes] for a in gt_annos_part], 0)\n            dims = np.concatenate(\n                [a[\"dimensions\"][:, bev_axes] for a in gt_annos_part], 0)\n            rots = np.concatenate([a[\"rotation_y\"] for a in gt_annos_part], 0)\n            gt_boxes = np.concatenate([loc, dims, rots[..., np.newaxis]],\n                                      axis=1)\n            loc = np.concatenate(\n                [a[\"location\"][:, bev_axes] for a in dt_annos_part], 0)\n            dims = np.concatenate(\n                [a[\"dimensions\"][:, bev_axes] for a in dt_annos_part], 0)\n            rots = np.concatenate([a[\"rotation_y\"] for a in dt_annos_part], 0)\n            dt_boxes = np.concatenate([loc, dims, rots[..., np.newaxis]],\n                                      axis=1)\n            overlap_part = bev_box_overlap(gt_boxes,\n                                           dt_boxes).astype(np.float64)\n        elif metric == 2:\n            loc = np.concatenate([a[\"location\"] for a in gt_annos_part], 0)\n            dims = np.concatenate([a[\"dimensions\"] for a in gt_annos_part], 0)\n            rots = np.concatenate([a[\"rotation_y\"] for a in gt_annos_part], 0)\n            gt_boxes = np.concatenate([loc, dims, rots[..., np.newaxis]],\n                                      axis=1)\n            loc = np.concatenate([a[\"location\"] for a in dt_annos_part], 0)\n            dims = np.concatenate([a[\"dimensions\"] for a in dt_annos_part], 0)\n            rots = np.concatenate([a[\"rotation_y\"] for a in dt_annos_part], 0)\n            dt_boxes = np.concatenate([loc, dims, rots[..., np.newaxis]],\n                                      axis=1)\n            overlap_part = d3_box_overlap(\n                gt_boxes, dt_boxes, z_axis=z_axis,\n                z_center=z_center).astype(np.float64)\n        else:\n            raise ValueError(\"unknown metric\")\n        parted_overlaps.append(overlap_part)\n        example_idx += num_part\n    overlaps = []\n    example_idx = 0\n    for j, num_part in enumerate(split_parts):\n        gt_annos_part = gt_annos[example_idx:example_idx + num_part]\n        dt_annos_part = dt_annos[example_idx:example_idx + num_part]\n        gt_num_idx, dt_num_idx = 0, 0\n        for i in range(num_part):\n            gt_box_num = total_gt_num[example_idx + i]\n            dt_box_num = total_dt_num[example_idx + i]\n            overlaps.append(\n                parted_overlaps[j][gt_num_idx:gt_num_idx +\n                                   gt_box_num, dt_num_idx:dt_num_idx +\n                                   dt_box_num])\n            gt_num_idx += gt_box_num\n            dt_num_idx += dt_box_num\n        example_idx += num_part\n\n    return overlaps, parted_overlaps, total_gt_num, total_dt_num\n\n\ndef _prepare_data(gt_annos, dt_annos, current_class, difficulty):\n    gt_datas_list = []\n    dt_datas_list = []\n    total_dc_num = []\n    ignored_gts, ignored_dets, dontcares = [], [], []\n    total_num_valid_gt = 0\n    for i in range(len(gt_annos)):\n        rets = clean_data(gt_annos[i], dt_annos[i], current_class, difficulty)\n        num_valid_gt, ignored_gt, ignored_det, dc_bboxes = rets\n        ignored_gts.append(np.array(ignored_gt, dtype=np.int64))\n        ignored_dets.append(np.array(ignored_det, dtype=np.int64))\n        if len(dc_bboxes) == 0:\n            dc_bboxes = np.zeros((0, 4)).astype(np.float64)\n        else:\n            dc_bboxes = np.stack(dc_bboxes, 0).astype(np.float64)\n        total_dc_num.append(dc_bboxes.shape[0])\n        dontcares.append(dc_bboxes)\n        total_num_valid_gt += num_valid_gt\n        gt_datas = np.concatenate(\n            [gt_annos[i][\"bbox\"], gt_annos[i][\"alpha\"][..., np.newaxis]], 1)\n        dt_datas = np.concatenate([\n            dt_annos[i][\"bbox\"], dt_annos[i][\"alpha\"][..., np.newaxis],\n            dt_annos[i][\"score\"][..., np.newaxis]\n        ], 1)\n        gt_datas_list.append(gt_datas)\n        dt_datas_list.append(dt_datas)\n    total_dc_num = np.stack(total_dc_num, axis=0)\n    return (gt_datas_list, dt_datas_list, ignored_gts, ignored_dets, dontcares,\n            total_dc_num, total_num_valid_gt)\n\n\ndef eval_class(gt_annos,\n                  dt_annos,\n                  current_classes,\n                  difficultys,\n                  metric,\n                  min_overlaps,\n                  compute_aos=False,\n                  z_axis=1,\n                  z_center=1.0,\n                  num_parts=50):\n    \"\"\"Kitti eval. support 2d/bev/3d/aos eval. support 0.5:0.05:0.95 coco AP.\n    Args:\n        gt_annos: dict, must from get_label_annos() in kitti_common.py\n        dt_annos: dict, must from get_label_annos() in kitti_common.py\n        current_class: int, 0: car, 1: pedestrian, 2: cyclist\n        difficulty: int. eval difficulty, 0: easy, 1: normal, 2: hard\n        metric: eval type. 0: bbox, 1: bev, 2: 3d\n        min_overlap: float, min overlap. official: \n            [[0.7, 0.5, 0.5], [0.7, 0.5, 0.5], [0.7, 0.5, 0.5]] \n            format: [metric, class]. choose one from matrix above.\n        num_parts: int. a parameter for fast calculate algorithm\n\n    Returns:\n        dict of recall, precision and aos\n    \"\"\"\n    assert len(gt_annos) == len(dt_annos)\n    num_examples = len(gt_annos)\n    split_parts = get_split_parts(num_examples, num_parts)\n\n    rets = calculate_iou_partly(\n        dt_annos,\n        gt_annos,\n        metric,\n        num_parts,\n        z_axis=z_axis,\n        z_center=z_center)\n    overlaps, parted_overlaps, total_dt_num, total_gt_num = rets\n    N_SAMPLE_PTS = 41\n    num_minoverlap = len(min_overlaps)\n    num_class = len(current_classes)\n    num_difficulty = len(difficultys)\n    precision = np.zeros(\n        [num_class, num_difficulty, num_minoverlap, N_SAMPLE_PTS])\n    recall = np.zeros(\n        [num_class, num_difficulty, num_minoverlap, N_SAMPLE_PTS])\n    aos = np.zeros([num_class, num_difficulty, num_minoverlap, N_SAMPLE_PTS])\n    all_thresholds = np.zeros([num_class, num_difficulty, num_minoverlap, N_SAMPLE_PTS])\n    for m, current_class in enumerate(current_classes):\n        for l, difficulty in enumerate(difficultys):\n            rets = _prepare_data(gt_annos, dt_annos, current_class, difficulty)\n            (gt_datas_list, dt_datas_list, ignored_gts, ignored_dets,\n             dontcares, total_dc_num, total_num_valid_gt) = rets\n            for k, min_overlap in enumerate(min_overlaps[:, metric, m]):\n                thresholdss = []\n                for i in range(len(gt_annos)):\n                    rets = compute_statistics_jit(\n                        overlaps[i],\n                        gt_datas_list[i],\n                        dt_datas_list[i],\n                        ignored_gts[i],\n                        ignored_dets[i],\n                        dontcares[i],\n                        metric,\n                        min_overlap=min_overlap,\n                        thresh=0.0,\n                        compute_fp=False)\n                    tp, fp, fn, similarity, thresholds = rets\n                    thresholdss += thresholds.tolist()\n                thresholdss = np.array(thresholdss)\n                thresholds = get_thresholds(thresholdss, total_num_valid_gt)\n                thresholds = np.array(thresholds)\n                all_thresholds[m, l, k, :len(thresholds)] = thresholds\n                pr = np.zeros([len(thresholds), 4])\n                idx = 0\n                for j, num_part in enumerate(split_parts):\n                    gt_datas_part = np.concatenate(\n                        gt_datas_list[idx:idx + num_part], 0)\n                    dt_datas_part = np.concatenate(\n                        dt_datas_list[idx:idx + num_part], 0)\n                    dc_datas_part = np.concatenate(\n                        dontcares[idx:idx + num_part], 0)\n                    ignored_dets_part = np.concatenate(\n                        ignored_dets[idx:idx + num_part], 0)\n                    ignored_gts_part = np.concatenate(\n                        ignored_gts[idx:idx + num_part], 0)\n                    fused_compute_statistics(\n                        parted_overlaps[j],\n                        pr,\n                        total_gt_num[idx:idx + num_part],\n                        total_dt_num[idx:idx + num_part],\n                        total_dc_num[idx:idx + num_part],\n                        gt_datas_part,\n                        dt_datas_part,\n                        dc_datas_part,\n                        ignored_gts_part,\n                        ignored_dets_part,\n                        metric,\n                        min_overlap=min_overlap,\n                        thresholds=thresholds,\n                        compute_aos=compute_aos)\n                    idx += num_part\n                for i in range(len(thresholds)):\n                    precision[m, l, k, i] = pr[i, 0] / (pr[i, 0] + pr[i, 1])\n                    if compute_aos:\n                        aos[m, l, k, i] = pr[i, 3] / (pr[i, 0] + pr[i, 1])\n                for i in range(len(thresholds)):\n                    precision[m, l, k, i] = np.max(\n                        precision[m, l, k, i:], axis=-1)\n                    if compute_aos:\n                        aos[m, l, k, i] = np.max(aos[m, l, k, i:], axis=-1)\n\n    ret_dict = {\n        # \"recall\": recall, # [num_class, num_difficulty, num_minoverlap, N_SAMPLE_PTS]\n        \"precision\": precision,\n        \"orientation\": aos,\n        \"thresholds\": all_thresholds,\n        \"min_overlaps\": min_overlaps,\n    }\n    return ret_dict\n\n\ndef get_mAP_v2(prec):\n    sums = 0\n    for i in range(0, prec.shape[-1], 4):\n        sums = sums + prec[..., i]\n    return sums / 11 * 100\n\n\ndef do_eval_v2(gt_annos,\n               dt_annos,\n               current_classes,\n               min_overlaps,\n               compute_aos=False,\n               difficultys=(0, 1, 2),\n               z_axis=1,\n               z_center=1.0):\n    # min_overlaps: [num_minoverlap, metric, num_class]\n    ret = eval_class(\n        gt_annos,\n        dt_annos,\n        current_classes,\n        difficultys,\n        0,\n        min_overlaps,\n        compute_aos,\n        z_axis=z_axis,\n        z_center=z_center)\n    # ret: [num_class, num_diff, num_minoverlap, num_sample_points]\n    mAP_bbox = get_mAP_v2(ret[\"precision\"])\n    mAP_aos = None\n    if compute_aos:\n        mAP_aos = get_mAP_v2(ret[\"orientation\"])\n    ret = eval_class(\n        gt_annos,\n        dt_annos,\n        current_classes,\n        difficultys,\n        1,\n        min_overlaps,\n        z_axis=z_axis,\n        z_center=z_center)\n    mAP_bev = get_mAP_v2(ret[\"precision\"])\n    ret = eval_class(\n        gt_annos,\n        dt_annos,\n        current_classes,\n        difficultys,\n        2,\n        min_overlaps,\n        z_axis=z_axis,\n        z_center=z_center)\n    mAP_3d = get_mAP_v2(ret[\"precision\"])\n    return mAP_bbox, mAP_bev, mAP_3d, mAP_aos\n\ndef do_eval_v3(gt_annos,\n               dt_annos,\n               current_classes,\n               min_overlaps,\n               compute_aos=False,\n               difficultys=(0, 1, 2),\n               z_axis=1,\n               z_center=1.0):\n    # min_overlaps: [num_minoverlap, metric, num_class]\n    types = [\"bbox\", \"bev\", \"3d\"]\n    metrics = {}\n    for i in range(3):\n        ret = eval_class(\n            gt_annos,\n            dt_annos,\n            current_classes,\n            difficultys,\n            i,\n            min_overlaps,\n            compute_aos,\n            z_axis=z_axis,\n            z_center=z_center)\n        metrics[types[i]] = ret\n    return metrics\n\n\ndef do_coco_style_eval(gt_annos,\n                       dt_annos,\n                       current_classes,\n                       overlap_ranges,\n                       compute_aos,\n                       z_axis=1,\n                       z_center=1.0):\n    # overlap_ranges: [range, metric, num_class]\n    min_overlaps = np.zeros([10, *overlap_ranges.shape[1:]])\n    for i in range(overlap_ranges.shape[1]):\n        for j in range(overlap_ranges.shape[2]):\n            min_overlaps[:, i, j] = np.linspace(*overlap_ranges[:, i, j])\n    mAP_bbox, mAP_bev, mAP_3d, mAP_aos = do_eval_v2(\n        gt_annos,\n        dt_annos,\n        current_classes,\n        min_overlaps,\n        compute_aos,\n        z_axis=z_axis,\n        z_center=z_center)\n    # ret: [num_class, num_diff, num_minoverlap]\n    mAP_bbox = mAP_bbox.mean(-1)\n    mAP_bev = mAP_bev.mean(-1)\n    mAP_3d = mAP_3d.mean(-1)\n    if mAP_aos is not None:\n        mAP_aos = mAP_aos.mean(-1)\n    return mAP_bbox, mAP_bev, mAP_3d, mAP_aos\n\n\ndef print_str(value, *arg, sstream=None):\n    if sstream is None:\n        sstream = sysio.StringIO()\n    sstream.truncate(0)\n    sstream.seek(0)\n    print(value, *arg, file=sstream)\n    return sstream.getvalue()\n\ndef get_official_eval_result(gt_annos,\n                             dt_annos,\n                             current_classes,\n                             difficultys=[0, 1, 2],\n                             z_axis=1,\n                             z_center=1.0):\n    \"\"\"\n        gt_annos and dt_annos must contains following keys:\n        [bbox, location, dimensions, rotation_y, score]\n    \"\"\"\n    overlap_mod = np.array([[0.7, 0.5, 0.5, 0.7, 0.5, 0.7, 0.7, 0.7],\n                            [0.7, 0.5, 0.5, 0.7, 0.5, 0.7, 0.7, 0.7],\n                            [0.7, 0.5, 0.5, 0.7, 0.5, 0.7, 0.7, 0.7]])\n    overlap_easy = np.array([[0.5, 0.5, 0.5, 0.7, 0.5, 0.5, 0.5, 0.5],\n                            [0.5, 0.25, 0.25, 0.5, 0.25, 0.5, 0.5, 0.5],\n                            [0.5, 0.25, 0.25, 0.5, 0.25, 0.5, 0.5, 0.5]])\n    min_overlaps = np.stack([overlap_mod, overlap_easy], axis=0)  # [2, 3, 5]\n    class_to_name = {\n        0: 'Car',\n        1: 'Pedestrian',\n        2: 'Cyclist',\n        3: 'Van',\n        4: 'Person_sitting',\n        5: 'car',\n        6: 'tractor',\n        7: 'trailer',\n    }\n    name_to_class = {v: n for n, v in class_to_name.items()}\n    if not isinstance(current_classes, (list, tuple)):\n        current_classes = [current_classes]\n    current_classes_int = []\n    for curcls in current_classes:\n        if isinstance(curcls, str):\n            current_classes_int.append(name_to_class[curcls])\n        else:\n            current_classes_int.append(curcls)\n    current_classes = current_classes_int\n    min_overlaps = min_overlaps[:, :, current_classes]\n    result = ''\n    # check whether alpha is valid\n    compute_aos = False\n    for anno in dt_annos:\n        if anno['alpha'].shape[0] != 0:\n            if anno['alpha'][0] != -10:\n                compute_aos = True\n            break\n    metrics = do_eval_v3(\n        gt_annos,\n        dt_annos,\n        current_classes,\n        min_overlaps,\n        compute_aos,\n        difficultys,\n        z_axis=z_axis,\n        z_center=z_center)\n    for j, curcls in enumerate(current_classes):\n        # mAP threshold array: [num_minoverlap, metric, class]\n        # mAP result: [num_class, num_diff, num_minoverlap]\n        for i in range(min_overlaps.shape[0]):\n            mAPbbox = get_mAP_v2(metrics[\"bbox\"][\"precision\"][j, :, i])\n            mAPbbox = \", \".join(f\"{v:.2f}\" for v in mAPbbox)\n            mAPbev = get_mAP_v2(metrics[\"bev\"][\"precision\"][j, :, i])\n            mAPbev = \", \".join(f\"{v:.2f}\" for v in mAPbev)\n            mAP3d = get_mAP_v2(metrics[\"3d\"][\"precision\"][j, :, i])\n            mAP3d = \", \".join(f\"{v:.2f}\" for v in mAP3d)\n            result += print_str(\n                (f\"{class_to_name[curcls]} \"\n                 \"AP(Average Precision)@{:.2f}, {:.2f}, {:.2f}:\".format(*min_overlaps[i, :, j])))\n            result += print_str(f\"bbox AP:{mAPbbox}\")\n            result += print_str(f\"bev  AP:{mAPbev}\")\n            result += print_str(f\"3d   AP:{mAP3d}\")\n            if compute_aos:\n                mAPaos = get_mAP_v2(metrics[\"bbox\"][\"orientation\"][j, :, i])\n                mAPaos = \", \".join(f\"{v:.2f}\" for v in mAPaos)\n                result += print_str(f\"aos  AP:{mAPaos}\")\n\n\n    return result\n\n\ndef get_coco_eval_result(gt_annos,\n                         dt_annos,\n                         current_classes,\n                         z_axis=1,\n                         z_center=1.0):\n    class_to_name = {\n        0: 'Car',\n        1: 'Pedestrian',\n        2: 'Cyclist',\n        3: 'Van',\n        4: 'Person_sitting',\n        5: 'car',\n        6: 'tractor',\n        7: 'trailer',\n    }\n    class_to_range = {\n        0: [0.5, 1.0, 0.05],\n        1: [0.25, 0.75, 0.05],\n        2: [0.25, 0.75, 0.05],\n        3: [0.5, 1.0, 0.05],\n        4: [0.25, 0.75, 0.05],\n        5: [0.5, 1.0, 0.05],\n        6: [0.5, 1.0, 0.05],\n        7: [0.5, 1.0, 0.05],\n    }\n    class_to_range = {\n        0: [0.5, 0.95, 10],\n        1: [0.25, 0.7, 10],\n        2: [0.25, 0.7, 10],\n        3: [0.5, 0.95, 10],\n        4: [0.25, 0.7, 10],\n        5: [0.5, 0.95, 10],\n        6: [0.5, 0.95, 10],\n        7: [0.5, 0.95, 10],\n    }\n\n    name_to_class = {v: n for n, v in class_to_name.items()}\n    if not isinstance(current_classes, (list, tuple)):\n        current_classes = [current_classes]\n    current_classes_int = []\n    for curcls in current_classes:\n        if isinstance(curcls, str):\n            current_classes_int.append(name_to_class[curcls])\n        else:\n            current_classes_int.append(curcls)\n    current_classes = current_classes_int\n    overlap_ranges = np.zeros([3, 3, len(current_classes)])\n    for i, curcls in enumerate(current_classes):\n        overlap_ranges[:, :, i] = np.array(\n            class_to_range[curcls])[:, np.newaxis]\n    result = ''\n    # check whether alpha is valid\n    compute_aos = False\n    for anno in dt_annos:\n        if anno['alpha'].shape[0] != 0:\n            if anno['alpha'][0] != -10:\n                compute_aos = True\n            break\n    mAPbbox, mAPbev, mAP3d, mAPaos = do_coco_style_eval(\n        gt_annos,\n        dt_annos,\n        current_classes,\n        overlap_ranges,\n        compute_aos,\n        z_axis=z_axis,\n        z_center=z_center)\n    for j, curcls in enumerate(current_classes):\n        # mAP threshold array: [num_minoverlap, metric, class]\n        # mAP result: [num_class, num_diff, num_minoverlap]\n        o_range = np.array(class_to_range[curcls])[[0, 2, 1]]\n        o_range[1] = (o_range[2] - o_range[0]) / (o_range[1] - 1)\n        result += print_str((f\"{class_to_name[curcls]} \"\n                             \"coco AP@{:.2f}:{:.2f}:{:.2f}:\".format(*o_range)))\n        result += print_str((f\"bbox AP:{mAPbbox[j, 0]:.2f}, \"\n                             f\"{mAPbbox[j, 1]:.2f}, \"\n                             f\"{mAPbbox[j, 2]:.2f}\"))\n        result += print_str((f\"bev  AP:{mAPbev[j, 0]:.2f}, \"\n                             f\"{mAPbev[j, 1]:.2f}, \"\n                             f\"{mAPbev[j, 2]:.2f}\"))\n        result += print_str((f\"3d   AP:{mAP3d[j, 0]:.2f}, \"\n                             f\"{mAP3d[j, 1]:.2f}, \"\n                             f\"{mAP3d[j, 2]:.2f}\"))\n        if compute_aos:\n            result += print_str((f\"aos  AP:{mAPaos[j, 0]:.2f}, \"\n                                 f\"{mAPaos[j, 1]:.2f}, \"\n                                 f\"{mAPaos[j, 2]:.2f}\"))\n    return result\n"
  },
  {
    "path": "disparity/eval/kitti-object-eval-python/eval.sh",
    "content": "#!/bin/bash\necho $1\nif [ ! -n \"$2\" ] ; then\n    class=\"0\"\nelse\n    class=$2\nfi\necho $class\npython3 evaluate.py evaluate \\\n    --label_path=/mnt/home/ylchen/ylchen/dataset/KITTI_DATASET/kitti_detection/training/label_2/ \\\n    --result_path=$1 \\\n    --current_class=$class --coco=False\n\n\n"
  },
  {
    "path": "disparity/eval/kitti-object-eval-python/eval_dist.sh",
    "content": "#!/bin/bash\necho $1\nif [ ! -n \"$2\" ] ; then\n    class=\"0\"\nelse\n    class=$2\nfi\necho $class\n\nfor i in $(seq 0 5 45)\ndo\n\techo \"eval $i,$(($i+5)) meters\"\n\tpython3.6 evaluate.py evaluate \\\n\t    --label_path=/home/yilunchen/data/kitti/training/label_2/ \\\n\t    --result_path=$1 \\\n\t    --current_class=$class --coco=False \\\n\t    --eval_dist=$i,$(($i+5))\ndone\n\n"
  },
  {
    "path": "disparity/eval/kitti-object-eval-python/evaluate.py",
    "content": "import time\nimport fire\nimport kitti_common as kitti\nfrom eval import get_official_eval_result, get_coco_eval_result\n\n\ndef _read_imageset_file(path):\n    with open(path, 'r') as f:\n        lines = f.readlines()\n    return [int(line) for line in lines]\n\n\ndef evaluate(label_path,\n             result_path,\n             current_class=0,\n             coco=False,\n             score_thresh=-1,\n             eval_dist=None):\n    dt_annos, image_ids = kitti.get_label_annos(result_path, return_image_ids=True, eval_dist=eval_dist)\n    print('Eval {} images'.format(len(dt_annos)))\n    if score_thresh > 0:\n        dt_annos = kitti.filter_annos_low_score(dt_annos, score_thresh)\n    #val_image_ids = _read_imageset_file(label_split_file)\n    gt_annos = kitti.get_label_annos(label_path, image_ids, eval_dist=eval_dist)\n    if coco:\n        print(get_coco_eval_result(gt_annos, dt_annos, current_class))\n    else:\n        print(get_official_eval_result(gt_annos, dt_annos, current_class))\n\n\nif __name__ == '__main__':\n    fire.Fire()\n"
  },
  {
    "path": "disparity/eval/kitti-object-eval-python/kitti_common.py",
    "content": "import concurrent.futures as futures\nimport os\nimport pathlib\nimport re\nfrom collections import OrderedDict\n\nimport numpy as np\nfrom skimage import io\n\ndef get_image_index_str(img_idx):\n    return \"{:06d}\".format(img_idx)\n\n\ndef get_kitti_info_path(idx,\n                        prefix,\n                        info_type='image_2',\n                        file_tail='.png',\n                        training=True,\n                        relative_path=True):\n    img_idx_str = get_image_index_str(idx)\n    img_idx_str += file_tail\n    prefix = pathlib.Path(prefix)\n    if training:\n        file_path = pathlib.Path('training') / info_type / img_idx_str\n    else:\n        file_path = pathlib.Path('testing') / info_type / img_idx_str\n    if not (prefix / file_path).exists():\n        raise ValueError(\"file not exist: {}\".format(file_path))\n    if relative_path:\n        return str(file_path)\n    else:\n        return str(prefix / file_path)\n\n\ndef get_image_path(idx, prefix, training=True, relative_path=True):\n    return get_kitti_info_path(idx, prefix, 'image_2', '.png', training,\n                               relative_path)\n\n\ndef get_label_path(idx, prefix, training=True, relative_path=True):\n    return get_kitti_info_path(idx, prefix, 'label_2', '.txt', training,\n                               relative_path)\n\n\ndef get_velodyne_path(idx, prefix, training=True, relative_path=True):\n    return get_kitti_info_path(idx, prefix, 'velodyne', '.bin', training,\n                               relative_path)\n\n\ndef get_calib_path(idx, prefix, training=True, relative_path=True):\n    return get_kitti_info_path(idx, prefix, 'calib', '.txt', training,\n                               relative_path)\n\n\ndef _extend_matrix(mat):\n    mat = np.concatenate([mat, np.array([[0., 0., 0., 1.]])], axis=0)\n    return mat\n\n\ndef get_kitti_image_info(path,\n                         training=True,\n                         label_info=True,\n                         velodyne=False,\n                         calib=False,\n                         image_ids=7481,\n                         extend_matrix=True,\n                         num_worker=8,\n                         relative_path=True,\n                         with_imageshape=True):\n    # image_infos = []\n    root_path = pathlib.Path(path)\n    if not isinstance(image_ids, list):\n        image_ids = list(range(image_ids))\n\n    def map_func(idx):\n        image_info = {'image_idx': idx}\n        annotations = None\n        if velodyne:\n            image_info['velodyne_path'] = get_velodyne_path(\n                idx, path, training, relative_path)\n        image_info['img_path'] = get_image_path(idx, path, training,\n                                                relative_path)\n        if with_imageshape:\n            img_path = image_info['img_path']\n            if relative_path:\n                img_path = str(root_path / img_path)\n            image_info['img_shape'] = np.array(\n                io.imread(img_path).shape[:2], dtype=np.int32)\n        if label_info:\n            label_path = get_label_path(idx, path, training, relative_path)\n            if relative_path:\n                label_path = str(root_path / label_path)\n            annotations = get_label_anno(label_path)\n        if calib:\n            calib_path = get_calib_path(\n                idx, path, training, relative_path=False)\n            with open(calib_path, 'r') as f:\n                lines = f.readlines()\n            P0 = np.array(\n                [float(info) for info in lines[0].split(' ')[1:13]]).reshape(\n                    [3, 4])\n            P1 = np.array(\n                [float(info) for info in lines[1].split(' ')[1:13]]).reshape(\n                    [3, 4])\n            P2 = np.array(\n                [float(info) for info in lines[2].split(' ')[1:13]]).reshape(\n                    [3, 4])\n            P3 = np.array(\n                [float(info) for info in lines[3].split(' ')[1:13]]).reshape(\n                    [3, 4])\n            if extend_matrix:\n                P0 = _extend_matrix(P0)\n                P1 = _extend_matrix(P1)\n                P2 = _extend_matrix(P2)\n                P3 = _extend_matrix(P3)\n            image_info['calib/P0'] = P0\n            image_info['calib/P1'] = P1\n            image_info['calib/P2'] = P2\n            image_info['calib/P3'] = P3\n            R0_rect = np.array([\n                float(info) for info in lines[4].split(' ')[1:10]\n            ]).reshape([3, 3])\n            if extend_matrix:\n                rect_4x4 = np.zeros([4, 4], dtype=R0_rect.dtype)\n                rect_4x4[3, 3] = 1.\n                rect_4x4[:3, :3] = R0_rect\n            else:\n                rect_4x4 = R0_rect\n            image_info['calib/R0_rect'] = rect_4x4\n            Tr_velo_to_cam = np.array([\n                float(info) for info in lines[5].split(' ')[1:13]\n            ]).reshape([3, 4])\n            Tr_imu_to_velo = np.array([\n                float(info) for info in lines[6].split(' ')[1:13]\n            ]).reshape([3, 4])\n            if extend_matrix:\n                Tr_velo_to_cam = _extend_matrix(Tr_velo_to_cam)\n                Tr_imu_to_velo = _extend_matrix(Tr_imu_to_velo)\n            image_info['calib/Tr_velo_to_cam'] = Tr_velo_to_cam\n            image_info['calib/Tr_imu_to_velo'] = Tr_imu_to_velo\n        if annotations is not None:\n            image_info['annos'] = annotations\n            add_difficulty_to_annos(image_info)\n        return image_info\n\n    with futures.ThreadPoolExecutor(num_worker) as executor:\n        image_infos = executor.map(map_func, image_ids)\n    return list(image_infos)\n\n\ndef filter_kitti_anno(image_anno,\n                      used_classes,\n                      used_difficulty=None,\n                      dontcare_iou=None):\n    if not isinstance(used_classes, (list, tuple)):\n        used_classes = [used_classes]\n    img_filtered_annotations = {}\n    relevant_annotation_indices = [\n        i for i, x in enumerate(image_anno['name']) if x in used_classes\n    ]\n    for key in image_anno.keys():\n        img_filtered_annotations[key] = (\n            image_anno[key][relevant_annotation_indices])\n    if used_difficulty is not None:\n        relevant_annotation_indices = [\n            i for i, x in enumerate(img_filtered_annotations['difficulty'])\n            if x in used_difficulty\n        ]\n        for key in image_anno.keys():\n            img_filtered_annotations[key] = (\n                img_filtered_annotations[key][relevant_annotation_indices])\n\n    if 'DontCare' in used_classes and dontcare_iou is not None:\n        dont_care_indices = [\n            i for i, x in enumerate(img_filtered_annotations['name'])\n            if x == 'DontCare'\n        ]\n        # bounding box format [y_min, x_min, y_max, x_max]\n        all_boxes = img_filtered_annotations['bbox']\n        ious = iou(all_boxes, all_boxes[dont_care_indices])\n\n        # Remove all bounding boxes that overlap with a dontcare region.\n        if ious.size > 0:\n            boxes_to_remove = np.amax(ious, axis=1) > dontcare_iou\n            for key in image_anno.keys():\n                img_filtered_annotations[key] = (img_filtered_annotations[key][\n                    np.logical_not(boxes_to_remove)])\n    return img_filtered_annotations\n\ndef filter_annos_low_score(image_annos, thresh):\n    new_image_annos = []\n    for anno in image_annos:\n        img_filtered_annotations = {}\n        relevant_annotation_indices = [\n            i for i, s in enumerate(anno['score']) if s >= thresh\n        ]\n        for key in anno.keys():\n            img_filtered_annotations[key] = (\n                anno[key][relevant_annotation_indices])\n        new_image_annos.append(img_filtered_annotations)\n    return new_image_annos\n\ndef kitti_result_line(result_dict, precision=4):\n    prec_float = \"{\" + \":.{}f\".format(precision) + \"}\"\n    res_line = []\n    all_field_default = OrderedDict([\n        ('name', None),\n        ('truncated', -1),\n        ('occluded', -1),\n        ('alpha', -10),\n        ('bbox', None),\n        ('dimensions', [-1, -1, -1]),\n        ('location', [-1000, -1000, -1000]),\n        ('rotation_y', -10),\n        ('score', None),\n    ])\n    res_dict = [(key, None) for key, val in all_field_default.items()]\n    res_dict = OrderedDict(res_dict)\n    for key, val in result_dict.items():\n        if all_field_default[key] is None and val is None:\n            raise ValueError(\"you must specify a value for {}\".format(key))\n        res_dict[key] = val\n\n    for key, val in res_dict.items():\n        if key == 'name':\n            res_line.append(val)\n        elif key in ['truncated', 'alpha', 'rotation_y', 'score']:\n            if val is None:\n                res_line.append(str(all_field_default[key]))\n            else:\n                res_line.append(prec_float.format(val))\n        elif key == 'occluded':\n            if val is None:\n                res_line.append(str(all_field_default[key]))\n            else:\n                res_line.append('{}'.format(val))\n        elif key in ['bbox', 'dimensions', 'location']:\n            if val is None:\n                res_line += [str(v) for v in all_field_default[key]]\n            else:\n                res_line += [prec_float.format(v) for v in val]\n        else:\n            raise ValueError(\"unknown key. supported key:{}\".format(\n                res_dict.keys()))\n    return ' '.join(res_line)\n\n\ndef add_difficulty_to_annos(info):\n    min_height = [40, 25,\n                  25]  # minimum height for evaluated groundtruth/detections\n    max_occlusion = [\n        0, 1, 2\n    ]  # maximum occlusion level of the groundtruth used for evaluation\n    max_trunc = [\n        0.15, 0.3, 0.5\n    ]  # maximum truncation level of the groundtruth used for evaluation\n    annos = info['annos']\n    dims = annos['dimensions']  # lhw format\n    bbox = annos['bbox']\n    height = bbox[:, 3] - bbox[:, 1]\n    occlusion = annos['occluded']\n    truncation = annos['truncated']\n    diff = []\n    easy_mask = np.ones((len(dims), ), dtype=np.bool)\n    moderate_mask = np.ones((len(dims), ), dtype=np.bool)\n    hard_mask = np.ones((len(dims), ), dtype=np.bool)\n    i = 0\n    for h, o, t in zip(height, occlusion, truncation):\n        if o > max_occlusion[0] or h <= min_height[0] or t > max_trunc[0]:\n            easy_mask[i] = False\n        if o > max_occlusion[1] or h <= min_height[1] or t > max_trunc[1]:\n            moderate_mask[i] = False\n        if o > max_occlusion[2] or h <= min_height[2] or t > max_trunc[2]:\n            hard_mask[i] = False\n        i += 1\n    is_easy = easy_mask\n    is_moderate = np.logical_xor(easy_mask, moderate_mask)\n    is_hard = np.logical_xor(hard_mask, moderate_mask)\n\n    for i in range(len(dims)):\n        if is_easy[i]:\n            diff.append(0)\n        elif is_moderate[i]:\n            diff.append(1)\n        elif is_hard[i]:\n            diff.append(2)\n        else:\n            diff.append(-1)\n    annos[\"difficulty\"] = np.array(diff, np.int32)\n    return diff\n\n\ndef get_label_anno(label_path, eval_dist=None):\n    annotations = {}\n    annotations.update({\n        'name': [],\n        'truncated': [],\n        'occluded': [],\n        'alpha': [],\n        'bbox': [],\n        'dimensions': [],\n        'location': [],\n        'rotation_y': []\n    })\n    with open(label_path, 'r') as f:\n        lines = f.readlines()\n    # if len(lines) == 0 or len(lines[0]) < 15:\n    #     content = []\n    # else:\n\n    content = [line.strip().split(' ') for line in lines]\n\n    if eval_dist is not None:\n        content = [x for x in content if float(x[13]) >= eval_dist[0] and float(x[13]) < eval_dist[1]]\n\n    annotations['name'] = np.array([x[0] for x in content])\n    annotations['truncated'] = np.array([float(x[1]) for x in content])\n    annotations['occluded'] = np.array([int(x[2]) for x in content])\n    annotations['alpha'] = np.array([float(x[3]) for x in content])\n    annotations['bbox'] = np.array(\n        [[float(info) for info in x[4:8]] for x in content]).reshape(-1, 4)\n    # dimensions will convert hwl format to standard lhw(camera) format.\n    annotations['dimensions'] = np.array(\n        [[float(info) for info in x[8:11]] for x in content]).reshape(\n            -1, 3)[:, [2, 0, 1]]\n    annotations['location'] = np.array(\n        [[float(info) for info in x[11:14]] for x in content]).reshape(-1, 3)\n    annotations['rotation_y'] = np.array(\n        [float(x[14]) for x in content]).reshape(-1)\n    if len(content) != 0 and len(content[0]) == 16:  # have score\n        annotations['score'] = np.array([float(x[15]) for x in content])\n    else:\n        annotations['score'] = np.zeros([len(annotations['bbox'])])\n    return annotations\n\ndef get_label_annos(label_folder, image_ids=None, return_image_ids=False, eval_dist=None):\n    if image_ids is None:\n        filepaths = pathlib.Path(label_folder).glob('*.txt')\n        prog = re.compile(r'^\\d{6}.txt$')\n        filepaths = filter(lambda f: prog.match(f.name), filepaths)\n        image_ids = [int(p.stem) for p in filepaths]\n        image_ids = sorted(image_ids)\n    if not isinstance(image_ids, list):\n        image_ids = list(range(image_ids))\n    annos = []\n    label_folder = pathlib.Path(label_folder)\n    for idx in image_ids:\n        image_idx = get_image_index_str(idx)\n        label_filename = label_folder / (image_idx + '.txt')\n        annos.append(get_label_anno(label_filename, eval_dist=eval_dist))\n    if return_image_ids:\n        return annos, image_ids\n    return annos\n\ndef area(boxes, add1=False):\n    \"\"\"Computes area of boxes.\n\n    Args:\n        boxes: Numpy array with shape [N, 4] holding N boxes\n\n    Returns:\n        a numpy array with shape [N*1] representing box areas\n    \"\"\"\n    if add1:\n        return (boxes[:, 2] - boxes[:, 0] + 1.0) * (\n            boxes[:, 3] - boxes[:, 1] + 1.0)\n    else:\n        return (boxes[:, 2] - boxes[:, 0]) * (boxes[:, 3] - boxes[:, 1])\n\n\ndef intersection(boxes1, boxes2, add1=False):\n    \"\"\"Compute pairwise intersection areas between boxes.\n\n    Args:\n        boxes1: a numpy array with shape [N, 4] holding N boxes\n        boxes2: a numpy array with shape [M, 4] holding M boxes\n\n    Returns:\n        a numpy array with shape [N*M] representing pairwise intersection area\n    \"\"\"\n    [y_min1, x_min1, y_max1, x_max1] = np.split(boxes1, 4, axis=1)\n    [y_min2, x_min2, y_max2, x_max2] = np.split(boxes2, 4, axis=1)\n\n    all_pairs_min_ymax = np.minimum(y_max1, np.transpose(y_max2))\n    all_pairs_max_ymin = np.maximum(y_min1, np.transpose(y_min2))\n    if add1:\n        all_pairs_min_ymax += 1.0\n    intersect_heights = np.maximum(\n        np.zeros(all_pairs_max_ymin.shape),\n        all_pairs_min_ymax - all_pairs_max_ymin)\n\n    all_pairs_min_xmax = np.minimum(x_max1, np.transpose(x_max2))\n    all_pairs_max_xmin = np.maximum(x_min1, np.transpose(x_min2))\n    if add1:\n        all_pairs_min_xmax += 1.0\n    intersect_widths = np.maximum(\n        np.zeros(all_pairs_max_xmin.shape),\n        all_pairs_min_xmax - all_pairs_max_xmin)\n    return intersect_heights * intersect_widths\n\n\ndef iou(boxes1, boxes2, add1=False):\n    \"\"\"Computes pairwise intersection-over-union between box collections.\n\n    Args:\n        boxes1: a numpy array with shape [N, 4] holding N boxes.\n        boxes2: a numpy array with shape [M, 4] holding N boxes.\n\n    Returns:\n        a numpy array with shape [N, M] representing pairwise iou scores.\n    \"\"\"\n    intersect = intersection(boxes1, boxes2, add1)\n    area1 = area(boxes1, add1)\n    area2 = area(boxes2, add1)\n    union = np.expand_dims(\n        area1, axis=1) + np.expand_dims(\n            area2, axis=0) - intersect\n    return intersect / union\n"
  },
  {
    "path": "disparity/eval/kitti-object-eval-python/rotate_iou.py",
    "content": "#####################\n# Based on https://github.com/hongzhenwang/RRPN-revise\n# Licensed under The MIT License\n# Author: yanyan, scrin@foxmail.com\n#####################\nimport math\n\nimport numba\nimport numpy as np\nfrom numba import cuda\n\n@numba.jit(nopython=True)\ndef div_up(m, n):\n    return m // n + (m % n > 0)\n\n@cuda.jit('(float32[:], float32[:], float32[:])', device=True, inline=True)\ndef trangle_area(a, b, c):\n    return ((a[0] - c[0]) * (b[1] - c[1]) - (a[1] - c[1]) *\n            (b[0] - c[0])) / 2.0\n\n\n@cuda.jit('(float32[:], int32)', device=True, inline=True)\ndef area(int_pts, num_of_inter):\n    area_val = 0.0\n    for i in range(num_of_inter - 2):\n        area_val += abs(\n            trangle_area(int_pts[:2], int_pts[2 * i + 2:2 * i + 4],\n                         int_pts[2 * i + 4:2 * i + 6]))\n    return area_val\n\n\n@cuda.jit('(float32[:], int32)', device=True, inline=True)\ndef sort_vertex_in_convex_polygon(int_pts, num_of_inter):\n    if num_of_inter > 0:\n        center = cuda.local.array((2, ), dtype=numba.float32)\n        center[:] = 0.0\n        for i in range(num_of_inter):\n            center[0] += int_pts[2 * i]\n            center[1] += int_pts[2 * i + 1]\n        center[0] /= num_of_inter\n        center[1] /= num_of_inter\n        v = cuda.local.array((2, ), dtype=numba.float32)\n        vs = cuda.local.array((16, ), dtype=numba.float32)\n        for i in range(num_of_inter):\n            v[0] = int_pts[2 * i] - center[0]\n            v[1] = int_pts[2 * i + 1] - center[1]\n            d = math.sqrt(v[0] * v[0] + v[1] * v[1])\n            v[0] = v[0] / d\n            v[1] = v[1] / d\n            if v[1] < 0:\n                v[0] = -2 - v[0]\n            vs[i] = v[0]\n        j = 0\n        temp = 0\n        for i in range(1, num_of_inter):\n            if vs[i - 1] > vs[i]:\n                temp = vs[i]\n                tx = int_pts[2 * i]\n                ty = int_pts[2 * i + 1]\n                j = i\n                while j > 0 and vs[j - 1] > temp:\n                    vs[j] = vs[j - 1]\n                    int_pts[j * 2] = int_pts[j * 2 - 2]\n                    int_pts[j * 2 + 1] = int_pts[j * 2 - 1]\n                    j -= 1\n\n                vs[j] = temp\n                int_pts[j * 2] = tx\n                int_pts[j * 2 + 1] = ty\n\n\n@cuda.jit(\n    '(float32[:], float32[:], int32, int32, float32[:])',\n    device=True,\n    inline=True)\ndef line_segment_intersection(pts1, pts2, i, j, temp_pts):\n    A = cuda.local.array((2, ), dtype=numba.float32)\n    B = cuda.local.array((2, ), dtype=numba.float32)\n    C = cuda.local.array((2, ), dtype=numba.float32)\n    D = cuda.local.array((2, ), dtype=numba.float32)\n\n    A[0] = pts1[2 * i]\n    A[1] = pts1[2 * i + 1]\n\n    B[0] = pts1[2 * ((i + 1) % 4)]\n    B[1] = pts1[2 * ((i + 1) % 4) + 1]\n\n    C[0] = pts2[2 * j]\n    C[1] = pts2[2 * j + 1]\n\n    D[0] = pts2[2 * ((j + 1) % 4)]\n    D[1] = pts2[2 * ((j + 1) % 4) + 1]\n    BA0 = B[0] - A[0]\n    BA1 = B[1] - A[1]\n    DA0 = D[0] - A[0]\n    CA0 = C[0] - A[0]\n    DA1 = D[1] - A[1]\n    CA1 = C[1] - A[1]\n    acd = DA1 * CA0 > CA1 * DA0\n    bcd = (D[1] - B[1]) * (C[0] - B[0]) > (C[1] - B[1]) * (D[0] - B[0])\n    if acd != bcd:\n        abc = CA1 * BA0 > BA1 * CA0\n        abd = DA1 * BA0 > BA1 * DA0\n        if abc != abd:\n            DC0 = D[0] - C[0]\n            DC1 = D[1] - C[1]\n            ABBA = A[0] * B[1] - B[0] * A[1]\n            CDDC = C[0] * D[1] - D[0] * C[1]\n            DH = BA1 * DC0 - BA0 * DC1\n            Dx = ABBA * DC0 - BA0 * CDDC\n            Dy = ABBA * DC1 - BA1 * CDDC\n            temp_pts[0] = Dx / DH\n            temp_pts[1] = Dy / DH\n            return True\n    return False\n\n\n@cuda.jit(\n    '(float32[:], float32[:], int32, int32, float32[:])',\n    device=True,\n    inline=True)\ndef line_segment_intersection_v1(pts1, pts2, i, j, temp_pts):\n    a = cuda.local.array((2, ), dtype=numba.float32)\n    b = cuda.local.array((2, ), dtype=numba.float32)\n    c = cuda.local.array((2, ), dtype=numba.float32)\n    d = cuda.local.array((2, ), dtype=numba.float32)\n\n    a[0] = pts1[2 * i]\n    a[1] = pts1[2 * i + 1]\n\n    b[0] = pts1[2 * ((i + 1) % 4)]\n    b[1] = pts1[2 * ((i + 1) % 4) + 1]\n\n    c[0] = pts2[2 * j]\n    c[1] = pts2[2 * j + 1]\n\n    d[0] = pts2[2 * ((j + 1) % 4)]\n    d[1] = pts2[2 * ((j + 1) % 4) + 1]\n\n    area_abc = trangle_area(a, b, c)\n    area_abd = trangle_area(a, b, d)\n\n    if area_abc * area_abd >= 0:\n        return False\n\n    area_cda = trangle_area(c, d, a)\n    area_cdb = area_cda + area_abc - area_abd\n\n    if area_cda * area_cdb >= 0:\n        return False\n    t = area_cda / (area_abd - area_abc)\n\n    dx = t * (b[0] - a[0])\n    dy = t * (b[1] - a[1])\n    temp_pts[0] = a[0] + dx\n    temp_pts[1] = a[1] + dy\n    return True\n\n\n@cuda.jit('(float32, float32, float32[:])', device=True, inline=True)\ndef point_in_quadrilateral(pt_x, pt_y, corners):\n    ab0 = corners[2] - corners[0]\n    ab1 = corners[3] - corners[1]\n\n    ad0 = corners[6] - corners[0]\n    ad1 = corners[7] - corners[1]\n\n    ap0 = pt_x - corners[0]\n    ap1 = pt_y - corners[1]\n\n    abab = ab0 * ab0 + ab1 * ab1\n    abap = ab0 * ap0 + ab1 * ap1\n    adad = ad0 * ad0 + ad1 * ad1\n    adap = ad0 * ap0 + ad1 * ap1\n\n    return abab >= abap and abap >= 0 and adad >= adap and adap >= 0\n\n\n@cuda.jit('(float32[:], float32[:], float32[:])', device=True, inline=True)\ndef quadrilateral_intersection(pts1, pts2, int_pts):\n    num_of_inter = 0\n    for i in range(4):\n        if point_in_quadrilateral(pts1[2 * i], pts1[2 * i + 1], pts2):\n            int_pts[num_of_inter * 2] = pts1[2 * i]\n            int_pts[num_of_inter * 2 + 1] = pts1[2 * i + 1]\n            num_of_inter += 1\n        if point_in_quadrilateral(pts2[2 * i], pts2[2 * i + 1], pts1):\n            int_pts[num_of_inter * 2] = pts2[2 * i]\n            int_pts[num_of_inter * 2 + 1] = pts2[2 * i + 1]\n            num_of_inter += 1\n    temp_pts = cuda.local.array((2, ), dtype=numba.float32)\n    for i in range(4):\n        for j in range(4):\n            has_pts = line_segment_intersection(pts1, pts2, i, j, temp_pts)\n            if has_pts:\n                int_pts[num_of_inter * 2] = temp_pts[0]\n                int_pts[num_of_inter * 2 + 1] = temp_pts[1]\n                num_of_inter += 1\n\n    return num_of_inter\n\n\n@cuda.jit('(float32[:], float32[:])', device=True, inline=True)\ndef rbbox_to_corners(corners, rbbox):\n    # generate clockwise corners and rotate it clockwise\n    angle = rbbox[4]\n    a_cos = math.cos(angle)\n    a_sin = math.sin(angle)\n    center_x = rbbox[0]\n    center_y = rbbox[1]\n    x_d = rbbox[2]\n    y_d = rbbox[3]\n    corners_x = cuda.local.array((4, ), dtype=numba.float32)\n    corners_y = cuda.local.array((4, ), dtype=numba.float32)\n    corners_x[0] = -x_d / 2\n    corners_x[1] = -x_d / 2\n    corners_x[2] = x_d / 2\n    corners_x[3] = x_d / 2\n    corners_y[0] = -y_d / 2\n    corners_y[1] = y_d / 2\n    corners_y[2] = y_d / 2\n    corners_y[3] = -y_d / 2\n    for i in range(4):\n        corners[2 *\n                i] = a_cos * corners_x[i] + a_sin * corners_y[i] + center_x\n        corners[2 * i\n                + 1] = -a_sin * corners_x[i] + a_cos * corners_y[i] + center_y\n\n\n@cuda.jit('(float32[:], float32[:])', device=True, inline=True)\ndef inter(rbbox1, rbbox2):\n    corners1 = cuda.local.array((8, ), dtype=numba.float32)\n    corners2 = cuda.local.array((8, ), dtype=numba.float32)\n    intersection_corners = cuda.local.array((16, ), dtype=numba.float32)\n\n    rbbox_to_corners(corners1, rbbox1)\n    rbbox_to_corners(corners2, rbbox2)\n\n    num_intersection = quadrilateral_intersection(corners1, corners2,\n                                                  intersection_corners)\n    sort_vertex_in_convex_polygon(intersection_corners, num_intersection)\n    # print(intersection_corners.reshape([-1, 2])[:num_intersection])\n\n    return area(intersection_corners, num_intersection)\n\n\n@cuda.jit('(float32[:], float32[:], int32)', device=True, inline=True)\ndef devRotateIoUEval(rbox1, rbox2, criterion=-1):\n    area1 = rbox1[2] * rbox1[3]\n    area2 = rbox2[2] * rbox2[3]\n    area_inter = inter(rbox1, rbox2)\n    if criterion == -1:\n        return area_inter / (area1 + area2 - area_inter)\n    elif criterion == 0:\n        return area_inter / area1\n    elif criterion == 1:\n        return area_inter / area2\n    elif criterion == 2:\n        return area_inter\n\n# (gt dt)\n@cuda.jit('(int64, int64, float32[:], float32[:], float32[:], int32)', fastmath=False)\ndef rotate_iou_kernel_eval(N, K, dev_boxes, dev_query_boxes, dev_iou, criterion=-1):\n    threadsPerBlock = 8 * 8\n    row_start = cuda.blockIdx.x\n    col_start = cuda.blockIdx.y\n    tx = cuda.threadIdx.x\n    row_size = min(N - row_start * threadsPerBlock, threadsPerBlock)\n    col_size = min(K - col_start * threadsPerBlock, threadsPerBlock)\n    block_boxes = cuda.shared.array(shape=(64 * 5, ), dtype=numba.float32)\n    block_qboxes = cuda.shared.array(shape=(64 * 5, ), dtype=numba.float32)\n\n    dev_query_box_idx = threadsPerBlock * col_start + tx\n    dev_box_idx = threadsPerBlock * row_start + tx\n    if (tx < col_size):\n        block_qboxes[tx * 5 + 0] = dev_query_boxes[dev_query_box_idx * 5 + 0]\n        block_qboxes[tx * 5 + 1] = dev_query_boxes[dev_query_box_idx * 5 + 1]\n        block_qboxes[tx * 5 + 2] = dev_query_boxes[dev_query_box_idx * 5 + 2]\n        block_qboxes[tx * 5 + 3] = dev_query_boxes[dev_query_box_idx * 5 + 3]\n        block_qboxes[tx * 5 + 4] = dev_query_boxes[dev_query_box_idx * 5 + 4]\n    if (tx < row_size):\n        block_boxes[tx * 5 + 0] = dev_boxes[dev_box_idx * 5 + 0]\n        block_boxes[tx * 5 + 1] = dev_boxes[dev_box_idx * 5 + 1]\n        block_boxes[tx * 5 + 2] = dev_boxes[dev_box_idx * 5 + 2]\n        block_boxes[tx * 5 + 3] = dev_boxes[dev_box_idx * 5 + 3]\n        block_boxes[tx * 5 + 4] = dev_boxes[dev_box_idx * 5 + 4]\n    cuda.syncthreads()\n\n    tmp_boxes = cuda.local.array(shape=(5,), dtype=numba.float32)\n    tmp_qboxes = cuda.local.array(shape=(5,), dtype=numba.float32)\n\n    if tx < row_size:\n        for i in range(col_size):\n            offset = row_start * threadsPerBlock * K + col_start * threadsPerBlock + tx * K + i\n\n            tmp_boxes[0] = block_boxes[tx * 5]\n            tmp_boxes[1] = block_boxes[tx * 5 + 1]\n            tmp_boxes[2] = block_boxes[tx * 5 + 2]\n            tmp_boxes[3] = block_boxes[tx * 5 + 3]\n            tmp_boxes[4] = block_boxes[tx * 5 + 4]\n\n            tmp_qboxes[0] = block_qboxes[i * 5]\n            tmp_qboxes[1] = block_qboxes[i * 5 + 1]\n            tmp_qboxes[2] = block_qboxes[i * 5 + 2]\n            tmp_qboxes[3] = block_qboxes[i * 5 + 3]\n            tmp_qboxes[4] = block_qboxes[i * 5 + 4]\n\n            tmp_criterion = criterion\n            if criterion == 3 or criterion == 4 or criterion == 5 or \\\n                criterion == 9 or criterion == 10 or criterion == 11 or criterion == 12 or criterion == 18:\n                tmp_criterion = -1\n            elif criterion == 6 or criterion == 7 or criterion == 8 or \\\n                criterion == 13 or criterion == 14 or criterion == 15 or criterion == 16 or criterion == 19:\n                tmp_criterion = 2\n\n            if criterion == 3 or criterion == 6:\n                tmp_qboxes[0] = tmp_boxes[0]\n            elif criterion == 5 or criterion == 8:\n                tmp_qboxes[1] = tmp_boxes[1]\n            elif criterion == 9 or criterion == 13:\n                tmp_qboxes[2] = tmp_boxes[2]\n            elif criterion == 11 or criterion == 15:\n                tmp_qboxes[3] = tmp_boxes[3]\n            elif criterion == 12 or criterion == 16:\n                tmp_qboxes[4] = tmp_boxes[4]\n            elif criterion == 18 or criterion == 19:\n                # it's suppose not to fix (x, y) since bev overlap between all boxes is 1\n                # tmp_qboxes[0] = tmp_boxes[0]+1e-3\n                # tmp_qboxes[1] = tmp_boxes[1]+1e-3\n                tmp_qboxes[2] = tmp_boxes[2]\n                tmp_qboxes[3] = tmp_boxes[3]\n                tmp_qboxes[4] = tmp_boxes[4]\n\n            dev_iou[offset] = devRotateIoUEval(tmp_boxes, tmp_qboxes, tmp_criterion)\n\ndef rotate_iou_gpu_eval(boxes, query_boxes, criterion=-1, device_id=0):\n    \"\"\"rotated box iou running in gpu. 500x faster than cpu version\n    (take 5ms in one example with numba.cuda code).\n    convert from [this project](\n        https://github.com/hongzhenwang/RRPN-revise/tree/master/lib/rotation).\n    \n    Args:\n        boxes (float tensor: [N, 5]): rbboxes. format: centers, dims, \n            angles(clockwise when positive)\n        query_boxes (float tensor: [K, 5]): [description]\n        device_id (int, optional): Defaults to 0. [description]\n    \n    Returns:\n        [type]: [description]\n    \"\"\"\n    box_dtype = boxes.dtype\n    boxes = boxes.astype(np.float32)\n    query_boxes = query_boxes.astype(np.float32)\n    N = boxes.shape[0]\n    K = query_boxes.shape[0]\n    iou = np.zeros((N, K), dtype=np.float32)\n    if N == 0 or K == 0:\n        return iou\n    threadsPerBlock = 8 * 8\n    cuda.select_device(device_id)\n    blockspergrid = (div_up(N, threadsPerBlock), div_up(K, threadsPerBlock))\n    \n    stream = cuda.stream()\n    with stream.auto_synchronize():\n        boxes_dev = cuda.to_device(boxes.reshape([-1]), stream)\n        query_boxes_dev = cuda.to_device(query_boxes.reshape([-1]), stream)\n        iou_dev = cuda.to_device(iou.reshape([-1]), stream)\n        rotate_iou_kernel_eval[blockspergrid, threadsPerBlock, stream](\n            N, K, boxes_dev, query_boxes_dev, iou_dev, criterion)\n        iou_dev.copy_to_host(iou.reshape([-1]), stream=stream)\n    return iou.astype(boxes.dtype)\n"
  },
  {
    "path": "disparity/layers/__init__.py",
    "content": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\nimport torch\n\nfrom .batch_norm import FrozenBatchNorm2d\nfrom .misc import Conv2d\nfrom .misc import ConvTranspose2d\nfrom .misc import BatchNorm2d\nfrom .misc import interpolate\nfrom .nms import nms\nfrom .roi_align import ROIAlign\nfrom .roi_align import roi_align\nfrom .roi_pool import ROIPool\nfrom .roi_pool import roi_pool\nfrom .smooth_l1_loss import smooth_l1_loss, l1_loss, l2_loss, ordinal_loss, dorn_encode, dorn_decode, bce_loss\nfrom .sigmoid_focal_loss import SigmoidFocalLoss\nfrom .iou_loss import IOULoss\nfrom .scale import Scale, ScaleShift\nfrom .build_cost_volume import BuildCostVolume\n\n\n__all__ = [\"nms\", \"roi_align\", \"ROIAlign\", \"roi_pool\", \"ROIPool\",\n           \"smooth_l1_loss\", \"Conv2d\", \"ConvTranspose2d\", \"interpolate\",\n           \"BatchNorm2d\", \"FrozenBatchNorm2d\", \"SigmoidFocalLoss\", \"IOULoss\",\n           \"Scale\", \"BuildCostVolume\"]\n\n"
  },
  {
    "path": "disparity/layers/_utils.py",
    "content": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\nimport glob\nimport os.path\n\nimport torch\n\ntry:\n    from torch.utils.cpp_extension import load as load_ext\n    from torch.utils.cpp_extension import CUDA_HOME\nexcept ImportError:\n    raise ImportError(\"The cpp layer extensions requires PyTorch 0.4 or higher\")\n\n\ndef _load_C_extensions():\n    this_dir = os.path.dirname(os.path.abspath(__file__))\n    this_dir = os.path.dirname(this_dir)\n    this_dir = os.path.join(this_dir, \"csrc\")\n\n    main_file = glob.glob(os.path.join(this_dir, \"*.cpp\"))\n    source_cpu = glob.glob(os.path.join(this_dir, \"cpu\", \"*.cpp\"))\n    source_cuda = glob.glob(os.path.join(this_dir, \"cuda\", \"*.cu\"))\n\n    source = main_file + source_cpu\n\n    extra_cflags = []\n    if torch.cuda.is_available() and CUDA_HOME is not None:\n        source.extend(source_cuda)\n        extra_cflags = [\"-DWITH_CUDA\"]\n    source = [os.path.join(this_dir, s) for s in source]\n    extra_include_paths = [this_dir]\n    return load_ext(\n        \"torchvision\",\n        source,\n        extra_cflags=extra_cflags,\n        extra_include_paths=extra_include_paths,\n    )\n\n\n_C = _load_C_extensions()\n"
  },
  {
    "path": "disparity/layers/batch_norm.py",
    "content": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\nimport torch\nfrom torch import nn\n\n\nclass FrozenBatchNorm2d(nn.Module):\n    \"\"\"\n    BatchNorm2d where the batch statistics and the affine parameters\n    are fixed\n    \"\"\"\n\n    def __init__(self, n):\n        super(FrozenBatchNorm2d, self).__init__()\n        self.register_buffer(\"weight\", torch.ones(n))\n        self.register_buffer(\"bias\", torch.zeros(n))\n        self.register_buffer(\"running_mean\", torch.zeros(n))\n        self.register_buffer(\"running_var\", torch.ones(n))\n\n    def forward(self, x):\n        scale = self.weight * self.running_var.rsqrt()\n        bias = self.bias - self.running_mean * scale\n        scale = scale.reshape(1, -1, 1, 1)\n        bias = bias.reshape(1, -1, 1, 1)\n        return x * scale + bias\n"
  },
  {
    "path": "disparity/layers/build_cost_volume.py",
    "content": "import torch\nfrom torch import nn\nfrom torch.autograd import Function\nfrom torch.autograd.function import once_differentiable\nfrom torch.nn.modules.utils import _pair\n\nfrom dsgn import _C\n\n\nclass _BuildCostVolume(Function):\n    @staticmethod\n    def forward(ctx, left, right, shift):\n        ctx.save_for_backward(shift)\n        assert torch.all(shift >= 0.)\n        output = _C.build_cost_volume_forward(\n            left, right, shift\n        )\n        return output\n\n    @staticmethod\n    @once_differentiable\n    def backward(ctx, grad_output):\n        shift, = ctx.saved_tensors\n        grad_left, grad_right = _C.build_cost_volume_backward(\n            grad_output,\n            shift\n        )\n        return grad_left, grad_right, None\n\n\nbuild_cost_volume = _BuildCostVolume.apply\n\n\nclass BuildCostVolume(nn.Module):\n    def __init__(self):\n        super(BuildCostVolume, self).__init__()\n\n    def forward(self, left, right, shift):\n        return build_cost_volume(\n            left, right, shift\n        )\n\n    def __repr__(self):\n        tmpstr = self.__class__.__name__ \n        return tmpstr\n"
  },
  {
    "path": "disparity/layers/iou_loss.py",
    "content": "import torch\nfrom torch import nn\n\n\nclass IOULoss(nn.Module):\n    def forward(self, pred, target, weight=None):\n        pred_left = pred[:, 0]\n        pred_top = pred[:, 1]\n        pred_right = pred[:, 2]\n        pred_bottom = pred[:, 3]\n\n        target_left = target[:, 0]\n        target_top = target[:, 1]\n        target_right = target[:, 2]\n        target_bottom = target[:, 3]\n\n        target_aera = (target_left + target_right) * \\\n                      (target_top + target_bottom)\n        pred_aera = (pred_left + pred_right) * \\\n                    (pred_top + pred_bottom)\n\n        w_intersect = torch.min(pred_left, target_left) + \\\n                      torch.min(pred_right, target_right)\n        h_intersect = torch.min(pred_bottom, target_bottom) + \\\n                      torch.min(pred_top, target_top)\n\n        area_intersect = w_intersect * h_intersect\n        area_union = target_aera + pred_aera - area_intersect\n\n        losses = -torch.log((area_intersect + 1.0) / (area_union + 1.0))\n\n        if weight is not None and weight.sum() > 0:\n            return (losses * weight).sum() / weight.sum()\n        else:\n            assert losses.numel() != 0\n            return losses.mean()\n"
  },
  {
    "path": "disparity/layers/misc.py",
    "content": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n\"\"\"\nhelper class that supports empty tensors on some nn functions.\n\nIdeally, add support directly in PyTorch to empty tensors in\nthose functions.\n\nThis can be removed once https://github.com/pytorch/pytorch/issues/12013\nis implemented\n\"\"\"\n\nimport math\nimport torch\nfrom torch.nn.modules.utils import _ntuple\n\n\nclass _NewEmptyTensorOp(torch.autograd.Function):\n    @staticmethod\n    def forward(ctx, x, new_shape):\n        ctx.shape = x.shape\n        return x.new_empty(new_shape)\n\n    @staticmethod\n    def backward(ctx, grad):\n        shape = ctx.shape\n        return _NewEmptyTensorOp.apply(grad, shape), None\n\n\nclass Conv2d(torch.nn.Conv2d):\n    def forward(self, x):\n        if x.numel() > 0:\n            return super(Conv2d, self).forward(x)\n        # get output shape\n\n        output_shape = [\n            (i + 2 * p - (di * (k - 1) + 1)) // d + 1\n            for i, p, di, k, d in zip(\n                x.shape[-2:], self.padding, self.dilation, self.kernel_size, self.stride\n            )\n        ]\n        output_shape = [x.shape[0], self.weight.shape[0]] + output_shape\n        return _NewEmptyTensorOp.apply(x, output_shape)\n\n\nclass ConvTranspose2d(torch.nn.ConvTranspose2d):\n    def forward(self, x):\n        if x.numel() > 0:\n            return super(ConvTranspose2d, self).forward(x)\n        # get output shape\n\n        output_shape = [\n            (i - 1) * d - 2 * p + (di * (k - 1) + 1) + op\n            for i, p, di, k, d, op in zip(\n                x.shape[-2:],\n                self.padding,\n                self.dilation,\n                self.kernel_size,\n                self.stride,\n                self.output_padding,\n            )\n        ]\n        output_shape = [x.shape[0], self.bias.shape[0]] + output_shape\n        return _NewEmptyTensorOp.apply(x, output_shape)\n\n\nclass BatchNorm2d(torch.nn.BatchNorm2d):\n    def forward(self, x):\n        if x.numel() > 0:\n            return super(BatchNorm2d, self).forward(x)\n        # get output shape\n        output_shape = x.shape\n        return _NewEmptyTensorOp.apply(x, output_shape)\n\n\ndef interpolate(\n    input, size=None, scale_factor=None, mode=\"nearest\", align_corners=None\n):\n    if input.numel() > 0:\n        return torch.nn.functional.interpolate(\n            input, size, scale_factor, mode, align_corners\n        )\n\n    def _check_size_scale_factor(dim):\n        if size is None and scale_factor is None:\n            raise ValueError(\"either size or scale_factor should be defined\")\n        if size is not None and scale_factor is not None:\n            raise ValueError(\"only one of size or scale_factor should be defined\")\n        if (\n            scale_factor is not None\n            and isinstance(scale_factor, tuple)\n            and len(scale_factor) != dim\n        ):\n            raise ValueError(\n                \"scale_factor shape must match input shape. \"\n                \"Input is {}D, scale_factor size is {}\".format(dim, len(scale_factor))\n            )\n\n    def _output_size(dim):\n        _check_size_scale_factor(dim)\n        if size is not None:\n            return size\n        scale_factors = _ntuple(dim)(scale_factor)\n        # math.floor might return float in py2.7\n        return [\n            int(math.floor(input.size(i + 2) * scale_factors[i])) for i in range(dim)\n        ]\n\n    output_shape = tuple(_output_size(2))\n    output_shape = input.shape[:-2] + output_shape\n    return _NewEmptyTensorOp.apply(input, output_shape)\n"
  },
  {
    "path": "disparity/layers/nms.py",
    "content": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n# from ._utils import _C\nfrom dsgn import _C\n\nnms = _C.nms\n# nms.__doc__ = \"\"\"\n# This function performs Non-maximum suppresion\"\"\"\n"
  },
  {
    "path": "disparity/layers/roi_align.py",
    "content": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\nimport torch\nfrom torch import nn\nfrom torch.autograd import Function\nfrom torch.autograd.function import once_differentiable\nfrom torch.nn.modules.utils import _pair\n\nfrom dsgn import _C\n\n\nclass _ROIAlign(Function):\n    @staticmethod\n    def forward(ctx, input, roi, output_size, spatial_scale, sampling_ratio):\n        ctx.save_for_backward(roi)\n        ctx.output_size = _pair(output_size)\n        ctx.spatial_scale = spatial_scale\n        ctx.sampling_ratio = sampling_ratio\n        ctx.input_shape = input.size()\n        output = _C.roi_align_forward(\n            input, roi, spatial_scale, output_size[0], output_size[1], sampling_ratio\n        )\n        return output\n\n    @staticmethod\n    @once_differentiable\n    def backward(ctx, grad_output):\n        rois, = ctx.saved_tensors\n        output_size = ctx.output_size\n        spatial_scale = ctx.spatial_scale\n        sampling_ratio = ctx.sampling_ratio\n        bs, ch, h, w = ctx.input_shape\n        grad_input = _C.roi_align_backward(\n            grad_output,\n            rois,\n            spatial_scale,\n            output_size[0],\n            output_size[1],\n            bs,\n            ch,\n            h,\n            w,\n            sampling_ratio,\n        )\n        return grad_input, None, None, None, None\n\n\nroi_align = _ROIAlign.apply\n\n\nclass ROIAlign(nn.Module):\n    def __init__(self, output_size, spatial_scale, sampling_ratio):\n        super(ROIAlign, self).__init__()\n        self.output_size = output_size\n        self.spatial_scale = spatial_scale\n        self.sampling_ratio = sampling_ratio\n\n    def forward(self, input, rois):\n        return roi_align(\n            input, rois, self.output_size, self.spatial_scale, self.sampling_ratio\n        )\n\n    def __repr__(self):\n        tmpstr = self.__class__.__name__ + \"(\"\n        tmpstr += \"output_size=\" + str(self.output_size)\n        tmpstr += \", spatial_scale=\" + str(self.spatial_scale)\n        tmpstr += \", sampling_ratio=\" + str(self.sampling_ratio)\n        tmpstr += \")\"\n        return tmpstr\n"
  },
  {
    "path": "disparity/layers/roi_pool.py",
    "content": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\nimport torch\nfrom torch import nn\nfrom torch.autograd import Function\nfrom torch.autograd.function import once_differentiable\nfrom torch.nn.modules.utils import _pair\n\nfrom dsgn import _C\n\n\nclass _ROIPool(Function):\n    @staticmethod\n    def forward(ctx, input, roi, output_size, spatial_scale):\n        ctx.output_size = _pair(output_size)\n        ctx.spatial_scale = spatial_scale\n        ctx.input_shape = input.size()\n        output, argmax = _C.roi_pool_forward(\n            input, roi, spatial_scale, output_size[0], output_size[1]\n        )\n        ctx.save_for_backward(input, roi, argmax)\n        return output\n\n    @staticmethod\n    @once_differentiable\n    def backward(ctx, grad_output):\n        input, rois, argmax = ctx.saved_tensors\n        output_size = ctx.output_size\n        spatial_scale = ctx.spatial_scale\n        bs, ch, h, w = ctx.input_shape\n        grad_input = _C.roi_pool_backward(\n            grad_output,\n            input,\n            rois,\n            argmax,\n            spatial_scale,\n            output_size[0],\n            output_size[1],\n            bs,\n            ch,\n            h,\n            w,\n        )\n        return grad_input, None, None, None\n\n\nroi_pool = _ROIPool.apply\n\n\nclass ROIPool(nn.Module):\n    def __init__(self, output_size, spatial_scale):\n        super(ROIPool, self).__init__()\n        self.output_size = output_size\n        self.spatial_scale = spatial_scale\n\n    def forward(self, input, rois):\n        return roi_pool(input, rois, self.output_size, self.spatial_scale)\n\n    def __repr__(self):\n        tmpstr = self.__class__.__name__ + \"(\"\n        tmpstr += \"output_size=\" + str(self.output_size)\n        tmpstr += \", spatial_scale=\" + str(self.spatial_scale)\n        tmpstr += \")\"\n        return tmpstr\n"
  },
  {
    "path": "disparity/layers/scale.py",
    "content": "import torch\nfrom torch import nn\n\n\nclass Scale(nn.Module):\n    def __init__(self, init_value=1.0):\n        super(Scale, self).__init__()\n        self.scale = nn.Parameter(torch.FloatTensor([init_value]))\n\n    def forward(self, input):\n        return input * self.scale\n\nclass ScaleShift(nn.Module):\n    def __init__(self, scale_value, shift_value, exp=False):\n        super(ScaleShift, self).__init__()\n        self.scale = nn.Parameter(torch.FloatTensor([scale_value]))\n        self.shift = nn.Parameter(torch.FloatTensor([shift_value]))\n        self.exp = exp\n\n    def forward(self, input):\n        if not self.exp:\n            return input * self.scale + self.shift\n        else:\n            return torch.exp(input / 10.) * self.scale + self.shift\n"
  },
  {
    "path": "disparity/layers/sigmoid_focal_loss.py",
    "content": "import torch\nfrom torch import nn\nfrom torch.autograd import Function\nfrom torch.autograd.function import once_differentiable\n\nfrom dsgn import _C\n\n# TODO: Use JIT to replace CUDA implementation in the future.\nclass _SigmoidFocalLoss(Function):\n    @staticmethod\n    def forward(ctx, logits, targets, gamma, alpha):\n        ctx.save_for_backward(logits, targets)\n        num_classes = logits.shape[1]\n        ctx.num_classes = num_classes\n        ctx.gamma = gamma\n        ctx.alpha = alpha\n\n        losses = _C.sigmoid_focalloss_forward(\n            logits, targets, num_classes, gamma, alpha\n        )\n        return losses\n\n    @staticmethod\n    @once_differentiable\n    def backward(ctx, d_loss):\n        logits, targets = ctx.saved_tensors\n        num_classes = ctx.num_classes\n        gamma = ctx.gamma\n        alpha = ctx.alpha\n        d_loss = d_loss.contiguous()\n        d_logits = _C.sigmoid_focalloss_backward(\n            logits, targets, d_loss, num_classes, gamma, alpha\n        )\n        return d_logits, None, None, None, None\n\n\nsigmoid_focal_loss_cuda = _SigmoidFocalLoss.apply\n\n\ndef sigmoid_focal_loss_cpu(logits, targets, gamma, alpha):\n    num_classes = logits.shape[1]\n    gamma = gamma[0]\n    alpha = alpha[0]\n    dtype = targets.dtype\n    device = targets.device\n    class_range = torch.arange(1, num_classes+1, dtype=dtype, device=device).unsqueeze(0)\n\n    t = targets.unsqueeze(1)\n    p = torch.sigmoid(logits)\n    term1 = (1 - p) ** gamma * torch.log(p)\n    term2 = p ** gamma * torch.log(1 - p)\n    return -(t == class_range).float() * term1 * alpha - ((t != class_range) * (t >= 0)).float() * term2 * (1 - alpha)\n\n\nclass SigmoidFocalLoss(nn.Module):\n    def __init__(self, gamma, alpha):\n        super(SigmoidFocalLoss, self).__init__()\n        self.gamma = gamma\n        self.alpha = alpha\n\n    def forward(self, logits, targets, weights=None):\n        device = logits.device\n        if logits.is_cuda:\n            loss_func = sigmoid_focal_loss_cuda\n        else:\n            loss_func = sigmoid_focal_loss_cpu\n\n        loss = loss_func(logits, targets, self.gamma, self.alpha)\n        if weights is not None:\n            loss = loss * weights.reshape(-1, 1)\n        return loss.sum()\n\n    def __repr__(self):\n        tmpstr = self.__class__.__name__ + \"(\"\n        tmpstr += \"gamma=\" + str(self.gamma)\n        tmpstr += \", alpha=\" + str(self.alpha)\n        tmpstr += \")\"\n        return tmpstr\n"
  },
  {
    "path": "disparity/layers/smooth_l1_loss.py",
    "content": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\nimport torch\nimport numpy as np\n\n# TODO maybe push this to nn?\ndef smooth_l1_loss(input, target, beta=1. / 9, size_average=True):\n    \"\"\"\n    very similar to the smooth_l1_loss from pytorch, but with\n    the extra beta parameter\n    \"\"\"\n    n = torch.abs(input - target)\n    cond = n < beta\n    loss = torch.where(cond, 0.5 * n ** 2 / beta, n - 0.5 * beta)\n    if size_average:\n        return loss.mean()\n    return loss.sum()\n\ndef l1_loss(input, target, beta=1., sum_last_dim=False):\n    n = torch.abs(input - target) \n    loss = n * beta\n    if sum_last_dim:\n        loss = loss.sum(dim=-1)\n    return loss.mean()\n\ndef l2_loss(input, target, beta=1., sum_last_dim=False):\n    diff = input - target\n    n = diff * diff\n    loss = n * beta\n    if sum_last_dim:\n        loss = loss.sum(dim=-1)\n    return loss.mean()\n\n\ndef ordinal_loss(input, target):\n    N, C = input.shape\n\n    ranges = torch.arange(C, dtype=torch.int32).cuda() \n    mask = ranges[None, :] < target[:, None]\n\n    loss = -(torch.sum(torch.log( input[mask] + 1e-6 )) \\\n        + torch.sum(torch.log( 1. - input[1 - mask] + 1e-6 )))\n\n    loss = loss / N / C\n    return loss\n\ndef dorn_decode(cls, reg, alpha, beta):\n    dorn_dim = cls.shape[1]\n\n    depth_discretization = torch.sum((cls > 0.5), dim=1, keepdim=True)\n    if reg is not None:\n        depth_residual = torch.gather(reg, dim=1, index=depth_discretization)\n        depth_continuity = depth_discretization.float() + 0.5 + depth_residual\n    else:\n        depth_continuity = depth_discretization.float()\n    depth = alpha * (beta / alpha) ** (depth_continuity / dorn_dim)\n\n    return depth\n\ndef dorn_encode(depth, alpha, beta, dorn_dim):\n    depth = dorn_dim * torch.log(depth / alpha + 1e-6) / np.log(beta / alpha + 1e-6)\n    depth = depth.clamp(0, dorn_dim)\n    return depth.int(), depth - depth.int().float() - 0.5\n\ndef bce_loss(score, target):\n    loss = - (target * torch.log(score + 1e-6) + (1 - target) * torch.log( 1 - score + 1e-6))\n\n    return loss.mean()\n\n"
  },
  {
    "path": "disparity/models/ActiveStereoNet.py",
    "content": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nimport numpy as np\nimport torch.backends.cudnn as cudnn\n\ndef convbn(in_channel, out_channel, kernel_size, stride, pad, dilation):\n    \n    return nn.Sequential(\n        nn.Conv2d(\n            in_channel,\n            out_channel,\n            kernel_size=kernel_size,\n            stride=stride,\n            padding=dilation if dilation>1 else pad,\n            dilation=dilation),\n       nn.BatchNorm2d(out_channel))\n\ndef convbn_3d(in_channel, out_channel, kernel_size, stride, pad):\n\n    return nn.Sequential(\n        nn.Conv3d(\n            in_channel,\n            out_channel,\n            kernel_size=kernel_size,\n            padding=pad,\n            stride=stride),\n       nn.BatchNorm3d(out_channel))\n\nclass ConvolutionBlock(nn.Module):\n    def __init__(self, in_channel, out_channel, stride, downsample, pad, dilation):\n        super(ConvolutionBlock, self).__init__()\n        self.conv1 = nn.Sequential(\n            convbn(in_channel, out_channel, 3, stride, pad, dilation),\n            nn.LeakyReLU(negative_slope=0.2, inplace=True))\n        self.downsample = downsample\n        self.stride = stride\n\n    def forward(self, x):\n        out = self.conv1(x)\n        # out = x + out\n        return out\n\nclass ResNetBlock(nn.Module):\n    def __init__(self, in_channel, out_channel, stride, downsample, pad, dilation):\n        super(ResNetBlock, self).__init__()\n        self.conv1 = nn.Conv2d(in_channel, out_channel, kernel_size=3, stride=1, padding=1)\n        self.downsample = downsample\n        self.stride = stride\n\n    def forward(self, x):\n        out = self.conv1(x)\n        out = x + out\n        return out\n\nclass Siamese_Tower(nn.Module):\n    def __init__(self):\n        super(Siamese_Tower, self).__init__()\n\n        self.conv_begin = nn.Conv2d(1, 32, kernel_size=3, stride=1, padding=1)\n\n        self.residual_blocks = nn.ModuleList()\n        for _ in range(3):\n            self.residual_blocks.append(\n                ResNetBlock(\n                    32, 32, stride=1, downsample=None, pad=1, dilation=1))\n\n\n        self.downsample = nn.ModuleList()\n        for _ in range(3):\n            self.downsample.append(\n                ConvolutionBlock(\n                    32, 32, stride=2, downsample=None, pad=1, dilation=1))\n\n        self.conv_end = nn.Conv2d(32, 32, kernel_size=3, stride=1, padding=1)\n\n    def forward(self, rgb_img):\n        output = rgb_img\n        output = self.conv_begin(output)\n\n        for block in self.residual_blocks:\n            output = block(output)\n\n        for block in self.downsample:\n            output = block(output)\n        \n        output = self.conv_end (output)\n\n        return output\n\nclass Disparity_Refinement(nn.Module):\n    #return: full_res disparity\n    def __init__(self, in_channel):\n        super(Disparity_Refinement, self).__init__()\n\n\n        self.conv2d_feature_img = nn.Sequential(\n            convbn(in_channel, 16, kernel_size=3, stride=1, pad=1, dilation=1),\n            nn.LeakyReLU(negative_slope=0.2, inplace=True))\n        self.residual_astrous_blocks_img = nn.ModuleList()\n        astrous_list = [1, 2]\n        for di in astrous_list:\n            self.residual_astrous_blocks_img.append(\n                ResNetBlock(\n                    16, 16, stride=1, downsample=None, pad=1, dilation=di))\n\n        self.conv2d_feature_disp = nn.Sequential(\n            convbn(in_channel, 16, kernel_size=3, stride=1, pad=1, dilation=1),\n            nn.LeakyReLU(negative_slope=0.2, inplace=True))\n        self.residual_astrous_blocks_disp = nn.ModuleList()\n        astrous_list = [1, 2]\n        for di in astrous_list:\n            self.residual_astrous_blocks_disp.append(\n                ResNetBlock(\n                    16, 16, stride=1, downsample=None, pad=1, dilation=di))\n\n        self.residual_astrous_blocks_cated = nn.ModuleList()\n        astrous_list = [4, 8, 1, 1]\n        for di in astrous_list:\n            self.residual_astrous_blocks_cated.append(\n                ResNetBlock(\n                    32, 32, stride=1, downsample=None, pad=1, dilation=di))\n\n        self.conv_end = nn.Conv2d(32, 1, kernel_size=3, stride=1, padding=1)\n\n    def forward(self, low_disparity, corresponding_rgb):\n\n        feature_disp = self.conv2d_feature_disp(low_disparity)\n        feature_img = self.conv2d_feature_img(corresponding_rgb)\n        feature_cated = torch.cat([feature_disp, feature_img], dim=1)\n\n        Disparity_Residual = self.conv_end(feature_cated)\n\n        \n        return Disparity_Residual + low_disparity \n\nclass Invalidation_Net(nn.Module):\n    #return: full_res Invalidation\n    def __init__(self):\n        super(Invalidation_Net, self).__init__()\n        \n        self.residual_blocks1 = nn.ModuleList()\n        for _ in range(5):\n            self.residual_blocks1.append(\n                ResNetBlock(\n                    64, 64, stride=1, downsample=None, pad=1, dilation=1))\n        self.conv_end1 = nn.Conv2d(64, 1, kernel_size=3, stride=1, padding=1)\n\n        self.conv_begin = ConvolutionBlock(\n                    3, 32, stride=1, downsample=None, pad=1, dilation=1)\n\n        self.residual_blocks2 = nn.ModuleList()\n        for _ in range(4):\n            self.residual_blocks2.append(\n                ResNetBlock(\n                    32, 32, stride=1, downsample=None, pad=1, dilation=1))\n        \n        self.conv_end2 = nn.Conv2d(32, 1, kernel_size=3, stride=1, padding=1)\n\n    def forward(self, left_tower, right_tower, input_img, full_res_disparity):\n\n        output = torch.cat([left_tower, right_tower], dim=1)\n\n        for block in self.residual_blocks1:\n            \n            output = block(output)\n        \n        Low_Res_invalidation_small = self.conv_end1(output)\n        \n        pred = Low_Res_invalidation_small * input_img.size()[-1] / Low_Res_invalidation_small.size()[-1]\n\n        Low_Res_invalidation = F.upsample(\n                pred,\n                size=input_img.size()[-2:],\n                mode='bilinear',\n                align_corners=False)\n\n\n        output = torch.cat([input_img, Low_Res_invalidation,full_res_disparity], dim=1)\n        output = self.conv_begin(output)\n        for block in self.residual_blocks2:\n            output = block(output)\n        Invalidation_Residual = self.conv_end2(output)\n\n        return Invalidation_Residual + Low_Res_invalidation\n\nclass disparityregression(nn.Module):\n    def __init__(self, maxdisp):\n        super(disparityregression, self).__init__()\n        self.disp = torch.FloatTensor(\n            np.reshape(np.array(range(maxdisp)), [1, maxdisp, 1, 1])).cuda()\n\n    def forward(self, x):\n        disp = self.disp.repeat(x.size()[0], 1, x.size()[2], x.size()[3])\n        out = torch.sum(x * disp, 1)\n        return out\n\n\nclass Active_StereoNet(nn.Module):\n    def __init__(self, maxdisp=144):\n        super(Active_StereoNet, self).__init__()\n        self.maxdisp = maxdisp\n        self.Siamese_Tower = Siamese_Tower()\n        self.filter = nn.ModuleList()\n        for _ in range(4):\n            self.filter.append(\n                nn.Sequential(\n                    convbn_3d(32, 32, kernel_size=3, stride=1, pad=1),\n                    nn.LeakyReLU(negative_slope=0.2, inplace=True)))\n        self.conv3d_alone = nn.Conv3d(\n            32, 1, kernel_size=3, stride=1, padding=1)\n        \n        self.Disparity_Refinement = Disparity_Refinement(in_channel=1)\n        self.Invalidation_Net = Invalidation_Net()\n\n    \n    def forward(self, left, right):\n        disp = (self.maxdisp + 1) // 8\n        refimg_feature = self.Siamese_Tower(left)\n        targetimg_feature = self.Siamese_Tower(right)\n        \n        def calculate(refimg_feature, targetimg_feature, img, type):\n\n            # matching\n            cost = torch.FloatTensor(refimg_feature.size()[0],\n                                    refimg_feature.size()[1],\n                                    disp,\n                                    refimg_feature.size()[2],\n                                    refimg_feature.size()[3]).zero_().cuda()\n\n            if type == 'left':\n                for i in range(disp):\n                    if i > 0:\n                        cost[:, :, i, :, i:] = refimg_feature[ :, :, :, i:] - targetimg_feature[:, :, :, :-i]\n                    else: \n                        cost[:, :, i, :, :] = refimg_feature - targetimg_feature\n\n            if type == 'right':\n                for i in range(disp):\n                    if i > 0:\n                        cost[:, :, i, :, :-i] = refimg_feature[ :, :, :, :-i] - targetimg_feature[:, :, :, i:]\n                    else: \n                        cost[:, :, i, :, :] = refimg_feature - targetimg_feature\n            cost = cost.contiguous()\n\n            for f in self.filter:\n                cost = f(cost)\n            cost = self.conv3d_alone(cost)\n            cost = torch.squeeze(cost, 1)\n            pred = F.softmax(cost, dim=1)\n            pred = disparityregression(disp)(pred)\n\n            pred = pred * img.size()[-1] / pred.size()[-1]\n\n            res_disparity = F.upsample(\n                    torch.unsqueeze(pred, dim=1),\n                    size=img.size()[-2:],\n                    mode='bilinear',\n                    align_corners=False)\n\n            return res_disparity\n\n        res_disparityL = calculate(refimg_feature, targetimg_feature, left, 'left')\n        res_disparityR = calculate(targetimg_feature, refimg_feature, right, 'right')\n\n        Full_res_disparityL = self.Disparity_Refinement(res_disparityL, left)\n        Full_res_disparityR = self.Disparity_Refinement(res_disparityR, right)\n        \n        \n        # Full_res_invalidation = self.Invalidation_Net(refimg_feature, targetimg_feature, left, Full_res_disparityL)\n\n\n\n        return Full_res_disparityL, Full_res_disparityR,  Full_res_disparityR\n        # return Full_res_disparityL, Full_res_disparityL,Full_res_disparityL\n\n\n\nif __name__ == '__main__':\n    model = Active_StereoNet().cuda()\n    # model.eval()\n    import time\n    import datetime\n    import torch\n    # torch.backends.cudnn.benchmark = True    \n    input = torch.FloatTensor(1,1,720,1280).zero_().cuda()\n    with torch.no_grad():\n\n        from thop import clever_format\n        from thop import profile\n\n\n        flops, params = profile(model, inputs=(input, input))\n        flops, params = clever_format([flops, params], \"%.3f\")\n        print(flops, params)\n\n    \n\n\n\n\n\n    \n\n\n"
  },
  {
    "path": "disparity/models/__init__.py",
    "content": "from .stereonet import StereoNet\nfrom .hitnet import HitNet\nfrom .stereonet_disp import StereoNet as stereonet_disp"
  },
  {
    "path": "disparity/models/stereonet.py",
    "content": "from __future__ import print_function\n\nfrom .submodule import *\nimport torch\nimport torch.nn as nn\nimport torch.utils.data\nfrom torch.autograd import Variable\nimport torch.nn.functional as F\nimport math\n\nfrom dsgn.utils.bounding_box import compute_corners, quan_to_angle, \\\n    angle_to_quan, quan_to_rotation, compute_corners_sc\nfrom dsgn.layers import BuildCostVolume\n\ndef project_rect_to_image(pts_3d_rect, P):\n    n = pts_3d_rect.shape[0]\n    ones = torch.ones((n,1))\n    if pts_3d_rect.is_cuda:\n        ones = ones.cuda()\n    pts_3d_rect = torch.cat([pts_3d_rect, ones], dim=1)\n    pts_2d = torch.mm(pts_3d_rect, torch.transpose(P, 0, 1)) # nx3\n    pts_2d[:,0] /= pts_2d[:,2]\n    pts_2d[:,1] /= pts_2d[:,2]\n    return pts_2d[:,0:2]\n\nclass StereoNet(nn.Module):\n    def __init__(self, cfg=None):\n        super(StereoNet, self).__init__()\n        self.maxdisp = cfg.maxdisp\n        self.downsample_disp = cfg.downsample_disp\n        self.cfg = cfg\n        self.num_classes = self.cfg.num_classes\n        self.hg_rpn_conv3d = getattr(self.cfg, 'hg_rpn_conv3d', False)\n        self.hg_rpn_conv = getattr(self.cfg, 'hg_rpn_conv', False)\n        self.centerness4class = getattr(self.cfg, 'centerness4class', False)\n        self.img_feature_attentionbydisp = getattr(self.cfg, 'img_feature_attentionbydisp', False)\n        self.voxel_attentionbydisp = getattr(self.cfg, 'voxel_attentionbydisp', False)\n        self.valid_classes = getattr(self.cfg, 'valid_classes', None)\n        self.class4angles = getattr(self.cfg, 'class4angles', True)\n        self.box_corner_parameters = getattr(self.cfg, 'box_corner_parameters', True)\n        self.PlaneSweepVolume = getattr(self.cfg, 'PlaneSweepVolume', True)\n        self.loss_disp = getattr(self.cfg, 'loss_disp', True)\n        self.fix_centerness_bug = getattr(self.cfg, 'fix_centerness_bug', False)\n        self.hg_firstconv = getattr(self.cfg, 'hg_firstconv', False)\n        self.rpn3d_conv_kernel = getattr(self.cfg, 'rpn3d_conv_kernel', 3)\n\n        if self.PlaneSweepVolume:\n            self.build_cost = BuildCostVolume()\n\n        self.anchor_angles = torch.as_tensor(self.cfg.ANCHOR_ANGLES)\n        self.num_angles = self.cfg.num_angles\n\n        self.feature_extraction = feature_extraction(cfg)\n\n        res_dim = 64\n        if self.PlaneSweepVolume:\n            if not self.hg_firstconv:\n                self.dres0 = nn.Sequential(convbn_3d(res_dim, res_dim, 3, 1, 1, gn=cfg.GN),\n                                           nn.ReLU(inplace=True),\n                                           convbn_3d(res_dim, res_dim, 3, 1, 1, gn=cfg.GN),\n                                           nn.ReLU(inplace=True))\n\n                self.dres1 = nn.Sequential(convbn_3d(res_dim, res_dim, 3, 1, 1, gn=cfg.GN),\n                                           nn.ReLU(inplace=True),\n                                           convbn_3d(res_dim, res_dim, 3, 1, 1, gn=cfg.GN))\n            else:\n                self.dres0 = hourglass(res_dim, gn=cfg.GN)\n\n            self.hg_cv = self.cfg.hg_cv\n\n            if self.hg_cv:\n                self.dres2 = hourglass(res_dim, gn=cfg.GN)\n\n            if self.loss_disp:\n                self.classif1 = nn.Sequential(convbn_3d(res_dim, res_dim, 3, 1, 1, gn=cfg.GN),\n                                              nn.ReLU(inplace=True),\n                                              nn.Conv3d(res_dim, 1, kernel_size=3, padding=1, stride=1, bias=False))\n\n        self.cat_disp = getattr(self.cfg, 'cat_disp', False)\n        self.cat_img_feature = getattr(self.cfg, 'cat_img_feature', False)\n        self.cat_right_img_feature = getattr(self.cfg, 'cat_right_img_feature', False)\n        self.num_convs = getattr(self.cfg.RPN3D, 'NUM_CONVS', 4)\n        self.num_3dconvs = getattr(self.cfg.RPN3D, 'NUM_3DCONVS', 1)\n        assert self.num_3dconvs > 0\n\n        RPN3D_INPUT_DIM = 0\n        if self.PlaneSweepVolume: RPN3D_INPUT_DIM += res_dim\n        if self.cat_disp: RPN3D_INPUT_DIM += 1\n        if self.cat_img_feature: RPN3D_INPUT_DIM += self.cfg.RPN_CONVDIM\n        if self.cat_right_img_feature: RPN3D_INPUT_DIM += self.cfg.RPN_CONVDIM\n\n        if self.cfg.RPN3D_ENABLE:\n            conv3d_dim = getattr(self.cfg, 'conv3d_dim', 64)\n\n            self.rpn3d_conv = nn.Sequential(convbn_3d(RPN3D_INPUT_DIM, conv3d_dim, self.rpn3d_conv_kernel, 1, \n                1 if self.rpn3d_conv_kernel == 3 else 0, gn=cfg.GN), nn.ReLU(inplace=True))\n\n            if self.num_3dconvs > 1:\n                self.rpn_3dconv1 = nn.Sequential(convbn_3d(conv3d_dim, conv3d_dim, 3, 1, 1, gn=cfg.GN),\n                    nn.ReLU(inplace=True))\n            if self.num_3dconvs > 2:\n                self.rpn_3dconv2 = nn.Sequential(convbn_3d(conv3d_dim, conv3d_dim, 3, 1, 1, gn=cfg.GN),\n                    nn.ReLU(inplace=True))\n            if self.num_3dconvs > 3:\n                self.rpn_3dconv3 = nn.Sequential(convbn_3d(conv3d_dim, conv3d_dim, 3, 1, 1, gn=cfg.GN),\n                    nn.ReLU(inplace=True))\n\n            if self.hg_rpn_conv3d:\n                self.hg_rpn3d_conv = hourglass(conv3d_dim, gn=cfg.GN)\n\n            self.rpn3d_pool = torch.nn.AvgPool3d((1, 4, 1), stride=(1, 4, 1))\n            self.rpn3d_conv2 = nn.Sequential(convbn(conv3d_dim * 5, conv3d_dim * 2, 3, 1, 1, 1, gn=cfg.GN),\n                    nn.ReLU(inplace=True))\n\n            if not self.hg_rpn_conv:\n                self.rpn3d_conv3 = nn.Sequential(convbn(res_dim * 2, res_dim * 2, 3, 1, 1, 1, gn=cfg.GN),\n                        nn.ReLU(inplace=True))\n            else:\n                self.rpn3d_conv3 = hourglass2d(res_dim * 2, gn=cfg.GN)\n\n            self.rpn3d_cls_convs = nn.Sequential(convbn(res_dim * 2, res_dim * 2, 3, 1, 1, 1, gn=cfg.GN),\n                    nn.ReLU(inplace=True))\n            self.rpn3d_bbox_convs = nn.Sequential(convbn(res_dim * 2, res_dim * 2, 3, 1, 1, 1, gn=cfg.GN),\n                    nn.ReLU(inplace=True))\n            if self.num_convs > 1:\n                self.rpn3d_cls_convs2 = nn.Sequential(convbn(res_dim * 2, res_dim * 2, 3, 1, 1, 1, gn=cfg.GN),\n                        nn.ReLU(inplace=True))\n                self.rpn3d_bbox_convs2 = nn.Sequential(convbn(res_dim * 2, res_dim * 2, 3, 1, 1, 1, gn=cfg.GN),\n                        nn.ReLU(inplace=True))\n            if self.num_convs > 2:\n                self.rpn3d_cls_convs3 = nn.Sequential(convbn(res_dim * 2, res_dim * 2, 3, 1, 1, 1, gn=cfg.GN),\n                        nn.ReLU(inplace=True))\n                self.rpn3d_bbox_convs3 = nn.Sequential(convbn(res_dim * 2, res_dim * 2, 3, 1, 1, 1, gn=cfg.GN),\n                        nn.ReLU(inplace=True))\n            if self.num_convs > 3:\n                self.rpn3d_cls_convs4 = nn.Sequential(convbn(res_dim * 2, res_dim * 2, 3, 1, 1, 1, gn=cfg.GN),\n                        nn.ReLU(inplace=True))\n                self.rpn3d_bbox_convs4 = nn.Sequential(convbn(res_dim * 2, res_dim * 2, 3, 1, 1, 1, gn=cfg.GN),\n                        nn.ReLU(inplace=True))\n\n            if self.class4angles:\n                self.bbox_cls = nn.Conv2d(res_dim * 2, self.num_angles * self.num_classes, kernel_size=3, padding=1, stride=1)\n            else:\n                self.bbox_cls = nn.Conv2d(res_dim * 2, self.num_classes, kernel_size=3, padding=1, stride=1)\n\n            centerness_dim = 1\n            centerness_dim *= self.num_angles\n            if self.centerness4class:\n                centerness_dim *= self.num_classes\n            self.bbox_centerness = nn.Conv2d(res_dim * 2, centerness_dim, kernel_size=3, padding=1, stride=1)\n\n            self.each_angle_dim = 1\n\n            self.hwl_dim = 3\n            self.xyz_dim = 3\n            # dx,dy,dz dh,dw,dl, [s,c, cls]xnum_angles\n            self.bbox_reg = nn.Conv2d(res_dim * 2, self.num_classes * (self.xyz_dim + self.hwl_dim + self.num_angles * self.each_angle_dim), kernel_size=3, padding=1, stride=1)\n            self.anchor_size = torch.as_tensor([cfg.RPN3D.ANCHORS_HEIGHT, cfg.RPN3D.ANCHORS_WIDTH, cfg.RPN3D.ANCHORS_LENGTH]).transpose(1, 0)\n\n        for m in self.modules():\n            if isinstance(m, nn.Conv2d):\n                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels\n                m.weight.data.normal_(0, math.sqrt(2. / n))\n            elif isinstance(m, nn.Conv3d):\n                n = m.kernel_size[0] * m.kernel_size[1] * m.kernel_size[2] * m.out_channels\n                m.weight.data.normal_(0, math.sqrt(2. / n))\n            elif isinstance(m, nn.BatchNorm2d):\n                m.weight.data.fill_(1)\n                m.bias.data.zero_()\n            elif isinstance(m, nn.BatchNorm3d):\n                m.weight.data.fill_(1)\n                m.bias.data.zero_()\n            elif isinstance(m, nn.Linear):\n                m.bias.data.zero_()\n\n        if self.cfg.RPN3D_ENABLE:\n            torch.nn.init.normal_(self.bbox_cls.weight, std=0.1)\n            torch.nn.init.constant_(self.bbox_cls.bias, 0)\n            torch.nn.init.normal_(self.bbox_centerness.weight, std=0.1)\n            torch.nn.init.constant_(self.bbox_centerness.bias, 0)\n            torch.nn.init.normal_(self.bbox_reg.weight, std=0.02)\n            torch.nn.init.constant_(self.bbox_reg.bias, 0)\n\n            prior_prob = cfg.RPN3D.PRIOR_PROB\n            bias_value = -math.log((1 - prior_prob) / prior_prob)\n            torch.nn.init.constant_(self.bbox_cls.bias, bias_value)\n\n        default_baseline = 0.54\n        default_fu = 721.5377\n        default_scale = default_baseline * default_fu\n        self.default_scale = default_scale\n\n        affine_mat = torch.as_tensor([[[1., 0., 0.], [0., 1., 0.]]])\n        affine_mat = affine_mat.repeat(self.maxdisp // self.downsample_disp, 1, 1)\n\n        for i in range(self.maxdisp // self.downsample_disp):\n            depth = ((i + 0.5) * self.downsample_disp + self.cfg.depth_min_intervals) * self.cfg.depth_interval\n            affine_mat[self.maxdisp // self.downsample_disp - 1 - i, 0, 2] = default_scale / depth / self.downsample_disp\n        self.affine_mat = affine_mat\n        # depth: 2.0 -> 40.2 # interval 0.2m\n        # disp : about 194.8 -> 9.69\n\n        depth = torch.zeros((self.maxdisp))\n        for i in range(self.maxdisp):\n            depth[self.maxdisp - 1 - i] = (i+self.cfg.depth_min_intervals) * self.cfg.depth_interval\n        self.depth = depth\n\n        self.dispregression = disparityregression(self.maxdisp, cfg=self.cfg)\n\n        self.CV_X_MIN, self.CV_Y_MIN, self.CV_Z_MIN = cfg.CV_X_MIN, cfg.CV_Y_MIN, cfg.CV_Z_MIN\n        self.CV_X_MAX, self.CV_Y_MAX, self.CV_Z_MAX = cfg.CV_X_MAX, cfg.CV_Y_MAX, cfg.CV_Z_MAX\n        self.X_MIN, self.Y_MIN, self.Z_MIN = cfg.X_MIN, cfg.Y_MIN, cfg.Z_MIN\n        self.X_MAX, self.Y_MAX, self.Z_MAX = cfg.X_MAX, cfg.Y_MAX, cfg.Z_MAX\n        self.VOXEL_X_SIZE, self.VOXEL_Y_SIZE, self.VOXEL_Z_SIZE = cfg.VOXEL_X_SIZE, cfg.VOXEL_Y_SIZE, cfg.VOXEL_Z_SIZE\n        self.GRID_SIZE = cfg.GRID_SIZE\n\n        zs = torch.arange(self.Z_MIN, self.Z_MAX, self.VOXEL_Z_SIZE) + self.VOXEL_Z_SIZE / 2.\n        ys = torch.arange(self.Y_MIN, self.Y_MAX, self.VOXEL_Y_SIZE) + self.VOXEL_Y_SIZE / 2.\n        xs = torch.arange(self.X_MIN, self.X_MAX, self.VOXEL_X_SIZE) + self.VOXEL_X_SIZE / 2.\n        zs, ys, xs = torch.meshgrid(zs, ys, xs)\n        coord_rect = torch.stack([xs, ys, zs], dim=-1)\n        self.coord_rect = coord_rect\n\n    def forward(self, left, right, calibs_fu, calibs_baseline, calibs_Proj, calibs_Proj_R=None):\n        N = left.shape[0]\n\n        refimg_fea, left_rpn_feature = self.feature_extraction(left)\n        targetimg_fea, right_rpn_feature = self.feature_extraction(right)\n\n        outputs = dict()\n\n        if self.PlaneSweepVolume:\n            affine_mat = self.affine_mat.cuda().clone().unsqueeze(0).repeat(N, 1, 1, 1)\n            affine_mat[:, :, 0, 2] = affine_mat[:, :, 0, 2] * calibs_fu[:,None].cuda().float() * calibs_baseline[:,None].cuda().float() / self.default_scale\n            cost = self.build_cost(refimg_fea, targetimg_fea, affine_mat[:,:,0,2])\n            cost = cost.contiguous()\n\n            if not self.hg_firstconv:\n                cost0 = self.dres0(cost)\n                cost0 = self.dres1(cost0) + cost0\n            else:\n                out0, pre0, post0 = self.dres0(cost, None, None)\n                cost0 = out0\n\n            if self.hg_cv:\n                out1, pre1, post1 = self.dres2(cost0, None, None)\n                out1 = out1 + cost0\n                if self.loss_disp:\n                    cost1 = self.classif1(out1)\n                else:\n                    cost1 = None\n\n                out, cost = out1, cost1\n            else:\n                out0 = cost0\n                if self.loss_disp:\n                    cost0 = self.classif1(out0)\n                else:\n                    cost0 = None\n\n                out, cost = out0, cost0\n            \n        outputs['depth_preds'] = []\n\n        if self.PlaneSweepVolume and self.loss_disp:\n            if self.hg_cv:\n                cost1 = F.upsample(cost1, [self.maxdisp, left.size()[2], left.size()[3]], mode='trilinear', align_corners=self.cfg.align_corners)\n                cost1 = torch.squeeze(cost1, 1)\n                pred1_softmax = F.softmax(cost1, dim=1)\n                pred1 = self.dispregression(pred1_softmax, depth=self.depth.cuda())\n                if self.training:\n                    outputs['depth_preds'].append( pred1 )\n                else:\n                    outputs['depth_preds'] = pred1\n            else:\n                cost0 = F.upsample(cost0, [self.maxdisp, left.size()[2], left.size()[3]], mode='trilinear', align_corners=self.cfg.align_corners)\n                cost0 = torch.squeeze(cost0, 1)\n                pred1_softmax = F.softmax(cost0, dim=1)\n                pred1 = self.dispregression(pred1_softmax, depth=self.depth.cuda())\n                if self.training:\n                    outputs['depth_preds'].append( pred1 )\n                else:\n                    outputs['depth_preds'] = pred1\n\n        if self.cfg.RPN3D_ENABLE:\n            coord_rect = self.coord_rect.cuda()\n\n            norm_coord_imgs = []\n            for i in range(N):\n                coord_img = torch.as_tensor(\n                    project_rect_to_image(\n                        coord_rect.reshape(-1, 3), \n                        calibs_Proj[i].float().cuda()\n                    ).reshape(*self.coord_rect.shape[:3], 2), \n                dtype=torch.float32)\n\n                coord_img = torch.cat([coord_img, self.coord_rect[..., 2:]], dim=-1)\n                norm_coord_img = (coord_img - torch.as_tensor([self.CV_X_MIN, self.CV_Y_MIN, self.CV_Z_MIN])[None, None, None, :]) / \\\n                    (torch.as_tensor([self.CV_X_MAX, self.CV_Y_MAX, self.CV_Z_MAX]) - torch.as_tensor([self.CV_X_MIN, self.CV_Y_MIN, self.CV_Z_MIN]))[None, None, None, :]\n                norm_coord_img = norm_coord_img * 2. - 1.\n                norm_coord_imgs.append(norm_coord_img)\n            norm_coord_imgs = torch.stack(norm_coord_imgs, dim=0)\n            norm_coord_imgs = norm_coord_imgs.cuda()\n\n            outputs['norm_coord_imgs'] = norm_coord_imgs\n            outputs['coord_rect'] = coord_rect\n\n            valids = (norm_coord_imgs[..., 0] >= -1.) & (norm_coord_imgs[..., 0] <= 1.) & \\\n                (norm_coord_imgs[..., 1] >= -1.) & (norm_coord_imgs[..., 1] <= 1.) & \\\n                (norm_coord_imgs[..., 2] >= -1.) & (norm_coord_imgs[..., 2] <= 1.)\n            outputs['valids'] = valids\n            valids = valids.float()\n\n            if self.PlaneSweepVolume:\n                # Retrieve Voxel Feature from Cost Volume Feature\n                if self.cat_disp:\n                    CV_feature = torch.cat([out, cost.detach()], dim= 1)\n                else:\n                    CV_feature = out\n\n                Voxel = F.grid_sample(CV_feature, norm_coord_imgs)\n                Voxel = Voxel * valids[:, None, :, :, :]\n\n                if (self.voxel_attentionbydisp or (self.img_feature_attentionbydisp and self.cat_img_feature)):\n                    pred_disp = F.grid_sample(pred1_softmax.detach()[:, None], norm_coord_imgs)\n                    pred_disp = pred_disp * valids[:, None, :, :, :]\n\n                if self.voxel_attentionbydisp:\n                    Voxel = Voxel * pred_disp\n            else:\n                Voxel = None\n\n            # Retrieve Voxel Feature from 2D Img Feature\n            if self.cat_img_feature:\n                RPN_feature = left_rpn_feature\n\n                valids = (norm_coord_imgs[..., 0] >= -1.) & (norm_coord_imgs[..., 0] <= 1.) & \\\n                    (norm_coord_imgs[..., 1] >= -1.) & (norm_coord_imgs[..., 1] <= 1.)\n                valids = valids.float() \n\n                Voxel_2D = []\n                for i in range(N):\n                    RPN_feature_per_im = RPN_feature[i:i+1]\n                    for j in range(len(norm_coord_imgs[i])):\n                        Voxel_2D_feature = F.grid_sample(RPN_feature_per_im, norm_coord_imgs[i, j:j+1, :, :, :2])\n                        Voxel_2D.append(Voxel_2D_feature)\n                Voxel_2D = torch.cat(Voxel_2D, dim=0)\n                Voxel_2D = Voxel_2D.reshape(N, self.GRID_SIZE[0], -1, self.GRID_SIZE[1], self.GRID_SIZE[2]).transpose(1,2)\n                Voxel_2D = Voxel_2D * valids[:, None, :, :, :]\n\n                if self.img_feature_attentionbydisp:\n                    Voxel_2D = Voxel_2D * pred_disp\n\n                if Voxel is not None:\n                    Voxel = torch.cat([Voxel, Voxel_2D], dim=1)\n                else:\n                    Voxel = Voxel_2D\n\n            if self.cat_right_img_feature:\n                RPN_feature = right_rpn_feature\n\n                norm_coord_right_imgs = []\n                for i in range(N):\n                    coord_right_img = torch.as_tensor(\n                        project_rect_to_image(\n                            coord_rect.reshape(-1, 3), \n                            calibs_Proj_R[i].float().cuda()\n                        ).reshape(*self.coord_rect.shape[:3], 2), \n                    dtype=torch.float32)\n\n                    coord_right_img = torch.cat([coord_right_img, self.coord_rect[..., 2:]], dim=-1)\n                    norm_coord_img = (coord_right_img - torch.as_tensor([self.CV_X_MIN, self.CV_Y_MIN, self.CV_Z_MIN])[None, None, None, :]) / \\\n                        (torch.as_tensor([self.CV_X_MAX, self.CV_Y_MAX, self.CV_Z_MAX]) - torch.as_tensor([self.CV_X_MIN, self.CV_Y_MIN, self.CV_Z_MIN]))[None, None, None, :]\n                    norm_coord_img = norm_coord_img * 2. - 1.\n                    norm_coord_right_imgs.append(norm_coord_img)\n                norm_coord_right_imgs = torch.stack(norm_coord_right_imgs, dim=0)\n                norm_coord_right_imgs = norm_coord_right_imgs.cuda()\n\n                valids_R = (norm_coord_right_imgs[..., 0] >= -1.) & (norm_coord_right_imgs[..., 0] <= 1.) & \\\n                    (norm_coord_right_imgs[..., 1] >= -1.) & (norm_coord_right_imgs[..., 1] <= 1.) \n                valids_R = valids_R.float()\n\n                Voxel_2D_R = []\n                for i in range(N):\n                    RPN_feature_per_im = RPN_feature[i:i+1]\n                    for j in range(len(norm_coord_right_imgs[i])):\n                        Voxel_2D_feature = F.grid_sample(RPN_feature_per_im, norm_coord_right_imgs[i, j:j+1, :, :, :2])\n                        Voxel_2D_R.append(Voxel_2D_feature)\n                Voxel_2D_R = torch.cat(Voxel_2D_R, dim=0)\n                Voxel_2D_R = Voxel_2D_R.reshape(N, self.GRID_SIZE[0], -1, self.GRID_SIZE[1], self.GRID_SIZE[2]).transpose(1,2)\n                Voxel_2D_R = Voxel_2D_R * valids_R[:, None, :, :, :]\n\n                if self.img_feature_attentionbydisp:\n                    Voxel_2D_R = Voxel_2D_R * pred_disp\n\n                if Voxel is not None:\n                    Voxel = torch.cat([Voxel, Voxel_2D_R], dim=1)\n                else:\n                    Voxel = Voxel_2D_R\n\n            # (64, 190, 20, 300)\n            Voxel = self.rpn3d_conv(Voxel) # (64, 190, 20, 300)\n\n            if self.num_3dconvs > 1:\n                Voxel = self.rpn_3dconv1(Voxel)\n            if self.num_3dconvs > 2:\n                Voxel = self.rpn_3dconv2(Voxel)\n            if self.num_3dconvs > 3:\n                Voxel = self.rpn_3dconv3(Voxel)\n\n            if self.hg_rpn_conv3d:\n                Voxel1, pre_Voxel, post_Voxel = self.hg_rpn3d_conv(Voxel, None, None)\n                Voxel = Voxel1 + Voxel\n\n            Voxel = self.rpn3d_pool(Voxel) # (64, 190, 5, 300)\n            Voxel = Voxel.permute(0, 1, 3, 2, 4).reshape(N, -1, self.GRID_SIZE[0], self.GRID_SIZE[2]).contiguous()\n\n            Voxel_BEV = self.rpn3d_conv2(Voxel)\n\n            if not self.hg_rpn_conv:\n                Voxel_BEV = self.rpn3d_conv3(Voxel_BEV)\n            else:\n                Voxel_BEV1, pre_BEV, post_BEV = self.rpn3d_conv3(Voxel_BEV, None, None)\n                Voxel_BEV = Voxel_BEV1 # some bug\n\n            Voxel_BEV_cls = self.rpn3d_cls_convs(Voxel_BEV)\n            Voxel_BEV_bbox = self.rpn3d_bbox_convs(Voxel_BEV)\n            if self.num_convs > 1:\n                Voxel_BEV_cls = self.rpn3d_cls_convs2(Voxel_BEV_cls)\n                Voxel_BEV_bbox = self.rpn3d_bbox_convs2(Voxel_BEV_bbox)\n            if self.num_convs > 2:\n                Voxel_BEV_cls = self.rpn3d_cls_convs3(Voxel_BEV_cls)\n                Voxel_BEV_bbox = self.rpn3d_bbox_convs3(Voxel_BEV_bbox)\n            if self.num_convs > 3:\n                Voxel_BEV_cls = self.rpn3d_cls_convs4(Voxel_BEV_cls)\n                Voxel_BEV_bbox = self.rpn3d_bbox_convs4(Voxel_BEV_bbox)\n\n            bbox_cls = self.bbox_cls(Voxel_BEV_cls)\n            if not self.fix_centerness_bug:\n                bbox_reg = self.bbox_reg(Voxel_BEV_cls)\n                bbox_centerness = self.bbox_centerness(Voxel_BEV_bbox)\n            else:\n                bbox_reg = self.bbox_reg(Voxel_BEV_bbox)\n                bbox_centerness = self.bbox_centerness(Voxel_BEV_bbox)\n\n            # dx, dy, h, w, l, q1, q2, q3, q4, dz\n            N, C, H, W = bbox_reg.shape\n\n            dxyz, dhwl, angle_reg = torch.split(bbox_reg.reshape(N, self.num_classes, C // self.num_classes, H, W), \\\n                    [self.xyz_dim, self.hwl_dim, self.each_angle_dim * self.num_angles], dim=2)\n\n            # angle / orientation\n            angle_reg = angle_reg.permute(0, 3, 4, 2, 1).reshape(-1, self.each_angle_dim * self.num_angles, self.num_classes)\n\n            angle_range = np.pi * 2 / self.num_angles\n            q = angle_reg.tanh() * angle_range / 2.\n            q = q + self.anchor_angles.cuda()[None, :, None]\n            sin_d, cos_d = torch.sin(q), torch.cos(q)\n\n            # XYZ\n            dxyz = dxyz[:, None, :].repeat(1, self.num_angles, 1, 1, 1, 1)\n\n            dhwl = dhwl.permute(0, 3, 4, 1, 2).reshape(-1, self.num_classes, self.hwl_dim)\n            dhwl = dhwl[:, None, :, :].repeat(1, self.num_angles, 1, 1)\n            hwl = self.anchor_size.cuda().reshape(1, 1, self.num_classes, 3) * torch.exp(dhwl)\n            hwl = hwl.reshape(-1, self.num_angles, self.num_classes, 3)\n\n            if not self.box_corner_parameters:\n                hwl = hwl.reshape(N, H, W, self.num_angles, self.num_classes, 3)\n                hwl = hwl.permute(0, 3, 4, 5, 1, 2)\n\n                q = q.reshape(N, H, W, self.num_angles, self.num_classes)\n                q = q.permute(0, 3, 4, 1, 2)\n\n                # N, num_angles, num_classes, dim, H, W\n                bbox_reg = torch.cat( [dxyz, hwl, q[:, :, :, None]], dim=3)\n                bbox_reg = bbox_reg.reshape(N, self.num_angles * self.num_classes * 7, H, W)\n            else:\n                box_corners = compute_corners_sc(\n                    hwl.reshape(-1, 3), \n                    sin_d.reshape(-1), \n                    cos_d.reshape(-1)\n                ).reshape(N, H, W, self.num_angles, self.num_classes, 3, 8)\n                box_corners[:, :, :, :, :, 1, :] += hwl.reshape(N, H, W, self.num_angles, self.num_classes, 3)[:, :, :, :, :, 0:1] / 2.\n                box_corners = box_corners.permute(0, 3, 4, 6, 5, 1, 2) \n                # (N, num_classes, num_angles, 8, 3, H, W)\n\n                # (N, num_classes, num_angles, )\n                bbox_reg = box_corners + dxyz[:, :, :, None]\n                bbox_reg = bbox_reg.reshape(N, self.num_angles * self.num_classes * 24, H, W)\n\n            outputs['bbox_cls'] = bbox_cls\n            outputs['bbox_reg'] = bbox_reg\n            outputs['bbox_centerness'] = bbox_centerness\n\n        return outputs\n"
  },
  {
    "path": "disparity/models/stereonet_disp.py",
    "content": "# ------------------------------------------------------------------------------\n# Copyright (c) NKU\n# Licensed under the MIT License.\n# Written by Xuanyi Li (xuanyili.edu@gmail.com)\n# ------------------------------------------------------------------------------\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nimport numpy as np\nimport torch.backends.cudnn as cudnn\n\ndef convbn(in_channel, out_channel, kernel_size, stride, pad, dilation):\n    \n    return nn.Sequential(\n        nn.Conv2d(\n            in_channel,\n            out_channel,\n            kernel_size=kernel_size,\n            stride=stride,\n            padding=dilation if dilation>1 else pad,\n            dilation=dilation),\n       nn.BatchNorm2d(out_channel))\n\ndef convbn_3d(in_channel, out_channel, kernel_size, stride, pad):\n\n    return nn.Sequential(\n        nn.Conv3d(\n            in_channel,\n            out_channel,\n            kernel_size=kernel_size,\n            padding=pad,\n            stride=stride),\n       nn.BatchNorm3d(out_channel))\n\nclass BasicBlock(nn.Module):\n    def __init__(self, in_channel, out_channel, stride, downsample, pad, dilation):\n        super().__init__()\n        self.conv1 = nn.Sequential(\n            convbn(in_channel, out_channel, 3, stride, pad, dilation),\n            nn.LeakyReLU(negative_slope=0.2, inplace=True))\n        self.conv2 = convbn(out_channel, out_channel, 3, 1, pad, dilation)\n        self.downsample = downsample\n        self.stride = stride\n\n    def forward(self, x):\n        out = self.conv1(x)\n\n        # out = self.conv2(out)\n\n        if self.downsample is not None:\n            x = self.downsample(x)\n        ### bug?\n        out = x + out\n        return out\n\nclass FeatureExtraction(nn.Module):\n    def __init__(self, k):\n        super().__init__()\n        self.k = k\n        self.downsample = nn.ModuleList()\n        in_channel = 3\n        out_channel = 32\n        for _ in range(k):\n            self.downsample.append(\n                nn.Conv2d(\n                    in_channel,\n                    out_channel,\n                    kernel_size=5,\n                    stride=2,\n                    padding=2))\n            in_channel = out_channel\n            out_channel = 32\n        self.residual_blocks = nn.ModuleList()\n        for _ in range(6):\n            self.residual_blocks.append(\n                BasicBlock(\n                    32, 32, stride=1, downsample=None, pad=1, dilation=1))\n        self.conv_alone = nn.Conv2d(32, 32, kernel_size=3, stride=1, padding=1)\n    def forward(self, rgb_img):\n        output = rgb_img\n        for i in range(self.k):\n            output = self.downsample[i](output)\n        for block in self.residual_blocks:\n            output = block(output)\n        return self.conv_alone(output)\n\nclass EdgeAwareRefinement(nn.Module):\n    def __init__(self, in_channel):\n        super().__init__()\n        self.conv2d_feature = nn.Sequential(\n            convbn(in_channel, 32, kernel_size=3, stride=1, pad=1, dilation=1),\n            nn.LeakyReLU(negative_slope=0.2, inplace=True))\n        self.residual_astrous_blocks = nn.ModuleList()\n        astrous_list = [1, 2, 4, 8 , 1 , 1]\n        for di in astrous_list:\n            self.residual_astrous_blocks.append(\n                BasicBlock(\n                    32, 32, stride=1, downsample=None, pad=1, dilation=di))\n                \n        self.conv2d_out = nn.Conv2d(32, 1, kernel_size=3, stride=1, padding=1)\n\n    def forward(self, low_disparity, corresponding_rgb):\n        output = torch.unsqueeze(low_disparity, dim=1)\n        twice_disparity = F.interpolate(\n            output,\n            size = corresponding_rgb.size()[-2:],\n            mode='bilinear',\n            align_corners=False)\n        if corresponding_rgb.size()[-1]/ low_disparity.size()[-1] >= 1.5:\n            twice_disparity *= 8   \n        output = self.conv2d_feature(\n            torch.cat([twice_disparity, corresponding_rgb], dim=1))\n        for astrous_block in self.residual_astrous_blocks:\n            output = astrous_block(output)\n        \n        return nn.ReLU(inplace=True)(torch.squeeze(\n            twice_disparity + self.conv2d_out(output), dim=1))\n\nclass disparityregression(nn.Module):\n    def __init__(self, maxdisp):\n        super().__init__()\n        self.disp = torch.FloatTensor(\n            np.reshape(np.array(range(maxdisp)), [1, maxdisp, 1, 1])).cuda()\n\n    def forward(self, x):\n        disp = self.disp.repeat(x.size()[0], 1, x.size()[2], x.size()[3])\n        out = torch.sum(x * disp, 1)\n        return out\n\n\nclass StereoNet(nn.Module):\n    def __init__(self, k=3, r=3, maxdisp=192):\n        super().__init__()\n        self.maxdisp = maxdisp\n        self.k = k\n        self.r = r\n        self.feature_extraction = FeatureExtraction(k)\n        self.filter = nn.ModuleList()\n        for _ in range(4):\n            self.filter.append(\n                nn.Sequential(\n                    convbn_3d(32, 32, kernel_size=3, stride=1, pad=1),\n                    nn.LeakyReLU(negative_slope=0.2, inplace=True)))\n        self.conv3d_alone = nn.Conv3d(\n            32, 1, kernel_size=3, stride=1, padding=1)\n        \n        self.edge_aware_refinements = nn.ModuleList()\n        for _ in range(1):\n            self.edge_aware_refinements.append(EdgeAwareRefinement(4))\n    \n    def forward(self, left, right):\n        disp = (self.maxdisp + 1) // pow(2, self.k)\n        refimg_feature = self.feature_extraction(left)\n        targetimg_feature = self.feature_extraction(right)\n\n        # matching\n        cost = torch.FloatTensor(refimg_feature.size()[0],\n                                 refimg_feature.size()[1],\n                                 disp,\n                                 refimg_feature.size()[2],\n                                 refimg_feature.size()[3]).zero_().cuda()\n        for i in range(disp):\n            if i > 0:\n                cost[:, :, i, :, i:] = refimg_feature[ :, :, :, i:] - targetimg_feature[:, :, :, :-i]\n            else:\n                cost[:, :, i, :, :] = refimg_feature - targetimg_feature\n        cost = cost.contiguous()\n\n\n\n\n        for f in self.filter:\n            cost = f(cost)\n        cost = self.conv3d_alone(cost)\n        cost = torch.squeeze(cost, 1)\n        pred = F.softmax(cost, dim=1)\n        pred = disparityregression(disp)(pred)\n\n        \n        img_pyramid_list = [left]\n        \n        pred_pyramid_list= [pred]\n\n\n\n        pred_pyramid_list.append(self.edge_aware_refinements[0](\n                    pred_pyramid_list[0], img_pyramid_list[0]))\n\n        for i in range(1):\n            pred_pyramid_list[i] = pred_pyramid_list[i]* (\n                left.size()[-1] / pred_pyramid_list[i].size()[-1])\n            pred_pyramid_list[i] = torch.squeeze(\n            F.interpolate(\n                torch.unsqueeze(pred_pyramid_list[i], dim=1),\n                size=left.size()[-2:],\n                mode='bilinear',\n                align_corners=False),\n            dim=1)\n\n        #return pred_pyramid_list\n        return pred_pyramid_list\nif __name__ == '__main__':\n    model = StereoNet(k=3, r=4).cuda()\n    # model.eval()\n    import time\n    import datetime\n    import torch\n    # torch.backends.cudnn.benchmark = True    \n    input = torch.FloatTensor(1,3,540,960).zero_().cuda()\n    # input = torch.FloatTensor(1,3,960,512).zero_().cuda()\n    for i in range(100):\n    # pass\n        out = model(input, input)\n        # print(len(out))\n    start = datetime.datetime.now()\n    for i in range(100):\n        # pass\n        out = model(input, input)\n        # shape = [x.size() for x in out]\n        # print(shape)\n    end = datetime.datetime.now()\n    print((end-start).total_seconds())\n\n\n\n\n\n    \n\n\n\n\n"
  },
  {
    "path": "disparity/models/submodule.py",
    "content": "from __future__ import print_function\nimport torch\nimport torch.nn as nn\nimport torch.utils.data\nfrom torch.autograd import Variable\nimport torch.nn.functional as F\nimport math\nimport numpy as np\nfrom torch.nn import BatchNorm2d\n\ndef convbn(in_planes, out_planes, kernel_size, stride, pad, dilation, gn=False, groups=32):\n    return nn.Sequential(nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride, padding=dilation if dilation > 1 else pad, dilation = dilation, bias=False),\n                         nn.BatchNorm2d(out_planes) if not gn else nn.GroupNorm(groups, out_planes))\n\n\ndef convbn_3d(in_planes, out_planes, kernel_size, stride, pad, gn=False, groups=32):\n    return nn.Sequential(nn.Conv3d(in_planes, out_planes, kernel_size=kernel_size, padding=pad, stride=stride,bias=False),\n                         nn.BatchNorm3d(out_planes) if not gn else nn.GroupNorm(groups, out_planes))\n\nclass BasicBlock(nn.Module):\n    expansion = 1\n    def __init__(self, inplanes, planes, stride, downsample, pad, dilation, gn=False):\n        super(BasicBlock, self).__init__()\n\n        self.conv1 = nn.Sequential(convbn(inplanes, planes, 3, stride, pad, dilation, gn=gn),\n                                   nn.ReLU(inplace=True))\n\n        self.conv2 = convbn(planes, planes, 3, 1, pad, dilation, gn=gn)\n\n        self.downsample = downsample\n        self.stride = stride\n\n    def forward(self, x):\n        out = self.conv1(x)\n        out = self.conv2(out)\n\n        if self.downsample is not None:\n            x = self.downsample(x)\n\n        out += x\n\n        return out\n\nclass disparityregression(nn.Module):\n    def __init__(self, maxdisp, cfg):\n        super(disparityregression, self).__init__()\n        self.disp = Variable(torch.Tensor(np.array(range(maxdisp))).cuda(), requires_grad=False)\n\n    def forward(self, x, depth):\n        out = torch.sum(x * depth[None, :, None, None],1)\n        return out\n\nclass hourglass(nn.Module):\n    def __init__(self, inplanes, gn=False):\n        super(hourglass, self).__init__()\n\n        self.conv1 = nn.Sequential(convbn_3d(inplanes, inplanes * 2, kernel_size=3, stride=2, pad=1, gn=gn),\n                                   nn.ReLU(inplace=True))\n\n        self.conv2 = convbn_3d(inplanes * 2, inplanes * 2, kernel_size=3, stride=1, pad=1, gn=gn)\n\n        self.conv3 = nn.Sequential(convbn_3d(inplanes * 2, inplanes * 2, kernel_size=3, stride=2, pad=1, gn=gn),\n                                   nn.ReLU(inplace=True))\n\n        self.conv4 = nn.Sequential(convbn_3d(inplanes * 2, inplanes * 2, kernel_size=3, stride=1, pad=1, gn=gn),\n                                   nn.ReLU(inplace=True))\n\n        self.conv5 = nn.Sequential(\n            nn.ConvTranspose3d(inplanes * 2, inplanes * 2, kernel_size=3, padding=1, output_padding=1, stride=2,\n                               bias=False),\n            nn.BatchNorm3d(inplanes * 2) if not gn else nn.GroupNorm(32, inplanes * 2))  # +conv2\n\n        self.conv6 = nn.Sequential(\n            nn.ConvTranspose3d(inplanes * 2, inplanes, kernel_size=3, padding=1, output_padding=1, stride=2,\n                               bias=False),\n            nn.BatchNorm3d(inplanes) if not gn else nn.GroupNorm(32, inplanes))  # +x\n\n    def forward(self, x, presqu, postsqu):\n\n        out = self.conv1(x)  # in:1/4 out:1/8\n        pre = self.conv2(out)  # in:1/8 out:1/8\n        if postsqu is not None:\n            pre = F.relu(pre + postsqu, inplace=True)\n        else:\n            pre = F.relu(pre, inplace=True)\n\n        out = self.conv3(pre)  # in:1/8 out:1/16\n        out = self.conv4(out)  # in:1/16 out:1/16\n\n        if presqu is not None:\n            post = F.relu(self.conv5(out) + presqu, inplace=True)  # in:1/16 out:1/8\n        else:\n            post = F.relu(self.conv5(out) + pre, inplace=True)\n\n        out = self.conv6(post)  # in:1/8 out:1/4\n\n        return out, pre, post\n\nclass hourglass2d(nn.Module):\n    def __init__(self, inplanes, gn=False):\n        super(hourglass2d, self).__init__()\n\n        self.conv1 = nn.Sequential(convbn(inplanes, inplanes * 2, kernel_size=3, stride=2, pad=1, dilation=1, gn=gn),\n                                   nn.ReLU(inplace=True))\n\n        self.conv2 = convbn(inplanes * 2, inplanes * 2, kernel_size=3, stride=1, pad=1, dilation=1, gn=gn)\n\n        self.conv3 = nn.Sequential(convbn(inplanes * 2, inplanes * 2, kernel_size=3, stride=2, pad=1, dilation=1, gn=gn),\n                                   nn.ReLU(inplace=True))\n\n        self.conv4 = nn.Sequential(convbn(inplanes * 2, inplanes * 2, kernel_size=3, stride=1, pad=1, dilation=1, gn=gn),\n                                   nn.ReLU(inplace=True))\n\n        self.conv5 = nn.Sequential(\n            nn.ConvTranspose2d(inplanes * 2, inplanes * 2, kernel_size=3, padding=1, output_padding=1, stride=2,\n                               bias=False),\n            nn.BatchNorm2d(inplanes * 2) if not gn else nn.GroupNorm(32, inplanes * 2))  # +conv2\n\n        self.conv6 = nn.Sequential(\n            nn.ConvTranspose2d(inplanes * 2, inplanes, kernel_size=3, padding=1, output_padding=1, stride=2,\n                               bias=False),\n            nn.BatchNorm2d(inplanes) if not gn else nn.GroupNorm(32, inplanes))  # +x\n\n    def forward(self, x, presqu, postsqu):\n\n        out = self.conv1(x)  # in:1/4 out:1/8\n        pre = self.conv2(out)  # in:1/8 out:1/8\n        if postsqu is not None:\n            pre = F.relu(pre + postsqu, inplace=True)\n        else:\n            pre = F.relu(pre, inplace=True)\n\n        out = self.conv3(pre)  # in:1/8 out:1/16\n        out = self.conv4(out)  # in:1/16 out:1/16\n\n        if presqu is not None:\n            post = F.relu(self.conv5(out) + presqu, inplace=True)  # in:1/16 out:1/8\n        else:\n            post = F.relu(self.conv5(out) + pre, inplace=True)\n\n        out = self.conv6(post)  # in:1/8 out:1/4\n\n        return out, pre, post\n\nclass feature_extraction(nn.Module):\n    def __init__(self, cfg):\n        super(feature_extraction, self).__init__()\n\n        self.cfg = cfg\n        self.RPN3D_ENABLE = self.cfg.RPN3D_ENABLE\n        self.cat_img_feature = getattr(self.cfg, 'cat_img_feature', False)\n        self.rpn_onemore_conv = getattr(self.cfg, 'RPN_ONEMORE_CONV', False)\n        self.rpn_onemore_dim = getattr(self.cfg, 'RPN_ONEMORE_DIM', 256)\n        self.img_feature_relu = getattr(self.cfg, 'img_feature_relu', True)\n        self.branch = getattr(self.cfg, 'branch', True)\n\n        self.backbone = getattr(self.cfg, 'backbone', 'reslike-det-small')\n        if self.backbone == 'reslike-det':\n            first_dim = 64\n            dims = [64, 128, 192, 256]\n            nr_convs = [3, 6, 12, 4]\n            branch_dim = 32\n            lastconv_dim = [256, 32]\n        elif self.backbone == 'reslike-det-small':\n            first_dim = 64\n            dims = [32, 64, 128, 192]\n            nr_convs = [3, 6, 12, 4]\n            branch_dim = 32\n            lastconv_dim = [256, 32]\n        elif self.backbone == 'reslike-det-small-fixfirst':\n            first_dim = 16\n            dims = [32, 64, 128, 192]\n            nr_convs = [3, 6, 12, 4]\n            branch_dim = 32\n            lastconv_dim = [256, 32]\n        elif self.backbone == 'reslike50-det-small-fixfirst':\n            first_dim = 16\n            dims = [32, 64, 128, 256]\n            nr_convs = [3, 4, 6, 3]\n            branch_dim = 32\n            lastconv_dim = [256, 32]\n        elif self.backbone == 'reslike50-det-tiny':\n            first_dim = 8\n            dims = [16, 32, 64, 128]\n            nr_convs = [3, 4, 6, 3]\n            branch_dim = 32\n            lastconv_dim = [128, 32]\n        else:\n            raise ValueError('Invalid backbone {}.'.format(self.backbone))\n\n        self.inplanes = first_dim\n\n        self.firstconv = nn.Sequential(convbn(3, first_dim, 3, 2, 1, 1, gn=cfg.GN if first_dim >= 32 else False),\n                                       nn.ReLU(inplace=True),\n                                       convbn(first_dim, first_dim, 3, 1, 1, 1, gn=cfg.GN if first_dim >= 32 else False),\n                                       nn.ReLU(inplace=True),\n                                       convbn(first_dim, first_dim, 3, 1, 1, 1, gn=cfg.GN if first_dim >= 32 else False),\n                                       nn.ReLU(inplace=True))\n\n        self.layer1 = self._make_layer(BasicBlock, dims[0], nr_convs[0], 1,1,1, gn=cfg.GN if dims[0] >= 32 else False)\n        self.layer2 = self._make_layer(BasicBlock, dims[1], nr_convs[1], 2,1,1, gn=cfg.GN) \n        self.layer3 = self._make_layer(BasicBlock, dims[2], nr_convs[2], 1,1,1, gn=cfg.GN)\n        self.layer4 = self._make_layer(BasicBlock, dims[3], nr_convs[3], 1,1,2, gn=cfg.GN)\n\n        if self.branch:\n            self.branch1 = nn.Sequential(nn.AvgPool2d((64, 64), stride=(64,64)),\n                                         convbn(dims[3], branch_dim, 1, 1, 0, 1, gn=cfg.GN, groups=min(32, branch_dim)),\n                                         nn.ReLU(inplace=True))\n\n            self.branch2 = nn.Sequential(nn.AvgPool2d((32, 32), stride=(32,32)),\n                                         convbn(dims[3], branch_dim, 1, 1, 0, 1, gn=cfg.GN, groups=min(32, branch_dim)),\n                                         nn.ReLU(inplace=True))\n\n            self.branch3 = nn.Sequential(nn.AvgPool2d((16, 16), stride=(16,16)),\n                                         convbn(dims[3], branch_dim, 1, 1, 0, 1, gn=cfg.GN, groups=min(32, branch_dim)),\n                                         nn.ReLU(inplace=True))\n\n            self.branch4 = nn.Sequential(nn.AvgPool2d((8, 8), stride=(8,8)),\n                                         convbn(dims[3], branch_dim, 1, 1, 0, 1, gn=cfg.GN, groups=min(32, branch_dim)),\n                                         nn.ReLU(inplace=True))\n\n        if self.branch:\n            concat_dim = branch_dim * 4 + dims[1] + dims[3] + dims[2]\n        else:\n            concat_dim = dims[1] + dims[3] + dims[2]\n\n        self.PlaneSweepVolume = getattr(cfg, 'PlaneSweepVolume', True)\n        if self.PlaneSweepVolume:\n            self.lastconv = nn.Sequential(convbn(concat_dim, lastconv_dim[0], 3, 1, 1, 1, gn=cfg.GN),\n                                          nn.ReLU(inplace=True),\n                                          nn.Conv2d(lastconv_dim[0], lastconv_dim[1], kernel_size=1, padding=0, stride = 1, bias=False))\n\n        if self.cfg.RPN3D_ENABLE and self.cat_img_feature:\n            if self.rpn_onemore_conv:\n                rpnconvs = [convbn(concat_dim, self.rpn_onemore_dim, 3, 1, 1, 1, gn=cfg.GN),\n                                          nn.ReLU(inplace=True),\n                                          convbn(self.rpn_onemore_dim, self.cfg.RPN_CONVDIM, 3, 1, 1, 1, gn=cfg.GN, groups=(32 if self.cfg.RPN_CONVDIM % 32 == 0 else 16))]\n            else:\n                rpnconvs = [convbn(concat_dim, self.cfg.RPN_CONVDIM, 3, 1, 1, 1, gn=cfg.GN, groups=(32 if self.cfg.RPN_CONVDIM % 32 == 0 else 16))]\n            if self.img_feature_relu:\n                rpnconvs.append( nn.ReLU(inplace=True) )\n            self.rpnconv = nn.Sequential( *rpnconvs )\n\n    def _make_layer(self, block, planes, blocks, stride, pad, dilation, gn=False):\n        downsample = None\n        if stride != 1 or self.inplanes != planes * block.expansion:\n           downsample = nn.Sequential(\n                nn.Conv2d(self.inplanes, planes * block.expansion,\n                          kernel_size=1, stride=stride, bias=False),\n                nn.BatchNorm2d(planes * block.expansion) if not gn else nn.GroupNorm(32, planes * block.expansion))\n\n        layers = []\n        layers.append(block(self.inplanes, planes, stride, downsample, pad, dilation, gn=gn))\n        self.inplanes = planes * block.expansion\n        for i in range(1, blocks):\n            layers.append(block(self.inplanes, planes,1,None,pad,dilation, gn=gn))\n\n        return nn.Sequential(*layers)\n\n    def forward(self, x):\n        output      = self.firstconv(x)         ; #print('conv1', output.shape)           # (1, 32, 192, 624)\n        output      = self.layer1(output)       ; #print('conv2', output.shape)           # (1, 32, 192, 624)\n        output_raw  = self.layer2(output)       ; #print('conv3', output_raw.shape)       # (1, 64, 96, 312)\n        output_mid  = self.layer3(output_raw)   ; #print('conv4', output.shape)           # (1, 128, 96, 312)\n        output_skip = self.layer4(output_mid)   ; #print('conv5', output_skip.shape)      # (1, 128, 96, 312)\n\n        if self.branch:\n            output_branch1 = self.branch1(output_skip) ; #print('b1', output_branch1.shape) # (1, 32, 1, 4) # avgpool 64\n            output_branch1 = F.interpolate(output_branch1, (output_skip.size()[2],output_skip.size()[3]),mode='bilinear', align_corners=self.cfg.align_corners) # (1, 32, 96, 312)\n\n            output_branch2 = self.branch2(output_skip) ; #print('b2', output_branch2.shape)# (1, 32, 3, 9)\n            output_branch2 = F.interpolate(output_branch2, (output_skip.size()[2],output_skip.size()[3]),mode='bilinear', align_corners=self.cfg.align_corners)\n\n            output_branch3 = self.branch3(output_skip) ; #print('b3', output_branch3.shape)# (1, 32, 6, 19)\n            output_branch3 = F.interpolate(output_branch3, (output_skip.size()[2],output_skip.size()[3]),mode='bilinear', align_corners=self.cfg.align_corners)\n\n            output_branch4 = self.branch4(output_skip) ; #print('b4', output_branch4.shape)# (1, 32, 12, 39)\n            output_branch4 = F.interpolate(output_branch4, (output_skip.size()[2],output_skip.size()[3]),mode='bilinear', align_corners=self.cfg.align_corners)\n\n        if self.branch:\n            concat_feature = torch.cat((output_raw, output_mid, output_skip, output_branch4, output_branch3, output_branch2, output_branch1), 1) ; #print('cat', concat_feature.shape)\n        else:\n            concat_feature = torch.cat((output_raw, output_mid, output_skip), 1)\n        \n        if self.RPN3D_ENABLE and self.cat_img_feature:\n            rpn_feature = self.rpnconv(concat_feature)\n        else:\n            rpn_feature = None\n\n        if self.PlaneSweepVolume:\n            output_feature = self.lastconv(concat_feature) ; #print('last', output_feature.shape)\n        else:\n            output_feature = None\n\n        return output_feature, rpn_feature\n"
  },
  {
    "path": "disparity/utils/__init__.py",
    "content": ""
  },
  {
    "path": "disparity/utils/logger.py",
    "content": "import logging\nimport os\n\n\ndef setup_logger(filepath):\n    file_formatter = logging.Formatter(\n        \"[%(asctime)s %(filename)s:%(lineno)s] %(levelname)-8s %(message)s\",\n        datefmt='%Y-%m-%d %H:%M:%S',\n    )\n    logger = logging.getLogger('example')\n    print(logger)\n    handler = logging.StreamHandler()\n    handler.setFormatter(file_formatter)\n    logger.addHandler(handler)\n\n    file_handle_name = \"file\"\n    if file_handle_name in [h.name for h in logger.handlers]:\n        print(logger.handlers)\n        #return\n    if os.path.dirname(filepath) is not '':\n        if not os.path.isdir(os.path.dirname(filepath)):\n            os.makedirs(os.path.dirname(filepath))\n    file_handle = logging.FileHandler(filename=filepath, mode=\"a\")\n    file_handle.set_name(file_handle_name)\n    file_handle.setFormatter(file_formatter)\n    logger.addHandler(file_handle)\n    logger.setLevel(logging.DEBUG)\n    return logger"
  },
  {
    "path": "disparity/utils/preprocess.py",
    "content": "import torch\nimport torchvision.transforms as transforms\nimport torchvision\nimport random\nimport numpy as np\n\n__imagenet_stats = {'mean': [0.485, 0.456, 0.406],\n                   'std': [0.229, 0.224, 0.225]}\n\n#__imagenet_stats = {'mean': [0.5, 0.5, 0.5],\n#                   'std': [0.5, 0.5, 0.5]}\n\n__imagenet_pca = {\n    'eigval': torch.Tensor([0.2175, 0.0188, 0.0045]),\n    'eigvec': torch.Tensor([\n        [-0.5675,  0.7192,  0.4009],\n        [-0.5808, -0.0045, -0.8140],\n        [-0.5836, -0.6948,  0.4203],\n    ])\n}\n\n\n# def scale_crop(input_size, scale_size=None, normalize=__imagenet_stats):\n#     t_list = [\n#         transforms.ToTensor(),\n#         transforms.Normalize(**normalize),\n#     ]\n#     #if scale_size != input_size:\n#     #t_list = [transforms.Scale((960,540))] + t_list\n\n#     return transforms.Compose(t_list)\n\ndef scale_crop(input_size, scale_size=None, normalize=__imagenet_stats):\n    t_list = [\n        transforms.ToTensor(),\n        transforms.Normalize(**normalize),\n    ]\n    return transforms.Compose(t_list)\n\ndef scale_random_crop(input_size, scale_size=None, normalize=__imagenet_stats):\n    t_list = [\n        transforms.RandomCrop(input_size),\n        transforms.ToTensor(),\n        transforms.Normalize(**normalize),\n    ]\n    if scale_size != input_size:\n        t_list = [transforms.Scale(scale_size)] + t_list\n\n    transforms.Compose(t_list)\n\n\ndef pad_random_crop(input_size, scale_size=None, normalize=__imagenet_stats):\n    padding = int((scale_size - input_size) / 2)\n    return transforms.Compose([\n        transforms.RandomCrop(input_size, padding=padding),\n        transforms.RandomHorizontalFlip(),\n        transforms.ToTensor(),\n        transforms.Normalize(**normalize),\n    ])\n\n\ndef inception_preproccess(input_size, normalize=__imagenet_stats):\n    return transforms.Compose([\n        transforms.RandomSizedCrop(input_size),\n        transforms.RandomHorizontalFlip(),\n        transforms.ToTensor(),\n        transforms.Normalize(**normalize)\n    ])\ndef inception_color_preproccess(input_size, normalize=__imagenet_stats):\n\n    return transforms.Compose([\n        transforms.ToTensor(),\n        ColorJitter(\n            brightness=0.4,\n            contrast=0.4,\n            saturation=0.4,\n        ),\n        Lighting(0.1, __imagenet_pca['eigval'], __imagenet_pca['eigvec']),\n        transforms.Normalize(**normalize)\n    ])\n    \n    # bright = np.random.uniform(0.8, 1.2)\n    # contrast = np.random.uniform(0.8, 1.2)\n    # return transforms.Compose([\n    #     #transforms.RandomSizedCrop(input_size),\n    #     #transforms.RandomHorizontalFlip(),\n    #     transforms.ToTensor(),\n    #     ColorJitter(\n    #         brightness=bright,\n    #         contrast=contrast,\n    #         saturation=0,\n    #     ),\n    # ])\n\ndef get_transform(name='imagenet', input_size=None,\n                  scale_size=None, normalize=None, augment=True):\n    normalize = __imagenet_stats\n    # normalize={'mean': [0., 0., 0.], 'std': [1, 1, 1]}\n    # normalize={'mean': [1., 1., 1.], 'std': [1, 1, 1]}\n    input_size = 256\n    if augment:\n            return inception_color_preproccess(input_size, normalize=normalize)\n    else:\n            return scale_crop(input_size=input_size,\n                              scale_size=scale_size, normalize=normalize)\n\n\n\n\nclass Lighting(object):\n    \"\"\"Lighting noise(AlexNet - style PCA - based noise)\"\"\"\n\n    def __init__(self, alphastd, eigval, eigvec):\n        self.alphastd = alphastd\n        self.eigval = eigval\n        self.eigvec = eigvec\n\n    def __call__(self, img):\n        if self.alphastd == 0:\n            return img\n\n        alpha = img.new().resize_(3).normal_(0, self.alphastd)\n        rgb = self.eigvec.type_as(img).clone()\\\n            .mul(alpha.view(1, 3).expand(3, 3))\\\n            .mul(self.eigval.view(1, 3).expand(3, 3))\\\n            .sum(1).squeeze()\n\n        return img.add(rgb.view(3, 1, 1).expand_as(img))\n\n\nclass Grayscale(object):\n\n    def __call__(self, img):\n        gs = img.clone()\n        gs[0].mul_(0.299).add_(0.587, gs[1]).add_(0.114, gs[2])\n        gs[1].copy_(gs[0])\n        gs[2].copy_(gs[0])\n        return gs\n\n\nclass Saturation(object):\n\n    def __init__(self, var):\n        self.var = var\n\n    def __call__(self, img):\n        gs = Grayscale()(img)\n        alpha = random.uniform(0, self.var)\n        return img.lerp(gs, alpha)\n\n\nclass Brightness(object):\n\n    def __init__(self, var):\n        self.var = var\n\n    def __call__(self, img):\n        img = img*255\n        gs = img.new().resize_as_(img).zero_()\n        alpha = random.uniform(0, self.var)\n        return img.lerp(gs, alpha)\n\n\nclass Contrast(object):\n\n    def __init__(self, var):\n        self.var = var\n\n    def __call__(self, img):\n        # img = img*255\n        gs = Grayscale()(img)\n        gs.fill_(gs.mean())\n        alpha = random.uniform(0, self.var)\n        return img.lerp(gs, alpha)\n\n\nclass RandomOrder(object):\n    \"\"\" Composes several transforms together in random order.\n    \"\"\"\n\n    def __init__(self, transforms):\n        self.transforms = transforms\n\n    def __call__(self, img):\n        if self.transforms is None:\n            return img\n        order = torch.randperm(len(self.transforms))\n        for i in order:\n            img = self.transforms[i](img)\n        return img\n\n\nclass ColorJitter(RandomOrder):\n\n    def __init__(self, brightness=0.4, contrast=0.4, saturation=0.4):\n        self.transforms = []\n        if brightness != 0:\n            self.transforms.append(Brightness(brightness))\n        if contrast != 0:\n            self.transforms.append(Contrast(contrast))\n        if saturation != 0:\n            self.transforms.append(Saturation(saturation))\n\n\n\n\n\n\ndef get_transform_unsym(left_img, right_img, size=[512, 960]):\n    \n    # photometric unsymmetric-augmentation\n    random_brightness = np.random.uniform(0.95, 1.05,2)\n    # random_gamma = np.random.uniform(0.8, 1.2,2)\n    random_contrast = np.random.uniform(0.95, 1.05,2)\n    left_img = torchvision.transforms.functional.adjust_brightness(left_img, random_brightness[0])\n    # left_img = torchvision.transforms.functional.adjust_gamma(left_img, random_gamma[0])\n    left_img = torchvision.transforms.functional.adjust_contrast(left_img, random_contrast[0])\n    right_img = torchvision.transforms.functional.adjust_brightness(right_img, random_brightness[1])\n    # right_img = torchvision.transforms.functional.adjust_gamma(right_img, random_gamma[1])\n    right_img = torchvision.transforms.functional.adjust_contrast(right_img, random_contrast[1])\n    right_img = np.asarray(right_img)\n    left_img = np.asarray(left_img)\n\n\n    return left_img, right_img"
  },
  {
    "path": "disparity/utils/readpfm.py",
    "content": "import re\nimport numpy as np\nimport sys\n \n\ndef readPFM(file):\n    file = open(file, 'rb')\n\n    color = None\n    width = None\n    height = None\n    scale = None\n    endian = None\n\n    header = file.readline().rstrip()\n    if header == 'PF':\n        color = True\n    elif header == 'Pf':\n        color = False\n    else:\n        raise Exception('Not a PFM file.')\n\n    dim_match = re.match(r'^(\\d+)\\s(\\d+)\\s$', file.readline())\n    if dim_match:\n        width, height = map(int, dim_match.groups())\n    else:\n        raise Exception('Malformed PFM header.')\n\n    scale = float(file.readline().rstrip())\n    if scale < 0: # little-endian\n        endian = '<'\n        scale = -scale\n    else:\n        endian = '>' # big-endian\n\n    data = np.fromfile(file, endian + 'f')\n    shape = (height, width, 3) if color else (height, width)\n\n    data = np.reshape(data, shape)\n    data = np.flipud(data)\n    return data, scale\n\n"
  },
  {
    "path": "disparity/utils/tensorboardx.py",
    "content": "from tensorboardX import SummaryWriter\nimport numpy as np\nwriter = SummaryWriter(log_dir='/disk1/hyj/DFAStereo/ver2.0/runs')\nfor epoch in range(100):\n    writer.add_scalar('/scalar/test',np.random.rand(),epoch)\n    writer.add_scalars('/scalar/scalars_test',{'stage0 test':epoch*np.sin(epoch),'stage0 train':epoch*np.cos(epoch),\n                                               'stage1 test': epoch * np.sin(epoch)+20,\n                                               'stage1 train': epoch * np.cos(epoch)+20},epoch)\nwriter.close()"
  },
  {
    "path": "disparity/utils/utils.py",
    "content": "# ------------------------------------------------------------------------------\n# Copyright (c) NKU\n# Licensed under the MIT License.\n# Written by Xuanyi Li (xuanyili.edu@gmail.com)\n# ------------------------------------------------------------------------------\nimport os\nimport torch\nimport torch.nn.functional as F\nimport cv2 as cv\nimport numpy as np\ndef GERF_loss(GT, pred, args):\n    # mask = (GT < args.maxdisp) & (GT >= 0)\n    mask = GT > 0 \n    mask.detach_()\n    # print(mask.size(), GT.size(), pred.size())\n    count = len(torch.nonzero(mask))\n    # print(count)\n    if count == 0:\n        count = 1\n    return torch.sum(torch.sqrt(torch.pow(GT[mask] - pred[mask], 2) + 4) /2 - 1) / count\n\ndef smooth_L1_loss(GT, pred, args):\n\n    mask = GT < args.maxdisp\n    mask.detach_()\n    # loss = F.smooth_l1_loss(pred[mask], GT[mask], size_average=True)\n    loss = (pred[mask] - GT[mask]).abs().mean()\n    return loss\n\n\n\nif __name__ == '__main__':\n\n    # import matplotlib.pyplot as plt\n    # image = cv.imread('/media/lxy/sdd1/ActiveStereoNet/StereoNet_pytorch/results/forvideo/iter-122.jpg')\n\n    im_gray = cv.imread('/media/lxy/sdd1/ActiveStereoNet/StereoNet_pytorch/results/forvideo/iter-133.jpg', cv.IMREAD_GRAYSCALE)\n    # print(im_gray.shape)\n    im_color = cv.applyColorMap(im_gray*2, cv.COLORMAP_JET)\n    # cv.imshow('test', im_color)\n    # cv.waitKey(0)\n    cv.imwrite('test.png',im_color)\n    # print(image.shape)\n    # plt.figure('Image')\n    # sc =plt.imshow(image)\n    # sc.set_cmap('hsv')\n    # plt.colorbar()\n    # plt.axis('off')\n    # plt.show()\n    # print('end')\n    # image[:,:,0].save('/media/lxy/sdd1/ActiveStereoNet/StereoNet_pytorch/results/pretrained_StereoNet_single/it1er-151.jpg')\n"
  },
  {
    "path": "preprocessing/generate_disp.py",
    "content": "import argparse\nimport os\n\nimport numpy as np\nimport scipy.misc as ssc\n\nimport kitti_util\nimport imageio\n\nDEPTH_AS_DISP = True\n\ndef generate_dispariy_from_velo(pc_velo, height, width, calib, depth_as_disp=False, baseline=0.54):\n    pts_2d = calib.project_velo_to_image(pc_velo)\n    fov_inds = (pts_2d[:, 0] < width - 1) & (pts_2d[:, 0] >= 0) & \\\n               (pts_2d[:, 1] < height - 1) & (pts_2d[:, 1] >= 0)\n    fov_inds = fov_inds & (pc_velo[:, 0] > 2)\n    imgfov_pc_velo = pc_velo[fov_inds, :]\n    imgfov_pts_2d = pts_2d[fov_inds, :]\n    imgfov_pc_rect = calib.project_velo_to_rect(imgfov_pc_velo)\n    depth_map = np.zeros((height, width)) - 1\n    imgfov_pts_2d = np.round(imgfov_pts_2d).astype(int)\n    for i in range(imgfov_pts_2d.shape[0]):\n        depth = imgfov_pc_rect[i, 2]\n        depth_map[int(imgfov_pts_2d[i, 1]), int(imgfov_pts_2d[i, 0])] = depth\n    if depth_as_disp:\n        return depth_map\n    disp_map = (calib.f_u * baseline) / depth_map\n    return disp_map\n\nif __name__ == '__main__':\n    parser = argparse.ArgumentParser(description='Generate Disparity')\n    parser.add_argument('--data_path', type=str, default='~/Kitti/object/training/')\n    parser.add_argument('--split_file', type=str, default='~/Kitti/object/train.txt')\n    parser.add_argument('--right_calib', action='store_true', default=False)\n    args = parser.parse_args()\n\n    assert os.path.isdir(args.data_path)\n    lidar_dir = args.data_path + '/velodyne/'\n    calib_dir = args.data_path + '/calib/'\n    image_dir = args.data_path + '/image_2/'\n    if DEPTH_AS_DISP:\n        disparity_dir = args.data_path + '/depth/'\n    else:\n        disparity_dir = args.data_path + '/disparity/'\n\n    assert os.path.isdir(lidar_dir)\n    assert os.path.isdir(calib_dir)\n    assert os.path.isdir(image_dir)\n\n    if not os.path.isdir(disparity_dir):\n        os.makedirs(disparity_dir)\n\n    lidar_files = [x for x in os.listdir(lidar_dir) if x[-3:] == 'bin']\n    lidar_files = sorted(lidar_files)\n\n    assert os.path.isfile(args.split_file)\n    with open(args.split_file, 'r') as f:\n        file_names = [x.strip() for x in f.readlines()]\n\n    for fn in lidar_files:\n        predix = fn[:-4]\n        if predix not in file_names:\n            continue\n        calib_file = '{}/{}.txt'.format(calib_dir, predix)\n        calib = kitti_util.Calibration(calib_file, right_calib=args.right_calib)\n        # load point cloud\n        lidar = np.fromfile(lidar_dir + '/' + fn, dtype=np.float32).reshape((-1, 4))[:, :3]\n        image_file = '{}/{}.png'.format(image_dir, predix)\n        image = imageio.imread(image_file)\n        height, width = image.shape[:2]\n        print('calib baseline {}'.format(calib.baseline))\n        disp = generate_dispariy_from_velo(lidar, height, width, calib, depth_as_disp=DEPTH_AS_DISP, baseline=calib.baseline)\n        np.save(disparity_dir + '/' + predix + ('_r' if args.right_calib else ''), disp)\n        print('Finish Disparity {}'.format(predix + ('_r' if args.right_calib else '')))\n"
  },
  {
    "path": "preprocessing/generate_lidar.py",
    "content": "import argparse\nimport os\n\nimport numpy as np\nimport scipy.misc as ssc\n\nimport kitti_util\nimport imageio\n\ndef project_disp_to_depth(calib, disp, max_high, baseline=0.54):\n    disp[disp < 0] = 0\n    mask = disp > 0\n    depth = calib.f_u * baseline / (disp + 1. - mask)\n    rows, cols = depth.shape\n    c, r = np.meshgrid(np.arange(cols), np.arange(rows))\n    points = np.stack([c, r, depth])\n    points = points.reshape((3, -1))\n    points = points.T\n    points = points[mask.reshape(-1)]\n    cloud = calib.project_image_to_velo(points)\n    valid = (cloud[:, 0] >= 0) & (cloud[:, 2] < max_high)\n    return cloud[valid]\n\n\nif __name__ == '__main__':\n    parser = argparse.ArgumentParser(description='Generate Libar')\n    parser.add_argument('--calib_dir', type=str,\n                        default='~/Kitti/object/training/calib')\n    parser.add_argument('--disparity_dir', type=str,\n                        default='~/Kitti/object/training/predicted_disparity')\n    parser.add_argument('--save_dir', type=str,\n                        default='~/Kitti/object/training/predicted_velodyne')\n    parser.add_argument('--max_high', type=int, default=1)\n    args = parser.parse_args()\n\n    assert os.path.isdir(args.disparity_dir)\n    assert os.path.isdir(args.calib_dir)\n\n    if not os.path.isdir(args.save_dir):\n        os.makedirs(args.save_dir)\n\n    disps = [x for x in os.listdir(args.disparity_dir) if x[-3:] == 'png']\n    disps = sorted(disps)\n\n    for fn in disps:\n        predix = fn[:-4]\n        calib_file = '{}/{}.txt'.format(args.calib_dir, predix)\n        calib = kitti_util.Calibration(calib_file)\n        disp_map = imageio.imread(args.disparity_dir + '/' + fn) / 256.\n        lidar = project_disp_to_depth(calib, disp_map, args.max_high)\n        # pad 1 in the indensity dimension\n        lidar = np.concatenate([lidar, np.ones((lidar.shape[0], 1))], 1)\n        lidar = lidar.astype(np.float32)\n        lidar.tofile('{}/{}.bin'.format(args.save_dir, predix))\n        print('Finish Depth {}'.format(predix))\n"
  },
  {
    "path": "preprocessing/kitti_util.py",
    "content": "\"\"\" Helper methods for loading and parsing KITTI data.\n\nAuthor: Charles R. Qi\nDate: September 2017\n\"\"\"\nfrom __future__ import print_function\n\nimport numpy as np\n\n\nclass Calibration(object):\n    ''' Calibration matrices and utils\n        3d XYZ in <label>.txt are in rect camera coord.\n        2d box xy are in image2 coord\n        Points in <lidar>.bin are in Velodyne coord.\n\n        y_image2 = P^2_rect * x_rect\n        y_image2 = P^2_rect * R0_rect * Tr_velo_to_cam * x_velo\n        x_ref = Tr_velo_to_cam * x_velo\n        x_rect = R0_rect * x_ref\n\n        P^2_rect = [f^2_u,  0,      c^2_u,  -f^2_u b^2_x;\n                    0,      f^2_v,  c^2_v,  -f^2_v b^2_y;\n                    0,      0,      1,      0]\n                 = K * [1|t]\n\n        image2 coord:\n         ----> x-axis (u)\n        |\n        |\n        v y-axis (v)\n\n        velodyne coord:\n        front x, left y, up z\n\n        rect/ref camera coord:\n        right x, down y, front z\n\n        Ref (KITTI paper): http://www.cvlibs.net/publications/Geiger2013IJRR.pdf\n\n        TODO(rqi): do matrix multiplication only once for each projection.\n    '''\n\n    def __init__(self, calib_filepath, right_calib=False):\n\n        calibs = self.read_calib_file(calib_filepath)\n        # Projection matrix from rect camera coord to image2 coord\n        P2 = calibs['P2'].reshape(3, 4)\n        P3 = calibs['P3'].reshape(3, 4)\n        if not right_calib:\n            self.P = P2\n        else:\n            self.P = P3\n        self.baseline = np.fabs(P2[0,3]-P3[0,3])/P2[0,0] # ~ 0.54\n        # Rigid transform from Velodyne coord to reference camera coord\n        self.V2C = calibs['Tr_velo_to_cam']\n        self.V2C = np.reshape(self.V2C, [3, 4])\n        self.C2V = inverse_rigid_trans(self.V2C)\n        # Rotation from reference camera coord to rect camera coord\n        self.R0 = calibs['R0_rect']\n        self.R0 = np.reshape(self.R0, [3, 3])\n\n        # Camera intrinsics and extrinsics\n        self.c_u = self.P[0, 2]\n        self.c_v = self.P[1, 2]\n        self.f_u = self.P[0, 0]\n        self.f_v = self.P[1, 1]\n        self.b_x = self.P[0, 3] / (-self.f_u)  # relative\n        self.b_y = self.P[1, 3] / (-self.f_v)\n\n    def read_calib_file(self, filepath):\n        ''' Read in a calibration file and parse into a dictionary.\n        Ref: https://github.com/utiasSTARS/pykitti/blob/master/pykitti/utils.py\n        '''\n        data = {}\n        with open(filepath, 'r') as f:\n            for line in f.readlines():\n                line = line.rstrip()\n                if len(line) == 0: continue\n                key, value = line.split(':', 1)\n                # The only non-float values in these files are dates, which\n                # we don't care about anyway\n                try:\n                    data[key] = np.array([float(x) for x in value.split()])\n                except ValueError:\n                    pass\n\n        return data\n\n    def cart2hom(self, pts_3d):\n        ''' Input: nx3 points in Cartesian\n            Oupput: nx4 points in Homogeneous by pending 1\n        '''\n        n = pts_3d.shape[0]\n        pts_3d_hom = np.hstack((pts_3d, np.ones((n, 1))))\n        return pts_3d_hom\n\n    # =========================== \n    # ------- 3d to 3d ---------- \n    # =========================== \n    def project_velo_to_ref(self, pts_3d_velo):\n        pts_3d_velo = self.cart2hom(pts_3d_velo)  # nx4\n        return np.dot(pts_3d_velo, np.transpose(self.V2C))\n\n    def project_ref_to_velo(self, pts_3d_ref):\n        pts_3d_ref = self.cart2hom(pts_3d_ref)  # nx4\n        return np.dot(pts_3d_ref, np.transpose(self.C2V))\n\n    def project_rect_to_ref(self, pts_3d_rect):\n        ''' Input and Output are nx3 points '''\n        return np.transpose(np.dot(np.linalg.inv(self.R0), np.transpose(pts_3d_rect)))\n\n    def project_ref_to_rect(self, pts_3d_ref):\n        ''' Input and Output are nx3 points '''\n        return np.transpose(np.dot(self.R0, np.transpose(pts_3d_ref)))\n\n    def project_rect_to_velo(self, pts_3d_rect):\n        ''' Input: nx3 points in rect camera coord.\n            Output: nx3 points in velodyne coord.\n        '''\n        pts_3d_ref = self.project_rect_to_ref(pts_3d_rect)\n        return self.project_ref_to_velo(pts_3d_ref)\n\n    def project_velo_to_rect(self, pts_3d_velo):\n        pts_3d_ref = self.project_velo_to_ref(pts_3d_velo)\n        return self.project_ref_to_rect(pts_3d_ref)\n\n    # =========================== \n    # ------- 3d to 2d ---------- \n    # =========================== \n    def project_rect_to_image(self, pts_3d_rect):\n        ''' Input: nx3 points in rect camera coord.\n            Output: nx2 points in image2 coord.\n        '''\n        pts_3d_rect = self.cart2hom(pts_3d_rect)\n        pts_2d = np.dot(pts_3d_rect, np.transpose(self.P))  # nx3\n        pts_2d[:, 0] /= pts_2d[:, 2]\n        pts_2d[:, 1] /= pts_2d[:, 2]\n        return pts_2d[:, 0:2]\n\n    def project_velo_to_image(self, pts_3d_velo):\n        ''' Input: nx3 points in velodyne coord.\n            Output: nx2 points in image2 coord.\n        '''\n        pts_3d_rect = self.project_velo_to_rect(pts_3d_velo)\n        return self.project_rect_to_image(pts_3d_rect)\n\n    # =========================== \n    # ------- 2d to 3d ---------- \n    # =========================== \n    def project_image_to_rect(self, uv_depth):\n        ''' Input: nx3 first two channels are uv, 3rd channel\n                   is depth in rect camera coord.\n            Output: nx3 points in rect camera coord.\n        '''\n        n = uv_depth.shape[0]\n        x = ((uv_depth[:, 0] - self.c_u) * uv_depth[:, 2]) / self.f_u + self.b_x\n        y = ((uv_depth[:, 1] - self.c_v) * uv_depth[:, 2]) / self.f_v + self.b_y\n        pts_3d_rect = np.zeros((n, 3))\n        pts_3d_rect[:, 0] = x\n        pts_3d_rect[:, 1] = y\n        pts_3d_rect[:, 2] = uv_depth[:, 2]\n        return pts_3d_rect\n\n    def project_image_to_velo(self, uv_depth):\n        pts_3d_rect = self.project_image_to_rect(uv_depth)\n        return self.project_rect_to_velo(pts_3d_rect)\n\n\ndef inverse_rigid_trans(Tr):\n    ''' Inverse a rigid body transform matrix (3x4 as [R|t])\n        [R'|-R't; 0|1]\n    '''\n    inv_Tr = np.zeros_like(Tr)  # 3x4\n    inv_Tr[0:3, 0:3] = np.transpose(Tr[0:3, 0:3])\n    inv_Tr[0:3, 3] = np.dot(-np.transpose(Tr[0:3, 0:3]), Tr[0:3, 3])\n    return inv_Tr\n"
  },
  {
    "path": "requirement.txt",
    "content": "torch==1.3.0\ntorchvision==0.4.1\n\ntensorboardX\nyacs\nopencv-python\nfire    \n\nscipy\nscikit-image\nnumba\n"
  },
  {
    "path": "setup.py",
    "content": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#!/usr/bin/env python\n\nimport glob\nimport os\n\nimport torch\nfrom setuptools import find_packages\nfrom setuptools import setup\nfrom torch.utils.cpp_extension import CUDA_HOME\nfrom torch.utils.cpp_extension import CppExtension\nfrom torch.utils.cpp_extension import CUDAExtension\n\nrequirements = [\"torch\", \"torchvision\"]\n\n\ndef get_extensions():\n    this_dir = os.path.dirname(os.path.abspath(__file__))\n    extensions_dir = os.path.join(this_dir, \"x-stereolab\", \"csrc\")\n\n    main_file = glob.glob(os.path.join(extensions_dir, \"*.cpp\"))\n    source_cpu = glob.glob(os.path.join(extensions_dir, \"cpu\", \"*.cpp\"))\n    source_cuda = glob.glob(os.path.join(extensions_dir, \"cuda\", \"*.cu\"))\n\n    sources = main_file + source_cpu\n    extension = CppExtension\n\n    extra_compile_args = {\"cxx\": []}\n    define_macros = []\n\n    if (torch.cuda.is_available() and CUDA_HOME is not None) or os.getenv(\"FORCE_CUDA\", \"0\") == \"1\":\n        extension = CUDAExtension\n        sources += source_cuda\n        define_macros += [(\"WITH_CUDA\", None)]\n        extra_compile_args[\"nvcc\"] = [\n            \"-DCUDA_HAS_FP16=1\",\n            \"-D__CUDA_NO_HALF_OPERATORS__\",\n            \"-D__CUDA_NO_HALF_CONVERSIONS__\",\n            \"-D__CUDA_NO_HALF2_OPERATORS__\",\n        ]\n\n    sources = [os.path.join(extensions_dir, s) for s in sources]\n\n    include_dirs = [extensions_dir]\n\n    ext_modules = [\n        extension(\n            \"x-stereolab._C\",\n            sources,\n            include_dirs=include_dirs,\n            define_macros=define_macros,\n            extra_compile_args=extra_compile_args,\n        )\n    ]\n\n    return ext_modules\n\nsetup(\n    name=\"x-stereolab\",\n    version=\"0.1\",\n    author=\"meteorshowers\",\n    url=\"https://github.com/meteorshowers/X-StereoLab\",\n    description=\"X-StereoLab pytorch\",\n    packages=find_packages(exclude=(\"configs\", \"tools\", \"preprocessing\")),\n    # install_requires=requirements,\n    ext_modules=get_extensions(),\n    cmdclass={\"build_ext\": torch.utils.cpp_extension.BuildExtension},\n)\n"
  },
  {
    "path": "tools/env_utils/__init__.py",
    "content": "from .logger import colorlogger\nfrom .utils import *\nfrom .exp import Experimenter"
  },
  {
    "path": "tools/env_utils/exp.py",
    "content": "import os\nimport sys\nimport numpy as np\n\nfrom .logger import colorlogger\nfrom tensorboardX import SummaryWriter \n\nclass Experimenter:\n    def __init__(self, model_dir, cfg_path=None):\n        self.model_dir = model_dir\n        self.cfg_path = cfg_path\n\n        # Always use the config file in the experiement repo.\n        if self.cfg_path:\n            ## Update the config file in the experiment repo with provided config\n            if not os.path.isdir(self.model_dir):\n                os.makedirs(self.model_dir)\n            assert os.path.exists(self.cfg_path), 'Found no config file in cfg path {}'.format(self.cfg_path)\n            save_path = '{}/save_config.py'.format(self.model_dir)\n            if os.path.normpath(self.cfg_path) == os.path.normpath(save_path):\n                pass\n            else:\n                if os.path.exists(save_path):\n                    os.system('mv {}/save_config.py {}/save_config.py.tmp'.format(self.model_dir, self.model_dir))\n                os.system('cp {} {}/save_config.py'.format(self.cfg_path, self.model_dir))\n            print('configuration: {} --> {}/save_config.py'.format(self.cfg_path, self.model_dir))\n        else:\n            ## If cfg_path is None, then there should be a config file in the repo.\n            assert os.path.exists('{}/save_config.py'.format(self.model_dir)), 'Found no config in the model_dir: {}'.format(self.model_dir)\n\n        sys.path.insert(0, self.model_dir)\n        from save_config import cfg\n        self.cfg = cfg\n\n    @property\n    def config(self):\n        return self.cfg\n\n    @property\n    def logger(self):\n        if not hasattr(self, '_logger'):\n            print('Log -->', os.path.join(self.model_dir, 'training.log'))\n            self._logger = colorlogger(self.model_dir)\n\n        return self._logger\n\n    @property\n    def writer(self):\n        if not hasattr(self, '_writer'):\n            self.tensorboard_dir = os.path.join(self.model_dir, 'tensorboard')\n            print('Tensorboard -->', self.tensorboard_dir)\n            self._writer = SummaryWriter(self.tensorboard_dir)\n\n        return self._writer\n\n\n"
  },
  {
    "path": "tools/env_utils/logger.py",
    "content": "import logging\nimport os\n\nOK = '\\033[92m'\nWARNING = '\\033[93m'\nFAIL = '\\033[91m'\nEND = '\\033[0m'\n\nPINK = '\\033[95m'\nBLUE = '\\033[94m'\nGREEN = OK\nRED = FAIL\nWHITE = END\nYELLOW = WARNING\n\nclass colorlogger():\n    def __init__(self, log_dir, log_name='training.log'):\n        # set log\n        self._logger = logging.getLogger(log_name)\n        self._logger.setLevel(logging.INFO)\n        log_file = os.path.join(log_dir, log_name)\n        if not os.path.exists(log_dir):\n            os.makedirs(log_dir)\n        file_log = logging.FileHandler(log_file, mode='a')\n        file_log.setLevel(logging.INFO)\n        console_log = logging.StreamHandler()\n        console_log.setLevel(logging.INFO)\n        formatter = logging.Formatter(\n            \"{}%(asctime)s{} %(message)s\".format(GREEN, END),\n            \"%m-%d %H:%M:%S\")\n        file_log.setFormatter(formatter)\n        console_log.setFormatter(formatter)\n        self._logger.addHandler(file_log)\n        self._logger.addHandler(console_log)\n\n    def debug(self, msg):\n        self._logger.debug(str(msg))\n\n    def info(self, msg):\n        self._logger.info(str(msg))\n\n    def warning(self, msg):\n        self._logger.warning(WARNING + 'WRN: ' + str(msg) + END)\n\n    def critical(self, msg):\n        self._logger.critical(RED + 'CRI: ' + str(msg) + END)\n\n    def error(self, msg):\n        self._logger.error(RED + 'ERR: ' + str(msg) + END)\n\ndef yellow(msg):\n    return YELLOW + str(msg) + END\n\ndef green(msg):\n    return GREEN + str(msg) + END\n\ndef red(msg):\n    return RED + str(msg) + END\n\ndef blue(msg):\n    return BLUE + str(msg) + END\n\ndef print_yellow(msg, **kwargs):\n    print(yellow(msg), **kwargs)\n\ndef print_green(msg, **kwargs):\n    print(green(msg), **kwargs)\n\ndef print_red(msg, **kwargs):\n    print(red(msg), **kwargs)\n\ndef print_blue(msg, **kwargs):\n    print(blue(msg), **kwargs)\n\ndef error(msg):\n    print(red('ERR: ' + str(msg)))\n\ndef warning(msg):\n    print(yellow('WRN: ' + str(msg)))\n\n"
  },
  {
    "path": "tools/env_utils/utils.py",
    "content": "import os\nimport os.path as osp\nimport shutil\nimport sys\nimport numpy as np\nfrom datetime import datetime\nfrom glob import glob\nfrom itertools import chain\nimport gc\nimport torch\n\ndef mem_info():\n    import subprocess\n    dev = subprocess.check_output(\n        \"nvidia-smi | grep MiB | awk -F '|' '{print $3}' | awk -F '/' '{print $1}' | grep -Eo '[0-9]{1,10}'\",\n        shell=True)\n    dev = dev.decode()\n    dev_mem = list(map(lambda x: int(x), dev.split('\\n')[:-1]))\n    return dev_mem\n\ndef random_int(obj=None):\n    return (id(obj) + os.getpid() + int(datetime.now().strftime(\"%Y%m%d%H%M%S%f\"))) % 4294967295\n\ndef cmd(command):\n    import subprocess\n    output = subprocess.check_output(command, shell=True)\n    output = output.decode()\n    return output\n\ndef reset_seed(seed):\n    torch.manual_seed(seed)\n    torch.cuda.manual_seed(seed)\n    np.random.seed(seed)\n    torch.backends.cudnn.deterministic = True\n\n"
  },
  {
    "path": "tools/train_net_disp.py",
    "content": "from __future__ import print_function\n\nimport argparse\nimport os\nimport time\n\nimport numpy as np\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nimport torch.nn.parallel\nimport torch.optim as optim\nimport torch.utils.data\nfrom torch.autograd import Variable\n\ntorch.backends.cudnn.benchmark = True\n\nfrom disparity.dataloader import KITTIloader2012 as ls_2012\nfrom disparity.dataloader import KITTIloader2015 as ls_2015\n\n\nfrom disparity.dataloader import KITTILoader as DA\n# from disparity.dataloader import KITTILoader3D as ls\n# from disparity.dataloader import KITTILoader_dataset3d as DA\nfrom disparity.models import *\nfrom disparity.models.loss import My_loss\nfrom disparity.models.loss import GERF_loss \nfrom env_utils import *\n\n\n# multiprocessing distributed training\nimport torch.distributed as dist\nimport torch.utils.data.distributed\nimport torch.multiprocessing as mp\n\ndef get_parser():\n    parser = argparse.ArgumentParser(description='PSMNet')\n    parser.add_argument('-cfg', '--cfg', '--config', default='./configs/default/config_car.py', help='config path')\n    # parser.add_argument('--data_path', default='./data/kitti/training/', help='data_path')\n    parser.add_argument('--datapath', default='./data/kitti2015/training/',\n                    help='datapath')\n    parser.add_argument('--datapath12', default='./data/kitti2012/training/',\n                    help='datapath')\n    parser.add_argument('--epochs', type=int, default=4200, help='number of epochs to train')\n    parser.add_argument('--loadmodel', default=None, help='load model')\n    parser.add_argument('--savemodel', default=None, help='save model')\n    parser.add_argument('--debug', action='store_true', default=False, help='debug mode')\n    parser.add_argument('--seed', type=int, default=1, metavar='S', help='random seed (default: 1)')\n    parser.add_argument('--devices', '-d', type=str, default=None)\n    parser.add_argument('--lr_scale', type=int, default=40, metavar='S', help='lr scale')\n    parser.add_argument('--split_file', default='./data/kitti/train.txt', help='split file')\n    parser.add_argument('--btrain', '-btrain', type=int, default=None)\n    parser.add_argument('--start_epoch', type=int, default=None)\n\n    parser.add_argument('-j', '--workers', default=4, type=int, metavar='N',\n                        help='number of data loading workers (default: 4)')\n    ## for distributed training\n    parser.add_argument('--world-size', default=1, type=int,\n                        help='number of nodes for distributed training')\n    parser.add_argument('--rank', default=0, type=int,\n                        help='node rank for distributed training')\n    parser.add_argument('--dist-url', type=str,\n                        help='url used to set up distributed training')\n    parser.add_argument('--dist-backend', default='nccl', type=str,\n                        help='distributed backend')\n    parser.add_argument('--multiprocessing-distributed', action='store_true',\n                        help='Use multi-processing distributed training to launch '\n                             'N processes per node, which has N GPUs. This is the '\n                             'fastest way to use PyTorch for either single node or '\n                             'multi node data parallel training')\n    args = parser.parse_args()\n\n    if not args.devices:\n        args.devices = str(np.argmin(mem_info()))\n\n    if args.devices is not None and '-' in args.devices:\n        gpus = args.devices.split('-')\n        gpus[0] = 0 if not gpus[0].isdigit() else int(gpus[0])\n        gpus[1] = len(mem_info()) if not gpus[1].isdigit() else int(gpus[1]) + 1\n        args.devices = ','.join(map(lambda x: str(x), list(range(*gpus))))\n\n    if not args.dist_url:\n        args.dist_url = \"tcp://127.0.0.1:{}\".format(random_int() % 30000)\n\n    print('Using GPU:{}'.format(args.devices))\n    os.environ['CUDA_VISIBLE_DEVICES'] = args.devices\n\n    return args\n\ndef main():\n    args = get_parser()\n\n    if args.debug:\n        args.savemodel = './outputs/debug/'\n        args.btrain = 1\n        args.workers = 0\n\n    global cfg\n    exp = Experimenter(args.savemodel, cfg_path=args.cfg)\n    cfg = exp.config\n    \n    reset_seed(args.seed)\n\n    cfg.debug = args.debug\n    cfg.warmup = getattr(cfg, 'warmup', True) if not args.debug else False\n\n    ### distributed training ###\n    if args.dist_url == \"env://\" and args.world_size == -1:\n        args.world_size = int(os.environ[\"WORLD_SIZE\"])\n\n    ngpus_per_node = torch.cuda.device_count()\n    print('ngpus_per_node: {}'.format(ngpus_per_node))\n    args.ngpus_per_node = ngpus_per_node\n\n    args.distributed = ngpus_per_node > 0 and (args.world_size > 1 or args.multiprocessing_distributed)\n    args.multiprocessing_distributed = args.distributed\n\n    if args.distributed and args.multiprocessing_distributed:\n        # Since we have ngpus_per_node processes per node, the total world_size\n        # needs to be adjusted accordingly\n        args.world_size = ngpus_per_node * args.world_size\n        # Use torch.multiprocessing.spawn to launch distributed processes: the\n        # main_worker process function\n        mp.spawn(main_worker, nprocs=ngpus_per_node, args=(ngpus_per_node, args, cfg, exp))\n    else:\n        # Simply call main_worker function\n        main_worker(0, ngpus_per_node, args, cfg, exp)\n\ndef main_process(args):\n    return not args.multiprocessing_distributed or (args.multiprocessing_distributed and args.rank % args.ngpus_per_node == 0)\n\ndef main_worker(gpu, ngpus_per_node, args, cfg, exp):\n    print(\"Using GPU: {} for training\".format(gpu))\n    if args.distributed:\n        if args.dist_url == \"env://\" and args.rank == -1:\n            args.rank = int(os.environ[\"RANK\"])\n        if args.multiprocessing_distributed:\n            # For multiprocessing distributed training, rank needs to be the\n            # global rank among all the processes\n            args.rank = args.rank * ngpus_per_node + gpu\n        dist.init_process_group(backend=args.dist_backend, init_method=args.dist_url,\n                                world_size=args.world_size, rank=args.rank)\n\n    #------------------- Model -----------------------\n    # model = hitnet(cfg)\n    if cfg.model == 'hitnet':\n        model = HitNet()\n    if cfg.model == 'stereonet':\n        model = stereonet_disp()\n    optimizer = optim.Adam(model.parameters(), lr=0.1, betas=(0.9, 0.999))\n\n    if args.distributed:\n        # For multiprocessing distributed, DistributedDataParallel constructor\n        # should always set the single device scope, otherwise,\n        # DistributedDataParallel will use all available devices.\n        torch.cuda.set_device(gpu)\n        model.cuda(gpu)\n        # When using a single GPU per process and per\n        # DistributedDataParallel, we need to divide the batch size\n        # ourselves based on the total number of GPUs we have\n        args.btrain = int(args.btrain / ngpus_per_node)\n        args.workers = int((args.workers + ngpus_per_node - 1) / ngpus_per_node)\n        model = torch.nn.parallel.DistributedDataParallel(model, device_ids=[gpu], find_unused_parameters=True)\n    elif ngpus_per_node > 1:\n        model = torch.nn.DataParallel(model).cuda()\n    else:\n        torch.cuda.set_device(gpu)\n        model = model.cuda(gpu)\n\n    #------------------- Data Loader -----------------------\n    # all_left_img, all_right_img, all_left_disp, = ls.dataloader(args.data_path,\n    #                                                             args.split_file,\n    #                                                             depth_disp=True,\n    #                                                             cfg=cfg,\n    #                                                             is_train=True)\n\n    train_left_img, train_right_img, train_left_disp,train_left_norm, test_left_img, test_right_img, test_left_disp, test_left_norm = ls_2015.dataloader(args.datapath)\n    train_left_img12, train_right_img12, train_left_disp12,train_left_norm12, test_left_img12, test_right_img12, test_left_disp12, test_left_norm12 = ls_2012.dataloader(args.datapath12)\n\n\n\n    # ImageFloader = DA.myImageFloder(all_left_img, all_right_img, all_left_disp, True, split=args.split_file, cfg=cfg)\n\n    # ImageFloader = torch.utils.data.DataLoader(\n    #     DA.myImageFloder(train_left_img+train_left_img12, train_right_img+train_right_img12, train_left_disp+train_left_disp12,train_left_norm+train_left_norm12, True),\n    #     batch_size=args.btrain, shuffle=True, num_workers=2, drop_last=False,pin_memory=True)\n\n    ImageFloader = DA.myImageFloder(train_left_img+train_left_img12, train_right_img+train_right_img12, train_left_disp+train_left_disp12,train_left_norm+train_left_norm12, True)\n    \n\n    if args.distributed:\n        train_sampler = torch.utils.data.distributed.DistributedSampler(ImageFloader)\n    else:\n        train_sampler = None\n\n    TrainImgLoader = torch.utils.data.DataLoader(\n        ImageFloader, \n        batch_size=args.btrain, shuffle=(train_sampler is None), num_workers=args.workers, drop_last=True,\n        collate_fn=BatchCollator(cfg), \n        sampler=train_sampler)\n\n    args.max_warmup_step = min(len(TrainImgLoader), 500)\n\n    #------------------ Logger -------------------------------------\n    if main_process(args):\n        logger = exp.logger\n        logger.info('Number of model parameters: {}'.format(sum([p.data.nelement() for p in model.parameters()])))\n        writer = exp.writer\n\n    # ------------------------ Resume ------------------------------\n    if args.loadmodel is not None:\n        if main_process(args):\n            logger.info('load model ' + args.loadmodel)\n        state_dict = torch.load(args.loadmodel)\n        model.load_state_dict(state_dict['state_dict'], strict=False)\n        if 'optimizer' in state_dict:\n            try:\n                optimizer.load_state_dict(state_dict['optimizer'])\n                if main_process(args):\n                    logger.info('Optimizer Restored.')\n            except Exception as e:\n                if main_process(args):\n                    logger.error(str(e))\n                    logger.info('Failed to restore Optimizer')\n        else:\n            if main_process(args):\n                logger.info('No saved optimizer.')\n        if args.start_epoch is None:\n            args.start_epoch = state_dict['epoch'] + 1\n\n    if args.start_epoch is None:\n        args.start_epoch = 1\n\n    # ------------------------ Training ------------------------------\n    for epoch in range(args.start_epoch, args.epochs + 1):\n        if args.distributed:\n            train_sampler.set_epoch(epoch)\n\n        total_train_loss = 0\n        adjust_learning_rate(optimizer, epoch, args=args)\n\n        for batch_idx, data_batch in enumerate(TrainImgLoader):\n            start_time = time.time()\n            if epoch == 1 and cfg.warmup and batch_idx < args.max_warmup_step:\n                adjust_learning_rate(optimizer, epoch, batch_idx, args=args)\n\n            losses = train(model, cfg, args, optimizer, **data_batch)\n            loss = losses.pop('loss')\n\n            if main_process(args):\n                logger.info('%s: %s' % (args.savemodel.strip('/').split('/')[-1], args.devices))\n                logger.info('Epoch %d Iter %d/%d training loss = %.3f , time = %.2f; Epoch time: %.3fs, Left time: %.3fs, lr: %.6f' % (\n                    epoch, \n                    batch_idx, len(TrainImgLoader), loss, time.time() - start_time, (time.time() - start_time) * len(TrainImgLoader), \n                    (time.time() - start_time) * (len(TrainImgLoader) * (args.epochs - epoch) - batch_idx), optimizer.param_groups[0][\"lr\"]) )\n                logger.info('losses: {}'.format(list(losses.items())))\n                for lk, lv in losses.items():\n                    writer.add_scalar(lk, lv, epoch * len(TrainImgLoader) + batch_idx)\n                total_train_loss += loss\n\n            if batch_idx == 100 and cfg.debug:\n                break\n\n        if main_process(args):\n            logger.info('epoch %d total training loss = %.3f' % (epoch, total_train_loss / len(TrainImgLoader)))\n            savefilename = args.savemodel + '/finetune_' + str(epoch) + '.tar'\n            torch.save({\n                'epoch': epoch,\n                'state_dict': model.state_dict(),\n                'train_loss': total_train_loss / len(TrainImgLoader),\n                'optimizer': optimizer.state_dict()\n            }, savefilename)\n            logger.info('Snapshot {} epoch in {}'.format(epoch, args.savemodel))\n\n\ndef train(model, cfg, args, optimizer, imgL, imgR, disp_L, norm_L,\n    calib=None, calib_R=None, image_indexes=None, targets=None, ious=None, labels_map=None):\n    get_loss= My_loss(10, 5, 2, 3)\n    model.train()\n    imgL = Variable(torch.FloatTensor(imgL))\n    imgR = Variable(torch.FloatTensor(imgR))\n    disp_L = Variable(torch.FloatTensor(disp_L))\n    norm_L = Variable(torch.FloatTensor(norm_L))\n\n    imgL, imgR, disp_true, norm_true = imgL.cuda(), imgR.cuda(), disp_L.cuda(), norm_L.cuda()\n\n\n    # ---------\n    mask = (disp_true > cfg.mindisp) & (disp_true <= cfg.maxdisp)\n    mask.detach_()\n    # ---------\n\n    losses = dict()\n\n    # outputs = model(imgL, imgR, disp_L)\n    if cfg.model == 'hitnet':\n        out, h_new, w = model(imgL, imgR, disp_true)\n        loss = get_loss(out, h_new, w, imgL, disp_true.squeeze(1), norm_true)\n\n    if cfg.model == 'stereonet':\n        outputs = model(imgL, imgR)\n        outputs = [torch.unsqueeze(output, 1) for output in outputs]\n\n        loss1 = [GERF_loss(disp_true, outputs[0])]\n        for i in range(len(outputs)-1):\n            loss1.append(GERF_loss(disp_true, outputs[i+1]))\n        loss = sum(loss1)\n\n        # loss = 0.\n        # if getattr(cfg, 'DispVolume', True) and cfg.loss_disp:\n        #     pass\n            # # depth_preds = [torch.squeeze(o, 1) for o in outputs['depth_preds']]\n\n            # disp_loss = 0.\n            # # weight = [0.5, 0.7, 1.0]\n            # # for i, o in enumerate(depth_preds):\n            # #     disp_loss += weight[3 - len(depth_preds) + i]  * F.smooth_l1_loss(o[mask], disp_true[mask], size_average=True)\n            # losses.update(disp_loss=disp_loss)\n            # loss += disp_loss\n    losses.update(loss=loss)\n\n    optimizer.zero_grad()\n    loss.backward()\n    optimizer.step()\n\n    if args.multiprocessing_distributed:\n        with torch.no_grad():\n            loss_names = []\n            all_losses = []\n            for k in sorted(losses.keys()):\n                loss_names.append(k)\n                all_losses.append(losses[k])\n            all_losses = torch.stack(all_losses, dim=0)\n            dist.all_reduce(all_losses)\n            all_losses /= args.ngpus_per_node\n            reduced_losses = {k: v.item() for k, v in zip(loss_names, all_losses)}\n    else:\n        reduced_losses = {k: v.item() for k, v in losses.items()}\n\n    return reduced_losses\n\nclass BatchCollator(object):\n    def __init__(self, cfg):\n        super(BatchCollator, self).__init__()\n        self.cfg = cfg\n\n    def __call__(self, batch):\n        transpose_batch = list(zip(*batch))\n        ret = dict()\n\n\n        ret['imgL'] = torch.cat(transpose_batch[0], dim=0)\n        \n        ret['imgR'] = torch.cat(transpose_batch[1], dim=0)\n        ret['disp_L'] = torch.stack(transpose_batch[2], dim=0)\n        # print(ret['disp_L'].size())\n        ret['norm_L'] = torch.stack(transpose_batch[3], dim=0)\n        # print(ret['norm_L'].size())\n        return ret\n\ndef adjust_learning_rate(optimizer, epoch, step=None, args=None):\n    # if epoch > 1 or step is None or step > args.max_warmup_step:\n    #     if epoch <= args.lr_scale:\n    #         lr = 0.001 / args.ngpus_per_node\n    #     else:\n    #         lr = 0.0001 / args.ngpus_per_node\n    # else:\n    #     lr = 0.001 / args.ngpus_per_node\n    #     warmup_pro = float(step) / args.max_warmup_step\n    #     lr = lr * (warmup_pro + 1./3. * (1. - warmup_pro))\n    lr = 4e-4\n    if epoch>4000:\n        lr = 1e-4\n    if epoch>4080:\n        lr = 4e-5\n    for param_group in optimizer.param_groups:\n        param_group['lr'] = lr\n\nif __name__ == '__main__':\n    main()\n\n"
  }
]