Full Code of NVlabs/PoseCNN-PyTorch for AI

main f7d28f2abd38 cached
121 files
1.2 MB
368.3k tokens
730 symbols
1 requests
Download .txt
Showing preview only (1,260K chars total). Download the full file or copy to clipboard to get everything.
Repository: NVlabs/PoseCNN-PyTorch
Branch: main
Commit: f7d28f2abd38
Files: 121
Total size: 1.2 MB

Directory structure:
gitextract_vctsyxbk/

├── .gitignore
├── .gitmodules
├── LICENSE.md
├── README.md
├── build.sh
├── experiments/
│   ├── cfgs/
│   │   ├── dex_ycb.yml
│   │   ├── ycb_object.yml
│   │   ├── ycb_object_detection.yml
│   │   ├── ycb_object_self_supervision.yml
│   │   └── ycb_video.yml
│   └── scripts/
│       ├── demo.sh
│       ├── dex_ycb_test_s0.sh
│       ├── dex_ycb_test_s1.sh
│       ├── dex_ycb_test_s2.sh
│       ├── dex_ycb_test_s3.sh
│       ├── dex_ycb_train_s0.sh
│       ├── dex_ycb_train_s1.sh
│       ├── dex_ycb_train_s2.sh
│       ├── dex_ycb_train_s3.sh
│       ├── ros_ycb_object_test.sh
│       ├── ros_ycb_object_test_detection.sh
│       ├── ycb_object_test.sh
│       ├── ycb_object_train.sh
│       ├── ycb_object_train_detection.sh
│       ├── ycb_object_train_self_supervision.sh
│       ├── ycb_video_test.sh
│       └── ycb_video_train.sh
├── lib/
│   ├── datasets/
│   │   ├── __init__.py
│   │   ├── background.py
│   │   ├── dex_ycb.py
│   │   ├── factory.py
│   │   ├── imdb.py
│   │   ├── ycb_object.py
│   │   ├── ycb_self_supervision.py
│   │   └── ycb_video.py
│   ├── fcn/
│   │   ├── __init__.py
│   │   ├── config.py
│   │   ├── render_utils.py
│   │   ├── test_common.py
│   │   ├── test_dataset.py
│   │   ├── test_imageset.py
│   │   └── train.py
│   ├── layers/
│   │   ├── ROIAlign_cuda.cu
│   │   ├── __init__.py
│   │   ├── backproject_kernel.cu
│   │   ├── hard_label.py
│   │   ├── hard_label_kernel.cu
│   │   ├── hough_voting.py
│   │   ├── hough_voting_kernel.cu
│   │   ├── point_matching_loss.py
│   │   ├── point_matching_loss_kernel.cu
│   │   ├── pose_target_layer.py
│   │   ├── posecnn_layers.cpp
│   │   ├── roi_align.py
│   │   ├── roi_pooling.py
│   │   ├── roi_pooling_kernel.cu
│   │   ├── roi_target_layer.py
│   │   ├── sdf_matching_loss.py
│   │   ├── sdf_matching_loss_kernel.cu
│   │   └── setup.py
│   ├── networks/
│   │   ├── PoseCNN.py
│   │   └── __init__.py
│   ├── sdf/
│   │   ├── __init__.py
│   │   ├── _init_paths.py
│   │   ├── multi_sdf_optimizer.py
│   │   ├── sdf_optimizer.py
│   │   ├── sdf_utils.py
│   │   └── test_sdf_optimizer.py
│   └── utils/
│       ├── __init__.py
│       ├── bbox.pyx
│       ├── bbox_transform.py
│       ├── blob.py
│       ├── nms.py
│       ├── pose_error.py
│       ├── se3.py
│       ├── segmentation_evaluation.py
│       └── setup.py
├── requirement.txt
├── ros/
│   ├── _init_paths.py
│   ├── collect_images_realsense.py
│   ├── posecnn.rviz
│   └── test_images.py
├── tools/
│   ├── _init_paths.py
│   ├── test_images.py
│   ├── test_net.py
│   └── train_net.py
└── ycb_render/
    ├── CMakeLists.txt
    ├── __init__.py
    ├── cpp/
    │   ├── query_devices.cpp
    │   ├── test_device.cpp
    │   └── ycb_renderer.cpp
    ├── get_available_devices.py
    ├── glad/
    │   ├── EGL/
    │   │   └── eglplatform.h
    │   ├── KHR/
    │   │   └── khrplatform.h
    │   ├── egl.c
    │   ├── gl.c
    │   ├── glad/
    │   │   ├── egl.h
    │   │   ├── gl.h
    │   │   └── glx.h
    │   ├── glx_dyn.c
    │   └── linmath.h
    ├── glutils/
    │   ├── __init__.py
    │   ├── _trackball.py
    │   ├── glcontext.py
    │   ├── glrenderer.py
    │   ├── meshutil.py
    │   ├── trackball.py
    │   └── utils.py
    ├── setup.py
    ├── shaders/
    │   ├── frag.shader
    │   ├── frag_blinnphong.shader
    │   ├── frag_mat.shader
    │   ├── frag_simple.shader
    │   ├── frag_textureless.shader
    │   ├── vert.shader
    │   ├── vert_blinnphong.shader
    │   ├── vert_mat.shader
    │   ├── vert_simple.shader
    │   └── vert_textureless.shader
    ├── visualize_sim.py
    └── ycb_renderer.py

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
*.mex*
*.pyc
*.tgz
*.so
*.o
output*
lib/synthesize/build/*
lib/utils/bbox.c
data/
data_self/
docker/
ngc/

.idea/
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
.python-version

# pipenv
#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
#   However, in case of collaboration, if having platform-specific dependencies or dependencies
#   having no cross-platform support, pipenv may install dependencies that don't work, or not
#   install all needed dependencies.
#Pipfile.lock

# celery beat schedule file
celerybeat-schedule

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

.idea/
.

*.png

results/


================================================
FILE: .gitmodules
================================================
[submodule "ycb_render/pybind11"]
	path = ycb_render/pybind11
	url = https://github.com/pybind/pybind11.git


================================================
FILE: LICENSE.md
================================================
# NVIDIA Source Code License for PoseCNN-PyTorch: A PyTorch Implementation of the PoseCNN Framework for 6D Object Pose Estimation

## 1. Definitions

“Licensor” means any person or entity that distributes its Work.

“Software” means the original work of authorship made available under this License.
“Work” means the Software and any additions to or derivative works of the Software that are made available under this License.

“Nvidia Processors” means any central processing unit (CPU), graphics processing unit (GPU), field-programmable gate array (FPGA), application-specific integrated circuit (ASIC) or any combination thereof designed, made, sold, or provided by Nvidia or its affiliates.

The terms “reproduce,” “reproduction,” “derivative works,” and “distribution” have the meaning as provided under U.S. copyright law; provided, however, that for the purposes of this License, derivative works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work.

Works, including the Software, are “made available” under this License by including in or with the Work either (a) a copyright notice referencing the applicability of this License to the Work, or (b) a copy of this License.

## 2. License Grants

### 2.1 Copyright Grant.
Subject to the terms and conditions of this License, each Licensor grants to you a perpetual, worldwide, non-exclusive, royalty-free, copyright license to reproduce, prepare derivative works of, publicly display, publicly perform, sublicense and distribute its Work and any resulting derivative works in any form.

## 3. Limitations

### 3.1 Redistribution.
 You may reproduce or distribute the Work only if (a) you do so under this License, (b) you include a complete copy of this License with your distribution, and (c) you retain without modification any copyright, patent, trademark, or attribution notices that are present in the Work.

### 3.2 Derivative Works.
 You may specify that additional or different terms apply to the use, reproduction, and distribution of your derivative works of the Work (“Your Terms”) only if (a) Your Terms provide that the use limitation in Section 3.3 applies to your derivative works, and (b) you identify the specific derivative works that are subject to Your Terms. Notwithstanding Your Terms, this License (including the redistribution requirements in Section 3.1) will continue to apply to the Work itself.

### 3.3 Use Limitation.
 The Work and any derivative works thereof only may be used or intended for use non-commercially.  The Work or derivative works thereof may be used or intended for use by Nvidia or its affiliates commercially or non-commercially.  As used herein, “non-commercially” means for research or evaluation purposes only.

### 3.4 Patent Claims.
 If you bring or threaten to bring a patent claim against any Licensor (including any claim, cross-claim or counterclaim in a lawsuit) to enforce any patents that you allege are infringed by any Work, then your rights under this License from such Licensor (including the grants in Sections 2.1 and 2.2) will terminate immediately.

### 3.5 Trademarks.
 This License does not grant any rights to use any Licensor’s or its affiliates’ names, logos, or trademarks, except as necessary to reproduce the notices described in this License.

### 3.6 Termination.
 If you violate any term of this License, then your rights under this License (including the grants in Sections 2.1 and 2.2) will terminate immediately.

## 4. Disclaimer of Warranty.

THE WORK IS PROVIDED “AS IS” WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING WARRANTIES OR CONDITIONS OF M ERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE OR NON-INFRINGEMENT. YOU BEAR THE RISK OF UNDERTAKING ANY ACTIVITIES UNDER THIS LICENSE. 

## 5. Limitation of Liability.

EXCEPT AS PROHIBITED BY APPLICABLE LAW, IN NO EVENT AND UNDER NO LEGAL THEORY, WHETHER IN TORT (INCLUDING NEGLIGENCE), CONTRACT, OR OTHERWISE SHALL ANY LICENSOR BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF OR RELATED TO THIS LICENSE, THE USE OR INABILITY TO USE THE WORK (INCLUDING BUT NOT LIMITED TO LOSS OF GOODWILL, BUSINESS INTERRUPTION, LOST PROFITS OR DATA, COMPUTER FAILURE OR MALFUNCTION, OR ANY OTHER COMM ERCIAL DAMAGES OR LOSSES), EVEN IF THE LICENSOR HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.



================================================
FILE: README.md
================================================
# PoseCNN-PyTorch: A PyTorch Implementation of the PoseCNN Framework for 6D Object Pose Estimation

### Introduction

We implement PoseCNN in PyTorch in this project.

PoseCNN is an end-to-end Convolutional Neural Network for 6D object pose estimation. PoseCNN estimates the 3D translation of an object by localizing its center in the image and predicting its distance from the camera. The 3D rotation of the object is estimated by regressing to a quaternion representation. [arXiv](https://arxiv.org/abs/1711.00199), [Project](https://rse-lab.cs.washington.edu/projects/posecnn/)

Rotation regression in PoseCNN cannot handle symmetric objects very well. Check [PoseRBPF](https://github.com/NVlabs/PoseRBPF) for a better solution for symmetric objects.

The code also supports pose refinement by matching segmented 3D point cloud of an object to its SDF.

<p align="center"><img src="./data/pics/intro.png" width="640" height="320"/></p>

### License

PoseCNN-PyTorch is released under the NVIDIA Source Code License (refer to the LICENSE file for details).

### Citation

If you find the package is useful in your research, please consider citing:

    @inproceedings{xiang2018posecnn,
        Author = {Yu Xiang and Tanner Schmidt and Venkatraman Narayanan and Dieter Fox},
        Title = {{PoseCNN}: A Convolutional Neural Network for {6D} Object Pose Estimation in Cluttered Scenes},
        booktitle = {Robotics: Science and Systems (RSS)},
        Year = {2018}
    }

### Required environment

- Ubuntu 16.04 or above
- PyTorch 0.4.1 or above
- CUDA 9.1 or above

### Installation

Use python3. If ROS is needed, compile with python2.

1. Install [PyTorch](https://pytorch.org/)

2. Install Eigen from the Github source code [here](https://github.com/eigenteam/eigen-git-mirror)

3. Install Sophus from the Github source code [here](https://github.com/yuxng/Sophus)

4. Install python packages
   ```Shell
   pip install -r requirement.txt
   ```

5. Initialize the submodules in ycb_render
   ```Shell
   git submodule update --init --recursive
   ```

6. Compile the new layers under $ROOT/lib/layers we introduce in PoseCNN
    ```Shell
    cd $ROOT/lib/layers
    sudo python setup.py install
    ```

7. Compile cython components
    ```Shell
    cd $ROOT/lib/utils
    python setup.py build_ext --inplace
    ```

8. Compile the ycb_render in $ROOT/ycb_render
    ```Shell
    cd $ROOT/ycb_render
    sudo python setup.py develop
    ```

### Download

- 3D models of YCB Objects we used [here](https://drive.google.com/file/d/1PTNmhd-eSq0fwSPv0nvQN8h_scR1v-UJ/view?usp=sharing) (3G). Save under $ROOT/data or use a symbol link.

- Our pre-trained checkpoints [here](https://drive.google.com/file/d/1-ECAkkTRfa1jJ9YBTzf04wxCGw6-m5d4/view?usp=sharing) (4G). Save under $ROOT/data or use a symbol link.

- Our real-world images with pose annotations for 20 YCB objects collected via robot interation [here](https://drive.google.com/file/d/1cQH_dnDzyrI0MWNx8st4lht_q0F6cUrE/view?usp=sharing) (53G). Check our ICRA 2020 [paper](https://arxiv.org/abs/1909.10159) for details.


### Running the demo

1. Download 3D models and our pre-trained checkpoints first.

2. run the following script
    ```Shell
    ./experiments/scripts/demo.sh
    ```

<p align="center"><img src="./data/pics/posecnn.png" width="640" height="360"/></p>

### Training your own models with synthetic data for YCB objects

1. Download background images, and save to $ROOT/data or use symbol links.

    - Our own images [here](https://drive.google.com/file/d/1Q5VTKHEEejT2lAKwefG00eWcrnNnpieC/view?usp=sharing) (7G)
    - COCO 2014 images [here](https://cocodataset.org/#download)
    - Or use your own background images

2. Download pretrained VGG16 weights: [here](https://drive.google.com/file/d/1tTd64s1zNnjONlXvTFDZAf4E68Pupc_S/view?usp=sharing) (528M). Put the weight file to $ROOT/data/checkpoints. If our pre-trained models are already downloaded, the VGG16 checkpoint should be in $ROOT/data/checkpoints already.

3. Training and testing for 20 YCB objects with synthetic data. Modify the configuration file for training on a subset of these objects.
    ```Shell
    cd $ROOT

    # multi-gpu training, use 1 GPU or 2 GPUs since batch size is set to 2
    ./experiments/scripts/ycb_object_train.sh

    # testing on synthetic data, $GPU_ID can be 0, 1, etc.
    ./experiments/scripts/ycb_object_test.sh $GPU_ID

    ```

### Training and testing on the YCB-Video dataset
1. Download the YCB-Video dataset from [here](https://rse-lab.cs.washington.edu/projects/posecnn/).

2. Create a symlink for the YCB-Video dataset
    ```Shell
    cd $ROOT/data/YCB_Video
    ln -s $ycb_data data
    ```

3. Training and testing on the YCB-Video dataset
    ```Shell
    cd $ROOT

    # multi-gpu training, use 1 GPU or 2 GPUs since batch size is set to 2
    ./experiments/scripts/ycb_video_train.sh

    # testing, $GPU_ID can be 0, 1, etc.
    ./experiments/scripts/ycb_video_test.sh $GPU_ID

    ```

### Training and testing on the DexYCB dataset
1. Download the DexYCB dataset from [here](https://dex-ycb.github.io/).

2. Create a symlink for the DexYCB dataset
    ```Shell
    cd $ROOT/data/DEX_YCB
    ln -s $dex_ycb_data data
    ```

3. Training and testing on the DexYCB dataset
    ```Shell
    cd $ROOT

    # multi-gpu training for different splits, use 1 GPU or 2 GPUs since batch size is set to 2
    ./experiments/scripts/dex_ycb_train_s0.sh
    ./experiments/scripts/dex_ycb_train_s1.sh
    ./experiments/scripts/dex_ycb_train_s2.sh
    ./experiments/scripts/dex_ycb_train_s3.sh

    # testing, $GPU_ID can be 0, 1, etc.
    # our trained models are in checkpoints.zip
    ./experiments/scripts/dex_ycb_test_s0.sh $GPU_ID $EPOCH
    ./experiments/scripts/dex_ycb_test_s1.sh $GPU_ID $EPOCH
    ./experiments/scripts/dex_ycb_test_s2.sh $GPU_ID $EPOCH
    ./experiments/scripts/dex_ycb_test_s3.sh $GPU_ID $EPOCH

    ```

### Running with ROS on a Realsense Camera for real-world pose estimation

- Python2 is needed for ROS.

- Make sure our pretrained checkpoints are downloaded.

```Shell
# start realsense
roslaunch realsense2_camera rs_aligned_depth.launch tf_prefix:=measured/camera

# start rviz
rosrun rviz rviz -d ./ros/posecnn.rviz

# run posecnn for detection only (20 objects), $GPU_ID can be 0, 1, etc.
./experiments/scripts/ros_ycb_object_test_detection.sh $GPU_ID

# run full posecnn (20 objects), $GPU_ID can be 0, 1, etc.
./experiments/scripts/ros_ycb_object_test.sh $GPU_ID
```

Our example:
<p align="center"><img src="./data/pics/posecnn.gif"/></p>


================================================
FILE: build.sh
================================================
cd lib/layers/;
python3 setup.py build develop;
cd ../utils;
python3 setup.py build_ext --inplace;
cd ../../ycb_render;
python3 setup.py develop



================================================
FILE: experiments/cfgs/dex_ycb.yml
================================================
EXP_DIR: dex_ycb
INPUT: COLOR
TRAIN:
  TRAINABLE: True
  WEIGHT_DECAY: 0.0001
  LEARNING_RATE: 0.001
  MILESTONES: !!python/tuple [12]
  MOMENTUM: 0.9
  BETA: 0.999
  GAMMA: 0.1
  SCALES_BASE: !!python/tuple [1.0]
  IMS_PER_BATCH: 2
  MAX_ITERS_PER_EPOCH: 20000
  NUM_UNITS: 64
  HARD_LABEL_THRESHOLD: 0.9
  HARD_LABEL_SAMPLING: 0.0
  HARD_ANGLE: 5.0
  HOUGH_LABEL_THRESHOLD: 100
  HOUGH_VOTING_THRESHOLD: 10
  HOUGH_SKIP_PIXELS: 10
  FG_THRESH: 0.5
  FG_THRESH_POSE: 0.5
  CLASSES: !!python/tuple [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 20, 21] # no large clamp
  SNAPSHOT_INFIX: dex_ycb
  SNAPSHOT_EPOCHS: 1
  SNAPSHOT_PREFIX: vgg16
  USE_FLIPPED: False
  CHROMATIC: True
  ADD_NOISE: True
  VISUALIZE: False
  VERTEX_REG: True
  POSE_REG: True
  FREEZE_LAYERS: False
  VERTEX_W: 10.0
  VERTEX_W_INSIDE: 10.0
  # synthetic data
  SYNTHESIZE: False
  SYN_RATIO: 5
  SYN_BACKGROUND_SPECIFIC: False
  SYN_BACKGROUND_SUBTRACT_MEAN: True
  SYN_SAMPLE_OBJECT: True
  SYN_SAMPLE_POSE: False
  SYN_MIN_OBJECT: 5
  SYN_MAX_OBJECT: 8
  SYN_TNEAR: 0.5
  SYN_TFAR: 1.6
  SYN_BOUND: 0.15
  SYN_STD_ROTATION: 15
  SYN_STD_TRANSLATION: 0.05
TEST:
  SINGLE_FRAME: True
  HOUGH_LABEL_THRESHOLD: 200
  HOUGH_VOTING_THRESHOLD: 10
  NUM_SDF_ITERATIONS_TRACKING: 50
  SDF_TRANSLATION_REG: 1000.0
  SDF_ROTATION_REG: 10.0
  IMS_PER_BATCH: 1
  HOUGH_SKIP_PIXELS: 10
  DET_THRESHOLD: 0.1
  SCALES_BASE: !!python/tuple [1.0]
  VISUALIZE: False
  SYNTHESIZE: False
  POSE_REFINE: True
  ROS_CAMERA: camera
  CLASSES: !!python/tuple [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 20, 21] # no large clamp


================================================
FILE: experiments/cfgs/ycb_object.yml
================================================
EXP_DIR: ycb_object
INPUT: COLOR
TRAIN:
  TRAINABLE: True
  WEIGHT_DECAY: 0.0001
  LEARNING_RATE: 0.001
  MILESTONES: !!python/tuple [3]
  MOMENTUM: 0.9
  BETA: 0.999
  GAMMA: 0.1
  SCALES_BASE: !!python/tuple [1.0]
  IMS_PER_BATCH: 2
  NUM_UNITS: 64
  HARD_LABEL_THRESHOLD: 0.9
  HARD_LABEL_SAMPLING: 0.0
  HARD_ANGLE: 5.0
  HOUGH_LABEL_THRESHOLD: 100
  HOUGH_VOTING_THRESHOLD: 10
  HOUGH_SKIP_PIXELS: 10
  FG_THRESH: 0.5
  FG_THRESH_POSE: 0.5
  CLASSES: !!python/tuple [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 20, 21] # no large clamp
  SYMMETRY: !!python/tuple [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1]
  SNAPSHOT_INFIX: ycb_object
  SNAPSHOT_EPOCHS: 1
  SNAPSHOT_PREFIX: vgg16
  USE_FLIPPED: False
  CHROMATIC: True
  ADD_NOISE: True
  VISUALIZE: False
  VERTEX_REG: True
  POSE_REG: True
  SLIM: False
  # synthetic data
  SYNTHESIZE: True
  SYNNUM: 40000
  SYN_RATIO: 5
  SYN_BACKGROUND_SPECIFIC: True
  SYN_BACKGROUND_SUBTRACT_MEAN: True
  SYN_SAMPLE_OBJECT: True
  SYN_SAMPLE_POSE: False
  SYN_MIN_OBJECT: 5
  SYN_MAX_OBJECT: 8
  SYN_TNEAR: 0.5
  SYN_TFAR: 1.6
  SYN_BOUND: 0.3
  SYN_STD_ROTATION: 15
  SYN_STD_TRANSLATION: 0.05
TEST:
  SINGLE_FRAME: True
  HOUGH_LABEL_THRESHOLD: 400
  HOUGH_VOTING_THRESHOLD: 10
  NUM_SDF_ITERATIONS_TRACKING: 50
  SDF_TRANSLATION_REG: 1000.0
  SDF_ROTATION_REG: 10.0
  IMS_PER_BATCH: 1
  HOUGH_SKIP_PIXELS: 10
  DET_THRESHOLD: 0.2
  SCALES_BASE: !!python/tuple [1.0]
  VISUALIZE: True
  SYNTHESIZE: True
  POSE_REFINE: True
  ROS_CAMERA: D435
  CLASSES: !!python/tuple [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 20, 21] # no large clamp
  SYMMETRY: !!python/tuple [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1]


================================================
FILE: experiments/cfgs/ycb_object_detection.yml
================================================
EXP_DIR: ycb_object
INPUT: COLOR
TRAIN:
  TRAINABLE: True
  WEIGHT_DECAY: 0.0001
  LEARNING_RATE: 0.001
  MILESTONES: !!python/tuple [3]
  MOMENTUM: 0.9
  BETA: 0.999
  GAMMA: 0.1
  SCALES_BASE: !!python/tuple [1.0]
  IMS_PER_BATCH: 2
  NUM_UNITS: 64
  HARD_LABEL_THRESHOLD: 0.9
  HARD_LABEL_SAMPLING: 0.0
  HARD_ANGLE: 5.0
  HOUGH_LABEL_THRESHOLD: 100
  HOUGH_VOTING_THRESHOLD: 10
  HOUGH_SKIP_PIXELS: 10
  FG_THRESH: 0.5
  FG_THRESH_POSE: 0.5
  CLASSES: !!python/tuple [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 20, 21] # no large clamp
  SYMMETRY: !!python/tuple [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1]
  SNAPSHOT_INFIX: ycb_object_detection
  SNAPSHOT_EPOCHS: 1
  SNAPSHOT_PREFIX: vgg16
  USE_FLIPPED: False
  CHROMATIC: True
  ADD_NOISE: True
  VISUALIZE: False
  VERTEX_REG: True
  POSE_REG: False       # no rotation regression
  SLIM: True
  # synthetic data
  SYNTHESIZE: True
  SYNNUM: 40000
  SYN_RATIO: 5
  SYN_BACKGROUND_SPECIFIC: True
  SYN_BACKGROUND_SUBTRACT_MEAN: True
  SYN_SAMPLE_OBJECT: True
  SYN_SAMPLE_POSE: False
  SYN_MIN_OBJECT: 5
  SYN_MAX_OBJECT: 8
  SYN_TNEAR: 0.5
  SYN_TFAR: 1.6
  SYN_BOUND: 0.3
  SYN_STD_ROTATION: 15
  SYN_STD_TRANSLATION: 0.05
TEST:
  SINGLE_FRAME: True
  HOUGH_LABEL_THRESHOLD: 400
  HOUGH_VOTING_THRESHOLD: 10
  IMS_PER_BATCH: 1
  HOUGH_SKIP_PIXELS: 10
  DET_THRESHOLD: 0.2
  SCALES_BASE: !!python/tuple [1.0]
  VISUALIZE: False
  SYNTHESIZE: True
  POSE_REFINE: False
  ROS_CAMERA: D435
  CLASSES: !!python/tuple [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 20, 21] # no large clamp
  SYMMETRY: !!python/tuple [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1]


================================================
FILE: experiments/cfgs/ycb_object_self_supervision.yml
================================================
EXP_DIR: ycb_self_supervision
INPUT: COLOR
TRAIN:
  TRAINABLE: True
  WEIGHT_DECAY: 0.0001
  LEARNING_RATE: 0.0001
  MILESTONES: !!python/tuple [10000000]
  MOMENTUM: 0.9
  BETA: 0.999
  GAMMA: 0.1
  SCALES_BASE: !!python/tuple [1.0]
  IMS_PER_BATCH: 2
  NUM_UNITS: 64
  HARD_LABEL_THRESHOLD: 0.9
  HARD_LABEL_SAMPLING: 0.0
  HARD_ANGLE: 5.0
  HOUGH_LABEL_THRESHOLD: 100
  HOUGH_VOTING_THRESHOLD: 10
  HOUGH_SKIP_PIXELS: 10
  FG_THRESH: 0.5
  FG_THRESH_POSE: 0.5
  CLASSES: !!python/tuple [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 20, 21] # no large clamp
  SYMMETRY: !!python/tuple [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1]
  SNAPSHOT_INFIX: ycb_object_self_supervision
  SNAPSHOT_EPOCHS: 1
  SNAPSHOT_PREFIX: vgg16
  USE_FLIPPED: False
  CHROMATIC: True
  ADD_NOISE: True
  VISUALIZE: False
  VERTEX_REG: True
  POSE_REG: True
  SLIM: False
  MAX_ITERS_PER_EPOCH: 20000
  # synthetic data
  SYNTHESIZE: True
  SYNNUM: 40000
  SYN_RATIO: 3
  SYN_BACKGROUND_SPECIFIC: True
  SYN_BACKGROUND_SUBTRACT_MEAN: True
  SYN_SAMPLE_OBJECT: True
  SYN_SAMPLE_POSE: False
  SYN_MIN_OBJECT: 5
  SYN_MAX_OBJECT: 8
  SYN_TNEAR: 0.5
  SYN_TFAR: 1.6
  SYN_BOUND: 0.3
  SYN_STD_ROTATION: 15
  SYN_STD_TRANSLATION: 0.05
TEST:
  SINGLE_FRAME: True
  HOUGH_LABEL_THRESHOLD: 100
  HOUGH_VOTING_THRESHOLD: 10
  NUM_SDF_ITERATIONS_TRACKING: 50
  SDF_TRANSLATION_REG: 1000.0
  SDF_ROTATION_REG: 10.0
  IMS_PER_BATCH: 1
  HOUGH_SKIP_PIXELS: 10
  DET_THRESHOLD: 0.2
  SCALES_BASE: !!python/tuple [1.0]
  VISUALIZE: True
  SYNTHESIZE: False
  POSE_REFINE: True
  ROS_CAMERA: D435
  CLASSES: !!python/tuple [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 20, 21]
  SYMMETRY: !!python/tuple [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1]


================================================
FILE: experiments/cfgs/ycb_video.yml
================================================
EXP_DIR: ycb_video
INPUT: COLOR
TRAIN:
  TRAINABLE: True
  WEIGHT_DECAY: 0.0001
  LEARNING_RATE: 0.001
  MILESTONES: !!python/tuple [12]
  MOMENTUM: 0.9
  BETA: 0.999
  GAMMA: 0.1
  SCALES_BASE: !!python/tuple [1.0]
  IMS_PER_BATCH: 2
  MAX_ITERS_PER_EPOCH: 20000
  NUM_UNITS: 64
  HARD_LABEL_THRESHOLD: 0.9
  HARD_LABEL_SAMPLING: 0.0
  HARD_ANGLE: 5.0
  HOUGH_LABEL_THRESHOLD: 100
  HOUGH_VOTING_THRESHOLD: 10
  HOUGH_SKIP_PIXELS: 10
  FG_THRESH: 0.5
  FG_THRESH_POSE: 0.5
  CLASSES: !!python/tuple [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21]
  SNAPSHOT_INFIX: ycb_video
  SNAPSHOT_EPOCHS: 1
  SNAPSHOT_PREFIX: vgg16
  USE_FLIPPED: False
  CHROMATIC: True
  ADD_NOISE: True
  VISUALIZE: False
  VERTEX_REG: True
  POSE_REG: True
  FREEZE_LAYERS: False
  VERTEX_W: 10.0
  VERTEX_W_INSIDE: 10.0
  # synthetic data
  SYNTHESIZE: False
  SYN_RATIO: 5
  SYN_BACKGROUND_SPECIFIC: False
  SYN_BACKGROUND_SUBTRACT_MEAN: True
  SYN_SAMPLE_OBJECT: True
  SYN_SAMPLE_POSE: False
  SYN_MIN_OBJECT: 5
  SYN_MAX_OBJECT: 8
  SYN_TNEAR: 0.5
  SYN_TFAR: 1.6
  SYN_BOUND: 0.15
  SYN_STD_ROTATION: 15
  SYN_STD_TRANSLATION: 0.05
TEST:
  SINGLE_FRAME: True
  HOUGH_LABEL_THRESHOLD: 200
  HOUGH_VOTING_THRESHOLD: 10
  NUM_SDF_ITERATIONS_TRACKING: 50
  SDF_TRANSLATION_REG: 100.0
  SDF_ROTATION_REG: 1.0
  IMS_PER_BATCH: 1
  HOUGH_SKIP_PIXELS: 10
  DET_THRESHOLD: 0.1
  SCALES_BASE: !!python/tuple [1.0]
  VISUALIZE: False
  SYNTHESIZE: False
  POSE_REFINE: True
  ROS_CAMERA: camera
  CLASSES: !!python/tuple [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21]
#  CLASSES: !!python/tuple [0, 1, 2, 3, 4]


================================================
FILE: experiments/scripts/demo.sh
================================================
#!/bin/bash
	
set -x
set -e

export PYTHONUNBUFFERED="True"
export CUDA_VISIBLE_DEVICES=0

time ./tools/test_images.py --gpu 0 \
  --imgdir data/demo/ \
  --meta data/demo/meta.yml \
  --color *color.png \
  --network posecnn \
  --pretrained data/checkpoints/ycb_object/vgg16_ycb_object_self_supervision_epoch_8.checkpoint.pth \
  --dataset ycb_object_test \
  --cfg experiments/cfgs/ycb_object.yml


================================================
FILE: experiments/scripts/dex_ycb_test_s0.sh
================================================
#!/bin/bash

set -x
set -e
export CUDA_VISIBLE_DEVICES=$1

time ./tools/test_net.py --gpu $1 \
  --network posecnn \
  --pretrained output/dex_ycb/dex_ycb_s0_train/vgg16_dex_ycb_epoch_$2.checkpoint.pth \
  --dataset dex_ycb_s0_test \
  --cfg experiments/cfgs/dex_ycb.yml


================================================
FILE: experiments/scripts/dex_ycb_test_s1.sh
================================================
#!/bin/bash

set -x
set -e
export CUDA_VISIBLE_DEVICES=$1

time ./tools/test_net.py --gpu $1 \
  --network posecnn \
  --pretrained output/dex_ycb/dex_ycb_s1_train/vgg16_dex_ycb_epoch_$2.checkpoint.pth \
  --dataset dex_ycb_s1_test \
  --cfg experiments/cfgs/dex_ycb.yml


================================================
FILE: experiments/scripts/dex_ycb_test_s2.sh
================================================
#!/bin/bash

set -x
set -e
export CUDA_VISIBLE_DEVICES=$1

time ./tools/test_net.py --gpu $1 \
  --network posecnn \
  --pretrained output/dex_ycb/dex_ycb_s2_train/vgg16_dex_ycb_epoch_$2.checkpoint.pth \
  --dataset dex_ycb_s2_test \
  --cfg experiments/cfgs/dex_ycb.yml


================================================
FILE: experiments/scripts/dex_ycb_test_s3.sh
================================================
#!/bin/bash

set -x
set -e
export CUDA_VISIBLE_DEVICES=$1

time ./tools/test_net.py --gpu $1 \
  --network posecnn \
  --pretrained output/dex_ycb/dex_ycb_s3_train/vgg16_dex_ycb_epoch_$2.checkpoint.pth \
  --dataset dex_ycb_s3_test \
  --cfg experiments/cfgs/dex_ycb.yml


================================================
FILE: experiments/scripts/dex_ycb_train_s0.sh
================================================
#!/bin/bash

set -x
set -e

time ./tools/train_net.py \
  --network posecnn \
  --pretrained data/checkpoints/vgg16-397923af.pth \
  --dataset dex_ycb_s0_train \
  --cfg experiments/cfgs/dex_ycb.yml \
  --solver sgd \
  --epochs 16


================================================
FILE: experiments/scripts/dex_ycb_train_s1.sh
================================================
#!/bin/bash

set -x
set -e

time ./tools/train_net.py \
  --network posecnn \
  --pretrained data/checkpoints/vgg16-397923af.pth \
  --dataset dex_ycb_s1_train \
  --cfg experiments/cfgs/dex_ycb.yml \
  --solver sgd \
  --epochs 16


================================================
FILE: experiments/scripts/dex_ycb_train_s2.sh
================================================
#!/bin/bash

set -x
set -e

time ./tools/train_net.py \
  --network posecnn \
  --pretrained data/checkpoints/vgg16-397923af.pth \
  --dataset dex_ycb_s2_train \
  --cfg experiments/cfgs/dex_ycb.yml \
  --solver sgd \
  --epochs 16


================================================
FILE: experiments/scripts/dex_ycb_train_s3.sh
================================================
#!/bin/bash

set -x
set -e

time ./tools/train_net.py \
  --network posecnn \
  --pretrained data/checkpoints/vgg16-397923af.pth \
  --dataset dex_ycb_s3_train \
  --cfg experiments/cfgs/dex_ycb.yml \
  --solver sgd \
  --epochs 16


================================================
FILE: experiments/scripts/ros_ycb_object_test.sh
================================================
#!/bin/bash
	
set -x
set -e

export PYTHONUNBUFFERED="True"
export CUDA_VISIBLE_DEVICES=$1

time ./ros/test_images.py --gpu $1 \
  --instance 0 \
  --network posecnn \
  --pretrained data/checkpoints/ycb_object/vgg16_ycb_object_self_supervision_epoch_8.checkpoint.pth \
  --dataset ycb_object_test \
  --cfg experiments/cfgs/ycb_object.yml


================================================
FILE: experiments/scripts/ros_ycb_object_test_detection.sh
================================================
#!/bin/bash
	
set -x
set -e

export PYTHONUNBUFFERED="True"
export CUDA_VISIBLE_DEVICES=$1

time ./ros/test_images.py --gpu $1 \
  --instance 0 \
  --network posecnn \
  --pretrained data/checkpoints/ycb_object/vgg16_ycb_object_detection_self_supervision_epoch_8.checkpoint.pth \
  --dataset ycb_object_test \
  --cfg experiments/cfgs/ycb_object_detection.yml


================================================
FILE: experiments/scripts/ycb_object_test.sh
================================================
#!/bin/bash

set -x
set -e

export PYTHONUNBUFFERED="True"
export CUDA_VISIBLE_DEVICES=$1

time ./tools/test_net.py --gpu $1 \
  --network posecnn \
  --pretrained output/ycb_object/ycb_object_train/vgg16_ycb_object_epoch_$2.checkpoint.pth \
  --dataset ycb_object_test \
  --cfg experiments/cfgs/ycb_object.yml


================================================
FILE: experiments/scripts/ycb_object_train.sh
================================================
#!/bin/bash

set -x
set -e

./tools/train_net.py \
  --network posecnn \
  --pretrained data/checkpoints/vgg16-397923af.pth \
  --dataset ycb_object_train \
  --cfg experiments/cfgs/ycb_object.yml \
  --solver sgd \
  --epochs 16


================================================
FILE: experiments/scripts/ycb_object_train_detection.sh
================================================
#!/bin/bash

set -x
set -e

./tools/train_net.py \
  --network posecnn \
  --pretrained data/checkpoints/vgg16-397923af.pth \
  --dataset ycb_object_train \
  --cfg experiments/cfgs/ycb_object_detection.yml \
  --solver sgd \
  --epochs 16


================================================
FILE: experiments/scripts/ycb_object_train_self_supervision.sh
================================================
#!/bin/bash

set -x
set -e

./tools/train_net.py \
  --network posecnn \
  --pretrained output/ycb_object/ycb_object_train/vgg16_ycb_object_epoch_16.checkpoint.pth \
  --dataset ycb_self_supervision_all \
  --cfg experiments/cfgs/ycb_object_self_supervision.yml \
  --solver sgd \
  --epochs 8


================================================
FILE: experiments/scripts/ycb_video_test.sh
================================================
#!/bin/bash

set -x
set -e

export PYTHONUNBUFFERED="True"
export CUDA_VISIBLE_DEVICES=$1

time ./tools/test_net.py --gpu $1 \
  --network posecnn \
  --pretrained output/ycb_video/ycb_video_train/vgg16_ycb_video_epoch_$2.checkpoint.pth \
  --dataset ycb_video_keyframe \
  --cfg experiments/cfgs/ycb_video.yml


================================================
FILE: experiments/scripts/ycb_video_train.sh
================================================
#!/bin/bash

set -x
set -e

time ./tools/train_net.py \
  --network posecnn \
  --pretrained data/checkpoints/vgg16-397923af.pth \
  --dataset ycb_video_train \
  --cfg experiments/cfgs/ycb_video.yml \
  --solver sgd \
  --epochs 16


================================================
FILE: lib/datasets/__init__.py
================================================
# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.
# This work is licensed under the NVIDIA Source Code License - Non-commercial. Full
# text can be found in LICENSE.md

from .imdb import imdb
from .ycb_video import YCBVideo
from .ycb_self_supervision import YCBSelfSupervision
from .ycb_object import YCBObject
from .dex_ycb import DexYCBDataset
from .background import BackgroundDataset

import os.path as osp
ROOT_DIR = osp.join(osp.dirname(__file__), '..', '..')


================================================
FILE: lib/datasets/background.py
================================================
# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.
# This work is licensed under the NVIDIA Source Code License - Non-commercial. Full
# text can be found in LICENSE.md

import torch
import torchvision
import torch.utils.data as data
import os, math
import os.path as osp
from os.path import *
import numpy as np
import numpy.random as npr
import cv2
import datasets
from fcn.config import cfg
from utils.blob import chromatic_transform, add_noise, add_noise_depth

class BackgroundDataset(data.Dataset, datasets.imdb):

    def __init__(self, name):

        self._name = name
        self.files_color = []
        self.files_depth = []

        if name == 'coco':
            background_dir = os.path.join(self.cache_path, '../coco/train2014/train2014')
            for filename in os.listdir(background_dir):
                self.files_color.append(os.path.join(background_dir, filename))
            self.files_color.sort()

        elif name == 'texture':
            background_dir = os.path.join(self.cache_path, '../textures')
            for filename in os.listdir(background_dir):
                self.files_color.append(os.path.join(background_dir, filename))
            self.files_color.sort()

        elif name == 'nvidia':
            allencenter = os.path.join(self.cache_path, '../backgrounds/nvidia')
            subdirs = os.listdir(allencenter)
            for i in range(len(subdirs)):
                subdir = subdirs[i]
                files = os.listdir(os.path.join(allencenter, subdir))
                for j in range(len(files)):
                    filename = os.path.join(allencenter, subdir, files[j])
                    self.files_color.append(filename)
            self.files_color.sort()

        elif name == 'table':
            allencenter = os.path.join(self.cache_path, '../backgrounds/table')
            subdirs = os.listdir(allencenter)
            for i in range(len(subdirs)):
                subdir = subdirs[i]
                files = os.listdir(os.path.join(allencenter, subdir))
                for j in range(len(files)):
                    filename = os.path.join(allencenter, subdir, files[j])
                    self.files_color.append(filename)
            self.files_color.sort()

        elif name == 'isaac':
            allencenter = os.path.join(self.cache_path, '../backgrounds/isaac')
            subdirs = os.listdir(allencenter)
            for i in range(len(subdirs)):
                subdir = subdirs[i]
                files = os.listdir(os.path.join(allencenter, subdir))
                for j in range(len(files)):
                    filename = os.path.join(allencenter, subdir, files[j])
                    self.files_color.append(filename)
            self.files_color.sort()

        elif name == 'rgbd':
            comotion = os.path.join(self.cache_path, '../backgrounds/rgbd')
            subdirs = os.listdir(comotion)
            for i in range(len(subdirs)):
                subdir = subdirs[i]
                files = os.listdir(os.path.join(comotion, subdir))
                for j in range(len(files)):
                    filename = os.path.join(comotion, subdir, files[j])
                    if 'depth.png' in filename:
                        self.files_depth.append(filename)
                    else:
                        self.files_color.append(filename)

            self.files_color.sort()
            self.files_depth.sort()

        self._intrinsic_matrix = np.array([[524.7917885754071, 0, 332.5213232846151],
                                          [0, 489.3563960810721, 281.2339855172282],
                                          [0, 0, 1]])

        self.num = len(self.files_color)
        self.subtract_mean = cfg.TRAIN.SYN_BACKGROUND_SUBTRACT_MEAN
        if cfg.TRAIN.SYN_CROP:
            self._height = cfg.TRAIN.SYN_CROP_SIZE
            self._width = cfg.TRAIN.SYN_CROP_SIZE
        else:
            self._height = cfg.TRAIN.SYN_HEIGHT
            self._width = cfg.TRAIN.SYN_WIDTH
        self._pixel_mean = cfg.PIXEL_MEANS
        print('{} background images'.format(self.num))


    def __len__(self):
        return self.num

    def __getitem__(self, idx):
        filename_color = self.files_color[idx]
        if self.name == 'rgbd':
            filename_depth = self.files_depth[idx]
        else:
            filename_depth = None
        return self.load(filename_color, filename_depth)

    def load(self, filename_color, filename_depth):
        if filename_depth is None:
            background_depth = np.zeros((3, self._height, self._width), dtype=np.float32)
            mask_depth = np.zeros((self._height, self._width), dtype=np.float32)

        if filename_depth is None and np.random.rand(1) < cfg.TRAIN.SYN_BACKGROUND_CONSTANT_PROB:  # only for rgb cases
            # constant background image
            background_color = np.ones((self._height, self._width, 3), dtype=np.uint8)
            color = np.random.randint(256, size=3)
            background_color[:, :, 0] = color[0]
            background_color[:, :, 1] = color[1]
            background_color[:, :, 2] = color[2]
        else:
            background_color = cv2.imread(filename_color, cv2.IMREAD_UNCHANGED)

            if filename_depth is not None:
                background_depth = cv2.imread(filename_depth, cv2.IMREAD_UNCHANGED)

            try:
                # randomly crop a region as background
                bw = background_color.shape[1]
                bh = background_color.shape[0]
                x1 = npr.randint(0, int(bw/3))
                y1 = npr.randint(0, int(bh/3))
                x2 = npr.randint(int(2*bw/3), bw)
                y2 = npr.randint(int(2*bh/3), bh)
                background_color = background_color[y1:y2, x1:x2]
                background_color = cv2.resize(background_color, (self._width, self._height), interpolation=cv2.INTER_LINEAR)
                if len(background_color.shape) != 3:
                    background_color = cv2.cvtColor(background_color, cv2.COLOR_GRAY2RGB)

                if filename_depth is not None:
                    background_depth = background_depth[y1:y2, x1:x2]
                    background_depth = cv2.resize(background_depth, (self._width, self._height), interpolation=cv2.INTER_NEAREST)
                    background_depth = self.backproject(background_depth, self._intrinsic_matrix, 1000.0)

            except:
                background_color = np.zeros((self._height, self._width, 3), dtype=np.uint8)
                print('bad background_color image', filename_color)
                if filename_depth is not None:
                    background_depth = np.zeros((self._height, self._width, 3), dtype=np.float32)
                    print('bad depth background image')

            if len(background_color.shape) != 3:
                background_color = np.zeros((self._height, self._width, 3), dtype=np.uint8)
                print('bad background_color image', filename_color)

            if filename_depth is not None:
                if len(background_depth.shape) != 3:
                    background_depth = np.zeros((self._height, self._width, 3), dtype=np.float32)
                    print('bad depth background image')

                z_im = background_depth[:, :, 2]
                mask_depth = z_im > 0.0
                mask_depth = mask_depth.astype(np.float32)

                if np.random.rand(1) > 0.1:
                    background_depth = add_noise_depth(background_depth)

                background_depth = background_depth.transpose(2, 0, 1).astype(np.float32)

            if np.random.rand(1) > 0.1:
                background_color = chromatic_transform(background_color)

        if np.random.rand(1) > 0.1:
            background_color = add_noise(background_color)

        background_color = background_color.astype(np.float32)

        if self.subtract_mean:
            background_color -= self._pixel_mean
        background_color = background_color.transpose(2, 0, 1) / 255.0

        sample = {'background_color': background_color,
                  'background_depth': background_depth,
                  'mask_depth': mask_depth}

        return sample


================================================
FILE: lib/datasets/dex_ycb.py
================================================
# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.
# This work is licensed under the NVIDIA Source Code License - Non-commercial. Full
# text can be found in LICENSE.md

import os
import sys
import yaml
import numpy as np
import torch
import torch.utils.data as data
import numpy as np
import numpy.random as npr
import cv2
import copy
import glob
import scipy

import datasets
from fcn.config import cfg
from utils.blob import pad_im, chromatic_transform, add_noise
from transforms3d.quaternions import mat2quat, quat2mat
from utils.se3 import *
from utils.pose_error import *
from utils.cython_bbox import bbox_overlaps

_SUBJECTS = [
    '20200709-subject-01',
    '20200813-subject-02',
    '20200820-subject-03',
    '20200903-subject-04',
    '20200908-subject-05',
    '20200918-subject-06',
    '20200928-subject-07',
    '20201002-subject-08',
    '20201015-subject-09',
    '20201022-subject-10',
]

_SERIALS = [
    '836212060125',
    '839512060362',
    '840412060917',
    '841412060263',
    '932122060857',
    '932122060861',
    '932122061900',
    '932122062010',
]

_YCB_CLASSES = {
     1: '002_master_chef_can',
     2: '003_cracker_box',
     3: '004_sugar_box',
     4: '005_tomato_soup_can',
     5: '006_mustard_bottle',
     6: '007_tuna_fish_can',
     7: '008_pudding_box',
     8: '009_gelatin_box',
     9: '010_potted_meat_can',
    10: '011_banana',
    11: '019_pitcher_base',
    12: '021_bleach_cleanser',
    13: '024_bowl',
    14: '025_mug',
    15: '035_power_drill',
    16: '036_wood_block',
    17: '037_scissors',
    18: '040_large_marker',
    19: '051_large_clamp',
    20: '052_extra_large_clamp',
    21: '061_foam_brick',
}

_MANO_JOINTS = [
    'wrist',
    'thumb_mcp',
    'thumb_pip',
    'thumb_dip',
    'thumb_tip',
    'index_mcp',
    'index_pip',
    'index_dip',
    'index_tip',
    'middle_mcp',
    'middle_pip',
    'middle_dip',
    'middle_tip',
    'ring_mcp',
    'ring_pip',
    'ring_dip',
    'ring_tip',
    'little_mcp',
    'little_pip',
    'little_dip',
    'little_tip'
]

_MANO_JOINT_CONNECT = [
    [0,  1], [ 1,  2], [ 2,  3], [ 3,  4],
    [0,  5], [ 5,  6], [ 6,  7], [ 7,  8],
    [0,  9], [ 9, 10], [10, 11], [11, 12],
    [0, 13], [13, 14], [14, 15], [15, 16],
    [0, 17], [17, 18], [18, 19], [19, 20],
]

_BOP_EVAL_SUBSAMPLING_FACTOR = 4

class DexYCBDataset(data.Dataset, datasets.imdb):

    def __init__(self, setup, split):
        self._setup = setup
        self._split = split
        self._color_format = "color_{:06d}.jpg"
        self._depth_format = "aligned_depth_to_color_{:06d}.png"
        self._label_format = "labels_{:06d}.npz"
        self._height = 480
        self._width = 640

        # paths
        self._name = 'dex_ycb_' + setup + '_' + split
        self._image_set = split
        self._dex_ycb_path = self._get_default_path()
        path = os.path.join(self._dex_ycb_path, 'data')
        self._data_dir = path
        self._calib_dir = os.path.join(self._data_dir, "calibration")
        self._model_dir = os.path.join(self._data_dir, "models")

        self._obj_file = {
            k: os.path.join(self._model_dir, v, "textured_simple.obj")
            for k, v in _YCB_CLASSES.items()
        }

        # define all the classes
        self._classes_all = ('__background__', '002_master_chef_can', '003_cracker_box', '004_sugar_box', '005_tomato_soup_can', '006_mustard_bottle', \
                         '007_tuna_fish_can', '008_pudding_box', '009_gelatin_box', '010_potted_meat_can', '011_banana', '019_pitcher_base', \
                         '021_bleach_cleanser', '024_bowl', '025_mug', '035_power_drill', '036_wood_block', '037_scissors', '040_large_marker', \
                         '051_large_clamp', '052_extra_large_clamp', '061_foam_brick')
        self._num_classes_all = len(self._classes_all)
        self._class_colors_all = [(255, 255, 255), (255, 0, 0), (0, 255, 0), (0, 0, 255), (255, 255, 0), (255, 0, 255), (0, 255, 255), \
                              (128, 0, 0), (0, 128, 0), (0, 0, 128), (128, 128, 0), (128, 0, 128), (0, 128, 128), \
                              (64, 0, 0), (0, 64, 0), (0, 0, 64), (64, 64, 0), (64, 0, 64), (0, 64, 64), 
                              (192, 0, 0), (0, 192, 0), (0, 0, 192)]
        self._symmetry_all = np.array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1]).astype(np.float32)
        self._extents_all = self._load_object_extents()

        # select a subset of classes
        self._classes = [self._classes_all[i] for i in cfg.TRAIN.CLASSES]
        self._classes_test = [self._classes_all[i] for i in cfg.TEST.CLASSES]
        self._num_classes = len(self._classes)
        self._class_colors = [self._class_colors_all[i] for i in cfg.TRAIN.CLASSES]
        self._symmetry = self._symmetry_all[cfg.TRAIN.CLASSES]
        self._symmetry_test = self._symmetry_all[cfg.TEST.CLASSES]
        self._extents = self._extents_all[cfg.TRAIN.CLASSES]
        self._extents_test = self._extents_all[cfg.TEST.CLASSES]
        self._pixel_mean = cfg.PIXEL_MEANS / 255.0

        # train classes
        self._points, self._points_all, self._point_blob = \
            self._load_object_points(self._classes, self._extents, self._symmetry)

        # test classes
        self._points_test, self._points_all_test, self._point_blob_test = \
            self._load_object_points(self._classes_test, self._extents_test, self._symmetry_test)

        # 3D model paths
        self.model_mesh_paths = ['{}/{}/textured_simple.obj'.format(self._model_dir, cls) for cls in self._classes_all[1:]]
        self.model_sdf_paths = ['{}/{}/textured_simple_low_res.pth'.format(self._model_dir, cls) for cls in self._classes_all[1:]]
        self.model_texture_paths = ['{}/{}/texture_map.png'.format(self._model_dir, cls) for cls in self._classes_all[1:]]
        self.model_colors = [np.array(self._class_colors_all[i]) / 255.0 for i in range(1, len(self._classes_all))]

        self.model_mesh_paths_target = ['{}/{}/textured_simple.obj'.format(self._model_dir, cls) for cls in self._classes[1:]]
        self.model_sdf_paths_target = ['{}/{}/textured_simple.sdf'.format(self._model_dir, cls) for cls in self._classes[1:]]
        self.model_texture_paths_target = ['{}/{}/texture_map.png'.format(self._model_dir, cls) for cls in self._classes[1:]]
        self.model_colors_target = [np.array(self._class_colors_all[i]) / 255.0 for i in cfg.TRAIN.CLASSES[1:]]

        # Seen subjects, camera views, grasped objects.
        if self._setup == 's0':
            if self._split == 'train':
                subject_ind = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
                serial_ind = [0, 1, 2, 3, 4, 5, 6, 7]
                sequence_ind = [i for i in range(100) if i % 5 != 4]
            if self._split == 'val':
                subject_ind = [0, 1]
                serial_ind = [0, 1, 2, 3, 4, 5, 6, 7]
                sequence_ind = [i for i in range(100) if i % 5 == 4]
            if self._split == 'test':
                subject_ind = [2, 3, 4, 5, 6, 7, 8, 9]
                serial_ind = [0, 1, 2, 3, 4, 5, 6, 7]
                sequence_ind = [i for i in range(100) if i % 5 == 4]

        # Unseen subjects.
        if self._setup == 's1':
            if self._split == 'train':
                subject_ind = [0, 1, 2, 3, 4, 5, 9]
                serial_ind = [0, 1, 2, 3, 4, 5, 6, 7]
                sequence_ind = list(range(100))
            if self._split == 'val':
                subject_ind = [6]
                serial_ind = [0, 1, 2, 3, 4, 5, 6, 7]
                sequence_ind = list(range(100))
            if self._split == 'test':
                subject_ind = [7, 8]
                serial_ind = [0, 1, 2, 3, 4, 5, 6, 7]
                sequence_ind = list(range(100))

        # Unseen camera views.
        if self._setup == 's2':
            if self._split == 'train':
                subject_ind = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
                serial_ind = [0, 1, 2, 3, 4, 5]
                sequence_ind = list(range(100))
            if self._split == 'val':
                subject_ind = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
                serial_ind = [6]
                sequence_ind = list(range(100))
            if self._split == 'test':
                subject_ind = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
                serial_ind = [7]
                sequence_ind = list(range(100))

        # Unseen grasped objects.
        if self._setup == 's3':
            if self._split == 'train':
                subject_ind = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
                serial_ind = [0, 1, 2, 3, 4, 5, 6, 7]
                sequence_ind = [
                    i for i in range(100) if i // 5 not in (3, 7, 11, 15, 19)
                ]
            if self._split == 'val':
                subject_ind = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
                serial_ind = [0, 1, 2, 3, 4, 5, 6, 7]
                sequence_ind = [i for i in range(100) if i // 5 in (3, 19)]
            if self._split == 'test':
                subject_ind = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
                serial_ind = [0, 1, 2, 3, 4, 5, 6, 7]
                sequence_ind = [i for i in range(100) if i // 5 in (7, 11, 15)]

        self._subjects = [_SUBJECTS[i] for i in subject_ind]
        self._serials = [_SERIALS[i] for i in serial_ind]
        self._intrinsics = []
        for s in self._serials:
            intr_file = os.path.join(self._calib_dir, "intrinsics", "{}_{}x{}.yml".format(s, self._width, self._height))
            with open(intr_file, 'r') as f:
                intr = yaml.load(f, Loader=yaml.FullLoader)
            intr = intr['color']
            self._intrinsics.append(intr)

        # build mapping
        self._sequences = []
        self._mapping = []
        self._ycb_ids = []
        offset = 0
        for n in self._subjects:
            seq = sorted(os.listdir(os.path.join(self._data_dir, n)))
            seq = [os.path.join(n, s) for s in seq]
            assert len(seq) == 100
            seq = [seq[i] for i in sequence_ind]
            self._sequences += seq
            for i, q in enumerate(seq):
                meta_file = os.path.join(self._data_dir, q, "meta.yml")
                with open(meta_file, 'r') as f:
                    meta = yaml.load(f, Loader=yaml.FullLoader)
                c = np.arange(len(self._serials))
                f = np.arange(meta['num_frames'])
                f, c = np.meshgrid(f, c)
                c = c.ravel()
                f = f.ravel()
                s = (offset + i) * np.ones_like(c)
                m = np.vstack((s, c, f)).T
                self._mapping.append(m)
                self._ycb_ids.append(meta['ycb_ids'])
            offset += len(seq)
        self._mapping = np.vstack(self._mapping)

        # sample a subset for training
        if split == 'train':
            self._mapping = self._mapping[::10]

        # dataset size
        self._size = len(self._mapping)
        print('dataset %s with images %d' % (self._name, self._size))
        if cfg.MODE == 'TRAIN' and self._size > cfg.TRAIN.MAX_ITERS_PER_EPOCH * cfg.TRAIN.IMS_PER_BATCH:
            self._size = cfg.TRAIN.MAX_ITERS_PER_EPOCH * cfg.TRAIN.IMS_PER_BATCH


    def __len__(self):
        return self._size


    def get_bop_id_from_idx(self, idx):
        s, c, f = map(lambda x: x.item(), self._mapping[idx])
        scene_id = s * len(self._serials) + c
        im_id = f
        return scene_id, im_id


    def __getitem__(self, idx):
        s, c, f = self._mapping[idx]

        is_testing = f % _BOP_EVAL_SUBSAMPLING_FACTOR == 0
        d = os.path.join(self._data_dir, self._sequences[s], self._serials[c])
        roidb = {
            'color_file': os.path.join(d, self._color_format.format(f)),
            'depth_file': os.path.join(d, self._depth_format.format(f)),
            'label_file': os.path.join(d, self._label_format.format(f)),
            'intrinsics': self._intrinsics[c],
            'ycb_ids': self._ycb_ids[s],
        }

        # Get the input image blob
        random_scale_ind = npr.randint(0, high=len(cfg.TRAIN.SCALES_BASE))
        im_blob, im_depth, im_scale, height, width = self._get_image_blob(roidb['color_file'], roidb['depth_file'], random_scale_ind)

        # build the label blob
        label_blob, mask, meta_data_blob, pose_blob, gt_boxes, vertex_targets, vertex_weights \
            = self._get_label_blob(roidb, self._num_classes, im_scale, height, width)

        is_syn = 0
        im_info = np.array([im_blob.shape[1], im_blob.shape[2], im_scale, is_syn], dtype=np.float32)
        scene_id, im_id = self.get_bop_id_from_idx(idx)
        video_id = '%04d' % (scene_id)
        image_id = '%06d' % (im_id)

        sample = {'image_color': im_blob,
                  'im_depth': im_depth,
                  'label': label_blob,
                  'mask': mask,
                  'meta_data': meta_data_blob,
                  'poses': pose_blob,
                  'extents': self._extents,
                  'points': self._point_blob,
                  'symmetry': self._symmetry,
                  'gt_boxes': gt_boxes,
                  'im_info': im_info,
                  'video_id': video_id,
                  'image_id': image_id}

        if cfg.TRAIN.VERTEX_REG:
            sample['vertex_targets'] = vertex_targets
            sample['vertex_weights'] = vertex_weights

        if self._split == 'test':
            sample['is_testing'] = is_testing

        return sample


    def _get_image_blob(self, color_file, depth_file, scale_ind):    

        # rgba
        rgba = pad_im(cv2.imread(color_file, cv2.IMREAD_UNCHANGED), 16)
        if rgba.shape[2] == 4:
            im = np.copy(rgba[:,:,:3])
            alpha = rgba[:,:,3]
            I = np.where(alpha == 0)
            im[I[0], I[1], :] = 0
        else:
            im = rgba

        im_scale = cfg.TRAIN.SCALES_BASE[scale_ind]
        if im_scale != 1.0:
            im = cv2.resize(im, None, None, fx=im_scale, fy=im_scale, interpolation=cv2.INTER_LINEAR)
        height = im.shape[0]
        width = im.shape[1]

        # chromatic transform
        if cfg.TRAIN.CHROMATIC and cfg.MODE == 'TRAIN' and np.random.rand(1) > 0.1:
            im = chromatic_transform(im)
        if cfg.TRAIN.ADD_NOISE and cfg.MODE == 'TRAIN' and np.random.rand(1) > 0.1:
            im = add_noise(im)
        im_tensor = torch.from_numpy(im) / 255.0
        im_tensor -= self._pixel_mean
        image_blob = im_tensor.permute(2, 0, 1).float()

        # depth image
        im_depth = pad_im(cv2.imread(depth_file, cv2.IMREAD_UNCHANGED), 16)
        if im_scale != 1.0:
            im_depth = cv2.resize(im_depth, None, None, fx=im_scale, fy=im_scale, interpolation=cv2.INTER_NEAREST)
        im_depth = im_depth.astype('float') / 1000.0

        return image_blob, im_depth, im_scale, height, width


    def _get_label_blob(self, roidb, num_classes, im_scale, height, width):
        """ build the label blob """

        # parse data
        cls_indexes = roidb['ycb_ids']
        classes = np.array(cfg.TRAIN.CLASSES)
        fx = roidb['intrinsics']['fx']
        fy = roidb['intrinsics']['fy']
        px = roidb['intrinsics']['ppx']
        py = roidb['intrinsics']['ppy']
        intrinsic_matrix = np.eye(3, dtype=np.float32)
        intrinsic_matrix[0, 0] = fx
        intrinsic_matrix[1, 1] = fy
        intrinsic_matrix[0, 2] = px
        intrinsic_matrix[1, 2] = py
        label = np.load(roidb['label_file'])

        # read label image
        im_label = label['seg']
        if im_scale != 1.0:
            im_label = cv2.resize(im_label, None, None, fx=im_scale, fy=im_scale, interpolation=cv2.INTER_NEAREST)

        label_blob = np.zeros((num_classes, height, width), dtype=np.float32)
        label_blob[0, :, :] = 1.0
        for i in range(1, num_classes):
            I = np.where(im_label == classes[i])
            if len(I[0]) > 0:
                label_blob[i, I[0], I[1]] = 1.0
                label_blob[0, I[0], I[1]] = 0.0

        # foreground mask
        seg = torch.from_numpy((im_label != 0).astype(np.float32))
        mask = seg.unsqueeze(0).repeat((3, 1, 1)).float()

        # poses
        poses = label['pose_y']
        if len(poses.shape) == 2:
            poses = np.reshape(poses, (1, 3, 4))
        num = poses.shape[0]
        assert num == len(cls_indexes), 'number of poses not equal to number of objects'

        pose_blob = np.zeros((num_classes, 9), dtype=np.float32)
        gt_boxes = np.zeros((num_classes, 5), dtype=np.float32)
        center = np.zeros((num, 2), dtype=np.float32)
        count = 0
        for i in range(num):
            cls = int(cls_indexes[i])
            ind = np.where(classes == cls)[0]
            if len(ind) > 0:
                R = poses[i, :, :3]
                T = poses[i, :, 3]
                pose_blob[count, 0] = 1
                pose_blob[count, 1] = ind
                qt = mat2quat(R)

                # egocentric to allocentric
                qt_allocentric = egocentric2allocentric(qt, T)
                if qt_allocentric[0] < 0:
                   qt_allocentric = -1 * qt_allocentric
                pose_blob[count, 2:6] = qt_allocentric
                pose_blob[count, 6:] = T

                # compute center
                center[i, 0] = fx * T[0] / T[2] + px
                center[i, 1] = fy * T[1] / T[2] + py

                # compute box
                x3d = np.ones((4, self._points_all.shape[1]), dtype=np.float32)
                x3d[0, :] = self._points_all[ind,:,0]
                x3d[1, :] = self._points_all[ind,:,1]
                x3d[2, :] = self._points_all[ind,:,2]
                RT = np.zeros((3, 4), dtype=np.float32)
                RT[:3, :3] = quat2mat(qt)
                RT[:, 3] = T
                x2d = np.matmul(intrinsic_matrix, np.matmul(RT, x3d))
                x2d[0, :] = np.divide(x2d[0, :], x2d[2, :])
                x2d[1, :] = np.divide(x2d[1, :], x2d[2, :])
        
                gt_boxes[count, 0] = np.min(x2d[0, :]) * im_scale
                gt_boxes[count, 1] = np.min(x2d[1, :]) * im_scale
                gt_boxes[count, 2] = np.max(x2d[0, :]) * im_scale
                gt_boxes[count, 3] = np.max(x2d[1, :]) * im_scale
                gt_boxes[count, 4] = ind
                count += 1

        # construct the meta data
        """
        format of the meta_data
        intrinsic matrix: meta_data[0 ~ 8]
        inverse intrinsic matrix: meta_data[9 ~ 17]
        """
        K = np.matrix(intrinsic_matrix) * im_scale
        K[2, 2] = 1
        Kinv = np.linalg.pinv(K)
        meta_data_blob = np.zeros(18, dtype=np.float32)
        meta_data_blob[0:9] = K.flatten()
        meta_data_blob[9:18] = Kinv.flatten()

        # vertex regression target
        if cfg.TRAIN.VERTEX_REG:
            vertex_targets, vertex_weights = self._generate_vertex_targets(im_label,
                cls_indexes, center, poses, classes, num_classes)
        else:
            vertex_targets = []
            vertex_weights = []

        return label_blob, mask, meta_data_blob, pose_blob, gt_boxes, vertex_targets, vertex_weights


    # compute the voting label image in 2D
    def _generate_vertex_targets(self, im_label, cls_indexes, center, poses, classes, num_classes):

        width = im_label.shape[1]
        height = im_label.shape[0]
        vertex_targets = np.zeros((3 * num_classes, height, width), dtype=np.float32)
        vertex_weights = np.zeros((3 * num_classes, height, width), dtype=np.float32)

        c = np.zeros((2, 1), dtype=np.float32)
        for i in range(1, num_classes):
            y, x = np.where(im_label == classes[i])
            I = np.where(im_label == classes[i])
            ind = np.where(cls_indexes == classes[i])[0]
            if len(x) > 0 and len(ind) > 0:
                c[0] = center[ind, 0]
                c[1] = center[ind, 1]
                if isinstance(poses, list):
                    z = poses[int(ind)][2]
                else:
                    if len(poses.shape) == 3:
                        z = poses[ind, 2, 3]
                    else:
                        z = poses[ind, -1]
                R = np.tile(c, (1, len(x))) - np.vstack((x, y))
                # compute the norm
                N = np.linalg.norm(R, axis=0) + 1e-10
                # normalization
                R = np.divide(R, np.tile(N, (2,1)))
                # assignment
                vertex_targets[3*i+0, y, x] = R[0,:]
                vertex_targets[3*i+1, y, x] = R[1,:]
                vertex_targets[3*i+2, y, x] = math.log(z)

                vertex_weights[3*i+0, y, x] = cfg.TRAIN.VERTEX_W_INSIDE
                vertex_weights[3*i+1, y, x] = cfg.TRAIN.VERTEX_W_INSIDE
                vertex_weights[3*i+2, y, x] = cfg.TRAIN.VERTEX_W_INSIDE

        return vertex_targets, vertex_weights


    def _get_default_path(self):
        """
        Return the default path where YCB_Video is expected to be installed.
        """
        return os.path.join(datasets.ROOT_DIR, 'data', 'DEX_YCB')


    def _load_object_extents(self):
        extents = np.zeros((self._num_classes_all, 3), dtype=np.float32)
        for i in range(1, self._num_classes_all):
            point_file = os.path.join(self._model_dir, self._classes_all[i], 'points.xyz')
            print(point_file)
            assert os.path.exists(point_file), 'Path does not exist: {}'.format(point_file)
            points = np.loadtxt(point_file)
            extents[i, :] = 2 * np.max(np.absolute(points), axis=0)
        return extents


    def _load_object_points(self, classes, extents, symmetry):

        points = [[] for _ in range(len(classes))]
        num = np.inf
        num_classes = len(classes)
        for i in range(1, num_classes):
            point_file = os.path.join(self._model_dir, classes[i], 'points.xyz')
            print(point_file)
            assert os.path.exists(point_file), 'Path does not exist: {}'.format(point_file)
            points[i] = np.loadtxt(point_file)
            if points[i].shape[0] < num:
                num = points[i].shape[0]

        points_all = np.zeros((num_classes, num, 3), dtype=np.float32)
        for i in range(1, num_classes):
            points_all[i, :, :] = points[i][:num, :]

        # rescale the points
        point_blob = points_all.copy()
        for i in range(1, num_classes):
            # compute the rescaling factor for the points
            weight = 10.0 / np.amax(extents[i, :])
            if weight < 10:
                weight = 10
            if symmetry[i] > 0:
                point_blob[i, :, :] = 4 * weight * point_blob[i, :, :]
            else:
                point_blob[i, :, :] = weight * point_blob[i, :, :]
        return points, points_all, point_blob


    def write_dop_results(self, output_dir):
        # only write the result file
        filename = os.path.join(output_dir, 'posecnn_' + self.name + '.csv')
        f = open(filename, 'w')
        f.write('scene_id,im_id,obj_id,score,R,t,time\n')

        if cfg.TEST.POSE_REFINE:
            filename_refined = os.path.join(output_dir, 'posecnn_' + self.name + '_refined.csv')
            f1 = open(filename_refined, 'w')
            f1.write('scene_id,im_id,obj_id,score,R,t,time\n')

        # list the mat file
        images_color = []
        filename = os.path.join(output_dir, '*.mat')
        files = sorted(glob.glob(filename))

        # for each image
        for i in range(len(files)):
            filename = os.path.basename(files[i])

            # parse filename
            pos = filename.find('_')
            scene_id = int(filename[:pos])
            im_id = int(filename[pos+1:-4])

            # load result
            print(files[i])
            result = scipy.io.loadmat(files[i])
            if len(result['rois']) == 0:
                continue

            rois = result['rois']
            num = rois.shape[0]
            for j in range(num):
                obj_id = cfg.TRAIN.CLASSES[int(rois[j, 1])]
                if obj_id == 0:
                    continue
                score = rois[j, -1]
                run_time = -1

                # pose from network
                R = quat2mat(result['poses'][j, :4].flatten())
                t = result['poses'][j, 4:] * 1000
                line = '{scene_id},{im_id},{obj_id},{score},{R},{t},{time}\n'.format(
                    scene_id=scene_id,
                    im_id=im_id,
                    obj_id=obj_id,
                    score=score,
                    R=' '.join(map(str, R.flatten().tolist())),
                    t=' '.join(map(str, t.flatten().tolist())),
                    time=run_time)
                f.write(line)

                if cfg.TEST.POSE_REFINE:
                    R = quat2mat(result['poses_refined'][j, :4].flatten())
                    t = result['poses_refined'][j, 4:] * 1000
                    line = '{scene_id},{im_id},{obj_id},{score},{R},{t},{time}\n'.format(
                        scene_id=scene_id,
                        im_id=im_id,
                        obj_id=obj_id,
                        score=score,
                        R=' '.join(map(str, R.flatten().tolist())),
                        t=' '.join(map(str, t.flatten().tolist())),
                        time=run_time)
                    f1.write(line)

        # close file
        f.close()
        if cfg.TEST.POSE_REFINE:
            f1.close()


    # compute box
    def compute_box(self, cls, intrinsic_matrix, RT):
        classes = np.array(cfg.TRAIN.CLASSES)
        ind = np.where(classes == cls)[0]
        x3d = np.ones((4, self._points_all.shape[1]), dtype=np.float32)
        x3d[0, :] = self._points_all[ind,:,0]
        x3d[1, :] = self._points_all[ind,:,1]
        x3d[2, :] = self._points_all[ind,:,2]
        x2d = np.matmul(intrinsic_matrix, np.matmul(RT, x3d))
        x2d[0, :] = np.divide(x2d[0, :], x2d[2, :])
        x2d[1, :] = np.divide(x2d[1, :], x2d[2, :])
        x1 = np.min(x2d[0, :])
        y1 = np.min(x2d[1, :])
        x2 = np.max(x2d[0, :])
        y2 = np.max(x2d[1, :])
        return [x1, y1, x2, y2]


    def evaluation(self, output_dir):
        self.write_dop_results(output_dir)

        filename = os.path.join(output_dir, 'results_posecnn.mat')
        if os.path.exists(filename):
            results_all = scipy.io.loadmat(filename)
            print('load results from file')
            print(filename)
            distances_sys = results_all['distances_sys']
            distances_non = results_all['distances_non']
            errors_rotation = results_all['errors_rotation']
            errors_translation = results_all['errors_translation']
            results_seq_id = results_all['results_seq_id'].flatten()
            results_frame_id = results_all['results_frame_id'].flatten()
            results_object_id = results_all['results_object_id'].flatten()
            results_cls_id = results_all['results_cls_id'].flatten()
        else:
            # save results
            num_max = 200000
            num_results = 2
            distances_sys = np.zeros((num_max, num_results), dtype=np.float32)
            distances_non = np.zeros((num_max, num_results), dtype=np.float32)
            errors_rotation = np.zeros((num_max, num_results), dtype=np.float32)
            errors_translation = np.zeros((num_max, num_results), dtype=np.float32)
            results_seq_id = np.zeros((num_max, ), dtype=np.float32)
            results_frame_id = np.zeros((num_max, ), dtype=np.float32)
            results_object_id = np.zeros((num_max, ), dtype=np.float32)
            results_cls_id = np.zeros((num_max, ), dtype=np.float32)

            # for each image
            count = -1
            for i in range(len(self._mapping)):

                s, c, f = self._mapping[i]
                is_testing = f % _BOP_EVAL_SUBSAMPLING_FACTOR == 0
                if not is_testing:
                    continue

                # intrinsics
                intrinsics = self._intrinsics[c]
                intrinsic_matrix = np.eye(3, dtype=np.float32)
                intrinsic_matrix[0, 0] = intrinsics['fx']
                intrinsic_matrix[1, 1] = intrinsics['fy']
                intrinsic_matrix[0, 2] = intrinsics['ppx']
                intrinsic_matrix[1, 2] = intrinsics['ppy']

                # parse keyframe name
                scene_id, im_id = self.get_bop_id_from_idx(i)

                # load result
                filename = os.path.join(output_dir, '%04d_%06d.mat' % (scene_id, im_id))
                print(filename)
                result = scipy.io.loadmat(filename)

                # load gt
                d = os.path.join(self._data_dir, self._sequences[s], self._serials[c])
                label_file = os.path.join(d, self._label_format.format(f))
                label = np.load(label_file)
                cls_indexes = np.array(self._ycb_ids[s]).flatten()

                # poses
                poses = label['pose_y']
                if len(poses.shape) == 2:
                    poses = np.reshape(poses, (1, 3, 4))
                num = poses.shape[0]
                assert num == len(cls_indexes), 'number of poses not equal to number of objects'

                # instance label
                im_label = label['seg']
                instance_ids = np.unique(im_label)
                if instance_ids[0] == 0:
                    instance_ids = instance_ids[1:]
                if instance_ids[-1] == 255:
                    instance_ids = instance_ids[:-1]

                # for each gt poses
                for j in range(len(instance_ids)):
                    cls = instance_ids[j]

                    # find the number of pixels of the object
                    pixels = np.sum(im_label == cls)
                    if pixels < 200:
                        continue
                    count += 1

                    # find the pose
                    object_index = np.where(cls_indexes == cls)[0][0]
                    RT_gt = poses[object_index, :, :]
                    box_gt = self.compute_box(cls, intrinsic_matrix, RT_gt)

                    results_seq_id[count] = scene_id
                    results_frame_id[count] = im_id
                    results_object_id[count] = object_index
                    results_cls_id[count] = cls

                    # network result
                    roi_index = []
                    if len(result['rois']) > 0:
                        for k in range(result['rois'].shape[0]):
                            ind = int(result['rois'][k, 1])
                            if cls == cfg.TRAIN.CLASSES[ind]:
                                roi_index.append(k)

                    # select the roi
                    if len(roi_index) > 1:
                        # overlaps: (rois x gt_boxes)
                        roi_blob = result['rois'][roi_index, :]
                        roi_blob = roi_blob[:, (0, 2, 3, 4, 5, 1)]
                        gt_box_blob = np.zeros((1, 5), dtype=np.float32)
                        gt_box_blob[0, 1:] = box_gt
                        overlaps = bbox_overlaps(
                            np.ascontiguousarray(roi_blob[:, :5], dtype=np.float),
                            np.ascontiguousarray(gt_box_blob, dtype=np.float)).flatten()
                        assignment = overlaps.argmax()
                        roi_index = [roi_index[assignment]]

                    if len(roi_index) > 0:
                        RT = np.zeros((3, 4), dtype=np.float32)
                        ind = int(result['rois'][roi_index, 1])
                        points = self._points[ind]

                        # pose from network
                        RT[:3, :3] = quat2mat(result['poses'][roi_index, :4].flatten())
                        RT[:, 3] = result['poses'][roi_index, 4:]
                        distances_sys[count, 0] = adi(RT[:3, :3], RT[:, 3],  RT_gt[:3, :3], RT_gt[:, 3], points)
                        distances_non[count, 0] = add(RT[:3, :3], RT[:, 3],  RT_gt[:3, :3], RT_gt[:, 3], points)
                        errors_rotation[count, 0] = re(RT[:3, :3], RT_gt[:3, :3])
                        errors_translation[count, 0] = te(RT[:, 3], RT_gt[:, 3])

                        # pose after depth refinement
                        if cfg.TEST.POSE_REFINE:
                            RT[:3, :3] = quat2mat(result['poses_refined'][roi_index, :4].flatten())
                            RT[:, 3] = result['poses_refined'][roi_index, 4:]
                            distances_sys[count, 1] = adi(RT[:3, :3], RT[:, 3],  RT_gt[:3, :3], RT_gt[:, 3], points)
                            distances_non[count, 1] = add(RT[:3, :3], RT[:, 3],  RT_gt[:3, :3], RT_gt[:, 3], points)
                            errors_rotation[count, 1] = re(RT[:3, :3], RT_gt[:3, :3])
                            errors_translation[count, 1] = te(RT[:, 3], RT_gt[:, 3])
                        else:
                            distances_sys[count, 1] = np.inf
                            distances_non[count, 1] = np.inf
                            errors_rotation[count, 1] = np.inf
                            errors_translation[count, 1] = np.inf
                    else:
                        distances_sys[count, :] = np.inf
                        distances_non[count, :] = np.inf
                        errors_rotation[count, :] = np.inf
                        errors_translation[count, :] = np.inf

            distances_sys = distances_sys[:count+1, :]
            distances_non = distances_non[:count+1, :]
            errors_rotation = errors_rotation[:count+1, :]
            errors_translation = errors_translation[:count+1, :]
            results_seq_id = results_seq_id[:count+1]
            results_frame_id = results_frame_id[:count+1]
            results_object_id = results_object_id[:count+1]
            results_cls_id = results_cls_id[:count+1]

            results_all = {'distances_sys': distances_sys,
                       'distances_non': distances_non,
                       'errors_rotation': errors_rotation,
                       'errors_translation': errors_translation,
                       'results_seq_id': results_seq_id,
                       'results_frame_id': results_frame_id,
                       'results_object_id': results_object_id,
                       'results_cls_id': results_cls_id }

            filename = os.path.join(output_dir, 'results_posecnn.mat')
            scipy.io.savemat(filename, results_all)

        # print the results
        # for each class
        import matplotlib.pyplot as plt
        max_distance = 0.1
        index_plot = [0, 1]
        color = ['r', 'b']
        leng = ['PoseCNN', 'PoseCNN refined']
        num = len(leng)
        ADD = np.zeros((self._num_classes_all, num), dtype=np.float32)
        ADDS = np.zeros((self._num_classes_all, num), dtype=np.float32)
        TS = np.zeros((self._num_classes_all, num), dtype=np.float32)
        classes = list(copy.copy(self._classes_all))
        classes[0] = 'all'
        for k in range(self._num_classes_all):
            fig = plt.figure(figsize=(16.0, 10.0))
            if k == 0:
                index = range(len(results_cls_id))
            else:
                index = np.where(results_cls_id == k)[0]

            if len(index) == 0:
                continue
            print('%s: %d objects' % (classes[k], len(index)))

            # distance symmetry
            ax = fig.add_subplot(2, 3, 1)
            lengs = []
            for i in index_plot:
                D = distances_sys[index, i]
                ind = np.where(D > max_distance)[0]
                D[ind] = np.inf
                d = np.sort(D)
                n = len(d)
                accuracy = np.cumsum(np.ones((n, ), np.float32)) / n
                plt.plot(d, accuracy, color[i], linewidth=2)
                ADDS[k, i] = VOCap(d, accuracy)
                lengs.append('%s (%.2f)' % (leng[i], ADDS[k, i] * 100))
                print('%s, %s: %d objects missed' % (classes[k], leng[i], np.sum(np.isinf(D))))

            ax.legend(lengs)
            plt.xlabel('Average distance threshold in meter (symmetry)')
            plt.ylabel('accuracy')
            ax.set_title(classes[k])

            # distance non-symmetry
            ax = fig.add_subplot(2, 3, 2)
            lengs = []
            for i in index_plot:
                D = distances_non[index, i]
                ind = np.where(D > max_distance)[0]
                D[ind] = np.inf
                d = np.sort(D)
                n = len(d)
                accuracy = np.cumsum(np.ones((n, ), np.float32)) / n
                plt.plot(d, accuracy, color[i], linewidth=2)
                ADD[k, i] = VOCap(d, accuracy)
                lengs.append('%s (%.2f)' % (leng[i], ADD[k, i] * 100))
                print('%s, %s: %d objects missed' % (classes[k], leng[i], np.sum(np.isinf(D))))

            ax.legend(lengs)
            plt.xlabel('Average distance threshold in meter (non-symmetry)')
            plt.ylabel('accuracy')
            ax.set_title(classes[k])

            # translation
            ax = fig.add_subplot(2, 3, 3)
            lengs = []
            for i in index_plot:
                D = errors_translation[index, i]
                ind = np.where(D > max_distance)[0]
                D[ind] = np.inf
                d = np.sort(D)
                n = len(d)
                accuracy = np.cumsum(np.ones((n, ), np.float32)) / n
                plt.plot(d, accuracy, color[i], linewidth=2)
                TS[k, i] = VOCap(d, accuracy)
                lengs.append('%s (%.2f)' % (leng[i], TS[k, i] * 100))
                print('%s, %s: %d objects missed' % (classes[k], leng[i], np.sum(np.isinf(D))))

            ax.legend(lengs)
            plt.xlabel('Translation threshold in meter')
            plt.ylabel('accuracy')
            ax.set_title(classes[k])

            # rotation histogram
            count = 4
            for i in index_plot:
                ax = fig.add_subplot(2, 3, count)
                D = errors_rotation[index, i]
                ind = np.where(np.isfinite(D))[0]
                D = D[ind]
                ax.hist(D, bins=range(0, 190, 10), range=(0, 180))
                plt.xlabel('Rotation angle error')
                plt.ylabel('count')
                ax.set_title(leng[i])
                count += 1

            # mng = plt.get_current_fig_manager()
            # mng.full_screen_toggle()
            filename = output_dir + '/' + classes[k] + '.png'
            # plt.show()
            plt.savefig(filename)

        # print ADD
        print('==================ADD======================')
        for k in range(len(classes)):
            print('%s: %f' % (classes[k], ADD[k, 0]))
        for k in range(len(classes)-1):
            print('%f' % (ADD[k+1, 0]))
        print('%f' % (ADD[0, 0]))
        print(cfg.TRAIN.SNAPSHOT_INFIX)
        print('===========================================')

        # print ADD-S
        print('==================ADD-S====================')
        for k in range(len(classes)):
            print('%s: %f' % (classes[k], ADDS[k, 0]))
        for k in range(len(classes)-1):
            print('%f' % (ADDS[k+1, 0]))
        print('%f' % (ADDS[0, 0]))
        print(cfg.TRAIN.SNAPSHOT_INFIX)
        print('===========================================')

        # print ADD
        print('==================ADD refined======================')
        for k in range(len(classes)):
            print('%s: %f' % (classes[k], ADD[k, 1]))
        for k in range(len(classes)-1):
            print('%f' % (ADD[k+1, 1]))
        print('%f' % (ADD[0, 1]))
        print(cfg.TRAIN.SNAPSHOT_INFIX)
        print('===========================================')

        # print ADD-S
        print('==================ADD-S refined====================')
        for k in range(len(classes)):
            print('%s: %f' % (classes[k], ADDS[k, 1]))
        for k in range(len(classes)-1):
            print('%f' % (ADDS[k+1, 1]))
        print('%f' % (ADDS[0, 1]))
        print(cfg.TRAIN.SNAPSHOT_INFIX)
        print('===========================================')


================================================
FILE: lib/datasets/factory.py
================================================
# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.
# This work is licensed under the NVIDIA Source Code License - Non-commercial. Full
# text can be found in LICENSE.md

"""Factory method for easily getting imdbs by name."""

__sets = {}

import datasets.ycb_video
import datasets.ycb_object
import datasets.ycb_self_supervision
import datasets.dex_ycb
import datasets.background
import numpy as np

# ycb video dataset
for split in ['train', 'val', 'keyframe', 'trainval', 'debug']:
    name = 'ycb_video_{}'.format(split)
    print(name)
    __sets[name] = (lambda split=split:
            datasets.YCBVideo(split))

# ycb object dataset
for split in ['train', 'test']:
    name = 'ycb_object_{}'.format(split)
    print(name)
    __sets[name] = (lambda split=split:
            datasets.YCBObject(split))

# ycb self supervision dataset
for split in ['train_1', 'train_2', 'train_3', 'train_4', 'train_5', 'test', 'all', 'train_block_median', 'train_block_median_azure', 'train_block_median_demo', 'train_block_median_azure_demo', 'train_table',
              'debug', 'train_block', 'train_block_azure', 'train_block_big_sim', 'train_block_median_sim', 'train_block_small_sim']:
    name = 'ycb_self_supervision_{}'.format(split)
    print(name)
    __sets[name] = (lambda split=split:
            datasets.YCBSelfSupervision(split))

# background dataset
for split in ['coco', 'rgbd', 'nvidia', 'table', 'isaac', 'texture']:
    name = 'background_{}'.format(split)
    print(name)
    __sets[name] = (lambda split=split:
            datasets.BackgroundDataset(split))


# DEX YCB dataset
for setup in ('s0', 's1', 's2', 's3'):
    for split in ('train', 'val', 'test'):
        name = 'dex_ycb_{}_{}'.format(setup, split)
        __sets[name] = (lambda setup=setup, split=split: datasets.DexYCBDataset(setup, split))


def get_dataset(name):
    """Get an imdb (image database) by name."""
    if name not in __sets:
        raise KeyError('Unknown dataset: {}'.format(name))
    return __sets[name]()

def list_datasets():
    """List all registered imdbs."""
    return __sets.keys()


================================================
FILE: lib/datasets/imdb.py
================================================
# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.
# This work is licensed under the NVIDIA Source Code License - Non-commercial. Full
# text can be found in LICENSE.md

import os
import os.path as osp
import numpy as np
import datasets
import math
import glob
from fcn.config import cfg

class imdb(object):
    """Image database."""

    def __init__(self):
        self._name = ''
        self._num_classes = 0
        self._classes = []
        self._class_colors = []

    @property
    def name(self):
        return self._name

    @property
    def num_classes(self):
        return len(self._classes)

    @property
    def classes(self):
        return self._classes

    @property
    def class_colors(self):
        return self._class_colors

    @property
    def cache_path(self):
        cache_path = osp.abspath(osp.join(datasets.ROOT_DIR, 'data', 'cache'))
        if not os.path.exists(cache_path):
            os.makedirs(cache_path)
        return cache_path


    # backproject pixels into 3D points in camera's coordinate system
    def backproject(self, depth_cv, intrinsic_matrix, factor):

        depth = depth_cv.astype(np.float32, copy=True) / factor

        index = np.where(~np.isfinite(depth))
        depth[index[0], index[1]] = 0

        # get intrinsic matrix
        K = intrinsic_matrix
        Kinv = np.linalg.inv(K)

        # compute the 3D points
        width = depth.shape[1]
        height = depth.shape[0]

        # construct the 2D points matrix
        x, y = np.meshgrid(np.arange(width), np.arange(height))
        ones = np.ones((height, width), dtype=np.float32)
        x2d = np.stack((x, y, ones), axis=2).reshape(width*height, 3)

        # backprojection
        R = np.dot(Kinv, x2d.transpose())

        # compute the 3D points
        X = np.multiply(np.tile(depth.reshape(1, width*height), (3, 1)), R)
        return np.array(X).transpose().reshape((height, width, 3))


    def _build_uniform_poses(self):

        self.eulers = []
        interval = cfg.TRAIN.UNIFORM_POSE_INTERVAL
        for yaw in range(-180, 180, interval):
            for pitch in range(-90, 90, interval):
                for roll in range(-180, 180, interval):
                    self.eulers.append([yaw, pitch, roll])

        # sample indexes
        num_poses = len(self.eulers)
        num_classes = len(self._classes_all) - 1 # no background
        self.pose_indexes = np.zeros((num_classes, ), dtype=np.int32)
        self.pose_lists = []
        for i in range(num_classes):
            self.pose_lists.append(np.random.permutation(np.arange(num_poses)))


    def evaluation(self, output_dir):
        print('evaluation function not implemented for dataset %s' % self._name)


================================================
FILE: lib/datasets/ycb_object.py
================================================
# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.
# This work is licensed under the NVIDIA Source Code License - Non-commercial. Full
# text can be found in LICENSE.md

import torch
import torch.utils.data as data

import os, math
import sys
import os.path as osp
from os.path import *
import numpy as np
import numpy.random as npr
import cv2
try:
    import cPickle  # Use cPickle on Python 2.7
except ImportError:
    import pickle as cPickle
import scipy.io
import glob

import datasets
from fcn.config import cfg
from utils.blob import pad_im, chromatic_transform, add_noise, add_noise_cuda, add_noise_depth_cuda
from transforms3d.quaternions import mat2quat, quat2mat
from transforms3d.euler import euler2quat
from utils.se3 import *
from scipy.optimize import minimize
import matplotlib.pyplot as plt


class YCBObject(data.Dataset, datasets.imdb):
    def __init__(self, image_set, ycb_object_path = None):

        self._name = 'ycb_object_' + image_set
        self._image_set = image_set
        self._ycb_object_path = self._get_default_path() if ycb_object_path is None \
                            else ycb_object_path
        self._data_path = os.path.join(self._ycb_object_path, 'data')
        self._model_path = os.path.join(datasets.ROOT_DIR, 'data', 'models')
        self.root_path = self._ycb_object_path

        # define all the classes
        self._classes_all = ('__background__', '002_master_chef_can', '003_cracker_box', '004_sugar_box', '005_tomato_soup_can', '006_mustard_bottle', \
                         '007_tuna_fish_can', '008_pudding_box', '009_gelatin_box', '010_potted_meat_can', '011_banana', '019_pitcher_base', \
                         '021_bleach_cleanser', '024_bowl', '025_mug', '035_power_drill', '036_wood_block', '037_scissors', '040_large_marker', \
                         '051_large_clamp', '052_extra_large_clamp', '061_foam_brick', 'holiday_cup1', 'holiday_cup2', 'sanning_mug', \
                         '001_chips_can', 'block_red_big', 'block_green_big', 'block_blue_big', 'block_yellow_big', \
                         'block_red_small', 'block_green_small', 'block_blue_small', 'block_yellow_small', \
                         'block_red_median', 'block_green_median', 'block_blue_median', 'block_yellow_median',
                         'fusion_duplo_dude', 'cabinet_handle')
        self._num_classes_all = len(self._classes_all)
        self._class_colors_all = [(255, 255, 255), (255, 0, 0), (0, 255, 0), (0, 0, 255), (255, 255, 0), (255, 0, 255), (0, 255, 255), \
                              (0, 0, 128), (0, 128, 0), (128, 0, 0), (128, 128, 0), (128, 0, 128), (0, 128, 128), \
                              (0, 64, 0), (64, 0, 0), (0, 0, 64), (64, 64, 0), (64, 0, 64), (0, 64, 64), \
                              (192, 0, 0), (0, 192, 0), (0, 0, 192), (192, 192, 0), (192, 0, 192), (0, 192, 192), (32, 0, 0), \
                              (150, 0, 0), (0, 150, 0), (0, 0, 150), (150, 150, 0), (75, 0, 0), (0, 75, 0), (0, 0, 75), (75, 75, 0), \
                              (200, 0, 0), (0, 200, 0), (0, 0, 200), (200, 200, 0), (16, 16, 0), (16, 16, 16)]
        self._extents_all = self._load_object_extents()

        self._width = cfg.TRAIN.SYN_WIDTH
        self._height = cfg.TRAIN.SYN_HEIGHT
        self._intrinsic_matrix = np.array([[524.7917885754071, 0, 332.5213232846151],
                                          [0, 489.3563960810721, 281.2339855172282],
                                          [0, 0, 1]])

        # select a subset of classes
        self._classes = [self._classes_all[i] for i in cfg.TRAIN.CLASSES]
        self._classes_test = [self._classes_all[i] for i in cfg.TEST.CLASSES]
        self._num_classes = len(self._classes)
        self._class_colors = [self._class_colors_all[i] for i in cfg.TRAIN.CLASSES]
        self._class_colors_test = [self._class_colors_all[i] for i in cfg.TEST.CLASSES]
        self._symmetry = np.array(cfg.TRAIN.SYMMETRY).astype(np.float32)
        self._symmetry_test = np.array(cfg.TEST.SYMMETRY).astype(np.float32)
        self._extents = self._extents_all[cfg.TRAIN.CLASSES]
        self._extents_test = self._extents_all[cfg.TEST.CLASSES]

        # train classes
        self._points, self._points_all, self._point_blob = \
            self._load_object_points(self._classes, self._extents, self._symmetry)

        # test classes
        self._points_test, self._points_all_test, self._point_blob_test = \
            self._load_object_points(self._classes_test, self._extents_test, self._symmetry_test)

        self._pixel_mean = torch.tensor(cfg.PIXEL_MEANS / 255.0).cuda().float()

        self._classes_other = []
        for i in range(self._num_classes_all):
            if i not in cfg.TRAIN.CLASSES:
                # do not use clamp
                if i == 19 and 20 in cfg.TRAIN.CLASSES:
                    continue
                if i == 20 and 19 in cfg.TRAIN.CLASSES:
                    continue
                self._classes_other.append(i)
        self._num_classes_other = len(self._classes_other)

        # 3D model paths
        self.model_sdf_paths = ['{}/{}/textured_simple_low_res.pth'.format(self._model_path, cls) for cls in self._classes_all[1:]]
        self.model_colors = [np.array(self._class_colors_all[i]) / 255.0 for i in range(1, len(self._classes_all))]

        self.model_mesh_paths = []
        for cls in self._classes_all[1:]:
            filename = '{}/{}/textured_simple.ply'.format(self._model_path, cls)
            if osp.exists(filename):
                self.model_mesh_paths.append(filename)
                continue
            filename = '{}/{}/textured_simple.obj'.format(self._model_path, cls)
            if osp.exists(filename):
                self.model_mesh_paths.append(filename)
                continue

        self.model_texture_paths = []
        for cls in self._classes_all[1:]:
            filename = '{}/{}/texture_map.png'.format(self._model_path, cls)
            if osp.exists(filename):
                self.model_texture_paths.append(filename)
            else:
                self.model_texture_paths.append('')

        # target meshes
        self.model_colors_target = [np.array(self._class_colors_all[i]) / 255.0 for i in cfg.TRAIN.CLASSES[1:]]
        self.model_mesh_paths_target = []
        for cls in self._classes[1:]:
            filename = '{}/{}/textured_simple.obj'.format(self._model_path, cls)
            if osp.exists(filename):
                self.model_mesh_paths_target.append(filename)
                continue
            filename = '{}/{}/textured_simple.ply'.format(self._model_path, cls)
            if osp.exists(filename):
                self.model_mesh_paths_target.append(filename)

        self.model_texture_paths_target = []
        for cls in self._classes[1:]:
            filename = '{}/{}/texture_map.png'.format(self._model_path, cls)
            if osp.exists(filename):
                self.model_texture_paths_target.append(filename)
            else:
                self.model_texture_paths_target.append('')

        self._class_to_ind = dict(zip(self._classes, range(self._num_classes)))
        self._size = cfg.TRAIN.SYNNUM
        self._build_uniform_poses()

        # sample indexes real for ycb object
        num_poses = 600
        num_classes = len(self._classes_all) - 1 # no background
        self.pose_indexes_real = np.zeros((num_classes, ), dtype=np.int32)
        self.pose_lists_real = []
        self.pose_images = []
        for i in range(num_classes):
            self.pose_lists_real.append(np.random.permutation(np.arange(num_poses)))
            dirname = osp.join(self._data_path, self._classes_all[i+1], '*.jpg')
            files = glob.glob(dirname)
            self.pose_images.append(files)

        # construct fake inputs
        label_blob = np.zeros((1, self._num_classes, self._height, self._width), dtype=np.float32)
        pose_blob = np.zeros((1, self._num_classes, 9), dtype=np.float32)
        gt_boxes = np.zeros((1, self._num_classes, 5), dtype=np.float32)

        # construct the meta data
        K = self._intrinsic_matrix
        Kinv = np.linalg.pinv(K)
        meta_data_blob = np.zeros((1, 18), dtype=np.float32)
        meta_data_blob[0, 0:9] = K.flatten()
        meta_data_blob[0, 9:18] = Kinv.flatten()

        self.input_labels = torch.from_numpy(label_blob).cuda()
        self.input_meta_data = torch.from_numpy(meta_data_blob).cuda()
        self.input_extents = torch.from_numpy(self._extents).cuda()
        self.input_gt_boxes = torch.from_numpy(gt_boxes).cuda()
        self.input_poses = torch.from_numpy(pose_blob).cuda()
        self.input_points = torch.from_numpy(self._point_blob).cuda()
        self.input_symmetry = torch.from_numpy(self._symmetry).cuda()


    def _render_item(self):

        height = cfg.TRAIN.SYN_HEIGHT
        width = cfg.TRAIN.SYN_WIDTH
        fx = self._intrinsic_matrix[0, 0]
        fy = self._intrinsic_matrix[1, 1]
        px = self._intrinsic_matrix[0, 2]
        py = self._intrinsic_matrix[1, 2]
        zfar = 6.0
        znear = 0.01

        # sample target objects
        if cfg.TRAIN.SYN_SAMPLE_OBJECT:
            maxnum = np.minimum(self.num_classes-1, cfg.TRAIN.SYN_MAX_OBJECT)
            num = np.random.randint(cfg.TRAIN.SYN_MIN_OBJECT, maxnum+1)
            perm = np.random.permutation(np.arange(self.num_classes-1))
            indexes_target = perm[:num] + 1
        else:
            num = self.num_classes - 1
            indexes_target = np.arange(num) + 1
        num_target = num
        cls_indexes = [cfg.TRAIN.CLASSES[i]-1 for i in indexes_target]

        # sample other objects as distractors
        if cfg.TRAIN.SYN_SAMPLE_DISTRACTOR:
            num_other = min(5, self._num_classes_other)
            num_selected = np.random.randint(0, num_other+1)
            perm = np.random.permutation(np.arange(self._num_classes_other))
            indexes = perm[:num_selected]
            for i in range(num_selected):
                cls_indexes.append(self._classes_other[indexes[i]]-1)
        else:
            num_selected = 0

        # sample poses
        num = num_target + num_selected
        poses_all = []
        for i in range(num):
            qt = np.zeros((7, ), dtype=np.float32)
            # rotation
            cls = int(cls_indexes[i])
            if self.pose_indexes[cls] >= len(self.pose_lists[cls]):
                self.pose_indexes[cls] = 0
                self.pose_lists[cls] = np.random.permutation(np.arange(len(self.eulers)))
            yaw = self.eulers[self.pose_lists[cls][self.pose_indexes[cls]]][0] + 15 * np.random.randn()
            pitch = self.eulers[self.pose_lists[cls][self.pose_indexes[cls]]][1] + 15 * np.random.randn()
            pitch = np.clip(pitch, -90, 90)
            roll = self.eulers[self.pose_lists[cls][self.pose_indexes[cls]]][2] + 15 * np.random.randn()
            qt[3:] = euler2quat(yaw * math.pi / 180.0, pitch * math.pi / 180.0, roll * math.pi / 180.0, 'syxz')
            self.pose_indexes[cls] += 1

            # translation
            bound = cfg.TRAIN.SYN_BOUND
            if i == 0 or i >= num_target or np.random.rand(1) > 0.5:
                qt[0] = np.random.uniform(-bound, bound)
                qt[1] = np.random.uniform(-bound, bound)
                qt[2] = np.random.uniform(cfg.TRAIN.SYN_TNEAR, cfg.TRAIN.SYN_TFAR)
            else:
                # sample an object nearby
                object_id = np.random.randint(0, i, size=1)[0]
                extent = 2 * np.mean(self._extents_all[cls+1, :])

                flag = np.random.randint(0, 2)
                if flag == 0:
                    flag = -1
                qt[0] = poses_all[object_id][0] + flag * extent * np.random.uniform(1.0, 1.5)
                if np.absolute(qt[0]) > bound:
                    qt[0] = poses_all[object_id][0] - flag * extent * np.random.uniform(1.0, 1.5)
                if np.absolute(qt[0]) > bound:
                    qt[0] = np.random.uniform(-bound, bound)

                flag = np.random.randint(0, 2)
                if flag == 0:
                    flag = -1
                qt[1] = poses_all[object_id][1] + flag * extent * np.random.uniform(1.0, 1.5)
                if np.absolute(qt[1]) > bound:
                    qt[1] = poses_all[object_id][1] - flag * extent * np.random.uniform(1.0, 1.5)
                if np.absolute(qt[1]) > bound:
                    qt[1] = np.random.uniform(-bound, bound)

                qt[2] = poses_all[object_id][2] - extent * np.random.uniform(2.0, 4.0)
                if qt[2] < cfg.TRAIN.SYN_TNEAR:
                    qt[2] = poses_all[object_id][2] + extent * np.random.uniform(2.0, 4.0)

            poses_all.append(qt)
        cfg.renderer.set_poses(poses_all)

        # sample lighting
        cfg.renderer.set_light_pos(np.random.uniform(-0.5, 0.5, 3))

        intensity = np.random.uniform(0.8, 2)
        light_color = intensity * np.random.uniform(0.9, 1.1, 3)
        cfg.renderer.set_light_color(light_color)
            
        # rendering
        cfg.renderer.set_projection_matrix(width, height, fx, fy, px, py, znear, zfar)
        image_tensor = torch.cuda.FloatTensor(height, width, 4).detach()
        seg_tensor = torch.cuda.FloatTensor(height, width, 4).detach()
        pc_tensor = torch.cuda.FloatTensor(height, width, 4).detach()
        cfg.renderer.render(cls_indexes, image_tensor, seg_tensor, pc2_tensor=pc_tensor)
        image_tensor = image_tensor.flip(0)
        seg_tensor = seg_tensor.flip(0)
        pc_tensor = pc_tensor.flip(0)

        # foreground mask
        seg = seg_tensor[:,:,2] + 256*seg_tensor[:,:,1] + 256*256*seg_tensor[:,:,0]
        mask = (seg != 0).unsqueeze(0).repeat((3, 1, 1)).float()

        # RGB to BGR order
        im = image_tensor.cpu().numpy()
        im = np.clip(im, 0, 1)
        im = im[:, :, (2, 1, 0)] * 255
        im = im.astype(np.uint8)

        # XYZ coordinates in camera frame
        im_depth = pc_tensor.cpu().numpy()
        im_depth = im_depth[:, :, :3]
        im_depth_return = im_depth[:, :, 2].copy()

        im_label = seg_tensor.cpu().numpy()
        im_label = im_label[:, :, (2, 1, 0)] * 255
        im_label = np.round(im_label).astype(np.uint8)
        im_label = np.clip(im_label, 0, 255)
        im_label, im_label_all = self.process_label_image(im_label)

        centers = np.zeros((num, 2), dtype=np.float32)
        rcenters = cfg.renderer.get_centers()
        for i in range(num):
            centers[i, 0] = rcenters[i][1] * width
            centers[i, 1] = rcenters[i][0] * height
        centers = centers[:num_target, :]

        '''
        import matplotlib.pyplot as plt
        fig = plt.figure()
        ax = fig.add_subplot(3, 2, 1)
        plt.imshow(im[:, :, (2, 1, 0)])
        for i in range(num_target):
            plt.plot(centers[i, 0], centers[i, 1], 'yo')
        ax = fig.add_subplot(3, 2, 2)
        plt.imshow(im_label)
        ax = fig.add_subplot(3, 2, 3)
        plt.imshow(im_depth[:, :, 0])
        ax = fig.add_subplot(3, 2, 4)
        plt.imshow(im_depth[:, :, 1])
        ax = fig.add_subplot(3, 2, 5)
        plt.imshow(im_depth[:, :, 2])
        plt.show()
        #'''

        # chromatic transform
        if cfg.TRAIN.CHROMATIC and cfg.MODE == 'TRAIN' and np.random.rand(1) > 0.1:
            im = chromatic_transform(im)

        im_cuda = torch.from_numpy(im).cuda().float() / 255.0
        if cfg.TRAIN.ADD_NOISE and cfg.MODE == 'TRAIN' and np.random.rand(1) > 0.1:
            im_cuda = add_noise_cuda(im_cuda)
        im_cuda -= self._pixel_mean
        im_cuda = im_cuda.permute(2, 0, 1)

        if cfg.INPUT == 'DEPTH' or cfg.INPUT == 'RGBD':

            # depth mask
            z_im = im_depth[:, :, 2]
            mask_depth = z_im > 0.0
            mask_depth = mask_depth.astype('float')
            mask_depth_cuda = torch.from_numpy(mask_depth).cuda().float()
            mask_depth_cuda.unsqueeze_(0)

            im_cuda_depth = torch.from_numpy(im_depth).cuda().float()
            if cfg.TRAIN.ADD_NOISE and cfg.MODE == 'TRAIN' and np.random.rand(1) > 0.1:
                im_cuda_depth = add_noise_depth_cuda(im_cuda_depth)
            im_cuda_depth = im_cuda_depth.permute(2, 0, 1)
        else:
            im_cuda_depth = im_cuda.clone()
            mask_depth_cuda = torch.cuda.FloatTensor(1, height, width).fill_(0)

        # label blob
        classes = np.array(range(self.num_classes))
        label_blob = np.zeros((self.num_classes, self._height, self._width), dtype=np.float32)
        label_blob[0, :, :] = 1.0
        for i in range(1, self.num_classes):
            I = np.where(im_label == classes[i])
            if len(I[0]) > 0:
                label_blob[i, I[0], I[1]] = 1.0
                label_blob[0, I[0], I[1]] = 0.0

        # poses and boxes
        pose_blob = np.zeros((self.num_classes, 9), dtype=np.float32)
        gt_boxes = np.zeros((self.num_classes, 5), dtype=np.float32)
        for i in range(num_target):
            cls = int(indexes_target[i])
            pose_blob[i, 0] = 1
            pose_blob[i, 1] = cls
            T = poses_all[i][:3]
            qt = poses_all[i][3:]

            # egocentric to allocentric
            qt_allocentric = egocentric2allocentric(qt, T)
            if qt_allocentric[0] < 0:
                qt_allocentric = -1 * qt_allocentric
            pose_blob[i, 2:6] = qt_allocentric
            pose_blob[i, 6:] = T

            # compute box
            x3d = np.ones((4, self._points_all.shape[1]), dtype=np.float32)
            x3d[0, :] = self._points_all[cls,:,0]
            x3d[1, :] = self._points_all[cls,:,1]
            x3d[2, :] = self._points_all[cls,:,2]
            RT = np.zeros((3, 4), dtype=np.float32)
            RT[:3, :3] = quat2mat(qt)
            RT[:, 3] = T
            x2d = np.matmul(self._intrinsic_matrix, np.matmul(RT, x3d))
            x2d[0, :] = np.divide(x2d[0, :], x2d[2, :])
            x2d[1, :] = np.divide(x2d[1, :], x2d[2, :])
        
            gt_boxes[i, 0] = np.min(x2d[0, :])
            gt_boxes[i, 1] = np.min(x2d[1, :])
            gt_boxes[i, 2] = np.max(x2d[0, :])
            gt_boxes[i, 3] = np.max(x2d[1, :])
            gt_boxes[i, 4] = cls


        # construct the meta data
        """
        format of the meta_data
        intrinsic matrix: meta_data[0 ~ 8]
        inverse intrinsic matrix: meta_data[9 ~ 17]
        """
        K = self._intrinsic_matrix
        K[2, 2] = 1
        Kinv = np.linalg.pinv(K)
        meta_data_blob = np.zeros(18, dtype=np.float32)
        meta_data_blob[0:9] = K.flatten()
        meta_data_blob[9:18] = Kinv.flatten()

        # vertex regression target
        if cfg.TRAIN.VERTEX_REG:
            vertex_targets, vertex_weights = self._generate_vertex_targets(im_label, indexes_target, centers, poses_all, classes, self.num_classes)
        elif cfg.TRAIN.VERTEX_REG_DELTA and cfg.INPUT == 'DEPTH' or cfg.INPUT == 'RGBD':
            vertex_targets, vertex_weights = self._generate_vertex_deltas(im_label, indexes_target, centers, poses_all,
                                                                           classes, self.num_classes, im_depth)
        else:
            vertex_targets = []
            vertex_weights = []

        im_info = np.array([im.shape[1], im.shape[2], cfg.TRAIN.SCALES_BASE[0], 1], dtype=np.float32)

        sample = {'image_color': im_cuda,
                  'image_depth': im_cuda_depth,
                  'im_depth': im_depth_return,
                  'label': label_blob,
                  'mask': mask,
                  'mask_depth': mask_depth_cuda,
                  'meta_data': meta_data_blob,
                  'poses': pose_blob,
                  'extents': self._extents,
                  'points': self._point_blob,
                  'symmetry': self._symmetry,
                  'gt_boxes': gt_boxes,
                  'im_info': im_info}

        if cfg.TRAIN.VERTEX_REG or cfg.TRAIN.VERTEX_REG_DELTA:
            sample['vertex_targets'] = vertex_targets
            sample['vertex_weights'] = vertex_weights

        return sample


    def __getitem__(self, index):
        return self._render_item()


    def __len__(self):
        return self._size


    # compute the voting label image in 2D
    def _generate_vertex_targets(self, im_label, cls_indexes, center, poses, classes, num_classes):

        width = im_label.shape[1]
        height = im_label.shape[0]
        vertex_targets = np.zeros((3 * num_classes, height, width), dtype=np.float32)
        vertex_weights = np.zeros((3 * num_classes, height, width), dtype=np.float32)

        c = np.zeros((2, 1), dtype=np.float32)
        for i in range(1, num_classes):
            y, x = np.where(im_label == classes[i])
            I = np.where(im_label == classes[i])
            ind = np.where(cls_indexes == classes[i])[0]
            if len(x) > 0 and len(ind) > 0:
                c[0] = center[ind, 0]
                c[1] = center[ind, 1]
                z = poses[int(ind)][2]
                R = np.tile(c, (1, len(x))) - np.vstack((x, y))
                # compute the norm
                N = np.linalg.norm(R, axis=0) + 1e-10
                # normalization
                R = np.divide(R, np.tile(N, (2,1)))
                # assignment
                vertex_targets[3*i+0, y, x] = R[0,:]
                vertex_targets[3*i+1, y, x] = R[1,:]
                vertex_targets[3*i+2, y, x] = math.log(z)

                vertex_weights[3*i+0, y, x] = cfg.TRAIN.VERTEX_W_INSIDE
                vertex_weights[3*i+1, y, x] = cfg.TRAIN.VERTEX_W_INSIDE
                vertex_weights[3*i+2, y, x] = cfg.TRAIN.VERTEX_W_INSIDE

        return vertex_targets, vertex_weights


    def _generate_vertex_deltas(self, im_label, cls_indexes, center, poses, classes, num_classes, im_depth):

        x_image = im_depth[:, :, 0]
        y_image = im_depth[:, :, 1]
        z_image = im_depth[:, :, 2]

        width = im_label.shape[1]
        height = im_label.shape[0]
        vertex_targets = np.zeros((3 * num_classes, height, width), dtype=np.float32)
        vertex_weights = np.zeros((3 * num_classes, height, width), dtype=np.float32)

        c = np.zeros((2, 1), dtype=np.float32)
        for i in range(1, num_classes):

            valid_mask = (z_image != 0.0)
            label_mask = (im_label == classes[i])
            fin_mask = valid_mask * label_mask

            y, x = np.where(fin_mask)
            ind = np.where(cls_indexes == classes[i])[0]
            if len(x) > 0 and len(ind) > 0:

                extents_here = self._extents[i, :]
                largest_dim = np.sqrt(np.sum(extents_here * extents_here))
                half_diameter = largest_dim / 2.0

                c[0] = center[ind, 0]
                c[1] = center[ind, 1]

                if isinstance(poses, list):
                    x_center_coord = poses[int(ind)][0]
                    y_center_coord = poses[int(ind)][1]
                    z_center_coord = poses[int(ind)][2]
                else:
                    if len(poses.shape) == 3:
                        x_center_coord = poses[0, 3, ind]
                        y_center_coord = poses[1, 3, ind]
                        z_center_coord = poses[2, 3, ind]
                    else:
                        x_center_coord = poses[ind, -3]
                        y_center_coord = poses[ind, -2]
                        z_center_coord = poses[ind, -1]

                targets_x = (x_image[y, x] - x_center_coord) / half_diameter
                targets_y = (y_image[y, x] - y_center_coord) / half_diameter
                targets_z = (z_image[y, x] - z_center_coord) / half_diameter

                vertex_targets[3 * i + 0, y, x] = targets_x
                vertex_targets[3 * i + 1, y, x] = targets_y
                vertex_targets[3 * i + 2, y, x] = targets_z

                vertex_weights[3 * i + 0, y, x] = cfg.TRAIN.VERTEX_W_INSIDE
                vertex_weights[3 * i + 1, y, x] = cfg.TRAIN.VERTEX_W_INSIDE
                vertex_weights[3 * i + 2, y, x] = cfg.TRAIN.VERTEX_W_INSIDE
        
        return vertex_targets, vertex_weights


    def _get_default_path(self):
        """
        Return the default path where ycb_object is expected to be installed.
        """
        return os.path.join(datasets.ROOT_DIR, 'data', 'YCB_Object')


    def _load_object_extents(self):

        extents = np.zeros((self._num_classes_all, 3), dtype=np.float32)
        for i in range(1, self._num_classes_all):
            point_file = os.path.join(self._model_path, self._classes_all[i], 'points.xyz')
            print(point_file)
            assert os.path.exists(point_file), 'Path does not exist: {}'.format(point_file)
            points = np.loadtxt(point_file)
            extents[i, :] = 2 * np.max(np.absolute(points), axis=0)

        return extents


    def _load_object_points(self, classes, extents, symmetry):

        points = [[] for _ in range(len(classes))]
        num = np.inf
        num_classes = len(classes)
        for i in range(1, num_classes):
            point_file = os.path.join(self._model_path, classes[i], 'points.xyz')
            print(point_file)
            assert os.path.exists(point_file), 'Path does not exist: {}'.format(point_file)
            points[i] = np.loadtxt(point_file)
            if points[i].shape[0] < num:
                num = points[i].shape[0]

        points_all = np.zeros((num_classes, num, 3), dtype=np.float32)
        for i in range(1, num_classes):
            points_all[i, :, :] = points[i][:num, :]

        # rescale the points
        point_blob = points_all.copy()
        for i in range(1, num_classes):
            # compute the rescaling factor for the points
            weight = 10.0 / np.amax(extents[i, :])
            if weight < 10:
                weight = 10
            if symmetry[i] > 0:
                point_blob[i, :, :] = 4 * weight * point_blob[i, :, :]
            else:
                point_blob[i, :, :] = weight * point_blob[i, :, :]

        return points, points_all, point_blob


    def labels_to_image(self, labels):

        height = labels.shape[0]
        width = labels.shape[1]
        im_label = np.zeros((height, width, 3), dtype=np.uint8)
        for i in range(self.num_classes):
            I = np.where(labels == i)
            im_label[I[0], I[1], :] = self._class_colors[i]

        return im_label


    def process_label_image(self, label_image):
        """
        change label image to label index
        """
        height = label_image.shape[0]
        width = label_image.shape[1]
        labels = np.zeros((height, width), dtype=np.int32)
        labels_all = np.zeros((height, width), dtype=np.int32)

        # label image is in BGR order
        index = label_image[:,:,2] + 256*label_image[:,:,1] + 256*256*label_image[:,:,0]
        for i in range(1, len(self._class_colors_all)):
            color = self._class_colors_all[i]
            ind = color[0] + 256*color[1] + 256*256*color[2]
            I = np.where(index == ind)
            labels_all[I[0], I[1]] = i

            ind = np.where(np.array(cfg.TRAIN.CLASSES) == i)[0]
            if len(ind) > 0:
                labels[I[0], I[1]] = ind

        return labels, labels_all


================================================
FILE: lib/datasets/ycb_self_supervision.py
================================================
# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.
# This work is licensed under the NVIDIA Source Code License - Non-commercial. Full
# text can be found in LICENSE.md

import torch
import torch.utils.data as data

import os, math
import sys
import os.path as osp
from os.path import *
import numpy as np
import numpy.random as npr
import cv2
import scipy.io
import copy
import glob
try:
    import cPickle  # Use cPickle on Python 2.7
except ImportError:
    import pickle as cPickle

import datasets
from fcn.config import cfg
from utils.blob import pad_im, chromatic_transform, add_noise, add_noise_cuda
from transforms3d.quaternions import mat2quat, quat2mat
from utils.se3 import *
from utils.pose_error import *
from utils.cython_bbox import bbox_overlaps
from utils.segmentation_evaluation import multilabel_metrics

def VOCap(rec, prec):
    index = np.where(np.isfinite(rec))[0]
    rec = rec[index]
    prec = prec[index]
    if len(rec) == 0 or len(prec) == 0:
        ap = 0
    else:
        mrec = np.insert(rec, 0, 0)
        mrec = np.append(mrec, 0.1)
        mpre = np.insert(prec, 0, 0)
        mpre = np.append(mpre, prec[-1])
        for i in range(1, len(mpre)):
            mpre[i] = max(mpre[i], mpre[i-1])
        i = np.where(mrec[1:] != mrec[:-1])[0] + 1
        ap = np.sum(np.multiply(mrec[i] - mrec[i-1], mpre[i])) * 10
    return ap

class YCBSelfSupervision(data.Dataset, datasets.imdb):
    def __init__(self, image_set, ycb_self_supervision_path = None):

        self._name = 'ycb_self_supervision_' + image_set
        self._image_set = image_set
        self._ycb_self_supervision_path = self._get_default_path() if ycb_self_supervision_path is None \
                            else ycb_self_supervision_path
        self._data_path = os.path.join(self._ycb_self_supervision_path, 'data')
        self._model_path = os.path.join(datasets.ROOT_DIR, 'data', 'models')

        # define all the classes
        self._classes_all = ('__background__', '002_master_chef_can', '003_cracker_box', '004_sugar_box', '005_tomato_soup_can', '006_mustard_bottle', \
                         '007_tuna_fish_can', '008_pudding_box', '009_gelatin_box', '010_potted_meat_can', '011_banana', '019_pitcher_base', \
                         '021_bleach_cleanser', '024_bowl', '025_mug', '035_power_drill', '036_wood_block', '037_scissors', '040_large_marker', \
                         '051_large_clamp', '052_extra_large_clamp', '061_foam_brick', 'holiday_cup1', 'holiday_cup2', 'sanning_mug', \
                         '001_chips_can', 'block_red_big', 'block_green_big', 'block_blue_big', 'block_yellow_big', \
                         'block_red_small', 'block_green_small', 'block_blue_small', 'block_yellow_small', \
                         'block_red_median', 'block_green_median', 'block_blue_median', 'block_yellow_median', 'fusion_duplo_dude', 'cabinet_handle')
        self._num_classes_all = len(self._classes_all)
        self._class_colors_all = [(255, 255, 255), (255, 0, 0), (0, 255, 0), (0, 0, 255), (255, 255, 0), (255, 0, 255), (0, 255, 255), \
                              (0, 0, 128), (0, 128, 0), (128, 0, 0), (128, 128, 0), (128, 0, 128), (0, 128, 128), \
                              (0, 64, 0), (64, 0, 0), (0, 0, 64), (64, 64, 0), (64, 0, 64), (0, 64, 64), \
                              (192, 0, 0), (0, 192, 0), (0, 0, 192), (192, 192, 0), (192, 0, 192), (0, 192, 192), (32, 0, 0), \
                              (150, 0, 0), (0, 150, 0), (0, 0, 150), (150, 150, 0), (75, 0, 0), (0, 75, 0), (0, 0, 75), (75, 75, 0), \
                              (200, 0, 0), (0, 200, 0), (0, 0, 200), (200, 200, 0), (16, 16, 0), (16, 16, 16)]
        self._extents_all = self._load_object_extents()

        self._width = cfg.TRAIN.SYN_WIDTH
        self._height = cfg.TRAIN.SYN_HEIGHT
        self._intrinsic_matrix = np.array([[616.3653,    0.,      310.25882],
                                           [  0.,      616.20294, 236.59981],
                                           [  0.,        0.,        1.     ]])

        if self._width == 1280:
            self._intrinsic_matrix = np.array([[599.48681641,   0.,         639.84338379],
                                               [  0.,         599.24389648, 366.09042358],
                                               [  0.,           0.,           1.        ]])


        # select a subset of classes
        self._classes = [self._classes_all[i] for i in cfg.TRAIN.CLASSES]
        self._classes_test = [self._classes_all[i] for i in cfg.TEST.CLASSES]
        self._num_classes = len(self._classes)
        self._class_colors = [self._class_colors_all[i] for i in cfg.TRAIN.CLASSES]
        self._class_colors_test = [self._class_colors_all[i] for i in cfg.TEST.CLASSES]
        self._symmetry = np.array(cfg.TRAIN.SYMMETRY).astype(np.float32)
        self._symmetry_test = np.array(cfg.TEST.SYMMETRY).astype(np.float32)
        self._extents = self._extents_all[cfg.TRAIN.CLASSES]
        self._extents_test = self._extents_all[cfg.TEST.CLASSES]
        self._points, self._points_all, self._point_blob = self._load_object_points(self._classes, self._extents, self._symmetry)
        self._points_test, self._points_all_test, self._point_blob_test = \
            self._load_object_points(self._classes_test, self._extents_test, self._symmetry_test)
        self._pixel_mean = torch.tensor(cfg.PIXEL_MEANS / 255.0).cuda().float()

        self._classes_other = []
        for i in range(self._num_classes_all):
            if i not in cfg.TRAIN.CLASSES:
                # do not use clamp
                if i == 19 and 20 in cfg.TRAIN.CLASSES:
                    continue
                if i == 20 and 19 in cfg.TRAIN.CLASSES:
                    continue
                self._classes_other.append(i)
        self._num_classes_other = len(self._classes_other)

        # 3D model paths
        self.model_sdf_paths = ['{}/{}/textured_simple_low_res.pth'.format(self._model_path, cls) for cls in self._classes_all[1:]]
        self.model_colors = [np.array(self._class_colors_all[i]) / 255.0 for i in range(1, len(self._classes_all))]

        self.model_mesh_paths = []
        for cls in self._classes_all[1:]:
            filename = '{}/{}/textured_simple.ply'.format(self._model_path, cls)
            if osp.exists(filename):
                self.model_mesh_paths.append(filename)
                continue
            filename = '{}/{}/textured_simple.obj'.format(self._model_path, cls)
            if osp.exists(filename):
                self.model_mesh_paths.append(filename)

        self.model_texture_paths = []
        for cls in self._classes_all[1:]:
            filename = '{}/{}/texture_map.png'.format(self._model_path, cls)
            if osp.exists(filename):
                self.model_texture_paths.append(filename)
            else:
                self.model_texture_paths.append('')

        # target meshes
        self.model_colors_target = [np.array(self._class_colors_all[i]) / 255.0 for i in cfg.TRAIN.CLASSES[1:]]
        self.model_mesh_paths_target = []
        for cls in self._classes[1:]:
            filename = '{}/{}/textured_simple.ply'.format(self._model_path, cls)
            if osp.exists(filename):
                self.model_mesh_paths_target.append(filename)
                continue
            filename = '{}/{}/textured_simple.obj'.format(self._model_path, cls)
            if osp.exists(filename):
                self.model_mesh_paths_target.append(filename)

        self.model_texture_paths_target = []
        for cls in self._classes[1:]:
            filename = '{}/{}/texture_map.png'.format(self._model_path, cls)
            if osp.exists(filename):
                self.model_texture_paths_target.append(filename)
            else:
                self.model_texture_paths_target.append('')

        self._class_to_ind = dict(zip(self._classes, range(self._num_classes)))
        self._image_ext = '.png'
        self._image_index = self._load_image_set_index(image_set)

        if (cfg.MODE == 'TRAIN' and cfg.TRAIN.SYNTHESIZE) or (cfg.MODE == 'TEST' and cfg.TEST.SYNTHESIZE):
            self._size = len(self._image_index) * (cfg.TRAIN.SYN_RATIO+1)
        else:
            self._size = len(self._image_index)

        if self._size > cfg.TRAIN.MAX_ITERS_PER_EPOCH * cfg.TRAIN.IMS_PER_BATCH:
            self._size = cfg.TRAIN.MAX_ITERS_PER_EPOCH * cfg.TRAIN.IMS_PER_BATCH
        self._roidb = self.gt_roidb()
        if cfg.MODE == 'TRAIN' or cfg.TEST.VISUALIZE:
            self._perm = np.random.permutation(np.arange(len(self._roidb)))
        else:
            self._perm = np.arange(len(self._roidb))
        self._cur = 0
        self._build_uniform_poses()
        self.lb_shift = -0.2
        self.ub_shift = 0.2
        self.lb_scale = 0.8
        self.ub_scale = 2.0

        assert os.path.exists(self._ycb_self_supervision_path), \
                'ycb_self_supervision path does not exist: {}'.format(self._ycb_self_supervision_path)
        assert os.path.exists(self._data_path), \
                'Data path does not exist: {}'.format(self._data_path)

        # construct fake inputs
        label_blob = np.zeros((1, self._num_classes, self._height, self._width), dtype=np.float32)
        pose_blob = np.zeros((1, self._num_classes, 9), dtype=np.float32)
        gt_boxes = np.zeros((1, self._num_classes, 5), dtype=np.float32)

        # construct the meta data
        K = self._intrinsic_matrix
        Kinv = np.linalg.pinv(K)
        meta_data_blob = np.zeros((1, 18), dtype=np.float32)
        meta_data_blob[0, 0:9] = K.flatten()
        meta_data_blob[0, 9:18] = Kinv.flatten()

        self.input_labels = torch.from_numpy(label_blob).cuda()
        self.input_meta_data = torch.from_numpy(meta_data_blob).cuda()
        self.input_extents = torch.from_numpy(self._extents).cuda()
        self.input_gt_boxes = torch.from_numpy(gt_boxes).cuda()
        self.input_poses = torch.from_numpy(pose_blob).cuda()
        self.input_points = torch.from_numpy(self._point_blob).cuda()
        self.input_symmetry = torch.from_numpy(self._symmetry).cuda()


    def _render_item(self):

        height = cfg.TRAIN.SYN_HEIGHT
        width = cfg.TRAIN.SYN_WIDTH
        fx = self._intrinsic_matrix[0, 0]
        fy = self._intrinsic_matrix[1, 1]
        px = self._intrinsic_matrix[0, 2]
        py = self._intrinsic_matrix[1, 2]
        zfar = 6.0
        znear = 0.01

        # sample target objects
        if cfg.TRAIN.SYN_SAMPLE_OBJECT:
            maxnum = np.minimum(self.num_classes-1, cfg.TRAIN.SYN_MAX_OBJECT)
            num = np.random.randint(cfg.TRAIN.SYN_MIN_OBJECT, maxnum+1)
            perm = np.random.permutation(np.arange(self.num_classes-1))
            indexes_target = perm[:num] + 1
        else:
            num = self.num_classes - 1
            indexes_target = np.arange(num) + 1
        num_target = num
        cls_indexes = [cfg.TRAIN.CLASSES[i]-1 for i in indexes_target]

        # sample other objects as distractors
        if cfg.TRAIN.SYN_SAMPLE_DISTRACTOR:
            num_other = min(5, self._num_classes_other)
            num_selected = np.random.randint(0, num_other+1)
            perm = np.random.permutation(np.arange(self._num_classes_other))
            indexes = perm[:num_selected]
            for i in range(num_selected):
                cls_indexes.append(self._classes_other[indexes[i]]-1)
        else:
            num_selected = 0

        # sample poses
        num = num_target + num_selected
        poses_all = []
        for i in range(num):
            qt = np.zeros((7, ), dtype=np.float32)
            # rotation
            cls = int(cls_indexes[i])
            if self.pose_indexes[cls] >= len(self.pose_lists[cls]):
                self.pose_indexes[cls] = 0
                self.pose_lists[cls] = np.random.permutation(np.arange(len(self.eulers)))
            yaw = self.eulers[self.pose_lists[cls][self.pose_indexes[cls]]][0] + 15 * np.random.randn()
            pitch = self.eulers[self.pose_lists[cls][self.pose_indexes[cls]]][1] + 15 * np.random.randn()
            pitch = np.clip(pitch, -90, 90)
            roll = self.eulers[self.pose_lists[cls][self.pose_indexes[cls]]][2] + 15 * np.random.randn()
            qt[3:] = euler2quat(yaw * math.pi / 180.0, pitch * math.pi / 180.0, roll * math.pi / 180.0, 'syxz')
            self.pose_indexes[cls] += 1

            # translation
            bound = cfg.TRAIN.SYN_BOUND
            if i == 0 or i >= num_target or np.random.rand(1) > 0.5:
                qt[0] = np.random.uniform(-bound, bound)
                qt[1] = np.random.uniform(-bound, bound)
                qt[2] = np.random.uniform(cfg.TRAIN.SYN_TNEAR, cfg.TRAIN.SYN_TFAR)
            else:
                # sample an object nearby
                object_id = np.random.randint(0, i, size=1)[0]
                extent = np.mean(self._extents_all[cls+1, :])

                flag = np.random.randint(0, 2)
                if flag == 0:
                    flag = -1
                qt[0] = poses_all[object_id][0] + flag * extent * np.random.uniform(1.0, 1.5)
                if np.absolute(qt[0]) > bound:
                    qt[0] = poses_all[object_id][0] - flag * extent * np.random.uniform(1.0, 1.5)
                if np.absolute(qt[0]) > bound:
                    qt[0] = np.random.uniform(-bound, bound)

                flag = np.random.randint(0, 2)
                if flag == 0:
                    flag = -1
                qt[1] = poses_all[object_id][1] + flag * extent * np.random.uniform(1.0, 1.5)
                if np.absolute(qt[1]) > bound:
                    qt[1] = poses_all[object_id][1] - flag * extent * np.random.uniform(1.0, 1.5)
                if np.absolute(qt[1]) > bound:
                    qt[1] = np.random.uniform(-bound, bound)

                qt[2] = poses_all[object_id][2] - extent * np.random.uniform(2.0, 4.0)
                if qt[2] < cfg.TRAIN.SYN_TNEAR:
                    qt[2] = poses_all[object_id][2] + extent * np.random.uniform(2.0, 4.0)

            poses_all.append(qt)
        cfg.renderer.set_poses(poses_all)

        # sample lighting
        cfg.renderer.set_light_pos(np.random.uniform(-0.5, 0.5, 3))

        intensity = np.random.uniform(0.8, 2)
        light_color = intensity * np.random.uniform(0.9, 1.1, 3)
        cfg.renderer.set_light_color(light_color)
            
        # rendering
        cfg.renderer.set_projection_matrix(width, height, fx, fy, px, py, znear, zfar)
        image_tensor = torch.cuda.FloatTensor(height, width, 4).detach()
        seg_tensor = torch.cuda.FloatTensor(height, width, 4).detach()
        pc_tensor = torch.cuda.FloatTensor(height, width, 4).detach()
        cfg.renderer.render(cls_indexes, image_tensor, seg_tensor, pc2_tensor=pc_tensor)
        image_tensor = image_tensor.flip(0)
        seg_tensor = seg_tensor.flip(0)
        pc_tensor = pc_tensor.flip(0)

        # foreground mask
        seg = seg_tensor[:,:,2] + 256*seg_tensor[:,:,1] + 256*256*seg_tensor[:,:,0]
        mask = (seg != 0).unsqueeze(0).repeat((3, 1, 1)).float()

        # RGB to BGR order
        im = image_tensor.cpu().numpy()
        im = np.clip(im, 0, 1)
        im = im[:, :, (2, 1, 0)] * 255
        im = im.astype(np.uint8)

        # XYZ coordinates in camera frame
        im_depth = pc_tensor.cpu().numpy()
        im_depth = im_depth[:, :, :3]
        im_depth_return = im_depth[:, :, 2].copy()

        im_label = seg_tensor.cpu().numpy()
        im_label = im_label[:, :, (2, 1, 0)] * 255
        im_label = np.round(im_label).astype(np.uint8)
        im_label = np.clip(im_label, 0, 255)
        im_label, im_label_all = self.process_label_image(im_label)

        centers = np.zeros((num, 2), dtype=np.float32)
        rcenters = cfg.renderer.get_centers()
        for i in range(num):
            centers[i, 0] = rcenters[i][1] * width
            centers[i, 1] = rcenters[i][0] * height
        centers = centers[:num_target, :]

        '''
        import matplotlib.pyplot as plt
        fig = plt.figure()
        ax = fig.add_subplot(3, 2, 1)
        plt.imshow(im[:, :, (2, 1, 0)])
        for i in range(num_target):
            plt.plot(centers[i, 0], centers[i, 1], 'yo')
        ax = fig.add_subplot(3, 2, 2)
        plt.imshow(im_label)
        ax = fig.add_subplot(3, 2, 3)
        plt.imshow(im_depth[:, :, 0])
        ax = fig.add_subplot(3, 2, 4)
        plt.imshow(im_depth[:, :, 1])
        ax = fig.add_subplot(3, 2, 5)
        plt.imshow(im_depth[:, :, 2])
        plt.show()
        #'''

        # chromatic transform
        if cfg.TRAIN.CHROMATIC and cfg.MODE == 'TRAIN' and np.random.rand(1) > 0.1:
            im = chromatic_transform(im)

        im_cuda = torch.from_numpy(im).cuda().float() / 255.0
        if cfg.TRAIN.ADD_NOISE and cfg.MODE == 'TRAIN' and np.random.rand(1) > 0.1:
            im_cuda = add_noise_cuda(im_cuda)
        im_cuda -= self._pixel_mean
        im_cuda = im_cuda.permute(2, 0, 1)

        if cfg.INPUT == 'DEPTH' or cfg.INPUT == 'RGBD':

            # depth mask
            z_im = im_depth[:, :, 2]
            mask_depth = z_im > 0.0
            mask_depth = mask_depth.astype('float')
            mask_depth_cuda = torch.from_numpy(mask_depth).cuda().float()
            mask_depth_cuda.unsqueeze_(0)

            im_cuda_depth = torch.from_numpy(im_depth).cuda().float()
            if cfg.TRAIN.ADD_NOISE and cfg.MODE == 'TRAIN' and np.random.rand(1) > 0.1:
                im_cuda_depth = add_noise_depth_cuda(im_cuda_depth)
            im_cuda_depth = im_cuda_depth.permute(2, 0, 1)
        else:
            im_cuda_depth = im_cuda.clone()
            mask_depth_cuda = torch.cuda.FloatTensor(1, height, width).fill_(0)

        # label blob
        classes = np.array(range(self.num_classes))
        label_blob = np.zeros((self.num_classes, self._height, self._width), dtype=np.float32)
        label_blob[0, :, :] = 1.0
        for i in range(1, self.num_classes):
            I = np.where(im_label == classes[i])
            if len(I[0]) > 0:
                label_blob[i, I[0], I[1]] = 1.0
                label_blob[0, I[0], I[1]] = 0.0

        # poses and boxes
        pose_blob = np.zeros((self.num_classes, 9), dtype=np.float32)
        gt_boxes = np.zeros((self.num_classes, 5), dtype=np.float32)
        count = 0
        for i in range(num_target):
            cls = int(indexes_target[i])
            T = poses_all[i][:3]
            qt = poses_all[i][3:]

            I = np.where(im_label == cls)
            if len(I[0]) == 0:
                continue

            # compute box
            x3d = np.ones((4, self._points_all.shape[1]), dtype=np.float32)
            x3d[0, :] = self._points_all[cls,:,0]
            x3d[1, :] = self._points_all[cls,:,1]
            x3d[2, :] = self._points_all[cls,:,2]
            RT = np.zeros((3, 4), dtype=np.float32)
            RT[:3, :3] = quat2mat(qt)
            RT[:, 3] = T
            x2d = np.matmul(self._intrinsic_matrix, np.matmul(RT, x3d))
            x2d[0, :] = np.divide(x2d[0, :], x2d[2, :])
            x2d[1, :] = np.divide(x2d[1, :], x2d[2, :])

            x1 = np.min(x2d[0, :])
            y1 = np.min(x2d[1, :])
            x2 = np.max(x2d[0, :])
            y2 = np.max(x2d[1, :])
            if x1 > width or y1 > height or x2 < 0 or y2 < 0:
                continue

            gt_boxes[count, 0] = x1
            gt_boxes[count, 1] = y1
            gt_boxes[count, 2] = x2
            gt_boxes[count, 3] = y2
            gt_boxes[count, 4] = cls

            pose_blob[count, 0] = 1
            pose_blob[count, 1] = cls
            # egocentric to allocentric
            qt_allocentric = egocentric2allocentric(qt, T)
            if qt_allocentric[0] < 0:
                qt_allocentric = -1 * qt_allocentric
            pose_blob[count, 2:6] = qt_allocentric
            pose_blob[count, 6:] = T
            count += 1


        # construct the meta data
        """
        format of the meta_data
        intrinsic matrix: meta_data[0 ~ 8]
        inverse intrinsic matrix: meta_data[9 ~ 17]
        """
        K = self._intrinsic_matrix
        K[2, 2] = 1
        Kinv = np.linalg.pinv(K)
        meta_data_blob = np.zeros(18, dtype=np.float32)
        meta_data_blob[0:9] = K.flatten()
        meta_data_blob[9:18] = Kinv.flatten()

        # vertex regression target
        if cfg.TRAIN.VERTEX_REG:
            vertex_targets, vertex_weights = self._generate_vertex_targets(im_label, indexes_target, centers, poses_all, classes, self.num_classes)
        elif cfg.TRAIN.VERTEX_REG_DELTA and cfg.INPUT == 'DEPTH' or cfg.INPUT == 'RGBD':
            vertex_targets, vertex_weights = self._generate_vertex_deltas(im_label, indexes_target, centers, poses_all,
                                                                           classes, self.num_classes, im_depth)
        else:
            vertex_targets = []
            vertex_weights = []

        im_info = np.array([im.shape[1], im.shape[2], cfg.TRAIN.SCALES_BASE[0], 1], dtype=np.float32)

        sample = {'image_color': im_cuda,
                  'im_depth': im_depth_return,
                  'label': label_blob,
                  'mask': mask,
                  'meta_data': meta_data_blob,
                  'poses': pose_blob,
                  'extents': self._extents,
                  'points': self._point_blob,
                  'symmetry': self._symmetry,
                  'gt_boxes': gt_boxes,
                  'im_info': im_info,
                  'meta_data_path': ''}

        if cfg.TRAIN.VERTEX_REG or cfg.TRAIN.VERTEX_REG_DELTA:
            sample['vertex_targets'] = vertex_targets
            sample['vertex_weights'] = vertex_weights

        # affine transformation
        if cfg.TRAIN.AFFINE:
            shift = np.float32([np.random.uniform(self.lb_shift, self.ub_shift), np.random.uniform(self.lb_shift, self.ub_shift)])
            scale = np.random.uniform(self.lb_scale, self.ub_scale)
            affine_matrix = np.float32([[scale, 0, shift[0]], [0, scale, shift[1]]])

            affine_1 = np.eye(3, dtype=np.float32)
            affine_1[0, 2] = -0.5 * self._width
            affine_1[1, 2] = -0.5 * self._height

            affine_2 = np.eye(3, dtype=np.float32)
            affine_2[0, 0] = 1.0 / scale
            affine_2[0, 2] = -shift[0] * 0.5 * self._width / scale
            affine_2[1, 1] = 1.0 / scale
            affine_2[1, 2] = -shift[1] * 0.5 * self._height / scale

            affine_3 = np.matmul(affine_2, affine_1)
            affine_4 = np.matmul(np.linalg.inv(affine_1), affine_3)
            affine_matrix_coordinate = affine_4[:3, :]

            sample['affine_matrix'] = torch.from_numpy(affine_matrix).cuda()
            sample['affine_matrix_coordinate'] = torch.from_numpy(affine_matrix_coordinate).cuda()

        return sample


    def __getitem__(self, index):

        is_syn = 0
        if ((cfg.MODE == 'TRAIN' and cfg.TRAIN.SYNTHESIZE) or (cfg.MODE == 'TEST' and cfg.TEST.SYNTHESIZE)) and (index % (cfg.TRAIN.SYN_RATIO+1) != 0):
            is_syn = 1

        if is_syn:
            return self._render_item()

        if self._cur >= len(self._roidb):
            self._perm = np.random.permutation(np.arange(len(self._roidb)))
            self._cur = 0
        db_ind = self._perm[self._cur]
        roidb = self._roidb[db_ind]
        self._cur += 1

        # Get the input image blob
        random_scale_ind = npr.randint(0, high=len(cfg.TRAIN.SCALES_BASE))
        im_blob, im_depth, im_scale, height, width = self._get_image_blob(roidb, random_scale_ind)

        # build the label blob
        label_blob, mask, meta_data_blob, pose_blob, gt_boxes, vertex_targets, vertex_weights \
            = self._get_label_blob(roidb, self._num_classes, im_scale, height, width)

        im_info = np.array([im_blob.shape[1], im_blob.shape[2], im_scale, is_syn], dtype=np.float32)
        mask_depth_cuda = torch.cuda.FloatTensor(1, height, width).fill_(0)

        sample = {'image_color': im_blob,
                  'im_depth': im_depth,
                  'label': label_blob,
                  'mask': mask,
                  'meta_data': meta_data_blob,
                  'poses': pose_blob,
                  'extents': self._extents,
                  'points': self._point_blob,
                  'symmetry': self._symmetry,
                  'gt_boxes': gt_boxes,
                  'im_info': im_info,
                  'meta_data_path': roidb['meta_data']}

        if cfg.TRAIN.VERTEX_REG:
            sample['vertex_targets'] = vertex_targets
            sample['vertex_weights'] = vertex_weights

        return sample


    def _get_image_blob(self, roidb, scale_ind):    

        # rgba
        rgba = pad_im(cv2.imread(roidb['image'], cv2.IMREAD_UNCHANGED), 16)
        if rgba.shape[2] == 4:
            im = np.copy(rgba[:,:,:3])
            alpha = rgba[:,:,3]
            I = np.where(alpha == 0)
            im[I[0], I[1], :] = 0
        else:
            im = rgba

        im_scale = cfg.TRAIN.SCALES_BASE[scale_ind]
        if im_scale != 1.0:
            im = cv2.resize(im, None, None, fx=im_scale, fy=im_scale, interpolation=cv2.INTER_LINEAR)
        height = im.shape[0]
        width = im.shape[1]

        if roidb['flipped']:
            im = im[:, ::-1, :]

        # chromatic transform
        if cfg.TRAIN.CHROMATIC and cfg.MODE == 'TRAIN' and np.random.rand(1) > 0.1:
            im = chromatic_transform(im)

        im_cuda = torch.from_numpy(im).cuda().float() / 255.0
        if cfg.TRAIN.ADD_NOISE and cfg.MODE == 'TRAIN' and np.random.rand(1) > 0.1:
            im_cuda = add_noise_cuda(im_cuda)
        im_cuda -= self._pixel_mean
        im_cuda = im_cuda.permute(2, 0, 1)

        # depth image
        im_depth = pad_im(cv2.imread(roidb['depth'], cv2.IMREAD_UNCHANGED), 16)
        if im_scale != 1.0:
            im_depth = cv2.resize(im_depth, None, None, fx=im_scale, fy=im_scale, interpolation=cv2.INTER_NEAREST)
        im_depth = im_depth.astype(np.float32) / 1000.0

        return im_cuda, im_depth, im_scale, height, width


    def _get_label_blob(self, roidb, num_classes, im_scale, height, width):
        """ build the label blob """

        meta_data = scipy.io.loadmat(roidb['meta_data'])
        meta_data['cls_indexes'] = meta_data['cls_indexes'].flatten()
        classes = np.array(cfg.TRAIN.CLASSES)
        classes_test = np.array(cfg.TEST.CLASSES).flatten()

        intrinsic_matrix = np.matrix(meta_data['intrinsic_matrix'])
        fx = intrinsic_matrix[0, 0]
        fy = intrinsic_matrix[1, 1]
        px = intrinsic_matrix[0, 2]
        py = intrinsic_matrix[1, 2]
        zfar = 6.0
        znear = 0.01

        # poses
        poses = meta_data['poses']
        if len(poses.shape) == 2:
            poses = np.reshape(poses, (3, 4, 1))
        if roidb['flipped']:
            poses = _flip_poses(poses, meta_data['intrinsic_matrix'], width)
        num = poses.shape[2]

        # render poses to get the label image
        cls_indexes = []
        poses_all = []
        qt = np.zeros((7, ), dtype=np.float32)
        for i in range(num):
            RT = poses[:, :, i]
            qt[:3] = RT[:, 3]
            qt[3:] = mat2quat(RT[:, :3])
            if cfg.MODE == 'TEST':
                index = np.where(classes_test == meta_data['cls_indexes'][i])[0]
                cls_indexes.append(index[0])
            else:
                cls_indexes.append(meta_data['cls_indexes'][i] - 1)
            poses_all.append(qt.copy())
            
        # rendering
        cfg.renderer.set_poses(poses_all)
        cfg.renderer.set_light_pos([0, 0, 0])
        cfg.renderer.set_light_color([1, 1, 1])
        cfg.renderer.set_projection_matrix(width, height, fx, fy, px, py, znear, zfar)
        image_tensor = torch.cuda.FloatTensor(height, width, 4).detach()
        seg_tensor = torch.cuda.FloatTensor(height, width, 4).detach()
        cfg.renderer.render(cls_indexes, image_tensor, seg_tensor)
        image_tensor = image_tensor.flip(0)
        seg_tensor = seg_tensor.flip(0)

        # semantic labels
        im_label = seg_tensor.cpu().numpy()
        im_label = im_label[:, :, (2, 1, 0)] * 255
        im_label = np.round(im_label).astype(np.uint8)
        im_label = np.clip(im_label, 0, 255)
        im_label, im_label_all = self.process_label_image(im_label)

        centers = np.zeros((num, 2), dtype=np.float32)
        rcenters = cfg.renderer.get_centers()
        for i in range(num):
            centers[i, 0] = rcenters[i][1] * width
            centers[i, 1] = rcenters[i][0] * height

        # label blob
        label_blob = np.zeros((num_classes, height, width), dtype=np.float32)
        label_blob[0, :, :] = 1.0
        for i in range(1, num_classes):
            I = np.where(im_label_all == classes[i])
            if len(I[0]) > 0:
                label_blob[i, I[0], I[1]] = 1.0
                label_blob[0, I[0], I[1]] = 0.0

        '''
        import matplotlib.pyplot as plt
        fig = plt.figure()
        ax = fig.add_subplot(1, 2, 1)
        plt.imshow(im_label)
        for i in range(num):
            plt.plot(centers[i, 0], centers[i, 1], 'yo')
        ax = fig.add_subplot(1, 2, 2)
        plt.imshow(im_label_all)
        plt.show()
        #'''

        # foreground mask
        seg = torch.from_numpy((im_label != 0).astype(np.float32))
        mask = seg.unsqueeze(0).repeat((3, 1, 1)).float().cuda()

        # gt poses
        pose_blob = np.zeros((num_classes, 9), dtype=np.float32)
        gt_boxes = np.zeros((num_classes, 5), dtype=np.float32)
        count = 0
        for i in range(num):
            cls = int(meta_data['cls_indexes'][i])
            ind = np.where(classes == cls)[0]
            if len(ind) > 0:

                I = np.where(im_label == ind[0])
                if len(I[0]) == 0:
                    continue

                R = poses[:, :3, i]
                T = poses[:, 3, i]

                # compute box
                x3d = np.ones((4, self._points_all.shape[1]), dtype=np.float32)
                x3d[0, :] = self._points_all[ind,:,0]
                x3d[1, :] = self._points_all[ind,:,1]
                x3d[2, :] = self._points_all[ind,:,2]
                RT = np.zeros((3, 4), dtype=np.float32)
                RT[:3, :3] = R
                RT[:, 3] = T
                x2d = np.matmul(meta_data['intrinsic_matrix'], np.matmul(RT, x3d))
                x2d[0, :] = np.divide(x2d[0, :], x2d[2, :])
                x2d[1, :] = np.divide(x2d[1, :], x2d[2, :])

                x1 = np.min(x2d[0, :]) * im_scale
                y1 = np.min(x2d[1, :]) * im_scale
                x2 = np.max(x2d[0, :]) * im_scale
                y2 = np.max(x2d[1, :]) * im_scale
                if x1 > width or y1 > height or x2 < 0 or y2 < 0:
                    continue
                gt_boxes[count, 0] = x1
                gt_boxes[count, 1] = y1
                gt_boxes[count, 2] = x2
                gt_boxes[count, 3] = y2
                gt_boxes[count, 4] = ind

                # pose
                pose_blob[count, 0] = 1
                pose_blob[count, 1] = ind
                qt = mat2quat(R)

                # egocentric to allocentric
                qt_allocentric = egocentric2allocentric(qt, T)
                if qt_allocentric[0] < 0:
                   qt_allocentric = -1 * qt_allocentric
                pose_blob[count, 2:6] = qt_allocentric
                pose_blob[count, 6:] = T
                count += 1

        # construct the meta data
        """
        format of the meta_data
        intrinsic matrix: meta_data[0 ~ 8]
        inverse intrinsic matrix: meta_data[9 ~ 17]
        """
        K = np.matrix(meta_data['intrinsic_matrix']) * im_scale
        K[2, 2] = 1
        Kinv = np.linalg.pinv(K)
        meta_data_blob = np.zeros(18, dtype=np.float32)
        meta_data_blob[0:9] = K.flatten()
        meta_data_blob[9:18] = Kinv.flatten()

        # vertex regression target
        if cfg.TRAIN.VERTEX_REG:
            if roidb['flipped']:
                centers[:, 0] = width - centers[:, 0]
            vertex_targets, vertex_weights = self._generate_vertex_targets(im_label_all, meta_data['cls_indexes'], \
                centers, poses_all, classes, num_classes)
        else:
            vertex_targets = []
            vertex_weights = []

        return label_blob, mask, meta_data_blob, pose_blob, gt_boxes, vertex_targets, vertex_weights


    # compute the voting label image in 2D
    def _generate_vertex_targets(self, im_label, cls_indexes, center, poses, classes, num_classes):

        width = im_label.shape[1]
        height = im_label.shape[0]
        vertex_targets = np.zeros((3 * num_classes, height, width), dtype=np.float32)
        vertex_weights = np.zeros((3 * num_classes, height, width), dtype=np.float32)

        c = np.zeros((2, 1), dtype=np.float32)
        for i in range(1, num_classes):
            y, x = np.where(im_label == classes[i])
            I = np.where(im_label == classes[i])
            ind = np.where(cls_indexes == classes[i])[0]
            if len(x) > 0 and len(ind) > 0:
                c[0] = center[ind, 0]
                c[1] = center[ind, 1]
                z = poses[int(ind)][2]
                R = np.tile(c, (1, len(x))) - np.vstack((x, y))
                # compute the norm
                N = np.linalg.norm(R, axis=0) + 1e-10
                # normalization
                R = np.divide(R, np.tile(N, (2,1)))
                # assignment
                vertex_targets[3*i+0, y, x] = R[0,:]
                vertex_targets[3*i+1, y, x] = R[1,:]
                vertex_targets[3*i+2, y, x] = math.log(z)

                vertex_weights[3*i+0, y, x] = cfg.TRAIN.VERTEX_W_INSIDE
                vertex_weights[3*i+1, y, x] = cfg.TRAIN.VERTEX_W_INSIDE
                vertex_weights[3*i+2, y, x] = cfg.TRAIN.VERTEX_W_INSIDE

        return vertex_targets, vertex_weights


    def __len__(self):
        return self._size


    def _get_default_path(self):
        """
        Return the default path where ycb_self_supervision is expected to be installed.
        """
        return os.path.join(datasets.ROOT_DIR, 'data', 'YCB_Self_Supervision')


    def _load_image_set_index(self, image_set):
        """
        Load the indexes of images in the data folder
        """

        image_set_file = os.path.join(self._ycb_self_supervision_path, image_set + '.txt')
        assert os.path.exists(image_set_file), \
                'Path does not exist: {}'.format(image_set_file)

        subdirs = []
        with open(image_set_file) as f:
            for x in f.readlines():
                subdirs.append(x.rstrip('\n'))

        image_index = []
        for i in range(len(subdirs)):
            subdir = subdirs[i]
            folder = osp.join(self._data_path, subdir)
            filename = os.path.join(folder, '*.mat')
            files = glob.glob(filename)
            print(subdir, len(files))
            for k in range(len(files)):
                filename = files[k]
                head, name = os.path.split(filename)
                index = subdir + '/' + name[:-9]
                image_index.append(index)

        print('=======================================================')
        print('%d image in %s' % (len(image_index), self._data_path))
        print('=======================================================')
        return image_index


    def _load_object_points(self, classes, extents, symmetry):

        points = [[] for _ in range(len(classes))]
        num = np.inf
        num_classes = len(classes)
        for i in range(1, num_classes):
            point_file = os.path.join(self._model_path, classes[i], 'points.xyz')
            print(point_file)
            assert os.path.exists(point_file), 'Path does not exist: {}'.format(point_file)
            points[i] = np.loadtxt(point_file)
            if points[i].shape[0] < num:
                num = points[i].shape[0]

        points_all = np.zeros((num_classes, num, 3), dtype=np.float32)
        for i in range(1, num_classes):
            points_all[i, :, :] = points[i][:num, :]

        # rescale the points
        point_blob = points_all.copy()
        for i in range(1, num_classes):
            # compute the rescaling factor for the points
            weight = 10.0 / np.amax(extents[i, :])
            if weight < 10:
                weight = 10
            if symmetry[i] > 0:
                point_blob[i, :, :] = 4 * weight * point_blob[i, :, :]
            else:
                point_blob[i, :, :] = weight * point_blob[i, :, :]

        return points, points_all, point_blob


    def _load_object_extents(self):

        extents = np.zeros((self._num_classes_all, 3), dtype=np.float32)
        for i in range(1, self._num_classes_all):
            point_file = os.path.join(self._model_path, self._classes_all[i], 'points.xyz')
            print(point_file)
            assert os.path.exists(point_file), 'Path does not exist: {}'.format(point_file)
            points = np.loadtxt(point_file)
            extents[i, :] = 2 * np.max(np.absolute(points), axis=0)

        return extents


    # image
    def image_path_at(self, i):
        """
        Return the absolute path to image i in the image sequence.
        """
        return self.image_path_from_index(self.image_index[i])

    def image_path_from_index(self, index):
        """
        Construct an image path from the image's "index" identifier.
        """

        image_path_jpg = os.path.join(self._data_path, index + '_color.jpg')
        image_path_png = os.path.join(self._data_path, index + '_color.png')
        if os.path.exists(image_path_jpg):
            return image_path_jpg
        elif os.path.exists(image_path_png):
            return image_path_png

        assert os.path.exists(image_path_jpg) or os.path.exists(image_path_png), \
                'Path does not exist: {} or {}'.format(image_path_jpg, image_path_png)

    # depth
    def depth_path_at(self, i):
        """
        Return the absolute path to depth i in the image sequence.
        """
        return self.depth_path_from_index(self.image_index[i])

    def depth_path_from_index(self, index):
        """
        Construct an depth path from the image's "index" identifier.
        """
        depth_path = os.path.join(self._data_path, index + '_depth' + self._image_ext)
        assert os.path.exists(depth_path), \
                'Path does not exist: {}'.format(depth_path)
        return depth_path

    # camera pose
    def metadata_path_at(self, i):
        """
        Return the absolute path to metadata i in the image sequence.
        """
        return self.metadata_path_from_index(self.image_index[i])

    def metadata_path_from_index(self, index):
        """
        Construct an metadata path from the image's "index" identifier.
        """
        metadata_path = os.path.join(self._data_path, index + '_meta.mat')
        assert os.path.exists(metadata_path), \
                'Path does not exist: {}'.format(metadata_path)
        return metadata_path

    def gt_roidb(self):
        """
        Return the database of ground-truth regions of interest.

        This function loads/saves from/to a cache file to speed up future calls.
        """

        gt_roidb = [self._load_ycb_self_supervision_annotation(index)
                    for index in self._image_index]

        return gt_roidb


    def _load_ycb_self_supervision_annotation(self, index):
        """
        Load class name and meta data
        """
        # image path
        image_path = self.image_path_from_index(index)

        # depth path
        depth_path = self.depth_path_from_index(index)

        # metadata path
        metadata_path = self.metadata_path_from_index(index)
        
        return {'image': image_path,
                'depth': depth_path,
                'meta_data': metadata_path,
                'flipped': False}


    def labels_to_image(self, labels):

        height = labels.shape[0]
        width = labels.shape[1]
        im_label = np.zeros((height, width, 3), dtype=np.uint8)
        for i in range(self.num_classes):
            I = np.where(labels == i)
            im_label[I[0], I[1], :] = self._class_colors[i]

        return im_label


    def process_label_image(self, label_image):
        """
        change label image to label index
        """
        height = label_image.shape[0]
        width = label_image.shape[1]
        labels = np.zeros((height, width), dtype=np.int32)
        labels_all = np.zeros((height, width), dtype=np.int32)

        # label image is in BGR order
        index = label_image[:,:,2] + 256*label_image[:,:,1] + 256*256*label_image[:,:,0]
        for i in range(1, len(self._class_colors_all)):
            color = self._class_colors_all[i]
            ind = color[0] + 256*color[1] + 256*256*color[2]
            I = np.where(index == ind)
            labels_all[I[0], I[1]] = i

            ind = np.where(np.array(cfg.TRAIN.CLASSES) == i)[0]
            if len(ind) > 0:
                labels[I[0], I[1]] = ind

        return labels, labels_all


    def render_gt_pose(self, meta_data):

        width = self._width
        height = self._height
        meta_data['cls_indexes'] = meta_data['cls_indexes'].flatten()
        classes = np.array(cfg.TRAIN.CLASSES)
        classes_test = np.array(cfg.TEST.CLASSES).flatten()

        intrinsic_matrix = np.matrix(meta_data['intrinsic_matrix'])
        fx = intrinsic_matrix[0, 0]
        fy = intrinsic_matrix[1, 1]
        px = intrinsic_matrix[0, 2]
        py = intrinsic_matrix[1, 2]
        zfar = 6.0
        znear = 0.01

        # poses
        poses = meta_data['poses']
        if len(poses.shape) == 2:
            poses = np.reshape(poses, (3, 4, 1))
        num = poses.shape[2]

        # render poses to get the label image
        cls_indexes = []
        poses_all = []
        qt = np.zeros((7, ), dtype=np.float32)
        for i in range(num):
            RT = poses[:, :, i]
            qt[:3] = RT[:, 3]
            qt[3:] = mat2quat(RT[:, :3])
            if cfg.MODE == 'TEST':
                index = np.where(classes_test == meta_data['cls_indexes'][i])[0]
                cls_indexes.append(index[0])
            else:
                cls_indexes.append(meta_data['cls_indexes'][i] - 1)
            poses_all.append(qt.copy())
            
        # rendering
        cfg.renderer.set_poses(poses_all)
        cfg.renderer.set_light_pos([0, 0, 0])
        cfg.renderer.set_light_color([1, 1, 1])
        cfg.renderer.set_projection_matrix(width, height, fx, fy, px, py, znear, zfar)
        image_tensor = torch.cuda.FloatTensor(height, width, 4).detach()
        seg_tensor = torch.cuda.FloatTensor(height, width, 4).detach()
        cfg.renderer.render(cls_indexes, image_tensor, seg_tensor)
        image_tensor = image_tensor.flip(0)
        seg_tensor = seg_tensor.flip(0)

        # semantic labels
        im_label = seg_tensor.cpu().numpy()
        im_label = im_label[:, :, (2, 1, 0)] * 255
        im_label = np.round(im_label).astype(np.uint8)
        im_label = np.clip(im_label, 0, 255)
        im_label, im_label_all = self.process_label_image(im_label)

        # foreground mask
        seg = torch.from_numpy((im_label != 0).astype(np.float32))
        mask = seg.unsqueeze(0).repeat((3, 1, 1)).float().cuda()

        # gt boxes
        gt_boxes = np.zeros((self._num_classes, 5), dtype=np.float32)
        count = 0
        selected = []
        for i in range(num):
            cls = int(meta_data['cls_indexes'][i])
            ind = np.where(classes == cls)[0]
            if len(ind) > 0:
                R = poses[:, :3, i]
                T = poses[:, 3, i]

                # compute box
                x3d = np.ones((4, self._points_all.shape[1]), dtype=np.float32)
                x3d[0, :] = self._points_all[ind,:,0]
                x3d[1, :] = self._points_all[ind,:,1]
                x3d[2, :] = self._points_all[ind,:,2]
                RT = np.zeros((3, 4), dtype=np.float32)
                RT[:3, :3] = R
                RT[:, 3] = T
                x2d = np.matmul(meta_data['intrinsic_matrix'], np.matmul(RT, x3d))
                x2d[0, :] = np.divide(x2d[0, :], x2d[2, :])
                x2d[1, :] = np.divide(x2d[1, :], x2d[2, :])

                x1 = np.min(x2d[0, :])
                y1 = np.min(x2d[1, :])
                x2 = np.max(x2d[0, :])
                y2 = np.max(x2d[1, :])
                if x1 > width or y1 > height or x2 < 0 or y2 < 0:
                    continue
                gt_boxes[count, 0] = x1
                gt_boxes[count, 1] = y1
                gt_boxes[count, 2] = x2
                gt_boxes[count, 3] = y2
                selected.append(i)
                count += 1

        meta_data['cls_indexes'] = meta_data['cls_indexes'][selected]
        meta_data['poses'] = poses[:, :, selected]
        meta_data['im_label'] = im_label
        meta_data['box'] = gt_boxes[:count, :4]
        return meta_data


    def evaluation(self, output_dir):

        filename = os.path.join(output_dir, 'results_posecnn.mat')
        if os.path.exists(filename):
            results_all = scipy.io.loadmat(filename)
            print('load results from file')
            print(filename)
            distances_sys = results_all['distances_sys']
            distances_non = results_all['distances_non']
            errors_rotation = results_all['errors_rotation']
            errors_translation = results_all['errors_translation']
            results_frame_id = results_all['results_frame_id'].flatten()
            results_object_id = results_all['results_object_id'].flatten()
            results_cls_id = results_all['results_cls_id'].flatten()
            segmentation_precision = results_all['segmentation_precision']
            segmentation_recall = results_all['segmentation_recall']
            segmentation_f1 = results_all['segmentation_f1']
            segmentation_count = results_all['segmentation_count']
        else:
            # save results
            num_max = 100000
            num_results = 2
            distances_sys = np.zeros((num_max, num_results), dtype=np.float32)
            distances_non = np.zeros((num_max, num_results), dtype=np.float32)
            errors_rotation = np.zeros((num_max, num_results), dtype=np.float32)
            errors_translation = np.zeros((num_max, num_results), dtype=np.float32)
            results_frame_id = np.zeros((num_max, ), dtype=np.float32)
            results_object_id = np.zeros((num_max, ), dtype=np.float32)
            results_cls_id = np.zeros((num_max, ), dtype=np.float32)
            segmentation_precision = np.zeros((num_max, self._num_classes), dtype=np.float32)
            segmentation_recall = np.zeros((num_max, self._num_classes), dtype=np.float32)
            segmentation_f1 = np.zeros((num_max, self._num_classes), dtype=np.float32)
            segmentation_count = np.zeros((num_max, self._num_classes), dtype=np.float32)

            # for each image
            count = -1
            count_file = -1
            filename = os.path.join(output_dir, '*.mat')
            files = glob.glob(filename)
            for i in range(len(files)):

                # load result
                filename = files[i]
                print(filename)
                result_posecnn = scipy.io.loadmat(filename)

                # load gt poses
                filename = result_posecnn['meta_data_path'][0]
                print(filename)
                gt = scipy.io.loadmat(filename)

                # render gt poses
                gt = self.render_gt_pose(gt)

                # compute segmentation metrics
                metrics_dict = multilabel_metrics(result_posecnn['labels'].astype(np.int32), gt['im_label'].astype(np.int32), self._num_classes)
                count_file += 1
                segmentation_precision[count_file, :] = metrics_dict['Precision']
                segmentation_recall[count_file, :] = metrics_dict['Recall']
                segmentation_f1[count_file, :] = metrics_dict['F-measure']
                segmentation_count[count_file, :] = metrics_dict['Count']

                '''
                import matplotlib.pyplot as plt
                fig = plt.figure()
                im_file = filename.replace('_meta.mat', '_color.png')
                im = cv2.imread(im_file)
                ax = fig.add_subplot(2, 2, 1)
                plt.imshow(im[:, :, (2, 1, 0)])
                for i in range(gt['box'].shape[0]):
                    x1 = gt['box'][i, 0]
                    y1 = gt['box'][i, 1]
                    x2 = gt['box'][i, 2]
                    y2 = gt['box'][i, 3]
                    plt.gca().add_patch(plt.Rectangle((x1, y1), x2-x1, y2-y1, fill=False, edgecolor='g', linewidth=3))
                ax = fig.add_subplot(2, 2, 2)
                plt.imshow(gt['im_label'])
                ax = fig.add_subplot(2, 2, 3)
                plt.imshow(result_posecnn['labels'].astype(np.int32))
                plt.show()
                #'''

                # for each gt poses
                cls_indexes = gt['cls_indexes'].flatten()
                for j in range(len(cls_indexes)):
                    count += 1
                    cls_index = cls_indexes[j]
                    RT_gt = gt['poses'][:, :, j]

                    results_frame_id[count] = i
                    results_object_id[count] = j
                    results_cls_id[count] = cls_index

                    # network result
                    result = result_posecnn
                    roi_index = []
                    if len(result['rois']) > 0:     
                        for k in range(result['rois'].shape[0]):
                            ind = int(result['rois'][k, 1])
                            cls = cfg.TRAIN.CLASSES[ind]
                            if cls == cls_index:
                                roi_index.append(k)                   

                    # select the roi
                    if len(roi_index) > 1:
                        # overlaps: (rois x gt_boxes)
                        roi_blob = result['rois'][roi_index, :]
                        roi_blob = roi_blob[:, (0, 2, 3, 4, 5, 1)]
                        gt_box_blob = np.zeros((1, 5), dtype=np.float32)
                        gt_box_blob[0, 1:] = gt['box'][j, :]
                        overlaps = bbox_overlaps(
                            np.ascontiguousarray(roi_blob[:, :5], dtype=np.float),
                            np.ascontiguousarray(gt_box_blob, dtype=np.float)).flatten()
                        assignment = overlaps.argmax()
                        roi_index = [roi_index[assignment]]

                    if len(roi_index) > 0:
                        RT = np.zeros((3, 4), dtype=np.float32)
                        ind = int(result['rois'][roi_index, 1])
                        if ind == -1:
                            points = self._points_clamp
                        else:
                            points = self._points[ind]

                        # pose from network
                        RT[:3, :3] = quat2mat(result['poses'][roi_index, :4].flatten())
                        RT[:, 3] = result['poses'][roi_index, 4:]
                        distances_sys[count, 0] = adi(RT[:3, :3], RT[:, 3],  RT_gt[:3, :3], RT_gt[:, 3], points)
                        distances_non[count, 0] = add(RT[:3, :3], RT[:, 3],  RT_gt[:3, :3], RT_gt[:, 3], points)
                        errors_rotation[count, 0] = re(RT[:3, :3], RT_gt[:3, :3])
                        errors_translation[count, 0] = te(RT[:, 3], RT_gt[:, 3])

                        # pose after depth refinement
                        if cfg.TEST.POSE_REFINE:
                            RT[:3, :3] = quat2mat(result['poses_refined'][roi_index, :4].flatten())
                            RT[:, 3] = result['poses_refined'][roi_index, 4:]
                            distances_sys[count, 1] = adi(RT[:3, :3], RT[:, 3],  RT_gt[:3, :3], RT_gt[:, 3], points)
                            distances_non[count, 1] = add(RT[:3, :3], RT[:, 3],  RT_gt[:3, :3], RT_gt[:, 3], points)
                            errors_rotation[count, 1] = re(RT[:3, :3], RT_gt[:3, :3])
                            errors_translation[count, 1] = te(RT[:, 3], RT_gt[:, 3])
                        else:
                            distances_sys[count, 1] = np.inf
                            distances_non[count, 1] = np.inf
                            errors_rotation[count, 1] = np.inf
                            errors_translation[count, 1] = np.inf
                    else:
                        distances_sys[count, :] = np.inf
                        distances_non[count, :] = np.inf
                        errors_rotation[count, :] = np.inf
                        errors_translation[count, :] = np.inf

            distances_sys = distances_sys[:count+1, :]
            distances_non = distances_non[:count+1, :]
            errors_rotation = errors_rotation[:count+1, :]
            errors_translation = errors_translation[:count+1, :]
            results_frame_id = results_frame_id[:count+1]
            results_object_id = results_object_id[:count+1]
            results_cls_id = results_cls_id[:count+1]
            segmentation_precision = segmentation_precision[:count_file+1, :]
            segmentation_recall = segmentation_recall[:count_file+1, :]
            segmentation_f1 = segmentation_f1[:count_file+1, :]
            segmentation_count = segmentation_count[:count_file+1, :]

            results_all = {'distances_sys': distances_sys,
                       'distances_non': distances_non,
                       'errors_rotation': errors_rotation,
                       'errors_translation': errors_translation,
                       'results_frame_id': results_frame_id,
                       'results_object_id': results_object_id,
                       'results_cls_id': results_cls_id,
                       'segmentation_precision': segmentation_precision, 
                       'segmentation_recall': segmentation_recall,
                       'segmentation_f1': segmentation_f1,
                       'segmentation_count': segmentation_count}

            filename = os.path.join(output_dir, 'results_posecnn.mat')
            scipy.io.savemat(filename, results_all)

        # for each class
        import matplotlib.pyplot as plt
        max_distance = 0.1
        index_plot = [0, 1]
        color = ['r', 'b']
        leng = ['PoseCNN', 'refined']
        num = len(leng)
        ADD = np.zeros((self._num_classes_all, num), dtype=np.float32)
        ADDS = np.zeros((self._num_classes_all, num), dtype=np.float32)
        TS = np.zeros((self._num_classes_all, num), dtype=np.float32)
        classes = list(copy.copy(self._classes_all))
        classes[0] = 'all'
        for k in range(self._num_classes_all):
            fig = plt.figure()
            if k == 0:
                index = range(len(results_cls_id))
            else:
                index = np.where(results_cls_id == k)[0]

            if len(index) == 0:
                continue
            print('%s: %d objects' % (classes[k], len(index)))

            # distance symmetry
            ax = fig.add_subplot(2, 3, 1)
            lengs = []
            for i in index_plot:
                D = distances_sys[index, i]
                ind = np.where(D > max_distance)[0]
                D[ind] = np.inf
                d = np.sort(D)
                n = len(d)
                accuracy = np.cumsum(np.ones((n, ), np.float32)) / n
                plt.plot(d, accuracy, color[i], linewidth=2)
                ADDS[k, i] = VOCap(d, accuracy)
                lengs.append('%s (%.2f)' % (leng[i], ADDS[k, i] * 100))
                print('%s, %s: %d objects missed' % (classes[k], leng[i], np.sum(np.isinf(D))))

            ax.legend(lengs)
            plt.xlabel('Average distance threshold in meter (symmetry)')
            plt.ylabel('accuracy')
            ax.set_title(classes[k])

            # distance non-symmetry
            ax = fig.add_subplot(2, 3, 2)
            lengs = []
            for i in index_plot:
                D = distances_non[index, i]
                ind = np.where(D > max_distance)[0]
                D[ind] = np.inf
                d = np.sort(D)
                n = len(d)
                accuracy = np.cumsum(np.ones((n, ), np.float32)) / n
                plt.plot(d, accuracy, color[i], linewidth=2)
                ADD[k, i] = VOCap(d, accuracy)
                lengs.append('%s (%.2f)' % (leng[i], ADD[k, i] * 100))
                print('%s, %s: %d objects missed' % (classes[k], leng[i], np.sum(np.isinf(D))))

            ax.legend(lengs)
            plt.xlabel('Average distance threshold in meter (non-symmetry)')
            plt.ylabel('accuracy')
            ax.set_title(classes[k])

            # translation
            ax = fig.add_subplot(2, 3, 3)
            lengs = []
            for i in index_plot:
                D = errors_translation[index, i]
                ind = np.where(D > max_distance)[0]
                D[ind] = np.inf
                d = np.sort(D)
                n = len(d)
                accuracy = np.cumsum(np.ones((n, ), np.float32)) / n
                plt.plot(d, accuracy, color[i], linewidth=2)
                TS[k, i] = VOCap(d, accuracy)
                lengs.append('%s (%.2f)' % (leng[i], TS[k, i] * 100))
                print('%s, %s: %d objects missed' % (classes[k], leng[i], np.sum(np.isinf(D))))

            ax.legend(lengs)
            plt.xlabel('Translation threshold in meter')
            plt.ylabel('accuracy')
            ax.set_title(classes[k])

            # rotation histogram
            count = 4
            for i in index_plot:
                ax = fig.add_subplot(2, 3, count)
                D = errors_rotation[index, i]
                ind = np.where(np.isfinite(D))[0]
                D = D[ind]
                ax.hist(D, bins=range(0, 190, 10), range=(0, 180))
                plt.xlabel('Rotation angle error')
                plt.ylabel('count')
                ax.set_title(leng[i])
                count += 1

            # mng = plt.get_current_fig_manager()
            # mng.full_screen_toggle()
            filename = output_dir + '/' + classes[k] + '.png'
            plt.savefig(filename)
            # plt.show()

        # print ADD
        print('==================ADD======================')
        for k in range(len(classes)):
            print('%s: %f' % (classes[k], ADD[k, 0]))
        for k in range(len(classes)-1):
            print('%f' % (ADD[k+1, 0]))
        print('%f' % (ADD[0, 0]))
        print(cfg.TRAIN.SNAPSHOT_INFIX)
        print('===========================================')

        # print ADD-S
        print('==================ADD-S====================')
        for k in range(len(classes)):
            print('%s: %f' % (classes[k], ADDS[k, 0]))
        for k in range(len(classes)-1):
            print('%f' % (ADDS[k+1, 0]))
        print('%f' % (ADDS[0, 0]))
        print(cfg.TRAIN.SNAPSHOT_INFIX)
        print('===========================================')

        # print ADD
        print('==================ADD refined======================')
        for k in range(len(classes)):
            print('%s: %f' % (classes[k], ADD[k, 1]))
        for k in range(len(classes)-1):
            print('%f' % (ADD[k+1, 1]))
        print('%f' % (ADD[0, 1]))
        print(cfg.TRAIN.SNAPSHOT_INFIX)
        print('===========================================')

        # print ADD-S
        print('==================ADD-S refined====================')
        for k in range(len(classes)):
            print('%s: %f' % (classes[k], ADDS[k, 1]))
        for k in range(len(classes)-1):
            print('%f' % (ADDS[k+1, 1]))
        print('%f' % (ADDS[0, 1]))
        print(cfg.TRAIN.SNAPSHOT_INFIX)
        print('===========================================')

        # print segmentation precision
        print('==================segmentation precision====================')
        for i in range(self._num_classes):
            count = np.sum(segmentation_count[:, i])
            if count > 0:
                precision = np.sum(segmentation_precision[:, i]) / count
            else:
                precision = 0
            print('%s: %d objects, %f' % (self._classes[i], count, precision))
        for i in range(self._num_classes):
            count = np.sum(segmentation_count[:, i])
            if count > 0:
                precision = np.sum(segmentation_precision[:, i]) / count
            else:
                precision = 0
            print('%f' % (precision))
        print('===========================================')

        # print segmentation recall
        print('==================segmentation recall====================')
        for i in range(self._num_classes):
            count = np.sum(segmentation_count[:, i])
            if count > 0:
                recall = np.sum(segmentation_recall[:, i]) / count
            else:
                recall = 0
            print('%s: %d objects, %f' % (self._classes[i], count, recall))
        for i in range(self._num_classes):
            count = np.sum(segmentation_count[:, i])
            if count > 0:
                recall = np.sum(segmentation_recall[:, i]) / count
            else:
                recall = 0
            print('%f' % (recall))
        print('===========================================')

        # print segmentation f1
        print('==================segmentation f1====================')
        for i in range(self._num_classes):
            count = np.sum(segmentation_count[:, i])
            if count > 0:
                f1 = np.sum(segmentation_f1[:, i]) / count
            else:
                f1 = 0
            print('%s: %d objects, %f' % (self._classes[i], count, f1))
        for i in range(self._num_classes):
            count = np.sum(segmentation_count[:, i])
            if count > 0:
                f1 = np.sum(segmentation_f1[:, i]) / count
            else:
                f1 = 0
            print('%f' % (f1))
        print('===========================================')


================================================
FILE: lib/datasets/ycb_video.py
================================================
# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.
# This work is licensed under the NVIDIA Source Code License - Non-commercial. Full
# text can be found in LICENSE.md

import torch
import torch.utils.data as data

import os, math
import sys
import os.path as osp
from os.path import *
import numpy as np
import numpy.random as npr
import cv2
import scipy.io
import copy
import glob
try:
    import cPickle  # Use cPickle on Python 2.7
except ImportError:
    import pickle as cPickle

import datasets
from fcn.config import cfg
from utils.blob import pad_im, chromatic_transform, add_noise, add_noise_cuda
from transforms3d.quaternions import mat2quat, quat2mat
from utils.se3 import *
from utils.pose_error import *
from utils.cython_bbox import bbox_overlaps


class YCBVideo(data.Dataset, datasets.imdb):
    def __init__(self, image_set, ycb_video_path = None):

        self._name = 'ycb_video_' + image_set
        self._image_set = image_set
        self._ycb_video_path = self._get_default_path() if ycb_video_path is None \
                            else ycb_video_path

        path = os.path.join(self._ycb_video_path, 'data')
        if not os.path.exists(path):
            path = os.path.join(self._ycb_video_path, 'YCB_Video_Dataset/YCB_Video_Dataset/YCB_Video_Dataset/data')
        self._data_path = path

        self._model_path = os.path.join(datasets.ROOT_DIR, 'data', 'models')

        # define all the classes
        self._classes_all = ('__background__', '002_master_chef_can', '003_cracker_box', '004_sugar_box', '005_tomato_soup_can', '006_mustard_bottle', \
                         '007_tuna_fish_can', '008_pudding_box', '009_gelatin_box', '010_potted_meat_can', '011_banana', '019_pitcher_base', \
                         '021_bleach_cleanser', '024_bowl', '025_mug', '035_power_drill', '036_wood_block', '037_scissors', '040_large_marker', \
                         '051_large_clamp', '052_extra_large_clamp', '061_foam_brick')
        self._num_classes_all = len(self._classes_all)
        self._class_colors_all = [(255, 255, 255), (255, 0, 0), (0, 255, 0), (0, 0, 255), (255, 255, 0), (255, 0, 255), (0, 255, 255), \
                              (128, 0, 0), (0, 128, 0), (0, 0, 128), (128, 128, 0), (128, 0, 128), (0, 128, 128), \
                              (64, 0, 0), (0, 64, 0), (0, 0, 64), (64, 64, 0), (64, 0, 64), (0, 64, 64), 
                              (192, 0, 0), (0, 192, 0), (0, 0, 192)]
        self._symmetry_all = np.array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1]).astype(np.float32)
        self._extents_all = self._load_object_extents()

        self._width = 640
        self._height = 480
        self._intrinsic_matrix = np.array([[1.066778e+03, 0.000000e+00, 3.129869e+02],
                                          [0.000000e+00, 1.067487e+03, 2.413109e+02],
                                          [0.000000e+00, 0.000000e+00, 1.000000e+00]])

        # select a subset of classes
        self._classes = [self._classes_all[i] for i in cfg.TRAIN.CLASSES]
        self._classes_test = [self._classes_all[i] for i in cfg.TEST.CLASSES]
        self._num_classes = len(self._classes)
        self._class_colors = [self._class_colors_all[i] for i in cfg.TRAIN.CLASSES]
        self._symmetry = self._symmetry_all[cfg.TRAIN.CLASSES]
        self._symmetry_test = self._symmetry_all[cfg.TEST.CLASSES]
        self._extents = self._extents_all[cfg.TRAIN.CLASSES]
        self._extents_test = self._extents_all[cfg.TEST.CLASSES]
        self._pixel_mean = cfg.PIXEL_MEANS / 255.0

        # train classes
        self._points, self._points_all, self._point_blob = \
            self._load_object_points(self._classes, self._extents, self._symmetry)

        # test classes
        self._points_test, self._points_all_test, self._point_blob_test = \
            self._load_object_points(self._classes_test, self._extents_test, self._symmetry_test)

        # 3D model paths
        self.model_mesh_paths = ['{}/{}/textured_simple.obj'.format(self._model_path, cls) for cls in self._classes_all[1:]]
        self.model_sdf_paths = ['{}/{}/textured_simple_low_res.pth'.format(self._model_path, cls) for cls in self._classes_all[1:]]
        self.model_texture_paths = ['{}/{}/texture_map.png'.format(self._model_path, cls) for cls in self._classes_all[1:]]
        self.model_colors = [np.array(self._class_colors_all[i]) / 255.0 for i in range(1, len(self._classes_all))]

        self.model_mesh_paths_target = ['{}/{}/textured_simple.obj'.format(self._model_path, cls) for cls in self._classes[1:]]
        self.model_sdf_paths_target = ['{}/{}/textured_simple.sdf'.format(self._model_path, cls) for cls in self._classes[1:]]
        self.model_texture_paths_target = ['{}/{}/texture_map.png'.format(self._model_path, cls) for cls in self._classes[1:]]
        self.model_colors_target = [np.array(self._class_colors_all[i]) / 255.0 for i in cfg.TRAIN.CLASSES[1:]]

        self._class_to_ind = dict(zip(self._classes, range(self._num_classes)))
        self._image_index = self._load_image_set_index(image_set)

        self._size = len(self._image_index)
        if self._size > cfg.TRAIN.MAX_ITERS_PER_EPOCH * cfg.TRAIN.IMS_PER_BATCH:
            self._size = cfg.TRAIN.MAX_ITERS_PER_EPOCH * cfg.TRAIN.IMS_PER_BATCH
        self._roidb = self.gt_roidb()

        assert os.path.exists(self._ycb_video_path), \
                'ycb_video path does not exist: {}'.format(self._ycb_video_path)
        assert os.path.exists(self._data_path), \
                'Data path does not exist: {}'.format(self._data_path)


    def __getitem__(self, index):

        is_syn = 0
        roidb = self._roidb[index]

        # Get the input image blob
        random_scale_ind = npr.randint(0, high=len(cfg.TRAIN.SCALES_BASE))
        im_blob, im_depth, im_scale, height, width = self._get_image_blob(roidb, random_scale_ind)

        # build the label blob
        label_blob, mask, meta_data_blob, pose_blob, gt_boxes, vertex_targets, vertex_weights \
            = self._get_label_blob(roidb, self._num_classes, im_scale, height, width)

        is_syn = roidb['is_syn']
        im_info = np.array([im_blob.shape[1], im_blob.shape[2], im_scale, is_syn], dtype=np.float32)

        sample = {'image_color': im_blob,
                  'im_depth': im_depth,
                  'label': label_blob,
                  'mask': mask,
                  'meta_data': meta_data_blob,
                  'poses': pose_blob,
                  'extents': self._extents,
                  'points': self._point_blob,
                  'symmetry': self._symmetry,
                  'gt_boxes': gt_boxes,
                  'im_info': im_info,
                  'video_id': roidb['video_id'],
                  'image_id': roidb['image_id']}

        if cfg.TRAIN.VERTEX_REG:
            sample['vertex_targets'] = vertex_targets
            sample['vertex_weights'] = vertex_weights

        return sample


    def _get_image_blob(self, roidb, scale_ind):    

        # rgba
        rgba = pad_im(cv2.imread(roidb['image'], cv2.IMREAD_UNCHANGED), 16)
        if rgba.shape[2] == 4:
            im = np.copy(rgba[:,:,:3])
            alpha = rgba[:,:,3]
            I = np.where(alpha == 0)
            im[I[0], I[1], :] = 0
        else:
            im = rgba

        im_scale = cfg.TRAIN.SCALES_BASE[scale_ind]
        if im_scale != 1.0:
            im = cv2.resize(im, None, None, fx=im_scale, fy=im_scale, interpolation=cv2.INTER_LINEAR)
        height = im.shape[0]
        width = im.shape[1]

        if roidb['flipped']:
            im = im[:, ::-1, :]

        # chromatic transform
        if cfg.TRAIN.CHROMATIC and cfg.MODE == 'TRAIN' and np.random.rand(1) > 0.1:
            im = chromatic_transform(im)
        if cfg.TRAIN.ADD_NOISE and cfg.MODE == 'TRAIN' and np.random.rand(1) > 0.1:
            im = add_noise(im)
        im_tensor = torch.from_numpy(im) / 255.0
        im_tensor -= self._pixel_mean
        image_blob = im_tensor.permute(2, 0, 1).float()

        # depth image
        im_depth = pad_im(cv2.imread(roidb['depth'], cv2.IMREAD_UNCHANGED), 16)
        if im_scale != 1.0:
            im_depth = cv2.resize(im_depth, None, None, fx=im_scale, fy=im_scale, interpolation=cv2.INTER_NEAREST)
        im_depth = im_depth.astype('float') / 10000.0

        return image_blob, im_depth, im_scale, height, width


    def _get_label_blob(self, roidb, num_classes, im_scale, height, width):
        """ build the label blob """

        meta_data = scipy.io.loadmat(roidb['meta_data'])
        meta_data['cls_indexes'] = meta_data['cls_indexes'].flatten()
        classes = np.array(cfg.TRAIN.CLASSES)

        # read label image
        im_label = pad_im(cv2.imread(roidb['label'], cv2.IMREAD_UNCHANGED), 16)
        if roidb['flipped']:
            if len(im_label.shape) == 2:
                im_label = im_label[:, ::-1]
            else:
                im_label = im_label[:, ::-1, :]
        if im_scale != 1.0:
            im_label = cv2.resize(im_label, None, None, fx=im_scale, fy=im_scale, interpolation=cv2.INTER_NEAREST)

        label_blob = np.zeros((num_classes, height, width), dtype=np.float32)
        label_blob[0, :, :] = 1.0
        for i in range(1, num_classes):
            I = np.where(im_label == classes[i])
            if len(I[0]) > 0:
                label_blob[i, I[0], I[1]] = 1.0
                label_blob[0, I[0], I[1]] = 0.0

        # foreground mask
        seg = torch.from_numpy((im_label != 0).astype(np.float32))
        mask = seg.unsqueeze(0).repeat((3, 1, 1)).float()

        # poses
        poses = meta_data['poses']
        if len(poses.shape) == 2:
            poses = np.reshape(poses, (3, 4, 1))
        if roidb['flipped']:
            poses = _flip_poses(poses, meta_data['intrinsic_matrix'], width)

        num = poses.shape[2]
        pose_blob = np.zeros((num_classes, 9), dtype=np.float32)
        gt_boxes = np.zeros((num_classes, 5), dtype=np.float32)
        count = 0
        for i in range(num):
            cls = int(meta_data['cls_indexes'][i])
            ind = np.where(classes == cls)[0]
            if len(ind) > 0:
                R = poses[:, :3, i]
                T = poses[:, 3, i]
                pose_blob[count, 0] = 1
                pose_blob[count, 1] = ind
                qt = mat2quat(R)

                # egocentric to allocentric
                qt_allocentric = egocentric2allocentric(qt, T)
                if qt_allocentric[0] < 0:
                   qt_allocentric = -1 * qt_allocentric
                pose_blob[count, 2:6] = qt_allocentric
                pose_blob[count, 6:] = T

                # compute box
                x3d = np.ones((4, self._points_all.shape[1]), dtype=np.float32)
                x3d[0, :] = self._points_all[ind,:,0]
                x3d[1, :] = self._points_all[ind,:,1]
                x3d[2, :] = self._points_all[ind,:,2]
                RT = np.zeros((3, 4), dtype=np.float32)
                RT[:3, :3] = quat2mat(qt)
                RT[:, 3] = T
                x2d = np.matmul(meta_data['intrinsic_matrix'], np.matmul(RT, x3d))
                x2d[0, :] = np.divide(x2d[0, :], x2d[2, :])
                x2d[1, :] = np.divide(x2d[1, :], x2d[2, :])
        
                gt_boxes[count, 0] = np.min(x2d[0, :]) * im_scale
                gt_boxes[count, 1] = np.min(x2d[1, :]) * im_scale
                gt_boxes[count, 2] = np.max(x2d[0, :]) * im_scale
                gt_boxes[count, 3] = np.max(x2d[1, :]) * im_scale
                gt_boxes[count, 4] = ind
                count += 1

        # construct the meta data
        """
        format of the meta_data
        intrinsic matrix: meta_data[0 ~ 8]
        inverse intrinsic matrix: meta_data[9 ~ 17]
        """
        K = np.matrix(meta_data['intrinsic_matrix']) * im_scale
        K[2, 2] = 1
        Kinv = np.linalg.pinv(K)
        meta_data_blob = np.zeros(18, dtype=np.float32)
        meta_data_blob[0:9] = K.flatten()
        meta_data_blob[9:18] = Kinv.flatten()

        # vertex regression target
        if cfg.TRAIN.VERTEX_REG:
            center = meta_data['center']
            if roidb['flipped']:
                center[:, 0] = width - center[:, 0]
            vertex_targets, vertex_weights = self._generate_vertex_targets(im_label,
                meta_data['cls_indexes'], center, poses, classes, num_classes)
        else:
            vertex_targets = []
            vertex_weights = []

        return label_blob, mask, meta_data_blob, pose_blob, gt_boxes, vertex_targets, vertex_weights


    # compute the voting label image in 2D
    def _generate_vertex_targets(self, im_label, cls_indexes, center, poses, classes, num_classes):

        width = im_label.shape[1]
        height = im_label.shape[0]
        vertex_targets = np.zeros((3 * num_classes, height, width), dtype=np.float32)
        vertex_weights = np.zeros((3 * num_classes, height, width), dtype=np.float32)

        c = np.zeros((2, 1), dtype=np.float32)
        for i in range(1, num_classes):
            y, x = np.where(im_label == classes[i])
            I = np.where(im_label == classes[i])
            ind = np.where(cls_indexes == classes[i])[0]
            if len(x) > 0 and len(ind) > 0:
                c[0] = center[ind, 0]
                c[1] = center[ind, 1]
                if isinstance(poses, list):
                    z = poses[int(ind)][2]
                else:
                    if len(poses.shape) == 3:
                        z = poses[2, 3, ind]
                    else:
                        z = poses[ind, -1]
                R = np.tile(c, (1, len(x))) - np.vstack((x, y))
                # compute the norm
                N = np.linalg.norm(R, axis=0) + 1e-10
                # normalization
                R = np.divide(R, np.tile(N, (2,1)))
                # assignment
                vertex_targets[3*i+0, y, x] = R[0,:]
                vertex_targets[3*i+1, y, x] = R[1,:]
                vertex_targets[3*i+2, y, x] = math.log(z)

                vertex_weights[3*i+0, y, x] = cfg.TRAIN.VERTEX_W_INSIDE
                vertex_weights[3*i+1, y, x] = cfg.TRAIN.VERTEX_W_INSIDE
                vertex_weights[3*i+2, y, x] = cfg.TRAIN.VERTEX_W_INSIDE

        return vertex_targets, vertex_weights


    def __len__(self):
        return self._size


    def _get_default_path(self):
        """
        Return the default path where YCB_Video is expected to be installed.
        """
        return os.path.join(datasets.ROOT_DIR, 'data', 'YCB_Video')


    def _load_image_set_index(self, image_set):
        """
        Load the indexes listed in this dataset's image set file.
        """
        image_set_file = os.path.join(self._ycb_video_path, image_set + '.txt')
        assert os.path.exists(image_set_file), \
                'Path does not exist: {}'.format(image_set_file)

        image_index = []
        video_ids_selected = set([])
        video_ids_not = set([])
        count = np.zeros((self.num_classes, ), dtype=np.int32)

        with open(image_set_file) as f:
            for x in f.readlines():
                index = x.rstrip('\n')
                pos = index.find('/')
                video_id = index[:pos]

                if not video_id in video_ids_selected and not video_id in video_ids_not:
                    filename = os.path.join(self._data_path, video_id, '000001-meta.mat')
                    meta_data = scipy.io.loadmat(filename)
                    cls_indexes = meta_data['cls_indexes'].flatten()
                    flag = 0
                    for i in range(len(cls_indexes)):
                        cls_index = int(cls_indexes[i])
                        ind = np.where(np.array(cfg.TRAIN.CLASSES) == cls_index)[0]
                        if len(ind) > 0:
                            count[ind] += 1
                            flag = 1
                    if flag:
                        video_ids_selected.add(video_id)
                    else:
                        video_ids_not.add(video_id)

                if video_id in video_ids_selected:
                    image_index.append(index)

        for i in range(1, self.num_classes):
            print('%d %s [%d/%d]' % (i, self.classes[i], count[i], len(list(video_ids_selected))))

        # sample a subset for training
        if image_set == 'train':
            image_index = image_index[::5]

            # add synthetic data
            filename = os.path.join(self._data_path + '_syn', '*.mat')
            files = glob.glob(filename)
            print('adding synthetic %d data' % (len(files)))
            for i in range(len(files)):
                filename = files[i].replace(self._data_path, '../data')[:-9]
                image_index.append(filename)

        return image_index


    def _load_object_points(self, classes, extents, symmetry):

        points = [[] for _ in range(len(classes))]
        num = np.inf
        num_classes = len(classes)
        for i in range(1, num_classes):
            point_file = os.path.join(self._model_path, classes[i], 'points.xyz')
            print(point_file)
            assert os.path.exists(point_file), 'Path does not exist: {}'.format(point_file)
            points[i] = np.loadtxt(point_file)
            if points[i].shape[0] < num:
                num = points[i].shape[0]

        points_all = np.zeros((num_classes, num, 3), dtype=np.float32)
        for i in range(1, num_classes):
            points_all[i, :, :] = points[i][:num, :]

        # rescale the points
        point_blob = points_all.copy()
        for i in range(1, num_classes):
            # compute the rescaling factor for the points
            weight = 10.0 / np.amax(extents[i, :])
            if weight < 10:
                weight = 10
            if symmetry[i] > 0:
                point_blob[i, :, :] = 4 * weight * point_blob[i, :, :]
            else:
                point_blob[i, :, :] = weight * point_blob[i, :, :]

        return points, points_all, point_blob


    def _load_object_extents(self):

        extents = np.zeros((self._num_classes_all, 3), dtype=np.float32)
        for i in range(1, self._num_classes_all):
            point_file = os.path.join(self._model_path, self._classes_all[i], 'points.xyz')
            print(point_file)
            assert os.path.exists(point_file), 'Path does not exist: {}'.format(point_file)
            points = np.loadtxt(point_file)
            extents[i, :] = 2 * np.max(np.absolute(points), axis=0)

        return extents


    # image
    def image_path_at(self, i):
        """
        Return the absolute path to image i in the image sequence.
        """
        return self.image_path_from_index(self.image_index[i])

    def image_path_from_index(self, index):
        """
        Construct an image path from the image's "index" identifier.
        """

        image_path = os.path.join(self._data_path, index + '-color.jpg')
        if not os.path.exists(image_path):
            image_path = os.path.join(self._data_path, index + '-color.png')

        assert os.path.exists(image_path), \
                'Path does not exist: {}'.format(image_path)
        return image_path

    # depth
    def depth_path_at(self, i):
        """
        Return the absolute path to depth i in the image sequence.
        """
        return self.depth_path_from_index(self.image_index[i])

    def depth_path_from_index(self, index):
        """
        Construct an depth path from the image's "index" identifier.
        """
        depth_path = os.path.join(self._data_path, index + '-depth.png')
        assert os.path.exists(depth_path), \
                'Path does not exist: {}'.format(depth_path)
        return depth_path

    # label
    def label_path_at(self, i):
        """
        Return the absolute path to metadata i in the image sequence.
        """
        return self.label_path_from_index(self.image_index[i])

    def label_path_from_index(self, index):
        """
        Construct an metadata path from the image's "index" identifier.
        """
        label_path = os.path.join(self._data_path, index + '-label.png')
        assert os.path.exists(label_path), \
                'Path does not exist: {}'.format(label_path)
        return label_path

    # camera pose
    def metadata_path_at(self, i):
        """
        Return the absolute path to metadata i in the image sequence.
        """
        return self.metadata_path_from_index(self.image_index[i])

    def metadata_path_from_index(self, index):
        """
        Construct an metadata path from the image's "index" identifier.
        """
        metadata_path = os.path.join(self._data_path, index + '-meta.mat')
        assert os.path.exists(metadata_path), \
                'Path does not exist: {}'.format(metadata_path)
        return metadata_path

    def gt_roidb(self):
        """
        Return the database of ground-truth regions of interest.

        This function loads/saves from/to a cache file to speed up future calls.
        """

        prefix = '_class'
        for i in range(len(cfg.TRAIN.CLASSES)):
            prefix += '_%d' % cfg.TRAIN.CLASSES[i]
        cache_file = os.path.join(self.cache_path, self.name + prefix + '_gt_roidb.pkl')
        if os.path.exists(cache_file):
            with open(cache_file, 'rb') as fid:
                roidb = cPickle.load(fid)
            print('{} gt roidb loaded from {}'.format(self.name, cache_file))
            return roidb

        print('loading gt...')
        gt_roidb = [self._load_ycb_video_annotation(index)
                    for index in self._image_index]

        with open(cache_file, 'wb') as fid:
            cPickle.dump(gt_roidb, fid, cPickle.HIGHEST_PROTOCOL)
        print('wrote gt roidb to {}'.format(cache_file))

        return gt_roidb


    def _load_ycb_video_annotation(self, index):
        """
        Load class name and meta data
        """
        # image path
        image_path = self.image_path_from_index(index)

        # depth path
        depth_path = self.depth_path_from_index(index)

        # label path
        label_path = self.label_path_from_index(index)

        # metadata path
        metadata_path = self.metadata_path_from_index(index)

        # is synthetic image or not
        if 'data_syn' in image_path:
            is_syn = 1
            video_id = ''
            image_id = ''
        else:
            is_syn = 0
            # parse image name
            pos = index.find('/')
            video_id = index[:pos]
            image_id = index[pos+1:]
        
        return {'image': image_path,
                'depth': depth_path,
                'label': label_path,
                'meta_data': metadata_path,
                'video_id': video_id,
                'image_id': image_id,
                'is_syn': is_syn,
                'flipped': False}


    def labels_to_image(self, labels):

        height = labels.shape[0]
        width = labels.shape[1]
        im_label = np.zeros((height, width, 3), dtype=np.uint8)
        for i in range(self.num_classes):
            I = np.where(labels == i)
            im_label[I[0], I[1], :] = self._class_colors[i]

        return im_label


    def process_label_image(self, label_image):
        """
        change label image to label index
        """
        height = label_image.shape[0]
        width = label_image.shape[1]
        labels = np.zeros((height, width), dtype=np.int32)
        labels_all = np.zeros((height, width), dtype=np.int32)

        # label image is in BGR order
        index = label_image[:,:,2] + 256*label_image[:,:,1] + 256*256*label_image[:,:,0]
        for i in range(1, len(self._class_colors_all)):
            color = self._class_colors_all[i]
            ind = color[0] + 256*color[1] + 256*256*color[2]
            I = np.where(index == ind)
            labels_all[I[0], I[1]] = i

            ind = np.where(np.array(cfg.TRAIN.CLASSES) == i)[0]
            if len(ind) > 0:
                labels[I[0], I[1]] = ind

        return labels, labels_all


    def evaluation(self, output_dir):

        filename = os.path.join(output_dir, 'results_posecnn.mat')
        if os.path.exists(filename):
            results_all = scipy.io.loadmat(filename)
            print('load results from file')
            print(filename)
            distances_sys = results_all['distances_sys']
            distances_non = results_all['distances_non']
            errors_rotation = results_all['errors_rotation']
            errors_translation = results_all['errors_translation']
            results_seq_id = results_all['results_seq_id'].flatten()
            results_frame_id = results_all['results_frame_id'].flatten()
            results_object_id = results_all['results_object_id'].flatten()
            results_cls_id = results_all['results_cls_id'].flatten()
        else:
            # save results
            num_max = 100000
            num_results = 2
            distances_sys = np.zeros((num_max, num_results), dtype=np.float32)
            distances_non = np.zeros((num_max, num_results), dtype=np.float32)
            er
Download .txt
gitextract_vctsyxbk/

├── .gitignore
├── .gitmodules
├── LICENSE.md
├── README.md
├── build.sh
├── experiments/
│   ├── cfgs/
│   │   ├── dex_ycb.yml
│   │   ├── ycb_object.yml
│   │   ├── ycb_object_detection.yml
│   │   ├── ycb_object_self_supervision.yml
│   │   └── ycb_video.yml
│   └── scripts/
│       ├── demo.sh
│       ├── dex_ycb_test_s0.sh
│       ├── dex_ycb_test_s1.sh
│       ├── dex_ycb_test_s2.sh
│       ├── dex_ycb_test_s3.sh
│       ├── dex_ycb_train_s0.sh
│       ├── dex_ycb_train_s1.sh
│       ├── dex_ycb_train_s2.sh
│       ├── dex_ycb_train_s3.sh
│       ├── ros_ycb_object_test.sh
│       ├── ros_ycb_object_test_detection.sh
│       ├── ycb_object_test.sh
│       ├── ycb_object_train.sh
│       ├── ycb_object_train_detection.sh
│       ├── ycb_object_train_self_supervision.sh
│       ├── ycb_video_test.sh
│       └── ycb_video_train.sh
├── lib/
│   ├── datasets/
│   │   ├── __init__.py
│   │   ├── background.py
│   │   ├── dex_ycb.py
│   │   ├── factory.py
│   │   ├── imdb.py
│   │   ├── ycb_object.py
│   │   ├── ycb_self_supervision.py
│   │   └── ycb_video.py
│   ├── fcn/
│   │   ├── __init__.py
│   │   ├── config.py
│   │   ├── render_utils.py
│   │   ├── test_common.py
│   │   ├── test_dataset.py
│   │   ├── test_imageset.py
│   │   └── train.py
│   ├── layers/
│   │   ├── ROIAlign_cuda.cu
│   │   ├── __init__.py
│   │   ├── backproject_kernel.cu
│   │   ├── hard_label.py
│   │   ├── hard_label_kernel.cu
│   │   ├── hough_voting.py
│   │   ├── hough_voting_kernel.cu
│   │   ├── point_matching_loss.py
│   │   ├── point_matching_loss_kernel.cu
│   │   ├── pose_target_layer.py
│   │   ├── posecnn_layers.cpp
│   │   ├── roi_align.py
│   │   ├── roi_pooling.py
│   │   ├── roi_pooling_kernel.cu
│   │   ├── roi_target_layer.py
│   │   ├── sdf_matching_loss.py
│   │   ├── sdf_matching_loss_kernel.cu
│   │   └── setup.py
│   ├── networks/
│   │   ├── PoseCNN.py
│   │   └── __init__.py
│   ├── sdf/
│   │   ├── __init__.py
│   │   ├── _init_paths.py
│   │   ├── multi_sdf_optimizer.py
│   │   ├── sdf_optimizer.py
│   │   ├── sdf_utils.py
│   │   └── test_sdf_optimizer.py
│   └── utils/
│       ├── __init__.py
│       ├── bbox.pyx
│       ├── bbox_transform.py
│       ├── blob.py
│       ├── nms.py
│       ├── pose_error.py
│       ├── se3.py
│       ├── segmentation_evaluation.py
│       └── setup.py
├── requirement.txt
├── ros/
│   ├── _init_paths.py
│   ├── collect_images_realsense.py
│   ├── posecnn.rviz
│   └── test_images.py
├── tools/
│   ├── _init_paths.py
│   ├── test_images.py
│   ├── test_net.py
│   └── train_net.py
└── ycb_render/
    ├── CMakeLists.txt
    ├── __init__.py
    ├── cpp/
    │   ├── query_devices.cpp
    │   ├── test_device.cpp
    │   └── ycb_renderer.cpp
    ├── get_available_devices.py
    ├── glad/
    │   ├── EGL/
    │   │   └── eglplatform.h
    │   ├── KHR/
    │   │   └── khrplatform.h
    │   ├── egl.c
    │   ├── gl.c
    │   ├── glad/
    │   │   ├── egl.h
    │   │   ├── gl.h
    │   │   └── glx.h
    │   ├── glx_dyn.c
    │   └── linmath.h
    ├── glutils/
    │   ├── __init__.py
    │   ├── _trackball.py
    │   ├── glcontext.py
    │   ├── glrenderer.py
    │   ├── meshutil.py
    │   ├── trackball.py
    │   └── utils.py
    ├── setup.py
    ├── shaders/
    │   ├── frag.shader
    │   ├── frag_blinnphong.shader
    │   ├── frag_mat.shader
    │   ├── frag_simple.shader
    │   ├── frag_textureless.shader
    │   ├── vert.shader
    │   ├── vert_blinnphong.shader
    │   ├── vert_mat.shader
    │   ├── vert_simple.shader
    │   └── vert_textureless.shader
    ├── visualize_sim.py
    └── ycb_renderer.py
Download .txt
SYMBOL INDEX (730 symbols across 62 files)

FILE: lib/datasets/background.py
  class BackgroundDataset (line 18) | class BackgroundDataset(data.Dataset, datasets.imdb):
    method __init__ (line 20) | def __init__(self, name):
    method __len__ (line 103) | def __len__(self):
    method __getitem__ (line 106) | def __getitem__(self, idx):
    method load (line 114) | def load(self, filename_color, filename_depth):

FILE: lib/datasets/dex_ycb.py
  class DexYCBDataset (line 108) | class DexYCBDataset(data.Dataset, datasets.imdb):
    method __init__ (line 110) | def __init__(self, setup, split):
    method __len__ (line 286) | def __len__(self):
    method get_bop_id_from_idx (line 290) | def get_bop_id_from_idx(self, idx):
    method __getitem__ (line 297) | def __getitem__(self, idx):
    method _get_image_blob (line 348) | def _get_image_blob(self, color_file, depth_file, scale_ind):
    method _get_label_blob (line 384) | def _get_label_blob(self, roidb, num_classes, im_scale, height, width):
    method _generate_vertex_targets (line 494) | def _generate_vertex_targets(self, im_label, cls_indexes, center, pose...
    method _get_default_path (line 533) | def _get_default_path(self):
    method _load_object_extents (line 540) | def _load_object_extents(self):
    method _load_object_points (line 551) | def _load_object_points(self, classes, extents, symmetry):
    method write_dop_results (line 582) | def write_dop_results(self, output_dir):
    method compute_box (line 655) | def compute_box(self, cls, intrinsic_matrix, RT):
    method evaluation (line 672) | def evaluation(self, output_dir):

FILE: lib/datasets/factory.py
  function get_dataset (line 53) | def get_dataset(name):
  function list_datasets (line 59) | def list_datasets():

FILE: lib/datasets/imdb.py
  class imdb (line 13) | class imdb(object):
    method __init__ (line 16) | def __init__(self):
    method name (line 23) | def name(self):
    method num_classes (line 27) | def num_classes(self):
    method classes (line 31) | def classes(self):
    method class_colors (line 35) | def class_colors(self):
    method cache_path (line 39) | def cache_path(self):
    method backproject (line 47) | def backproject(self, depth_cv, intrinsic_matrix, factor):
    method _build_uniform_poses (line 75) | def _build_uniform_poses(self):
    method evaluation (line 93) | def evaluation(self, output_dir):

FILE: lib/datasets/ycb_object.py
  class YCBObject (line 32) | class YCBObject(data.Dataset, datasets.imdb):
    method __init__ (line 33) | def __init__(self, image_set, ycb_object_path = None):
    method _render_item (line 179) | def _render_item(self):
    method __getitem__ (line 448) | def __getitem__(self, index):
    method __len__ (line 452) | def __len__(self):
    method _generate_vertex_targets (line 457) | def _generate_vertex_targets(self, im_label, cls_indexes, center, pose...
    method _generate_vertex_deltas (line 490) | def _generate_vertex_deltas(self, im_label, cls_indexes, center, poses...
    method _get_default_path (line 548) | def _get_default_path(self):
    method _load_object_extents (line 555) | def _load_object_extents(self):
    method _load_object_points (line 568) | def _load_object_points(self, classes, extents, symmetry):
    method labels_to_image (line 600) | def labels_to_image(self, labels):
    method process_label_image (line 612) | def process_label_image(self, label_image):

FILE: lib/datasets/ycb_self_supervision.py
  function VOCap (line 32) | def VOCap(rec, prec):
  class YCBSelfSupervision (line 49) | class YCBSelfSupervision(data.Dataset, datasets.imdb):
    method __init__ (line 50) | def __init__(self, image_set, ycb_self_supervision_path = None):
    method _render_item (line 205) | def _render_item(self):
    method __getitem__ (line 509) | def __getitem__(self, index):
    method _get_image_blob (line 556) | def _get_image_blob(self, roidb, scale_ind):
    method _get_label_blob (line 596) | def _get_label_blob(self, roidb, num_classes, im_scale, height, width):
    method _generate_vertex_targets (line 764) | def _generate_vertex_targets(self, im_label, cls_indexes, center, pose...
    method __len__ (line 797) | def __len__(self):
    method _get_default_path (line 801) | def _get_default_path(self):
    method _load_image_set_index (line 808) | def _load_image_set_index(self, image_set):
    method _load_object_points (line 841) | def _load_object_points(self, classes, extents, symmetry):
    method _load_object_extents (line 873) | def _load_object_extents(self):
    method image_path_at (line 887) | def image_path_at(self, i):
    method image_path_from_index (line 893) | def image_path_from_index(self, index):
    method depth_path_at (line 909) | def depth_path_at(self, i):
    method depth_path_from_index (line 915) | def depth_path_from_index(self, index):
    method metadata_path_at (line 925) | def metadata_path_at(self, i):
    method metadata_path_from_index (line 931) | def metadata_path_from_index(self, index):
    method gt_roidb (line 940) | def gt_roidb(self):
    method _load_ycb_self_supervision_annotation (line 953) | def _load_ycb_self_supervision_annotation(self, index):
    method labels_to_image (line 972) | def labels_to_image(self, labels):
    method process_label_image (line 984) | def process_label_image(self, label_image):
    method render_gt_pose (line 1008) | def render_gt_pose(self, meta_data):
    method evaluation (line 1110) | def evaluation(self, output_dir):

FILE: lib/datasets/ycb_video.py
  class YCBVideo (line 32) | class YCBVideo(data.Dataset, datasets.imdb):
    method __init__ (line 33) | def __init__(self, image_set, ycb_video_path = None):
    method __getitem__ (line 110) | def __getitem__(self, index):
    method _get_image_blob (line 147) | def _get_image_blob(self, roidb, scale_ind):
    method _get_label_blob (line 186) | def _get_label_blob(self, roidb, num_classes, im_scale, height, width):
    method _generate_vertex_targets (line 290) | def _generate_vertex_targets(self, im_label, cls_indexes, center, pose...
    method __len__ (line 329) | def __len__(self):
    method _get_default_path (line 333) | def _get_default_path(self):
    method _load_image_set_index (line 340) | def _load_image_set_index(self, image_set):
    method _load_object_points (line 396) | def _load_object_points(self, classes, extents, symmetry):
    method _load_object_extents (line 428) | def _load_object_extents(self):
    method image_path_at (line 442) | def image_path_at(self, i):
    method image_path_from_index (line 448) | def image_path_from_index(self, index):
    method depth_path_at (line 462) | def depth_path_at(self, i):
    method depth_path_from_index (line 468) | def depth_path_from_index(self, index):
    method label_path_at (line 478) | def label_path_at(self, i):
    method label_path_from_index (line 484) | def label_path_from_index(self, index):
    method metadata_path_at (line 494) | def metadata_path_at(self, i):
    method metadata_path_from_index (line 500) | def metadata_path_from_index(self, index):
    method gt_roidb (line 509) | def gt_roidb(self):
    method _load_ycb_video_annotation (line 537) | def _load_ycb_video_annotation(self, index):
    method labels_to_image (line 575) | def labels_to_image(self, labels):
    method process_label_image (line 587) | def process_label_image(self, label_image):
    method evaluation (line 611) | def evaluation(self, output_dir):

FILE: lib/fcn/config.py
  function get_output_dir (line 333) | def get_output_dir(imdb, net):
  function _merge_a_into_b (line 345) | def _merge_a_into_b(a, b):
  function cfg_from_file (line 373) | def cfg_from_file(filename):
  function yaml_from_file (line 382) | def yaml_from_file(filename):

FILE: lib/fcn/render_utils.py
  function render_image (line 15) | def render_image(dataset, im, rois, poses, poses_refine, labels):

FILE: lib/fcn/test_common.py
  function compute_index_sdf (line 15) | def compute_index_sdf(rois):
  function refine_pose (line 30) | def refine_pose(im_label, im_depth, rois, poses, meta_data, dataset, vis...

FILE: lib/fcn/test_dataset.py
  class AverageMeter (line 21) | class AverageMeter(object):
    method __init__ (line 24) | def __init__(self):
    method reset (line 27) | def reset(self):
    method update (line 33) | def update(self, val, n=1):
    method __repr__ (line 39) | def __repr__(self):
  function test (line 43) | def test(test_loader, background_loader, network, output_dir):
  function _vis_test (line 166) | def _vis_test(inputs, labels, out_label, out_vertex, rois, poses, poses_...

FILE: lib/fcn/test_imageset.py
  function test_image (line 22) | def test_image(network, dataset, im_color, im_depth=None, im_index=None):
  function vis_test (line 127) | def vis_test(dataset, im, im_depth, label, rois, poses, poses_refined, i...

FILE: lib/fcn/train.py
  class AverageMeter (line 18) | class AverageMeter(object):
    method __init__ (line 21) | def __init__(self):
    method reset (line 24) | def reset(self):
    method update (line 30) | def update(self, val, n=1):
    method __repr__ (line 36) | def __repr__(self):
  function loss_cross_entropy (line 40) | def loss_cross_entropy(scores, labels):
  function smooth_l1_loss (line 52) | def smooth_l1_loss(vertex_pred, vertex_targets, vertex_weights, sigma=1.0):
  function train (line 83) | def train(train_loader, background_loader, network, optimizer, epoch):
  function _get_bb3D (line 197) | def _get_bb3D(extent):
  function _vis_minibatch (line 216) | def _vis_minibatch(inputs, background, labels, vertex_targets, sample, c...

FILE: lib/layers/hard_label.py
  class HardLabelFunction (line 12) | class HardLabelFunction(Function):
    method forward (line 14) | def forward(ctx, prob, label, rand, threshold, sample_percentage):
    method backward (line 20) | def backward(ctx, top_diff):
  class HardLabel (line 26) | class HardLabel(nn.Module):
    method __init__ (line 27) | def __init__(self, threshold, sample_percentage):
    method forward (line 32) | def forward(self, prob, label, rand):

FILE: lib/layers/hough_voting.py
  class HoughVotingFunction (line 12) | class HoughVotingFunction(Function):
    method forward (line 14) | def forward(ctx, label, vertex, meta_data, extents, is_train, skip_pix...
    method backward (line 25) | def backward(ctx, top_diff_box, top_diff_pose):
  class HoughVoting (line 29) | class HoughVoting(nn.Module):
    method __init__ (line 30) | def __init__(self, is_train=0, skip_pixels=10, label_threshold=100, in...
    method forward (line 39) | def forward(self, label, vertex, meta_data, extents):

FILE: lib/layers/point_matching_loss.py
  class PMLossFunction (line 11) | class PMLossFunction(Function):
    method forward (line 13) | def forward(ctx, prediction, target, weight, points, symmetry, hard_an...
    method backward (line 22) | def backward(ctx, grad_loss):
  class PMLoss (line 29) | class PMLoss(nn.Module):
    method __init__ (line 30) | def __init__(self, hard_angle=15):
    method forward (line 34) | def forward(self, prediction, target, weight, points, symmetry):

FILE: lib/layers/pose_target_layer.py
  function pose_target_layer (line 17) | def pose_target_layer(rois, bbox_prob, bbox_pred, gt_boxes, poses, is_tr...
  function _compute_pose_targets (line 92) | def _compute_pose_targets(quaternions, labels, num_classes):

FILE: lib/layers/posecnn_layers.cpp
  function backproject_forward (line 24) | std::vector<at::Tensor> backproject_forward(
  function hard_label_forward (line 47) | std::vector<at::Tensor> hard_label_forward(
  function hard_label_backward (line 61) | std::vector<at::Tensor> hard_label_backward(
  function hough_voting_forward (line 84) | std::vector<at::Tensor> hough_voting_forward(
  function roi_pool_forward (line 125) | std::vector<at::Tensor> roi_pool_forward(
  function roi_pool_backward (line 138) | std::vector<at::Tensor> roi_pool_backward(
  function ROIAlign_forward (line 178) | at::Tensor ROIAlign_forward(const at::Tensor& input,
  function ROIAlign_backward (line 188) | at::Tensor ROIAlign_backward(const at::Tensor& grad,
  function pml_forward (line 219) | std::vector<at::Tensor> pml_forward(
  function pml_backward (line 236) | std::vector<at::Tensor> pml_backward(
  function sdf_loss_forward (line 263) | std::vector<at::Tensor> sdf_loss_forward(
  function sdf_loss_backward (line 281) | std::vector<at::Tensor> sdf_loss_backward(
  function PYBIND11_MODULE (line 293) | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {

FILE: lib/layers/roi_align.py
  class _ROIAlign (line 10) | class _ROIAlign(Function):
    method forward (line 12) | def forward(ctx, input, roi, output_size, spatial_scale, sampling_ratio):
    method backward (line 25) | def backward(ctx, grad_output):
  class ROIAlign (line 49) | class ROIAlign(nn.Module):
    method __init__ (line 50) | def __init__(self, output_size, spatial_scale, sampling_ratio):
    method forward (line 56) | def forward(self, input, rois):
    method __repr__ (line 60) | def __repr__(self):

FILE: lib/layers/roi_pooling.py
  class RoIPoolFunction (line 12) | class RoIPoolFunction(Function):
    method forward (line 14) | def forward(ctx, features, rois, pool_height, pool_width, spatial_scale):
    method backward (line 25) | def backward(ctx, top_diff):
  class RoIPool (line 34) | class RoIPool(nn.Module):
    method __init__ (line 35) | def __init__(self, pool_height, pool_width, spatial_scale):
    method forward (line 42) | def forward(self, features, rois):

FILE: lib/layers/roi_target_layer.py
  function roi_target_layer (line 17) | def roi_target_layer(rpn_rois, gt_boxes):
  function _get_bbox_regression_labels (line 54) | def _get_bbox_regression_labels(bbox_target_data, num_classes):
  function _compute_targets (line 79) | def _compute_targets(ex_rois, gt_rois, labels):
  function _sample_rois (line 95) | def _sample_rois(all_rois, gt_boxes, num_classes):

FILE: lib/layers/sdf_matching_loss.py
  class SDFLossFunction (line 11) | class SDFLossFunction(Function):
    method forward (line 13) | def forward(ctx, pose_delta, pose_init, sdf_grids, sdf_limits, points,...
    method backward (line 26) | def backward(ctx, grad_loss, grad_sdf_values, grad_se3, grad_JTJ, grad...
  class SDFLoss (line 33) | class SDFLoss(nn.Module):
    method __init__ (line 34) | def __init__(self):
    method forward (line 37) | def forward(self, pose_delta, pose_init, sdf_grids, sdf_limits, points...

FILE: lib/networks/PoseCNN.py
  function log_softmax_high_dimension (line 26) | def log_softmax_high_dimension(input):
  function softmax_high_dimension (line 42) | def softmax_high_dimension(input):
  function conv (line 56) | def conv(in_planes, out_planes, kernel_size=3, stride=1, relu=True):
  function fc (line 65) | def fc(in_planes, out_planes, relu=True):
  function upsample (line 74) | def upsample(scale_factor):
  class PoseCNN (line 78) | class PoseCNN(nn.Module):
    method __init__ (line 80) | def __init__(self, num_classes, num_units):
    method forward (line 152) | def forward(self, x, label_gt, meta_data, extents, gt_boxes, poses, po...
    method weight_parameters (line 244) | def weight_parameters(self):
    method bias_parameters (line 247) | def bias_parameters(self):
  function posecnn (line 251) | def posecnn(num_classes, num_units, data=None):

FILE: lib/sdf/_init_paths.py
  function add_path (line 10) | def add_path(path):

FILE: lib/sdf/multi_sdf_optimizer.py
  class multi_sdf_optimizer (line 7) | class multi_sdf_optimizer():
    method __init__ (line 8) | def __init__(self, sdf_file, lr=0.01, online_calib=True, use_gpu=False):
    method look_up (line 60) | def look_up(self, samples_x, samples_y, samples_z):
    method compute_dist (line 78) | def compute_dist(self, d_pose, T_oc_0, ps_c):
    method compute_dist_multiview (line 86) | def compute_dist_multiview(self, d_pose, d_pose_ext, T_oc_0, T_rc_0, p...
    method refine_pose_singleview (line 102) | def refine_pose_singleview(self, T_co_0, ps_c, steps=100):
    method refine_pose_multiview (line 137) | def refine_pose_multiview(self, T_co_0, T_rc_0, ps_c, T_r0rv, steps=100):

FILE: lib/sdf/sdf_optimizer.py
  class sdf_optimizer (line 13) | class sdf_optimizer():
    method __init__ (line 14) | def __init__(self, classes, sdf_files, lr=0.01, optimizer='Adam', use_...
    method look_up (line 79) | def look_up(self, samples_x, samples_y, samples_z):
    method compute_dist (line 97) | def compute_dist(self, d_pose, T_oc_0, ps_c):
    method refine_pose (line 105) | def refine_pose(self, T_co_0, ps_c, steps=100):
    method refine_pose_layer (line 166) | def refine_pose_layer(self, T_oc_0, points, steps=100):

FILE: lib/sdf/sdf_utils.py
  function read_sdf (line 18) | def read_sdf(sdf_file):
  function skew (line 34) | def skew(w, gpu=False):
  function Exp (line 49) | def Exp(dq, gpu):
  function Oplus (line 73) | def Oplus(T, v, gpu=False):

FILE: lib/sdf/test_sdf_optimizer.py
  function Twc_np (line 9) | def Twc_np(pose):
  class SignedDensityField (line 26) | class SignedDensityField(object):
    method __init__ (line 31) | def __init__(self, data, origin, delta):
    method _rel_pos_to_idxes (line 38) | def _rel_pos_to_idxes(self, rel_pos):
    method get_distance (line 43) | def get_distance(self, rel_pos):
    method dump (line 48) | def dump(self, pkl_file):
    method visualize (line 55) | def visualize(self, max_dist=0.1):
    method from_sdf (line 79) | def from_sdf(cls, sdf_file):
    method from_pkl (line 100) | def from_pkl(cls, pkl_file):

FILE: lib/utils/bbox_transform.py
  function bbox_transform (line 11) | def bbox_transform(ex_rois, gt_rois):
  function bbox_transform_inv (line 32) | def bbox_transform_inv(boxes, deltas):
  function clip_boxes (line 65) | def clip_boxes(boxes, im_shape):

FILE: lib/utils/blob.py
  function im_list_to_blob (line 13) | def im_list_to_blob(ims, num_channels):
  function prep_im_for_blob (line 31) | def prep_im_for_blob(im, pixel_means, target_size, max_size):
  function pad_im (line 48) | def pad_im(im, factor, value=0):
  function unpad_im (line 61) | def unpad_im(im, factor):
  function chromatic_transform (line 74) | def chromatic_transform(im, label=None, d_h=None, d_s=None, d_l=None):
  function add_noise (line 102) | def add_noise(image, level = 0.1):
  function add_noise_depth (line 132) | def add_noise_depth(image, level = 0.1):
  function add_noise_depth_cuda (line 141) | def add_noise_depth_cuda(image, level = 0.1):
  function add_gaussian_noise_cuda (line 147) | def add_gaussian_noise_cuda(image, level = 0.1):
  function add_noise_cuda (line 157) | def add_noise_cuda(image, level = 0.1):

FILE: lib/utils/nms.py
  function nms (line 7) | def nms(dets, thresh):

FILE: lib/utils/pose_error.py
  function VOCap (line 13) | def VOCap(rec, prec):
  function transform_pts_Rt (line 31) | def transform_pts_Rt(pts, R, t):
  function reproj (line 44) | def reproj(K, R_est, t_est, R_gt, t_gt, pts):
  function add (line 74) | def add(R_est, t_est, R_gt, t_gt, pts):
  function adi (line 90) | def adi(R_est, t_est, R_gt, t_gt, pts):
  function re (line 111) | def re(R_est, R_gt):
  function te (line 126) | def te(t_est, t_gt):

FILE: lib/utils/se3.py
  function se3_inverse (line 10) | def se3_inverse(RT):
  function se3_mul (line 18) | def se3_mul(RT1, RT2):
  function egocentric2allocentric (line 32) | def egocentric2allocentric(qt, T):
  function allocentric2egocentric (line 40) | def allocentric2egocentric(qt, T):
  function T_inv_transform (line 48) | def T_inv_transform(T_src, T_tgt):
  function rotation_x (line 63) | def rotation_x(theta):
  function rotation_y (line 73) | def rotation_y(theta):
  function rotation_z (line 83) | def rotation_z(theta):

FILE: lib/utils/segmentation_evaluation.py
  function multilabel_metrics (line 10) | def multilabel_metrics(prediction, gt, num_classes):

FILE: lib/utils/setup.py
  function find_in_path (line 12) | def find_in_path(name, path):
  function locate_cuda (line 21) | def locate_cuda():
  function customize_compiler_for_nvcc (line 60) | def customize_compiler_for_nvcc(self):
  class custom_build_ext (line 99) | class custom_build_ext(build_ext):
    method build_extensions (line 100) | def build_extensions(self):

FILE: ros/_init_paths.py
  function add_path (line 10) | def add_path(path):

FILE: ros/collect_images_realsense.py
  class ImageListener (line 21) | class ImageListener:
    method __init__ (line 23) | def __init__(self):
    method callback (line 54) | def callback(self, rgb, depth):

FILE: ros/test_images.py
  class ImageListener (line 49) | class ImageListener:
    method __init__ (line 51) | def __init__(self, network, dataset):
    method callback_rgbd (line 113) | def callback_rgbd(self, rgb, depth):
    method run_network (line 133) | def run_network(self):
  function parse_args (line 225) | def parse_args():

FILE: tools/_init_paths.py
  function add_path (line 10) | def add_path(path):

FILE: tools/test_images.py
  function parse_args (line 32) | def parse_args():

FILE: tools/test_net.py
  function parse_args (line 30) | def parse_args():

FILE: tools/train_net.py
  function parse_args (line 30) | def parse_args():

FILE: ycb_render/cpp/query_devices.cpp
  type EGLInternalData2 (line 16) | struct EGLInternalData2 {
    method EGLInternalData2 (line 31) | EGLInternalData2()
  function main (line 37) | int main(){

FILE: ycb_render/cpp/test_device.cpp
  type EGLInternalData2 (line 16) | struct EGLInternalData2 {
    method EGLInternalData2 (line 30) | EGLInternalData2()
  function main (line 36) | int main(int argc, char ** argv){

FILE: ycb_render/cpp/ycb_renderer.cpp
  type EGLInternalData2 (line 28) | struct EGLInternalData2 {
    method EGLInternalData2 (line 42) | EGLInternalData2()
  class CppYCBRenderer (line 49) | class CppYCBRenderer{
    method CppYCBRenderer (line 51) | CppYCBRenderer(int w, int h, int d):m_windowHeight(h),m_windowWidth(w)...
    method init (line 70) | int init() {
    method query (line 220) | void query() {
    method release (line 231) | void release()
    method draw (line 248) | void draw(py::array_t<float> x) {
    method draw_py (line 273) | void draw_py(py::array_t<float> x) {
    method map_tensor (line 284) | void map_tensor(GLuint tid, int width, int height, std::size_t data)
  function PYBIND11_MODULE (line 325) | PYBIND11_MODULE(CppYCBRenderer, m) {

FILE: ycb_render/get_available_devices.py
  function get_available_devices (line 9) | def get_available_devices():

FILE: ycb_render/glad/EGL/eglplatform.h
  type HDC (line 76) | typedef HDC     EGLNativeDisplayType;
  type HBITMAP (line 77) | typedef HBITMAP EGLNativePixmapType;
  type HWND (line 78) | typedef HWND    EGLNativeWindowType;
  type EGLNativeDisplayType (line 82) | typedef int   EGLNativeDisplayType;
  type wl_display (line 88) | struct wl_display
  type wl_egl_pixmap (line 89) | struct wl_egl_pixmap
  type wl_egl_window (line 90) | struct wl_egl_window
  type gbm_device (line 94) | struct gbm_device
  type gbm_bo (line 95) | struct gbm_bo
  type ANativeWindow (line 100) | struct ANativeWindow
  type egl_native_pixmap_t (line 101) | struct egl_native_pixmap_t
  type ANativeWindow (line 103) | struct ANativeWindow
  type egl_native_pixmap_t (line 104) | struct egl_native_pixmap_t
  type khronos_uintptr_t (line 112) | typedef khronos_uintptr_t EGLNativePixmapType;
  type khronos_uintptr_t (line 113) | typedef khronos_uintptr_t EGLNativeWindowType;
  type Display (line 121) | typedef Display *EGLNativeDisplayType;
  type Pixmap (line 122) | typedef Pixmap   EGLNativePixmapType;
  type Window (line 123) | typedef Window   EGLNativeWindowType;
  type khronos_uintptr_t (line 130) | typedef khronos_uintptr_t	 EGLNativePixmapType;
  type khronos_uintptr_t (line 131) | typedef khronos_uintptr_t	 EGLNativeWindowType;
  type EGLNativeDisplayType (line 138) | typedef EGLNativeDisplayType NativeDisplayType;
  type EGLNativePixmapType (line 139) | typedef EGLNativePixmapType  NativePixmapType;
  type EGLNativeWindowType (line 140) | typedef EGLNativeWindowType  NativeWindowType;
  type khronos_int32_t (line 150) | typedef khronos_int32_t EGLint;

FILE: ycb_render/glad/KHR/khrplatform.h
  type khronos_int32_t (line 144) | typedef int32_t                 khronos_int32_t;
  type khronos_uint32_t (line 145) | typedef uint32_t                khronos_uint32_t;
  type khronos_int64_t (line 146) | typedef int64_t                 khronos_int64_t;
  type khronos_uint64_t (line 147) | typedef uint64_t                khronos_uint64_t;
  type khronos_int32_t (line 157) | typedef int32_t                 khronos_int32_t;
  type khronos_uint32_t (line 158) | typedef uint32_t                khronos_uint32_t;
  type khronos_int64_t (line 159) | typedef int64_t                 khronos_int64_t;
  type khronos_uint64_t (line 160) | typedef uint64_t                khronos_uint64_t;
  type __int32 (line 169) | typedef __int32                 khronos_int32_t;
  type khronos_uint32_t (line 170) | typedef unsigned __int32        khronos_uint32_t;
  type __int64 (line 171) | typedef __int64                 khronos_int64_t;
  type khronos_uint64_t (line 172) | typedef unsigned __int64        khronos_uint64_t;
  type khronos_int32_t (line 181) | typedef int                     khronos_int32_t;
  type khronos_uint32_t (line 182) | typedef unsigned int            khronos_uint32_t;
  type khronos_int64_t (line 184) | typedef long int                khronos_int64_t;
  type khronos_uint64_t (line 185) | typedef unsigned long int       khronos_uint64_t;
  type khronos_int64_t (line 187) | typedef long long int           khronos_int64_t;
  type khronos_uint64_t (line 188) | typedef unsigned long long int  khronos_uint64_t;
  type khronos_int32_t (line 198) | typedef int                     khronos_int32_t;
  type khronos_uint32_t (line 199) | typedef unsigned int            khronos_uint32_t;
  type khronos_int32_t (line 209) | typedef int32_t                 khronos_int32_t;
  type khronos_uint32_t (line 210) | typedef uint32_t                khronos_uint32_t;
  type khronos_int64_t (line 211) | typedef int64_t                 khronos_int64_t;
  type khronos_uint64_t (line 212) | typedef uint64_t                khronos_uint64_t;
  type khronos_int8_t (line 222) | typedef signed   char          khronos_int8_t;
  type khronos_uint8_t (line 223) | typedef unsigned char          khronos_uint8_t;
  type khronos_int16_t (line 224) | typedef signed   short int     khronos_int16_t;
  type khronos_uint16_t (line 225) | typedef unsigned short int     khronos_uint16_t;
  type khronos_intptr_t (line 233) | typedef signed   long long int khronos_intptr_t;
  type khronos_uintptr_t (line 234) | typedef unsigned long long int khronos_uintptr_t;
  type khronos_ssize_t (line 235) | typedef signed   long long int khronos_ssize_t;
  type khronos_usize_t (line 236) | typedef unsigned long long int khronos_usize_t;
  type khronos_intptr_t (line 238) | typedef signed   long  int     khronos_intptr_t;
  type khronos_uintptr_t (line 239) | typedef unsigned long  int     khronos_uintptr_t;
  type khronos_ssize_t (line 240) | typedef signed   long  int     khronos_ssize_t;
  type khronos_usize_t (line 241) | typedef unsigned long  int     khronos_usize_t;
  type khronos_float_t (line 248) | typedef          float         khronos_float_t;
  type khronos_uint64_t (line 261) | typedef khronos_uint64_t       khronos_utime_nanoseconds_t;
  type khronos_int64_t (line 262) | typedef khronos_int64_t        khronos_stime_nanoseconds_t;
  type khronos_boolean_enum_t (line 278) | typedef enum {

FILE: ycb_render/glad/egl.c
  function load_EGL_VERSION_1_0 (line 94) | static void load_EGL_VERSION_1_0( GLADuserptrloadfunc load, void* userpt...
  function load_EGL_VERSION_1_1 (line 121) | static void load_EGL_VERSION_1_1( GLADuserptrloadfunc load, void* userpt...
  function load_EGL_VERSION_1_2 (line 128) | static void load_EGL_VERSION_1_2( GLADuserptrloadfunc load, void* userpt...
  function load_EGL_VERSION_1_4 (line 136) | static void load_EGL_VERSION_1_4( GLADuserptrloadfunc load, void* userpt...
  function load_EGL_VERSION_1_5 (line 140) | static void load_EGL_VERSION_1_5( GLADuserptrloadfunc load, void* userpt...
  function load_EGL_EXT_platform_base (line 153) | static void load_EGL_EXT_platform_base( GLADuserptrloadfunc load, void* ...
  function load_EGL_EXT_device_enumeration (line 159) | static void load_EGL_EXT_device_enumeration( GLADuserptrloadfunc load, v...
  function load_EGL_EXT_device_query (line 163) | static void load_EGL_EXT_device_query( GLADuserptrloadfunc load, void* u...
  function load_EGL_EXT_device_base (line 170) | static void load_EGL_EXT_device_base( GLADuserptrloadfunc load, void* us...
  function get_exts (line 181) | static int get_exts(EGLDisplay display, const char **extensions) {
  function has_ext (line 187) | static int has_ext(const char *extensions, const char *ext) {
  function GLADapiproc (line 207) | static GLADapiproc glad_egl_get_proc_from_userptr(const char *name, void...
  function find_extensionsEGL (line 211) | static int find_extensionsEGL(EGLDisplay display) {
  function find_coreEGL (line 226) | static int find_coreEGL(EGLDisplay display) {
  function gladLoadEGLUserPtr (line 263) | int gladLoadEGLUserPtr(EGLDisplay display, GLADuserptrloadfunc load, voi...
  function gladLoadEGL (line 288) | int gladLoadEGL(EGLDisplay display, GLADloadfunc load) {
  function glad_close_dlopen_handle (line 338) | static void glad_close_dlopen_handle(void* handle) {
  function GLADapiproc (line 348) | static GLADapiproc glad_dlsym_handle(void* handle, const char *name) {
  type _glad_egl_userptr (line 362) | struct _glad_egl_userptr {
  function GLADapiproc (line 367) | static GLADapiproc glad_egl_get_proc(const char* name, void *vuserptr) {
  function gladLoaderLoadEGL (line 381) | int gladLoaderLoadEGL(EGLDisplay display) {
  function gladLoaderUnloadEGL (line 415) | void gladLoaderUnloadEGL() {

FILE: ycb_render/glad/gl.c
  function load_GL_VERSION_1_0 (line 1097) | static void load_GL_VERSION_1_0( GLADuserptrloadfunc load, void* userptr) {
  function load_GL_VERSION_1_1 (line 1406) | static void load_GL_VERSION_1_1( GLADuserptrloadfunc load, void* userptr) {
  function load_GL_VERSION_1_2 (line 1439) | static void load_GL_VERSION_1_2( GLADuserptrloadfunc load, void* userptr) {
  function load_GL_VERSION_1_3 (line 1446) | static void load_GL_VERSION_1_3( GLADuserptrloadfunc load, void* userptr) {
  function load_GL_VERSION_1_4 (line 1495) | static void load_GL_VERSION_1_4( GLADuserptrloadfunc load, void* userptr) {
  function load_GL_VERSION_1_5 (line 1545) | static void load_GL_VERSION_1_5( GLADuserptrloadfunc load, void* userptr) {
  function load_GL_VERSION_2_0 (line 1567) | static void load_GL_VERSION_2_0( GLADuserptrloadfunc load, void* userptr) {
  function load_GL_VERSION_2_1 (line 1663) | static void load_GL_VERSION_2_1( GLADuserptrloadfunc load, void* userptr) {
  function load_GL_VERSION_3_0 (line 1672) | static void load_GL_VERSION_3_0( GLADuserptrloadfunc load, void* userptr) {
  function load_GL_VERSION_3_1 (line 1759) | static void load_GL_VERSION_3_1( GLADuserptrloadfunc load, void* userptr) {
  function load_GL_VERSION_3_2 (line 1777) | static void load_GL_VERSION_3_2( GLADuserptrloadfunc load, void* userptr) {
  function load_GL_VERSION_3_3 (line 1799) | static void load_GL_VERSION_3_3( GLADuserptrloadfunc load, void* userptr) {
  function load_GL_VERSION_4_0 (line 1860) | static void load_GL_VERSION_4_0( GLADuserptrloadfunc load, void* userptr) {
  function load_GL_VERSION_4_1 (line 1909) | static void load_GL_VERSION_4_1( GLADuserptrloadfunc load, void* userptr) {
  function load_GL_VERSION_4_2 (line 2000) | static void load_GL_VERSION_4_2( GLADuserptrloadfunc load, void* userptr) {
  function load_GL_VERSION_4_3 (line 2015) | static void load_GL_VERSION_4_3( GLADuserptrloadfunc load, void* userptr) {
  function load_GL_VERSION_4_4 (line 2062) | static void load_GL_VERSION_4_4( GLADuserptrloadfunc load, void* userptr) {
  function load_GL_VERSION_4_5 (line 2074) | static void load_GL_VERSION_4_5( GLADuserptrloadfunc load, void* userptr) {
  function load_GL_VERSION_4_6 (line 2199) | static void load_GL_VERSION_4_6( GLADuserptrloadfunc load, void* userptr) {
  function get_exts (line 2215) | static int get_exts( int version, const char **out_exts, unsigned int *o...
  function free_exts (line 2260) | static void free_exts(char **exts_i, unsigned int num_exts_i) {
  function has_ext (line 2270) | static int has_ext(int version, const char *exts, unsigned int num_exts_...
  function GLADapiproc (line 2303) | static GLADapiproc glad_gl_get_proc_from_userptr(const char* name, void ...
  function find_extensionsGL (line 2307) | static int find_extensionsGL( int version) {
  function find_coreGL (line 2319) | static int find_coreGL(void) {
  function gladLoadGLUserPtr (line 2363) | int gladLoadGLUserPtr( GLADuserptrloadfunc load, void *userptr) {
  function gladLoadGL (line 2399) | int gladLoadGL( GLADloadfunc load) {
  function glad_close_dlopen_handle (line 2451) | static void glad_close_dlopen_handle(void* handle) {
  function GLADapiproc (line 2461) | static GLADapiproc glad_dlsym_handle(void* handle, const char *name) {
  type _glad_gl_userptr (line 2476) | struct _glad_gl_userptr {
  function GLADapiproc (line 2481) | static GLADapiproc glad_gl_get_proc(const char *name, void *vuserptr) {
  function gladLoaderLoadGL (line 2497) | int gladLoaderLoadGL(void) {

FILE: ycb_render/glad/glad/egl.h
  type GLADapiproc (line 118) | typedef GLADapiproc (*GLADloadfunc)(const char *name);
  type GLADapiproc (line 119) | typedef GLADapiproc (*GLADuserptrloadfunc)(const char *name, void *userp...
  type AHardwareBuffer (line 314) | struct AHardwareBuffer
  type EGLBoolean (line 316) | typedef unsigned int EGLBoolean;
  type EGLenum (line 317) | typedef unsigned int EGLenum;
  type EGLAttribKHR (line 318) | typedef intptr_t EGLAttribKHR;
  type EGLAttrib (line 319) | typedef intptr_t EGLAttrib;
  type khronos_utime_nanoseconds_t (line 337) | typedef khronos_utime_nanoseconds_t EGLTimeKHR;
  type khronos_utime_nanoseconds_t (line 338) | typedef khronos_utime_nanoseconds_t EGLTime;
  type khronos_utime_nanoseconds_t (line 339) | typedef khronos_utime_nanoseconds_t EGLTimeNV;
  type khronos_utime_nanoseconds_t (line 340) | typedef khronos_utime_nanoseconds_t EGLuint64NV;
  type khronos_uint64_t (line 341) | typedef khronos_uint64_t EGLuint64KHR;
  type khronos_stime_nanoseconds_t (line 342) | typedef khronos_stime_nanoseconds_t EGLnsecsANDROID;
  type EGLNativeFileDescriptorKHR (line 343) | typedef int EGLNativeFileDescriptorKHR;
  type khronos_ssize_t (line 344) | typedef khronos_ssize_t EGLsizeiANDROID;
  type EGLsizeiANDROID (line 346) | typedef EGLsizeiANDROID (*EGLGetBlobFuncANDROID) (const void *key, EGLsi...
  type EGLClientPixmapHI (line 347) | struct EGLClientPixmapHI {
  type const (line 390) | typedef EGLSurface (GLAD_API_PTR *PFNEGLCREATEPLATFORMPIXMAPSURFACEPROC)...
  type const (line 396) | typedef EGLContext (GLAD_API_PTR *PFNEGLCREATECONTEXTPROC)(EGLDisplay   ...
  type EGLint (line 397) | typedef EGLBoolean (GLAD_API_PTR *PFNEGLCHOOSECONFIGPROC)(EGLDisplay   d...
  type const (line 401) | typedef EGLSurface (GLAD_API_PTR *PFNEGLCREATEPLATFORMPIXMAPSURFACEEXTPR...
  type const (line 403) | typedef EGLSurface (GLAD_API_PTR *PFNEGLCREATEPLATFORMWINDOWSURFACEPROC)...
  type const (line 407) | typedef EGLDisplay (GLAD_API_PTR *PFNEGLGETPLATFORMDISPLAYEXTPROC)(EGLen...
  type const (line 411) | typedef EGLDisplay (GLAD_API_PTR *PFNEGLGETPLATFORMDISPLAYPROC)(EGLenum ...
  type const (line 426) | typedef EGLSurface (GLAD_API_PTR *PFNEGLCREATEPLATFORMWINDOWSURFACEEXTPR...

FILE: ycb_render/glad/glad/gl.h
  type GLADapiproc (line 123) | typedef GLADapiproc (*GLADloadfunc)(const char *name);
  type GLADapiproc (line 124) | typedef GLADapiproc (*GLADuserptrloadfunc)(const char *name, void *userp...
  type __int32 (line 1962) | typedef __int32 int32_t;
  type __int64 (line 1963) | typedef __int64 int64_t;
  type GLenum (line 1970) | typedef unsigned int GLenum;
  type GLboolean (line 1971) | typedef unsigned char GLboolean;
  type GLbitfield (line 1972) | typedef unsigned int GLbitfield;
  type GLvoid (line 1973) | typedef void GLvoid;
  type GLbyte (line 1974) | typedef signed char GLbyte;
  type GLshort (line 1975) | typedef short GLshort;
  type GLint (line 1976) | typedef int GLint;
  type GLclampx (line 1977) | typedef int GLclampx;
  type GLubyte (line 1978) | typedef unsigned char GLubyte;
  type GLushort (line 1979) | typedef unsigned short GLushort;
  type GLuint (line 1980) | typedef unsigned int GLuint;
  type GLsizei (line 1981) | typedef int GLsizei;
  type GLfloat (line 1982) | typedef float GLfloat;
  type GLclampf (line 1983) | typedef float GLclampf;
  type GLdouble (line 1984) | typedef double GLdouble;
  type GLclampd (line 1985) | typedef double GLclampd;
  type GLchar (line 1988) | typedef char GLchar;
  type GLcharARB (line 1989) | typedef char GLcharARB;
  type GLhandleARB (line 1993) | typedef unsigned int GLhandleARB;
  type GLhalfARB (line 1995) | typedef unsigned short GLhalfARB;
  type GLhalf (line 1996) | typedef unsigned short GLhalf;
  type GLint (line 1997) | typedef GLint GLfixed;
  type khronos_intptr_t (line 1999) | typedef khronos_intptr_t GLintptr;
  type khronos_intptr_t (line 2001) | typedef khronos_intptr_t GLintptr;
  type khronos_ssize_t (line 2004) | typedef khronos_ssize_t GLsizeiptr;
  type khronos_ssize_t (line 2006) | typedef khronos_ssize_t GLsizeiptr;
  type GLint64 (line 2008) | typedef int64_t GLint64;
  type GLuint64 (line 2009) | typedef uint64_t GLuint64;
  type GLintptrARB (line 2011) | typedef long GLintptrARB;
  type GLintptrARB (line 2013) | typedef ptrdiff_t GLintptrARB;
  type GLsizeiptrARB (line 2016) | typedef long GLsizeiptrARB;
  type GLsizeiptrARB (line 2018) | typedef ptrdiff_t GLsizeiptrARB;
  type GLint64EXT (line 2020) | typedef int64_t GLint64EXT;
  type GLuint64EXT (line 2021) | typedef uint64_t GLuint64EXT;
  type __GLsync (line 2022) | struct __GLsync
  type _cl_context (line 2023) | struct _cl_context
  type _cl_event (line 2024) | struct _cl_event
  type GLhalfNV (line 2029) | typedef unsigned short GLhalfNV;
  type GLintptr (line 2030) | typedef GLintptr GLvdpauSurfaceNV;
  type GLubyte (line 2200) | typedef const GLubyte * (GLAD_API_PTR *PFNGLGETSTRINGIPROC)(GLenum   nam...
  type GLuint (line 2325) | typedef GLboolean (GLAD_API_PTR *PFNGLARETEXTURESRESIDENTPROC)(GLsizei  ...
  type const (line 2613) | typedef GLuint (GLAD_API_PTR *PFNGLGETSUBROUTINEINDEXPROC)(GLuint   prog...
  type const (line 2798) | typedef GLint (GLAD_API_PTR *PFNGLGETPROGRAMRESOURCELOCATIONINDEXPROC)(G...
  type const (line 2850) | typedef GLint (GLAD_API_PTR *PFNGLGETSUBROUTINEUNIFORMLOCATIONPROC)(GLui...
  type const (line 2863) | typedef GLuint (GLAD_API_PTR *PFNGLGETPROGRAMRESOURCEINDEXPROC)(GLuint  ...
  type GLubyte (line 2893) | typedef const GLubyte * (GLAD_API_PTR *PFNGLGETSTRINGPROC)(GLenum   name);
  type const (line 3110) | typedef GLint (GLAD_API_PTR *PFNGLGETPROGRAMRESOURCELOCATIONPROC)(GLuint...

FILE: ycb_render/glad/glad/glx.h
  type GLADapiproc (line 129) | typedef GLADapiproc (*GLADloadfunc)(const char *name);
  type GLADapiproc (line 130) | typedef GLADapiproc (*GLADuserptrloadfunc)(const char *name, void *userp...
  type __int32 (line 453) | typedef __int32 int32_t;
  type __int64 (line 454) | typedef __int64 int64_t;
  type XID (line 495) | typedef XID GLXFBConfigID;
  type __GLXFBConfigRec (line 496) | struct __GLXFBConfigRec
  type XID (line 497) | typedef XID GLXContextID;
  type __GLXcontextRec (line 498) | struct __GLXcontextRec
  type XID (line 499) | typedef XID GLXPixmap;
  type XID (line 500) | typedef XID GLXDrawable;
  type XID (line 501) | typedef XID GLXWindow;
  type XID (line 502) | typedef XID GLXPbuffer;
  type XID (line 504) | typedef XID GLXVideoCaptureDeviceNV;
  type GLXVideoDeviceNV (line 505) | typedef unsigned int GLXVideoDeviceNV;
  type XID (line 506) | typedef XID GLXVideoSourceSGIX;
  type XID (line 507) | typedef XID GLXFBConfigIDSGIX;
  type __GLXFBConfigRec (line 508) | struct __GLXFBConfigRec
  type XID (line 509) | typedef XID GLXPbufferSGIX;
  type GLXPbufferClobberEvent (line 510) | typedef struct {
  type GLXBufferSwapComplete (line 523) | typedef struct {
  type GLXEvent (line 534) | typedef union __GLXEvent {
  type GLXStereoNotifyEventEXT (line 539) | typedef struct {
  type GLXBufferClobberEventSGIX (line 549) | typedef struct {
  type GLXHyperpipeNetworkSGIX (line 562) | typedef struct {
  type GLXHyperpipeConfigSGIX (line 566) | typedef struct {
  type GLXPipeRect (line 572) | typedef struct {
  type GLXPipeRectLimits (line 577) | typedef struct {
  type const (line 725) | typedef GLXContext (GLAD_API_PTR *PFNGLXCREATECONTEXTATTRIBSARBPROC)(Dis...
  type GLXHyperpipeConfigSGIX (line 728) | typedef GLXHyperpipeConfigSGIX * (GLAD_API_PTR *PFNGLXQUERYHYPERPIPECONF...
  type GLXHyperpipeNetworkSGIX (line 731) | typedef GLXHyperpipeNetworkSGIX * (GLAD_API_PTR *PFNGLXQUERYHYPERPIPENET...
  type GLXFBConfigSGIX (line 744) | typedef GLXFBConfigSGIX * (GLAD_API_PTR *PFNGLXCHOOSEFBCONFIGSGIXPROC)(D...
  type XVisualInfo (line 771) | typedef XVisualInfo * (GLAD_API_PTR *PFNGLXGETVISUALFROMFBCONFIGPROC)(Di...
  type Display (line 797) | typedef Display * (GLAD_API_PTR *PFNGLXGETCURRENTDISPLAYPROC)(void);
  type Display (line 808) | typedef Display * (GLAD_API_PTR *PFNGLXGETCURRENTDISPLAYEXTPROC)(void);
  type GLXFBConfig (line 809) | typedef GLXFBConfig * (GLAD_API_PTR *PFNGLXGETFBCONFIGSPROC)(Display  * ...
  type XVisualInfo (line 814) | typedef XVisualInfo * (GLAD_API_PTR *PFNGLXCHOOSEVISUALPROC)(Display  * ...
  type XVisualInfo (line 824) | typedef XVisualInfo * (GLAD_API_PTR *PFNGLXGETVISUALFROMFBCONFIGSGIXPROC...
  type GLXVideoCaptureDeviceNV (line 837) | typedef GLXVideoCaptureDeviceNV * (GLAD_API_PTR *PFNGLXENUMERATEVIDEOCAP...
  type const (line 838) | typedef GLXContext (GLAD_API_PTR *PFNGLXCREATEASSOCIATEDCONTEXTATTRIBSAM...
  type GLXFBConfig (line 845) | typedef GLXFBConfig * (GLAD_API_PTR *PFNGLXCHOOSEFBCONFIGPROC)(Display  ...

FILE: ycb_render/glad/glx_dyn.c
  type Display (line 9) | typedef Display* (* PFNXOPENDISPLAY) (_Xconst char* a);
  type Screen (line 10) | typedef Screen* (* PFNXDEFAULTSCREENOFDISPLAY) (Display*);
  type X11Struct (line 12) | typedef struct
  function initX11Struct (line 20) | void initX11Struct(X11Struct *x11) {
  function initX11Struct (line 34) | void initX11Struct(X11Struct *x11) {
  function load_GLX_VERSION_1_0 (line 265) | static void load_GLX_VERSION_1_0( GLADuserptrloadfunc load, void* userpt...
  function load_GLX_VERSION_1_1 (line 285) | static void load_GLX_VERSION_1_1( GLADuserptrloadfunc load, void* userpt...
  function load_GLX_VERSION_1_2 (line 291) | static void load_GLX_VERSION_1_2( GLADuserptrloadfunc load, void* userpt...
  function load_GLX_VERSION_1_3 (line 295) | static void load_GLX_VERSION_1_3( GLADuserptrloadfunc load, void* userpt...
  function load_GLX_VERSION_1_4 (line 315) | static void load_GLX_VERSION_1_4( GLADuserptrloadfunc load, void* userpt...
  function load_GLX_MESA_copy_sub_buffer (line 319) | static void load_GLX_MESA_copy_sub_buffer( GLADuserptrloadfunc load, voi...
  function load_GLX_SGIX_pbuffer (line 323) | static void load_GLX_SGIX_pbuffer( GLADuserptrloadfunc load, void* userp...
  function load_GLX_SGI_make_current_read (line 331) | static void load_GLX_SGI_make_current_read( GLADuserptrloadfunc load, vo...
  function load_GLX_OML_sync_control (line 336) | static void load_GLX_OML_sync_control( GLADuserptrloadfunc load, void* u...
  function load_GLX_SGIX_hyperpipe (line 344) | static void load_GLX_SGIX_hyperpipe( GLADuserptrloadfunc load, void* use...
  function load_GLX_EXT_swap_control (line 355) | static void load_GLX_EXT_swap_control( GLADuserptrloadfunc load, void* u...
  function load_GLX_MESA_pixmap_colormap (line 359) | static void load_GLX_MESA_pixmap_colormap( GLADuserptrloadfunc load, voi...
  function load_GLX_NV_video_capture (line 363) | static void load_GLX_NV_video_capture( GLADuserptrloadfunc load, void* u...
  function load_GLX_NV_swap_group (line 371) | static void load_GLX_NV_swap_group( GLADuserptrloadfunc load, void* user...
  function load_GLX_EXT_texture_from_pixmap (line 380) | static void load_GLX_EXT_texture_from_pixmap( GLADuserptrloadfunc load, ...
  function load_GLX_SUN_get_transparent_index (line 385) | static void load_GLX_SUN_get_transparent_index( GLADuserptrloadfunc load...
  function load_GLX_MESA_release_buffers (line 389) | static void load_GLX_MESA_release_buffers( GLADuserptrloadfunc load, voi...
  function load_GLX_NV_delay_before_swap (line 393) | static void load_GLX_NV_delay_before_swap( GLADuserptrloadfunc load, voi...
  function load_GLX_MESA_agp_offset (line 397) | static void load_GLX_MESA_agp_offset( GLADuserptrloadfunc load, void* us...
  function load_GLX_SGI_swap_control (line 401) | static void load_GLX_SGI_swap_control( GLADuserptrloadfunc load, void* u...
  function load_GLX_EXT_import_context (line 405) | static void load_GLX_EXT_import_context( GLADuserptrloadfunc load, void*...
  function load_GLX_SGI_video_sync (line 413) | static void load_GLX_SGI_video_sync( GLADuserptrloadfunc load, void* use...
  function load_GLX_SGI_cushion (line 418) | static void load_GLX_SGI_cushion( GLADuserptrloadfunc load, void* userpt...
  function load_GLX_SGIX_fbconfig (line 422) | static void load_GLX_SGIX_fbconfig( GLADuserptrloadfunc load, void* user...
  function load_GLX_NV_copy_buffer (line 431) | static void load_GLX_NV_copy_buffer( GLADuserptrloadfunc load, void* use...
  function load_GLX_ARB_create_context (line 436) | static void load_GLX_ARB_create_context( GLADuserptrloadfunc load, void*...
  function load_GLX_AMD_gpu_association (line 440) | static void load_GLX_AMD_gpu_association( GLADuserptrloadfunc load, void...
  function load_GLX_MESA_query_renderer (line 452) | static void load_GLX_MESA_query_renderer( GLADuserptrloadfunc load, void...
  function load_GLX_MESA_swap_control (line 459) | static void load_GLX_MESA_swap_control( GLADuserptrloadfunc load, void* ...
  function load_GLX_SGIX_video_resize (line 464) | static void load_GLX_SGIX_video_resize( GLADuserptrloadfunc load, void* ...
  function load_GLX_NV_video_out (line 472) | static void load_GLX_NV_video_out( GLADuserptrloadfunc load, void* userp...
  function load_GLX_MESA_set_3dfx_mode (line 481) | static void load_GLX_MESA_set_3dfx_mode( GLADuserptrloadfunc load, void*...
  function load_GLX_ARB_get_proc_address (line 485) | static void load_GLX_ARB_get_proc_address( GLADuserptrloadfunc load, voi...
  function load_GLX_NV_copy_image (line 489) | static void load_GLX_NV_copy_image( GLADuserptrloadfunc load, void* user...
  function load_GLX_NV_present_video (line 493) | static void load_GLX_NV_present_video( GLADuserptrloadfunc load, void* u...
  function load_GLX_SGIX_swap_barrier (line 498) | static void load_GLX_SGIX_swap_barrier( GLADuserptrloadfunc load, void* ...
  function load_GLX_SGIX_swap_group (line 503) | static void load_GLX_SGIX_swap_group( GLADuserptrloadfunc load, void* us...
  function has_ext (line 510) | static int has_ext(Display *display, int screen, const char *ext) {
  function GLADapiproc (line 540) | static GLADapiproc glad_glx_get_proc_from_userptr(const char* name, void...
  function find_extensionsGLX (line 544) | static int find_extensionsGLX(Display *display, int screen) {
  function find_coreGLX (line 611) | static int find_coreGLX(Display **display, int *screen) {
  function gladLoadGLXUserPtr (line 631) | int gladLoadGLXUserPtr(Display *display, int screen, GLADuserptrloadfunc...
  function gladLoadGLX (line 680) | int gladLoadGLX(Display *display, int screen, GLADloadfunc load) {
  function glad_close_dlopen_handle (line 730) | static void glad_close_dlopen_handle(void* handle) {
  function GLADapiproc (line 740) | static GLADapiproc glad_dlsym_handle(void* handle, const char *name) {
  function GLADapiproc (line 756) | static GLADapiproc glad_glx_get_proc(const char *name, void *userptr) {
  function gladLoaderLoadGLX (line 762) | int gladLoaderLoadGLX(Display *display, int screen) {
  function gladLoaderUnloadGLX (line 795) | void gladLoaderUnloadGLX() {

FILE: ycb_render/glad/linmath.h
  function vec3_mul_cross (line 52) | static inline void vec3_mul_cross(vec3 r, vec3 const a, vec3 const b)
  function vec3_reflect (line 59) | static inline void vec3_reflect(vec3 r, vec3 const v, vec3 const n)
  function vec4_mul_cross (line 67) | static inline void vec4_mul_cross(vec4 r, vec4 a, vec4 b)
  function vec4_reflect (line 75) | static inline void vec4_reflect(vec4 r, vec4 v, vec4 n)
  type vec4 (line 83) | typedef vec4 mat4x4[4];
  function mat4x4_identity (line 84) | static inline void mat4x4_identity(mat4x4 M)
  function mat4x4_dup (line 91) | static inline void mat4x4_dup(mat4x4 M, mat4x4 N)
  function mat4x4_row (line 98) | static inline void mat4x4_row(vec4 r, mat4x4 M, int i)
  function mat4x4_col (line 104) | static inline void mat4x4_col(vec4 r, mat4x4 M, int i)
  function mat4x4_transpose (line 110) | static inline void mat4x4_transpose(mat4x4 M, mat4x4 N)
  function mat4x4_add (line 117) | static inline void mat4x4_add(mat4x4 M, mat4x4 a, mat4x4 b)
  function mat4x4_sub (line 123) | static inline void mat4x4_sub(mat4x4 M, mat4x4 a, mat4x4 b)
  function mat4x4_scale (line 129) | static inline void mat4x4_scale(mat4x4 M, mat4x4 a, float k)
  function mat4x4_scale_aniso (line 135) | static inline void mat4x4_scale_aniso(mat4x4 M, mat4x4 a, float x, float...
  function mat4x4_mul (line 145) | static inline void mat4x4_mul(mat4x4 M, mat4x4 a, mat4x4 b)
  function mat4x4_mul_vec4 (line 156) | static inline void mat4x4_mul_vec4(vec4 r, mat4x4 M, vec4 v)
  function mat4x4_translate (line 165) | static inline void mat4x4_translate(mat4x4 T, float x, float y, float z)
  function mat4x4_translate_in_place (line 172) | static inline void mat4x4_translate_in_place(mat4x4 M, float x, float y,...
  function mat4x4_from_vec3_mul_outer (line 182) | static inline void mat4x4_from_vec3_mul_outer(mat4x4 M, vec3 a, vec3 b)
  function mat4x4_rotate (line 188) | static inline void mat4x4_rotate(mat4x4 R, mat4x4 M, float x, float y, f...
  function mat4x4_rotate_X (line 223) | static inline void mat4x4_rotate_X(mat4x4 Q, mat4x4 M, float angle)
  function mat4x4_rotate_Y (line 235) | static inline void mat4x4_rotate_Y(mat4x4 Q, mat4x4 M, float angle)
  function mat4x4_rotate_Z (line 247) | static inline void mat4x4_rotate_Z(mat4x4 Q, mat4x4 M, float angle)
  function mat4x4_invert (line 259) | static inline void mat4x4_invert(mat4x4 T, mat4x4 M)
  function mat4x4_orthonormalize (line 301) | static inline void mat4x4_orthonormalize(mat4x4 R, mat4x4 M)
  function mat4x4_frustum (line 325) | static inline void mat4x4_frustum(mat4x4 M, float l, float r, float b, f...
  function mat4x4_ortho (line 341) | static inline void mat4x4_ortho(mat4x4 M, float l, float r, float b, flo...
  function mat4x4_perspective (line 357) | static inline void mat4x4_perspective(mat4x4 m, float y_fov, float aspec...
  function mat4x4_look_at (line 383) | static inline void mat4x4_look_at(mat4x4 m, vec3 eye, vec3 center, vec3 up)
  function quat_identity (line 427) | static inline void quat_identity(quat q)
  function quat_add (line 432) | static inline void quat_add(quat r, quat a, quat b)
  function quat_sub (line 438) | static inline void quat_sub(quat r, quat a, quat b)
  function quat_mul (line 444) | static inline void quat_mul(quat r, quat p, quat q)
  function quat_scale (line 454) | static inline void quat_scale(quat r, quat v, float s)
  function quat_inner_product (line 460) | static inline float quat_inner_product(quat a, quat b)
  function quat_conj (line 468) | static inline void quat_conj(quat r, quat q)
  function quat_rotate (line 475) | static inline void quat_rotate(quat r, float angle, vec3 axis) {
  function quat_mul_vec3 (line 484) | static inline void quat_mul_vec3(vec3 r, quat q, vec3 v)
  function mat4x4_from_quat (line 503) | static inline void mat4x4_from_quat(mat4x4 M, quat q)
  function mat4x4o_mul_quat (line 533) | static inline void mat4x4o_mul_quat(mat4x4 R, mat4x4 M, quat q)
  function quat_from_mat4x4 (line 544) | static inline void quat_from_mat4x4(quat q, mat4x4 M)

FILE: ycb_render/glutils/_trackball.py
  function _v_add (line 80) | def _v_add(v1, v2):
  function _v_sub (line 82) | def _v_sub(v1, v2):
  function _v_mul (line 84) | def _v_mul(v, s):
  function _v_dot (line 86) | def _v_dot(v1, v2):
  function _v_cross (line 88) | def _v_cross(v1, v2):
  function _v_length (line 92) | def _v_length(v):
  function _v_normalize (line 94) | def _v_normalize(v):
  function _q_add (line 100) | def _q_add(q1,q2):
  function _q_mul (line 108) | def _q_mul(q, s):
  function _q_dot (line 110) | def _q_dot(q1, q2):
  function _q_length (line 112) | def _q_length(q):
  function _q_normalize (line 114) | def _q_normalize(q):
  function _q_from_axis_angle (line 117) | def _q_from_axis_angle(v, phi):
  function _q_rotmatrix (line 121) | def _q_rotmatrix(q):
  class Trackball (line 139) | class Trackball(object):
    method __init__ (line 142) | def __init__(self, theta=0, phi=0):
    method drag_to (line 154) | def drag_to (self, x, y, dx, dy):
    method model (line 166) | def model(self):
    method theta (line 171) | def theta(self):
    method theta (line 177) | def theta(self, theta):
    method phi (line 182) | def phi(self):
    method phi (line 188) | def phi(self, phi):
    method _get_orientation (line 193) | def _get_orientation(self):
    method _set_orientation (line 201) | def _set_orientation(self, theta, phi):
    method _project (line 216) | def _project(self, r, x, y):
    method _rotate (line 230) | def _rotate(self, x, y, dx, dy):

FILE: ycb_render/glutils/glcontext.py
  function _find_library_new (line 64) | def _find_library_new(name):
  class Context (line 82) | class Context:
    method __init__ (line 83) | def __init__(self):
    method create_opengl_context (line 86) | def create_opengl_context(self, surface_size=(640, 480)):
    method destroy (line 131) | def destroy(self):

FILE: ycb_render/glutils/glrenderer.py
  class GLObject (line 13) | class GLObject(object):
    method __del__ (line 14) | def __del__(self):
    method __enter__ (line 17) | def __enter__(self):
    method __exit__ (line 21) | def __exit__(self, *args):
  class FBO (line 26) | class FBO(GLObject):
    method __init__ (line 29) | def __init__(self):
    method release (line 32) | def release(self):
  class Texture (line 36) | class Texture(GLObject):
    method __init__ (line 39) | def __init__(self):
    method release (line 42) | def release(self):
  class Shader (line 46) | class Shader(GLObject):
    method __init__ (line 48) | def __init__(self, vp_code, fp_code):
    method release (line 58) | def release(self):
    method __getitem__ (line 61) | def __getitem__(self, uniform_name):
    method __enter__ (line 66) | def __enter__(self):
    method __exit__ (line 69) | def __exit__(self, *args):
  class MeshRenderer (line 73) | class MeshRenderer(object):
    method __init__ (line 74) | def __init__(self, size):
    method _bind_attrib (line 123) | def _bind_attrib(self, i, arr):
    method proj_matrix (line 134) | def proj_matrix(self):
    method render_mesh (line 137) | def render_mesh(self, position, uv, face=None,
    method loadTexture (line 169) | def loadTexture(self, path):

FILE: ycb_render/glutils/meshutil.py
  function frustum (line 9) | def frustum(left, right, bottom, top, znear, zfar):
  function perspective (line 26) | def perspective(fovy, aspect, znear, zfar):
  function anorm (line 34) | def anorm(x, axis=None, keepdims=False):
  function normalize (line 39) | def normalize(v, axis=None, eps=1e-10):
  function lookat (line 44) | def lookat(eye, target=[0, 0, 0], up=[0, 1, 0]):
  function sample_view (line 57) | def sample_view(min_dist, max_dist=None):
  function homotrans (line 71) | def homotrans(M, p):
  function _parse_vertex_tuple (line 79) | def _parse_vertex_tuple(s):
  function _unify_rows (line 88) | def _unify_rows(a):
  function load_obj (line 100) | def load_obj(fn):
  function normalize_mesh (line 161) | def normalize_mesh(mesh):
  function quat2rotmat (line 170) | def quat2rotmat(quat):
  function mat2rotmat (line 175) | def mat2rotmat(mat):
  function quat2rotmat (line 180) | def quat2rotmat(quat):
  function xyz2mat (line 185) | def xyz2mat(xyz):
  function mat2xyz (line 190) | def mat2xyz(mat):
  function safemat2quat (line 195) | def safemat2quat(mat):
  function unpack_pose (line 204) | def unpack_pose(pose):
  function pack_pose (line 210) | def pack_pose(pose):

FILE: ycb_render/glutils/trackball.py
  class Trackball (line 12) | class Trackball():
    method __init__ (line 73) | def __init__(self, width, height, cam_pos=[0,0,2.0]):
    method distance (line 101) | def distance(self):
    method theta (line 116) | def theta(self):
    method theta (line 122) | def theta(self, theta):
    method phi (line 130) | def phi(self):
    method phi (line 136) | def phi(self, phi):
    method zoom (line 144) | def zoom(self):
    method zoom (line 151) | def zoom(self, value):
    method aspect (line 160) | def aspect(self):
    method aspect (line 167) | def aspect(self, value):
    method on_attach (line 175) | def on_attach(self, program):
    method on_resize (line 181) | def on_resize(self, width, height):
    method on_mouse_drag (line 191) | def on_mouse_drag(self, x, y, dx, dy, button=None):
    method on_mouse_scroll (line 202) | def on_mouse_scroll(self, x, y, dx, dy):
    method reinit (line 210) | def reinit(self, cam_pos):

FILE: ycb_render/setup.py
  class CMakeExtension (line 16) | class CMakeExtension(Extension):
    method __init__ (line 17) | def __init__(self, name, sourcedir=''):
  class CMakeBuild (line 22) | class CMakeBuild(build_ext):
    method run (line 23) | def run(self):
    method build_extension (line 39) | def build_extension(self, ext):

FILE: ycb_render/ycb_renderer.py
  function loadTexture (line 37) | def loadTexture(path):
  class YCBRenderer (line 63) | class YCBRenderer:
    method __init__ (line 64) | def __init__(self, width=512, height=512, gpu_id=0, render_marker=Fals...
    method generate_grid (line 204) | def generate_grid(self):
    method load_object (line 231) | def load_object(self, obj_path, texture_path, scale=1.0):
    method load_offset (line 323) | def load_offset(self):
    method load_mesh (line 339) | def load_mesh(self, path, scale=1.0):
    method recursive_load (line 348) | def recursive_load(self, node, vertices, faces, materials,
    method load_objects (line 391) | def load_objects(self, obj_paths, texture_paths, colors=[[0.9, 0, 0], ...
    method set_camera (line 403) | def set_camera(self, camera, target, up):
    method set_camera_default (line 413) | def set_camera_default(self):
    method set_fov (line 416) | def set_fov(self, fov):
    method set_projection_matrix (line 424) | def set_projection_matrix(self, w, h, fu, fv, u0, v0, znear, zfar):
    method set_light_color (line 440) | def set_light_color(self, color):
    method render (line 443) | def render(self, cls_indexes, image_tensor, seg_tensor, normal_tensor=...
    method set_light_pos (line 554) | def set_light_pos(self, light):
    method get_num_objects (line 557) | def get_num_objects(self):
    method set_poses (line 560) | def set_poses(self, poses):
    method set_allocentric_poses (line 566) | def set_allocentric_poses(self, poses):
    method release (line 580) | def release(self):
    method clean (line 585) | def clean(self):
    method transform_vector (line 607) | def transform_vector(self, vec):
    method transform_point (line 617) | def transform_point(self, vec):
    method transform_pose (line 628) | def transform_pose(self, pose):
    method get_num_instances (line 634) | def get_num_instances(self):
    method get_poses (line 637) | def get_poses(self):
    method get_egocentric_poses (line 644) | def get_egocentric_poses(self):
    method get_allocentric_poses (line 647) | def get_allocentric_poses(self):
    method get_centers (line 659) | def get_centers(self):
    method vis (line 674) | def vis(self, poses, cls_indexes, color_idx=None, color_list=None, cam...
  function parse_args (line 815) | def parse_args():
Condensed preview — 121 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (1,278K chars).
[
  {
    "path": ".gitignore",
    "chars": 1849,
    "preview": "*.mex*\n*.pyc\n*.tgz\n*.so\n*.o\noutput*\nlib/synthesize/build/*\nlib/utils/bbox.c\ndata/\ndata_self/\ndocker/\nngc/\n\n.idea/\n# Byte"
  },
  {
    "path": ".gitmodules",
    "chars": 108,
    "preview": "[submodule \"ycb_render/pybind11\"]\n\tpath = ycb_render/pybind11\n\turl = https://github.com/pybind/pybind11.git\n"
  },
  {
    "path": "LICENSE.md",
    "chars": 4466,
    "preview": "# NVIDIA Source Code License for PoseCNN-PyTorch: A PyTorch Implementation of the PoseCNN Framework for 6D Object Pose E"
  },
  {
    "path": "README.md",
    "chars": 6568,
    "preview": "# PoseCNN-PyTorch: A PyTorch Implementation of the PoseCNN Framework for 6D Object Pose Estimation\n\n### Introduction\n\nWe"
  },
  {
    "path": "build.sh",
    "chars": 146,
    "preview": "cd lib/layers/;\npython3 setup.py build develop;\ncd ../utils;\npython3 setup.py build_ext --inplace;\ncd ../../ycb_render;\n"
  },
  {
    "path": "experiments/cfgs/dex_ycb.yml",
    "chars": 1627,
    "preview": "EXP_DIR: dex_ycb\nINPUT: COLOR\nTRAIN:\n  TRAINABLE: True\n  WEIGHT_DECAY: 0.0001\n  LEARNING_RATE: 0.001\n  MILESTONES: !!pyt"
  },
  {
    "path": "experiments/cfgs/ycb_object.yml",
    "chars": 1744,
    "preview": "EXP_DIR: ycb_object\nINPUT: COLOR\nTRAIN:\n  TRAINABLE: True\n  WEIGHT_DECAY: 0.0001\n  LEARNING_RATE: 0.001\n  MILESTONES: !!"
  },
  {
    "path": "experiments/cfgs/ycb_object_detection.yml",
    "chars": 1698,
    "preview": "EXP_DIR: ycb_object\nINPUT: COLOR\nTRAIN:\n  TRAINABLE: True\n  WEIGHT_DECAY: 0.0001\n  LEARNING_RATE: 0.001\n  MILESTONES: !!"
  },
  {
    "path": "experiments/cfgs/ycb_object_self_supervision.yml",
    "chars": 1792,
    "preview": "EXP_DIR: ycb_self_supervision\nINPUT: COLOR\nTRAIN:\n  TRAINABLE: True\n  WEIGHT_DECAY: 0.0001\n  LEARNING_RATE: 0.0001\n  MIL"
  },
  {
    "path": "experiments/cfgs/ycb_video.yml",
    "chars": 1646,
    "preview": "EXP_DIR: ycb_video\nINPUT: COLOR\nTRAIN:\n  TRAINABLE: True\n  WEIGHT_DECAY: 0.0001\n  LEARNING_RATE: 0.001\n  MILESTONES: !!p"
  },
  {
    "path": "experiments/scripts/demo.sh",
    "chars": 400,
    "preview": "#!/bin/bash\n\t\nset -x\nset -e\n\nexport PYTHONUNBUFFERED=\"True\"\nexport CUDA_VISIBLE_DEVICES=0\n\ntime ./tools/test_images.py -"
  },
  {
    "path": "experiments/scripts/dex_ycb_test_s0.sh",
    "chars": 271,
    "preview": "#!/bin/bash\n\nset -x\nset -e\nexport CUDA_VISIBLE_DEVICES=$1\n\ntime ./tools/test_net.py --gpu $1 \\\n  --network posecnn \\\n  -"
  },
  {
    "path": "experiments/scripts/dex_ycb_test_s1.sh",
    "chars": 271,
    "preview": "#!/bin/bash\n\nset -x\nset -e\nexport CUDA_VISIBLE_DEVICES=$1\n\ntime ./tools/test_net.py --gpu $1 \\\n  --network posecnn \\\n  -"
  },
  {
    "path": "experiments/scripts/dex_ycb_test_s2.sh",
    "chars": 271,
    "preview": "#!/bin/bash\n\nset -x\nset -e\nexport CUDA_VISIBLE_DEVICES=$1\n\ntime ./tools/test_net.py --gpu $1 \\\n  --network posecnn \\\n  -"
  },
  {
    "path": "experiments/scripts/dex_ycb_test_s3.sh",
    "chars": 271,
    "preview": "#!/bin/bash\n\nset -x\nset -e\nexport CUDA_VISIBLE_DEVICES=$1\n\ntime ./tools/test_net.py --gpu $1 \\\n  --network posecnn \\\n  -"
  },
  {
    "path": "experiments/scripts/dex_ycb_train_s0.sh",
    "chars": 232,
    "preview": "#!/bin/bash\n\nset -x\nset -e\n\ntime ./tools/train_net.py \\\n  --network posecnn \\\n  --pretrained data/checkpoints/vgg16-3979"
  },
  {
    "path": "experiments/scripts/dex_ycb_train_s1.sh",
    "chars": 232,
    "preview": "#!/bin/bash\n\nset -x\nset -e\n\ntime ./tools/train_net.py \\\n  --network posecnn \\\n  --pretrained data/checkpoints/vgg16-3979"
  },
  {
    "path": "experiments/scripts/dex_ycb_train_s2.sh",
    "chars": 232,
    "preview": "#!/bin/bash\n\nset -x\nset -e\n\ntime ./tools/train_net.py \\\n  --network posecnn \\\n  --pretrained data/checkpoints/vgg16-3979"
  },
  {
    "path": "experiments/scripts/dex_ycb_train_s3.sh",
    "chars": 232,
    "preview": "#!/bin/bash\n\nset -x\nset -e\n\ntime ./tools/train_net.py \\\n  --network posecnn \\\n  --pretrained data/checkpoints/vgg16-3979"
  },
  {
    "path": "experiments/scripts/ros_ycb_object_test.sh",
    "chars": 340,
    "preview": "#!/bin/bash\n\t\nset -x\nset -e\n\nexport PYTHONUNBUFFERED=\"True\"\nexport CUDA_VISIBLE_DEVICES=$1\n\ntime ./ros/test_images.py --"
  },
  {
    "path": "experiments/scripts/ros_ycb_object_test_detection.sh",
    "chars": 360,
    "preview": "#!/bin/bash\n\t\nset -x\nset -e\n\nexport PYTHONUNBUFFERED=\"True\"\nexport CUDA_VISIBLE_DEVICES=$1\n\ntime ./ros/test_images.py --"
  },
  {
    "path": "experiments/scripts/ycb_object_test.sh",
    "chars": 312,
    "preview": "#!/bin/bash\n\nset -x\nset -e\n\nexport PYTHONUNBUFFERED=\"True\"\nexport CUDA_VISIBLE_DEVICES=$1\n\ntime ./tools/test_net.py --gp"
  },
  {
    "path": "experiments/scripts/ycb_object_train.sh",
    "chars": 230,
    "preview": "#!/bin/bash\n\nset -x\nset -e\n\n./tools/train_net.py \\\n  --network posecnn \\\n  --pretrained data/checkpoints/vgg16-397923af."
  },
  {
    "path": "experiments/scripts/ycb_object_train_detection.sh",
    "chars": 240,
    "preview": "#!/bin/bash\n\nset -x\nset -e\n\n./tools/train_net.py \\\n  --network posecnn \\\n  --pretrained data/checkpoints/vgg16-397923af."
  },
  {
    "path": "experiments/scripts/ycb_object_train_self_supervision.sh",
    "chars": 294,
    "preview": "#!/bin/bash\n\nset -x\nset -e\n\n./tools/train_net.py \\\n  --network posecnn \\\n  --pretrained output/ycb_object/ycb_object_tra"
  },
  {
    "path": "experiments/scripts/ycb_video_test.sh",
    "chars": 311,
    "preview": "#!/bin/bash\n\nset -x\nset -e\n\nexport PYTHONUNBUFFERED=\"True\"\nexport CUDA_VISIBLE_DEVICES=$1\n\ntime ./tools/test_net.py --gp"
  },
  {
    "path": "experiments/scripts/ycb_video_train.sh",
    "chars": 233,
    "preview": "#!/bin/bash\n\nset -x\nset -e\n\ntime ./tools/train_net.py \\\n  --network posecnn \\\n  --pretrained data/checkpoints/vgg16-3979"
  },
  {
    "path": "lib/datasets/__init__.py",
    "chars": 478,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/datasets/background.py",
    "chars": 8209,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/datasets/dex_ycb.py",
    "chars": 40279,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/datasets/factory.py",
    "chars": 2103,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/datasets/imdb.py",
    "chars": 2735,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/datasets/ycb_object.py",
    "chars": 27341,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/datasets/ycb_self_supervision.py",
    "chars": 62426,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/datasets/ycb_video.py",
    "chars": 36882,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/fcn/__init__.py",
    "chars": 180,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/fcn/config.py",
    "chars": 10489,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/fcn/render_utils.py",
    "chars": 4124,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/fcn/test_common.py",
    "chars": 9359,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/fcn/test_dataset.py",
    "chars": 15060,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/fcn/test_imageset.py",
    "chars": 9679,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/fcn/train.py",
    "chars": 14679,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/layers/ROIAlign_cuda.cu",
    "chars": 13076,
    "preview": "// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\n#include <ATen/ATen.h>\n#include <ATen/cuda/CUDA"
  },
  {
    "path": "lib/layers/__init__.py",
    "chars": 180,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/layers/backproject_kernel.cu",
    "chars": 1746,
    "preview": "// Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n// This work is licensed under the NVIDIA Source Code Lic"
  },
  {
    "path": "lib/layers/hard_label.py",
    "chars": 1111,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/layers/hard_label_kernel.cu",
    "chars": 2174,
    "preview": "// Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n// This work is licensed under the NVIDIA Source Code Lic"
  },
  {
    "path": "lib/layers/hough_voting.py",
    "chars": 1675,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/layers/hough_voting_kernel.cu",
    "chars": 21258,
    "preview": "// Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n// This work is licensed under the NVIDIA Source Code Lic"
  },
  {
    "path": "lib/layers/point_matching_loss.py",
    "chars": 1166,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/layers/point_matching_loss_kernel.cu",
    "chars": 12731,
    "preview": "// Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n// This work is licensed under the NVIDIA Source Code Lic"
  },
  {
    "path": "lib/layers/pose_target_layer.py",
    "chars": 4198,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/layers/posecnn_layers.cpp",
    "chars": 9634,
    "preview": "// Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n// This work is licensed under the NVIDIA Source Code Lic"
  },
  {
    "path": "lib/layers/roi_align.py",
    "chars": 2172,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.\nimport torch\nfrom torch import nn\nfrom torch.aut"
  },
  {
    "path": "lib/layers/roi_pooling.py",
    "chars": 1647,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/layers/roi_pooling_kernel.cu",
    "chars": 9001,
    "preview": "// Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n// This work is licensed under the NVIDIA Source Code Lic"
  },
  {
    "path": "lib/layers/roi_target_layer.py",
    "chars": 5092,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/layers/sdf_matching_loss.py",
    "chars": 1352,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/layers/sdf_matching_loss_kernel.cu",
    "chars": 14326,
    "preview": "// Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n// This work is licensed under the NVIDIA Source Code Lic"
  },
  {
    "path": "lib/layers/setup.py",
    "chars": 853,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/networks/PoseCNN.py",
    "chars": 11985,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/networks/__init__.py",
    "chars": 204,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/sdf/__init__.py",
    "chars": 180,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/sdf/_init_paths.py",
    "chars": 422,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/sdf/multi_sdf_optimizer.py",
    "chars": 7412,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/sdf/sdf_optimizer.py",
    "chars": 8126,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/sdf/sdf_utils.py",
    "chars": 3331,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/sdf/test_sdf_optimizer.py",
    "chars": 6378,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/utils/__init__.py",
    "chars": 180,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/utils/bbox.pyx",
    "chars": 1955,
    "preview": "# --------------------------------------------------------\n# Fast R-CNN\n# Copyright (c) 2015 Microsoft\n# Licensed under "
  },
  {
    "path": "lib/utils/bbox_transform.py",
    "chars": 2477,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/utils/blob.py",
    "chars": 6632,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/utils/nms.py",
    "chars": 1037,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/utils/pose_error.py",
    "chars": 4707,
    "preview": "# Author: Tomas Hodan (hodantom@cmp.felk.cvut.cz)\n# Center for Machine Perception, Czech Technical University in Prague\n"
  },
  {
    "path": "lib/utils/se3.py",
    "chars": 2351,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/utils/segmentation_evaluation.py",
    "chars": 2044,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "lib/utils/setup.py",
    "chars": 4272,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "requirement.txt",
    "chars": 150,
    "preview": "pyassimp == 4.1.3\nprogressbar2\npyopengl >= 3.1.0\nopencv-python == 4.2.0.34\ntransforms3d\npillow\nIPython\nmatplotlib\neasydi"
  },
  {
    "path": "ros/_init_paths.py",
    "chars": 524,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "ros/collect_images_realsense.py",
    "chars": 2715,
    "preview": "#!/usr/bin/env python\n\n# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the N"
  },
  {
    "path": "ros/posecnn.rviz",
    "chars": 12890,
    "preview": "Panels:\n  - Class: rviz/Displays\n    Help Height: 0\n    Name: Displays\n    Property Tree Widget:\n      Expanded:\n       "
  },
  {
    "path": "ros/test_images.py",
    "chars": 12752,
    "preview": "#!/usr/bin/env python\n\n# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the N"
  },
  {
    "path": "tools/_init_paths.py",
    "chars": 524,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "tools/test_images.py",
    "chars": 8810,
    "preview": "#!/usr/bin/env python3\n\n# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the "
  },
  {
    "path": "tools/test_net.py",
    "chars": 6133,
    "preview": "#!/usr/bin/env python3\n\n# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the "
  },
  {
    "path": "tools/train_net.py",
    "chars": 6867,
    "preview": "#!/usr/bin/env python3\n\n# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the "
  },
  {
    "path": "ycb_render/CMakeLists.txt",
    "chars": 747,
    "preview": "cmake_minimum_required(VERSION 2.8.12)\nproject(CppYCBRenderer)\n\nfind_package(CUDA REQUIRED)\nset(CUDA_LIBRARIES PUBLIC ${"
  },
  {
    "path": "ycb_render/__init__.py",
    "chars": 180,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "ycb_render/cpp/query_devices.cpp",
    "chars": 2216,
    "preview": "// Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n// This work is licensed under the NVIDIA Source Code Lic"
  },
  {
    "path": "ycb_render/cpp/test_device.cpp",
    "chars": 5656,
    "preview": "// Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n// This work is licensed under the NVIDIA Source Code Lic"
  },
  {
    "path": "ycb_render/cpp/ycb_renderer.cpp",
    "chars": 10687,
    "preview": "// Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n// This work is licensed under the NVIDIA Source Code Lic"
  },
  {
    "path": "ycb_render/get_available_devices.py",
    "chars": 804,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "ycb_render/glad/EGL/eglplatform.h",
    "chars": 5483,
    "preview": "#ifndef __eglplatform_h_\n#define __eglplatform_h_\n\n/*\n** Copyright (c) 2007-2016 The Khronos Group Inc.\n**\n** Permission"
  },
  {
    "path": "ycb_render/glad/KHR/khrplatform.h",
    "chars": 10114,
    "preview": "#ifndef __khrplatform_h_\n#define __khrplatform_h_\n\n/*\n** Copyright (c) 2008-2009 The Khronos Group Inc.\n**\n** Permission"
  },
  {
    "path": "ycb_render/glad/egl.c",
    "chars": 16930,
    "preview": "#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n#include <glad/egl.h>\n\n#ifndef GLAD_IMPL_UTIL_C_\n#define GLAD"
  },
  {
    "path": "ycb_render/glad/gl.c",
    "chars": 154883,
    "preview": "#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n#include <glad/gl.h>\n\n#ifndef GLAD_IMPL_UTIL_C_\n#define GLAD_"
  },
  {
    "path": "ycb_render/glad/glad/egl.h",
    "chars": 24893,
    "preview": "/**\n * Loader generated by glad 0.1.11a0 on Wed Jun 13 07:59:54 2018\n *\n * Generator: C/C++\n * Specification: egl\n * Ext"
  },
  {
    "path": "ycb_render/glad/glad/gl.h",
    "chars": 306342,
    "preview": "/**\n * Loader generated by glad 0.1.11a0 on Wed Jun 13 07:59:51 2018\n *\n * Generator: C/C++\n * Specification: gl\n * Exte"
  },
  {
    "path": "ycb_render/glad/glad/glx.h",
    "chars": 60923,
    "preview": "/**\n * Loader generated by glad 0.1.11a0 on Wed Jun 13 07:59:53 2018\n *\n * Generator: C/C++\n * Specification: glx\n * Ext"
  },
  {
    "path": "ycb_render/glad/glx_dyn.c",
    "chars": 42255,
    "preview": "#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n#include <glad/glx.h>\n#ifdef DYNAMIC_LOAD_X11_FUNCTIONS\n#incl"
  },
  {
    "path": "ycb_render/glad/linmath.h",
    "chars": 12708,
    "preview": "#ifndef LINMATH_H\n#define LINMATH_H\n\n#include <math.h>\n\n#ifdef _MSC_VER\n#define inline __inline\n#endif\n\n#define LINMATH_"
  },
  {
    "path": "ycb_render/glutils/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "ycb_render/glutils/_trackball.py",
    "chars": 9258,
    "preview": "# -*- coding: utf-8 -*-\n#\n# Copyright (c)  2014 Nicolas Rougier\n#                2008 Roger Allen\n#                1993,"
  },
  {
    "path": "ycb_render/glutils/glcontext.py",
    "chars": 4339,
    "preview": "\"\"\"Headless GPU-accelerated OpenGL context creation on Google Colaboratory.\n\nTypical usage:\n\n    # Optional PyOpenGL con"
  },
  {
    "path": "ycb_render/glutils/glrenderer.py",
    "chars": 6838,
    "preview": "\"\"\"OpenGL Mesh rendering utils.\"\"\"\n\nfrom contextlib import contextmanager\nimport numpy as np\n\nimport OpenGL.GL as gl\n\nfr"
  },
  {
    "path": "ycb_render/glutils/meshutil.py",
    "chars": 6155,
    "preview": "\"\"\"3D mesh manipulation utilities.\"\"\"\n\nfrom builtins import str\nfrom collections import OrderedDict\nimport numpy as np\nf"
  },
  {
    "path": "ycb_render/glutils/trackball.py",
    "chars": 6838,
    "preview": "# -----------------------------------------------------------------------------\n# Copyright (c) 2009-2016 Nicolas P. Rou"
  },
  {
    "path": "ycb_render/glutils/utils.py",
    "chars": 39,
    "preview": "colormap = [[1,0,0], [0,1,0], [0,0,1]]\n"
  },
  {
    "path": "ycb_render/setup.py",
    "chars": 2960,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "ycb_render/shaders/frag.shader",
    "chars": 968,
    "preview": "#version 460\nuniform sampler2D texUnit;\nin vec2 theCoords;\nin vec3 Normal;\nin vec3 Normal_cam;\nin vec3 FragPos;\nin vec3 "
  },
  {
    "path": "ycb_render/shaders/frag_blinnphong.shader",
    "chars": 2950,
    "preview": "#version 460\nuniform sampler2D texUnit;\nin vec2 theCoords;\nin vec3 Normal;\nin vec3 Normal_cam;\nin vec3 FragPos;\nin vec3 "
  },
  {
    "path": "ycb_render/shaders/frag_mat.shader",
    "chars": 1385,
    "preview": "#version 460\nin vec3 Normal;\nin vec3 Normal_cam;\nin vec3 FragPos;\nin vec3 Instance_color;\nin vec3 Pos_cam;\nin vec3 Pos_o"
  },
  {
    "path": "ycb_render/shaders/frag_simple.shader",
    "chars": 352,
    "preview": "#version 460\nlayout (location = 0) out vec4 outputColour;\nlayout (location = 1) out vec4 NormalColour;\nlayout (location "
  },
  {
    "path": "ycb_render/shaders/frag_textureless.shader",
    "chars": 1618,
    "preview": "#version 460\nin vec3 theColor;\nin vec3 Normal;\nin vec3 Normal_cam;\nin vec3 FragPos;\nin vec3 Instance_color;\nin vec3 Pos_"
  },
  {
    "path": "ycb_render/shaders/vert.shader",
    "chars": 985,
    "preview": "#version 460\nuniform mat4 V;\nuniform mat4 P;\nuniform mat4 pose_rot;\nuniform mat4 pose_trans;\nuniform vec3 instance_color"
  },
  {
    "path": "ycb_render/shaders/vert_blinnphong.shader",
    "chars": 1092,
    "preview": "#version 460\nuniform mat4 V;\nuniform mat4 P;\nuniform mat4 pose_rot;\nuniform mat4 pose_trans;\nuniform vec3 instance_color"
  },
  {
    "path": "ycb_render/shaders/vert_mat.shader",
    "chars": 1005,
    "preview": "#version 460\nuniform mat4 V;\nuniform mat4 P;\nuniform mat4 pose_rot;\nuniform mat4 pose_trans;\nuniform vec3 instance_color"
  },
  {
    "path": "ycb_render/shaders/vert_simple.shader",
    "chars": 219,
    "preview": "#version 460\nuniform mat4 V;\nuniform mat4 P;\n\nlayout (location=0) in vec3 position;\nlayout (location=1) in vec3 normal;\n"
  },
  {
    "path": "ycb_render/shaders/vert_textureless.shader",
    "chars": 1087,
    "preview": "#version 460\nuniform mat4 V;\nuniform mat4 P;\nuniform mat4 pose_rot;\nuniform mat4 pose_trans;\nuniform vec3 instance_color"
  },
  {
    "path": "ycb_render/visualize_sim.py",
    "chars": 3913,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  },
  {
    "path": "ycb_render/ycb_renderer.py",
    "chars": 42781,
    "preview": "# Copyright (c) 2020 NVIDIA Corporation. All rights reserved.\n# This work is licensed under the NVIDIA Source Code Licen"
  }
]

About this extraction

This page contains the full source code of the NVlabs/PoseCNN-PyTorch GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 121 files (1.2 MB), approximately 368.3k tokens, and a symbol index with 730 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!