[
  {
    "path": ".gitattributes",
    "content": "# Auto detect text files and perform LF normalization\n* text=auto\n"
  },
  {
    "path": ".gitignore",
    "content": ".DS_Store\n.idea/\n*/.DS_Store\n*.pyc\nevents.*\nlightning_logs"
  },
  {
    "path": "LICENSE",
    "content": "MIT License\n\nCopyright (c) 2021 Kangel Zenn\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "README.md",
    "content": "# Open3DSOT\nA general python framework for single object tracking in LiDAR point clouds, based on PyTorch Lightning.\n\nThe official code release of **[BAT](https://arxiv.org/abs/2108.04728)** and **[M2 Track](https://ghostish.github.io/MM-Track/)**.\n\n\n### Features\n+ Modular design. It is easy to config the model and training/testing behaviors through just a `.yaml` file.\n+ DDP support for both training and testing.\n+ Support all common tracking datasets (KITTI, NuScenes, Waymo Open Dataset).\n### :mega:  The extension of M2-Track is accepted by TPAMI! :point_down:\n+ [An Effective Motion-Centric Paradigm for 3D Single Object Tracking in Point Clouds\n](https://arxiv.org/abs/2303.12535)\n\n+ Codes are coming soon.\n\n### :mega:  One tracking paper is accepted by CVPR2022 (Oral)! :point_down:\n+ [Beyond 3D Siamese Tracking: A Motion-Centric Paradigm for 3D Single Object Tracking in Point Clouds](https://arxiv.org/abs/2203.01730)\n\n### :mega: The codes for M2-Track is now available.\n+ The checkpoints we provide here achieve **better** performances than those reported in our main paper. Check below for more details.\n+ The supplementary material is now out. Please check [this](https://openaccess.thecvf.com/content/CVPR2022/supplemental/Zheng_Beyond_3D_Siamese_CVPR_2022_supplemental.pdf) for more implementation details.\n## Trackers\nThis repository includes the implementation of the following models:\n\n### M2-Track (CVPR2022 Oral)\n**[[Paper]](http://arxiv.org/abs/2203.01730)** **[[Project Page]](https://ghostish.github.io/MM-Track/)**\n\n**M2-Track** is the first **motion-centric tracker** in LiDAR SOT, which robustly handles distractors and drastic appearance changes in complex driving scenes.  Unlike previous methods, M2-Track is a **matching-free** two-stage tracker which localizes the targets by explicitly modeling the \"relative target motion\" among frames.\n\n<p align=\"center\">\n<img src=\"figures/mmtrack.png\" width=\"800\"/>\n</p>\n\n<p align=\"center\">\n<img src=\"figures/results_mmtrack.gif\" width=\"800\"/>\n</p>\n\n### BAT (ICCV2021)\n**[[Paper]](https://arxiv.org/abs/2108.04728) [[Results]](./README.md#Reproduction)**\n\nOfficial implementation of **BAT**. BAT uses the BBox information to compensate the information loss of incomplete scans. It augments the target template with box-aware features that efficiently and effectively improve appearance matching.\n\n<p align=\"center\">\n<img src=\"figures/bat.png\" width=\"800\"/>\n</p>\n<p align=\"center\">\n<img src=\"figures/results.gif\" width=\"800\"/>\n</p>\n\n### P2B (CVPR2020)\n**[[Paper]](https://arxiv.org/abs/2005.13888) [[Official implementation]](https://github.com/HaozheQi/P2B)**\n\nThird party implementation of **P2B**. Our implementation achieves better results than the official code release. P2B adapts SiamRPN to 3D point clouds by integrating a pointwise correlation operator with a point-based RPN (VoteNet).\n\n<p align=\"center\">\n<img src=\"figures/p2b.png\" width=\"800\"/>\n</p>\n\n## Setup\nInstallation\n+ Create the environment\n  ```\n  git clone https://github.com/Ghostish/Open3DSOT.git\n  cd Open3DSOT\n  conda create -n Open3DSOT  python=3.8\n  conda activate Open3DSOT\n  ```\n+ Install pytorch\n  ```\n  conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0 -c pytorch\n  ```\n  Our code is compatible with other PyTorch/CUDA versions. You can follow [this](https://pytorch.org/get-started/locally/) to install another version of pytorch. **Note: In order to reproduce the reported results with the provided checkpoints of BAT, please use CUDA 10.x.** \n\n+ Install other dependencies:\n  ```\n  pip install -r requirement.txt\n  ```\n\n\nKITTI dataset\n+ Download the data for [velodyne](http://www.cvlibs.net/download.php?file=data_tracking_velodyne.zip), [calib](http://www.cvlibs.net/download.php?file=data_tracking_calib.zip) and [label_02](http://www.cvlibs.net/download.php?file=data_tracking_label_2.zip) from [KITTI Tracking](http://www.cvlibs.net/datasets/kitti/eval_tracking.php).\n+ Unzip the downloaded files.\n+ Put the unzipped files under the same folder as following.\n  ```\n  [Parent Folder]\n  --> [calib]\n      --> {0000-0020}.txt\n  --> [label_02]\n      --> {0000-0020}.txt\n  --> [velodyne]\n      --> [0000-0020] folders with velodynes .bin files\n  ```\n\nNuScenes dataset\n+ Download the dataset from the [download page](https://www.nuscenes.org/download)\n+ Extract the downloaded files and make sure you have the following structure:\n  ```\n  [Parent Folder]\n    samples\t-\tSensor data for keyframes.\n    sweeps\t-\tSensor data for intermediate frames.\n    maps\t        -\tFolder for all map files: rasterized .png images and vectorized .json files.\n    v1.0-*\t-\tJSON tables that include all the meta data and annotations. Each split (trainval, test, mini) is provided in a separate folder.\n  ```\n>Note: We use the **train_track** split to train our model and test it with the **val** split. Both splits are officially provided by NuScenes. During testing, we ignore the sequences where there is no point in the first given bbox.\n\nWaymo dataset\n+ Download and prepare dataset by the instruction of [CenterPoint](https://github.com/tianweiy/CenterPoint/blob/master/docs/WAYMO.md).\n  ```\n  [Parent Folder]\n    tfrecord_training\t                    \n    tfrecord_validation\t                 \n    train \t                                    -\tall training frames and annotations \n    val   \t                                    -\tall validation frames and annotations \n    infos_train_01sweeps_filter_zero_gt.pkl\n    infos_val_01sweeps_filter_zero_gt.pkl\n  ```\n+ Prepare SOT dataset. Data from specific category and split will be merged (e.g., sot_infos_vehicle_train.pkl).\n```bash\n  python datasets/generate_waymo_sot.py\n```\n\n## Quick Start\n### Training\nTo train a model, you must specify the `.yaml` file with `--cfg` argument. The `.yaml` file contains all the configurations of the dataset and the model. We provide `.yaml` files under the [*cfgs*](./cfgs) directory. **Note:** Before running the code, you will need to edit the `.yaml` file by setting the `path` argument as the correct root of the dataset.\n```bash\nCUDA_VISIBLE_DEVICES=0,1 python main.py  --cfg cfgs/M2_track_kitti.yaml  --batch_size 64 --epoch 60 --preloading\n```\nFor M2-Track, we use the same configuration for all categories. By default, the `.yaml` is used to trained a Car tracker. You need to change the `category_name` in the `.yaml` file to train for another category.\n\nIn this version, we remove the `--gpus` flag. And all the available GPUs will be used by default. You can use `CUDA_VISIBLE_DEVICES` to select specific GPUs.\n\nAfter you start training, you can start Tensorboard to monitor the training process:\n```\ntensorboard --logdir=./ --port=6006\n```\nBy default, the trainer runs a full evaluation on the full test split after training every epoch. You can set `--check_val_every_n_epoch` to a larger number to speed up the training. The `--preloading` flag is used to preload the training samples into the memory to save traning time. Remove this flag if you don't have enough memory.\n### Testing\nTo test a trained model, specify the checkpoint location with `--checkpoint` argument and send the `--test` flag to the command.\n```bash\npython main.py  --cfg cfgs/M2_track_kitti.yaml  --checkpoint /path/to/checkpoint/xxx.ckpt --test\n```\n\n## Reproduction\n| Model | Category | Success| Precision| Checkpoint\n|--|--|--|--|--|\n| BAT-KITTI | Car\t|65.37 | 78.88|pretrained_models/bat_kitti_car.ckpt\n| BAT-NuScenes | Car\t|40.73 | 43.29|pretrained_models/bat_nuscenes_car.ckpt\n| BAT-KITTI | Pedestrian | 45.74| 74.53| pretrained_models/bat_kitti_pedestrian.ckpt\n| M2Track-KITTI | Car | **67.43**| **81.04**| pretrained_models/mmtrack_kitti_car.ckpt\n| M2Track-KITTI | Pedestrian | **60.61**| **89.39**| pretrained_models/mmtrack_kitti_pedestrian.ckpt\n| M2Track-NuScenes | Car | **57.22**| **65.72**| pretrained_models/mmtrack_nuscenes_car.ckpt\n\nTrained models are provided in the  [*pretrained_models*](./pretrained_models) directory. To reproduce the results, simply run the code with the corresponding `.yaml` file and checkpoint. For example, to reproduce the tracking results on KITTI Car of M2-Track, just run:\n```bash\npython main.py  --cfg cfgs/M2_track_kitti.yaml  --checkpoint ./pretrained_models/mmtrack_kitti_car.ckpt --test\n```\nThe reported results of M2-Track checkpoints are produced on 3090/3080ti GPUs. Due to the precision issues, there could be minor differences if you test them with other GPUs.\n\n## Acknowledgment\n+ This repo is built upon [P2B](https://github.com/HaozheQi/P2B) and [SC3D](https://github.com/SilvioGiancola/ShapeCompletion3DTracking).\n+ Thank Erik Wijmans for his pytorch implementation of [PointNet++](https://github.com/erikwijmans/Pointnet2_PyTorch)\n\n## License\nThis repository is released under MIT License (see LICENSE file for details).\n"
  },
  {
    "path": "cfgs/BAT_CAR_NUSCENES.yaml",
    "content": "#data\ndataset: nuscenes\npath: #put data root here\nversion: v1.0-trainval\ncategory_name: Car\nsearch_bb_scale: 1.25\nsearch_bb_offset: 2\nmodel_bb_scale: 1.25\nmodel_bb_offset: 0\ntemplate_size: 512\nsearch_size: 1024\nrandom_sample: False\nsample_per_epoch: -1\ndegrees: True # use degrees or radians\nbox_aware: True\nnum_candidates: 4\nup_axis: [ 0,0,1 ]\npreload_offset: 10\nkey_frame_only: True\ntrain_split: train_track\nval_split: val\ntest_split: val\nmin_points: 1\ntrain_type: train_siamese\ndata_limit_box: False\n\n#model configuration\nnet_model: BAT\nuse_fps: True\nnormalize_xyz: False\nfeature_channel: 256 #the output channel of backbone\nhidden_channel: 256 #the hidden channel of xcorr\nout_channel: 256 #the output channel of xcorr\nvote_channel: 256 #the channel for vote aggregation\nnum_proposal: 64\nk: 4\nuse_search_bc: False\nuse_search_feature: False\nbc_channel: 9\n\n#loss configuration\nobjectiveness_weight: 1.5\nbox_weight: 0.2\nvote_weight: 1.0\nseg_weight: 0.2\nbc_weight: 1.0\n\n# testing config\nreference_BB: previous_result\nshape_aggregation: firstandprevious\nuse_z: False\nlimit_box: True\nIoU_space: 3\n\n#training\nbatch_size: 100\nworkers: 10\nepoch: 60\nfrom_epoch: 0\nlr: 0.001\noptimizer: Adam\nlr_decay_step: 20\nlr_decay_rate: 0.1\nwd: 0\ngradient_clip_val: 0.0"
  },
  {
    "path": "cfgs/BAT_Car.yaml",
    "content": "#data\ndataset: kitti\npath:  #put data root here\ncategory_name: Car # [Car, Van, Pedestrian, Cyclist, All]\nsearch_bb_scale: 1.25\nsearch_bb_offset: 2\nmodel_bb_scale: 1.25\nmodel_bb_offset: 0\ntemplate_size: 512\nsearch_size: 1024\nrandom_sample: False\nsample_per_epoch: -1\ndegrees: True # use degrees or radians\nbox_aware: True\nnum_candidates: 4\ncoordinate_mode: velodyne\nup_axis: [0,0,1]\ntrain_split: train\nval_split: test\ntest_split: test\npreload_offset: 10\ntrain_type: train_siamese\ndata_limit_box: False\n\n#model configuration\nnet_model: BAT\nuse_fps: True\nnormalize_xyz: False\nfeature_channel: 256 #the output channel of backbone\nhidden_channel: 256 #the hidden channel of xcorr\nout_channel: 256 #the output channel of xcorr\nvote_channel: 256 #the channel for vote aggregation\nnum_proposal: 64\nk: 4\nuse_search_bc: False\nuse_search_feature: False\nbc_channel: 9\n\n#loss configuration\nobjectiveness_weight: 1.5\nbox_weight: 0.2\nvote_weight: 1.0\nseg_weight: 0.2\nbc_weight: 1.0\n\n# testing config\nreference_BB: previous_result\nshape_aggregation: firstandprevious\nuse_z: True\nlimit_box: False\nIoU_space: 3\n\n#training\nbatch_size: 50 #batch_size per gpu\nworkers: 10\nepoch: 60\nfrom_epoch: 0\nlr: 0.001\noptimizer: Adam\nlr_decay_step: 12\nlr_decay_rate: 0.2\nwd: 0\ngradient_clip_val: 0.0"
  },
  {
    "path": "cfgs/BAT_Car_Waymo.yaml",
    "content": "#data\ndataset: waymo\npath:  #put the root of the dataset here\ncategory_name: Vehicle # [Vehicle, Pedestrian, Cyclist]\nsearch_bb_scale: 1.25\nsearch_bb_offset: 2\nmodel_bb_scale: 1.25\nmodel_bb_offset: 0\ntemplate_size: 512\nsearch_size: 1024\nrandom_sample: False\nsample_per_epoch: -1\ndegrees: True # use degrees or radians\nbox_aware: True\nnum_candidates: 4\ncoordinate_mode: velodyne\nup_axis: [0,0,1]\ntrain_split: train\nval_split: test\ntest_split: test\npreload_offset: 10\ntiny: False # for debug only\ntrain_type: train_siamese\ndata_limit_box: False\n\n#model configuration\nnet_model: BAT\nuse_fps: True\nnormalize_xyz: False\nfeature_channel: 256 #the output channel of backbone\nhidden_channel: 256 #the hidden channel of xcorr\nout_channel: 256 #the output channel of xcorr\nvote_channel: 256 #the channel for vote aggregation\nnum_proposal: 64\nk: 4\nuse_search_bc: False\nuse_search_feature: False\nbc_channel: 9\n\n#loss configuration\nobjectiveness_weight: 1.5\nbox_weight: 0.2\nvote_weight: 1.0\nseg_weight: 0.2\nbc_weight: 1.0\n\n# testing config\nreference_BB: previous_result\nshape_aggregation: firstandprevious\nuse_z: False\nlimit_box: True\nIoU_space: 3\n\n#training\nbatch_size: 50 #batch_size per gpu\nworkers: 10\nepoch: 60\nfrom_epoch: 0\nlr: 0.001\noptimizer: Adam\nlr_decay_step: 5\nlr_decay_rate: 0.1\nwd: 0\ngradient_clip_val: 0.0"
  },
  {
    "path": "cfgs/BAT_PEDESTRIAN_NUSCENES.yaml",
    "content": "#data\ndataset: nuscenes\npath: #put data root here\nversion: v1.0-trainval\ncategory_name: Pedestrian\nsearch_bb_scale: 1.25\nsearch_bb_offset: 2\nmodel_bb_scale: 1.25\nmodel_bb_offset: 0\ntemplate_size: 512\nsearch_size: 1024\nrandom_sample: False\nsample_per_epoch: -1\ndegrees: True # use degrees or radians\nbox_aware: True\nnum_candidates: 4\nup_axis: [ 0,0,1 ]\npreload_offset: 10\nkey_frame_only: False\ntrain_split: train_track\nval_split: val\ntest_split: val\nmin_points: 1\ntrain_type: train_siamese\n\n#model configuration\nnet_model: BAT\nuse_fps: True\nnormalize_xyz: False\nfeature_channel: 256 #the output channel of backbone\nhidden_channel: 256 #the hidden channel of xcorr\nout_channel: 256 #the output channel of xcorr\nvote_channel: 256 #the channel for vote aggregation\nnum_proposal: 64\nk: 4\nuse_search_bc: False\nuse_search_feature: False\nbc_channel: 9\n\n#loss configuration\nobjectiveness_weight: 1.5\nbox_weight: 0.2\nvote_weight: 1.0\nseg_weight: 0.2\nbc_weight: 1.0\n\n# testing config\nreference_BB: previous_result\nshape_aggregation: firstandprevious\nuse_z: False\nlimit_box: True\nIoU_space: 3\n\n#training\nbatch_size: 100\nworkers: 10\nepoch: 60\nfrom_epoch: 0\nlr: 0.001\noptimizer: Adam\nlr_decay_step: 20\nlr_decay_rate: 0.1\nwd: 0\ngradient_clip_val: 0.0"
  },
  {
    "path": "cfgs/BAT_Pedestrian.yaml",
    "content": "#data\ndataset: kitti\npath: #put the root of the dataset here\ncategory_name: Pedestrian # [Car, Van, Pedestrian, Cyclist, All]\nsearch_bb_scale: 1.25\nsearch_bb_offset: 2\nmodel_bb_scale: 1.25\nmodel_bb_offset: 0\ntemplate_size: 512\nsearch_size: 1024\nrandom_sample: False\nsample_per_epoch: -1\ndegrees: True # use degrees or radians\nbox_aware: True\nnum_candidates: 4\ncoordinate_mode: velodyne\nup_axis: [0,0,1]\ntrain_split: train\nval_split: test\ntest_split: test\npreload_offset: 10\ntrain_type: train_siamese\ndata_limit_box: True\n\n#model configuration\nnet_model: BAT\nuse_fps: True\nnormalize_xyz: False\nfeature_channel: 256 #the output channel of backbone\nhidden_channel: 256 #the hidden channel of xcorr\nout_channel: 256 #the output channel of xcorr\nvote_channel: 256 #the channel for vote aggregation\nnum_proposal: 64\nk: 4\nuse_search_bc: False\nuse_search_feature: False\nbc_channel: 9\n\n#loss configuration\nobjectiveness_weight: 1.5\nbox_weight: 0.2\nvote_weight: 1.0\nseg_weight: 0.2\nbc_weight: 1.0\n\n# testing config\nreference_BB: previous_result\nshape_aggregation: firstandprevious\nuse_z: False\nlimit_box: True\nIoU_space: 3\n\n#training\nbatch_size: 50 #batch_size per gpu\nworkers: 10\nepoch: 60\nfrom_epoch: 0\nlr: 0.001\noptimizer: Adam\nlr_decay_step: 12\nlr_decay_rate: 0.2\nwd: 0\ngradient_clip_val: 0.0"
  },
  {
    "path": "cfgs/M2_Track_nuscene.yaml",
    "content": "#data\ndataset: nuscenes\npath:  #put data root here\nversion: v1.0-trainval\ncategory_name: Car\nbb_scale: 1.25\nbb_offset: 2\npoint_sample_size: 1024\ndegrees: False\ncoordinate_mode: velodyne\nup_axis: [0,0,1]\npreload_offset: 10\ndata_limit_box: True\nkey_frame_only: True\ntrain_split: train_track\nval_split: val\ntest_split: val\ntrain_type: train_motion\nnum_candidates: 4\nmotion_threshold: 0.15\nuse_augmentation: True\n\n\n#model configuration\nnet_model: m2track\nbox_aware: True\n\n#loss configuration\ncenter_weight: 2\nangle_weight: 10.0\nseg_weight: 0.1\nbc_weight: 1\n\nmotion_cls_seg_weight: 0.1\n\n\n# testing config\nuse_z: True\nlimit_box: False\nIoU_space: 3\n\n#training\nbatch_size: 100\nworkers: 10\nepoch: 180\nfrom_epoch: 0\nlr: 0.001\noptimizer: Adam\nlr_decay_step: 20\nlr_decay_rate: 0.1\nwd: 0\ngradient_clip_val: 0.0"
  },
  {
    "path": "cfgs/M2_Track_waymo.yaml",
    "content": "#data\ndataset: waymo\npath: #put data root here\ncategory_name: Vehicle # [Vehicle, Pedestrian, Cyclist]\nbb_scale: 1.25\nbb_offset: 2\npoint_sample_size: 1024\ndegrees: False\ncoordinate_mode: velodyne\nup_axis: [ 0,0,1 ]\n\npreload_offset: 60\ndata_limit_box: True\ntrain_split: train\nval_split: test\ntest_split: test\ntrain_type: train_motion\nnum_candidates: 4\nmotion_threshold: 0.15\nuse_augmentation: True\n\n\nbox_aware: True\n\ntiny: False # for debug only\n\n#model configuration\nnet_model: m2track\n\n#loss configuration\ncenter_weight: 2\nangle_weight: 10.0\nseg_weight: 0.1\nbc_weight: 1.0\nmotion_cls_seg_weight: 0.1\n\n\n# testing config\nuse_z: True\nlimit_box: False\nIoU_space: 3\n\n#training\nbatch_size: 100\nworkers: 10\nepoch: 180\nfrom_epoch: 0\nlr: 0.001\noptimizer: Adam\nlr_decay_step: 20\nlr_decay_rate: 0.1\nwd: 0\ngradient_clip_val: 0.0"
  },
  {
    "path": "cfgs/M2_track_kitti.yaml",
    "content": "#data\ndataset: kitti\npath: #put data root here\ncategory_name: Pedestrian # [Car, Van, Pedestrian, Cyclist, All]\nbb_scale: 1.25\nbb_offset: 2\npoint_sample_size: 1024\ndegrees: False\ncoordinate_mode: velodyne\nup_axis: [ 0,0,1 ]\npreload_offset: 10\ndata_limit_box: True\ntrain_split: train\nval_split: test\ntest_split: test\ntrain_type: train_motion\nnum_candidates: 4\nmotion_threshold: 0.15\nuse_augmentation: True\n\n\n#model configuration\nnet_model: m2track\nbox_aware: True\n\n#loss configuration\ncenter_weight: 2\nangle_weight: 10.0\nseg_weight: 0.1\nbc_weight: 1\n\nmotion_cls_seg_weight: 0.1\n\n# testing config\nuse_z: True\nlimit_box: False\nIoU_space: 3\n\n#training\nbatch_size: 100\nworkers: 10\nepoch: 180\nfrom_epoch: 0\nlr: 0.001\noptimizer: Adam\nlr_decay_step: 20\nlr_decay_rate: 0.1\nwd: 0\ngradient_clip_val: 0.0"
  },
  {
    "path": "cfgs/P2B_Car.yaml",
    "content": "#data\ndataset: kitti\npath:  #put the root of the dataset here\ncategory_name: Car # [Car, Van, Pedestrian, Cyclist, All]\nsearch_bb_scale: 1.25\nsearch_bb_offset: 2\nmodel_bb_scale: 1.25\nmodel_bb_offset: 0\ntemplate_size: 512\nsearch_size: 1024\nrandom_sample: False\nsample_per_epoch: -1\ndegrees: True # use degrees or radians\nnum_candidates: 4\ncoordinate_mode: camera\nup_axis: [0,-1,0]\ntrain_split: train\nval_split: test\ntest_split: test\npreload_offset: 10\ntrain_type: train_siamese\ndata_limit_box: True\n\n#model configuration\nnet_model: P2B\nuse_fps: False\nnormalize_xyz: False\nfeature_channel: 256 #the output channel of backbone\nhidden_channel: 256 #the hidden channel of xcorr\nout_channel: 256 #the output channel of xcorr\nvote_channel: 256 #the channel for vote aggregation\nnum_proposal: 64\n\n#loss configuration\nobjectiveness_weight: 1.5\nbox_weight: 0.2\nvote_weight: 1.0\nseg_weight: 0.2\n\n# testing config\nreference_BB: previous_result\nshape_aggregation: firstandprevious\nuse_z: False\nlimit_box: True\nIoU_space: 3\n\n#training\nbatch_size: 50\nworkers: 10\nepoch: 60\nfrom_epoch: 0\nlr: 0.001\noptimizer: Adam\nlr_decay_step: 20\nlr_decay_rate: 0.1\nwd: 0\ngradient_clip_val: 0.0"
  },
  {
    "path": "cfgs/P2B_Car_NuScenes.yaml",
    "content": "#data\ndataset: nuscenes\npath: #put data root here\nversion: v1.0-trainval\ncategory_name: Car # [Car, Van, Pedestrian, Cyclist, All]\nsearch_bb_scale: 1.25\nsearch_bb_offset: 2\nmodel_bb_scale: 1.25\nmodel_bb_offset: 0\ntemplate_size: 512\nsearch_size: 1024\nrandom_sample: False\nsample_per_epoch: -1\ndegrees: True # use degrees or radians\nnum_candidates: 4\nup_axis: [ 0,0,1 ]\npreload_offset: 10\nkey_frame_only: True\ntrain_split: train_track\nval_split: val\ntest_split: val\nmin_points: 1\ntrain_type: train_siamese\ndata_limit_box: True\n\n#model configuration\nnet_model: P2B\nuse_fps: False\nnormalize_xyz: False\nfeature_channel: 256 #the output channel of backbone\nhidden_channel: 256 #the hidden channel of xcorr\nout_channel: 256 #the output channel of xcorr\nvote_channel: 256 #the channel for vote aggregation\nnum_proposal: 64\n\n#loss configuration\nobjectiveness_weight: 1.5\nbox_weight: 0.2\nvote_weight: 1.0\nseg_weight: 0.2\n\n# testing config\nreference_BB: previous_result\nshape_aggregation: firstandprevious\nuse_z: False\nlimit_box: True\nIoU_space: 3\n\n#training\nbatch_size: 50\nworkers: 10\nepoch: 60\nfrom_epoch: 0\nlr: 0.001\noptimizer: Adam\nlr_decay_step: 20\nlr_decay_rate: 0.1\nwd: 0\ngradient_clip_val: 0.0"
  },
  {
    "path": "cfgs/P2B_Car_Waymo.yaml",
    "content": "#data\ndataset: waymo\npath:  #put the root of the dataset here\ncategory_name: Vehicle # [Vehicle, Pedestrian, Cyclist]\nsearch_bb_scale: 1.25\nsearch_bb_offset: 2\nmodel_bb_scale: 1.25\nmodel_bb_offset: 0\ntemplate_size: 512\nsearch_size: 1024\nrandom_sample: False\nsample_per_epoch: -1\ndegrees: True # use degrees or radians\nbox_aware: True\nnum_candidates: 4\ncoordinate_mode: velodyne\nup_axis: [0,0,1]\ntrain_split: train\nval_split: test\ntest_split: test\npreload_offset: 10\ntiny: False # for debug only\ntrain_type: train_siamese\ndata_limit_box: True\n\n#model configuration\nnet_model: P2B\nuse_fps: False\nnormalize_xyz: False\nfeature_channel: 256 #the output channel of backbone\nhidden_channel: 256 #the hidden channel of xcorr\nout_channel: 256 #the output channel of xcorr\nvote_channel: 256 #the channel for vote aggregation\nnum_proposal: 64\nk: 4\nuse_search_bc: False\nuse_search_feature: False\nbc_channel: 9\n\n#loss configuration\nobjectiveness_weight: 1.5\nbox_weight: 0.2\nvote_weight: 1.0\nseg_weight: 0.2\nbc_weight: 1.0\n\n# testing config\nreference_BB: previous_result\nshape_aggregation: firstandprevious\nuse_z: False\nlimit_box: True\nIoU_space: 3\n\n#training\nbatch_size: 50 #batch_size per gpu\nworkers: 10\nepoch: 60\nfrom_epoch: 0\nlr: 0.001\noptimizer: Adam\nlr_decay_step: 5\nlr_decay_rate: 0.1\nwd: 0\ngradient_clip_val: 0.0"
  },
  {
    "path": "datasets/__init__.py",
    "content": "\"\"\"\n___init__.py\nCreated by zenn at 2021/7/18 15:50\n\"\"\"\nfrom datasets import kitti, sampler, nuscenes_data, waymo_data\n\n\ndef get_dataset(config, type='train', **kwargs):\n    if config.dataset == 'kitti':\n        data = kitti.kittiDataset(path=config.path,\n                                  split=kwargs.get('split', 'train'),\n                                  category_name=config.category_name,\n                                  coordinate_mode=config.coordinate_mode,\n                                  preloading=config.preloading,\n                                  preload_offset=config.preload_offset if type != 'test' else -1)\n    elif config.dataset == 'nuscenes':\n        data = nuscenes_data.NuScenesDataset(path=config.path,\n                                             split=kwargs.get('split', 'train_track'),\n                                             category_name=config.category_name,\n                                             version=config.version,\n                                             key_frame_only=True if type != 'test' else config.key_frame_only,\n                                             # can only use keyframes for training\n                                             preloading=config.preloading,\n                                             preload_offset=config.preload_offset if type != 'test' else -1,\n                                             min_points=1 if kwargs.get('split', 'train_track') in\n                                                             [config.val_split, config.test_split] else -1)\n    elif config.dataset == 'waymo':\n        data = waymo_data.WaymoDataset(path=config.path,\n                                       split=kwargs.get('split', 'train'),\n                                       category_name=config.category_name,\n                                       preloading=config.preloading,\n                                       preload_offset=config.preload_offset,\n                                       tiny=config.tiny)\n    else:\n        data = None\n\n    if type == 'train_siamese':\n        return sampler.PointTrackingSampler(dataset=data,\n                                            random_sample=config.random_sample,\n                                            sample_per_epoch=config.sample_per_epoch,\n                                            config=config)\n    elif type.lower() == 'train_motion':\n        return sampler.MotionTrackingSampler(dataset=data,\n                                             config=config)\n    else:\n        return sampler.TestTrackingSampler(dataset=data, config=config)\n"
  },
  {
    "path": "datasets/base_dataset.py",
    "content": "\"\"\" \nbase_dataset.py\nCreated by zenn at 2021/9/1 22:16\n\"\"\"\n\n\nclass BaseDataset:\n    def __init__(self, path, split, category_name=\"Car\", **kwargs):\n        self.path = path\n        self.split = split\n        self.category_name = category_name\n        self.preloading = kwargs.get('preloading', False)\n\n\n    def get_num_tracklets(self):\n        raise NotImplementedError\n\n    def get_num_frames_total(self):\n        raise NotImplementedError\n\n    def get_num_frames_tracklet(self, tracklet_id):\n        raise NotImplementedError\n\n    def get_frames(self, seq_id, frame_ids):\n        raise NotImplementedError\n"
  },
  {
    "path": "datasets/data_classes.py",
    "content": "# nuScenes dev-kit.\n# Code written by Oscar Beijbom, 2018.\n# Licensed under the Creative Commons [see licence.txt]\n\n#from __future__ import annotations\nimport torch\nimport numpy as np\nfrom pyquaternion import Quaternion\n\n\nclass PointCloud:\n\n    def __init__(self, points):\n        \"\"\"\n        Class for manipulating and viewing point clouds.\n        :param points: <np.float: 4, n>. Input point cloud matrix.\n        \"\"\"\n        self.points = points\n        if self.points.shape[0] > 3:\n            self.points = self.points[0:3, :]\n\n    @staticmethod\n    def load_pcd_bin(file_name):\n        \"\"\"\n        Loads from binary format. Data is stored as (x, y, z, intensity, ring index).\n        :param file_name: <str>.\n        :return: <np.float: 4, n>. Point cloud matrix (x, y, z, intensity).\n        \"\"\"\n        scan = np.fromfile(file_name, dtype=np.float32)\n        points = scan.reshape((-1, 5))[:, :4]\n        return points.T\n\n    @classmethod\n    def from_file(cls, file_name):\n        \"\"\"\n        Instantiate from a .pcl, .pdc, .npy, or .bin file.\n        :param file_name: <str>. Path of the pointcloud file on disk.\n        :return: <PointCloud>.\n        \"\"\"\n\n        if file_name.endswith('.bin'):\n            points = cls.load_pcd_bin(file_name)\n        elif file_name.endswith('.npy'):\n            points = np.load(file_name)\n        else:\n            raise ValueError('Unsupported filetype {}'.format(file_name))\n\n        return cls(points)\n\n    def nbr_points(self):\n        \"\"\"\n        Returns the number of points.\n        :return: <int>. Number of points.\n        \"\"\"\n        return self.points.shape[1]\n\n    def subsample(self, ratio):\n        \"\"\"\n        Sub-samples the pointcloud.\n        :param ratio: <float>. Fraction to keep.\n        :return: <None>.\n        \"\"\"\n        selected_ind = np.random.choice(np.arange(0, self.nbr_points()),\n                                        size=int(self.nbr_points() * ratio))\n        self.points = self.points[:, selected_ind]\n\n    def remove_close(self, radius):\n        \"\"\"\n        Removes point too close within a certain radius from origin.\n        :param radius: <float>.\n        :return: <None>.\n        \"\"\"\n\n        x_filt = np.abs(self.points[0, :]) < radius\n        y_filt = np.abs(self.points[1, :]) < radius\n        not_close = np.logical_not(np.logical_and(x_filt, y_filt))\n        self.points = self.points[:, not_close]\n\n    def translate(self, x):\n        \"\"\"\n        Applies a translation to the point cloud.\n        :param x: <np.float: 3, 1>. Translation in x, y, z.\n        :return: <None>.\n        \"\"\"\n        for i in range(3):\n            self.points[i, :] = self.points[i, :] + x[i]\n\n    def rotate(self, rot_matrix):\n        \"\"\"\n        Applies a rotation.\n        :param rot_matrix: <np.float: 3, 3>. Rotation matrix.\n        :return: <None>.\n        \"\"\"\n        self.points[:3, :] = np.dot(rot_matrix, self.points[:3, :])\n\n    def transform(self, transf_matrix):\n        \"\"\"\n        Applies a homogeneous transform.\n        :param transf_matrix: <np.float: 4, 4>. Homogenous transformation matrix.\n        :return: <None>.\n        \"\"\"\n        self.points[:3, :] = transf_matrix.dot(\n            np.vstack((self.points[:3, :], np.ones(self.nbr_points()))))[:3, :]\n\n    def convertToPytorch(self):\n        \"\"\"\n        Helper from pytorch.\n        :return: Pytorch array of points.\n        \"\"\"\n        return torch.from_numpy(self.points)\n\n    @staticmethod\n    def fromPytorch(cls, pytorchTensor):\n        \"\"\"\n        Loads from binary format. Data is stored as (x, y, z, intensity, ring index).\n        :param pyttorchTensor: <Tensor>.\n        :return: <np.float: 4, n>. Point cloud matrix (x, y, z, intensity).\n        \"\"\"\n        points = pytorchTensor.numpy()\n        # points = points.reshape((-1, 5))[:, :4]\n        return cls(points)\n\n    def normalize(self, wlh):\n        normalizer = [wlh[1], wlh[0], wlh[2]]\n        self.points = self.points / np.atleast_2d(normalizer).T\n\n\nclass Box:\n    \"\"\" Simple data class representing a 3d box including, label, score and velocity. \"\"\"\n\n    def __init__(self, center, size, orientation, label=np.nan, score=np.nan, velocity=(np.nan, np.nan, np.nan),\n                 name=None):\n        \"\"\"\n        :param center: [<float>: 3]. Center of box given as x, y, z.\n        :param size: [<float>: 3]. Size of box in width, length, height.\n        :param orientation: <Quaternion>. Box orientation.\n        :param label: <int>. Integer label, optional.\n        :param score: <float>. Classification score, optional.\n        :param velocity: [<float>: 3]. Box velocity in x, y, z direction.\n        :param name: <str>. Box name, optional. Can be used e.g. for denote category name.\n        \"\"\"\n        assert not np.any(np.isnan(center))\n        assert not np.any(np.isnan(size))\n        assert len(center) == 3\n        assert len(size) == 3\n        # assert type(orientation) == Quaternion\n\n        self.center = np.array(center)\n        self.wlh = np.array(size)\n        self.orientation = orientation\n        self.label = int(label) if not np.isnan(label) else label\n        self.score = float(score) if not np.isnan(score) else score\n        self.velocity = np.array(velocity)\n        self.name = name\n\n    def __eq__(self, other):\n        center = np.allclose(self.center, other.center)\n        wlh = np.allclose(self.wlh, other.wlh)\n        orientation = np.allclose(self.orientation.elements, other.orientation.elements)\n        label = (self.label == other.label) or (np.isnan(self.label) and np.isnan(other.label))\n        score = (self.score == other.score) or (np.isnan(self.score) and np.isnan(other.score))\n        vel = (np.allclose(self.velocity, other.velocity) or\n               (np.all(np.isnan(self.velocity)) and np.all(np.isnan(other.velocity))))\n\n        return center and wlh and orientation and label and score and vel\n\n    def __repr__(self):\n        repr_str = 'label: {}, score: {:.2f}, xyz: [{:.2f}, {:.2f}, {:.2f}], wlh: [{:.2f}, {:.2f}, {:.2f}], ' \\\n                   'rot axis: [{:.2f}, {:.2f}, {:.2f}], ang(degrees): {:.2f}, ang(rad): {:.2f}, ' \\\n                   'vel: {:.2f}, {:.2f}, {:.2f}, name: {}'\n\n        return repr_str.format(self.label, self.score, self.center[0], self.center[1], self.center[2], self.wlh[0],\n                               self.wlh[1], self.wlh[2], self.orientation.axis[0], self.orientation.axis[1],\n                               self.orientation.axis[2], self.orientation.degrees, self.orientation.radians,\n                               self.velocity[0], self.velocity[1], self.velocity[2], self.name)\n\n    def encode(self):\n        \"\"\"\n        Encodes the box instance to a JSON-friendly vector representation.\n        :return: [<float>: 16]. List of floats encoding the box.\n        \"\"\"\n        return self.center.tolist() + self.wlh.tolist() + self.orientation.elements.tolist() + [self.label] + [self.score] + self.velocity.tolist() + [self.name]\n\n    @classmethod\n    def decode(cls, data):\n        \"\"\"\n        Instantiates a Box instance from encoded vector representation.\n        :param data: [<float>: 16]. Output from encode.\n        :return: <Box>.\n        \"\"\"\n        return Box(data[0:3], data[3:6], Quaternion(data[6:10]), label=data[10], score=data[11], velocity=data[12:15],\n                   name=data[15])\n\n    @property\n    def rotation_matrix(self):\n        \"\"\"\n        Return a rotation matrix.\n        :return: <np.float: (3, 3)>.\n        \"\"\"\n        return self.orientation.rotation_matrix\n\n    def translate(self, x):\n        \"\"\"\n        Applies a translation.\n        :param x: <np.float: 3, 1>. Translation in x, y, z direction.\n        :return: <None>.\n        \"\"\"\n        self.center += x\n\n    def rotate(self, quaternion):\n        \"\"\"\n        Rotates box.\n        :param quaternion: <Quaternion>. Rotation to apply.\n        :return: <None>.\n        \"\"\"\n        self.center = np.dot(quaternion.rotation_matrix, self.center)\n        self.orientation = quaternion * self.orientation\n        self.velocity = np.dot(quaternion.rotation_matrix, self.velocity)\n\n    def transform(self, transf_matrix):\n        transformed = np.dot(transf_matrix[0:3,0:4].T, self.center)\n        self.center = transformed[0:3]/transformed[3]\n        self.orientation = self.orientation* Quaternion(matrix = transf_matrix[0:3,0:3])\n        self.velocity = np.dot(transf_matrix[0:3,0:3], self.velocity)\n\n    def corners(self, wlh_factor=1.0):\n        \"\"\"\n        Returns the bounding box corners.\n        :param wlh_factor: <float>. Multiply w, l, h by a factor to inflate or deflate the box.\n        :return: <np.float: 3, 8>. First four corners are the ones facing forward.\n            The last four are the ones facing backwards.\n        \"\"\"\n        w, l, h = self.wlh * wlh_factor\n\n        # 3D bounding box corners. (Convention: x points forward, y to the left, z up.)\n        x_corners = l / 2 * np.array([1,  1,  1,  1, -1, -1, -1, -1])\n        y_corners = w / 2 * np.array([1, -1, -1,  1,  1, -1, -1,  1])\n        z_corners = h / 2 * np.array([1,  1, -1, -1,  1,  1, -1, -1])\n        corners = np.vstack((x_corners, y_corners, z_corners))\n\n        # Rotate\n        corners = np.dot(self.orientation.rotation_matrix, corners)\n\n        # Translate\n        x, y, z = self.center\n        corners[0, :] = corners[0, :] + x\n        corners[1, :] = corners[1, :] + y\n        corners[2, :] = corners[2, :] + z\n\n        return corners\n\n    def bottom_corners(self):\n        \"\"\"\n        Returns the four bottom corners.\n        :return: <np.float: 3, 4>. Bottom corners. First two face forward, last two face backwards.\n        \"\"\"\n        return self.corners()[:, [2, 3, 7, 6]]\n"
  },
  {
    "path": "datasets/generate_waymo_sot.py",
    "content": "#!/usr/bin/env python\n# encoding: utf-8\n'''\n@author: Xu Yan\n@file: generate_waymo_sot.py\n@time: 2021/6/17 13:17\n'''\nimport os\nimport pickle\nfrom collections import defaultdict\n\nfrom tqdm import tqdm\n\n\ndef lood_pickle(root):\n    with open(root, \"rb\") as f:\n        file = pickle.load(f)\n    return file\n\n\ndef generate_waymo_data(root, cla, split):\n    TYPE_LIST = ['UNKNOWN', 'VEHICLE', 'PEDESTRIAN', 'SIGN', 'CYCLIST']\n\n    print('Generate %s class for %s set' % (cla, split))\n    waymo_infos_all = lood_pickle(os.path.join(root, 'infos_%s_01sweeps_filter_zero_gt.pkl' % split))\n\n    DATA = defaultdict(list)\n\n    for idx, frame in tqdm(enumerate(waymo_infos_all), total=len(waymo_infos_all)):\n        anno = lood_pickle(os.path.join(root, frame['anno_path']))\n        \n        for obj in anno['objects']:\n            if TYPE_LIST[obj['label']] == cla:\n                if not obj['name'] in DATA:\n                    DATA[obj['name']] = [\n                        {\n                            'PC': frame['path'],\n                            'Box': obj['box'],\n                            'Class': cla\n                        }\n                    ]\n                else:\n                    DATA[obj['name']] += [\n                        {\n                            'PC': frame['path'],\n                            'Box': obj['box'],\n                            'Class': cla\n                        }\n                    ]\n\n    print('Save data...')\n    with open(os.path.join(root, 'sot_infos_%s_%s.pkl' % (cla.lower(), split)), \"wb\") as f:\n        pickle.dump(DATA, f)\n\n\nif __name__ == '__main__':\n    splits = ['train', 'val']\n    classes = ['VEHICLE', 'PEDESTRIAN', 'CYCLIST']\n    root = '/raid/databases/Waymo/'\n    for split in splits:\n        for cla in classes:\n            generate_waymo_data(root, cla, split)\n"
  },
  {
    "path": "datasets/kitti.py",
    "content": "# Created by zenn at 2021/4/27\n\nimport copy\nimport random\n\nfrom torch.utils.data import Dataset\nfrom datasets.data_classes import PointCloud, Box\nfrom pyquaternion import Quaternion\nimport numpy as np\nimport pandas as pd\nimport os\nimport warnings\nimport pickle\nfrom collections import defaultdict\nfrom datasets import points_utils, base_dataset\n\n\nclass kittiDataset(base_dataset.BaseDataset):\n    def __init__(self, path, split, category_name=\"Car\", **kwargs):\n        super().__init__(path, split, category_name, **kwargs)\n        self.KITTI_Folder = path\n        self.KITTI_velo = os.path.join(self.KITTI_Folder, \"velodyne\")\n        self.KITTI_image = os.path.join(self.KITTI_Folder, \"image_02\")\n        self.KITTI_label = os.path.join(self.KITTI_Folder, \"label_02\")\n        self.KITTI_calib = os.path.join(self.KITTI_Folder, \"calib\")\n        self.scene_list = self._build_scene_list(split)\n        self.velos = defaultdict(dict)\n        self.calibs = {}\n        self.tracklet_anno_list, self.tracklet_len_list = self._build_tracklet_anno()\n        self.coordinate_mode = kwargs.get('coordinate_mode', 'velodyne')\n        self.preload_offset = kwargs.get('preload_offset', -1)\n        if self.preloading:\n            self.training_samples = self._load_data()\n\n    @staticmethod\n    def _build_scene_list(split):\n        if \"TRAIN\" in split.upper():  # Training SET\n            if \"TINY\" in split.upper():\n                scene_names = [0]\n            else:\n                scene_names = list(range(0, 17))\n        elif \"VALID\" in split.upper():  # Validation Set\n            if \"TINY\" in split.upper():\n                scene_names = [18]\n            else:\n                scene_names = list(range(17, 19))\n        elif \"TEST\" in split.upper():  # Testing Set\n            if \"TINY\" in split.upper():\n                scene_names = [19]\n            else:\n                scene_names = list(range(19, 21))\n\n        else:  # Full Dataset\n            scene_names = list(range(21))\n        scene_names = ['%04d' % scene_name for scene_name in scene_names]\n        return scene_names\n\n    def _load_data(self):\n        print('preloading data into memory')\n        preload_data_path = os.path.join(self.KITTI_Folder,\n                                         f\"preload_kitti_{self.category_name}_{self.split}_{self.coordinate_mode}_{self.preload_offset}.dat\")\n        if os.path.isfile(preload_data_path):\n            print(f'loading from saved file {preload_data_path}.')\n            with open(preload_data_path, 'rb') as f:\n                training_samples = pickle.load(f)\n        else:\n            print('reading from annos')\n            training_samples = []\n            for i in range(len(self.tracklet_anno_list)):\n                frames = []\n                for anno in self.tracklet_anno_list[i]:\n                    frames.append(self._get_frame_from_anno(anno))\n                training_samples.append(frames)\n            with open(preload_data_path, 'wb') as f:\n                print(f'saving loaded data to {preload_data_path}')\n                pickle.dump(training_samples, f)\n        return training_samples\n\n    def get_num_scenes(self):\n        return len(self.scene_list)\n\n    def get_num_tracklets(self):\n        return len(self.tracklet_anno_list)\n\n    def get_num_frames_total(self):\n        return sum(self.tracklet_len_list)\n\n    def get_num_frames_tracklet(self, tracklet_id):\n        return self.tracklet_len_list[tracklet_id]\n\n    def _build_tracklet_anno(self):\n\n        list_of_tracklet_anno = []\n        list_of_tracklet_len = []\n        for scene in self.scene_list:\n\n            label_file = os.path.join(self.KITTI_label, scene + \".txt\")\n\n            df = pd.read_csv(\n                label_file,\n                sep=' ',\n                names=[\n                    \"frame\", \"track_id\", \"type\", \"truncated\", \"occluded\",\n                    \"alpha\", \"bbox_left\", \"bbox_top\", \"bbox_right\",\n                    \"bbox_bottom\", \"height\", \"width\", \"length\", \"x\", \"y\", \"z\",\n                    \"rotation_y\"\n                ])\n            if self.category_name in ['Car', 'Van', 'Truck',\n                                      'Pedestrian', 'Person_sitting', 'Cyclist', 'Tram',\n                                      'Misc']:\n                df = df[df[\"type\"] == self.category_name]\n            elif self.category_name == 'All':\n                df = df[(df[\"type\"] == 'Car') |\n                        (df[\"type\"] == 'Van') |\n                        (df[\"type\"] == 'Pedestrian') |\n                        (df[\"type\"] == 'Cyclist')]\n            else:\n                df = df[df[\"type\"] != 'DontCare']\n            df.insert(loc=0, column=\"scene\", value=scene)\n            for track_id in df.track_id.unique():\n                df_tracklet = df[df[\"track_id\"] == track_id]\n                df_tracklet = df_tracklet.sort_values(by=['frame'])\n                df_tracklet = df_tracklet.reset_index(drop=True)\n                tracklet_anno = [anno for index, anno in df_tracklet.iterrows()]\n                list_of_tracklet_anno.append(tracklet_anno)\n                list_of_tracklet_len.append((len(tracklet_anno)))\n\n        return list_of_tracklet_anno, list_of_tracklet_len\n\n    def get_frames(self, seq_id, frame_ids):\n        if self.preloading:\n            frames = [self.training_samples[seq_id][f_id] for f_id in frame_ids]\n        else:\n            seq_annos = self.tracklet_anno_list[seq_id]\n            frames = [self._get_frame_from_anno(seq_annos[f_id]) for f_id in frame_ids]\n\n        return frames\n\n    def _get_frame_from_anno(self, anno):\n        scene_id = anno['scene']\n        frame_id = anno['frame']\n        try:\n            calib = self.calibs[scene_id]\n        except KeyError:\n            calib_path = os.path.join(self.KITTI_calib, scene_id + \".txt\")\n            calib = self._read_calib_file(calib_path)\n            self.calibs[scene_id] = calib\n        velo_to_cam = np.vstack((calib[\"Tr_velo_cam\"], np.array([0, 0, 0, 1])))\n\n        if self.coordinate_mode == 'velodyne':\n            box_center_cam = np.array([anno[\"x\"], anno[\"y\"] - anno[\"height\"] / 2, anno[\"z\"], 1])\n            # transform bb from camera coordinate into velo coordinates\n            box_center_velo = np.dot(np.linalg.inv(velo_to_cam), box_center_cam)\n            box_center_velo = box_center_velo[:3]\n            size = [anno[\"width\"], anno[\"length\"], anno[\"height\"]]\n            orientation = Quaternion(\n                axis=[0, 0, -1], radians=anno[\"rotation_y\"]) * Quaternion(axis=[0, 0, -1], degrees=90)\n            bb = Box(box_center_velo, size, orientation)\n        else:\n            center = [anno[\"x\"], anno[\"y\"] - anno[\"height\"] / 2, anno[\"z\"]]\n            size = [anno[\"width\"], anno[\"length\"], anno[\"height\"]]\n            orientation = Quaternion(\n                axis=[0, 1, 0], radians=anno[\"rotation_y\"]) * Quaternion(\n                axis=[1, 0, 0], radians=np.pi / 2)\n            bb = Box(center, size, orientation)\n\n        try:\n            try:\n                pc = self.velos[scene_id][frame_id]\n            except KeyError:\n                # VELODYNE PointCloud\n                velodyne_path = os.path.join(self.KITTI_velo, scene_id,\n                                             '{:06}.bin'.format(frame_id))\n\n                pc = PointCloud(\n                    np.fromfile(velodyne_path, dtype=np.float32).reshape(-1, 4).T)\n                if self.coordinate_mode == \"camera\":\n                    pc.transform(velo_to_cam)\n                self.velos[scene_id][frame_id] = pc\n            if self.preload_offset > 0:\n                pc = points_utils.crop_pc_axis_aligned(pc, bb, offset=self.preload_offset)\n        except:\n            # in case the Point cloud is missing\n            # (0001/[000177-000180].bin)\n            # msg = f\"The point cloud at scene {scene_id} frame {frame_id} is missing.\"\n            # warnings.warn(msg)\n            pc = PointCloud(np.array([[0, 0, 0]]).T)\n        # todo add image\n        return {\"pc\": pc, \"3d_bbox\": bb, 'meta': anno}\n\n    @staticmethod\n    def _read_calib_file(filepath):\n        \"\"\"Read in a calibration file and parse into a dictionary.\"\"\"\n        data = {}\n        with open(filepath, 'r') as f:\n            for line in f.readlines():\n                values = line.split()\n                # The only non-float values in these files are dates, which\n                # we don't care about anyway\n                try:\n                    data[values[0]] = np.array(\n                        [float(x) for x in values[1:]]).reshape(3, 4)\n                except ValueError:\n                    pass\n        return data\n"
  },
  {
    "path": "datasets/nuscenes_data.py",
    "content": "\"\"\"\nnuscenes.py\nCreated by zenn at 2021/9/1 15:05\n\"\"\"\nimport os\n\nimport numpy as np\nimport pickle\nimport nuscenes\nfrom nuscenes.nuscenes import NuScenes\nfrom nuscenes.utils.data_classes import LidarPointCloud, Box\nfrom nuscenes.utils.splits import create_splits_scenes\n\nfrom pyquaternion import Quaternion\n\nfrom datasets import points_utils, base_dataset\nfrom datasets.data_classes import PointCloud\n\ngeneral_to_tracking_class = {\"animal\": \"void / ignore\",\n                             \"human.pedestrian.personal_mobility\": \"void / ignore\",\n                             \"human.pedestrian.stroller\": \"void / ignore\",\n                             \"human.pedestrian.wheelchair\": \"void / ignore\",\n                             \"movable_object.barrier\": \"void / ignore\",\n                             \"movable_object.debris\": \"void / ignore\",\n                             \"movable_object.pushable_pullable\": \"void / ignore\",\n                             \"movable_object.trafficcone\": \"void / ignore\",\n                             \"static_object.bicycle_rack\": \"void / ignore\",\n                             \"vehicle.emergency.ambulance\": \"void / ignore\",\n                             \"vehicle.emergency.police\": \"void / ignore\",\n                             \"vehicle.construction\": \"void / ignore\",\n                             \"vehicle.bicycle\": \"bicycle\",\n                             \"vehicle.bus.bendy\": \"bus\",\n                             \"vehicle.bus.rigid\": \"bus\",\n                             \"vehicle.car\": \"car\",\n                             \"vehicle.motorcycle\": \"motorcycle\",\n                             \"human.pedestrian.adult\": \"pedestrian\",\n                             \"human.pedestrian.child\": \"pedestrian\",\n                             \"human.pedestrian.construction_worker\": \"pedestrian\",\n                             \"human.pedestrian.police_officer\": \"pedestrian\",\n                             \"vehicle.trailer\": \"trailer\",\n                             \"vehicle.truck\": \"truck\", }\n\ntracking_to_general_class = {\n    'void / ignore': ['animal', 'human.pedestrian.personal_mobility', 'human.pedestrian.stroller',\n                      'human.pedestrian.wheelchair', 'movable_object.barrier', 'movable_object.debris',\n                      'movable_object.pushable_pullable', 'movable_object.trafficcone', 'static_object.bicycle_rack',\n                      'vehicle.emergency.ambulance', 'vehicle.emergency.police', 'vehicle.construction'],\n    'bicycle': ['vehicle.bicycle'],\n    'bus': ['vehicle.bus.bendy', 'vehicle.bus.rigid'],\n    'car': ['vehicle.car'],\n    'motorcycle': ['vehicle.motorcycle'],\n    'pedestrian': ['human.pedestrian.adult', 'human.pedestrian.child', 'human.pedestrian.construction_worker',\n                   'human.pedestrian.police_officer'],\n    'trailer': ['vehicle.trailer'],\n    'truck': ['vehicle.truck']}\n\n\nclass NuScenesDataset(base_dataset.BaseDataset):\n    def __init__(self, path, split, category_name=\"Car\", version='v1.0-trainval', **kwargs):\n        super().__init__(path, split, category_name, **kwargs)\n        self.nusc = NuScenes(version=version, dataroot=path, verbose=False)\n        self.version = version\n        self.key_frame_only = kwargs.get('key_frame_only', False)\n        self.min_points = kwargs.get('min_points', False)\n        self.preload_offset = kwargs.get('preload_offset', -1)\n        self.track_instances = self.filter_instance(split, category_name.lower(), self.min_points)\n        self.tracklet_anno_list, self.tracklet_len_list = self._build_tracklet_anno()\n        if self.preloading:\n            self.training_samples = self._load_data()\n\n    def filter_instance(self, split, category_name=None, min_points=-1):\n        \"\"\"\n        This function is used to filter the tracklets.\n\n        split: the dataset split\n        category_name:\n        min_points: the minimum number of points in the first bbox\n        \"\"\"\n        if category_name is not None:\n            general_classes = tracking_to_general_class[category_name]\n        instances = []\n        scene_splits = nuscenes.utils.splits.create_splits_scenes()\n        for instance in self.nusc.instance:\n            anno = self.nusc.get('sample_annotation', instance['first_annotation_token'])\n            sample = self.nusc.get('sample', anno['sample_token'])\n            scene = self.nusc.get('scene', sample['scene_token'])\n            instance_category = self.nusc.get('category', instance['category_token'])['name']\n            if scene['name'] in scene_splits[split] and anno['num_lidar_pts'] >= min_points and \\\n                    (category_name is None or category_name is not None and instance_category in general_classes):\n                instances.append(instance)\n        return instances\n\n    def _build_tracklet_anno(self):\n        list_of_tracklet_anno = []\n        list_of_tracklet_len = []\n        for instance in self.track_instances:\n            track_anno = []\n            curr_anno_token = instance['first_annotation_token']\n\n            while curr_anno_token != '':\n\n                ann_record = self.nusc.get('sample_annotation', curr_anno_token)\n                sample = self.nusc.get('sample', ann_record['sample_token'])\n                sample_data_lidar = self.nusc.get('sample_data', sample['data']['LIDAR_TOP'])\n\n                curr_anno_token = ann_record['next']\n                if self.key_frame_only and not sample_data_lidar['is_key_frame']:\n                    continue\n                track_anno.append({\"sample_data_lidar\": sample_data_lidar, \"box_anno\": ann_record})\n\n            list_of_tracklet_anno.append(track_anno)\n            list_of_tracklet_len.append(len(track_anno))\n        return list_of_tracklet_anno, list_of_tracklet_len\n\n    def _load_data(self):\n        print('preloading data into memory')\n        preload_data_path = os.path.join(self.path,\n                                         f\"preload_nuscenes_{self.category_name}_{self.split}_{self.version}_{self.preload_offset}_{self.min_points}.dat\")\n        if os.path.isfile(preload_data_path):\n            print(f'loading from saved file {preload_data_path}.')\n            with open(preload_data_path, 'rb') as f:\n                training_samples = pickle.load(f)\n        else:\n            print('reading from annos')\n            training_samples = []\n            for i in range(len(self.tracklet_anno_list)):\n                frames = []\n                for anno in self.tracklet_anno_list[i]:\n                    frames.append(self._get_frame_from_anno_data(anno))\n                training_samples.append(frames)\n            with open(preload_data_path, 'wb') as f:\n                print(f'saving loaded data to {preload_data_path}')\n                pickle.dump(training_samples, f)\n        return training_samples\n\n    def get_num_tracklets(self):\n        return len(self.tracklet_anno_list)\n\n    def get_num_frames_total(self):\n        return sum(self.tracklet_len_list)\n\n    def get_num_frames_tracklet(self, tracklet_id):\n        return self.tracklet_len_list[tracklet_id]\n\n    def get_frames(self, seq_id, frame_ids):\n        if self.preloading:\n            frames = [self.training_samples[seq_id][f_id] for f_id in frame_ids]\n        else:\n            seq_annos = self.tracklet_anno_list[seq_id]\n            frames = [self._get_frame_from_anno_data(seq_annos[f_id]) for f_id in frame_ids]\n\n        return frames\n\n    def _get_frame_from_anno_data(self, anno):\n        sample_data_lidar = anno['sample_data_lidar']\n        box_anno = anno['box_anno']\n        bb = Box(box_anno['translation'], box_anno['size'], Quaternion(box_anno['rotation']),\n                 name=box_anno['category_name'], token=box_anno['token'])\n        pcl_path = os.path.join(self.path, sample_data_lidar['filename'])\n        pc = LidarPointCloud.from_file(pcl_path)\n\n        cs_record = self.nusc.get('calibrated_sensor', sample_data_lidar['calibrated_sensor_token'])\n        pc.rotate(Quaternion(cs_record['rotation']).rotation_matrix)\n        pc.translate(np.array(cs_record['translation']))\n\n        poserecord = self.nusc.get('ego_pose', sample_data_lidar['ego_pose_token'])\n        pc.rotate(Quaternion(poserecord['rotation']).rotation_matrix)\n        pc.translate(np.array(poserecord['translation']))\n\n        pc = PointCloud(points=pc.points)\n        if self.preload_offset > 0:\n            pc = points_utils.crop_pc_axis_aligned(pc, bb, offset=self.preload_offset)\n        return {\"pc\": pc, \"3d_bbox\": bb, 'meta': anno}\n"
  },
  {
    "path": "datasets/points_utils.py",
    "content": "import nuscenes.utils.geometry_utils\nimport torch\nimport os\nimport copy\nimport numpy as np\nfrom pyquaternion import Quaternion\nfrom datasets.data_classes import PointCloud, Box\nfrom scipy.spatial.distance import cdist\n\n\ndef random_choice(num_samples, size, replacement=False, seed=None):\n    if seed is not None:\n        generator = torch.random.manual_seed(seed)\n    else:\n        generator = None\n    return torch.multinomial(\n        torch.ones((size), dtype=torch.float32),\n        num_samples=num_samples,\n        replacement=replacement,\n        generator=generator\n    )\n\n\ndef regularize_pc(points, sample_size, seed=None):\n    # random sampling from points\n    num_points = points.shape[0]\n    new_pts_idx = None\n    rng = np.random if seed is None else np.random.default_rng(seed)\n    if num_points > 2:\n        if num_points != sample_size:\n            new_pts_idx = rng.choice(num_points, size=sample_size, replace=sample_size > num_points)\n            # new_pts_idx = random_choice(num_samples=sample_size, size=num_points,\n            #                             replacement=sample_size > num_points, seed=seed).numpy()\n        else:\n            new_pts_idx = np.arange(num_points)\n    if new_pts_idx is not None:\n        points = points[new_pts_idx, :]\n    else:\n        points = np.zeros((sample_size, 3), dtype='float32')\n    return points, new_pts_idx\n\n\ndef getOffsetBB(box, offset, degrees=True, use_z=False, limit_box=True, inplace=False):\n    rot_quat = Quaternion(matrix=box.rotation_matrix)\n    trans = np.array(box.center)\n    if not inplace:\n        new_box = copy.deepcopy(box)\n    else:\n        new_box = box\n\n    new_box.translate(-trans)\n    new_box.rotate(rot_quat.inverse)\n    if len(offset) == 3:\n        use_z = False\n    # REMOVE TRANSfORM\n    if degrees:\n        if len(offset) == 3:\n            new_box.rotate(\n                Quaternion(axis=[0, 0, 1], degrees=offset[2]))\n        elif len(offset) == 4:\n            new_box.rotate(\n                Quaternion(axis=[0, 0, 1], degrees=offset[3]))\n    else:\n        if len(offset) == 3:\n            new_box.rotate(\n                Quaternion(axis=[0, 0, 1], radians=offset[2]))\n        elif len(offset) == 4:\n            new_box.rotate(\n                Quaternion(axis=[0, 0, 1], radians=offset[3]))\n    if limit_box:\n        if offset[0] > new_box.wlh[0]:\n            offset[0] = np.random.uniform(-1, 1)\n        if offset[1] > min(new_box.wlh[1], 2):\n            offset[1] = np.random.uniform(-1, 1)\n        if use_z and offset[2] > new_box.wlh[2]:\n            offset[2] = 0\n    if use_z:\n        new_box.translate(np.array([offset[0], offset[1], offset[2]]))\n    else:\n        new_box.translate(np.array([offset[0], offset[1], 0]))\n\n    # APPLY PREVIOUS TRANSFORMATION\n    new_box.rotate(rot_quat)\n    new_box.translate(trans)\n    return new_box\n\n\ndef getModel(PCs, boxes, offset=0, scale=1.0, normalize=False):\n    \"\"\"center and merge the object pcs in boxes\"\"\"\n    if len(PCs) == 0:\n        return PointCloud(np.ones((3, 0)))\n    points = [np.ones((PCs[0].points.shape[0], 0), dtype='float32')]\n    for PC, box in zip(PCs, boxes):\n        cropped_PC, new_box = cropAndCenterPC(PC, box, offset=offset, scale=scale, normalize=normalize)\n        # try:\n        if cropped_PC.nbr_points() > 0:\n            points.append(cropped_PC.points)\n\n    PC = PointCloud(np.concatenate(points, axis=1))\n    return PC, new_box\n\n\ndef cropAndCenterPC(PC, box, offset=0, scale=1.0, normalize=False):\n    \"\"\"\n    crop and center the pc using the given box\n    \"\"\"\n    new_PC = crop_pc_axis_aligned(PC, box, offset=2 * offset, scale=4 * scale)\n\n    new_box = copy.deepcopy(box)\n\n    rot_mat = np.transpose(new_box.rotation_matrix)\n    trans = -new_box.center\n\n    new_PC.translate(trans)\n    new_box.translate(trans)\n    new_PC.rotate((rot_mat))\n    new_box.rotate(Quaternion(matrix=(rot_mat)))\n\n    # crop around box\n    new_PC = crop_pc_axis_aligned(new_PC, new_box, offset=offset, scale=scale)\n\n    if normalize:\n        new_PC.normalize(box.wlh)\n    return new_PC, new_box\n\n\ndef get_point_to_box_distance(pc, box, wlh_factor=1.0):\n    \"\"\"\n    generate the BoxCloud for the given pc and box\n    :param pc: Pointcloud object or numpy array\n    :param box:\n    :return:\n    \"\"\"\n    if isinstance(pc, PointCloud):\n        points = pc.points.T  # N,3\n    else:\n        points = pc  # N,3\n        assert points.shape[1] == 3\n    box_corners = box.corners(wlh_factor=wlh_factor)  # 3,8\n    box_centers = box.center.reshape(-1, 1)  # 3,1\n    box_points = np.concatenate([box_centers, box_corners], axis=1)  # 3,9\n    points2cc_dist = cdist(points, box_points.T)  # N,9\n    return points2cc_dist\n\n\ndef crop_pc_axis_aligned(PC, box, offset=0, scale=1.0, return_mask=False):\n    \"\"\"\n    crop the pc using the box in the axis-aligned manner\n    \"\"\"\n    box_tmp = copy.deepcopy(box)\n    box_tmp.wlh = box_tmp.wlh * scale\n    maxi = np.max(box_tmp.corners(), 1) + offset\n    mini = np.min(box_tmp.corners(), 1) - offset\n\n    x_filt_max = PC.points[0, :] < maxi[0]\n    x_filt_min = PC.points[0, :] > mini[0]\n    y_filt_max = PC.points[1, :] < maxi[1]\n    y_filt_min = PC.points[1, :] > mini[1]\n    z_filt_max = PC.points[2, :] < maxi[2]\n    z_filt_min = PC.points[2, :] > mini[2]\n\n    close = np.logical_and(x_filt_min, x_filt_max)\n    close = np.logical_and(close, y_filt_min)\n    close = np.logical_and(close, y_filt_max)\n    close = np.logical_and(close, z_filt_min)\n    close = np.logical_and(close, z_filt_max)\n\n    new_PC = PointCloud(PC.points[:, close])\n    if return_mask:\n        return new_PC, close\n    return new_PC\n\n\ndef crop_pc_oriented(PC, box, offset=0, scale=1.0, return_mask=False):\n    \"\"\"\n    crop the pc using the exact box.\n    slower than 'crop_pc_axis_aligned' but more accurate\n    \"\"\"\n\n    box_tmp = copy.deepcopy(box)\n    new_PC = PointCloud(PC.points.copy())\n    rot_mat = np.transpose(box_tmp.rotation_matrix)\n    trans = -box_tmp.center\n\n    # align data\n    new_PC.translate(trans)\n    box_tmp.translate(trans)\n    new_PC.rotate(rot_mat)\n    box_tmp.rotate(Quaternion(matrix=rot_mat))\n\n    box_tmp.wlh = box_tmp.wlh * scale\n    maxi = np.max(box_tmp.corners(), 1) + offset\n    mini = np.min(box_tmp.corners(), 1) - offset\n\n    x_filt_max = new_PC.points[0, :] < maxi[0]\n    x_filt_min = new_PC.points[0, :] > mini[0]\n    y_filt_max = new_PC.points[1, :] < maxi[1]\n    y_filt_min = new_PC.points[1, :] > mini[1]\n    z_filt_max = new_PC.points[2, :] < maxi[2]\n    z_filt_min = new_PC.points[2, :] > mini[2]\n\n    close = np.logical_and(x_filt_min, x_filt_max)\n    close = np.logical_and(close, y_filt_min)\n    close = np.logical_and(close, y_filt_max)\n    close = np.logical_and(close, z_filt_min)\n    close = np.logical_and(close, z_filt_max)\n\n    new_PC = PointCloud(new_PC.points[:, close])\n\n    # transform back to the original coordinate system\n    new_PC.rotate(np.transpose(rot_mat))\n    new_PC.translate(-trans)\n    if return_mask:\n        return new_PC, close\n    return new_PC\n\n\ndef generate_subwindow(pc, sample_bb, scale, offset=2, oriented=True):\n    \"\"\"\n    generating the search area using the sample_bb\n\n    :param pc:\n    :param sample_bb:\n    :param scale:\n    :param offset:\n    :param oriented: use oriented or axis-aligned cropping\n    :return:\n    \"\"\"\n    rot_mat = np.transpose(sample_bb.rotation_matrix)\n    trans = -sample_bb.center\n    if oriented:\n        new_pc = PointCloud(pc.points.copy())\n        box_tmp = copy.deepcopy(sample_bb)\n\n        # transform to the coordinate system of sample_bb\n        new_pc.translate(trans)\n        box_tmp.translate(trans)\n        new_pc.rotate(rot_mat)\n        box_tmp.rotate(Quaternion(matrix=rot_mat))\n        new_pc = crop_pc_axis_aligned(new_pc, box_tmp, scale=scale, offset=offset)\n\n\n    else:\n        new_pc = crop_pc_axis_aligned(pc, sample_bb, scale=scale, offset=offset)\n\n        # transform to the coordinate system of sample_bb\n        new_pc.translate(trans)\n        new_pc.rotate(rot_mat)\n\n    return new_pc\n\n\ndef transform_box(box, ref_box, inplace=False):\n    if not inplace:\n        box = copy.deepcopy(box)\n    box.translate(-ref_box.center)\n    box.rotate(Quaternion(matrix=ref_box.rotation_matrix.T))\n    return box\n\n\ndef transform_pc(pc, ref_box, inplace=False):\n    if not inplace:\n        pc = copy.deepcopy(pc)\n    pc.translate(-ref_box.center)\n    pc.rotate(ref_box.rotation_matrix.T)\n    return pc\n\n\ndef get_in_box_mask(PC, box):\n    \"\"\"check which points of PC are inside the box\"\"\"\n    box_tmp = copy.deepcopy(box)\n    new_PC = PointCloud(PC.points.copy())\n    rot_mat = np.transpose(box_tmp.rotation_matrix)\n    trans = -box_tmp.center\n\n    # align data\n    new_PC.translate(trans)\n    box_tmp.translate(trans)\n    new_PC.rotate(rot_mat)\n    box_tmp.rotate(Quaternion(matrix=rot_mat))\n    maxi = np.max(box_tmp.corners(), 1)\n    mini = np.min(box_tmp.corners(), 1)\n\n    x_filt_max = new_PC.points[0, :] < maxi[0]\n    x_filt_min = new_PC.points[0, :] > mini[0]\n    y_filt_max = new_PC.points[1, :] < maxi[1]\n    y_filt_min = new_PC.points[1, :] > mini[1]\n    z_filt_max = new_PC.points[2, :] < maxi[2]\n    z_filt_min = new_PC.points[2, :] > mini[2]\n\n    close = np.logical_and(x_filt_min, x_filt_max)\n    close = np.logical_and(close, y_filt_min)\n    close = np.logical_and(close, y_filt_max)\n    close = np.logical_and(close, z_filt_min)\n    close = np.logical_and(close, z_filt_max)\n    return close\n\n\ndef apply_transform(in_box_pc, box, translation, rotation, flip_x, flip_y, rotation_axis=(0, 0, 1)):\n    \"\"\"\n    Apply transformation to the box and its pc insides. pc should be inside the given box.\n    :param in_box_pc: PointCloud object\n    :param box: Box object\n    :param flip_y: boolean\n    :param flip_x: boolean\n    :param rotation_axis: 3-element tuple. The rotation axis\n    :param translation: <np.float: 3, 1>. Translation in x, y, z direction.\n    :param rotation: float. rotation in degrees\n    :return:\n    \"\"\"\n\n    # get inverse transform\n    rot_mat = box.rotation_matrix\n    trans = box.center\n\n    new_box = copy.deepcopy(box)\n    new_pc = copy.deepcopy(in_box_pc)\n\n    new_pc.translate(-trans)\n    new_box.translate(-trans)\n    new_pc.rotate(rot_mat.T)\n    new_box.rotate(Quaternion(matrix=rot_mat.T))\n\n    if flip_x:\n        new_pc.points[0, :] = -new_pc.points[0, :]\n        # rotate the box to make sure that the x-axis is point to the head\n        new_box.rotate(Quaternion(axis=[0, 0, 1], degrees=180))\n    if flip_y:\n        new_pc.points[1, :] = -new_pc.points[1, :]\n\n    # apply rotation\n    rot_quat = Quaternion(axis=rotation_axis, degrees=rotation)\n    new_box.rotate(rot_quat)\n    new_pc.rotate(rot_quat.rotation_matrix)\n\n    # apply translation\n    new_box.translate(translation)\n    new_pc.translate(translation)\n\n    # transform back\n    new_box.rotate(Quaternion(matrix=rot_mat))\n    new_pc.rotate(rot_mat)\n    new_box.translate(trans)\n    new_pc.translate(trans)\n    return new_pc, new_box\n\n\ndef apply_augmentation(pc, box, wlh_factor=1.25):\n    in_box_mask = nuscenes.utils.geometry_utils.points_in_box(box, pc.points, wlh_factor=wlh_factor)\n    # in_box_mask = get_in_box_mask(pc, box)\n    in_box_pc = PointCloud(pc.points[:, in_box_mask])\n\n    rand_trans = np.random.uniform(low=-0.3, high=0.3, size=3)\n    rand_rot = np.random.uniform(low=-10, high=10)\n    flip_x, flip_y = np.random.choice([True, False], size=2, replace=True)\n\n    new_in_box_pc, new_box = apply_transform(in_box_pc, box, rand_trans, rand_rot, flip_x, flip_y)\n\n    new_pc = copy.deepcopy(pc)\n    new_pc.points[:, in_box_mask] = new_in_box_pc.points\n    return new_pc, new_box\n\n\ndef roty_batch_tensor(t):\n    input_shape = t.shape\n    output = torch.zeros(tuple(list(input_shape) + [3, 3]), dtype=torch.float32, device=t.device)\n    c = torch.cos(t)\n    s = torch.sin(t)\n    output[..., 0, 0] = c\n    output[..., 0, 2] = s\n    output[..., 1, 1] = 1\n    output[..., 2, 0] = -s\n    output[..., 2, 2] = c\n    return output\n\n\ndef rotz_batch_tensor(t):\n    input_shape = t.shape\n    output = torch.zeros(tuple(list(input_shape) + [3, 3]), dtype=torch.float32, device=t.device)\n    c = torch.cos(t)\n    s = torch.sin(t)\n    output[..., 0, 0] = c\n    output[..., 0, 1] = -s\n    output[..., 1, 0] = s\n    output[..., 1, 1] = c\n    output[..., 2, 2] = 1\n    return output\n\n\ndef get_offset_points_tensor(points, ref_box_params, offset_box_params):\n    \"\"\"\n\n    :param points: B,N,3\n    :param ref_box_params: B,4\n    :return:\n    \"\"\"\n    ref_center = ref_box_params[:, :3]\n    ref_rot_angles = ref_box_params[:, -1]\n    offset_center = offset_box_params[:, :3]\n    offset_rot_angles = offset_box_params[:, -1]\n\n    # transform to object coordinate system defined by the ref_box_params\n    rot_mat = rotz_batch_tensor(-ref_rot_angles)  # B,3,3\n    points -= ref_center[:, None, :]  # B,N,3\n    points = torch.matmul(points, rot_mat.transpose(1, 2))\n\n    # apply the offset\n    rot_mat_offset = rotz_batch_tensor(offset_rot_angles)\n    points = torch.matmul(points, rot_mat_offset.transpose(1, 2))\n    points += offset_center[:, None, :]\n\n    # # transform back to world coordinate\n    points = torch.matmul(points, rot_mat)\n    points += ref_center[:, None, :]\n    return points\n\n\ndef get_offset_box_tensor(ref_box_params, offset_box_params):\n    \"\"\"\n    transform the ref_box with the give offset\n    :param ref_box_params: B,4\n    :param offset_box_params: B,4\n    :return: B,4\n    \"\"\"\n    ref_center = ref_box_params[:, :3]  # B,3\n    ref_rot_angles = ref_box_params[:, -1]  # B,\n    offset_center = offset_box_params[:, :3]\n    offset_rot_angles = offset_box_params[:, -1]\n    rot_mat = rotz_batch_tensor(ref_rot_angles)  # B,3,3\n\n    new_center = torch.matmul(rot_mat, offset_center[..., None]).squeeze(dim=-1)  # B,3\n    new_center += ref_center\n    new_angle = ref_rot_angles + offset_rot_angles\n    return torch.cat([new_center, new_angle[:, None]], dim=-1)\n\n\ndef remove_transform_points_tensor(points, ref_box_params):\n    \"\"\"\n\n    :param points: B,N,3\n    :param ref_box_params: B,4\n    :return:\n    \"\"\"\n    ref_center = ref_box_params[:, :3]\n    ref_rot_angles = ref_box_params[:, -1]\n\n    # transform to object coordinate system defined by the ref_box_params\n    rot_mat = rotz_batch_tensor(-ref_rot_angles)  # B,3,3\n    points -= ref_center[:, None, :]  # B,N,3\n    points = torch.matmul(points, rot_mat.transpose(1, 2))\n    return points\n\n\ndef np_to_torch_tensor(data, device=None):\n    return torch.tensor(data, device=device).unsqueeze(dim=0)\n\n"
  },
  {
    "path": "datasets/sampler.py",
    "content": "# Created by zenn at 2021/4/27\n\nimport numpy as np\nimport torch\nfrom easydict import EasyDict\nfrom nuscenes.utils import geometry_utils\n\nimport datasets.points_utils as points_utils\nfrom datasets.searchspace import KalmanFiltering\n\n\ndef no_processing(data, *args):\n    return data\n\n\ndef siamese_processing(data, config, template_transform=None, search_transform=None):\n    \"\"\"\n\n    :param data:\n    :param config: {model_bb_scale,model_bb_offset,search_bb_scale, search_bb_offset}\n    :return:\n    \"\"\"\n    first_frame = data['first_frame']\n    template_frame = data['template_frame']\n    search_frame = data['search_frame']\n    candidate_id = data['candidate_id']\n    first_pc, first_box = first_frame['pc'], first_frame['3d_bbox']\n    template_pc, template_box = template_frame['pc'], template_frame['3d_bbox']\n    search_pc, search_box = search_frame['pc'], search_frame['3d_bbox']\n    if template_transform is not None:\n        template_pc, template_box = template_transform(template_pc, template_box)\n        first_pc, first_box = template_transform(first_pc, first_box)\n    if search_transform is not None:\n        search_pc, search_box = search_transform(search_pc, search_box)\n    # generating template. Merging the object from previous and the first frames.\n    if candidate_id == 0:\n        samplegt_offsets = np.zeros(3)\n    else:\n        samplegt_offsets = np.random.uniform(low=-0.3, high=0.3, size=3)\n        samplegt_offsets[2] = samplegt_offsets[2] * (5 if config.degrees else np.deg2rad(5))\n    template_box = points_utils.getOffsetBB(template_box, samplegt_offsets, limit_box=config.data_limit_box,\n                                            degrees=config.degrees)\n    model_pc, model_box = points_utils.getModel([first_pc, template_pc], [first_box, template_box],\n                                                scale=config.model_bb_scale, offset=config.model_bb_offset)\n\n    assert model_pc.nbr_points() > 20, 'not enough template points'\n\n    # generating search area. Use the current gt box to select the nearby region as the search area.\n\n    if candidate_id == 0 and config.num_candidates > 1:\n        sample_offset = np.zeros(3)\n    else:\n        gaussian = KalmanFiltering(bnd=[1, 1, (5 if config.degrees else np.deg2rad(5))])\n        sample_offset = gaussian.sample(1)[0]\n    sample_bb = points_utils.getOffsetBB(search_box, sample_offset, limit_box=config.data_limit_box,\n                                         degrees=config.degrees)\n    search_pc_crop = points_utils.generate_subwindow(search_pc, sample_bb,\n                                                     scale=config.search_bb_scale, offset=config.search_bb_offset)\n    assert search_pc_crop.nbr_points() > 20, 'not enough search points'\n    search_box = points_utils.transform_box(search_box, sample_bb)\n    seg_label = points_utils.get_in_box_mask(search_pc_crop, search_box).astype(int)\n    search_bbox_reg = [search_box.center[0], search_box.center[1], search_box.center[2], -sample_offset[2]]\n\n    template_points, idx_t = points_utils.regularize_pc(model_pc.points.T, config.template_size)\n    search_points, idx_s = points_utils.regularize_pc(search_pc_crop.points.T, config.search_size)\n    seg_label = seg_label[idx_s]\n    data_dict = {\n        'template_points': template_points.astype('float32'),\n        'search_points': search_points.astype('float32'),\n        'box_label': np.array(search_bbox_reg).astype('float32'),\n        'bbox_size': search_box.wlh,\n        'seg_label': seg_label.astype('float32'),\n    }\n    if getattr(config, 'box_aware', False):\n        template_bc = points_utils.get_point_to_box_distance(template_points, model_box)\n        search_bc = points_utils.get_point_to_box_distance(search_points, search_box)\n        data_dict.update({'points2cc_dist_t': template_bc.astype('float32'),\n                          'points2cc_dist_s': search_bc.astype('float32'), })\n    return data_dict\n\n\ndef motion_processing(data, config, template_transform=None, search_transform=None):\n    \"\"\"\n\n    :param data:\n    :param config: {model_bb_scale,model_bb_offset,search_bb_scale, search_bb_offset}\n    :return:\n    point_sample_size\n    bb_scale\n    bb_offset\n    \"\"\"\n    prev_frame = data['prev_frame']\n    this_frame = data['this_frame']\n    candidate_id = data['candidate_id']\n    prev_pc, prev_box = prev_frame['pc'], prev_frame['3d_bbox']\n    this_pc, this_box = this_frame['pc'], this_frame['3d_bbox']\n\n    num_points_in_prev_box = geometry_utils.points_in_box(prev_box, prev_pc.points).sum()\n    assert num_points_in_prev_box > 10, 'not enough target points'\n\n    if template_transform is not None:\n        prev_pc, prev_box = template_transform(prev_pc, prev_box)\n    if search_transform is not None:\n        this_pc, this_box = search_transform(this_pc, this_box)\n\n    if candidate_id == 0:\n        sample_offsets = np.zeros(3)\n    else:\n        sample_offsets = np.random.uniform(low=-0.3, high=0.3, size=3)\n        sample_offsets[2] = sample_offsets[2] * (5 if config.degrees else np.deg2rad(5))\n    ref_box = points_utils.getOffsetBB(prev_box, sample_offsets, limit_box=config.data_limit_box,\n                                       degrees=config.degrees)\n    prev_frame_pc = points_utils.generate_subwindow(prev_pc, ref_box,\n                                                    scale=config.bb_scale,\n                                                    offset=config.bb_offset)\n\n    this_frame_pc = points_utils.generate_subwindow(this_pc, ref_box,\n                                                    scale=config.bb_scale,\n                                                    offset=config.bb_offset)\n    assert this_frame_pc.nbr_points() > 20, 'not enough search points'\n\n    this_box = points_utils.transform_box(this_box, ref_box)\n    prev_box = points_utils.transform_box(prev_box, ref_box)\n    ref_box = points_utils.transform_box(ref_box, ref_box)\n    motion_box = points_utils.transform_box(this_box, prev_box)\n\n    prev_points, idx_prev = points_utils.regularize_pc(prev_frame_pc.points.T, config.point_sample_size)\n    this_points, idx_this = points_utils.regularize_pc(this_frame_pc.points.T, config.point_sample_size)\n\n    seg_label_this = geometry_utils.points_in_box(this_box, this_points.T, 1.25).astype(int)\n    seg_label_prev = geometry_utils.points_in_box(prev_box, prev_points.T, 1.25).astype(int)\n    seg_mask_prev = geometry_utils.points_in_box(ref_box, prev_points.T, 1.25).astype(float)\n    if candidate_id != 0:\n        # Here we use 0.2/0.8 instead of 0/1 to indicate that the previous box is not GT.\n        # When boxcloud is used, the actual value of prior-targetness mask doesn't really matter.\n        seg_mask_prev[seg_mask_prev == 0] = 0.2\n        seg_mask_prev[seg_mask_prev == 1] = 0.8\n    seg_mask_this = np.full(seg_mask_prev.shape, fill_value=0.5)\n\n    timestamp_prev = np.full((config.point_sample_size, 1), fill_value=0)\n    timestamp_this = np.full((config.point_sample_size, 1), fill_value=0.1)\n\n    prev_points = np.concatenate([prev_points, timestamp_prev, seg_mask_prev[:, None]], axis=-1)\n    this_points = np.concatenate([this_points, timestamp_this, seg_mask_this[:, None]], axis=-1)\n\n    stack_points = np.concatenate([prev_points, this_points], axis=0)\n    stack_seg_label = np.hstack([seg_label_prev, seg_label_this])\n    theta_this = this_box.orientation.degrees * this_box.orientation.axis[-1] if config.degrees else \\\n        this_box.orientation.radians * this_box.orientation.axis[-1]\n    box_label = np.append(this_box.center, theta_this).astype('float32')\n    theta_prev = prev_box.orientation.degrees * prev_box.orientation.axis[-1] if config.degrees else \\\n        prev_box.orientation.radians * prev_box.orientation.axis[-1]\n    box_label_prev = np.append(prev_box.center, theta_prev).astype('float32')\n    theta_motion = motion_box.orientation.degrees * motion_box.orientation.axis[-1] if config.degrees else \\\n        motion_box.orientation.radians * motion_box.orientation.axis[-1]\n    motion_label = np.append(motion_box.center, theta_motion).astype('float32')\n\n    motion_state_label = np.sqrt(np.sum((this_box.center - prev_box.center) ** 2)) > config.motion_threshold\n\n    data_dict = {\n        'points': stack_points.astype('float32'),\n        'box_label': box_label,\n        'box_label_prev': box_label_prev,\n        'motion_label': motion_label,\n        'motion_state_label': motion_state_label.astype('int'),\n        'bbox_size': this_box.wlh,\n        'seg_label': stack_seg_label.astype('int'),\n    }\n\n    if getattr(config, 'box_aware', False):\n        prev_bc = points_utils.get_point_to_box_distance(stack_points[:config.point_sample_size, :3], prev_box)\n        this_bc = points_utils.get_point_to_box_distance(stack_points[config.point_sample_size:, :3], this_box)\n        candidate_bc_prev = points_utils.get_point_to_box_distance(stack_points[:config.point_sample_size, :3], ref_box)\n        candidate_bc_this = np.zeros_like(candidate_bc_prev)\n        candidate_bc = np.concatenate([candidate_bc_prev, candidate_bc_this], axis=0)\n\n        data_dict.update({'prev_bc': prev_bc.astype('float32'),\n                          'this_bc': this_bc.astype('float32'),\n                          'candidate_bc': candidate_bc.astype('float32')})\n    return data_dict\n\n\nclass PointTrackingSampler(torch.utils.data.Dataset):\n    def __init__(self, dataset, random_sample, sample_per_epoch=10000, processing=siamese_processing, config=None,\n                 **kwargs):\n        if config is None:\n            config = EasyDict(kwargs)\n        self.sample_per_epoch = sample_per_epoch\n        self.dataset = dataset\n        self.processing = processing\n        self.config = config\n        self.random_sample = random_sample\n        self.num_candidates = getattr(config, 'num_candidates', 1)\n        if getattr(self.config, \"use_augmentation\", False):\n            print('using augmentation')\n            self.transform = points_utils.apply_augmentation\n        else:\n            self.transform = None\n        if not self.random_sample:\n            num_frames_total = 0\n            self.tracklet_start_ids = [num_frames_total]\n            for i in range(dataset.get_num_tracklets()):\n                num_frames_total += dataset.get_num_frames_tracklet(i)\n                self.tracklet_start_ids.append(num_frames_total)\n\n    def get_anno_index(self, index):\n        return index // self.num_candidates\n\n    def get_candidate_index(self, index):\n        return index % self.num_candidates\n\n    def __len__(self):\n        if self.random_sample:\n            return self.sample_per_epoch * self.num_candidates\n        else:\n            return self.dataset.get_num_frames_total() * self.num_candidates\n\n    def __getitem__(self, index):\n        anno_id = self.get_anno_index(index)\n        candidate_id = self.get_candidate_index(index)\n        try:\n            if self.random_sample:\n                tracklet_id = torch.randint(0, self.dataset.get_num_tracklets(), size=(1,)).item()\n                tracklet_annos = self.dataset.tracklet_anno_list[tracklet_id]\n                frame_ids = [0] + points_utils.random_choice(num_samples=2, size=len(tracklet_annos)).tolist()\n            else:\n                for i in range(0, self.dataset.get_num_tracklets()):\n                    if self.tracklet_start_ids[i] <= anno_id < self.tracklet_start_ids[i + 1]:\n                        tracklet_id = i\n                        this_frame_id = anno_id - self.tracklet_start_ids[i]\n                        prev_frame_id = max(this_frame_id - 1, 0)\n                        frame_ids = (0, prev_frame_id, this_frame_id)\n            first_frame, template_frame, search_frame = self.dataset.get_frames(tracklet_id, frame_ids=frame_ids)\n            data = {\"first_frame\": first_frame,\n                    \"template_frame\": template_frame,\n                    \"search_frame\": search_frame,\n                    \"candidate_id\": candidate_id}\n\n            return self.processing(data, self.config,\n                                   template_transform=None,\n                                   search_transform=self.transform)\n        except AssertionError:\n            return self[torch.randint(0, len(self), size=(1,)).item()]\n\n\nclass TestTrackingSampler(torch.utils.data.Dataset):\n    def __init__(self, dataset, config=None, **kwargs):\n        if config is None:\n            config = EasyDict(kwargs)\n        self.dataset = dataset\n        self.config = config\n\n    def __len__(self):\n        return self.dataset.get_num_tracklets()\n\n    def __getitem__(self, index):\n        tracklet_annos = self.dataset.tracklet_anno_list[index]\n        frame_ids = list(range(len(tracklet_annos)))\n        return self.dataset.get_frames(index, frame_ids)\n\n\nclass MotionTrackingSampler(PointTrackingSampler):\n    def __init__(self, dataset, config=None, **kwargs):\n        super().__init__(dataset, random_sample=False, config=config, **kwargs)\n        self.processing = motion_processing\n\n    def __getitem__(self, index):\n        anno_id = self.get_anno_index(index)\n        candidate_id = self.get_candidate_index(index)\n        try:\n\n            for i in range(0, self.dataset.get_num_tracklets()):\n                if self.tracklet_start_ids[i] <= anno_id < self.tracklet_start_ids[i + 1]:\n                    tracklet_id = i\n                    this_frame_id = anno_id - self.tracklet_start_ids[i]\n                    prev_frame_id = max(this_frame_id - 1, 0)\n                    frame_ids = (0, prev_frame_id, this_frame_id)\n            first_frame, prev_frame, this_frame = self.dataset.get_frames(tracklet_id, frame_ids=frame_ids)\n            data = {\n                \"first_frame\": first_frame,\n                \"prev_frame\": prev_frame,\n                \"this_frame\": this_frame,\n                \"candidate_id\": candidate_id}\n            return self.processing(data, self.config,\n                                   template_transform=self.transform,\n                                   search_transform=self.transform)\n        except AssertionError:\n            return self[torch.randint(0, len(self), size=(1,)).item()]\n"
  },
  {
    "path": "datasets/searchspace.py",
    "content": "import numpy as np\nfrom pomegranate import MultivariateGaussianDistribution, GeneralMixtureModel\nimport logging\n\n\nclass SearchSpace(object):\n\n    def reset(self):\n        raise NotImplementedError\n\n    def sample(self):\n        raise NotImplementedError\n\n    def addData(self, data, score):\n        return\n\n\nclass ExhaustiveSearch(SearchSpace):\n\n    def __init__(self,\n                 search_space=[[-3.0, 3.0], [-3.0, 3.0], [-10.0, 10.0]],\n                 search_dims=[7, 7, 3]):\n\n        x_space = np.linspace(\n            search_space[0][0], search_space[0][1],\n            search_dims[0])\n\n        y_space = np.linspace(\n            search_space[1][0], search_space[1][1],\n            search_dims[1])\n\n        a_space = np.linspace(\n            search_space[2][0], search_space[2][1],\n            search_dims[2])\n\n        X, Y, A = np.meshgrid(x_space, y_space, a_space)  # create mesh grid\n\n        self.search_grid = np.array([X.flatten(), Y.flatten(), A.flatten()]).T\n\n        self.reset()\n\n    def reset(self):\n        return\n\n    def sample(self, n=0):\n        return self.search_grid\n\n\nclass ParticleFiltering(SearchSpace):\n    def __init__(self, bnd=[1, 1, 10]):\n        self.bnd = bnd\n        self.reset()\n\n    def sample(self, n=10):\n        samples = []\n        for i in range(n):\n            if len(self.data) > 0:\n                i_mean = np.random.choice(\n                    list(range(len(self.data))),\n                    p=self.score / np.linalg.norm(self.score, ord=1))\n                sample = np.random.multivariate_normal(\n                    mean=self.data[i_mean], cov=np.diag(np.array(self.bnd)))\n            else:\n                sample = np.random.multivariate_normal(\n                    mean=np.zeros(len(self.bnd)),\n                    cov=np.diag(np.array(self.bnd) * 3))\n\n            samples.append(sample)\n        return np.array(samples)\n\n    def addData(self, data, score):\n        score = score.clip(min=1e-5)  # prevent sum=0 in case of bad scores\n        self.data = data\n        self.score = score\n\n    def reset(self):\n        if len(self.bnd) == 2:\n            self.data = np.array([[], []]).T\n        else:\n            self.data = np.array([[], [], []]).T\n        self.score = np.ones(np.shape(self.data)[0])\n        self.score = self.score / np.linalg.norm(self.score, ord=1)\n\n\nclass KalmanFiltering(SearchSpace):\n    def __init__(self, bnd=[1, 1, 10]):\n        self.bnd = bnd\n        self.reset()\n\n    def sample(self, n=10):\n        return np.random.multivariate_normal(self.mean, self.cov, size=n)\n\n    def addData(self, data, score):\n        score = score.clip(min=1e-5)  # prevent sum=0 in case of bad scores\n        self.data = np.concatenate((self.data, data))\n        self.score = np.concatenate((self.score, score))\n        self.mean = np.average(self.data, weights=self.score, axis=0)\n        self.cov = np.cov(self.data.T, ddof=0, aweights=self.score)\n\n    def reset(self):\n        self.mean = np.zeros(len(self.bnd))\n        self.cov = np.diag(self.bnd)\n        if len(self.bnd) == 2:\n            self.data = np.array([[], []]).T\n        else:\n            self.data = np.array([[], [], []]).T\n        self.score = np.array([])\n\n\nclass GaussianMixtureModel(SearchSpace):\n\n    def __init__(self, n_comp=5, dim=3):\n        self.dim = dim\n        self.reset(n_comp)\n\n    def sample(self, n=10):\n        try:\n            X1 = np.stack(self.model.sample(int(np.round(0.8 * n))))\n            if self.dim == 2:\n                mean = np.mean(X1, axis=0)\n                std = np.diag([1.0, 1.0])\n                gmm = MultivariateGaussianDistribution(mean, std)\n                X2 = np.stack(gmm.sample(int(np.round(0.1 * n))))\n\n                mean = np.mean(X1, axis=0)\n                std = np.diag([1e-3, 1e-3])\n                gmm = MultivariateGaussianDistribution(mean, std)\n                X3 = np.stack(gmm.sample(int(np.round(0.1 * n))))\n\n            else:\n                mean = np.mean(X1, axis=0)\n                std = np.diag([1.0, 1.0, 1e-3])\n                gmm = MultivariateGaussianDistribution(mean, std)\n                X2 = np.stack(gmm.sample(int(np.round(0.1 * n))))\n\n                mean = np.mean(X1, axis=0)\n                std = np.diag([1e-3, 1e-3, 10.0])\n                gmm = MultivariateGaussianDistribution(mean, std)\n                X3 = np.stack(gmm.sample(int(np.round(0.1 * n))))\n\n            X = np.concatenate((X1, X2, X3))\n\n        except ValueError:\n            print(\"exception caught on sampling\")\n            if self.dim == 2:\n                mean = np.zeros(self.dim)\n                std = np.diag([1.0, 1.0])\n                gmm = MultivariateGaussianDistribution(mean, std)\n                X = gmm.sample(int(n))\n            else:\n                mean = np.zeros(self.dim)\n                std = np.diag([1.0, 1.0, 5.0])\n                gmm = MultivariateGaussianDistribution(mean, std)\n                X = gmm.sample(int(n))\n        return X\n\n    def addData(self, data, score):\n        score = score.clip(min=1e-5)\n        self.data = data\n        self.score = score\n\n        score_normed = self.score / np.linalg.norm(self.score, ord=1)\n        try:\n            model = GeneralMixtureModel.from_samples(\n                MultivariateGaussianDistribution,\n                n_components=self.n_comp,\n                X=self.data,\n                weights=score_normed)\n            self.model = model\n        except:\n            logging.info(\"catched an exception\")\n\n    def reset(self, n_comp=5):\n        self.n_comp = n_comp\n\n        if self.dim == 2:\n            self.data = np.array([[], []]).T\n        else:\n            self.data = np.array([[], [], []]).T\n        self.score = np.ones(np.shape(self.data)[0])\n        self.score = self.score / np.linalg.norm(self.score, ord=1)\n        if self.dim == 2:\n            self.model = MultivariateGaussianDistribution(\n                np.zeros(self.dim), np.diag([1.0, 1.0]))\n        else:\n            self.model = MultivariateGaussianDistribution(\n                np.zeros(self.dim), np.diag([1.0, 1.0, 5.0]))\n"
  },
  {
    "path": "datasets/utils.py",
    "content": "#!/usr/bin/env python\n# encoding: utf-8\n'''\n@author: Xu Yan\n@file: utils.py\n@time: 2021/10/21 21:45\n'''\nimport numpy as np\n\ndef roty(t):\n    \"\"\"Rotation about the y-axis.\"\"\"\n    c = np.cos(t)\n    s = np.sin(t)\n    return np.array([[c, -s,  0],\n                     [s,  c,  0],\n                     [0,  0,  1]])\n\ndef get_3d_box(box_size, heading_angle, center):\n    ''' box_size is array(l,w,h), heading_angle is radius clockwise from pos x axis, center is xyz of box center\n        output (8,3) array for 3D box cornders\n        Similar to utils/compute_orientation_3d\n    '''\n    R = roty(heading_angle)\n    l,w,h = box_size\n    # x_corners = [l/2,l/2,-l/2,-l/2,l/2,l/2,-l/2,-l/2]\n    # y_corners = [h/2,h/2,h/2,h/2,-h/2,-h/2,-h/2,-h/2]\n    # z_corners = [w/2,-w/2,-w/2,w/2,w/2,-w/2,-w/2,w/2]\n    x_corners = [l/2,l/2,-l/2,-l/2,l/2,l/2,-l/2,-l/2]\n    y_corners = [w/2,-w/2,-w/2,w/2,w/2,-w/2,-w/2,w/2]\n    z_corners = [h/2,h/2,h/2,h/2,-h/2,-h/2,-h/2,-h/2]\n    corners_3d = np.dot(R, np.vstack([x_corners,y_corners,z_corners]))\n    corners_3d[0,:] = corners_3d[0,:] + center[0]\n    corners_3d[1,:] = corners_3d[1,:] + center[1]\n    corners_3d[2,:] = corners_3d[2,:] + center[2]\n    corners_3d = np.transpose(corners_3d)\n    return corners_3d\n\n\ndef write_ply(verts, colors, indices, output_file):\n    if colors is None:\n        colors = np.zeros_like(verts)\n    if indices is None:\n        indices = []\n\n    file = open(output_file, 'w')\n    file.write('ply \\n')\n    file.write('format ascii 1.0\\n')\n    file.write('element vertex {:d}\\n'.format(len(verts)))\n    file.write('property float x\\n')\n    file.write('property float y\\n')\n    file.write('property float z\\n')\n    file.write('property uchar red\\n')\n    file.write('property uchar green\\n')\n    file.write('property uchar blue\\n')\n    file.write('element face {:d}\\n'.format(len(indices)))\n    file.write('property list uchar uint vertex_indices\\n')\n    file.write('end_header\\n')\n    for vert, color in zip(verts, colors):\n        file.write(\"{:f} {:f} {:f} {:d} {:d} {:d}\\n\".format(vert[0], vert[1], vert[2], int(color[0] * 255),\n                                                            int(color[1] * 255), int(color[2] * 255)))\n    for ind in indices:\n        file.write('3 {:d} {:d} {:d}\\n'.format(ind[0], ind[1], ind[2]))\n    file.close()\n\n\ndef box2obj(box, objname):\n    corners = box.corners().T\n    with open(objname, 'w') as f:\n        for corner in corners:\n            f.write('v %f %f %f\\n' % (corner[0], corner[1], corner[2]))\n        f.write('f %d %d %d %d\\n' % (1, 2, 3, 4))\n        f.write('f %d %d %d %d\\n' % (5, 6, 7, 8))\n        f.write('f %d %d %d %d\\n' % (1, 5, 8, 4))\n        f.write('f %d %d %d %d\\n' % (2, 6, 7, 3))\n        f.write('f %d %d %d %d\\n' % (1, 2, 6, 5))\n        f.write('f %d %d %d %d\\n' % (4, 3, 7, 8))\n\n\ndef write_bbox(corners, mode, output_file):\n    \"\"\"\n    bbox: (cx, cy, cz, lx, ly, lz, r), center and length in three axis, the last is the rotation\n    output_file: string\n    \"\"\"\n\n    def create_cylinder_mesh(radius, p0, p1, stacks=10, slices=10):\n\n        import math\n\n        def compute_length_vec3(vec3):\n            return math.sqrt(vec3[0] * vec3[0] + vec3[1] * vec3[1] + vec3[2] * vec3[2])\n\n        def rotation(axis, angle):\n            rot = np.eye(4)\n            c = np.cos(-angle)\n            s = np.sin(-angle)\n            t = 1.0 - c\n            axis /= compute_length_vec3(axis)\n            x = axis[0]\n            y = axis[1]\n            z = axis[2]\n            rot[0, 0] = 1 + t * (x * x - 1)\n            rot[0, 1] = z * s + t * x * y\n            rot[0, 2] = -y * s + t * x * z\n            rot[1, 0] = -z * s + t * x * y\n            rot[1, 1] = 1 + t * (y * y - 1)\n            rot[1, 2] = x * s + t * y * z\n            rot[2, 0] = y * s + t * x * z\n            rot[2, 1] = -x * s + t * y * z\n            rot[2, 2] = 1 + t * (z * z - 1)\n            return rot\n\n        verts = []\n        indices = []\n        diff = (p1 - p0).astype(np.float32)\n        height = compute_length_vec3(diff)\n        for i in range(stacks + 1):\n            for i2 in range(slices):\n                theta = i2 * 2.0 * math.pi / slices\n                pos = np.array([radius * math.cos(theta), radius * math.sin(theta), height * i / stacks])\n                verts.append(pos)\n        for i in range(stacks):\n            for i2 in range(slices):\n                i2p1 = math.fmod(i2 + 1, slices)\n                indices.append(np.array([(i + 1) * slices + i2, i * slices + i2, i * slices + i2p1], dtype=np.uint32))\n                indices.append(\n                    np.array([(i + 1) * slices + i2, i * slices + i2p1, (i + 1) * slices + i2p1], dtype=np.uint32))\n        transform = np.eye(4)\n        va = np.array([0, 0, 1], dtype=np.float32)\n        vb = diff\n        vb /= compute_length_vec3(vb)\n        axis = np.cross(vb, va)\n        angle = np.arccos(np.clip(np.dot(va, vb), -1, 1))\n        if angle != 0:\n            if compute_length_vec3(axis) == 0:\n                dotx = va[0]\n                if (math.fabs(dotx) != 1.0):\n                    axis = np.array([1, 0, 0]) - dotx * va\n                else:\n                    axis = np.array([0, 1, 0]) - va[1] * va\n                axis /= compute_length_vec3(axis)\n            transform = rotation(axis, -angle)\n        transform[:3, 3] += p0\n        verts = [np.dot(transform, np.array([v[0], v[1], v[2], 1.0])) for v in verts]\n        verts = [np.array([v[0], v[1], v[2]]) / v[3] for v in verts]\n\n        return verts, indices\n\n    def get_bbox_edges(bbox_min, bbox_max):\n        def get_bbox_verts(bbox_min, bbox_max):\n            verts = [\n                np.array([bbox_min[0], bbox_min[1], bbox_min[2]]),\n                np.array([bbox_max[0], bbox_min[1], bbox_min[2]]),\n                np.array([bbox_max[0], bbox_max[1], bbox_min[2]]),\n                np.array([bbox_min[0], bbox_max[1], bbox_min[2]]),\n\n                np.array([bbox_min[0], bbox_min[1], bbox_max[2]]),\n                np.array([bbox_max[0], bbox_min[1], bbox_max[2]]),\n                np.array([bbox_max[0], bbox_max[1], bbox_max[2]]),\n                np.array([bbox_min[0], bbox_max[1], bbox_max[2]])\n            ]\n            return verts\n\n        box_verts = get_bbox_verts(bbox_min, bbox_max)\n        edges = [\n            (box_verts[0], box_verts[1]),\n            (box_verts[1], box_verts[2]),\n            (box_verts[2], box_verts[3]),\n            (box_verts[3], box_verts[0]),\n\n            (box_verts[4], box_verts[5]),\n            (box_verts[5], box_verts[6]),\n            (box_verts[6], box_verts[7]),\n            (box_verts[7], box_verts[4]),\n\n            (box_verts[0], box_verts[4]),\n            (box_verts[1], box_verts[5]),\n            (box_verts[2], box_verts[6]),\n            (box_verts[3], box_verts[7])\n        ]\n        return edges\n\n    radius = 0.03\n    offset = [0, 0, 0]\n    verts = []\n    indices = []\n    colors = []\n\n    box_min = np.min(corners, axis=0)\n    box_max = np.max(corners, axis=0)\n    palette = {\n        0: [0, 255, 0],  # gt\n        1: [0, 0, 255]  # pred\n    }\n    chosen_color = palette[mode]\n    edges = get_bbox_edges(box_min, box_max)\n    for k in range(len(edges)):\n        cyl_verts, cyl_ind = create_cylinder_mesh(radius, edges[k][0], edges[k][1])\n        cur_num_verts = len(verts)\n        cyl_color = [[c / 255 for c in chosen_color] for _ in cyl_verts]\n        cyl_verts = [x + offset for x in cyl_verts]\n        cyl_ind = [x + cur_num_verts for x in cyl_ind]\n        verts.extend(cyl_verts)\n        indices.extend(cyl_ind)\n        colors.extend(cyl_color)\n\n    write_ply(verts, colors, indices, output_file)\n\n\ndef write_obj(points, file, rgb=False):\n    fout = open('%s.obj' % file, 'w')\n    for i in range(points.shape[0]):\n        if not rgb:\n            fout.write('v %f %f %f %d %d %d\\n' % (\n                points[i, 0], points[i, 1], points[i, 2], 255, 255, 0))\n        else:\n            fout.write('v %f %f %f %d %d %d\\n' % (\n                points[i, 0], points[i, 1], points[i, 2], points[i, -3] * 255, points[i, -2] * 255,\n                points[i, -1] * 255))\n"
  },
  {
    "path": "datasets/waymo_data.py",
    "content": "# Created by Xu Yan at 2021/10/17\n\nimport copy\nimport random\n\nfrom torch.utils.data import Dataset\nfrom datasets.data_classes import PointCloud, Box\nfrom pyquaternion import Quaternion\nimport numpy as np\nimport pandas as pd\nimport os\nimport warnings\nimport pickle\nfrom functools import reduce\nfrom tqdm import tqdm\nfrom datasets.generate_waymo_sot import generate_waymo_data\nfrom collections import defaultdict\nfrom datasets import points_utils, base_dataset\n\n\nclass WaymoDataset(base_dataset.BaseDataset):\n    def __init__(self, path, split, category_name=\"VEHICLE\", **kwargs):\n        super().__init__(path, split, category_name, **kwargs)\n        self.Waymo_Folder = path\n        self.category_name = category_name\n        self.Waymo_velo = os.path.join(self.Waymo_Folder, split, \"velodyne\")\n        self.Waymo_label = os.path.join(self.Waymo_Folder, split, \"label_02\")\n        self.Waymo_calib = os.path.join(self.Waymo_Folder, split, \"calib\")\n        self.velos = defaultdict(dict)\n        self.calibs = {}\n\n        self.split = self.split.lower()\n        self.category_name = self.category_name.lower()\n        self.split = 'val' if self.split == 'test' else self.split\n        assert self.split in ['train', 'val']\n        assert self.category_name in ['vehicle', 'pedestrian', 'cyclist']\n\n        self.tiny = kwargs.get('tiny', False)\n        self.tracklet_anno_list, self.tracklet_len_list = self._build_tracklet_anno()\n        if self.tiny:\n            self.tracklet_anno_list = self.tracklet_anno_list[:100]\n            self.tracklet_len_list = self.tracklet_len_list[:100]\n\n        self.preload_offset = kwargs.get('preload_offset', 10)\n        if self.preloading:\n            self.training_samples = self._load_data()\n\n    def _load_data(self):\n        print('preloading data into memory')\n        if self.tiny:\n            preload_data_path = os.path.join(self.Waymo_Folder,\n                                             f\"preload_{self.split}_{self.category_name}_{self.preload_offset}_tiny.dat\")\n        else:\n            preload_data_path = os.path.join(self.Waymo_Folder,\n                                             f\"preload_{self.split}_{self.category_name}_{self.preload_offset}.dat\")\n\n        print(preload_data_path)\n\n        if os.path.isfile(preload_data_path):\n            print(f'loading from saved file {preload_data_path}.')\n            with open(preload_data_path, 'rb') as f:\n                training_samples = pickle.load(f)\n        else:\n            print('reading from annos')\n            training_samples = []\n            for i in tqdm(range(len(self.tracklet_anno_list)), total=len(self.tracklet_anno_list)):\n                frames = []\n                for anno in self.tracklet_anno_list[i]:\n                    frames.append(self._get_frame_from_anno(anno, i))\n\n                training_samples.append(frames)\n            with open(preload_data_path, 'wb') as f:\n                print(f'saving loaded data to {preload_data_path}')\n                pickle.dump(training_samples, f)\n        return training_samples\n\n    def get_num_scenes(self):\n        return len(self.scene_list)\n\n    def get_num_tracklets(self):\n        return len(self.tracklet_anno_list)\n\n    def get_num_frames_total(self):\n        return sum(self.tracklet_len_list)\n\n    def get_num_frames_tracklet(self, tracklet_id):\n        return self.tracklet_len_list[tracklet_id]\n\n    def _build_tracklet_anno(self):\n        preload_data_path = os.path.join(self.Waymo_Folder,\n                                         f\"sot_infos_{self.category_name.lower()}_{self.split.lower()}.pkl\")\n        if not os.path.exists(preload_data_path):\n            print('Prepare %s' % preload_data_path)\n            generate_waymo_data(self.Waymo_Folder, self.category_name, self.split)\n\n        with open(preload_data_path, 'rb') as f:\n            infos = pickle.load(f)\n\n        list_of_tracklet_anno = []\n        list_of_tracklet_len = []\n\n        for scene in list(infos.keys()):\n            anno = infos[scene]\n            list_of_tracklet_anno.append(anno)\n            list_of_tracklet_len.append(len(anno))\n\n        return list_of_tracklet_anno, list_of_tracklet_len\n\n    def get_frames(self, seq_id, frame_ids):\n        if self.preloading:\n            frames = [self.training_samples[seq_id][f_id] for f_id in frame_ids]\n        else:\n            seq_annos = self.tracklet_anno_list[seq_id]\n            frames = [self._get_frame_from_anno(seq_annos[f_id]) for f_id in frame_ids]\n\n        return frames\n\n    def _get_frame_from_anno(self, anno, track_id=None, check=False):\n        '''\n        'box': np.array([box.center_x, box.center_y, box.center_z,\n                         box.length, box.width, box.height, ref_velocity[0],\n                         ref_velocity[1], box.heading], dtype=np.float32),\n        '''\n        sample_data_lidar = anno['PC']\n        gt_boxes = anno['Box']\n\n        with open(sample_data_lidar, 'rb') as f:\n            pc_info = pickle.load(f)\n\n        pointcloud = pc_info['lidars']['points_xyz'].transpose((1, 0))\n\n        with open(sample_data_lidar.replace('lidar', 'annos'), 'rb') as f:\n            ref_obj = pickle.load(f)\n\n        ref_pose = np.reshape(ref_obj['veh_to_global'], [4, 4])\n        global_from_car, _ = self.veh_pos_to_transform(ref_pose)\n        nbr_points = pointcloud.shape[1]\n        pointcloud[:3, :] = global_from_car.dot(\n            np.vstack((pointcloud[:3, :], np.ones(nbr_points)))\n        )[:3, :]\n\n        # transform from Waymo to KITTI coordinate\n        # Waymo: x, y, z, length, width, height, rotation from positive x axis clockwisely\n        # KITTI: x, y, z, width, length, height, rotation from negative y axis counterclockwisely\n        gt_boxes[[3, 4]] = gt_boxes[[4, 3]]\n\n        pc = PointCloud(pointcloud)\n        bb = Box(gt_boxes[0:3], gt_boxes[3:6], Quaternion(axis=[0, 0, 1], radians=-gt_boxes[-1]),\n                 velocity=gt_boxes[6:9], name=anno['Class'])\n        bb.rotate(Quaternion(matrix=global_from_car))\n        bb.translate(global_from_car[:3, -1])\n        if self.preload_offset > 0:\n            pc = points_utils.crop_pc_axis_aligned(pc, bb, offset=self.preload_offset)\n\n        if check:\n            from datasets.utils import write_bbox, write_obj, get_3d_box, box2obj\n            print('check', pc_info['frame_id'])\n            path = 'visual_%s_track%d/' % (pc_info['scene_name'], track_id)\n            os.makedirs(path, exist_ok=True)\n            if pc_info['frame_id'] % 50 == 0:\n                write_obj(pc.points.transpose((1, 0)), path + 'frames_%d' % pc_info['frame_id'])\n                # write_bbox(get_3d_box(bb.wlh, bb.orientation.radians * bb.orientation.axis[-1], bb.center), 0, path + 'box_%d.ply' % pc_info['frame_id'])\n                box2obj(bb, path + 'box_%d.obj' % pc_info['frame_id'])\n            print(path + 'box_%d.obj' % pc_info['frame_id'])\n            # exit()\n\n        return {\"pc\": pc, \"3d_bbox\": bb, 'meta': anno}\n\n    @staticmethod\n    def veh_pos_to_transform(veh_pos):\n        def transform_matrix(translation: np.ndarray = np.array([0, 0, 0]),\n                             rotation: Quaternion = Quaternion([1, 0, 0, 0]),\n                             inverse: bool = False) -> np.ndarray:\n            \"\"\"\n            Convert pose to transformation matrix.\n            :param translation: <np.float32: 3>. Translation in x, y, z.\n            :param rotation: Rotation in quaternions (w ri rj rk).\n            :param inverse: Whether to compute inverse transform matrix.\n            :return: <np.float32: 4, 4>. Transformation matrix.\n            \"\"\"\n            tm = np.eye(4)\n\n            if inverse:\n                rot_inv = rotation.rotation_matrix.T\n                trans = np.transpose(-np.array(translation))\n                tm[:3, :3] = rot_inv\n                tm[:3, 3] = rot_inv.dot(trans)\n            else:\n                tm[:3, :3] = rotation.rotation_matrix\n                tm[:3, 3] = np.transpose(np.array(translation))\n\n            return tm\n\n        \"convert vehicle pose to two transformation matrix\"\n        rotation = veh_pos[:3, :3]\n        tran = veh_pos[:3, 3]\n\n        global_from_car = transform_matrix(\n            tran, Quaternion(matrix=rotation), inverse=False\n        )\n\n        car_from_global = transform_matrix(\n            tran, Quaternion(matrix=rotation), inverse=True\n        )\n\n        return global_from_car, car_from_global\n\n\n"
  },
  {
    "path": "main.py",
    "content": "\"\"\"\nmain.py\nCreated by zenn at 2021/7/18 15:08\n\"\"\"\nimport pytorch_lightning as pl\nimport argparse\n\nimport pytorch_lightning.utilities.distributed\nimport torch\nimport yaml\nfrom easydict import EasyDict\nimport os\n\nfrom pytorch_lightning.callbacks import ModelCheckpoint\nfrom torch.utils.data import DataLoader\n\nfrom datasets import get_dataset\nfrom models import get_model\n\n\n# os.environ[\"NCCL_DEBUG\"] = \"INFO\"\n\ndef load_yaml(file_name):\n    with open(file_name, 'r') as f:\n        try:\n            config = yaml.load(f, Loader=yaml.FullLoader)\n        except:\n            config = yaml.load(f)\n    return config\n\n\ndef parse_config():\n    parser = argparse.ArgumentParser()\n    parser.add_argument('--batch_size', type=int, default=100, help='input batch size')\n    parser.add_argument('--epoch', type=int, default=60, help='number of epochs')\n    parser.add_argument('--save_top_k', type=int, default=-1, help='save top k checkpoints')\n    parser.add_argument('--check_val_every_n_epoch', type=int, default=1, help='check_val_every_n_epoch')\n    parser.add_argument('--workers', type=int, default=10, help='number of data loading workers')\n    parser.add_argument('--cfg', type=str, help='the config_file')\n    parser.add_argument('--checkpoint', type=str, default=None, help='checkpoint location')\n    parser.add_argument('--log_dir', type=str, default=None, help='log location')\n    parser.add_argument('--test', action='store_true', default=False, help='test mode')\n    parser.add_argument('--preloading', action='store_true', default=False, help='preload dataset into memory')\n\n    args = parser.parse_args()\n    config = load_yaml(args.cfg)\n    config.update(vars(args))  # override the configuration using the value in args\n\n    return EasyDict(config)\n\n\ncfg = parse_config()\nenv_cp = os.environ.copy()\ntry:\n    node_rank, local_rank, world_size = env_cp['NODE_RANK'], env_cp['LOCAL_RANK'], env_cp['WORLD_SIZE']\n\n    is_in_ddp_subprocess = env_cp['PL_IN_DDP_SUBPROCESS']\n    pl_trainer_gpus = env_cp['PL_TRAINER_GPUS']\n    print(node_rank, local_rank, world_size, is_in_ddp_subprocess, pl_trainer_gpus)\n\n    if int(local_rank) == int(world_size) - 1:\n        print(cfg)\nexcept KeyError:\n    pass\n\n# init model\nif cfg.checkpoint is None:\n    net = get_model(cfg.net_model)(cfg)\nelse:\n    net = get_model(cfg.net_model).load_from_checkpoint(cfg.checkpoint, config=cfg)\nif not cfg.test:\n    # dataset and dataloader\n    train_data = get_dataset(cfg, type=cfg.train_type, split=cfg.train_split)\n    val_data = get_dataset(cfg, type='test', split=cfg.val_split)\n    train_loader = DataLoader(train_data, batch_size=cfg.batch_size, num_workers=cfg.workers, shuffle=True,drop_last=True,\n                              pin_memory=True)\n    val_loader = DataLoader(val_data, batch_size=1, num_workers=cfg.workers, collate_fn=lambda x: x, pin_memory=True)\n    checkpoint_callback = ModelCheckpoint(monitor='precision/test', mode='max', save_last=True,\n                                          save_top_k=cfg.save_top_k)\n\n    # init trainer\n    trainer = pl.Trainer(gpus=-1, accelerator='ddp', max_epochs=cfg.epoch, resume_from_checkpoint=cfg.checkpoint,\n                         callbacks=[checkpoint_callback], default_root_dir=cfg.log_dir,\n                         check_val_every_n_epoch=cfg.check_val_every_n_epoch, num_sanity_val_steps=2,\n                         gradient_clip_val=cfg.gradient_clip_val)\n    trainer.fit(net, train_loader, val_loader)\nelse:\n    test_data = get_dataset(cfg, type='test', split=cfg.test_split)\n    test_loader = DataLoader(test_data, batch_size=1, num_workers=cfg.workers, collate_fn=lambda x: x, pin_memory=True)\n\n    trainer = pl.Trainer(gpus=-1, accelerator='ddp', default_root_dir=cfg.log_dir,\n                         resume_from_checkpoint=cfg.checkpoint)\n    trainer.test(net, test_loader)\n"
  },
  {
    "path": "models/__init__.py",
    "content": "\"\"\" \n__init__.py\nCreated by zenn at 2021/7/15 21:40\n\"\"\"\n\nimport importlib\n# import pkgutil\n# import os\n# import inspect\n# __all__ = []\n# for loader, module_name, is_pkg in pkgutil.walk_packages(os.path.abspath(__file__)):\n#     print(loader, module_name, is_pkg)\n#     module = loader.find_module(module_name).load_module(module_name)\n\n\nfrom models import p2b, bat, m2track\n\n\ndef get_model(name):\n    model = globals()[name.lower()].__getattribute__(name.upper())\n    return model\n"
  },
  {
    "path": "models/backbone/pointnet.py",
    "content": "\"\"\"\npointnet.py\nCreated by zenn at 2021/5/9 13:41\n\"\"\"\n\nimport torch\nimport torch.nn as nn\n\nfrom pointnet2.utils.pointnet2_modules import PointnetSAModule\n\n\nclass Pointnet_Backbone(nn.Module):\n    r\"\"\"\n        PointNet2 with single-scale grouping\n        Semantic segmentation network that uses feature propogation layers\n\n        Parameters\n        ----------\n        num_classes: int\n            Number of semantics classes to predict over -- size of softmax classifier that run for each point\n        input_channels: int = 6\n            Number of input channels in the feature descriptor for each point.  If the point cloud is Nx9, this\n            value should be 6 as in an Nx9 point cloud, 3 of the channels are xyz, and 6 are feature descriptors\n        use_xyz: bool = True\n            Whether or not to use the xyz position of a point as a feature\n    \"\"\"\n\n    def __init__(self, use_fps=False, normalize_xyz=False, return_intermediate=False, input_channels=0):\n        super(Pointnet_Backbone, self).__init__()\n        self.return_intermediate = return_intermediate\n        self.SA_modules = nn.ModuleList()\n        self.SA_modules.append(\n            PointnetSAModule(\n                radius=0.3,\n                nsample=32,\n                mlp=[input_channels, 64, 64, 128],\n                use_xyz=True,\n                use_fps=use_fps, normalize_xyz=normalize_xyz\n            )\n        )\n        self.SA_modules.append(\n            PointnetSAModule(\n                radius=0.5,\n                nsample=32,\n                mlp=[128, 128, 128, 256],\n                use_xyz=True,\n                use_fps=False, normalize_xyz=normalize_xyz\n            )\n        )\n        self.SA_modules.append(\n            PointnetSAModule(\n                radius=0.7,\n                nsample=32,\n                mlp=[256, 256, 256, 256],\n                use_xyz=True,\n                use_fps=False, normalize_xyz=normalize_xyz\n            )\n        )\n\n    def _break_up_pc(self, pc):\n        xyz = pc[..., 0:3].contiguous()\n        features = pc[..., 3:].transpose(1, 2).contiguous() if pc.size(-1) > 3 else None\n\n        return xyz, features\n\n    def forward(self, pointcloud, numpoints):\n        r\"\"\"\n            Forward pass of the network\n\n            Parameters\n            ----------\n            pointcloud: Variable(torch.cuda.FloatTensor)\n                (B, N, 3 + input_channels) tensor\n                Point cloud to run predicts on\n                Each point in the point-cloud MUST\n                be formated as (x, y, z, features...)\n        \"\"\"\n        xyz, features = self._break_up_pc(pointcloud)\n\n        l_xyz, l_features, l_idxs = [xyz], [features], []\n        for i in range(len(self.SA_modules)):\n            li_xyz, li_features, sample_idxs = self.SA_modules[i](l_xyz[i], l_features[i], numpoints[i], True)\n            l_xyz.append(li_xyz)\n            l_features.append(li_features)\n            l_idxs.append(sample_idxs)\n        if self.return_intermediate:\n            return l_xyz[1:], l_features[1:], l_idxs[0]\n        return l_xyz[-1], l_features[-1], l_idxs[0]\n\n\nclass MiniPointNet(nn.Module):\n\n    def __init__(self, input_channel, per_point_mlp, hidden_mlp, output_size=0):\n        \"\"\"\n\n        :param input_channel: int\n        :param per_point_mlp: list\n        :param hidden_mlp: list\n        :param output_size: int, if output_size <=0, then the final fc will not be used\n        \"\"\"\n        super(MiniPointNet, self).__init__()\n        seq_per_point = []\n        in_channel = input_channel\n        for out_channel in per_point_mlp:\n            seq_per_point.append(nn.Conv1d(in_channel, out_channel, 1))\n            seq_per_point.append(nn.BatchNorm1d(out_channel))\n            seq_per_point.append(nn.ReLU())\n            in_channel = out_channel\n        seq_hidden = []\n        for out_channel in hidden_mlp:\n            seq_hidden.append(nn.Linear(in_channel, out_channel))\n            seq_hidden.append(nn.BatchNorm1d(out_channel))\n            seq_hidden.append(nn.ReLU())\n            in_channel = out_channel\n\n        # self.per_point_mlp = nn.Sequential(*seq)\n        # self.pooling = nn.AdaptiveMaxPool1d(output_size=1)\n        # self.hidden_mlp = nn.Sequential(*seq_hidden)\n\n        self.features = nn.Sequential(*seq_per_point,\n                                      nn.AdaptiveMaxPool1d(output_size=1),\n                                      nn.Flatten(),\n                                      *seq_hidden)\n        self.output_size = output_size\n        if output_size >= 0:\n            self.fc = nn.Linear(in_channel, output_size)\n\n    def forward(self, x):\n        \"\"\"\n\n        :param x: B,C,N\n        :return: B,output_size\n        \"\"\"\n\n        # x = self.per_point_mlp(x)\n        # x = self.pooling(x)\n        # x = self.hidden_mlp(x)\n        x = self.features(x)\n        if self.output_size > 0:\n            x = self.fc(x)\n        return x\n\n\nclass SegPointNet(nn.Module):\n\n    def __init__(self, input_channel, per_point_mlp1, per_point_mlp2, output_size=0, return_intermediate=False):\n        \"\"\"\n\n        :param input_channel: int\n        :param per_point_mlp: list\n        :param hidden_mlp: list\n        :param output_size: int, if output_size <=0, then the final fc will not be used\n        \"\"\"\n        super(SegPointNet, self).__init__()\n        self.return_intermediate = return_intermediate\n        self.seq_per_point = nn.ModuleList()\n        in_channel = input_channel\n        for out_channel in per_point_mlp1:\n            self.seq_per_point.append(\n                nn.Sequential(\n                    nn.Conv1d(in_channel, out_channel, 1),\n                    nn.BatchNorm1d(out_channel),\n                    nn.ReLU()\n                ))\n            in_channel = out_channel\n\n        self.pool = nn.AdaptiveMaxPool1d(output_size=1)\n\n        self.seq_per_point2 = nn.ModuleList()\n        in_channel = in_channel + per_point_mlp1[1]\n        for out_channel in per_point_mlp2:\n            self.seq_per_point2.append(\n                nn.Sequential(\n                    nn.Conv1d(in_channel, out_channel, 1),\n                    nn.BatchNorm1d(out_channel),\n                    nn.ReLU()\n                ))\n            in_channel = out_channel\n\n        self.output_size = output_size\n        if output_size >= 0:\n            self.fc = nn.Conv1d(in_channel, output_size, 1)\n\n    def forward(self, x):\n        \"\"\"\n\n        :param x: B,C,N\n        :return: B,output_size,N\n        \"\"\"\n        second_layer_out = None\n        for i, mlp in enumerate(self.seq_per_point):\n            x = mlp(x)\n            if i == 1:\n                second_layer_out = x\n        pooled_feature = self.pool(x)  # B,C,1\n        pooled_feature_expand = pooled_feature.expand_as(x)\n        x = torch.cat([second_layer_out, pooled_feature_expand], dim=1)\n        for mlp in self.seq_per_point2:\n            x = mlp(x)\n        if self.output_size > 0:\n            x = self.fc(x)\n        if self.return_intermediate:\n            return x, pooled_feature.squeeze(dim=-1)\n        return x\n\n"
  },
  {
    "path": "models/base_model.py",
    "content": "\"\"\" \nbaseModel.py\nCreated by zenn at 2021/5/9 14:40\n\"\"\"\n\nimport torch\nfrom easydict import EasyDict\nimport pytorch_lightning as pl\nfrom datasets import points_utils\nfrom utils.metrics import TorchSuccess, TorchPrecision\nfrom utils.metrics import estimateOverlap, estimateAccuracy\nimport torch.nn.functional as F\nimport numpy as np\nfrom nuscenes.utils import geometry_utils\n\n\nclass BaseModel(pl.LightningModule):\n    def __init__(self, config=None, **kwargs):\n        super().__init__()\n        if config is None:\n            config = EasyDict(kwargs)\n        self.config = config\n\n        # testing metrics\n        self.prec = TorchPrecision()\n        self.success = TorchSuccess()\n\n    def configure_optimizers(self):\n        if self.config.optimizer.lower() == 'sgd':\n            optimizer = torch.optim.SGD(self.parameters(), lr=self.config.lr, momentum=0.9, weight_decay=self.config.wd)\n        else:\n            optimizer = torch.optim.Adam(self.parameters(), lr=self.config.lr, weight_decay=self.config.wd,\n                                         betas=(0.5, 0.999), eps=1e-06)\n        scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=self.config.lr_decay_step,\n                                                    gamma=self.config.lr_decay_rate)\n        return {\"optimizer\": optimizer, \"lr_scheduler\": scheduler}\n\n    def compute_loss(self, data, output):\n        raise NotImplementedError\n\n    def build_input_dict(self, sequence, frame_id, results_bbs, **kwargs):\n        raise NotImplementedError\n\n    def evaluate_one_sample(self, data_dict, ref_box):\n        end_points = self(data_dict)\n\n        estimation_box = end_points['estimation_boxes']\n        estimation_box_cpu = estimation_box.squeeze(0).detach().cpu().numpy()\n\n        if len(estimation_box.shape) == 3:\n            best_box_idx = estimation_box_cpu[:, 4].argmax()\n            estimation_box_cpu = estimation_box_cpu[best_box_idx, 0:4]\n\n        candidate_box = points_utils.getOffsetBB(ref_box, estimation_box_cpu, degrees=self.config.degrees,\n                                                 use_z=self.config.use_z,\n                                                 limit_box=self.config.limit_box)\n        return candidate_box\n\n    def evaluate_one_sequence(self, sequence):\n        \"\"\"\n        :param sequence: a sequence of annos {\"pc\": pc, \"3d_bbox\": bb, 'meta': anno}\n        :return:\n        \"\"\"\n        ious = []\n        distances = []\n        results_bbs = []\n        for frame_id in range(len(sequence)):  # tracklet\n            this_bb = sequence[frame_id][\"3d_bbox\"]\n            if frame_id == 0:\n                # the first frame\n                results_bbs.append(this_bb)\n            else:\n\n                # construct input dict\n                data_dict, ref_bb = self.build_input_dict(sequence, frame_id, results_bbs)\n                # run the tracker\n                candidate_box = self.evaluate_one_sample(data_dict, ref_box=ref_bb)\n                results_bbs.append(candidate_box)\n\n            this_overlap = estimateOverlap(this_bb, results_bbs[-1], dim=self.config.IoU_space,\n                                           up_axis=self.config.up_axis)\n            this_accuracy = estimateAccuracy(this_bb, results_bbs[-1], dim=self.config.IoU_space,\n                                             up_axis=self.config.up_axis)\n            ious.append(this_overlap)\n            distances.append(this_accuracy)\n        return ious, distances, results_bbs\n\n    def validation_step(self, batch, batch_idx):\n        sequence = batch[0]  # unwrap the batch with batch size = 1\n        ious, distances, *_ = self.evaluate_one_sequence(sequence)\n        # update metrics\n        self.success(torch.tensor(ious, device=self.device))\n        self.prec(torch.tensor(distances, device=self.device))\n        self.log('success/test', self.success, on_step=True, on_epoch=True)\n        self.log('precision/test', self.prec, on_step=True, on_epoch=True)\n\n    def validation_epoch_end(self, outputs):\n        self.logger.experiment.add_scalars('metrics/test',\n                                           {'success': self.success.compute(),\n                                            'precision': self.prec.compute()},\n                                           global_step=self.global_step)\n\n    def test_step(self, batch, batch_idx):\n        sequence = batch[0]  # unwrap the batch with batch size = 1\n        ious, distances, result_bbs = self.evaluate_one_sequence(sequence)\n        # update metrics\n        self.success(torch.tensor(ious, device=self.device))\n        self.prec(torch.tensor(distances, device=self.device))\n        self.log('success/test', self.success, on_step=True, on_epoch=True)\n        self.log('precision/test', self.prec, on_step=True, on_epoch=True)\n        return result_bbs\n\n    def test_epoch_end(self, outputs):\n        self.logger.experiment.add_scalars('metrics/test',\n                                           {'success': self.success.compute(),\n                                            'precision': self.prec.compute()},\n                                           global_step=self.global_step)\n\n\nclass MatchingBaseModel(BaseModel):\n\n    def compute_loss(self, data, output):\n        \"\"\"\n\n        :param data: input data\n        :param output:\n        :return:\n        \"\"\"\n        estimation_boxes = output['estimation_boxes']  # B,num_proposal,5\n        estimation_cla = output['estimation_cla']  # B,N\n        seg_label = data['seg_label']\n        box_label = data['box_label']  # B,4\n        proposal_center = output[\"center_xyz\"]  # B,num_proposal,3\n        vote_xyz = output[\"vote_xyz\"]\n\n        loss_seg = F.binary_cross_entropy_with_logits(estimation_cla, seg_label)\n\n        loss_vote = F.smooth_l1_loss(vote_xyz, box_label[:, None, :3].expand_as(vote_xyz), reduction='none')  # B,N,3\n        loss_vote = (loss_vote.mean(2) * seg_label).sum() / (seg_label.sum() + 1e-06)\n\n        dist = torch.sum((proposal_center - box_label[:, None, :3]) ** 2, dim=-1)\n\n        dist = torch.sqrt(dist + 1e-6)  # B, K\n        objectness_label = torch.zeros_like(dist, dtype=torch.float)\n        objectness_label[dist < 0.3] = 1\n        objectness_score = estimation_boxes[:, :, 4]  # B, K\n        objectness_mask = torch.zeros_like(objectness_label, dtype=torch.float)\n        objectness_mask[dist < 0.3] = 1\n        objectness_mask[dist > 0.6] = 1\n        loss_objective = F.binary_cross_entropy_with_logits(objectness_score, objectness_label,\n                                                            pos_weight=torch.tensor([2.0]).cuda())\n        loss_objective = torch.sum(loss_objective * objectness_mask) / (\n                torch.sum(objectness_mask) + 1e-6)\n        loss_box = F.smooth_l1_loss(estimation_boxes[:, :, :4],\n                                    box_label[:, None, :4].expand_as(estimation_boxes[:, :, :4]),\n                                    reduction='none')\n        loss_box = torch.sum(loss_box.mean(2) * objectness_label) / (objectness_label.sum() + 1e-6)\n\n        return {\n            \"loss_objective\": loss_objective,\n            \"loss_box\": loss_box,\n            \"loss_seg\": loss_seg,\n            \"loss_vote\": loss_vote,\n        }\n\n    def generate_template(self, sequence, current_frame_id, results_bbs):\n        \"\"\"\n        generate template for evaluating.\n        the template can be updated using the previous predictions.\n        :param sequence: the list of the whole sequence\n        :param current_frame_id:\n        :param results_bbs: predicted box for previous frames\n        :return:\n        \"\"\"\n        first_pc = sequence[0]['pc']\n        previous_pc = sequence[current_frame_id - 1]['pc']\n        if \"firstandprevious\".upper() in self.config.shape_aggregation.upper():\n            template_pc, canonical_box = points_utils.getModel([first_pc, previous_pc],\n                                                               [results_bbs[0], results_bbs[current_frame_id - 1]],\n                                                               scale=self.config.model_bb_scale,\n                                                               offset=self.config.model_bb_offset)\n        elif \"first\".upper() in self.config.shape_aggregation.upper():\n            template_pc, canonical_box = points_utils.cropAndCenterPC(first_pc, results_bbs[0],\n                                                                      scale=self.config.model_bb_scale,\n                                                                      offset=self.config.model_bb_offset)\n        elif \"previous\".upper() in self.config.hape_aggregation.upper():\n            template_pc, canonical_box = points_utils.cropAndCenterPC(previous_pc, results_bbs[current_frame_id - 1],\n                                                                      scale=self.config.model_bb_scale,\n                                                                      offset=self.config.model_bb_offset)\n        elif \"all\".upper() in self.config.shape_aggregation.upper():\n            template_pc, canonical_box = points_utils.getModel([frame[\"pc\"] for frame in sequence[:current_frame_id]],\n                                                               results_bbs,\n                                                               scale=self.config.model_bb_scale,\n                                                               offset=self.config.model_bb_offset)\n        return template_pc, canonical_box\n\n    def generate_search_area(self, sequence, current_frame_id, results_bbs):\n        \"\"\"\n        generate search area for evaluating.\n\n        :param sequence:\n        :param current_frame_id:\n        :param results_bbs:\n        :return:\n        \"\"\"\n        this_bb = sequence[current_frame_id][\"3d_bbox\"]\n        this_pc = sequence[current_frame_id][\"pc\"]\n        if (\"previous_result\".upper() in self.config.reference_BB.upper()):\n            ref_bb = results_bbs[-1]\n        elif (\"previous_gt\".upper() in self.config.reference_BB.upper()):\n            previous_bb = sequence[current_frame_id - 1][\"3d_bbox\"]\n            ref_bb = previous_bb\n        elif (\"current_gt\".upper() in self.config.reference_BB.upper()):\n            ref_bb = this_bb\n        search_pc_crop = points_utils.generate_subwindow(this_pc, ref_bb,\n                                                         scale=self.config.search_bb_scale,\n                                                         offset=self.config.search_bb_offset)\n        return search_pc_crop, ref_bb\n\n    def prepare_input(self, template_pc, search_pc, template_box, *args, **kwargs):\n        \"\"\"\n        construct input dict for evaluating\n        :param template_pc:\n        :param search_pc:\n        :param template_box:\n        :return:\n        \"\"\"\n        template_points, idx_t = points_utils.regularize_pc(template_pc.points.T, self.config.template_size,\n                                                            seed=1)\n        search_points, idx_s = points_utils.regularize_pc(search_pc.points.T, self.config.search_size,\n                                                          seed=1)\n        template_points_torch = torch.tensor(template_points, device=self.device, dtype=torch.float32)\n        search_points_torch = torch.tensor(search_points, device=self.device, dtype=torch.float32)\n        data_dict = {\n            'template_points': template_points_torch[None, ...],\n            'search_points': search_points_torch[None, ...],\n        }\n        return data_dict\n\n    def build_input_dict(self, sequence, frame_id, results_bbs, **kwargs):\n        # preparing search area\n        search_pc_crop, ref_bb = self.generate_search_area(sequence, frame_id, results_bbs)\n        # update template\n        template_pc, canonical_box = self.generate_template(sequence, frame_id, results_bbs)\n        # construct input dict\n        data_dict = self.prepare_input(template_pc, search_pc_crop, canonical_box)\n        return data_dict, ref_bb\n\n\nclass MotionBaseModel(BaseModel):\n    def __init__(self, config, **kwargs):\n        super().__init__(config, **kwargs)\n        self.save_hyperparameters()\n\n    def build_input_dict(self, sequence, frame_id, results_bbs):\n        assert frame_id > 0, \"no need to construct an input_dict at frame 0\"\n\n        prev_frame = sequence[frame_id - 1]\n        this_frame = sequence[frame_id]\n        prev_pc = prev_frame['pc']\n        this_pc = this_frame['pc']\n        ref_box = results_bbs[-1]\n        prev_frame_pc = points_utils.generate_subwindow(prev_pc, ref_box,\n                                                        scale=self.config.bb_scale,\n                                                        offset=self.config.bb_offset)\n        this_frame_pc = points_utils.generate_subwindow(this_pc, ref_box,\n                                                        scale=self.config.bb_scale,\n                                                        offset=self.config.bb_offset)\n\n        canonical_box = points_utils.transform_box(ref_box, ref_box)\n        prev_points, idx_prev = points_utils.regularize_pc(prev_frame_pc.points.T,\n                                                           self.config.point_sample_size,\n                                                           seed=1)\n\n        this_points, idx_this = points_utils.regularize_pc(this_frame_pc.points.T,\n                                                           self.config.point_sample_size,\n                                                           seed=1)\n        seg_mask_prev = geometry_utils.points_in_box(canonical_box, prev_points.T, 1.25).astype(float)\n\n        # Here we use 0.2/0.8 instead of 0/1 to indicate that the previous box is not GT.\n        # When boxcloud is used, the actual value of prior-targetness mask doesn't really matter.\n        if frame_id != 1:\n            seg_mask_prev[seg_mask_prev == 0] = 0.2\n            seg_mask_prev[seg_mask_prev == 1] = 0.8\n        seg_mask_this = np.full(seg_mask_prev.shape, fill_value=0.5)\n\n        timestamp_prev = np.full((self.config.point_sample_size, 1), fill_value=0)\n        timestamp_this = np.full((self.config.point_sample_size, 1), fill_value=0.1)\n        prev_points = np.concatenate([prev_points, timestamp_prev, seg_mask_prev[:, None]], axis=-1)\n        this_points = np.concatenate([this_points, timestamp_this, seg_mask_this[:, None]], axis=-1)\n\n        stack_points = np.concatenate([prev_points, this_points], axis=0)\n\n        data_dict = {\"points\": torch.tensor(stack_points[None, :], device=self.device, dtype=torch.float32),\n                     }\n        if getattr(self.config, 'box_aware', False):\n            candidate_bc_prev = points_utils.get_point_to_box_distance(\n                stack_points[:self.config.point_sample_size, :3], canonical_box)\n            candidate_bc_this = np.zeros_like(candidate_bc_prev)\n            candidate_bc = np.concatenate([candidate_bc_prev, candidate_bc_this], axis=0)\n            data_dict.update({'candidate_bc': points_utils.np_to_torch_tensor(candidate_bc.astype('float32'),\n                                                                              device=self.device)})\n        return data_dict, results_bbs[-1]\n"
  },
  {
    "path": "models/bat.py",
    "content": "\"\"\" \nbat.py\nCreated by zenn at 2021/7/21 14:16\n\"\"\"\n\nimport torch\nfrom torch import nn\nfrom models.backbone.pointnet import Pointnet_Backbone\nfrom models.head.xcorr import BoxAwareXCorr\nfrom models.head.rpn import P2BVoteNetRPN\nfrom models import base_model\nimport torch.nn.functional as F\nfrom datasets import points_utils\nfrom pointnet2.utils import pytorch_utils as pt_utils\n\n\nclass BAT(base_model.MatchingBaseModel):\n    def __init__(self, config=None, **kwargs):\n        super().__init__(config, **kwargs)\n        self.save_hyperparameters()\n        self.backbone = Pointnet_Backbone(self.config.use_fps, self.config.normalize_xyz, return_intermediate=False)\n        self.conv_final = nn.Conv1d(256, self.config.feature_channel, kernel_size=1)\n        self.mlp_bc = (pt_utils.Seq(3 + self.config.feature_channel)\n                       .conv1d(self.config.feature_channel, bn=True)\n                       .conv1d(self.config.feature_channel, bn=True)\n                       .conv1d(self.config.bc_channel, activation=None))\n\n        self.xcorr = BoxAwareXCorr(feature_channel=self.config.feature_channel,\n                                   hidden_channel=self.config.hidden_channel,\n                                   out_channel=self.config.out_channel,\n                                   k=self.config.k,\n                                   use_search_bc=self.config.use_search_bc,\n                                   use_search_feature=self.config.use_search_feature,\n                                   bc_channel=self.config.bc_channel)\n        self.rpn = P2BVoteNetRPN(self.config.feature_channel,\n                                 vote_channel=self.config.vote_channel,\n                                 num_proposal=self.config.num_proposal,\n                                 normalize_xyz=self.config.normalize_xyz)\n\n\n    def prepare_input(self, template_pc, search_pc, template_box):\n        template_points, idx_t = points_utils.regularize_pc(template_pc.points.T, self.config.template_size,\n                                                            seed=1)\n        search_points, idx_s = points_utils.regularize_pc(search_pc.points.T, self.config.search_size,\n                                                          seed=1)\n        template_points_torch = torch.tensor(template_points, device=self.device, dtype=torch.float32)\n        search_points_torch = torch.tensor(search_points, device=self.device, dtype=torch.float32)\n        template_bc = points_utils.get_point_to_box_distance(template_points, template_box)\n        template_bc_torch = torch.tensor(template_bc, device=self.device, dtype=torch.float32)\n        data_dict = {\n            'template_points': template_points_torch[None, ...],\n            'search_points': search_points_torch[None, ...],\n            'points2cc_dist_t': template_bc_torch[None, ...]\n        }\n        return data_dict\n\n    def compute_loss(self, data, output):\n        out_dict = super(BAT, self).compute_loss(data, output)\n        search_bc = data['points2cc_dist_s']\n        pred_search_bc = output['pred_search_bc']\n        seg_label = data['seg_label']\n        loss_bc = F.smooth_l1_loss(pred_search_bc, search_bc, reduction='none')\n        loss_bc = torch.sum(loss_bc.mean(2) * seg_label) / (seg_label.sum() + 1e-6)\n        out_dict[\"loss_bc\"] = loss_bc\n        return out_dict\n\n    def forward(self, input_dict):\n        \"\"\"\n        :param input_dict:\n        {\n        'template_points': template_points.astype('float32'),\n        'search_points': search_points.astype('float32'),\n        'box_label': np.array(search_bbox_reg).astype('float32'),\n        'bbox_size': search_box.wlh,\n        'seg_label': seg_label.astype('float32'),\n        'points2cc_dist_t': template_bc,\n        'points2cc_dist_s': search_bc,\n        }\n\n        :return:\n        \"\"\"\n        template = input_dict['template_points']\n        search = input_dict['search_points']\n        template_bc = input_dict['points2cc_dist_t']\n        M = template.shape[1]\n        N = search.shape[1]\n\n        # backbone\n        template_xyz, template_feature, sample_idxs_t = self.backbone(template, [M // 2, M // 4, M // 8])\n        search_xyz, search_feature, sample_idxs = self.backbone(search, [N // 2, N // 4, N // 8])\n        template_feature = self.conv_final(template_feature)\n        search_feature = self.conv_final(search_feature)\n        # prepare bc\n        pred_search_bc = self.mlp_bc(torch.cat([search_xyz.transpose(1, 2), search_feature], dim=1))  # B, 9, N // 8\n        pred_search_bc = pred_search_bc.transpose(1, 2)\n        sample_idxs_t = sample_idxs_t[:, :M // 8, None]\n        template_bc = template_bc.gather(dim=1, index=sample_idxs_t.repeat(1, 1, self.config.bc_channel).long())\n        # box-aware xcorr\n        fusion_feature = self.xcorr(template_feature, search_feature, template_xyz, search_xyz, template_bc,\n                                    pred_search_bc)\n        # proposal generation\n        estimation_boxes, estimation_cla, vote_xyz, center_xyzs = self.rpn(search_xyz, fusion_feature)\n        end_points = {\"estimation_boxes\": estimation_boxes,\n                      \"vote_center\": vote_xyz,\n                      \"pred_seg_score\": estimation_cla,\n                      \"center_xyz\": center_xyzs,\n                      'sample_idxs': sample_idxs,\n                      'estimation_cla': estimation_cla,\n                      \"vote_xyz\": vote_xyz,\n                      \"pred_search_bc\": pred_search_bc\n                      }\n        return end_points\n\n    def training_step(self, batch, batch_idx):\n        \"\"\"\n        {\"estimation_boxes\": estimation_boxs.transpose(1, 2).contiguous(),\n                  \"vote_center\": vote_xyz,\n                  \"pred_seg_score\": estimation_cla,\n                  \"center_xyz\": center_xyzs,\n                  \"seed_idxs\":\n                  \"seg_label\"\n                  \"pred_search_bc\": pred_search_bc\n        }\n        \"\"\"\n        end_points = self(batch)\n\n        search_pc = batch['points2cc_dist_s']\n        estimation_cla = end_points['estimation_cla']  # B,N\n        N = estimation_cla.shape[1]\n        seg_label = batch['seg_label']\n        sample_idxs = end_points['sample_idxs']  # B,N\n        seg_label = seg_label.gather(dim=1, index=sample_idxs[:, :N].long())  # B,N\n        search_pc = search_pc.gather(dim=1, index=sample_idxs[:, :N, None].repeat(1, 1, self.config.bc_channel).long())\n        # update label\n        batch['seg_label'] = seg_label\n        batch['points2cc_dist_s'] = search_pc\n        # compute loss\n        loss_dict = self.compute_loss(batch, end_points)\n        loss = loss_dict['loss_objective'] * self.config.objectiveness_weight \\\n               + loss_dict['loss_box'] * self.config.box_weight \\\n               + loss_dict['loss_seg'] * self.config.seg_weight \\\n               + loss_dict['loss_vote'] * self.config.vote_weight \\\n               + loss_dict['loss_bc'] * self.config.bc_weight\n\n        # log\n        self.log('loss/train', loss.item(), on_step=True, on_epoch=True, prog_bar=True, logger=False)\n        self.log('loss_box/train', loss_dict['loss_box'].item(), on_step=True, on_epoch=True, prog_bar=True,\n                 logger=False)\n        self.log('loss_seg/train', loss_dict['loss_seg'].item(), on_step=True, on_epoch=True, prog_bar=True,\n                 logger=False)\n        self.log('loss_vote/train', loss_dict['loss_vote'].item(), on_step=True, on_epoch=True, prog_bar=True,\n                 logger=False)\n        self.log('loss_bc/train', loss_dict['loss_bc'].item(), on_step=True, on_epoch=True, prog_bar=True,\n                 logger=False)\n        self.log('loss_objective/train', loss_dict['loss_objective'].item(), on_step=True, on_epoch=True, prog_bar=True,\n                 logger=False)\n\n        self.logger.experiment.add_scalars('loss', {'loss_total': loss.item(),\n                                                    'loss_box': loss_dict['loss_box'].item(),\n                                                    'loss_seg': loss_dict['loss_seg'].item(),\n                                                    'loss_vote': loss_dict['loss_vote'].item(),\n                                                    'loss_objective': loss_dict['loss_objective'].item(),\n                                                    'loss_bc': loss_dict['loss_bc'].item()},\n                                           global_step=self.global_step)\n\n        return loss\n"
  },
  {
    "path": "models/head/rpn.py",
    "content": "\"\"\" \nrpn.py\nCreated by zenn at 2021/5/8 20:55\n\"\"\"\nimport torch\nfrom torch import nn\nfrom pointnet2.utils import pytorch_utils as pt_utils\n\nfrom pointnet2.utils.pointnet2_modules import PointnetSAModule\n\n\nclass P2BVoteNetRPN(nn.Module):\n\n    def __init__(self, feature_channel, vote_channel=256, num_proposal=64, normalize_xyz=False):\n        super().__init__()\n        self.num_proposal = num_proposal\n        self.FC_layer_cla = (\n            pt_utils.Seq(feature_channel)\n                .conv1d(feature_channel, bn=True)\n                .conv1d(feature_channel, bn=True)\n                .conv1d(1, activation=None))\n        self.vote_layer = (\n            pt_utils.Seq(3 + feature_channel)\n                .conv1d(feature_channel, bn=True)\n                .conv1d(feature_channel, bn=True)\n                .conv1d(3 + feature_channel, activation=None))\n\n        self.vote_aggregation = PointnetSAModule(\n            radius=0.3,\n            nsample=16,\n            mlp=[1 + feature_channel, vote_channel, vote_channel, vote_channel],\n            use_xyz=True,\n            normalize_xyz=normalize_xyz)\n\n        self.FC_proposal = (\n            pt_utils.Seq(vote_channel)\n                .conv1d(vote_channel, bn=True)\n                .conv1d(vote_channel, bn=True)\n                .conv1d(3 + 1 + 1, activation=None))\n\n    def forward(self, xyz, feature):\n        \"\"\"\n\n        :param xyz: B,N,3\n        :param feature: B,f,N\n        :return: B,N,4+1 (xyz,theta,targetnessscore)\n        \"\"\"\n        estimation_cla = self.FC_layer_cla(feature).squeeze(1)\n        score = estimation_cla.sigmoid()\n\n        xyz_feature = torch.cat((xyz.transpose(1, 2).contiguous(), feature), dim=1)\n\n        offset = self.vote_layer(xyz_feature)\n        vote = xyz_feature + offset\n        vote_xyz = vote[:, 0:3, :].transpose(1, 2).contiguous()\n        vote_feature = vote[:, 3:, :]\n\n        vote_feature = torch.cat((score.unsqueeze(1), vote_feature), dim=1)\n\n        center_xyzs, proposal_features = self.vote_aggregation(vote_xyz, vote_feature, self.num_proposal)\n        proposal_offsets = self.FC_proposal(proposal_features)\n        estimation_boxes = torch.cat(\n            (proposal_offsets[:, 0:3, :] + center_xyzs.transpose(1, 2).contiguous(), proposal_offsets[:, 3:5, :]),\n            dim=1)\n\n        estimation_boxes = estimation_boxes.transpose(1, 2).contiguous()\n        return estimation_boxes, estimation_cla, vote_xyz, center_xyzs\n"
  },
  {
    "path": "models/head/xcorr.py",
    "content": "# Created by zenn at 2021/5/8\nimport torch\nfrom torch import nn\nfrom pointnet2.utils import pytorch_utils as pt_utils\nfrom pointnet2.utils import pointnet2_utils\n\nimport torch.nn.functional as F\n\n\nclass BaseXCorr(nn.Module):\n    def __init__(self, in_channel, hidden_channel, out_channel):\n        super().__init__()\n        self.cosine = nn.CosineSimilarity(dim=1)\n        self.mlp = pt_utils.SharedMLP([in_channel, hidden_channel, hidden_channel, hidden_channel], bn=True)\n        self.fea_layer = (pt_utils.Seq(hidden_channel)\n                          .conv1d(hidden_channel, bn=True)\n                          .conv1d(out_channel, activation=None))\n\n\nclass P2B_XCorr(BaseXCorr):\n    def __init__(self, feature_channel, hidden_channel, out_channel):\n        mlp_in_channel = feature_channel + 4\n        super().__init__(mlp_in_channel, hidden_channel, out_channel)\n\n    def forward(self, template_feature, search_feature, template_xyz):\n        \"\"\"\n\n        :param template_feature: B,f,M\n        :param search_feature: B,f,N\n        :param template_xyz: B,M,3\n        :return:\n        \"\"\"\n        B = template_feature.size(0)\n        f = template_feature.size(1)\n        n1 = template_feature.size(2)\n        n2 = search_feature.size(2)\n        final_out_cla = self.cosine(template_feature.unsqueeze(-1).expand(B, f, n1, n2),\n                                    search_feature.unsqueeze(2).expand(B, f, n1, n2))  # B,n1,n2\n\n        fusion_feature = torch.cat(\n            (final_out_cla.unsqueeze(1), template_xyz.transpose(1, 2).contiguous().unsqueeze(-1).expand(B, 3, n1, n2)),\n            dim=1)  # B,1+3,n1,n2\n\n        fusion_feature = torch.cat((fusion_feature, template_feature.unsqueeze(-1).expand(B, f, n1, n2)),\n                                   dim=1)  # B,1+3+f,n1,n2\n\n        fusion_feature = self.mlp(fusion_feature)\n\n        fusion_feature = F.max_pool2d(fusion_feature, kernel_size=[fusion_feature.size(2), 1])  # B, f, 1, n2\n        fusion_feature = fusion_feature.squeeze(2)  # B, f, n2\n        fusion_feature = self.fea_layer(fusion_feature)\n\n        return fusion_feature\n\n\nclass BoxAwareXCorr(BaseXCorr):\n    def __init__(self, feature_channel, hidden_channel, out_channel, k=8, use_search_bc=False, use_search_feature=False,\n                 bc_channel=9):\n        self.k = k\n        self.use_search_bc = use_search_bc\n        self.use_search_feature = use_search_feature\n        mlp_in_channel = feature_channel + 3 + bc_channel\n        if use_search_bc: mlp_in_channel += bc_channel\n        if use_search_feature: mlp_in_channel += feature_channel\n        super(BoxAwareXCorr, self).__init__(mlp_in_channel, hidden_channel, out_channel)\n\n    def forward(self, template_feature, search_feature, template_xyz,\n                search_xyz=None, template_bc=None, search_bc=None):\n        \"\"\"\n\n        :param template_feature: B,f,M\n        :param search_feature: B,f,N\n        :param template_xyz: B,M,3\n        :param search_xyz: B.N,3\n        :param template_bc: B,M,9\n        :param search_bc: B.N,9\n        :param args:\n        :param kwargs:\n        :return:\n        \"\"\"\n        dist_matrix = torch.cdist(template_bc, search_bc)  # B, M, N\n        template_xyz_feature_box = torch.cat([template_xyz.transpose(1, 2).contiguous(),\n                                              template_bc.transpose(1, 2).contiguous(),\n                                              template_feature], dim=1)\n        # search_xyz_feature = torch.cat([search_xyz.transpose(1, 2).contiguous(), search_feature], dim=1)\n\n        top_k_nearest_idx_b = torch.argsort(dist_matrix, dim=1)[:, :self.k, :]  # B, K, N\n        top_k_nearest_idx_b = top_k_nearest_idx_b.transpose(1, 2).contiguous().int()  # B, N, K\n        correspondences_b = pointnet2_utils.grouping_operation(template_xyz_feature_box,\n                                                               top_k_nearest_idx_b)  # B,3+9+D,N,K\n        if self.use_search_bc:\n            search_bc_expand = search_bc.transpose(1, 2).unsqueeze(dim=-1).repeat(1, 1, 1, self.K)  # B,9,N,K\n            correspondences_b = torch.cat([search_bc_expand, correspondences_b], dim=1)\n        if self.use_search_feature:\n            search_feature_expand = search_feature.unsqueeze(dim=-1).repeat(1, 1, 1, self.K)  # B,D,N,K\n            correspondences_b = torch.cat([search_feature_expand, correspondences_b], dim=1)\n\n        ## correspondences fusion head\n        fusion_feature = self.mlp(correspondences_b)  # B,D,N,K\n        fusion_feature, _ = torch.max(fusion_feature, dim=-1)  # B,D,N,1\n        fusion_feature = self.fea_layer(fusion_feature.squeeze(dim=-1))  # B,D,N\n\n        return fusion_feature\n"
  },
  {
    "path": "models/m2track.py",
    "content": "\"\"\"\nm2track.py\nCreated by zenn at 2021/11/24 13:10\n\"\"\"\nfrom datasets import points_utils\nfrom models import base_model\nfrom models.backbone.pointnet import MiniPointNet, SegPointNet\n\nimport torch\nfrom torch import nn\nimport torch.nn.functional as F\n\nfrom utils.metrics import estimateOverlap, estimateAccuracy\nfrom torchmetrics import Accuracy\n\n\nclass M2TRACK(base_model.MotionBaseModel):\n    def __init__(self, config, **kwargs):\n        super().__init__(config, **kwargs)\n        self.seg_acc = Accuracy(num_classes=2, average='none')\n\n        self.box_aware = getattr(config, 'box_aware', False)\n        self.use_motion_cls = getattr(config, 'use_motion_cls', True)\n        self.use_second_stage = getattr(config, 'use_second_stage', True)\n        self.use_prev_refinement = getattr(config, 'use_prev_refinement', True)\n        self.seg_pointnet = SegPointNet(input_channel=3 + 1 + 1 + (9 if self.box_aware else 0),\n                                        per_point_mlp1=[64, 64, 64, 128, 1024],\n                                        per_point_mlp2=[512, 256, 128, 128],\n                                        output_size=2 + (9 if self.box_aware else 0))\n        self.mini_pointnet = MiniPointNet(input_channel=3 + 1 + (9 if self.box_aware else 0),\n                                          per_point_mlp=[64, 128, 256, 512],\n                                          hidden_mlp=[512, 256],\n                                          output_size=-1)\n        if self.use_second_stage:\n            self.mini_pointnet2 = MiniPointNet(input_channel=3 + (9 if self.box_aware else 0),\n                                               per_point_mlp=[64, 128, 256, 512],\n                                               hidden_mlp=[512, 256],\n                                               output_size=-1)\n\n            self.box_mlp = nn.Sequential(nn.Linear(256, 128),\n                                         nn.BatchNorm1d(128),\n                                         nn.ReLU(),\n                                         nn.Linear(128, 128),\n                                         nn.BatchNorm1d(128),\n                                         nn.ReLU(),\n                                         nn.Linear(128, 4))\n        if self.use_prev_refinement:\n            self.final_mlp = nn.Sequential(nn.Linear(256, 128),\n                                           nn.BatchNorm1d(128),\n                                           nn.ReLU(),\n                                           nn.Linear(128, 128),\n                                           nn.BatchNorm1d(128),\n                                           nn.ReLU(),\n                                           nn.Linear(128, 4))\n        if self.use_motion_cls:\n            self.motion_state_mlp = nn.Sequential(nn.Linear(256, 128),\n                                                  nn.BatchNorm1d(128),\n                                                  nn.ReLU(),\n                                                  nn.Linear(128, 128),\n                                                  nn.BatchNorm1d(128),\n                                                  nn.ReLU(),\n                                                  nn.Linear(128, 2))\n            self.motion_acc = Accuracy(num_classes=2, average='none')\n\n        self.motion_mlp = nn.Sequential(nn.Linear(256, 128),\n                                        nn.BatchNorm1d(128),\n                                        nn.ReLU(),\n                                        nn.Linear(128, 128),\n                                        nn.BatchNorm1d(128),\n                                        nn.ReLU(),\n                                        nn.Linear(128, 4))\n\n    def forward(self, input_dict):\n        \"\"\"\n        Args:\n            input_dict: {\n            \"points\": (B,N,3+1+1)\n            \"candidate_bc\": (B,N,9)\n\n        }\n\n        Returns: B,4\n\n        \"\"\"\n        output_dict = {}\n        x = input_dict[\"points\"].transpose(1, 2)\n        if self.box_aware:\n            candidate_bc = input_dict[\"candidate_bc\"].transpose(1, 2)\n            x = torch.cat([x, candidate_bc], dim=1)\n\n        B, _, N = x.shape\n\n        seg_out = self.seg_pointnet(x)\n        seg_logits = seg_out[:, :2, :]  # B,2,N\n        pred_cls = torch.argmax(seg_logits, dim=1, keepdim=True)  # B,1,N\n        mask_points = x[:, :4, :] * pred_cls\n        mask_xyz_t0 = mask_points[:, :3, :N // 2]  # B,3,N//2\n        mask_xyz_t1 = mask_points[:, :3, N // 2:]\n        if self.box_aware:\n            pred_bc = seg_out[:, 2:, :]\n            mask_pred_bc = pred_bc * pred_cls\n            # mask_pred_bc_t0 = mask_pred_bc[:, :, :N // 2]  # B,9,N//2\n            # mask_pred_bc_t1 = mask_pred_bc[:, :, N // 2:]\n            mask_points = torch.cat([mask_points, mask_pred_bc], dim=1)\n            output_dict['pred_bc'] = pred_bc.transpose(1, 2)\n\n        point_feature = self.mini_pointnet(mask_points)\n\n        # motion state prediction\n        motion_pred = self.motion_mlp(point_feature)  # B,4\n        if self.use_motion_cls:\n            motion_state_logits = self.motion_state_mlp(point_feature)  # B,2\n            motion_mask = torch.argmax(motion_state_logits, dim=1, keepdim=True)  # B,1\n            motion_pred_masked = motion_pred * motion_mask\n            output_dict['motion_cls'] = motion_state_logits\n        else:\n            motion_pred_masked = motion_pred\n        # previous bbox refinement\n        if self.use_prev_refinement:\n            prev_boxes = self.final_mlp(point_feature)  # previous bb, B,4\n            output_dict[\"estimation_boxes_prev\"] = prev_boxes[:, :4]\n        else:\n            prev_boxes = torch.zeros_like(motion_pred)\n\n        # 1st stage prediction\n        aux_box = points_utils.get_offset_box_tensor(prev_boxes, motion_pred_masked)\n\n        # 2nd stage refinement\n        if self.use_second_stage:\n            mask_xyz_t0_2_t1 = points_utils.get_offset_points_tensor(mask_xyz_t0.transpose(1, 2),\n                                                                     prev_boxes[:, :4],\n                                                                     motion_pred_masked).transpose(1, 2)  # B,3,N//2\n            mask_xyz_t01 = torch.cat([mask_xyz_t0_2_t1, mask_xyz_t1], dim=-1)  # B,3,N\n\n            # transform to the aux_box coordinate system\n            mask_xyz_t01 = points_utils.remove_transform_points_tensor(mask_xyz_t01.transpose(1, 2),\n                                                                       aux_box).transpose(1, 2)\n\n            if self.box_aware:\n                mask_xyz_t01 = torch.cat([mask_xyz_t01, mask_pred_bc], dim=1)\n            output_offset = self.box_mlp(self.mini_pointnet2(mask_xyz_t01))  # B,4\n            output = points_utils.get_offset_box_tensor(aux_box, output_offset)\n            output_dict[\"estimation_boxes\"] = output\n        else:\n            output_dict[\"estimation_boxes\"] = aux_box\n        output_dict.update({\"seg_logits\": seg_logits,\n                            \"motion_pred\": motion_pred,\n                            'aux_estimation_boxes': aux_box,\n                            })\n\n        return output_dict\n\n    def compute_loss(self, data, output):\n        loss_total = 0.0\n        loss_dict = {}\n        aux_estimation_boxes = output['aux_estimation_boxes']  # B,4\n        motion_pred = output['motion_pred']  # B,4\n        seg_logits = output['seg_logits']\n        with torch.no_grad():\n            seg_label = data['seg_label']\n            box_label = data['box_label']\n            box_label_prev = data['box_label_prev']\n            motion_label = data['motion_label']\n            motion_state_label = data['motion_state_label']\n            center_label = box_label[:, :3]\n            angle_label = torch.sin(box_label[:, 3])\n            center_label_prev = box_label_prev[:, :3]\n            angle_label_prev = torch.sin(box_label_prev[:, 3])\n            center_label_motion = motion_label[:, :3]\n            angle_label_motion = torch.sin(motion_label[:, 3])\n\n        loss_seg = F.cross_entropy(seg_logits, seg_label, weight=torch.tensor([0.5, 2.0]).cuda())\n        if self.use_motion_cls:\n            motion_cls = output['motion_cls']  # B,2\n            loss_motion_cls = F.cross_entropy(motion_cls, motion_state_label)\n            loss_total += loss_motion_cls * self.config.motion_cls_seg_weight\n            loss_dict['loss_motion_cls'] = loss_motion_cls\n\n            loss_center_motion = F.smooth_l1_loss(motion_pred[:, :3], center_label_motion, reduction='none')\n            loss_center_motion = (motion_state_label * loss_center_motion.mean(dim=1)).sum() / (\n                    motion_state_label.sum() + 1e-6)\n            loss_angle_motion = F.smooth_l1_loss(torch.sin(motion_pred[:, 3]), angle_label_motion, reduction='none')\n            loss_angle_motion = (motion_state_label * loss_angle_motion).sum() / (motion_state_label.sum() + 1e-6)\n        else:\n            loss_center_motion = F.smooth_l1_loss(motion_pred[:, :3], center_label_motion)\n            loss_angle_motion = F.smooth_l1_loss(torch.sin(motion_pred[:, 3]), angle_label_motion)\n\n        if self.use_second_stage:\n            estimation_boxes = output['estimation_boxes']  # B,4\n            loss_center = F.smooth_l1_loss(estimation_boxes[:, :3], center_label)\n            loss_angle = F.smooth_l1_loss(torch.sin(estimation_boxes[:, 3]), angle_label)\n            loss_total += 1 * (loss_center * self.config.center_weight + loss_angle * self.config.angle_weight)\n            loss_dict[\"loss_center\"] = loss_center\n            loss_dict[\"loss_angle\"] = loss_angle\n        if self.use_prev_refinement:\n            estimation_boxes_prev = output['estimation_boxes_prev']  # B,4\n            loss_center_prev = F.smooth_l1_loss(estimation_boxes_prev[:, :3], center_label_prev)\n            loss_angle_prev = F.smooth_l1_loss(torch.sin(estimation_boxes_prev[:, 3]), angle_label_prev)\n            loss_total += (loss_center_prev * self.config.center_weight + loss_angle_prev * self.config.angle_weight)\n            loss_dict[\"loss_center_prev\"] = loss_center_prev\n            loss_dict[\"loss_angle_prev\"] = loss_angle_prev\n\n        loss_center_aux = F.smooth_l1_loss(aux_estimation_boxes[:, :3], center_label)\n\n        loss_angle_aux = F.smooth_l1_loss(torch.sin(aux_estimation_boxes[:, 3]), angle_label)\n\n        loss_total += loss_seg * self.config.seg_weight \\\n                      + 1 * (loss_center_aux * self.config.center_weight + loss_angle_aux * self.config.angle_weight) \\\n                      + 1 * (\n                              loss_center_motion * self.config.center_weight + loss_angle_motion * self.config.angle_weight)\n        loss_dict.update({\n            \"loss_total\": loss_total,\n            \"loss_seg\": loss_seg,\n            \"loss_center_aux\": loss_center_aux,\n            \"loss_center_motion\": loss_center_motion,\n            \"loss_angle_aux\": loss_angle_aux,\n            \"loss_angle_motion\": loss_angle_motion,\n        })\n        if self.box_aware:\n            prev_bc = data['prev_bc']\n            this_bc = data['this_bc']\n            bc_label = torch.cat([prev_bc, this_bc], dim=1)\n            pred_bc = output['pred_bc']\n            loss_bc = F.smooth_l1_loss(pred_bc, bc_label)\n            loss_total += loss_bc * self.config.bc_weight\n            loss_dict.update({\n                \"loss_total\": loss_total,\n                \"loss_bc\": loss_bc\n            })\n\n        return loss_dict\n\n    def training_step(self, batch, batch_idx):\n        \"\"\"\n        Args:\n            batch: {\n            \"points\": stack_frames, (B,N,3+9+1)\n            \"seg_label\": stack_label,\n            \"box_label\": np.append(this_gt_bb_transform.center, theta),\n            \"box_size\": this_gt_bb_transform.wlh\n        }\n        Returns:\n\n        \"\"\"\n        output = self(batch)\n        loss_dict = self.compute_loss(batch, output)\n        loss = loss_dict['loss_total']\n\n        # log\n        seg_acc = self.seg_acc(torch.argmax(output['seg_logits'], dim=1, keepdim=False), batch['seg_label'])\n        self.log('seg_acc_background/train', seg_acc[0], on_step=True, on_epoch=True, prog_bar=False, logger=True)\n        self.log('seg_acc_foreground/train', seg_acc[1], on_step=True, on_epoch=True, prog_bar=False, logger=True)\n        if self.use_motion_cls:\n            motion_acc = self.motion_acc(torch.argmax(output['motion_cls'], dim=1, keepdim=False),\n                                         batch['motion_state_label'])\n            self.log('motion_acc_static/train', motion_acc[0], on_step=True, on_epoch=True, prog_bar=False, logger=True)\n            self.log('motion_acc_dynamic/train', motion_acc[1], on_step=True, on_epoch=True, prog_bar=False,\n                     logger=True)\n\n        log_dict = {k: v.item() for k, v in loss_dict.items()}\n\n        self.logger.experiment.add_scalars('loss', log_dict,\n                                           global_step=self.global_step)\n        return loss\n\n\n"
  },
  {
    "path": "models/p2b.py",
    "content": "\"\"\" \np2b.py\nCreated by zenn at 2021/5/9 13:47\n\"\"\"\n\nfrom torch import nn\nfrom models.backbone.pointnet import Pointnet_Backbone\nfrom models.head.xcorr import P2B_XCorr\nfrom models.head.rpn import P2BVoteNetRPN\nfrom models import base_model\n\n\nclass P2B(base_model.MatchingBaseModel):\n    def __init__(self, config=None, **kwargs):\n        super().__init__(config, **kwargs)\n        self.save_hyperparameters()\n        self.backbone = Pointnet_Backbone(self.config.use_fps, self.config.normalize_xyz, return_intermediate=False)\n        self.conv_final = nn.Conv1d(256, self.config.feature_channel, kernel_size=1)\n\n        self.xcorr = P2B_XCorr(feature_channel=self.config.feature_channel,\n                               hidden_channel=self.config.hidden_channel,\n                               out_channel=self.config.out_channel)\n        self.rpn = P2BVoteNetRPN(self.config.feature_channel,\n                                 vote_channel=self.config.vote_channel,\n                                 num_proposal=self.config.num_proposal,\n                                 normalize_xyz=self.config.normalize_xyz)\n\n    def forward(self, input_dict):\n        \"\"\"\n        :param input_dict:\n        {\n        'template_points': template_points.astype('float32'),\n        'search_points': search_points.astype('float32'),\n        'box_label': np.array(search_bbox_reg).astype('float32'),\n        'bbox_size': search_box.wlh,\n        'seg_label': seg_label.astype('float32'),\n        }\n\n        :return:\n        \"\"\"\n        template = input_dict['template_points']\n        search = input_dict['search_points']\n        M = template.shape[1]\n        N = search.shape[1]\n        template_xyz, template_feature, _ = self.backbone(template, [M // 2, M // 4, M // 8])\n        search_xyz, search_feature, sample_idxs = self.backbone(search, [N // 2, N // 4, N // 8])\n        template_feature = self.conv_final(template_feature)\n        search_feature = self.conv_final(search_feature)\n        fusion_feature = self.xcorr(template_feature, search_feature, template_xyz)\n        estimation_boxes, estimation_cla, vote_xyz, center_xyzs = self.rpn(search_xyz, fusion_feature)\n        end_points = {\"estimation_boxes\": estimation_boxes,\n                      \"vote_center\": vote_xyz,\n                      \"pred_seg_score\": estimation_cla,\n                      \"center_xyz\": center_xyzs,\n                      'sample_idxs': sample_idxs,\n                      'estimation_cla': estimation_cla,\n                      \"vote_xyz\": vote_xyz,\n                      }\n        return end_points\n\n    def training_step(self, batch, batch_idx):\n        \"\"\"\n        {\"estimation_boxes\": estimation_boxs.transpose(1, 2).contiguous(),\n                  \"vote_center\": vote_xyz,\n                  \"pred_seg_score\": estimation_cla,\n                  \"center_xyz\": center_xyzs,\n                  \"seed_idxs\":\n                  \"seg_label\"\n        }\n        \"\"\"\n        end_points = self(batch)\n        estimation_cla = end_points['estimation_cla']  # B,N\n        N = estimation_cla.shape[1]\n        seg_label = batch['seg_label']\n        sample_idxs = end_points['sample_idxs']  # B,N\n        # update label\n        seg_label = seg_label.gather(dim=1, index=sample_idxs[:, :N].long())  # B,N\n        batch[\"seg_label\"] = seg_label\n        # compute loss\n        loss_dict = self.compute_loss(batch, end_points)\n        loss = loss_dict['loss_objective'] * self.config.objectiveness_weight \\\n               + loss_dict['loss_box'] * self.config.box_weight \\\n               + loss_dict['loss_seg'] * self.config.seg_weight \\\n               + loss_dict['loss_vote'] * self.config.vote_weight\n        self.log('loss/train', loss.item(), on_step=True, on_epoch=True, prog_bar=True, logger=False)\n        self.log('loss_box/train', loss_dict['loss_box'].item(), on_step=True, on_epoch=True, prog_bar=True,\n                 logger=False)\n        self.log('loss_seg/train', loss_dict['loss_seg'].item(), on_step=True, on_epoch=True, prog_bar=True,\n                 logger=False)\n        self.log('loss_vote/train', loss_dict['loss_vote'].item(), on_step=True, on_epoch=True, prog_bar=True,\n                 logger=False)\n        self.log('loss_objective/train', loss_dict['loss_objective'].item(), on_step=True, on_epoch=True, prog_bar=True,\n                 logger=False)\n        self.logger.experiment.add_scalars('loss', {'loss_total': loss.item(),\n                                                    'loss_box': loss_dict['loss_box'].item(),\n                                                    'loss_seg': loss_dict['loss_seg'].item(),\n                                                    'loss_vote': loss_dict['loss_vote'].item(),\n                                                    'loss_objective': loss_dict['loss_objective'].item()},\n                                           global_step=self.global_step)\n\n        return loss\n"
  },
  {
    "path": "pointnet2/__init__.py",
    "content": "from __future__ import (\n    division,\n    absolute_import,\n    with_statement,\n    print_function,\n    unicode_literals,\n)\n\n__version__ = \"2.1.1\"\n\ntry:\n    __POINTNET2_SETUP__\nexcept NameError:\n    __POINTNET2_SETUP__ = False\n\nif not __POINTNET2_SETUP__:\n    from pointnet2 import utils\n"
  },
  {
    "path": "pointnet2/utils/__init__.py",
    "content": "from __future__ import (\n    division,\n    absolute_import,\n    with_statement,\n    print_function,\n    unicode_literals,\n)\nfrom . import pointnet2_utils\nfrom . import pointnet2_modules\n"
  },
  {
    "path": "pointnet2/utils/linalg_utils.py",
    "content": "from __future__ import (\n    division,\n    absolute_import,\n    with_statement,\n    print_function,\n    unicode_literals,\n)\nimport torch\nfrom enum import Enum\nimport numpy as np\n\nPDist2Order = Enum(\"PDist2Order\", \"d_first d_second\")\n\n\ndef pdist2(X, Z=None, order=PDist2Order.d_second):\n    # type: (torch.Tensor, torch.Tensor, PDist2Order) -> torch.Tensor\n    r\"\"\" Calculates the pairwise distance between X and Z\n\n    D[b, i, j] = l2 distance X[b, i] and Z[b, j]\n\n    Parameters\n    ---------\n    X : torch.Tensor\n        X is a (B, N, d) tensor.  There are B batches, and N vectors of dimension d\n    Z: torch.Tensor\n        Z is a (B, M, d) tensor.  If Z is None, then Z = X\n\n    Returns\n    -------\n    torch.Tensor\n        Distance matrix is size (B, N, M)\n    \"\"\"\n\n    if order == PDist2Order.d_second:\n        if X.dim() == 2:\n            X = X.unsqueeze(0)\n        if Z is None:\n            Z = X\n            G = np.matmul(X, Z.transpose(-2, -1))\n            S = (X * X).sum(-1, keepdim=True)\n            R = S.transpose(-2, -1)\n        else:\n            if Z.dim() == 2:\n                Z = Z.unsqueeze(0)\n            G = np.matmul(X, Z.transpose(-2, -1))\n            S = (X * X).sum(-1, keepdim=True)\n            R = (Z * Z).sum(-1, keepdim=True).transpose(-2, -1)\n    else:\n        if X.dim() == 2:\n            X = X.unsqueeze(0)\n        if Z is None:\n            Z = X\n            G = np.matmul(X.transpose(-2, -1), Z)\n            R = (X * X).sum(-2, keepdim=True)\n            S = R.transpose(-2, -1)\n        else:\n            if Z.dim() == 2:\n                Z = Z.unsqueeze(0)\n            G = np.matmul(X.transpose(-2, -1), Z)\n            S = (X * X).sum(-2, keepdim=True).transpose(-2, -1)\n            R = (Z * Z).sum(-2, keepdim=True)\n\n    return torch.abs(R + S - 2 * G).squeeze(0)\n\n\ndef pdist2_slow(X, Z=None):\n    if Z is None:\n        Z = X\n    D = torch.zeros(X.size(0), X.size(2), Z.size(2))\n\n    for b in range(D.size(0)):\n        for i in range(D.size(1)):\n            for j in range(D.size(2)):\n                D[b, i, j] = torch.dist(X[b, :, i], Z[b, :, j])\n    return D\n\n\nif __name__ == \"__main__\":\n    X = torch.randn(2, 3, 5)\n    Z = torch.randn(2, 3, 3)\n\n    print(pdist2(X, order=PDist2Order.d_first))\n    print(pdist2_slow(X))\n    print(torch.dist(pdist2(X, order=PDist2Order.d_first), pdist2_slow(X)))\n"
  },
  {
    "path": "pointnet2/utils/pointnet2_modules.py",
    "content": "\"\"\" PointNet++ Layers\nModified by Zenn\nDate: Feb 2021\n\"\"\"\nfrom __future__ import (\n    division,\n    absolute_import,\n    with_statement,\n    print_function,\n    unicode_literals,\n)\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nfrom pointnet2.utils import pytorch_utils as pt_utils\n\nfrom pointnet2.utils import pointnet2_utils\n\nif False:\n    # Workaround for type hints without depending on the `typing` module\n    from typing import *\n\n\nclass _PointnetSAModuleBase(nn.Module):\n    def __init__(self, use_fps=False):\n        super(_PointnetSAModuleBase, self).__init__()\n        self.groupers = None\n        self.mlps = None\n        self.use_fps = use_fps\n\n    def forward(self, xyz, features, npoint, return_idx=False):\n        # modified to return sample idxs\n        r\"\"\"\n        Parameters\n        ----------\n        xyz : torch.Tensor\n            (B, N, 3) tensor of the xyz coordinates of the features\n        features : torch.Tensor\n            (B, C, N) tensor of the descriptors of the the features\n\n        Returns\n        -------\n        new_xyz : torch.Tensor\n            (B, npoint, 3) tensor of the new features' xyz\n        new_features : torch.Tensor\n            (B,  \\sum_k(mlps[k][-1]), npoint) tensor of the new_features descriptors\n        \"\"\"\n\n        self.npoint = npoint\n        new_features_list = []\n\n        xyz_flipped = xyz.transpose(1, 2).contiguous()\n        if self.use_fps:\n            sample_idxs = pointnet2_utils.furthest_point_sample(xyz, self.npoint)\n        else:\n            sample_idxs = torch.arange(self.npoint).repeat(xyz.size(0), 1).int().cuda()\n\n        new_xyz = (\n            pointnet2_utils.gather_operation(xyz_flipped, sample_idxs)\n                .transpose(1, 2)\n                .contiguous()\n        )\n\n        for i in range(len(self.groupers)):\n            new_features = self.groupers[i](\n                xyz, new_xyz, features\n            )  # (B, C, npoint, nsample)\n\n            new_features = self.mlps[i](new_features)  # (B, mlp[-1], npoint, nsample)\n            new_features = F.max_pool2d(\n                new_features, kernel_size=[1, new_features.size(3)]\n            )  # (B, mlp[-1], npoint, 1)\n            new_features = new_features.squeeze(-1)  # (B, mlp[-1], npoint)\n\n            new_features_list.append(new_features)\n        if return_idx:\n            return new_xyz, torch.cat(new_features_list, dim=1), sample_idxs\n        else:\n            return new_xyz, torch.cat(new_features_list, dim=1)\n\n\nclass PointnetSAModuleMSG(_PointnetSAModuleBase):\n    r\"\"\"Pointnet set abstrction layer with multiscale grouping\n\n    Parameters\n    ----------\n    npoint : int\n        Number of features\n    radii : list of float32\n        list of radii to group with\n    nsamples : list of int32\n        Number of samples in each ball query\n    mlps : list of list of int32\n        Spec of the pointnet before the global max_pool for each scale\n    bn : bool\n        Use batchnorm\n    \"\"\"\n\n    def __init__(self, radii, nsamples, mlps, bn=True, use_xyz=True, use_fps=False, normalize_xyz=False):\n        # type: (PointnetSAModuleMSG, List[float],List[int], List[List[int]], bool, bool,bool) -> None\n        super(PointnetSAModuleMSG, self).__init__(use_fps=use_fps)\n\n        assert len(radii) == len(nsamples) == len(mlps)\n\n        self.groupers = nn.ModuleList()\n        self.mlps = nn.ModuleList()\n        for i in range(len(radii)):\n            radius = radii[i]\n            nsample = nsamples[i]\n            self.groupers.append(\n                pointnet2_utils.QueryAndGroup(radius, nsample, use_xyz=use_xyz, normalize_xyz=normalize_xyz))\n\n            mlp_spec = mlps[i]\n            if use_xyz:\n                mlp_spec[0] += 3\n\n            self.mlps.append(pt_utils.SharedMLP(mlp_spec, bn=bn))\n\n\nclass PointnetSAModule(PointnetSAModuleMSG):\n    r\"\"\"Pointnet set abstrction layer\n\n    Parameters\n    ----------\n    npoint : int\n        Number of features\n    radius : float\n        Radius of ball\n    nsample : int\n        Number of samples in the ball query\n    mlp : list\n        Spec of the pointnet before the global max_pool\n    bn : bool\n        Use batchnorm\n    \"\"\"\n\n    def __init__(\n            self, mlp, radius=None, nsample=None, bn=True, use_xyz=True, use_fps=False, normalize_xyz=False\n    ):\n        # type: (PointnetSAModule, List[int], float, int, bool, bool, bool,bool) -> None\n        super(PointnetSAModule, self).__init__(\n            mlps=[mlp],\n            radii=[radius],\n            nsamples=[nsample],\n            bn=bn,\n            use_xyz=use_xyz,\n            use_fps=use_fps,\n            normalize_xyz=normalize_xyz\n        )\n\n\nclass PointnetFPModule(nn.Module):\n    r\"\"\"Propigates the features of one set to another\n\n    Parameters\n    ----------\n    mlp : list\n        Pointnet module parameters\n    bn : bool\n        Use batchnorm\n    \"\"\"\n\n    def __init__(self, mlp, bn=True):\n        # type: (PointnetFPModule, List[int], bool) -> None\n        super(PointnetFPModule, self).__init__()\n        self.mlp = pt_utils.SharedMLP(mlp, bn=bn)\n\n    def forward(self, unknown, known, unknow_feats, known_feats):\n        # type: (PointnetFPModule, torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor) -> torch.Tensor\n        r\"\"\"\n        Parameters\n        ----------\n        unknown : torch.Tensor\n            (B, n, 3) tensor of the xyz positions of the unknown features\n        known : torch.Tensor\n            (B, m, 3) tensor of the xyz positions of the known features\n        unknow_feats : torch.Tensor\n            (B, C1, n) tensor of the features to be propigated to\n        known_feats : torch.Tensor\n            (B, C2, m) tensor of features to be propigated\n\n        Returns\n        -------\n        new_features : torch.Tensor\n            (B, mlp[-1], n) tensor of the features of the unknown features\n        \"\"\"\n\n        if known is not None:\n            dist, idx = pointnet2_utils.three_nn(unknown, known)\n            dist_recip = 1.0 / (dist + 1e-8)\n            norm = torch.sum(dist_recip, dim=2, keepdim=True)\n            weight = dist_recip / norm\n\n            interpolated_feats = pointnet2_utils.three_interpolate(\n                known_feats, idx, weight\n            )\n        else:\n            interpolated_feats = known_feats.expand(\n                *(known_feats.size()[0:2] + [unknown.size(1)])\n            )\n\n        if unknow_feats is not None:\n            new_features = torch.cat(\n                [interpolated_feats, unknow_feats], dim=1\n            )  # (B, C2 + C1, n)\n        else:\n            new_features = interpolated_feats\n\n        new_features = new_features.unsqueeze(-1)\n        new_features = self.mlp(new_features)\n\n        return new_features.squeeze(-1)\n\n\nclass FlowEmbedding(nn.Module):\n    \"\"\"Modified from https://github.com/hyangwinter/flownet3d_pytorch/blob/master/util.py\"\"\"\n\n    def __init__(self, radius, nsample, in_channel, mlp, pooling='max', corr_func='concat', knn=True):\n        super(FlowEmbedding, self).__init__()\n        self.radius = radius\n        self.nsample = nsample\n        self.knn = knn\n        self.pooling = pooling\n        self.corr_func = corr_func\n        self.mlp_convs = nn.ModuleList()\n        self.mlp_bns = nn.ModuleList()\n        if corr_func is 'concat':\n            last_channel = in_channel * 2 + 3\n        for out_channel in mlp:\n            self.mlp_convs.append(nn.Conv2d(last_channel, out_channel, 1, bias=False))\n            self.mlp_bns.append(nn.BatchNorm2d(out_channel))\n            last_channel = out_channel\n\n    def forward(self, xyz1, xyz2, feature1, feature2):\n        \"\"\"\n        Input:\n            xyz1: (batch_size, npoint, 3)\n            xyz2: (batch_size, npoint, 3)\n            feat1: (batch_size, channel, npoint)\n            feat2: (batch_size, channel, npoint)\n        Output:\n            xyz1: (batch_size, npoint, 3)\n            feat1_new: (batch_size, mlp[-1], npoint)\n        \"\"\"\n        xyz1_t = xyz1.permute(0, 2, 1).contiguous()\n        xyz2_t = xyz2.permute(0, 2, 1).contiguous()\n        # feature1 = feature1.permute(0, 2, 1).contiguous()\n        # feature2 = feature2.permute(0, 2, 1).contiguous()\n\n        B, N, C = xyz1.shape\n        if self.knn:\n            idx = pointnet2_utils.knn_point(self.nsample, xyz1, xyz2)  # (B, npoint, nsample)\n        else:\n            idx, cnt = pointnet2_utils.ball_query(self.radius, self.nsample, xyz2, xyz1)  # (B, npoint, nsample)\n\n        xyz2_grouped = pointnet2_utils.grouping_operation(xyz2_t, idx)  # (B, 3, npoint, nsample)\n        pos_diff = xyz2_grouped - xyz1_t.view(B, -1, N, 1)  # (B, 3, npoint, nsample)\n\n        feat2_grouped = pointnet2_utils.grouping_operation(feature2, idx)  # [B, C, npoint, nsample]\n        if self.corr_func == 'concat':\n            feat_diff = torch.cat([feat2_grouped, feature1.view(B, -1, N, 1).repeat(1, 1, 1, self.nsample)], dim=1)\n\n        feat1_new = torch.cat([pos_diff, feat_diff], dim=1)  # [B, 2*C+3,npoint, nsample]\n        for i, conv in enumerate(self.mlp_convs):\n            bn = self.mlp_bns[i]\n            feat1_new = F.relu(bn(conv(feat1_new)))\n\n        feat1_new = torch.max(feat1_new, -1)[0]  # [B, mlp[-1], npoint]\n        return xyz1, feat1_new\n\n\nclass PointNetSetUpConv(nn.Module):\n    def __init__(self, nsample, radius, f1_channel, f2_channel, mlp, mlp2, knn=True):\n        super(PointNetSetUpConv, self).__init__()\n        self.nsample = nsample\n        self.radius = radius\n        self.knn = knn\n        self.mlp1_convs = nn.ModuleList()\n        self.mlp2_convs = nn.ModuleList()\n        last_channel = f2_channel + 3\n        for out_channel in mlp:\n            self.mlp1_convs.append(nn.Sequential(nn.Conv2d(last_channel, out_channel, 1, bias=False),\n                                                 nn.BatchNorm2d(out_channel),\n                                                 nn.ReLU(inplace=False)))\n            last_channel = out_channel\n        if len(mlp) is not 0:\n            last_channel = mlp[-1] + f1_channel\n        else:\n            last_channel = last_channel + f1_channel\n        for out_channel in mlp2:\n            self.mlp2_convs.append(nn.Sequential(nn.Conv1d(last_channel, out_channel, 1, bias=False),\n                                                 nn.BatchNorm1d(out_channel),\n                                                 nn.ReLU(inplace=False)))\n            last_channel = out_channel\n\n    def forward(self, xyz1, xyz2, feature1, feature2):\n        \"\"\"\n            Feature propagation from xyz2 (less points) to xyz1 (more points)\n        Inputs:\n            xyz1: (batch_size, npoint1, 3)\n            xyz2: (batch_size, npoint2, 3)\n            feat1: (batch_size, channel1, npoint1) features for xyz1 points (earlier layers)\n            feat2: (batch_size, channel2, npoint2) features for xyz2 points\n        Output:\n            feat1_new: (batch_size, mlp[-1] or mlp2[-1] or channel1+3, npoint2)\n            TODO: Add support for skip links. Study how delta(XYZ) plays a role in feature updating.\n        \"\"\"\n        xyz1_t = xyz1.permute(0, 2, 1).contiguous()\n        xyz2_t = xyz2.permute(0, 2, 1).contiguous()\n        # feature1 = feature1.permute(0, 2, 1).contiguous()\n        # feature2 = feature2.permute(0, 2, 1).contiguous()\n        B, C, N = xyz1_t.shape\n        if self.knn:\n            idx = pointnet2_utils.knn_point(self.nsample, xyz1, xyz2)  # (B, npoint1, nsample)\n        else:\n            idx, cnt = pointnet2_utils.ball_query(self.radius, self.nsample, xyz2, xyz1)  # (B, npoint1, nsample)\n\n        xyz2_grouped = pointnet2_utils.grouping_operation(xyz2_t, idx)\n        pos_diff = xyz2_grouped - xyz1_t.view(B, -1, N, 1)  # [B,3,N1,S]\n\n        feat2_grouped = pointnet2_utils.grouping_operation(feature2, idx)\n        feat_new = torch.cat([feat2_grouped, pos_diff], dim=1)  # [B,C1+3,N1,S]\n        for conv in self.mlp1_convs:\n            feat_new = conv(feat_new)\n        # max pooling\n        feat_new = feat_new.max(-1)[0]  # [B,mlp1[-1],N1]\n        # concatenate feature in early layer\n        if feature1 is not None:\n            feat_new = torch.cat([feat_new, feature1], dim=1)\n        # feat_new = feat_new.view(B,-1,N,1)\n        for conv in self.mlp2_convs:\n            feat_new = conv(feat_new)\n\n        return feat_new\n\n\nif __name__ == \"__main__\":\n    from torch.autograd import Variable\n\n    torch.manual_seed(1)\n    torch.cuda.manual_seed_all(1)\n    xyz = Variable(torch.randn(2, 9, 3).cuda(), requires_grad=True)\n    xyz_feats = Variable(torch.randn(2, 9, 6).cuda(), requires_grad=True)\n\n    test_module = PointnetSAModuleMSG(\n        npoint=2, radii=[5.0, 10.0], nsamples=[6, 3], mlps=[[9, 3], [9, 6]]\n    )\n    test_module.cuda()\n    print(test_module(xyz, xyz_feats))\n\n    #  test_module = PointnetFPModule(mlp=[6, 6])\n    #  test_module.cuda()\n    #  from torch.autograd import gradcheck\n    #  inputs = (xyz, xyz, None, xyz_feats)\n    #  test = gradcheck(test_module, inputs, eps=1e-6, atol=1e-4)\n    #  print(test)\n\n    for _ in range(1):\n        _, new_features = test_module(xyz, xyz_feats)\n        new_features.backward(torch.cuda.FloatTensor(*new_features.size()).fill_(1))\n        print(new_features)\n        print(xyz.grad)\n"
  },
  {
    "path": "pointnet2/utils/pointnet2_utils.py",
    "content": "\"\"\" PointNet++ utils\nModified by Zenn\nDate: Feb 2021\n\"\"\"\nfrom __future__ import (\n    division,\n    absolute_import,\n    with_statement,\n    print_function,\n    unicode_literals,\n)\nimport torch\nfrom torch.autograd import Function\nimport torch.nn as nn\nfrom pointnet2.utils import pytorch_utils as pt_utils\n\nimport pointnet2_ops._ext as _ext\n\nif False:\n    # Workaround for type hints without depending on the `typing` module\n    from typing import *\n\n\nclass RandomDropout(nn.Module):\n    def __init__(self, p=0.5, inplace=False):\n        super(RandomDropout, self).__init__()\n        self.p = p\n        self.inplace = inplace\n\n    def forward(self, X):\n        theta = torch.Tensor(1).uniform_(0, self.p)[0]\n        return pt_utils.feature_dropout_no_scaling(X, theta, self.train, self.inplace)\n\n\nclass FurthestPointSampling(Function):\n    @staticmethod\n    def forward(ctx, xyz, npoint):\n        # type: (Any, torch.Tensor, int) -> torch.Tensor\n        r\"\"\"\n        Uses iterative furthest point sampling to select a set of npoint features that have the largest\n        minimum distance\n\n        Parameters\n        ----------\n        xyz : torch.Tensor\n            (B, N, 3) tensor where N > npoint\n        npoint : int32\n            number of features in the sampled set\n\n        Returns\n        -------\n        torch.Tensor\n            (B, npoint) tensor containing the set\n        \"\"\"\n        # return _ext.furthest_point_sampling(xyz, npoint)\n        fps_inds = _ext.furthest_point_sampling(xyz, npoint)\n        ctx.mark_non_differentiable(fps_inds)\n        return fps_inds\n\n    @staticmethod\n    def backward(xyz, a=None):\n        return None, None\n\n\nfurthest_point_sample = FurthestPointSampling.apply\n\n\nclass GatherOperation(Function):\n    @staticmethod\n    def forward(ctx, features, idx):\n        # type: (Any, torch.Tensor, torch.Tensor) -> torch.Tensor\n        r\"\"\"\n\n        Parameters\n        ----------\n        features : torch.Tensor\n            (B, C, N) tensor\n\n        idx : torch.Tensor\n            (B, npoint) tensor of the features to gather\n\n        Returns\n        -------\n        torch.Tensor\n            (B, C, npoint) tensor\n        \"\"\"\n\n        _, C, N = features.size()\n\n        ctx.for_backwards = (idx, C, N)\n\n        return _ext.gather_points(features, idx)\n\n    @staticmethod\n    def backward(ctx, grad_out):\n        idx, C, N = ctx.for_backwards\n\n        grad_features = _ext.gather_points_grad(grad_out.contiguous(), idx, N)\n        return grad_features, None\n\n\ngather_operation = GatherOperation.apply\n\n\nclass ThreeNN(Function):\n    @staticmethod\n    def forward(ctx, unknown, known):\n        # type: (Any, torch.Tensor, torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor]\n        r\"\"\"\n            Find the three nearest neighbors of unknown in known\n        Parameters\n        ----------\n        unknown : torch.Tensor\n            (B, n, 3) tensor of known features\n        known : torch.Tensor\n            (B, m, 3) tensor of unknown features\n\n        Returns\n        -------\n        dist : torch.Tensor\n            (B, n, 3) l2 distance to the three nearest neighbors\n        idx : torch.Tensor\n            (B, n, 3) index of 3 nearest neighbors\n        \"\"\"\n        dist2, idx = _ext.three_nn(unknown, known)\n\n        return torch.sqrt(dist2), idx\n\n    @staticmethod\n    def backward(ctx, a=None, b=None):\n        return None, None\n\n\nthree_nn = ThreeNN.apply\n\n\nclass ThreeInterpolate(Function):\n    @staticmethod\n    def forward(ctx, features, idx, weight):\n        # type(Any, torch.Tensor, torch.Tensor, torch.Tensor) -> Torch.Tensor\n        r\"\"\"\n            Performs weight linear interpolation on 3 features\n        Parameters\n        ----------\n        features : torch.Tensor\n            (B, c, m) Features descriptors to be interpolated from\n        idx : torch.Tensor\n            (B, n, 3) three nearest neighbors of the target features in features\n        weight : torch.Tensor\n            (B, n, 3) weights\n\n        Returns\n        -------\n        torch.Tensor\n            (B, c, n) tensor of the interpolated features\n        \"\"\"\n        B, c, m = features.size()\n        n = idx.size(1)\n\n        ctx.three_interpolate_for_backward = (idx, weight, m)\n\n        return _ext.three_interpolate(features, idx, weight)\n\n    @staticmethod\n    def backward(ctx, grad_out):\n        # type: (Any, torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]\n        r\"\"\"\n        Parameters\n        ----------\n        grad_out : torch.Tensor\n            (B, c, n) tensor with gradients of ouputs\n\n        Returns\n        -------\n        grad_features : torch.Tensor\n            (B, c, m) tensor with gradients of features\n\n        None\n\n        None\n        \"\"\"\n        idx, weight, m = ctx.three_interpolate_for_backward\n\n        grad_features = _ext.three_interpolate_grad(\n            grad_out.contiguous(), idx, weight, m\n        )\n\n        return grad_features, None, None\n\n\nthree_interpolate = ThreeInterpolate.apply\n\n\nclass GroupingOperation(Function):\n    @staticmethod\n    def forward(ctx, features, idx):\n        # type: (Any, torch.Tensor, torch.Tensor) -> torch.Tensor\n        r\"\"\"\n\n        Parameters\n        ----------\n        features : torch.Tensor\n            (B, C, N) tensor of features to group\n        idx : torch.Tensor\n            (B, npoint, nsample) tensor containing the indicies of features to group with\n\n        Returns\n        -------\n        torch.Tensor\n            (B, C, npoint, nsample) tensor\n        \"\"\"\n        B, nfeatures, nsample = idx.size()\n        _, C, N = features.size()\n\n        ctx.for_backwards = (idx, N)\n\n        return _ext.group_points(features, idx)\n\n    @staticmethod\n    def backward(ctx, grad_out):\n        # type: (Any, torch.tensor) -> Tuple[torch.Tensor, torch.Tensor]\n        r\"\"\"\n\n        Parameters\n        ----------\n        grad_out : torch.Tensor\n            (B, C, npoint, nsample) tensor of the gradients of the output from forward\n\n        Returns\n        -------\n        torch.Tensor\n            (B, C, N) gradient of the features\n        None\n        \"\"\"\n        idx, N = ctx.for_backwards\n\n        grad_features = _ext.group_points_grad(grad_out.contiguous(), idx, N)\n\n        return grad_features, None\n\n\ngrouping_operation = GroupingOperation.apply\n\n\nclass BallQuery(Function):\n    @staticmethod\n    def forward(ctx, radius, nsample, xyz, new_xyz):\n        # type: (Any, float, int, torch.Tensor, torch.Tensor) -> torch.Tensor\n        r\"\"\"\n\n        Parameters\n        ----------\n        radius : float\n            radius of the balls\n        nsample : int\n            maximum number of features in the balls\n        xyz : torch.Tensor\n            (B, N, 3) xyz coordinates of the features\n        new_xyz : torch.Tensor\n            (B, npoint, 3) centers of the ball query\n\n        Returns\n        -------\n        torch.Tensor\n            (B, npoint, nsample) tensor with the indicies of the features that form the query balls\n        \"\"\"\n        # return _ext.ball_query(new_xyz, xyz, radius, nsample)\n        inds = _ext.ball_query(new_xyz, xyz, radius, nsample)\n        ctx.mark_non_differentiable(inds)\n        return inds\n\n    @staticmethod\n    def backward(ctx, a=None):\n        return None, None, None, None\n\n\nball_query = BallQuery.apply\n\n\nclass QueryAndGroup(nn.Module):\n    r\"\"\"\n    Groups with a ball query of radius\n\n    Parameters\n    ---------\n    radius : float32\n        Radius of ball\n    nsample : int32\n        Maximum number of features to gather in the ball\n    \"\"\"\n\n    def __init__(self, radius, nsample, use_xyz=True, return_idx=False, normalize_xyz=False):\n        # type: (QueryAndGroup, float, int, bool,bool,bool) -> None\n        super(QueryAndGroup, self).__init__()\n        self.radius, self.nsample, self.use_xyz = radius, nsample, use_xyz\n        self.return_idx = return_idx\n        self.normalize_xyz = normalize_xyz\n\n    def forward(self, xyz, new_xyz, features=None):\n        # type: (QueryAndGroup, torch.Tensor. torch.Tensor, torch.Tensor) -> Tuple[Torch.Tensor]\n        r\"\"\"\n        Parameters\n        ----------\n        xyz : torch.Tensor\n            xyz coordinates of the features (B, N, 3)\n        new_xyz : torch.Tensor\n            centriods (B, npoint, 3)\n        features : torch.Tensor\n            Descriptors of the features (B, C, N)\n\n        Returns\n        -------\n        new_features : torch.Tensor\n            (B, 3 + C, npoint, nsample) tensor\n        \"\"\"\n\n        idx = ball_query(self.radius, self.nsample, xyz, new_xyz)\n        xyz_trans = xyz.transpose(1, 2).contiguous()\n        grouped_xyz = grouping_operation(xyz_trans, idx)  # (B, 3, npoint, nsample)\n        grouped_xyz -= new_xyz.transpose(1, 2).unsqueeze(-1)\n        if self.normalize_xyz:\n            grouped_xyz /= self.radius\n\n        if features is not None:\n            grouped_features = grouping_operation(features, idx)\n            if self.use_xyz:\n                new_features = torch.cat(\n                    [grouped_xyz, grouped_features], dim=1\n                )  # (B, C + 3, npoint, nsample)\n            else:\n                new_features = grouped_features\n        else:\n            assert (\n                self.use_xyz\n            ), \"Cannot have not features and not use xyz as a feature!\"\n            new_features = grouped_xyz\n        if self.return_idx:\n            return new_features, idx\n        return new_features\n\n\nclass GroupAll(nn.Module):\n    r\"\"\"\n    Groups all features\n\n    Parameters\n    ---------\n    \"\"\"\n\n    def __init__(self, use_xyz=True):\n        # type: (GroupAll, bool) -> None\n        super(GroupAll, self).__init__()\n        self.use_xyz = use_xyz\n\n    def forward(self, xyz, new_xyz, features=None):\n        # type: (GroupAll, torch.Tensor, torch.Tensor, torch.Tensor) -> Tuple[torch.Tensor]\n        r\"\"\"\n        Parameters\n        ----------\n        xyz : torch.Tensor\n            xyz coordinates of the features (B, N, 3)\n        new_xyz : torch.Tensor\n            Ignored\n        features : torch.Tensor\n            Descriptors of the features (B, C, N)\n\n        Returns\n        -------\n        new_features : torch.Tensor\n            (B, C + 3, 1, N) tensor\n        \"\"\"\n\n        grouped_xyz = xyz.transpose(1, 2).unsqueeze(2)\n        if features is not None:\n            grouped_features = features.unsqueeze(2)\n            if self.use_xyz:\n                new_features = torch.cat(\n                    [grouped_xyz, grouped_features], dim=1\n                )  # (B, 3 + C, 1, N)\n            else:\n                new_features = grouped_features\n        else:\n            new_features = grouped_xyz\n\n        return new_features\n\n\ndef knn_point(k, points1, points2):\n    \"\"\"\n    find for each point in points1 the knn in points2\n    Args:\n        k: k for kNN\n        points1: B x npoint1 x d\n        points2: B x npoint2 x d\n\n    Returns:\n        top_k_neareast_idx: (batch_size, npoint1, k) int32 array, indices to input points\n    \"\"\"\n    dist_matrix = torch.cdist(points1, points2)  # B, npoint1, npoint2\n    top_k_neareast_idx = torch.argsort(dist_matrix, dim=-1)[:, :, :k]  # B, npoint1, K\n    top_k_neareast_idx = top_k_neareast_idx.int().contiguous()\n    return top_k_neareast_idx\n"
  },
  {
    "path": "pointnet2/utils/pytorch_utils.py",
    "content": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n# LICENSE file in the root directory of this source tree.\n\n''' Modified based on Ref: https://github.com/erikwijmans/Pointnet2_PyTorch '''\nimport torch\nimport torch.nn as nn\nfrom typing import List, Tuple\n\n\nclass SharedMLP(nn.Sequential):\n\n    def __init__(\n            self,\n            args: List[int],\n            *,\n            bn: bool = False,\n            activation=nn.ReLU(inplace=True),\n            preact: bool = False,\n            first: bool = False,\n            name: str = \"\"\n    ):\n        super().__init__()\n\n        for i in range(len(args) - 1):\n            self.add_module(\n                name + 'layer{}'.format(i),\n                Conv2d(\n                    args[i],\n                    args[i + 1],\n                    bn=(not first or not preact or (i != 0)) and bn,\n                    activation=activation\n                    if (not first or not preact or (i != 0)) else None,\n                    preact=preact\n                )\n            )\n\n\nclass _BNBase(nn.Sequential):\n\n    def __init__(self, in_size, batch_norm=None, name=\"\"):\n        super().__init__()\n        self.add_module(name + \"bn\", batch_norm(in_size))\n\n        nn.init.constant_(self[0].weight, 1.0)\n        nn.init.constant_(self[0].bias, 0)\n\n\nclass BatchNorm1d(_BNBase):\n\n    def __init__(self, in_size: int, *, name: str = \"\"):\n        super().__init__(in_size, batch_norm=nn.BatchNorm1d, name=name)\n\n\nclass BatchNorm2d(_BNBase):\n\n    def __init__(self, in_size: int, name: str = \"\"):\n        super().__init__(in_size, batch_norm=nn.BatchNorm2d, name=name)\n\n\nclass BatchNorm3d(_BNBase):\n\n    def __init__(self, in_size: int, name: str = \"\"):\n        super().__init__(in_size, batch_norm=nn.BatchNorm3d, name=name)\n\n\nclass _ConvBase(nn.Sequential):\n\n    def __init__(\n            self,\n            in_size,\n            out_size,\n            kernel_size,\n            stride,\n            padding,\n            activation,\n            bn,\n            init,\n            conv=None,\n            batch_norm=None,\n            bias=True,\n            preact=False,\n            name=\"\"\n    ):\n        super().__init__()\n\n        bias = bias and (not bn)\n        conv_unit = conv(\n            in_size,\n            out_size,\n            kernel_size=kernel_size,\n            stride=stride,\n            padding=padding,\n            bias=bias\n        )\n        init(conv_unit.weight)\n        if bias:\n            nn.init.constant_(conv_unit.bias, 0)\n\n        if bn:\n            if not preact:\n                bn_unit = batch_norm(out_size)\n            else:\n                bn_unit = batch_norm(in_size)\n\n        if preact:\n            if bn:\n                self.add_module(name + 'bn', bn_unit)\n\n            if activation is not None:\n                self.add_module(name + 'activation', activation)\n\n        self.add_module(name + 'conv', conv_unit)\n\n        if not preact:\n            if bn:\n                self.add_module(name + 'bn', bn_unit)\n\n            if activation is not None:\n                self.add_module(name + 'activation', activation)\n\n\nclass Conv1d(_ConvBase):\n\n    def __init__(\n            self,\n            in_size: int,\n            out_size: int,\n            *,\n            kernel_size: int = 1,\n            stride: int = 1,\n            padding: int = 0,\n            activation=nn.ReLU(inplace=True),\n            bn: bool = False,\n            init=nn.init.kaiming_normal_,\n            bias: bool = True,\n            preact: bool = False,\n            name: str = \"\"\n    ):\n        super().__init__(\n            in_size,\n            out_size,\n            kernel_size,\n            stride,\n            padding,\n            activation,\n            bn,\n            init,\n            conv=nn.Conv1d,\n            batch_norm=BatchNorm1d,\n            bias=bias,\n            preact=preact,\n            name=name\n        )\n\n\nclass Conv2d(_ConvBase):\n\n    def __init__(\n            self,\n            in_size: int,\n            out_size: int,\n            *,\n            kernel_size: Tuple[int, int] = (1, 1),\n            stride: Tuple[int, int] = (1, 1),\n            padding: Tuple[int, int] = (0, 0),\n            activation=nn.ReLU(inplace=True),\n            bn: bool = False,\n            init=nn.init.kaiming_normal_,\n            bias: bool = True,\n            preact: bool = False,\n            name: str = \"\"\n    ):\n        super().__init__(\n            in_size,\n            out_size,\n            kernel_size,\n            stride,\n            padding,\n            activation,\n            bn,\n            init,\n            conv=nn.Conv2d,\n            batch_norm=BatchNorm2d,\n            bias=bias,\n            preact=preact,\n            name=name\n        )\n\n\nclass Conv3d(_ConvBase):\n\n    def __init__(\n            self,\n            in_size: int,\n            out_size: int,\n            *,\n            kernel_size: Tuple[int, int, int] = (1, 1, 1),\n            stride: Tuple[int, int, int] = (1, 1, 1),\n            padding: Tuple[int, int, int] = (0, 0, 0),\n            activation=nn.ReLU(inplace=True),\n            bn: bool = False,\n            init=nn.init.kaiming_normal_,\n            bias: bool = True,\n            preact: bool = False,\n            name: str = \"\"\n    ):\n        super().__init__(\n            in_size,\n            out_size,\n            kernel_size,\n            stride,\n            padding,\n            activation,\n            bn,\n            init,\n            conv=nn.Conv3d,\n            batch_norm=BatchNorm3d,\n            bias=bias,\n            preact=preact,\n            name=name\n        )\n\n\nclass FC(nn.Sequential):\n\n    def __init__(\n            self,\n            in_size: int,\n            out_size: int,\n            *,\n            activation=nn.ReLU(inplace=True),\n            bn: bool = False,\n            init=None,\n            preact: bool = False,\n            name: str = \"\"\n    ):\n        super().__init__()\n\n        fc = nn.Linear(in_size, out_size, bias=not bn)\n        if init is not None:\n            init(fc.weight)\n        if not bn:\n            nn.init.constant_(fc.bias, 0)\n\n        if preact:\n            if bn:\n                self.add_module(name + 'bn', BatchNorm1d(in_size))\n\n            if activation is not None:\n                self.add_module(name + 'activation', activation)\n\n        self.add_module(name + 'fc', fc)\n\n        if not preact:\n            if bn:\n                self.add_module(name + 'bn', BatchNorm1d(out_size))\n\n            if activation is not None:\n                self.add_module(name + 'activation', activation)\n\ndef set_bn_momentum_default(bn_momentum):\n\n    def fn(m):\n        if isinstance(m, (nn.BatchNorm1d, nn.BatchNorm2d, nn.BatchNorm3d)):\n            m.momentum = bn_momentum\n\n    return fn\n\n\nclass BNMomentumScheduler(object):\n\n    def __init__(\n            self, model, bn_lambda, last_epoch=-1,\n            setter=set_bn_momentum_default\n    ):\n        if not isinstance(model, nn.Module):\n            raise RuntimeError(\n                \"Class '{}' is not a PyTorch nn Module\".format(\n                    type(model).__name__\n                )\n            )\n\n        self.model = model\n        self.setter = setter\n        self.lmbd = bn_lambda\n\n        self.step(last_epoch + 1)\n        self.last_epoch = last_epoch\n\n    def step(self, epoch=None):\n        if epoch is None:\n            epoch = self.last_epoch + 1\n\n        self.last_epoch = epoch\n        self.model.apply(self.setter(self.lmbd(epoch)))\n\n\nclass Seq(nn.Sequential):\n\n    def __init__(self, input_channels):\n        super().__init__()\n        self.count = 0\n        self.current_channels = input_channels\n\n    def conv1d(self,\n               out_size: int,\n               *,\n               kernel_size: int = 1,\n               stride: int = 1,\n               padding: int = 0,\n               dilation: int = 1,\n               activation=nn.ReLU(inplace=True),\n               bn: bool = False,\n               init=nn.init.kaiming_normal_,\n               bias: bool = True,\n               preact: bool = False,\n               name: str = \"\",\n               norm_layer=BatchNorm1d):\n\n        self.add_module(\n            str(self.count),\n            Conv1d(\n                self.current_channels,\n                out_size,\n                kernel_size=kernel_size,\n                stride=stride,\n                padding=padding,\n                activation=activation,\n                bn=bn,\n                init=init,\n                bias=bias,\n                preact=preact,\n                name=name))\n        self.count += 1\n        self.current_channels = out_size\n\n        return self\n\n    def conv2d(self,\n               out_size: int,\n               *,\n               kernel_size: Tuple[int, int] = (1, 1),\n               stride: Tuple[int, int] = (1, 1),\n               padding: Tuple[int, int] = (0, 0),\n               dilation: Tuple[int, int] = (1, 1),\n               activation=nn.ReLU(inplace=True),\n               bn: bool = False,\n               init=nn.init.kaiming_normal_,\n               bias: bool = True,\n               preact: bool = False,\n               name: str = \"\",\n               norm_layer=BatchNorm2d):\n\n        self.add_module(\n            str(self.count),\n            Conv2d(\n                self.current_channels,\n                out_size,\n                kernel_size=kernel_size,\n                stride=stride,\n                padding=padding,\n                activation=activation,\n                bn=bn,\n                init=init,\n                bias=bias,\n                preact=preact,\n                name=name))\n        self.count += 1\n        self.current_channels = out_size\n\n        return self\n\n    def conv3d(self,\n               out_size: int,\n               *,\n               kernel_size: Tuple[int, int, int] = (1, 1, 1),\n               stride: Tuple[int, int, int] = (1, 1, 1),\n               padding: Tuple[int, int, int] = (0, 0, 0),\n               dilation: Tuple[int, int, int] = (1, 1, 1),\n               activation=nn.ReLU(inplace=True),\n               bn: bool = False,\n               init=nn.init.kaiming_normal_,\n               bias: bool = True,\n               preact: bool = False,\n               name: str = \"\",\n               norm_layer=BatchNorm3d):\n\n        self.add_module(\n            str(self.count),\n            Conv3d(\n                self.current_channels,\n                out_size,\n                kernel_size=kernel_size,\n                stride=stride,\n                padding=padding,\n                activation=activation,\n                bn=bn,\n                init=init,\n                bias=bias,\n                preact=preact,\n                name=name))\n        self.count += 1\n        self.current_channels = out_size\n\n        return self\n\n    def fc(self,\n           out_size: int,\n           *,\n           activation=nn.ReLU(inplace=True),\n           bn: bool = False,\n           init=None,\n           preact: bool = False,\n           name: str = \"\"):\n\n        self.add_module(\n            str(self.count),\n            FC(self.current_channels,\n               out_size,\n               activation=activation,\n               bn=bn,\n               init=init,\n               preact=preact,\n               name=name))\n        self.count += 1\n        self.current_channels = out_size\n\n        return self\n\n    def dropout(self, p=0.5):\n\n        self.add_module(str(self.count), nn.Dropout(p=0.5))\n        self.count += 1\n\n        return self\n\n    def maxpool2d(self,\n                  kernel_size,\n                  stride=None,\n                  padding=0,\n                  dilation=1,\n                  return_indices=False,\n                  ceil_mode=False):\n        self.add_module(\n            str(self.count),\n            nn.MaxPool2d(\n                kernel_size=kernel_size,\n                stride=stride,\n                padding=padding,\n                dilation=dilation,\n                return_indices=return_indices,\n                ceil_mode=ceil_mode))\n        self.count += 1\n\n        return self\n\n"
  },
  {
    "path": "requirement.txt",
    "content": "protobuf==3.19\neasydict==1.9\nnumpy==1.22\npandas==1.1.5\ngit+https://github.com/erikwijmans/Pointnet2_PyTorch.git#egg=pointnet2_ops&subdirectory=pointnet2_ops_lib\npomegranate\npyquaternion==0.9.9\npytorch-lightning==1.3.8\nPyYAML==5.4.1\nscipy==1.5.4\nShapely==1.7.1\ntdqm==0.0.1\ntensorboard==2.4.1\ntorchmetrics==0.4.1\ntqdm==4.61.1\nnuscenes-devkit==1.1.9"
  },
  {
    "path": "utils/__init__.py",
    "content": "\"\"\" \n__init__.py\nCreated by zenn at 2021/7/16 14:13\n\"\"\"\n"
  },
  {
    "path": "utils/metrics.py",
    "content": "import numpy as np\nimport torch\nimport torchmetrics.utilities.data\nfrom shapely.geometry import Polygon\nfrom torchmetrics import Metric\n\n\nclass AverageMeter(object):\n    \"\"\"Computes and stores the average and current value\"\"\"\n\n    def __init__(self):\n        self.reset()\n\n    def reset(self):\n        self.val = 0\n        self.avg = 0\n        self.sum = 0\n        self.count = 0\n\n    def update(self, val, n=1):\n        self.val = val\n        self.sum += val * n\n        self.count += n\n        self.avg = self.sum / self.count\n\n\ndef estimateAccuracy(box_a, box_b, dim=3, up_axis=(0, -1, 0)):\n    if dim == 3:\n        return np.linalg.norm(box_a.center - box_b.center, ord=2)\n    elif dim == 2:\n        up_axis = np.array(up_axis)\n        return np.linalg.norm(\n            box_a.center[up_axis != 0] - box_b.center[up_axis != 0], ord=2)\n\n\ndef fromBoxToPoly(box, up_axis=(0, -1, 0)):\n    \"\"\"\n\n    :param box:\n    :param up_axis: the up axis must contain only one non-zero component\n    :return:\n    \"\"\"\n    if up_axis[1] != 0:\n        return Polygon(tuple(box.corners()[[0, 2]].T[[0, 1, 5, 4]]))\n    elif up_axis[2] != 0:\n        return Polygon(tuple(box.bottom_corners().T))\n\n\ndef estimateOverlap(box_a, box_b, dim=2, up_axis=(0, -1, 0)):\n    # if box_a == box_b:\n    #     return 1.0\n    try:\n        Poly_anno = fromBoxToPoly(box_a, up_axis)\n        Poly_subm = fromBoxToPoly(box_b, up_axis)\n\n        box_inter = Poly_anno.intersection(Poly_subm)\n        box_union = Poly_anno.union(Poly_subm)\n        if dim == 2:\n            return box_inter.area / box_union.area\n\n        else:\n            up_axis = np.array(up_axis)\n            up_max = min(box_a.center[up_axis != 0], box_b.center[up_axis != 0])\n            up_min = max(box_a.center[up_axis != 0] - box_a.wlh[2], box_b.center[up_axis != 0] - box_b.wlh[2])\n            inter_vol = box_inter.area * max(0, up_max[0] - up_min[0])\n            anno_vol = box_a.wlh[0] * box_a.wlh[1] * box_a.wlh[2]\n            subm_vol = box_b.wlh[0] * box_b.wlh[1] * box_b.wlh[2]\n\n            overlap = inter_vol * 1.0 / (anno_vol + subm_vol - inter_vol)\n            return overlap\n    except ValueError:\n        return 0.0\n\n\nclass TorchPrecision(Metric):\n    \"\"\"Computes and stores the Precision using torchMetrics\"\"\"\n\n    def __init__(self, n=21, max_accuracy=2, dist_sync_on_step=False):\n        super().__init__(dist_sync_on_step=dist_sync_on_step)\n        self.max_accuracy = max_accuracy\n        self.Xaxis = torch.linspace(0, self.max_accuracy, steps=n)\n        self.add_state(\"accuracies\", default=[])\n\n    def value(self, accs):\n        prec = [\n            torch.sum((accs <= thres).float()) / len(accs)\n            for thres in self.Xaxis\n        ]\n        return torch.tensor(prec)\n\n    def update(self, val):\n        self.accuracies.append(val)\n\n    def compute(self):\n        accs = torchmetrics.utilities.data.dim_zero_cat(self.accuracies)\n        if accs.numel() == 0:\n            return 0\n        return torch.trapz(self.value(accs), x=self.Xaxis) * 100 / self.max_accuracy\n\n\nclass TorchSuccess(Metric):\n    \"\"\"Computes and stores the Success using torchMetrics\"\"\"\n\n    def __init__(self, n=21, max_overlap=1, dist_sync_on_step=False):\n        super().__init__(dist_sync_on_step=dist_sync_on_step)\n        self.max_overlap = max_overlap\n        self.Xaxis = torch.linspace(0, self.max_overlap, steps=n)\n        self.add_state(\"overlaps\", default=[])\n\n    def value(self, overlaps):\n        succ = [\n            torch.sum((overlaps >= thres).float()) / len(overlaps)\n            for thres in self.Xaxis\n        ]\n        return torch.tensor(succ)\n\n    def compute(self):\n        overlaps = torchmetrics.utilities.data.dim_zero_cat(self.overlaps)\n\n        if overlaps.numel() == 0:\n            return 0\n        return torch.trapz(self.value(overlaps), x=self.Xaxis) * 100 / self.max_overlap\n\n    def update(self, val):\n        self.overlaps.append(val)\n"
  }
]