[
  {
    "path": "LICENSE",
    "content": "                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      \"License\" shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      \"Licensor\" shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      \"Legal Entity\" shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      \"control\" means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      \"You\" (or \"Your\") shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      \"Source\" form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      \"Object\" form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      \"Work\" shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      \"Derivative Works\" shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      \"Contribution\" shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, \"submitted\"\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as \"Not a Contribution.\"\n\n      \"Contributor\" shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a \"NOTICE\" text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an \"AS IS\" BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets \"[]\"\n      replaced with your own identifying information. (Don't include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same \"printed page\" as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright [yyyy] [name of copyright owner]\n\n   Licensed under the Apache License, Version 2.0 (the \"License\");\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an \"AS IS\" BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n"
  },
  {
    "path": "README.md",
    "content": "\n# Transformation-Equivariant 3D Object Detection for Autonomous Driving\nThis is a improved version of [TED](https://arxiv.org/abs/2211.11962) by a multiple refinement design. \nThis code is mainly based on [OpenPCDet](https://github.com/open-mmlab/OpenPCDet) and [CasA](https://github.com/hailanyi/CasA), some codes are from \n[PENet](https://github.com/JUGGHM/PENet_ICRA2021) and [SFD](https://github.com/LittlePey/SFD).\n\n## Detection Framework\nThe overall detection framework is shown below.\n(1) Transformation-equivariant Sparse Convolution (TeSpConv) backbone; (2) Transformation-equivariant Bird Eye View (TeBEV) pooling; \n(3) Multi-grid pooling and multi-refinement. \nTeSpConv applies shared weights on multiple transformed point clouds to record the transformation-equivariant voxel features. \nTeBEV pooling aligns and aggregates the scene-level equivariant features into lightweight representations for proposal generation.\n Multi-grid pooling and multi-refinement align and aggregate the instance-level invariant features for proposal refinement.\n \n![](./tools/images/framework.png)\n\n## Model Zoo\nWe release two models, which are based on LiDAR-only and multi-modal data respectively. We denoted the two models as TED-S and TED-M respectively.\n\n* All models are trained with 8 V100 GPUs and are available for download. \n\n* The models are trained with train split (3712 samples) of KITTI dataset\n\n* The results are the 3D AP(R40) of Car on the *val* set of KITTI dataset.\n\n* These models are not suitable to directly report results on KITTI test set, please use slightly lower score threshold and train the models on all or 80% training data to achieve a desirable performance on KITTI test set.\n\n|                                             |Modality|GPU memory of training| Easy | Mod. | Hard  | download | \n|---------------------------------------------|----------:|----------:|:-------:|:-------:|:-------:|:---------:|\n| [TED-S](tools/cfgs/models/kitti/TED-S.yaml)|LiDAR only|~12 GB |93.25 |87.99| 86.28| [google](https://drive.google.com/file/d/1hqoj-lV4Cr3m7U3EphdCSjHmhBlekRm8/view?usp=sharing) / [baidu(p91t)](https://pan.baidu.com/s/1ecobwO673ScrGYOHbooGIw) / 36M | \n| [TED-M](tools/cfgs/models/kitti/TED-M.yaml)|LiDAR+RGB |~15 GB| 95.62 |89.24 |86.77 | [google](https://drive.google.com/file/d/1hXe1at-LKogTfWorALmq6djjYqhKX7nD/view?usp=sharing) / [baidu(nkr5)](https://pan.baidu.com/s/1FP80452dfM09YtE8DBaicQ) / 65M|\n\n## Getting Started\n\n```\nconda create -n spconv2 python=3.9\nconda activate spconv2\npip install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html\npip install numpy==1.19.5 protobuf==3.19.4 scikit-image==0.19.2 waymo-open-dataset-tf-2-5-0 nuscenes-devkit==1.0.5 spconv-cu111 numba scipy pyyaml easydict fire tqdm shapely matplotlib opencv-python addict pyquaternion awscli open3d pandas future pybind11 tensorboardX tensorboard Cython prefetch-generator\n```\n\n### Dependency\nOur released implementation is tested on.\n+ Ubuntu 18.04\n+ Python 3.6.9 \n+ PyTorch 1.8.1\n+ Spconv 1.2.1\n+ NVIDIA CUDA 11.1\n+ 8x Tesla V100 GPUs\n\nWe also tested on.\n+ Ubuntu 18.04\n+ Python 3.9.13\n+ PyTorch 1.8.1\n+ Spconv 2.1.22 # pip install spconv-cu111\n+ NVIDIA CUDA 11.1\n+ 2x 3090 GPUs\n\n### Prepare dataset\n\nPlease download the official [KITTI 3D object detection](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d) dataset and organize the downloaded files as follows (the road planes could be downloaded from [[road plane]](https://drive.google.com/file/d/1d5mq0RXRnvHPVeKx6Q612z0YRO1t2wAp/view?usp=sharing), which are optional for data augmentation in the training):\n\n```\nTED\n├── data\n│   ├── kitti\n│   │   │── ImageSets\n│   │   │── training\n│   │   │   ├──calib & velodyne & label_2 & image_2 & (optional: planes)\n│   │   │── testing\n│   │   │   ├──calib & velodyne & image_2\n├── pcdet\n├── tools\n```\n\nYou need creat a 'velodyne_depth' dataset to run our multimodal detector:\nYou can download our preprocessed data from [google (13GB)](https://drive.google.com/file/d/1xki9v_zsQMM8vMVNo0ENi1Mh_GNMjHUg/view?usp=sharing), [baidu (a20o)](https://pan.baidu.com/s/1OH4KIVoSSH7ea3-3CqkZRQ), or generate the data by yourself:\n* [Install this project](#installation).\n* Download the PENet depth completion model [here (500M)](https://drive.google.com/file/d/1RDdKlKJcas-G5OA49x8OoqcUDiYYZgeM/view?usp=sharing) and put it into ```tools/PENet```.\n* Then run the following code to generate RGB pseudo points.\n```\ncd tools/PENet\npython3 main.py --detpath [your path like: ../../data/kitti/training]\n```\n\nAfter 'velodyne_depth' generation, run following command to creat dataset infos:\n```\ncd ../..\npython3 -m pcdet.datasets.kitti.kitti_dataset create_kitti_infos tools/cfgs/dataset_configs/kitti_dataset.yaml\npython3 -m pcdet.datasets.kitti.kitti_dataset_mm create_kitti_infos tools/cfgs/dataset_configs/kitti_dataset.yaml\n```\n\nAnyway, the data structure should be: \n```\nTED\n├── data\n│   ├── kitti\n│   │   │── ImageSets\n│   │   │── training\n│   │   │   ├──calib & velodyne & label_2 & image_2 & (optional: planes) & velodyne_depth\n│   │   │── testing\n│   │   │   ├──calib & velodyne & image_2 & velodyne_depth\n│   │   │── gt_database\n│   │   │── gt_database_mm\n│   │   │── kitti_dbinfos_train_mm.pkl\n│   │   │── kitti_dbinfos_train.pkl\n│   │   │── kitti_infos_test.pkl\n│   │   │── kitti_infos_train.pkl\n│   │   │── kitti_infos_trainval.pkl\n│   │   │── kitti_infos_val.pkl\n├── pcdet\n├── tools\n```\n\n### Installation\n\n```\ngit clone https://github.com/hailanyi/TED.git\ncd TED\npython3 setup.py develop\n```\n\n### Training\n\nSingle GPU train:\n```\ncd tools\npython3 train.py --cfg_file ${CONFIG_FILE}\n```\nFor example, if you train the TED-S model:\n```\ncd tools\npython3 train.py --cfg_file cfgs/models/kitti/TED-S.yaml\n```\n\nMultiple GPU train: \n\nYou can modify the gpu number in the dist_train.sh and run\n```\ncd tools\nsh dist_train.sh\n```\nThe log infos are saved into log.txt\nYou can run ```cat log.txt``` to view the training process.\n\n### Evaluation\n\n```\ncd tools\npython3 test.py --cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE} --ckpt ${CKPT}\n```\n\nFor example, if you test the TED-S model:\n\n```\ncd tools\npython3 test.py --cfg_file cfgs/models/kitti/TED-S.yaml --ckpt TED-S.pth\n```\n\nMultiple GPU test: you need modify the gpu number in the dist_test.sh and run\n```\nsh dist_test.sh \n```\nThe log infos are saved into log-test.txt\nYou can run ```cat log-test.txt``` to view the test results.\n\n## License\n\nThis code is released under the [Apache 2.0 license](LICENSE).\n\n## Acknowledgement\n\n[CasA](https://github.com/hailanyi/CasA)\n\n[OpenPCDet](https://github.com/open-mmlab/OpenPCDet)\n\n[PENet](https://github.com/JUGGHM/PENet_ICRA2021)\n\n[SFD](https://github.com/LittlePey/SFD)\n\n## Citation\n    @inproceedings{TED,\n        title={Transformation-Equivariant 3D Object Detection for Autonomous Driving},\n        author={Wu, Hai and Wen, Chenglu and Li, Wei and Yang, Ruigang and Wang, Cheng},\n        year={2023},\n        booktitle={AAAI}\n        \n    }\n\n\n\n\n\n"
  },
  {
    "path": "data/kitti/ImageSets/test.txt",
    "content": "000000\n000001\n000002\n000003\n000004\n000005\n000006\n000007\n000008\n000009\n000010\n000011\n000012\n000013\n000014\n000015\n000016\n000017\n000018\n000019\n000020\n000021\n000022\n000023\n000024\n000025\n000026\n000027\n000028\n000029\n000030\n000031\n000032\n000033\n000034\n000035\n000036\n000037\n000038\n000039\n000040\n000041\n000042\n000043\n000044\n000045\n000046\n000047\n000048\n000049\n000050\n000051\n000052\n000053\n000054\n000055\n000056\n000057\n000058\n000059\n000060\n000061\n000062\n000063\n000064\n000065\n000066\n000067\n000068\n000069\n000070\n000071\n000072\n000073\n000074\n000075\n000076\n000077\n000078\n000079\n000080\n000081\n000082\n000083\n000084\n000085\n000086\n000087\n000088\n000089\n000090\n000091\n000092\n000093\n000094\n000095\n000096\n000097\n000098\n000099\n000100\n000101\n000102\n000103\n000104\n000105\n000106\n000107\n000108\n000109\n000110\n000111\n000112\n000113\n000114\n000115\n000116\n000117\n000118\n000119\n000120\n000121\n000122\n000123\n000124\n000125\n000126\n000127\n000128\n000129\n000130\n000131\n000132\n000133\n000134\n000135\n000136\n000137\n000138\n000139\n000140\n000141\n000142\n000143\n000144\n000145\n000146\n000147\n000148\n000149\n000150\n000151\n000152\n000153\n000154\n000155\n000156\n000157\n000158\n000159\n000160\n000161\n000162\n000163\n000164\n000165\n000166\n000167\n000168\n000169\n000170\n000171\n000172\n000173\n000174\n000175\n000176\n000177\n000178\n000179\n000180\n000181\n000182\n000183\n000184\n000185\n000186\n000187\n000188\n000189\n000190\n000191\n000192\n000193\n000194\n000195\n000196\n000197\n000198\n000199\n000200\n000201\n000202\n000203\n000204\n000205\n000206\n000207\n000208\n000209\n000210\n000211\n000212\n000213\n000214\n000215\n000216\n000217\n000218\n000219\n000220\n000221\n000222\n000223\n000224\n000225\n000226\n000227\n000228\n000229\n000230\n000231\n000232\n000233\n000234\n000235\n000236\n000237\n000238\n000239\n000240\n000241\n000242\n000243\n000244\n000245\n000246\n000247\n000248\n000249\n000250\n000251\n000252\n000253\n000254\n000255\n000256\n000257\n000258\n000259\n000260\n000261\n000262\n000263\n000264\n000265\n000266\n000267\n000268\n000269\n000270\n000271\n000272\n000273\n000274\n000275\n000276\n000277\n000278\n000279\n000280\n000281\n000282\n000283\n000284\n000285\n000286\n000287\n000288\n000289\n000290\n000291\n000292\n000293\n000294\n000295\n000296\n000297\n000298\n000299\n000300\n000301\n000302\n000303\n000304\n000305\n000306\n000307\n000308\n000309\n000310\n000311\n000312\n000313\n000314\n000315\n000316\n000317\n000318\n000319\n000320\n000321\n000322\n000323\n000324\n000325\n000326\n000327\n000328\n000329\n000330\n000331\n000332\n000333\n000334\n000335\n000336\n000337\n000338\n000339\n000340\n000341\n000342\n000343\n000344\n000345\n000346\n000347\n000348\n000349\n000350\n000351\n000352\n000353\n000354\n000355\n000356\n000357\n000358\n000359\n000360\n000361\n000362\n000363\n000364\n000365\n000366\n000367\n000368\n000369\n000370\n000371\n000372\n000373\n000374\n000375\n000376\n000377\n000378\n000379\n000380\n000381\n000382\n000383\n000384\n000385\n000386\n000387\n000388\n000389\n000390\n000391\n000392\n000393\n000394\n000395\n000396\n000397\n000398\n000399\n000400\n000401\n000402\n000403\n000404\n000405\n000406\n000407\n000408\n000409\n000410\n000411\n000412\n000413\n000414\n000415\n000416\n000417\n000418\n000419\n000420\n000421\n000422\n000423\n000424\n000425\n000426\n000427\n000428\n000429\n000430\n000431\n000432\n000433\n000434\n000435\n000436\n000437\n000438\n000439\n000440\n000441\n000442\n000443\n000444\n000445\n000446\n000447\n000448\n000449\n000450\n000451\n000452\n000453\n000454\n000455\n000456\n000457\n000458\n000459\n000460\n000461\n000462\n000463\n000464\n000465\n000466\n000467\n000468\n000469\n000470\n000471\n000472\n000473\n000474\n000475\n000476\n000477\n000478\n000479\n000480\n000481\n000482\n000483\n000484\n000485\n000486\n000487\n000488\n000489\n000490\n000491\n000492\n000493\n000494\n000495\n000496\n000497\n000498\n000499\n000500\n000501\n000502\n000503\n000504\n000505\n000506\n000507\n000508\n000509\n000510\n000511\n000512\n000513\n000514\n000515\n000516\n000517\n000518\n000519\n000520\n000521\n000522\n000523\n000524\n000525\n000526\n000527\n000528\n000529\n000530\n000531\n000532\n000533\n000534\n000535\n000536\n000537\n000538\n000539\n000540\n000541\n000542\n000543\n000544\n000545\n000546\n000547\n000548\n000549\n000550\n000551\n000552\n000553\n000554\n000555\n000556\n000557\n000558\n000559\n000560\n000561\n000562\n000563\n000564\n000565\n000566\n000567\n000568\n000569\n000570\n000571\n000572\n000573\n000574\n000575\n000576\n000577\n000578\n000579\n000580\n000581\n000582\n000583\n000584\n000585\n000586\n000587\n000588\n000589\n000590\n000591\n000592\n000593\n000594\n000595\n000596\n000597\n000598\n000599\n000600\n000601\n000602\n000603\n000604\n000605\n000606\n000607\n000608\n000609\n000610\n000611\n000612\n000613\n000614\n000615\n000616\n000617\n000618\n000619\n000620\n000621\n000622\n000623\n000624\n000625\n000626\n000627\n000628\n000629\n000630\n000631\n000632\n000633\n000634\n000635\n000636\n000637\n000638\n000639\n000640\n000641\n000642\n000643\n000644\n000645\n000646\n000647\n000648\n000649\n000650\n000651\n000652\n000653\n000654\n000655\n000656\n000657\n000658\n000659\n000660\n000661\n000662\n000663\n000664\n000665\n000666\n000667\n000668\n000669\n000670\n000671\n000672\n000673\n000674\n000675\n000676\n000677\n000678\n000679\n000680\n000681\n000682\n000683\n000684\n000685\n000686\n000687\n000688\n000689\n000690\n000691\n000692\n000693\n000694\n000695\n000696\n000697\n000698\n000699\n000700\n000701\n000702\n000703\n000704\n000705\n000706\n000707\n000708\n000709\n000710\n000711\n000712\n000713\n000714\n000715\n000716\n000717\n000718\n000719\n000720\n000721\n000722\n000723\n000724\n000725\n000726\n000727\n000728\n000729\n000730\n000731\n000732\n000733\n000734\n000735\n000736\n000737\n000738\n000739\n000740\n000741\n000742\n000743\n000744\n000745\n000746\n000747\n000748\n000749\n000750\n000751\n000752\n000753\n000754\n000755\n000756\n000757\n000758\n000759\n000760\n000761\n000762\n000763\n000764\n000765\n000766\n000767\n000768\n000769\n000770\n000771\n000772\n000773\n000774\n000775\n000776\n000777\n000778\n000779\n000780\n000781\n000782\n000783\n000784\n000785\n000786\n000787\n000788\n000789\n000790\n000791\n000792\n000793\n000794\n000795\n000796\n000797\n000798\n000799\n000800\n000801\n000802\n000803\n000804\n000805\n000806\n000807\n000808\n000809\n000810\n000811\n000812\n000813\n000814\n000815\n000816\n000817\n000818\n000819\n000820\n000821\n000822\n000823\n000824\n000825\n000826\n000827\n000828\n000829\n000830\n000831\n000832\n000833\n000834\n000835\n000836\n000837\n000838\n000839\n000840\n000841\n000842\n000843\n000844\n000845\n000846\n000847\n000848\n000849\n000850\n000851\n000852\n000853\n000854\n000855\n000856\n000857\n000858\n000859\n000860\n000861\n000862\n000863\n000864\n000865\n000866\n000867\n000868\n000869\n000870\n000871\n000872\n000873\n000874\n000875\n000876\n000877\n000878\n000879\n000880\n000881\n000882\n000883\n000884\n000885\n000886\n000887\n000888\n000889\n000890\n000891\n000892\n000893\n000894\n000895\n000896\n000897\n000898\n000899\n000900\n000901\n000902\n000903\n000904\n000905\n000906\n000907\n000908\n000909\n000910\n000911\n000912\n000913\n000914\n000915\n000916\n000917\n000918\n000919\n000920\n000921\n000922\n000923\n000924\n000925\n000926\n000927\n000928\n000929\n000930\n000931\n000932\n000933\n000934\n000935\n000936\n000937\n000938\n000939\n000940\n000941\n000942\n000943\n000944\n000945\n000946\n000947\n000948\n000949\n000950\n000951\n000952\n000953\n000954\n000955\n000956\n000957\n000958\n000959\n000960\n000961\n000962\n000963\n000964\n000965\n000966\n000967\n000968\n000969\n000970\n000971\n000972\n000973\n000974\n000975\n000976\n000977\n000978\n000979\n000980\n000981\n000982\n000983\n000984\n000985\n000986\n000987\n000988\n000989\n000990\n000991\n000992\n000993\n000994\n000995\n000996\n000997\n000998\n000999\n001000\n001001\n001002\n001003\n001004\n001005\n001006\n001007\n001008\n001009\n001010\n001011\n001012\n001013\n001014\n001015\n001016\n001017\n001018\n001019\n001020\n001021\n001022\n001023\n001024\n001025\n001026\n001027\n001028\n001029\n001030\n001031\n001032\n001033\n001034\n001035\n001036\n001037\n001038\n001039\n001040\n001041\n001042\n001043\n001044\n001045\n001046\n001047\n001048\n001049\n001050\n001051\n001052\n001053\n001054\n001055\n001056\n001057\n001058\n001059\n001060\n001061\n001062\n001063\n001064\n001065\n001066\n001067\n001068\n001069\n001070\n001071\n001072\n001073\n001074\n001075\n001076\n001077\n001078\n001079\n001080\n001081\n001082\n001083\n001084\n001085\n001086\n001087\n001088\n001089\n001090\n001091\n001092\n001093\n001094\n001095\n001096\n001097\n001098\n001099\n001100\n001101\n001102\n001103\n001104\n001105\n001106\n001107\n001108\n001109\n001110\n001111\n001112\n001113\n001114\n001115\n001116\n001117\n001118\n001119\n001120\n001121\n001122\n001123\n001124\n001125\n001126\n001127\n001128\n001129\n001130\n001131\n001132\n001133\n001134\n001135\n001136\n001137\n001138\n001139\n001140\n001141\n001142\n001143\n001144\n001145\n001146\n001147\n001148\n001149\n001150\n001151\n001152\n001153\n001154\n001155\n001156\n001157\n001158\n001159\n001160\n001161\n001162\n001163\n001164\n001165\n001166\n001167\n001168\n001169\n001170\n001171\n001172\n001173\n001174\n001175\n001176\n001177\n001178\n001179\n001180\n001181\n001182\n001183\n001184\n001185\n001186\n001187\n001188\n001189\n001190\n001191\n001192\n001193\n001194\n001195\n001196\n001197\n001198\n001199\n001200\n001201\n001202\n001203\n001204\n001205\n001206\n001207\n001208\n001209\n001210\n001211\n001212\n001213\n001214\n001215\n001216\n001217\n001218\n001219\n001220\n001221\n001222\n001223\n001224\n001225\n001226\n001227\n001228\n001229\n001230\n001231\n001232\n001233\n001234\n001235\n001236\n001237\n001238\n001239\n001240\n001241\n001242\n001243\n001244\n001245\n001246\n001247\n001248\n001249\n001250\n001251\n001252\n001253\n001254\n001255\n001256\n001257\n001258\n001259\n001260\n001261\n001262\n001263\n001264\n001265\n001266\n001267\n001268\n001269\n001270\n001271\n001272\n001273\n001274\n001275\n001276\n001277\n001278\n001279\n001280\n001281\n001282\n001283\n001284\n001285\n001286\n001287\n001288\n001289\n001290\n001291\n001292\n001293\n001294\n001295\n001296\n001297\n001298\n001299\n001300\n001301\n001302\n001303\n001304\n001305\n001306\n001307\n001308\n001309\n001310\n001311\n001312\n001313\n001314\n001315\n001316\n001317\n001318\n001319\n001320\n001321\n001322\n001323\n001324\n001325\n001326\n001327\n001328\n001329\n001330\n001331\n001332\n001333\n001334\n001335\n001336\n001337\n001338\n001339\n001340\n001341\n001342\n001343\n001344\n001345\n001346\n001347\n001348\n001349\n001350\n001351\n001352\n001353\n001354\n001355\n001356\n001357\n001358\n001359\n001360\n001361\n001362\n001363\n001364\n001365\n001366\n001367\n001368\n001369\n001370\n001371\n001372\n001373\n001374\n001375\n001376\n001377\n001378\n001379\n001380\n001381\n001382\n001383\n001384\n001385\n001386\n001387\n001388\n001389\n001390\n001391\n001392\n001393\n001394\n001395\n001396\n001397\n001398\n001399\n001400\n001401\n001402\n001403\n001404\n001405\n001406\n001407\n001408\n001409\n001410\n001411\n001412\n001413\n001414\n001415\n001416\n001417\n001418\n001419\n001420\n001421\n001422\n001423\n001424\n001425\n001426\n001427\n001428\n001429\n001430\n001431\n001432\n001433\n001434\n001435\n001436\n001437\n001438\n001439\n001440\n001441\n001442\n001443\n001444\n001445\n001446\n001447\n001448\n001449\n001450\n001451\n001452\n001453\n001454\n001455\n001456\n001457\n001458\n001459\n001460\n001461\n001462\n001463\n001464\n001465\n001466\n001467\n001468\n001469\n001470\n001471\n001472\n001473\n001474\n001475\n001476\n001477\n001478\n001479\n001480\n001481\n001482\n001483\n001484\n001485\n001486\n001487\n001488\n001489\n001490\n001491\n001492\n001493\n001494\n001495\n001496\n001497\n001498\n001499\n001500\n001501\n001502\n001503\n001504\n001505\n001506\n001507\n001508\n001509\n001510\n001511\n001512\n001513\n001514\n001515\n001516\n001517\n001518\n001519\n001520\n001521\n001522\n001523\n001524\n001525\n001526\n001527\n001528\n001529\n001530\n001531\n001532\n001533\n001534\n001535\n001536\n001537\n001538\n001539\n001540\n001541\n001542\n001543\n001544\n001545\n001546\n001547\n001548\n001549\n001550\n001551\n001552\n001553\n001554\n001555\n001556\n001557\n001558\n001559\n001560\n001561\n001562\n001563\n001564\n001565\n001566\n001567\n001568\n001569\n001570\n001571\n001572\n001573\n001574\n001575\n001576\n001577\n001578\n001579\n001580\n001581\n001582\n001583\n001584\n001585\n001586\n001587\n001588\n001589\n001590\n001591\n001592\n001593\n001594\n001595\n001596\n001597\n001598\n001599\n001600\n001601\n001602\n001603\n001604\n001605\n001606\n001607\n001608\n001609\n001610\n001611\n001612\n001613\n001614\n001615\n001616\n001617\n001618\n001619\n001620\n001621\n001622\n001623\n001624\n001625\n001626\n001627\n001628\n001629\n001630\n001631\n001632\n001633\n001634\n001635\n001636\n001637\n001638\n001639\n001640\n001641\n001642\n001643\n001644\n001645\n001646\n001647\n001648\n001649\n001650\n001651\n001652\n001653\n001654\n001655\n001656\n001657\n001658\n001659\n001660\n001661\n001662\n001663\n001664\n001665\n001666\n001667\n001668\n001669\n001670\n001671\n001672\n001673\n001674\n001675\n001676\n001677\n001678\n001679\n001680\n001681\n001682\n001683\n001684\n001685\n001686\n001687\n001688\n001689\n001690\n001691\n001692\n001693\n001694\n001695\n001696\n001697\n001698\n001699\n001700\n001701\n001702\n001703\n001704\n001705\n001706\n001707\n001708\n001709\n001710\n001711\n001712\n001713\n001714\n001715\n001716\n001717\n001718\n001719\n001720\n001721\n001722\n001723\n001724\n001725\n001726\n001727\n001728\n001729\n001730\n001731\n001732\n001733\n001734\n001735\n001736\n001737\n001738\n001739\n001740\n001741\n001742\n001743\n001744\n001745\n001746\n001747\n001748\n001749\n001750\n001751\n001752\n001753\n001754\n001755\n001756\n001757\n001758\n001759\n001760\n001761\n001762\n001763\n001764\n001765\n001766\n001767\n001768\n001769\n001770\n001771\n001772\n001773\n001774\n001775\n001776\n001777\n001778\n001779\n001780\n001781\n001782\n001783\n001784\n001785\n001786\n001787\n001788\n001789\n001790\n001791\n001792\n001793\n001794\n001795\n001796\n001797\n001798\n001799\n001800\n001801\n001802\n001803\n001804\n001805\n001806\n001807\n001808\n001809\n001810\n001811\n001812\n001813\n001814\n001815\n001816\n001817\n001818\n001819\n001820\n001821\n001822\n001823\n001824\n001825\n001826\n001827\n001828\n001829\n001830\n001831\n001832\n001833\n001834\n001835\n001836\n001837\n001838\n001839\n001840\n001841\n001842\n001843\n001844\n001845\n001846\n001847\n001848\n001849\n001850\n001851\n001852\n001853\n001854\n001855\n001856\n001857\n001858\n001859\n001860\n001861\n001862\n001863\n001864\n001865\n001866\n001867\n001868\n001869\n001870\n001871\n001872\n001873\n001874\n001875\n001876\n001877\n001878\n001879\n001880\n001881\n001882\n001883\n001884\n001885\n001886\n001887\n001888\n001889\n001890\n001891\n001892\n001893\n001894\n001895\n001896\n001897\n001898\n001899\n001900\n001901\n001902\n001903\n001904\n001905\n001906\n001907\n001908\n001909\n001910\n001911\n001912\n001913\n001914\n001915\n001916\n001917\n001918\n001919\n001920\n001921\n001922\n001923\n001924\n001925\n001926\n001927\n001928\n001929\n001930\n001931\n001932\n001933\n001934\n001935\n001936\n001937\n001938\n001939\n001940\n001941\n001942\n001943\n001944\n001945\n001946\n001947\n001948\n001949\n001950\n001951\n001952\n001953\n001954\n001955\n001956\n001957\n001958\n001959\n001960\n001961\n001962\n001963\n001964\n001965\n001966\n001967\n001968\n001969\n001970\n001971\n001972\n001973\n001974\n001975\n001976\n001977\n001978\n001979\n001980\n001981\n001982\n001983\n001984\n001985\n001986\n001987\n001988\n001989\n001990\n001991\n001992\n001993\n001994\n001995\n001996\n001997\n001998\n001999\n002000\n002001\n002002\n002003\n002004\n002005\n002006\n002007\n002008\n002009\n002010\n002011\n002012\n002013\n002014\n002015\n002016\n002017\n002018\n002019\n002020\n002021\n002022\n002023\n002024\n002025\n002026\n002027\n002028\n002029\n002030\n002031\n002032\n002033\n002034\n002035\n002036\n002037\n002038\n002039\n002040\n002041\n002042\n002043\n002044\n002045\n002046\n002047\n002048\n002049\n002050\n002051\n002052\n002053\n002054\n002055\n002056\n002057\n002058\n002059\n002060\n002061\n002062\n002063\n002064\n002065\n002066\n002067\n002068\n002069\n002070\n002071\n002072\n002073\n002074\n002075\n002076\n002077\n002078\n002079\n002080\n002081\n002082\n002083\n002084\n002085\n002086\n002087\n002088\n002089\n002090\n002091\n002092\n002093\n002094\n002095\n002096\n002097\n002098\n002099\n002100\n002101\n002102\n002103\n002104\n002105\n002106\n002107\n002108\n002109\n002110\n002111\n002112\n002113\n002114\n002115\n002116\n002117\n002118\n002119\n002120\n002121\n002122\n002123\n002124\n002125\n002126\n002127\n002128\n002129\n002130\n002131\n002132\n002133\n002134\n002135\n002136\n002137\n002138\n002139\n002140\n002141\n002142\n002143\n002144\n002145\n002146\n002147\n002148\n002149\n002150\n002151\n002152\n002153\n002154\n002155\n002156\n002157\n002158\n002159\n002160\n002161\n002162\n002163\n002164\n002165\n002166\n002167\n002168\n002169\n002170\n002171\n002172\n002173\n002174\n002175\n002176\n002177\n002178\n002179\n002180\n002181\n002182\n002183\n002184\n002185\n002186\n002187\n002188\n002189\n002190\n002191\n002192\n002193\n002194\n002195\n002196\n002197\n002198\n002199\n002200\n002201\n002202\n002203\n002204\n002205\n002206\n002207\n002208\n002209\n002210\n002211\n002212\n002213\n002214\n002215\n002216\n002217\n002218\n002219\n002220\n002221\n002222\n002223\n002224\n002225\n002226\n002227\n002228\n002229\n002230\n002231\n002232\n002233\n002234\n002235\n002236\n002237\n002238\n002239\n002240\n002241\n002242\n002243\n002244\n002245\n002246\n002247\n002248\n002249\n002250\n002251\n002252\n002253\n002254\n002255\n002256\n002257\n002258\n002259\n002260\n002261\n002262\n002263\n002264\n002265\n002266\n002267\n002268\n002269\n002270\n002271\n002272\n002273\n002274\n002275\n002276\n002277\n002278\n002279\n002280\n002281\n002282\n002283\n002284\n002285\n002286\n002287\n002288\n002289\n002290\n002291\n002292\n002293\n002294\n002295\n002296\n002297\n002298\n002299\n002300\n002301\n002302\n002303\n002304\n002305\n002306\n002307\n002308\n002309\n002310\n002311\n002312\n002313\n002314\n002315\n002316\n002317\n002318\n002319\n002320\n002321\n002322\n002323\n002324\n002325\n002326\n002327\n002328\n002329\n002330\n002331\n002332\n002333\n002334\n002335\n002336\n002337\n002338\n002339\n002340\n002341\n002342\n002343\n002344\n002345\n002346\n002347\n002348\n002349\n002350\n002351\n002352\n002353\n002354\n002355\n002356\n002357\n002358\n002359\n002360\n002361\n002362\n002363\n002364\n002365\n002366\n002367\n002368\n002369\n002370\n002371\n002372\n002373\n002374\n002375\n002376\n002377\n002378\n002379\n002380\n002381\n002382\n002383\n002384\n002385\n002386\n002387\n002388\n002389\n002390\n002391\n002392\n002393\n002394\n002395\n002396\n002397\n002398\n002399\n002400\n002401\n002402\n002403\n002404\n002405\n002406\n002407\n002408\n002409\n002410\n002411\n002412\n002413\n002414\n002415\n002416\n002417\n002418\n002419\n002420\n002421\n002422\n002423\n002424\n002425\n002426\n002427\n002428\n002429\n002430\n002431\n002432\n002433\n002434\n002435\n002436\n002437\n002438\n002439\n002440\n002441\n002442\n002443\n002444\n002445\n002446\n002447\n002448\n002449\n002450\n002451\n002452\n002453\n002454\n002455\n002456\n002457\n002458\n002459\n002460\n002461\n002462\n002463\n002464\n002465\n002466\n002467\n002468\n002469\n002470\n002471\n002472\n002473\n002474\n002475\n002476\n002477\n002478\n002479\n002480\n002481\n002482\n002483\n002484\n002485\n002486\n002487\n002488\n002489\n002490\n002491\n002492\n002493\n002494\n002495\n002496\n002497\n002498\n002499\n002500\n002501\n002502\n002503\n002504\n002505\n002506\n002507\n002508\n002509\n002510\n002511\n002512\n002513\n002514\n002515\n002516\n002517\n002518\n002519\n002520\n002521\n002522\n002523\n002524\n002525\n002526\n002527\n002528\n002529\n002530\n002531\n002532\n002533\n002534\n002535\n002536\n002537\n002538\n002539\n002540\n002541\n002542\n002543\n002544\n002545\n002546\n002547\n002548\n002549\n002550\n002551\n002552\n002553\n002554\n002555\n002556\n002557\n002558\n002559\n002560\n002561\n002562\n002563\n002564\n002565\n002566\n002567\n002568\n002569\n002570\n002571\n002572\n002573\n002574\n002575\n002576\n002577\n002578\n002579\n002580\n002581\n002582\n002583\n002584\n002585\n002586\n002587\n002588\n002589\n002590\n002591\n002592\n002593\n002594\n002595\n002596\n002597\n002598\n002599\n002600\n002601\n002602\n002603\n002604\n002605\n002606\n002607\n002608\n002609\n002610\n002611\n002612\n002613\n002614\n002615\n002616\n002617\n002618\n002619\n002620\n002621\n002622\n002623\n002624\n002625\n002626\n002627\n002628\n002629\n002630\n002631\n002632\n002633\n002634\n002635\n002636\n002637\n002638\n002639\n002640\n002641\n002642\n002643\n002644\n002645\n002646\n002647\n002648\n002649\n002650\n002651\n002652\n002653\n002654\n002655\n002656\n002657\n002658\n002659\n002660\n002661\n002662\n002663\n002664\n002665\n002666\n002667\n002668\n002669\n002670\n002671\n002672\n002673\n002674\n002675\n002676\n002677\n002678\n002679\n002680\n002681\n002682\n002683\n002684\n002685\n002686\n002687\n002688\n002689\n002690\n002691\n002692\n002693\n002694\n002695\n002696\n002697\n002698\n002699\n002700\n002701\n002702\n002703\n002704\n002705\n002706\n002707\n002708\n002709\n002710\n002711\n002712\n002713\n002714\n002715\n002716\n002717\n002718\n002719\n002720\n002721\n002722\n002723\n002724\n002725\n002726\n002727\n002728\n002729\n002730\n002731\n002732\n002733\n002734\n002735\n002736\n002737\n002738\n002739\n002740\n002741\n002742\n002743\n002744\n002745\n002746\n002747\n002748\n002749\n002750\n002751\n002752\n002753\n002754\n002755\n002756\n002757\n002758\n002759\n002760\n002761\n002762\n002763\n002764\n002765\n002766\n002767\n002768\n002769\n002770\n002771\n002772\n002773\n002774\n002775\n002776\n002777\n002778\n002779\n002780\n002781\n002782\n002783\n002784\n002785\n002786\n002787\n002788\n002789\n002790\n002791\n002792\n002793\n002794\n002795\n002796\n002797\n002798\n002799\n002800\n002801\n002802\n002803\n002804\n002805\n002806\n002807\n002808\n002809\n002810\n002811\n002812\n002813\n002814\n002815\n002816\n002817\n002818\n002819\n002820\n002821\n002822\n002823\n002824\n002825\n002826\n002827\n002828\n002829\n002830\n002831\n002832\n002833\n002834\n002835\n002836\n002837\n002838\n002839\n002840\n002841\n002842\n002843\n002844\n002845\n002846\n002847\n002848\n002849\n002850\n002851\n002852\n002853\n002854\n002855\n002856\n002857\n002858\n002859\n002860\n002861\n002862\n002863\n002864\n002865\n002866\n002867\n002868\n002869\n002870\n002871\n002872\n002873\n002874\n002875\n002876\n002877\n002878\n002879\n002880\n002881\n002882\n002883\n002884\n002885\n002886\n002887\n002888\n002889\n002890\n002891\n002892\n002893\n002894\n002895\n002896\n002897\n002898\n002899\n002900\n002901\n002902\n002903\n002904\n002905\n002906\n002907\n002908\n002909\n002910\n002911\n002912\n002913\n002914\n002915\n002916\n002917\n002918\n002919\n002920\n002921\n002922\n002923\n002924\n002925\n002926\n002927\n002928\n002929\n002930\n002931\n002932\n002933\n002934\n002935\n002936\n002937\n002938\n002939\n002940\n002941\n002942\n002943\n002944\n002945\n002946\n002947\n002948\n002949\n002950\n002951\n002952\n002953\n002954\n002955\n002956\n002957\n002958\n002959\n002960\n002961\n002962\n002963\n002964\n002965\n002966\n002967\n002968\n002969\n002970\n002971\n002972\n002973\n002974\n002975\n002976\n002977\n002978\n002979\n002980\n002981\n002982\n002983\n002984\n002985\n002986\n002987\n002988\n002989\n002990\n002991\n002992\n002993\n002994\n002995\n002996\n002997\n002998\n002999\n003000\n003001\n003002\n003003\n003004\n003005\n003006\n003007\n003008\n003009\n003010\n003011\n003012\n003013\n003014\n003015\n003016\n003017\n003018\n003019\n003020\n003021\n003022\n003023\n003024\n003025\n003026\n003027\n003028\n003029\n003030\n003031\n003032\n003033\n003034\n003035\n003036\n003037\n003038\n003039\n003040\n003041\n003042\n003043\n003044\n003045\n003046\n003047\n003048\n003049\n003050\n003051\n003052\n003053\n003054\n003055\n003056\n003057\n003058\n003059\n003060\n003061\n003062\n003063\n003064\n003065\n003066\n003067\n003068\n003069\n003070\n003071\n003072\n003073\n003074\n003075\n003076\n003077\n003078\n003079\n003080\n003081\n003082\n003083\n003084\n003085\n003086\n003087\n003088\n003089\n003090\n003091\n003092\n003093\n003094\n003095\n003096\n003097\n003098\n003099\n003100\n003101\n003102\n003103\n003104\n003105\n003106\n003107\n003108\n003109\n003110\n003111\n003112\n003113\n003114\n003115\n003116\n003117\n003118\n003119\n003120\n003121\n003122\n003123\n003124\n003125\n003126\n003127\n003128\n003129\n003130\n003131\n003132\n003133\n003134\n003135\n003136\n003137\n003138\n003139\n003140\n003141\n003142\n003143\n003144\n003145\n003146\n003147\n003148\n003149\n003150\n003151\n003152\n003153\n003154\n003155\n003156\n003157\n003158\n003159\n003160\n003161\n003162\n003163\n003164\n003165\n003166\n003167\n003168\n003169\n003170\n003171\n003172\n003173\n003174\n003175\n003176\n003177\n003178\n003179\n003180\n003181\n003182\n003183\n003184\n003185\n003186\n003187\n003188\n003189\n003190\n003191\n003192\n003193\n003194\n003195\n003196\n003197\n003198\n003199\n003200\n003201\n003202\n003203\n003204\n003205\n003206\n003207\n003208\n003209\n003210\n003211\n003212\n003213\n003214\n003215\n003216\n003217\n003218\n003219\n003220\n003221\n003222\n003223\n003224\n003225\n003226\n003227\n003228\n003229\n003230\n003231\n003232\n003233\n003234\n003235\n003236\n003237\n003238\n003239\n003240\n003241\n003242\n003243\n003244\n003245\n003246\n003247\n003248\n003249\n003250\n003251\n003252\n003253\n003254\n003255\n003256\n003257\n003258\n003259\n003260\n003261\n003262\n003263\n003264\n003265\n003266\n003267\n003268\n003269\n003270\n003271\n003272\n003273\n003274\n003275\n003276\n003277\n003278\n003279\n003280\n003281\n003282\n003283\n003284\n003285\n003286\n003287\n003288\n003289\n003290\n003291\n003292\n003293\n003294\n003295\n003296\n003297\n003298\n003299\n003300\n003301\n003302\n003303\n003304\n003305\n003306\n003307\n003308\n003309\n003310\n003311\n003312\n003313\n003314\n003315\n003316\n003317\n003318\n003319\n003320\n003321\n003322\n003323\n003324\n003325\n003326\n003327\n003328\n003329\n003330\n003331\n003332\n003333\n003334\n003335\n003336\n003337\n003338\n003339\n003340\n003341\n003342\n003343\n003344\n003345\n003346\n003347\n003348\n003349\n003350\n003351\n003352\n003353\n003354\n003355\n003356\n003357\n003358\n003359\n003360\n003361\n003362\n003363\n003364\n003365\n003366\n003367\n003368\n003369\n003370\n003371\n003372\n003373\n003374\n003375\n003376\n003377\n003378\n003379\n003380\n003381\n003382\n003383\n003384\n003385\n003386\n003387\n003388\n003389\n003390\n003391\n003392\n003393\n003394\n003395\n003396\n003397\n003398\n003399\n003400\n003401\n003402\n003403\n003404\n003405\n003406\n003407\n003408\n003409\n003410\n003411\n003412\n003413\n003414\n003415\n003416\n003417\n003418\n003419\n003420\n003421\n003422\n003423\n003424\n003425\n003426\n003427\n003428\n003429\n003430\n003431\n003432\n003433\n003434\n003435\n003436\n003437\n003438\n003439\n003440\n003441\n003442\n003443\n003444\n003445\n003446\n003447\n003448\n003449\n003450\n003451\n003452\n003453\n003454\n003455\n003456\n003457\n003458\n003459\n003460\n003461\n003462\n003463\n003464\n003465\n003466\n003467\n003468\n003469\n003470\n003471\n003472\n003473\n003474\n003475\n003476\n003477\n003478\n003479\n003480\n003481\n003482\n003483\n003484\n003485\n003486\n003487\n003488\n003489\n003490\n003491\n003492\n003493\n003494\n003495\n003496\n003497\n003498\n003499\n003500\n003501\n003502\n003503\n003504\n003505\n003506\n003507\n003508\n003509\n003510\n003511\n003512\n003513\n003514\n003515\n003516\n003517\n003518\n003519\n003520\n003521\n003522\n003523\n003524\n003525\n003526\n003527\n003528\n003529\n003530\n003531\n003532\n003533\n003534\n003535\n003536\n003537\n003538\n003539\n003540\n003541\n003542\n003543\n003544\n003545\n003546\n003547\n003548\n003549\n003550\n003551\n003552\n003553\n003554\n003555\n003556\n003557\n003558\n003559\n003560\n003561\n003562\n003563\n003564\n003565\n003566\n003567\n003568\n003569\n003570\n003571\n003572\n003573\n003574\n003575\n003576\n003577\n003578\n003579\n003580\n003581\n003582\n003583\n003584\n003585\n003586\n003587\n003588\n003589\n003590\n003591\n003592\n003593\n003594\n003595\n003596\n003597\n003598\n003599\n003600\n003601\n003602\n003603\n003604\n003605\n003606\n003607\n003608\n003609\n003610\n003611\n003612\n003613\n003614\n003615\n003616\n003617\n003618\n003619\n003620\n003621\n003622\n003623\n003624\n003625\n003626\n003627\n003628\n003629\n003630\n003631\n003632\n003633\n003634\n003635\n003636\n003637\n003638\n003639\n003640\n003641\n003642\n003643\n003644\n003645\n003646\n003647\n003648\n003649\n003650\n003651\n003652\n003653\n003654\n003655\n003656\n003657\n003658\n003659\n003660\n003661\n003662\n003663\n003664\n003665\n003666\n003667\n003668\n003669\n003670\n003671\n003672\n003673\n003674\n003675\n003676\n003677\n003678\n003679\n003680\n003681\n003682\n003683\n003684\n003685\n003686\n003687\n003688\n003689\n003690\n003691\n003692\n003693\n003694\n003695\n003696\n003697\n003698\n003699\n003700\n003701\n003702\n003703\n003704\n003705\n003706\n003707\n003708\n003709\n003710\n003711\n003712\n003713\n003714\n003715\n003716\n003717\n003718\n003719\n003720\n003721\n003722\n003723\n003724\n003725\n003726\n003727\n003728\n003729\n003730\n003731\n003732\n003733\n003734\n003735\n003736\n003737\n003738\n003739\n003740\n003741\n003742\n003743\n003744\n003745\n003746\n003747\n003748\n003749\n003750\n003751\n003752\n003753\n003754\n003755\n003756\n003757\n003758\n003759\n003760\n003761\n003762\n003763\n003764\n003765\n003766\n003767\n003768\n003769\n003770\n003771\n003772\n003773\n003774\n003775\n003776\n003777\n003778\n003779\n003780\n003781\n003782\n003783\n003784\n003785\n003786\n003787\n003788\n003789\n003790\n003791\n003792\n003793\n003794\n003795\n003796\n003797\n003798\n003799\n003800\n003801\n003802\n003803\n003804\n003805\n003806\n003807\n003808\n003809\n003810\n003811\n003812\n003813\n003814\n003815\n003816\n003817\n003818\n003819\n003820\n003821\n003822\n003823\n003824\n003825\n003826\n003827\n003828\n003829\n003830\n003831\n003832\n003833\n003834\n003835\n003836\n003837\n003838\n003839\n003840\n003841\n003842\n003843\n003844\n003845\n003846\n003847\n003848\n003849\n003850\n003851\n003852\n003853\n003854\n003855\n003856\n003857\n003858\n003859\n003860\n003861\n003862\n003863\n003864\n003865\n003866\n003867\n003868\n003869\n003870\n003871\n003872\n003873\n003874\n003875\n003876\n003877\n003878\n003879\n003880\n003881\n003882\n003883\n003884\n003885\n003886\n003887\n003888\n003889\n003890\n003891\n003892\n003893\n003894\n003895\n003896\n003897\n003898\n003899\n003900\n003901\n003902\n003903\n003904\n003905\n003906\n003907\n003908\n003909\n003910\n003911\n003912\n003913\n003914\n003915\n003916\n003917\n003918\n003919\n003920\n003921\n003922\n003923\n003924\n003925\n003926\n003927\n003928\n003929\n003930\n003931\n003932\n003933\n003934\n003935\n003936\n003937\n003938\n003939\n003940\n003941\n003942\n003943\n003944\n003945\n003946\n003947\n003948\n003949\n003950\n003951\n003952\n003953\n003954\n003955\n003956\n003957\n003958\n003959\n003960\n003961\n003962\n003963\n003964\n003965\n003966\n003967\n003968\n003969\n003970\n003971\n003972\n003973\n003974\n003975\n003976\n003977\n003978\n003979\n003980\n003981\n003982\n003983\n003984\n003985\n003986\n003987\n003988\n003989\n003990\n003991\n003992\n003993\n003994\n003995\n003996\n003997\n003998\n003999\n004000\n004001\n004002\n004003\n004004\n004005\n004006\n004007\n004008\n004009\n004010\n004011\n004012\n004013\n004014\n004015\n004016\n004017\n004018\n004019\n004020\n004021\n004022\n004023\n004024\n004025\n004026\n004027\n004028\n004029\n004030\n004031\n004032\n004033\n004034\n004035\n004036\n004037\n004038\n004039\n004040\n004041\n004042\n004043\n004044\n004045\n004046\n004047\n004048\n004049\n004050\n004051\n004052\n004053\n004054\n004055\n004056\n004057\n004058\n004059\n004060\n004061\n004062\n004063\n004064\n004065\n004066\n004067\n004068\n004069\n004070\n004071\n004072\n004073\n004074\n004075\n004076\n004077\n004078\n004079\n004080\n004081\n004082\n004083\n004084\n004085\n004086\n004087\n004088\n004089\n004090\n004091\n004092\n004093\n004094\n004095\n004096\n004097\n004098\n004099\n004100\n004101\n004102\n004103\n004104\n004105\n004106\n004107\n004108\n004109\n004110\n004111\n004112\n004113\n004114\n004115\n004116\n004117\n004118\n004119\n004120\n004121\n004122\n004123\n004124\n004125\n004126\n004127\n004128\n004129\n004130\n004131\n004132\n004133\n004134\n004135\n004136\n004137\n004138\n004139\n004140\n004141\n004142\n004143\n004144\n004145\n004146\n004147\n004148\n004149\n004150\n004151\n004152\n004153\n004154\n004155\n004156\n004157\n004158\n004159\n004160\n004161\n004162\n004163\n004164\n004165\n004166\n004167\n004168\n004169\n004170\n004171\n004172\n004173\n004174\n004175\n004176\n004177\n004178\n004179\n004180\n004181\n004182\n004183\n004184\n004185\n004186\n004187\n004188\n004189\n004190\n004191\n004192\n004193\n004194\n004195\n004196\n004197\n004198\n004199\n004200\n004201\n004202\n004203\n004204\n004205\n004206\n004207\n004208\n004209\n004210\n004211\n004212\n004213\n004214\n004215\n004216\n004217\n004218\n004219\n004220\n004221\n004222\n004223\n004224\n004225\n004226\n004227\n004228\n004229\n004230\n004231\n004232\n004233\n004234\n004235\n004236\n004237\n004238\n004239\n004240\n004241\n004242\n004243\n004244\n004245\n004246\n004247\n004248\n004249\n004250\n004251\n004252\n004253\n004254\n004255\n004256\n004257\n004258\n004259\n004260\n004261\n004262\n004263\n004264\n004265\n004266\n004267\n004268\n004269\n004270\n004271\n004272\n004273\n004274\n004275\n004276\n004277\n004278\n004279\n004280\n004281\n004282\n004283\n004284\n004285\n004286\n004287\n004288\n004289\n004290\n004291\n004292\n004293\n004294\n004295\n004296\n004297\n004298\n004299\n004300\n004301\n004302\n004303\n004304\n004305\n004306\n004307\n004308\n004309\n004310\n004311\n004312\n004313\n004314\n004315\n004316\n004317\n004318\n004319\n004320\n004321\n004322\n004323\n004324\n004325\n004326\n004327\n004328\n004329\n004330\n004331\n004332\n004333\n004334\n004335\n004336\n004337\n004338\n004339\n004340\n004341\n004342\n004343\n004344\n004345\n004346\n004347\n004348\n004349\n004350\n004351\n004352\n004353\n004354\n004355\n004356\n004357\n004358\n004359\n004360\n004361\n004362\n004363\n004364\n004365\n004366\n004367\n004368\n004369\n004370\n004371\n004372\n004373\n004374\n004375\n004376\n004377\n004378\n004379\n004380\n004381\n004382\n004383\n004384\n004385\n004386\n004387\n004388\n004389\n004390\n004391\n004392\n004393\n004394\n004395\n004396\n004397\n004398\n004399\n004400\n004401\n004402\n004403\n004404\n004405\n004406\n004407\n004408\n004409\n004410\n004411\n004412\n004413\n004414\n004415\n004416\n004417\n004418\n004419\n004420\n004421\n004422\n004423\n004424\n004425\n004426\n004427\n004428\n004429\n004430\n004431\n004432\n004433\n004434\n004435\n004436\n004437\n004438\n004439\n004440\n004441\n004442\n004443\n004444\n004445\n004446\n004447\n004448\n004449\n004450\n004451\n004452\n004453\n004454\n004455\n004456\n004457\n004458\n004459\n004460\n004461\n004462\n004463\n004464\n004465\n004466\n004467\n004468\n004469\n004470\n004471\n004472\n004473\n004474\n004475\n004476\n004477\n004478\n004479\n004480\n004481\n004482\n004483\n004484\n004485\n004486\n004487\n004488\n004489\n004490\n004491\n004492\n004493\n004494\n004495\n004496\n004497\n004498\n004499\n004500\n004501\n004502\n004503\n004504\n004505\n004506\n004507\n004508\n004509\n004510\n004511\n004512\n004513\n004514\n004515\n004516\n004517\n004518\n004519\n004520\n004521\n004522\n004523\n004524\n004525\n004526\n004527\n004528\n004529\n004530\n004531\n004532\n004533\n004534\n004535\n004536\n004537\n004538\n004539\n004540\n004541\n004542\n004543\n004544\n004545\n004546\n004547\n004548\n004549\n004550\n004551\n004552\n004553\n004554\n004555\n004556\n004557\n004558\n004559\n004560\n004561\n004562\n004563\n004564\n004565\n004566\n004567\n004568\n004569\n004570\n004571\n004572\n004573\n004574\n004575\n004576\n004577\n004578\n004579\n004580\n004581\n004582\n004583\n004584\n004585\n004586\n004587\n004588\n004589\n004590\n004591\n004592\n004593\n004594\n004595\n004596\n004597\n004598\n004599\n004600\n004601\n004602\n004603\n004604\n004605\n004606\n004607\n004608\n004609\n004610\n004611\n004612\n004613\n004614\n004615\n004616\n004617\n004618\n004619\n004620\n004621\n004622\n004623\n004624\n004625\n004626\n004627\n004628\n004629\n004630\n004631\n004632\n004633\n004634\n004635\n004636\n004637\n004638\n004639\n004640\n004641\n004642\n004643\n004644\n004645\n004646\n004647\n004648\n004649\n004650\n004651\n004652\n004653\n004654\n004655\n004656\n004657\n004658\n004659\n004660\n004661\n004662\n004663\n004664\n004665\n004666\n004667\n004668\n004669\n004670\n004671\n004672\n004673\n004674\n004675\n004676\n004677\n004678\n004679\n004680\n004681\n004682\n004683\n004684\n004685\n004686\n004687\n004688\n004689\n004690\n004691\n004692\n004693\n004694\n004695\n004696\n004697\n004698\n004699\n004700\n004701\n004702\n004703\n004704\n004705\n004706\n004707\n004708\n004709\n004710\n004711\n004712\n004713\n004714\n004715\n004716\n004717\n004718\n004719\n004720\n004721\n004722\n004723\n004724\n004725\n004726\n004727\n004728\n004729\n004730\n004731\n004732\n004733\n004734\n004735\n004736\n004737\n004738\n004739\n004740\n004741\n004742\n004743\n004744\n004745\n004746\n004747\n004748\n004749\n004750\n004751\n004752\n004753\n004754\n004755\n004756\n004757\n004758\n004759\n004760\n004761\n004762\n004763\n004764\n004765\n004766\n004767\n004768\n004769\n004770\n004771\n004772\n004773\n004774\n004775\n004776\n004777\n004778\n004779\n004780\n004781\n004782\n004783\n004784\n004785\n004786\n004787\n004788\n004789\n004790\n004791\n004792\n004793\n004794\n004795\n004796\n004797\n004798\n004799\n004800\n004801\n004802\n004803\n004804\n004805\n004806\n004807\n004808\n004809\n004810\n004811\n004812\n004813\n004814\n004815\n004816\n004817\n004818\n004819\n004820\n004821\n004822\n004823\n004824\n004825\n004826\n004827\n004828\n004829\n004830\n004831\n004832\n004833\n004834\n004835\n004836\n004837\n004838\n004839\n004840\n004841\n004842\n004843\n004844\n004845\n004846\n004847\n004848\n004849\n004850\n004851\n004852\n004853\n004854\n004855\n004856\n004857\n004858\n004859\n004860\n004861\n004862\n004863\n004864\n004865\n004866\n004867\n004868\n004869\n004870\n004871\n004872\n004873\n004874\n004875\n004876\n004877\n004878\n004879\n004880\n004881\n004882\n004883\n004884\n004885\n004886\n004887\n004888\n004889\n004890\n004891\n004892\n004893\n004894\n004895\n004896\n004897\n004898\n004899\n004900\n004901\n004902\n004903\n004904\n004905\n004906\n004907\n004908\n004909\n004910\n004911\n004912\n004913\n004914\n004915\n004916\n004917\n004918\n004919\n004920\n004921\n004922\n004923\n004924\n004925\n004926\n004927\n004928\n004929\n004930\n004931\n004932\n004933\n004934\n004935\n004936\n004937\n004938\n004939\n004940\n004941\n004942\n004943\n004944\n004945\n004946\n004947\n004948\n004949\n004950\n004951\n004952\n004953\n004954\n004955\n004956\n004957\n004958\n004959\n004960\n004961\n004962\n004963\n004964\n004965\n004966\n004967\n004968\n004969\n004970\n004971\n004972\n004973\n004974\n004975\n004976\n004977\n004978\n004979\n004980\n004981\n004982\n004983\n004984\n004985\n004986\n004987\n004988\n004989\n004990\n004991\n004992\n004993\n004994\n004995\n004996\n004997\n004998\n004999\n005000\n005001\n005002\n005003\n005004\n005005\n005006\n005007\n005008\n005009\n005010\n005011\n005012\n005013\n005014\n005015\n005016\n005017\n005018\n005019\n005020\n005021\n005022\n005023\n005024\n005025\n005026\n005027\n005028\n005029\n005030\n005031\n005032\n005033\n005034\n005035\n005036\n005037\n005038\n005039\n005040\n005041\n005042\n005043\n005044\n005045\n005046\n005047\n005048\n005049\n005050\n005051\n005052\n005053\n005054\n005055\n005056\n005057\n005058\n005059\n005060\n005061\n005062\n005063\n005064\n005065\n005066\n005067\n005068\n005069\n005070\n005071\n005072\n005073\n005074\n005075\n005076\n005077\n005078\n005079\n005080\n005081\n005082\n005083\n005084\n005085\n005086\n005087\n005088\n005089\n005090\n005091\n005092\n005093\n005094\n005095\n005096\n005097\n005098\n005099\n005100\n005101\n005102\n005103\n005104\n005105\n005106\n005107\n005108\n005109\n005110\n005111\n005112\n005113\n005114\n005115\n005116\n005117\n005118\n005119\n005120\n005121\n005122\n005123\n005124\n005125\n005126\n005127\n005128\n005129\n005130\n005131\n005132\n005133\n005134\n005135\n005136\n005137\n005138\n005139\n005140\n005141\n005142\n005143\n005144\n005145\n005146\n005147\n005148\n005149\n005150\n005151\n005152\n005153\n005154\n005155\n005156\n005157\n005158\n005159\n005160\n005161\n005162\n005163\n005164\n005165\n005166\n005167\n005168\n005169\n005170\n005171\n005172\n005173\n005174\n005175\n005176\n005177\n005178\n005179\n005180\n005181\n005182\n005183\n005184\n005185\n005186\n005187\n005188\n005189\n005190\n005191\n005192\n005193\n005194\n005195\n005196\n005197\n005198\n005199\n005200\n005201\n005202\n005203\n005204\n005205\n005206\n005207\n005208\n005209\n005210\n005211\n005212\n005213\n005214\n005215\n005216\n005217\n005218\n005219\n005220\n005221\n005222\n005223\n005224\n005225\n005226\n005227\n005228\n005229\n005230\n005231\n005232\n005233\n005234\n005235\n005236\n005237\n005238\n005239\n005240\n005241\n005242\n005243\n005244\n005245\n005246\n005247\n005248\n005249\n005250\n005251\n005252\n005253\n005254\n005255\n005256\n005257\n005258\n005259\n005260\n005261\n005262\n005263\n005264\n005265\n005266\n005267\n005268\n005269\n005270\n005271\n005272\n005273\n005274\n005275\n005276\n005277\n005278\n005279\n005280\n005281\n005282\n005283\n005284\n005285\n005286\n005287\n005288\n005289\n005290\n005291\n005292\n005293\n005294\n005295\n005296\n005297\n005298\n005299\n005300\n005301\n005302\n005303\n005304\n005305\n005306\n005307\n005308\n005309\n005310\n005311\n005312\n005313\n005314\n005315\n005316\n005317\n005318\n005319\n005320\n005321\n005322\n005323\n005324\n005325\n005326\n005327\n005328\n005329\n005330\n005331\n005332\n005333\n005334\n005335\n005336\n005337\n005338\n005339\n005340\n005341\n005342\n005343\n005344\n005345\n005346\n005347\n005348\n005349\n005350\n005351\n005352\n005353\n005354\n005355\n005356\n005357\n005358\n005359\n005360\n005361\n005362\n005363\n005364\n005365\n005366\n005367\n005368\n005369\n005370\n005371\n005372\n005373\n005374\n005375\n005376\n005377\n005378\n005379\n005380\n005381\n005382\n005383\n005384\n005385\n005386\n005387\n005388\n005389\n005390\n005391\n005392\n005393\n005394\n005395\n005396\n005397\n005398\n005399\n005400\n005401\n005402\n005403\n005404\n005405\n005406\n005407\n005408\n005409\n005410\n005411\n005412\n005413\n005414\n005415\n005416\n005417\n005418\n005419\n005420\n005421\n005422\n005423\n005424\n005425\n005426\n005427\n005428\n005429\n005430\n005431\n005432\n005433\n005434\n005435\n005436\n005437\n005438\n005439\n005440\n005441\n005442\n005443\n005444\n005445\n005446\n005447\n005448\n005449\n005450\n005451\n005452\n005453\n005454\n005455\n005456\n005457\n005458\n005459\n005460\n005461\n005462\n005463\n005464\n005465\n005466\n005467\n005468\n005469\n005470\n005471\n005472\n005473\n005474\n005475\n005476\n005477\n005478\n005479\n005480\n005481\n005482\n005483\n005484\n005485\n005486\n005487\n005488\n005489\n005490\n005491\n005492\n005493\n005494\n005495\n005496\n005497\n005498\n005499\n005500\n005501\n005502\n005503\n005504\n005505\n005506\n005507\n005508\n005509\n005510\n005511\n005512\n005513\n005514\n005515\n005516\n005517\n005518\n005519\n005520\n005521\n005522\n005523\n005524\n005525\n005526\n005527\n005528\n005529\n005530\n005531\n005532\n005533\n005534\n005535\n005536\n005537\n005538\n005539\n005540\n005541\n005542\n005543\n005544\n005545\n005546\n005547\n005548\n005549\n005550\n005551\n005552\n005553\n005554\n005555\n005556\n005557\n005558\n005559\n005560\n005561\n005562\n005563\n005564\n005565\n005566\n005567\n005568\n005569\n005570\n005571\n005572\n005573\n005574\n005575\n005576\n005577\n005578\n005579\n005580\n005581\n005582\n005583\n005584\n005585\n005586\n005587\n005588\n005589\n005590\n005591\n005592\n005593\n005594\n005595\n005596\n005597\n005598\n005599\n005600\n005601\n005602\n005603\n005604\n005605\n005606\n005607\n005608\n005609\n005610\n005611\n005612\n005613\n005614\n005615\n005616\n005617\n005618\n005619\n005620\n005621\n005622\n005623\n005624\n005625\n005626\n005627\n005628\n005629\n005630\n005631\n005632\n005633\n005634\n005635\n005636\n005637\n005638\n005639\n005640\n005641\n005642\n005643\n005644\n005645\n005646\n005647\n005648\n005649\n005650\n005651\n005652\n005653\n005654\n005655\n005656\n005657\n005658\n005659\n005660\n005661\n005662\n005663\n005664\n005665\n005666\n005667\n005668\n005669\n005670\n005671\n005672\n005673\n005674\n005675\n005676\n005677\n005678\n005679\n005680\n005681\n005682\n005683\n005684\n005685\n005686\n005687\n005688\n005689\n005690\n005691\n005692\n005693\n005694\n005695\n005696\n005697\n005698\n005699\n005700\n005701\n005702\n005703\n005704\n005705\n005706\n005707\n005708\n005709\n005710\n005711\n005712\n005713\n005714\n005715\n005716\n005717\n005718\n005719\n005720\n005721\n005722\n005723\n005724\n005725\n005726\n005727\n005728\n005729\n005730\n005731\n005732\n005733\n005734\n005735\n005736\n005737\n005738\n005739\n005740\n005741\n005742\n005743\n005744\n005745\n005746\n005747\n005748\n005749\n005750\n005751\n005752\n005753\n005754\n005755\n005756\n005757\n005758\n005759\n005760\n005761\n005762\n005763\n005764\n005765\n005766\n005767\n005768\n005769\n005770\n005771\n005772\n005773\n005774\n005775\n005776\n005777\n005778\n005779\n005780\n005781\n005782\n005783\n005784\n005785\n005786\n005787\n005788\n005789\n005790\n005791\n005792\n005793\n005794\n005795\n005796\n005797\n005798\n005799\n005800\n005801\n005802\n005803\n005804\n005805\n005806\n005807\n005808\n005809\n005810\n005811\n005812\n005813\n005814\n005815\n005816\n005817\n005818\n005819\n005820\n005821\n005822\n005823\n005824\n005825\n005826\n005827\n005828\n005829\n005830\n005831\n005832\n005833\n005834\n005835\n005836\n005837\n005838\n005839\n005840\n005841\n005842\n005843\n005844\n005845\n005846\n005847\n005848\n005849\n005850\n005851\n005852\n005853\n005854\n005855\n005856\n005857\n005858\n005859\n005860\n005861\n005862\n005863\n005864\n005865\n005866\n005867\n005868\n005869\n005870\n005871\n005872\n005873\n005874\n005875\n005876\n005877\n005878\n005879\n005880\n005881\n005882\n005883\n005884\n005885\n005886\n005887\n005888\n005889\n005890\n005891\n005892\n005893\n005894\n005895\n005896\n005897\n005898\n005899\n005900\n005901\n005902\n005903\n005904\n005905\n005906\n005907\n005908\n005909\n005910\n005911\n005912\n005913\n005914\n005915\n005916\n005917\n005918\n005919\n005920\n005921\n005922\n005923\n005924\n005925\n005926\n005927\n005928\n005929\n005930\n005931\n005932\n005933\n005934\n005935\n005936\n005937\n005938\n005939\n005940\n005941\n005942\n005943\n005944\n005945\n005946\n005947\n005948\n005949\n005950\n005951\n005952\n005953\n005954\n005955\n005956\n005957\n005958\n005959\n005960\n005961\n005962\n005963\n005964\n005965\n005966\n005967\n005968\n005969\n005970\n005971\n005972\n005973\n005974\n005975\n005976\n005977\n005978\n005979\n005980\n005981\n005982\n005983\n005984\n005985\n005986\n005987\n005988\n005989\n005990\n005991\n005992\n005993\n005994\n005995\n005996\n005997\n005998\n005999\n006000\n006001\n006002\n006003\n006004\n006005\n006006\n006007\n006008\n006009\n006010\n006011\n006012\n006013\n006014\n006015\n006016\n006017\n006018\n006019\n006020\n006021\n006022\n006023\n006024\n006025\n006026\n006027\n006028\n006029\n006030\n006031\n006032\n006033\n006034\n006035\n006036\n006037\n006038\n006039\n006040\n006041\n006042\n006043\n006044\n006045\n006046\n006047\n006048\n006049\n006050\n006051\n006052\n006053\n006054\n006055\n006056\n006057\n006058\n006059\n006060\n006061\n006062\n006063\n006064\n006065\n006066\n006067\n006068\n006069\n006070\n006071\n006072\n006073\n006074\n006075\n006076\n006077\n006078\n006079\n006080\n006081\n006082\n006083\n006084\n006085\n006086\n006087\n006088\n006089\n006090\n006091\n006092\n006093\n006094\n006095\n006096\n006097\n006098\n006099\n006100\n006101\n006102\n006103\n006104\n006105\n006106\n006107\n006108\n006109\n006110\n006111\n006112\n006113\n006114\n006115\n006116\n006117\n006118\n006119\n006120\n006121\n006122\n006123\n006124\n006125\n006126\n006127\n006128\n006129\n006130\n006131\n006132\n006133\n006134\n006135\n006136\n006137\n006138\n006139\n006140\n006141\n006142\n006143\n006144\n006145\n006146\n006147\n006148\n006149\n006150\n006151\n006152\n006153\n006154\n006155\n006156\n006157\n006158\n006159\n006160\n006161\n006162\n006163\n006164\n006165\n006166\n006167\n006168\n006169\n006170\n006171\n006172\n006173\n006174\n006175\n006176\n006177\n006178\n006179\n006180\n006181\n006182\n006183\n006184\n006185\n006186\n006187\n006188\n006189\n006190\n006191\n006192\n006193\n006194\n006195\n006196\n006197\n006198\n006199\n006200\n006201\n006202\n006203\n006204\n006205\n006206\n006207\n006208\n006209\n006210\n006211\n006212\n006213\n006214\n006215\n006216\n006217\n006218\n006219\n006220\n006221\n006222\n006223\n006224\n006225\n006226\n006227\n006228\n006229\n006230\n006231\n006232\n006233\n006234\n006235\n006236\n006237\n006238\n006239\n006240\n006241\n006242\n006243\n006244\n006245\n006246\n006247\n006248\n006249\n006250\n006251\n006252\n006253\n006254\n006255\n006256\n006257\n006258\n006259\n006260\n006261\n006262\n006263\n006264\n006265\n006266\n006267\n006268\n006269\n006270\n006271\n006272\n006273\n006274\n006275\n006276\n006277\n006278\n006279\n006280\n006281\n006282\n006283\n006284\n006285\n006286\n006287\n006288\n006289\n006290\n006291\n006292\n006293\n006294\n006295\n006296\n006297\n006298\n006299\n006300\n006301\n006302\n006303\n006304\n006305\n006306\n006307\n006308\n006309\n006310\n006311\n006312\n006313\n006314\n006315\n006316\n006317\n006318\n006319\n006320\n006321\n006322\n006323\n006324\n006325\n006326\n006327\n006328\n006329\n006330\n006331\n006332\n006333\n006334\n006335\n006336\n006337\n006338\n006339\n006340\n006341\n006342\n006343\n006344\n006345\n006346\n006347\n006348\n006349\n006350\n006351\n006352\n006353\n006354\n006355\n006356\n006357\n006358\n006359\n006360\n006361\n006362\n006363\n006364\n006365\n006366\n006367\n006368\n006369\n006370\n006371\n006372\n006373\n006374\n006375\n006376\n006377\n006378\n006379\n006380\n006381\n006382\n006383\n006384\n006385\n006386\n006387\n006388\n006389\n006390\n006391\n006392\n006393\n006394\n006395\n006396\n006397\n006398\n006399\n006400\n006401\n006402\n006403\n006404\n006405\n006406\n006407\n006408\n006409\n006410\n006411\n006412\n006413\n006414\n006415\n006416\n006417\n006418\n006419\n006420\n006421\n006422\n006423\n006424\n006425\n006426\n006427\n006428\n006429\n006430\n006431\n006432\n006433\n006434\n006435\n006436\n006437\n006438\n006439\n006440\n006441\n006442\n006443\n006444\n006445\n006446\n006447\n006448\n006449\n006450\n006451\n006452\n006453\n006454\n006455\n006456\n006457\n006458\n006459\n006460\n006461\n006462\n006463\n006464\n006465\n006466\n006467\n006468\n006469\n006470\n006471\n006472\n006473\n006474\n006475\n006476\n006477\n006478\n006479\n006480\n006481\n006482\n006483\n006484\n006485\n006486\n006487\n006488\n006489\n006490\n006491\n006492\n006493\n006494\n006495\n006496\n006497\n006498\n006499\n006500\n006501\n006502\n006503\n006504\n006505\n006506\n006507\n006508\n006509\n006510\n006511\n006512\n006513\n006514\n006515\n006516\n006517\n006518\n006519\n006520\n006521\n006522\n006523\n006524\n006525\n006526\n006527\n006528\n006529\n006530\n006531\n006532\n006533\n006534\n006535\n006536\n006537\n006538\n006539\n006540\n006541\n006542\n006543\n006544\n006545\n006546\n006547\n006548\n006549\n006550\n006551\n006552\n006553\n006554\n006555\n006556\n006557\n006558\n006559\n006560\n006561\n006562\n006563\n006564\n006565\n006566\n006567\n006568\n006569\n006570\n006571\n006572\n006573\n006574\n006575\n006576\n006577\n006578\n006579\n006580\n006581\n006582\n006583\n006584\n006585\n006586\n006587\n006588\n006589\n006590\n006591\n006592\n006593\n006594\n006595\n006596\n006597\n006598\n006599\n006600\n006601\n006602\n006603\n006604\n006605\n006606\n006607\n006608\n006609\n006610\n006611\n006612\n006613\n006614\n006615\n006616\n006617\n006618\n006619\n006620\n006621\n006622\n006623\n006624\n006625\n006626\n006627\n006628\n006629\n006630\n006631\n006632\n006633\n006634\n006635\n006636\n006637\n006638\n006639\n006640\n006641\n006642\n006643\n006644\n006645\n006646\n006647\n006648\n006649\n006650\n006651\n006652\n006653\n006654\n006655\n006656\n006657\n006658\n006659\n006660\n006661\n006662\n006663\n006664\n006665\n006666\n006667\n006668\n006669\n006670\n006671\n006672\n006673\n006674\n006675\n006676\n006677\n006678\n006679\n006680\n006681\n006682\n006683\n006684\n006685\n006686\n006687\n006688\n006689\n006690\n006691\n006692\n006693\n006694\n006695\n006696\n006697\n006698\n006699\n006700\n006701\n006702\n006703\n006704\n006705\n006706\n006707\n006708\n006709\n006710\n006711\n006712\n006713\n006714\n006715\n006716\n006717\n006718\n006719\n006720\n006721\n006722\n006723\n006724\n006725\n006726\n006727\n006728\n006729\n006730\n006731\n006732\n006733\n006734\n006735\n006736\n006737\n006738\n006739\n006740\n006741\n006742\n006743\n006744\n006745\n006746\n006747\n006748\n006749\n006750\n006751\n006752\n006753\n006754\n006755\n006756\n006757\n006758\n006759\n006760\n006761\n006762\n006763\n006764\n006765\n006766\n006767\n006768\n006769\n006770\n006771\n006772\n006773\n006774\n006775\n006776\n006777\n006778\n006779\n006780\n006781\n006782\n006783\n006784\n006785\n006786\n006787\n006788\n006789\n006790\n006791\n006792\n006793\n006794\n006795\n006796\n006797\n006798\n006799\n006800\n006801\n006802\n006803\n006804\n006805\n006806\n006807\n006808\n006809\n006810\n006811\n006812\n006813\n006814\n006815\n006816\n006817\n006818\n006819\n006820\n006821\n006822\n006823\n006824\n006825\n006826\n006827\n006828\n006829\n006830\n006831\n006832\n006833\n006834\n006835\n006836\n006837\n006838\n006839\n006840\n006841\n006842\n006843\n006844\n006845\n006846\n006847\n006848\n006849\n006850\n006851\n006852\n006853\n006854\n006855\n006856\n006857\n006858\n006859\n006860\n006861\n006862\n006863\n006864\n006865\n006866\n006867\n006868\n006869\n006870\n006871\n006872\n006873\n006874\n006875\n006876\n006877\n006878\n006879\n006880\n006881\n006882\n006883\n006884\n006885\n006886\n006887\n006888\n006889\n006890\n006891\n006892\n006893\n006894\n006895\n006896\n006897\n006898\n006899\n006900\n006901\n006902\n006903\n006904\n006905\n006906\n006907\n006908\n006909\n006910\n006911\n006912\n006913\n006914\n006915\n006916\n006917\n006918\n006919\n006920\n006921\n006922\n006923\n006924\n006925\n006926\n006927\n006928\n006929\n006930\n006931\n006932\n006933\n006934\n006935\n006936\n006937\n006938\n006939\n006940\n006941\n006942\n006943\n006944\n006945\n006946\n006947\n006948\n006949\n006950\n006951\n006952\n006953\n006954\n006955\n006956\n006957\n006958\n006959\n006960\n006961\n006962\n006963\n006964\n006965\n006966\n006967\n006968\n006969\n006970\n006971\n006972\n006973\n006974\n006975\n006976\n006977\n006978\n006979\n006980\n006981\n006982\n006983\n006984\n006985\n006986\n006987\n006988\n006989\n006990\n006991\n006992\n006993\n006994\n006995\n006996\n006997\n006998\n006999\n007000\n007001\n007002\n007003\n007004\n007005\n007006\n007007\n007008\n007009\n007010\n007011\n007012\n007013\n007014\n007015\n007016\n007017\n007018\n007019\n007020\n007021\n007022\n007023\n007024\n007025\n007026\n007027\n007028\n007029\n007030\n007031\n007032\n007033\n007034\n007035\n007036\n007037\n007038\n007039\n007040\n007041\n007042\n007043\n007044\n007045\n007046\n007047\n007048\n007049\n007050\n007051\n007052\n007053\n007054\n007055\n007056\n007057\n007058\n007059\n007060\n007061\n007062\n007063\n007064\n007065\n007066\n007067\n007068\n007069\n007070\n007071\n007072\n007073\n007074\n007075\n007076\n007077\n007078\n007079\n007080\n007081\n007082\n007083\n007084\n007085\n007086\n007087\n007088\n007089\n007090\n007091\n007092\n007093\n007094\n007095\n007096\n007097\n007098\n007099\n007100\n007101\n007102\n007103\n007104\n007105\n007106\n007107\n007108\n007109\n007110\n007111\n007112\n007113\n007114\n007115\n007116\n007117\n007118\n007119\n007120\n007121\n007122\n007123\n007124\n007125\n007126\n007127\n007128\n007129\n007130\n007131\n007132\n007133\n007134\n007135\n007136\n007137\n007138\n007139\n007140\n007141\n007142\n007143\n007144\n007145\n007146\n007147\n007148\n007149\n007150\n007151\n007152\n007153\n007154\n007155\n007156\n007157\n007158\n007159\n007160\n007161\n007162\n007163\n007164\n007165\n007166\n007167\n007168\n007169\n007170\n007171\n007172\n007173\n007174\n007175\n007176\n007177\n007178\n007179\n007180\n007181\n007182\n007183\n007184\n007185\n007186\n007187\n007188\n007189\n007190\n007191\n007192\n007193\n007194\n007195\n007196\n007197\n007198\n007199\n007200\n007201\n007202\n007203\n007204\n007205\n007206\n007207\n007208\n007209\n007210\n007211\n007212\n007213\n007214\n007215\n007216\n007217\n007218\n007219\n007220\n007221\n007222\n007223\n007224\n007225\n007226\n007227\n007228\n007229\n007230\n007231\n007232\n007233\n007234\n007235\n007236\n007237\n007238\n007239\n007240\n007241\n007242\n007243\n007244\n007245\n007246\n007247\n007248\n007249\n007250\n007251\n007252\n007253\n007254\n007255\n007256\n007257\n007258\n007259\n007260\n007261\n007262\n007263\n007264\n007265\n007266\n007267\n007268\n007269\n007270\n007271\n007272\n007273\n007274\n007275\n007276\n007277\n007278\n007279\n007280\n007281\n007282\n007283\n007284\n007285\n007286\n007287\n007288\n007289\n007290\n007291\n007292\n007293\n007294\n007295\n007296\n007297\n007298\n007299\n007300\n007301\n007302\n007303\n007304\n007305\n007306\n007307\n007308\n007309\n007310\n007311\n007312\n007313\n007314\n007315\n007316\n007317\n007318\n007319\n007320\n007321\n007322\n007323\n007324\n007325\n007326\n007327\n007328\n007329\n007330\n007331\n007332\n007333\n007334\n007335\n007336\n007337\n007338\n007339\n007340\n007341\n007342\n007343\n007344\n007345\n007346\n007347\n007348\n007349\n007350\n007351\n007352\n007353\n007354\n007355\n007356\n007357\n007358\n007359\n007360\n007361\n007362\n007363\n007364\n007365\n007366\n007367\n007368\n007369\n007370\n007371\n007372\n007373\n007374\n007375\n007376\n007377\n007378\n007379\n007380\n007381\n007382\n007383\n007384\n007385\n007386\n007387\n007388\n007389\n007390\n007391\n007392\n007393\n007394\n007395\n007396\n007397\n007398\n007399\n007400\n007401\n007402\n007403\n007404\n007405\n007406\n007407\n007408\n007409\n007410\n007411\n007412\n007413\n007414\n007415\n007416\n007417\n007418\n007419\n007420\n007421\n007422\n007423\n007424\n007425\n007426\n007427\n007428\n007429\n007430\n007431\n007432\n007433\n007434\n007435\n007436\n007437\n007438\n007439\n007440\n007441\n007442\n007443\n007444\n007445\n007446\n007447\n007448\n007449\n007450\n007451\n007452\n007453\n007454\n007455\n007456\n007457\n007458\n007459\n007460\n007461\n007462\n007463\n007464\n007465\n007466\n007467\n007468\n007469\n007470\n007471\n007472\n007473\n007474\n007475\n007476\n007477\n007478\n007479\n007480\n007481\n007482\n007483\n007484\n007485\n007486\n007487\n007488\n007489\n007490\n007491\n007492\n007493\n007494\n007495\n007496\n007497\n007498\n007499\n007500\n007501\n007502\n007503\n007504\n007505\n007506\n007507\n007508\n007509\n007510\n007511\n007512\n007513\n007514\n007515\n007516\n007517"
  },
  {
    "path": "data/kitti/ImageSets/train.txt",
    "content": "000000\n000003\n000007\n000009\n000010\n000011\n000012\n000013\n000014\n000016\n000017\n000018\n000022\n000026\n000029\n000030\n000032\n000034\n000036\n000038\n000041\n000043\n000044\n000045\n000046\n000049\n000051\n000054\n000055\n000056\n000057\n000060\n000064\n000067\n000068\n000069\n000070\n000071\n000072\n000073\n000074\n000075\n000079\n000080\n000082\n000083\n000084\n000085\n000086\n000087\n000088\n000091\n000092\n000095\n000096\n000097\n000099\n000100\n000101\n000103\n000105\n000109\n000110\n000111\n000112\n000113\n000114\n000115\n000119\n000120\n000121\n000123\n000125\n000127\n000129\n000130\n000131\n000133\n000136\n000138\n000141\n000142\n000144\n000145\n000146\n000148\n000149\n000150\n000154\n000155\n000157\n000158\n000160\n000162\n000163\n000164\n000165\n000166\n000171\n000172\n000176\n000177\n000178\n000179\n000180\n000184\n000185\n000189\n000193\n000198\n000200\n000202\n000205\n000206\n000208\n000209\n000210\n000214\n000215\n000217\n000219\n000220\n000221\n000222\n000225\n000227\n000228\n000232\n000233\n000238\n000240\n000241\n000243\n000244\n000245\n000253\n000254\n000255\n000256\n000257\n000258\n000259\n000261\n000264\n000267\n000271\n000274\n000275\n000276\n000277\n000280\n000282\n000285\n000286\n000287\n000288\n000292\n000294\n000295\n000296\n000298\n000299\n000300\n000303\n000304\n000306\n000310\n000313\n000316\n000317\n000318\n000322\n000325\n000326\n000330\n000331\n000334\n000337\n000338\n000339\n000342\n000344\n000348\n000349\n000353\n000358\n000363\n000364\n000367\n000368\n000371\n000374\n000375\n000380\n000384\n000387\n000389\n000390\n000400\n000405\n000406\n000410\n000411\n000412\n000416\n000417\n000418\n000421\n000423\n000424\n000425\n000426\n000431\n000432\n000433\n000434\n000435\n000438\n000439\n000441\n000442\n000444\n000445\n000447\n000449\n000456\n000458\n000460\n000461\n000462\n000464\n000465\n000466\n000467\n000470\n000471\n000474\n000482\n000483\n000484\n000487\n000488\n000490\n000497\n000500\n000501\n000502\n000505\n000507\n000511\n000513\n000514\n000516\n000518\n000520\n000522\n000523\n000525\n000526\n000529\n000531\n000532\n000534\n000535\n000537\n000538\n000539\n000540\n000544\n000547\n000549\n000550\n000552\n000553\n000556\n000557\n000562\n000563\n000565\n000570\n000573\n000574\n000575\n000576\n000577\n000578\n000579\n000580\n000582\n000584\n000585\n000586\n000587\n000592\n000593\n000594\n000596\n000597\n000598\n000599\n000602\n000603\n000605\n000606\n000607\n000608\n000609\n000616\n000617\n000621\n000622\n000623\n000627\n000629\n000631\n000632\n000633\n000637\n000638\n000640\n000641\n000643\n000646\n000649\n000651\n000652\n000653\n000654\n000656\n000661\n000662\n000663\n000664\n000665\n000666\n000668\n000671\n000672\n000673\n000675\n000676\n000678\n000680\n000681\n000685\n000686\n000687\n000688\n000689\n000690\n000693\n000695\n000697\n000701\n000703\n000705\n000707\n000709\n000710\n000711\n000712\n000713\n000714\n000715\n000719\n000720\n000723\n000724\n000726\n000730\n000732\n000733\n000735\n000738\n000739\n000742\n000743\n000744\n000747\n000749\n000753\n000755\n000757\n000758\n000759\n000760\n000762\n000763\n000764\n000770\n000775\n000776\n000777\n000780\n000781\n000783\n000784\n000785\n000786\n000787\n000788\n000789\n000791\n000793\n000794\n000796\n000797\n000799\n000808\n000813\n000814\n000815\n000817\n000818\n000820\n000821\n000822\n000824\n000825\n000827\n000828\n000829\n000830\n000832\n000833\n000834\n000835\n000836\n000839\n000842\n000845\n000846\n000851\n000853\n000855\n000856\n000857\n000858\n000860\n000861\n000864\n000865\n000866\n000867\n000868\n000870\n000871\n000872\n000880\n000882\n000883\n000886\n000887\n000888\n000890\n000891\n000892\n000895\n000896\n000898\n000900\n000901\n000902\n000903\n000905\n000906\n000908\n000910\n000913\n000914\n000918\n000919\n000921\n000924\n000925\n000927\n000929\n000933\n000934\n000935\n000936\n000937\n000941\n000945\n000946\n000947\n000950\n000951\n000954\n000955\n000957\n000959\n000960\n000962\n000965\n000968\n000972\n000975\n000977\n000978\n000980\n000982\n000987\n000989\n000990\n000992\n000993\n000994\n000995\n000996\n000997\n000998\n001000\n001001\n001003\n001004\n001005\n001009\n001016\n001017\n001020\n001023\n001024\n001028\n001029\n001030\n001031\n001032\n001033\n001034\n001036\n001038\n001040\n001041\n001044\n001045\n001047\n001048\n001049\n001052\n001056\n001057\n001059\n001060\n001061\n001062\n001064\n001072\n001073\n001074\n001079\n001080\n001081\n001082\n001085\n001087\n001090\n001091\n001092\n001093\n001098\n001100\n001103\n001105\n001109\n001110\n001112\n001117\n001119\n001121\n001122\n001124\n001126\n001128\n001130\n001137\n001142\n001146\n001151\n001156\n001157\n001159\n001160\n001161\n001164\n001165\n001166\n001168\n001169\n001170\n001171\n001174\n001175\n001181\n001184\n001185\n001186\n001190\n001196\n001197\n001200\n001201\n001202\n001204\n001205\n001208\n001209\n001210\n001211\n001212\n001215\n001219\n001220\n001223\n001227\n001229\n001231\n001233\n001238\n001240\n001247\n001248\n001250\n001256\n001258\n001262\n001264\n001276\n001277\n001278\n001279\n001280\n001282\n001283\n001285\n001288\n001290\n001293\n001297\n001298\n001299\n001300\n001301\n001302\n001309\n001310\n001311\n001312\n001313\n001315\n001316\n001319\n001320\n001321\n001322\n001323\n001324\n001325\n001326\n001327\n001328\n001335\n001338\n001340\n001341\n001343\n001348\n001349\n001351\n001354\n001357\n001358\n001360\n001361\n001362\n001364\n001366\n001367\n001368\n001369\n001370\n001371\n001373\n001378\n001379\n001383\n001385\n001390\n001392\n001393\n001394\n001396\n001399\n001400\n001401\n001402\n001403\n001404\n001405\n001406\n001408\n001409\n001413\n001414\n001417\n001418\n001420\n001422\n001423\n001425\n001426\n001428\n001429\n001430\n001433\n001434\n001436\n001440\n001444\n001447\n001449\n001452\n001453\n001454\n001455\n001456\n001457\n001459\n001460\n001462\n001464\n001465\n001467\n001468\n001470\n001472\n001473\n001474\n001475\n001476\n001479\n001482\n001483\n001484\n001486\n001490\n001491\n001492\n001493\n001494\n001496\n001498\n001499\n001500\n001503\n001504\n001505\n001506\n001509\n001510\n001512\n001515\n001518\n001519\n001520\n001523\n001529\n001530\n001531\n001532\n001534\n001539\n001540\n001541\n001543\n001544\n001548\n001550\n001551\n001553\n001554\n001556\n001558\n001559\n001561\n001563\n001566\n001568\n001570\n001571\n001572\n001575\n001578\n001580\n001581\n001584\n001593\n001595\n001598\n001599\n001601\n001604\n001607\n001608\n001609\n001611\n001612\n001614\n001618\n001620\n001622\n001623\n001624\n001626\n001628\n001630\n001632\n001636\n001637\n001638\n001639\n001641\n001642\n001644\n001646\n001648\n001649\n001651\n001652\n001653\n001655\n001657\n001659\n001661\n001663\n001668\n001669\n001671\n001672\n001673\n001674\n001676\n001677\n001678\n001679\n001681\n001685\n001686\n001687\n001688\n001690\n001691\n001692\n001695\n001696\n001698\n001700\n001703\n001708\n001715\n001716\n001720\n001723\n001724\n001725\n001728\n001730\n001731\n001734\n001735\n001736\n001737\n001738\n001739\n001743\n001744\n001747\n001748\n001753\n001754\n001756\n001757\n001759\n001760\n001761\n001763\n001766\n001767\n001769\n001770\n001773\n001775\n001777\n001779\n001784\n001785\n001788\n001789\n001790\n001791\n001792\n001793\n001796\n001798\n001799\n001803\n001805\n001806\n001809\n001810\n001811\n001812\n001815\n001816\n001819\n001821\n001826\n001827\n001829\n001830\n001832\n001833\n001834\n001836\n001837\n001838\n001839\n001841\n001842\n001843\n001845\n001847\n001849\n001850\n001857\n001860\n001864\n001865\n001866\n001870\n001871\n001873\n001874\n001876\n001879\n001882\n001883\n001889\n001891\n001894\n001895\n001896\n001899\n001901\n001902\n001903\n001906\n001907\n001908\n001910\n001911\n001912\n001913\n001914\n001915\n001916\n001917\n001918\n001921\n001922\n001930\n001935\n001938\n001939\n001944\n001947\n001948\n001949\n001950\n001951\n001953\n001955\n001956\n001957\n001958\n001961\n001962\n001963\n001964\n001965\n001968\n001970\n001971\n001973\n001974\n001975\n001976\n001981\n001987\n001988\n001990\n001992\n001993\n001994\n001998\n002003\n002005\n002006\n002007\n002009\n002015\n002016\n002018\n002020\n002023\n002024\n002026\n002030\n002031\n002032\n002033\n002039\n002040\n002041\n002047\n002051\n002053\n002055\n002059\n002060\n002061\n002063\n002064\n002065\n002066\n002067\n002069\n002070\n002072\n002077\n002080\n002083\n002084\n002088\n002090\n002092\n002095\n002096\n002097\n002098\n002099\n002104\n002105\n002106\n002109\n002110\n002114\n002116\n002117\n002119\n002122\n002125\n002126\n002129\n002132\n002133\n002134\n002141\n002143\n002144\n002145\n002146\n002147\n002148\n002149\n002150\n002154\n002155\n002156\n002157\n002162\n002164\n002167\n002171\n002172\n002174\n002175\n002176\n002178\n002180\n002181\n002184\n002186\n002189\n002190\n002191\n002192\n002194\n002195\n002197\n002198\n002199\n002203\n002204\n002205\n002208\n002210\n002211\n002212\n002213\n002214\n002217\n002221\n002222\n002223\n002226\n002227\n002230\n002231\n002235\n002236\n002237\n002238\n002240\n002241\n002242\n002244\n002247\n002249\n002252\n002253\n002256\n002259\n002261\n002263\n002264\n002265\n002267\n002268\n002269\n002270\n002271\n002273\n002274\n002275\n002278\n002281\n002285\n002288\n002289\n002296\n002297\n002301\n002302\n002305\n002309\n002311\n002312\n002313\n002316\n002317\n002318\n002321\n002322\n002323\n002324\n002326\n002328\n002331\n002333\n002335\n002339\n002342\n002343\n002349\n002350\n002351\n002352\n002354\n002355\n002358\n002360\n002361\n002363\n002364\n002368\n002371\n002373\n002374\n002375\n002377\n002379\n002381\n002388\n002389\n002390\n002394\n002395\n002396\n002400\n002401\n002402\n002403\n002406\n002407\n002408\n002409\n002410\n002412\n002413\n002416\n002417\n002421\n002426\n002427\n002430\n002431\n002435\n002436\n002437\n002438\n002441\n002443\n002444\n002445\n002447\n002448\n002449\n002451\n002452\n002453\n002456\n002459\n002464\n002465\n002466\n002467\n002468\n002469\n002470\n002471\n002472\n002475\n002480\n002481\n002482\n002484\n002485\n002487\n002489\n002491\n002493\n002494\n002496\n002498\n002501\n002507\n002508\n002510\n002512\n002513\n002514\n002515\n002517\n002518\n002522\n002523\n002524\n002527\n002533\n002535\n002536\n002537\n002542\n002544\n002545\n002547\n002549\n002550\n002551\n002553\n002554\n002555\n002559\n002560\n002561\n002566\n002567\n002571\n002573\n002576\n002578\n002579\n002582\n002587\n002588\n002589\n002591\n002592\n002593\n002595\n002596\n002597\n002605\n002607\n002608\n002609\n002610\n002611\n002614\n002616\n002617\n002618\n002620\n002622\n002623\n002624\n002627\n002629\n002632\n002634\n002637\n002639\n002642\n002643\n002647\n002648\n002649\n002650\n002652\n002654\n002655\n002658\n002659\n002660\n002662\n002664\n002665\n002667\n002668\n002670\n002671\n002672\n002676\n002678\n002679\n002682\n002683\n002684\n002687\n002688\n002689\n002691\n002697\n002698\n002700\n002701\n002703\n002704\n002705\n002708\n002714\n002716\n002718\n002719\n002723\n002731\n002732\n002733\n002734\n002736\n002738\n002739\n002741\n002743\n002750\n002751\n002754\n002756\n002759\n002762\n002766\n002768\n002769\n002770\n002771\n002774\n002776\n002777\n002778\n002779\n002780\n002781\n002782\n002784\n002785\n002788\n002790\n002791\n002792\n002795\n002798\n002799\n002802\n002803\n002807\n002808\n002813\n002816\n002817\n002819\n002821\n002822\n002823\n002824\n002825\n002829\n002832\n002834\n002835\n002837\n002838\n002842\n002843\n002849\n002850\n002851\n002852\n002854\n002855\n002857\n002859\n002860\n002862\n002864\n002865\n002868\n002869\n002870\n002871\n002872\n002873\n002874\n002882\n002884\n002886\n002887\n002888\n002897\n002898\n002899\n002904\n002906\n002907\n002909\n002910\n002912\n002913\n002915\n002918\n002920\n002921\n002922\n002923\n002926\n002927\n002929\n002931\n002932\n002933\n002936\n002938\n002939\n002940\n002941\n002943\n002946\n002949\n002950\n002952\n002954\n002956\n002965\n002967\n002968\n002969\n002970\n002972\n002973\n002975\n002980\n002981\n002983\n002986\n002987\n002989\n002990\n002992\n002996\n002998\n003002\n003008\n003009\n003012\n003013\n003014\n003015\n003016\n003017\n003018\n003020\n003021\n003023\n003026\n003028\n003036\n003037\n003039\n003040\n003041\n003044\n003045\n003049\n003051\n003057\n003059\n003060\n003063\n003064\n003068\n003069\n003070\n003072\n003075\n003077\n003078\n003079\n003081\n003083\n003084\n003085\n003086\n003089\n003091\n003092\n003093\n003095\n003097\n003098\n003100\n003104\n003105\n003108\n003111\n003113\n003115\n003117\n003119\n003120\n003121\n003122\n003123\n003125\n003128\n003130\n003132\n003138\n003139\n003140\n003143\n003147\n003149\n003151\n003152\n003154\n003155\n003157\n003158\n003160\n003163\n003164\n003166\n003168\n003169\n003171\n003173\n003176\n003178\n003184\n003185\n003186\n003188\n003189\n003191\n003193\n003195\n003196\n003198\n003200\n003201\n003205\n003206\n003208\n003209\n003212\n003213\n003215\n003218\n003220\n003223\n003227\n003230\n003234\n003235\n003237\n003238\n003241\n003243\n003244\n003245\n003246\n003248\n003249\n003253\n003256\n003258\n003260\n003261\n003262\n003263\n003264\n003267\n003268\n003270\n003271\n003273\n003274\n003277\n003278\n003279\n003282\n003284\n003285\n003286\n003287\n003289\n003290\n003291\n003293\n003294\n003297\n003299\n003303\n003307\n003309\n003311\n003314\n003317\n003320\n003321\n003326\n003327\n003328\n003329\n003332\n003333\n003334\n003335\n003336\n003339\n003340\n003342\n003344\n003345\n003348\n003349\n003354\n003356\n003359\n003360\n003361\n003362\n003363\n003369\n003371\n003372\n003374\n003376\n003377\n003378\n003380\n003381\n003382\n003383\n003384\n003387\n003388\n003389\n003390\n003391\n003392\n003398\n003400\n003413\n003414\n003415\n003416\n003418\n003420\n003423\n003424\n003427\n003431\n003433\n003436\n003437\n003438\n003439\n003440\n003441\n003442\n003444\n003445\n003446\n003451\n003452\n003454\n003455\n003457\n003458\n003459\n003460\n003462\n003463\n003468\n003472\n003473\n003475\n003476\n003477\n003479\n003485\n003486\n003493\n003494\n003498\n003499\n003500\n003501\n003505\n003507\n003508\n003509\n003510\n003512\n003513\n003514\n003516\n003518\n003522\n003523\n003525\n003526\n003532\n003533\n003534\n003536\n003537\n003538\n003540\n003541\n003542\n003545\n003546\n003548\n003549\n003551\n003555\n003556\n003560\n003561\n003564\n003565\n003566\n003567\n003569\n003570\n003572\n003575\n003576\n003577\n003578\n003579\n003581\n003585\n003586\n003587\n003589\n003590\n003591\n003592\n003593\n003594\n003595\n003596\n003597\n003598\n003599\n003602\n003603\n003606\n003610\n003612\n003613\n003615\n003617\n003619\n003625\n003626\n003628\n003636\n003637\n003638\n003639\n003640\n003641\n003642\n003644\n003646\n003648\n003650\n003651\n003654\n003656\n003657\n003660\n003663\n003664\n003665\n003666\n003670\n003672\n003673\n003674\n003675\n003680\n003681\n003685\n003686\n003687\n003693\n003694\n003695\n003696\n003697\n003698\n003699\n003700\n003701\n003704\n003706\n003709\n003710\n003713\n003714\n003717\n003720\n003721\n003722\n003724\n003725\n003727\n003729\n003730\n003731\n003732\n003733\n003734\n003740\n003741\n003742\n003743\n003744\n003745\n003749\n003752\n003754\n003757\n003758\n003759\n003760\n003761\n003765\n003766\n003767\n003768\n003770\n003772\n003773\n003774\n003776\n003780\n003783\n003784\n003785\n003786\n003789\n003790\n003791\n003792\n003795\n003796\n003797\n003799\n003801\n003803\n003806\n003810\n003813\n003815\n003816\n003817\n003818\n003819\n003821\n003823\n003824\n003825\n003829\n003831\n003832\n003833\n003836\n003838\n003839\n003840\n003842\n003843\n003844\n003845\n003846\n003848\n003849\n003850\n003851\n003853\n003855\n003857\n003858\n003861\n003862\n003863\n003865\n003867\n003868\n003871\n003875\n003876\n003877\n003882\n003884\n003887\n003888\n003889\n003893\n003895\n003896\n003900\n003903\n003904\n003906\n003908\n003910\n003911\n003912\n003913\n003917\n003918\n003919\n003921\n003922\n003925\n003927\n003928\n003929\n003930\n003933\n003935\n003936\n003939\n003940\n003941\n003942\n003944\n003947\n003949\n003951\n003952\n003953\n003954\n003955\n003957\n003959\n003960\n003963\n003966\n003967\n003968\n003971\n003973\n003974\n003976\n003978\n003979\n003983\n003985\n003987\n003988\n003989\n003990\n003991\n003993\n003994\n003995\n003997\n003999\n004005\n004006\n004012\n004013\n004014\n004015\n004017\n004018\n004019\n004020\n004022\n004023\n004024\n004025\n004029\n004030\n004031\n004035\n004037\n004039\n004043\n004044\n004046\n004047\n004050\n004052\n004053\n004054\n004056\n004057\n004058\n004060\n004062\n004066\n004067\n004069\n004070\n004071\n004073\n004075\n004076\n004078\n004080\n004084\n004086\n004088\n004090\n004093\n004094\n004097\n004099\n004102\n004103\n004106\n004112\n004114\n004115\n004123\n004127\n004133\n004134\n004135\n004139\n004141\n004144\n004145\n004146\n004147\n004151\n004159\n004165\n004166\n004167\n004169\n004170\n004176\n004177\n004178\n004179\n004180\n004181\n004182\n004183\n004184\n004186\n004192\n004193\n004194\n004197\n004198\n004199\n004200\n004201\n004203\n004204\n004208\n004211\n004212\n004216\n004217\n004218\n004219\n004225\n004227\n004229\n004230\n004231\n004233\n004234\n004235\n004236\n004238\n004240\n004244\n004245\n004247\n004252\n004253\n004257\n004258\n004261\n004262\n004264\n004265\n004266\n004267\n004268\n004269\n004272\n004273\n004274\n004276\n004279\n004283\n004286\n004287\n004292\n004296\n004297\n004302\n004304\n004308\n004310\n004313\n004315\n004316\n004317\n004320\n004322\n004325\n004328\n004331\n004332\n004333\n004334\n004339\n004341\n004344\n004346\n004347\n004351\n004354\n004355\n004356\n004357\n004358\n004359\n004361\n004365\n004366\n004371\n004372\n004375\n004376\n004378\n004379\n004380\n004381\n004382\n004386\n004387\n004389\n004390\n004394\n004395\n004399\n004400\n004405\n004408\n004409\n004410\n004411\n004412\n004413\n004416\n004417\n004427\n004428\n004431\n004432\n004436\n004441\n004442\n004445\n004446\n004448\n004449\n004451\n004453\n004455\n004457\n004459\n004461\n004463\n004464\n004466\n004467\n004468\n004471\n004473\n004476\n004477\n004478\n004479\n004484\n004488\n004492\n004495\n004497\n004498\n004499\n004500\n004503\n004504\n004505\n004506\n004507\n004509\n004510\n004512\n004514\n004515\n004518\n004522\n004523\n004524\n004525\n004533\n004535\n004536\n004537\n004538\n004539\n004543\n004544\n004545\n004546\n004550\n004552\n004554\n004555\n004558\n004559\n004560\n004561\n004563\n004564\n004565\n004571\n004572\n004575\n004577\n004579\n004580\n004583\n004584\n004586\n004590\n004592\n004593\n004594\n004595\n004597\n004600\n004601\n004602\n004604\n004605\n004606\n004607\n004613\n004614\n004616\n004617\n004619\n004621\n004623\n004625\n004627\n004628\n004631\n004635\n004637\n004639\n004641\n004642\n004643\n004645\n004646\n004653\n004654\n004656\n004659\n004661\n004662\n004663\n004664\n004670\n004671\n004674\n004675\n004676\n004677\n004678\n004681\n004684\n004690\n004696\n004701\n004702\n004703\n004704\n004707\n004712\n004719\n004723\n004727\n004728\n004729\n004731\n004733\n004736\n004741\n004747\n004749\n004750\n004751\n004754\n004755\n004757\n004758\n004760\n004761\n004765\n004767\n004771\n004772\n004774\n004775\n004778\n004779\n004780\n004781\n004784\n004785\n004786\n004789\n004793\n004794\n004795\n004796\n004798\n004801\n004802\n004803\n004805\n004808\n004809\n004812\n004818\n004819\n004820\n004823\n004824\n004826\n004827\n004828\n004833\n004834\n004836\n004837\n004838\n004840\n004841\n004842\n004844\n004845\n004847\n004853\n004854\n004855\n004856\n004857\n004865\n004866\n004869\n004870\n004872\n004876\n004877\n004878\n004879\n004880\n004882\n004883\n004884\n004886\n004889\n004890\n004894\n004897\n004899\n004900\n004901\n004906\n004908\n004910\n004911\n004912\n004913\n004915\n004916\n004919\n004922\n004923\n004925\n004930\n004933\n004936\n004937\n004939\n004940\n004945\n004950\n004951\n004952\n004955\n004957\n004961\n004964\n004965\n004967\n004968\n004969\n004970\n004971\n004972\n004973\n004975\n004977\n004978\n004980\n004982\n004984\n004987\n004991\n004992\n004997\n005000\n005003\n005005\n005006\n005007\n005009\n005011\n005012\n005016\n005018\n005020\n005022\n005023\n005025\n005027\n005029\n005030\n005031\n005033\n005035\n005039\n005042\n005043\n005044\n005046\n005047\n005048\n005051\n005059\n005060\n005061\n005066\n005069\n005071\n005076\n005083\n005084\n005085\n005087\n005088\n005089\n005091\n005092\n005096\n005097\n005098\n005099\n005100\n005102\n005104\n005106\n005107\n005111\n005114\n005115\n005116\n005117\n005118\n005119\n005123\n005126\n005129\n005130\n005131\n005132\n005134\n005137\n005142\n005146\n005148\n005150\n005151\n005152\n005154\n005159\n005160\n005165\n005169\n005171\n005173\n005177\n005178\n005183\n005186\n005187\n005192\n005193\n005195\n005196\n005200\n005202\n005203\n005204\n005205\n005207\n005208\n005209\n005210\n005211\n005212\n005215\n005216\n005220\n005223\n005224\n005225\n005228\n005231\n005232\n005235\n005238\n005239\n005243\n005245\n005247\n005248\n005250\n005252\n005253\n005254\n005257\n005258\n005259\n005261\n005263\n005264\n005265\n005266\n005269\n005270\n005272\n005277\n005278\n005281\n005283\n005285\n005286\n005288\n005290\n005291\n005293\n005294\n005295\n005300\n005301\n005302\n005303\n005305\n005306\n005310\n005314\n005317\n005320\n005324\n005326\n005327\n005331\n005332\n005339\n005340\n005344\n005346\n005348\n005351\n005352\n005353\n005354\n005355\n005356\n005357\n005358\n005361\n005362\n005364\n005367\n005370\n005373\n005374\n005376\n005380\n005382\n005383\n005384\n005387\n005388\n005392\n005393\n005394\n005395\n005396\n005397\n005398\n005399\n005400\n005401\n005402\n005403\n005406\n005407\n005408\n005409\n005410\n005411\n005412\n005414\n005416\n005417\n005418\n005419\n005420\n005421\n005424\n005425\n005428\n005432\n005433\n005435\n005436\n005438\n005439\n005440\n005442\n005446\n005451\n005454\n005455\n005456\n005457\n005462\n005463\n005464\n005468\n005469\n005470\n005475\n005478\n005480\n005483\n005485\n005488\n005490\n005491\n005492\n005493\n005496\n005497\n005499\n005500\n005501\n005502\n005503\n005504\n005506\n005507\n005508\n005509\n005512\n005513\n005516\n005517\n005518\n005519\n005520\n005521\n005522\n005524\n005526\n005527\n005529\n005530\n005533\n005535\n005537\n005539\n005541\n005543\n005547\n005548\n005549\n005550\n005553\n005554\n005561\n005562\n005563\n005564\n005567\n005568\n005569\n005574\n005575\n005578\n005579\n005583\n005585\n005591\n005592\n005593\n005594\n005597\n005598\n005599\n005604\n005605\n005606\n005607\n005608\n005609\n005611\n005612\n005614\n005615\n005620\n005621\n005622\n005624\n005626\n005627\n005628\n005629\n005632\n005636\n005637\n005641\n005644\n005645\n005646\n005647\n005648\n005651\n005654\n005655\n005657\n005661\n005663\n005665\n005666\n005667\n005670\n005671\n005674\n005675\n005678\n005679\n005681\n005682\n005684\n005686\n005688\n005690\n005691\n005692\n005693\n005694\n005696\n005697\n005701\n005702\n005705\n005710\n005711\n005715\n005716\n005718\n005719\n005720\n005721\n005722\n005723\n005726\n005730\n005732\n005733\n005734\n005737\n005738\n005742\n005748\n005749\n005750\n005752\n005753\n005755\n005756\n005758\n005759\n005761\n005764\n005766\n005767\n005768\n005769\n005770\n005771\n005772\n005773\n005774\n005775\n005776\n005778\n005779\n005780\n005781\n005788\n005789\n005791\n005792\n005795\n005797\n005798\n005799\n005802\n005804\n005808\n005809\n005810\n005813\n005814\n005815\n005816\n005817\n005823\n005824\n005825\n005828\n005830\n005831\n005832\n005833\n005835\n005836\n005837\n005838\n005842\n005844\n005845\n005846\n005847\n005848\n005849\n005850\n005851\n005853\n005858\n005860\n005861\n005862\n005863\n005865\n005866\n005867\n005868\n005870\n005871\n005872\n005874\n005875\n005877\n005880\n005884\n005886\n005888\n005890\n005891\n005895\n005896\n005897\n005898\n005902\n005904\n005908\n005915\n005920\n005924\n005928\n005929\n005930\n005932\n005934\n005936\n005937\n005940\n005941\n005942\n005943\n005945\n005946\n005950\n005951\n005953\n005954\n005956\n005957\n005959\n005960\n005964\n005966\n005967\n005968\n005971\n005973\n005974\n005976\n005977\n005979\n005980\n005983\n005987\n005989\n005990\n005991\n005992\n005993\n005995\n005998\n006000\n006004\n006006\n006007\n006011\n006015\n006017\n006018\n006019\n006020\n006021\n006022\n006025\n006032\n006035\n006037\n006040\n006049\n006051\n006053\n006055\n006056\n006059\n006064\n006065\n006069\n006072\n006073\n006076\n006079\n006080\n006081\n006082\n006084\n006089\n006090\n006091\n006092\n006094\n006099\n006101\n006104\n006105\n006108\n006109\n006111\n006112\n006113\n006119\n006120\n006124\n006128\n006129\n006131\n006132\n006134\n006135\n006137\n006138\n006140\n006141\n006142\n006143\n006145\n006147\n006149\n006150\n006153\n006155\n006157\n006158\n006159\n006160\n006162\n006164\n006166\n006170\n006171\n006172\n006174\n006175\n006178\n006179\n006180\n006181\n006183\n006184\n006188\n006189\n006191\n006192\n006193\n006197\n006199\n006200\n006201\n006203\n006205\n006206\n006207\n006209\n006211\n006212\n006214\n006216\n006217\n006218\n006220\n006221\n006223\n006224\n006225\n006226\n006230\n006231\n006234\n006235\n006236\n006237\n006239\n006241\n006242\n006243\n006245\n006248\n006251\n006252\n006253\n006254\n006255\n006256\n006257\n006259\n006260\n006261\n006262\n006264\n006268\n006271\n006277\n006279\n006281\n006283\n006284\n006285\n006289\n006290\n006291\n006292\n006293\n006294\n006295\n006296\n006298\n006299\n006303\n006304\n006307\n006308\n006309\n006310\n006311\n006313\n006318\n006319\n006320\n006323\n006325\n006326\n006327\n006328\n006329\n006330\n006335\n006336\n006337\n006341\n006346\n006347\n006350\n006352\n006358\n006359\n006361\n006362\n006363\n006365\n006367\n006373\n006374\n006375\n006376\n006378\n006382\n006383\n006384\n006387\n006389\n006390\n006392\n006397\n006398\n006399\n006400\n006401\n006402\n006404\n006408\n006412\n006413\n006414\n006418\n006419\n006421\n006422\n006428\n006429\n006430\n006431\n006432\n006438\n006443\n006447\n006448\n006449\n006450\n006455\n006456\n006457\n006458\n006459\n006460\n006461\n006463\n006466\n006467\n006471\n006476\n006479\n006480\n006485\n006487\n006489\n006490\n006492\n006494\n006495\n006499\n006500\n006501\n006502\n006504\n006509\n006510\n006511\n006513\n006518\n006522\n006523\n006526\n006527\n006528\n006536\n006538\n006539\n006541\n006543\n006544\n006545\n006546\n006547\n006550\n006552\n006554\n006557\n006559\n006562\n006564\n006566\n006567\n006571\n006572\n006573\n006575\n006579\n006580\n006584\n006585\n006587\n006589\n006591\n006594\n006598\n006599\n006600\n006601\n006605\n006606\n006607\n006608\n006609\n006610\n006615\n006616\n006617\n006619\n006620\n006621\n006622\n006627\n006630\n006631\n006635\n006639\n006640\n006642\n006644\n006645\n006646\n006648\n006652\n006653\n006654\n006657\n006661\n006662\n006663\n006665\n006668\n006671\n006672\n006673\n006675\n006680\n006681\n006683\n006684\n006687\n006688\n006689\n006690\n006691\n006697\n006699\n006700\n006702\n006704\n006705\n006706\n006707\n006708\n006716\n006717\n006718\n006721\n006722\n006724\n006727\n006728\n006730\n006735\n006736\n006739\n006740\n006742\n006743\n006746\n006748\n006749\n006750\n006757\n006763\n006766\n006769\n006774\n006775\n006776\n006779\n006784\n006787\n006788\n006790\n006793\n006795\n006799\n006801\n006802\n006805\n006809\n006810\n006814\n006817\n006820\n006821\n006823\n006824\n006825\n006826\n006827\n006830\n006831\n006834\n006835\n006838\n006839\n006840\n006842\n006845\n006846\n006848\n006851\n006857\n006859\n006861\n006864\n006865\n006867\n006869\n006871\n006875\n006877\n006878\n006880\n006883\n006886\n006888\n006890\n006892\n006893\n006894\n006896\n006902\n006904\n006905\n006909\n006911\n006912\n006915\n006916\n006918\n006919\n006920\n006921\n006923\n006924\n006926\n006927\n006929\n006931\n006932\n006933\n006934\n006935\n006939\n006940\n006941\n006946\n006947\n006949\n006951\n006952\n006957\n006958\n006961\n006963\n006965\n006966\n006967\n006969\n006970\n006972\n006974\n006975\n006976\n006979\n006983\n006984\n006985\n006986\n006988\n006991\n006993\n006995\n006996\n006998\n007001\n007002\n007004\n007007\n007009\n007013\n007017\n007018\n007020\n007021\n007024\n007025\n007035\n007036\n007039\n007040\n007041\n007044\n007045\n007046\n007050\n007051\n007054\n007057\n007058\n007060\n007062\n007064\n007066\n007070\n007073\n007075\n007077\n007086\n007090\n007092\n007093\n007094\n007096\n007097\n007099\n007101\n007102\n007104\n007105\n007106\n007107\n007108\n007111\n007113\n007114\n007116\n007118\n007121\n007123\n007124\n007126\n007127\n007128\n007129\n007134\n007137\n007140\n007141\n007142\n007143\n007147\n007148\n007150\n007151\n007152\n007153\n007155\n007156\n007159\n007160\n007167\n007170\n007171\n007173\n007175\n007179\n007181\n007184\n007185\n007186\n007188\n007189\n007190\n007191\n007192\n007193\n007195\n007196\n007197\n007203\n007206\n007209\n007211\n007213\n007216\n007218\n007220\n007222\n007223\n007224\n007226\n007228\n007231\n007234\n007236\n007237\n007239\n007241\n007243\n007245\n007248\n007249\n007250\n007251\n007254\n007257\n007259\n007263\n007264\n007268\n007269\n007270\n007276\n007281\n007282\n007285\n007286\n007293\n007295\n007296\n007297\n007298\n007301\n007305\n007306\n007307\n007308\n007312\n007313\n007314\n007316\n007317\n007320\n007321\n007324\n007328\n007332\n007333\n007334\n007335\n007338\n007340\n007341\n007346\n007348\n007354\n007355\n007356\n007357\n007358\n007361\n007362\n007363\n007365\n007366\n007367\n007368\n007370\n007372\n007373\n007378\n007379\n007386\n007387\n007388\n007390\n007392\n007393\n007394\n007399\n007400\n007404\n007406\n007408\n007414\n007417\n007418\n007425\n007427\n007428\n007429\n007431\n007432\n007438\n007441\n007443\n007444\n007446\n007451\n007452\n007454\n007455\n007457\n007459\n007460\n007461\n007465\n007471\n007472\n007474\n007476\n007479"
  },
  {
    "path": "data/kitti/ImageSets/val.txt",
    "content": "000001\n000002\n000004\n000005\n000006\n000008\n000015\n000019\n000020\n000021\n000023\n000024\n000025\n000027\n000028\n000031\n000033\n000035\n000037\n000039\n000040\n000042\n000047\n000048\n000050\n000052\n000053\n000058\n000059\n000061\n000062\n000063\n000065\n000066\n000076\n000077\n000078\n000081\n000089\n000090\n000093\n000094\n000098\n000102\n000104\n000106\n000107\n000108\n000116\n000117\n000118\n000122\n000124\n000126\n000128\n000132\n000134\n000135\n000137\n000139\n000140\n000143\n000147\n000151\n000152\n000153\n000156\n000159\n000161\n000167\n000168\n000169\n000170\n000173\n000174\n000175\n000181\n000182\n000183\n000186\n000187\n000188\n000190\n000191\n000192\n000194\n000195\n000196\n000197\n000199\n000201\n000203\n000204\n000207\n000211\n000212\n000213\n000216\n000218\n000223\n000224\n000226\n000229\n000230\n000231\n000234\n000235\n000236\n000237\n000239\n000242\n000246\n000247\n000248\n000249\n000250\n000251\n000252\n000260\n000262\n000263\n000265\n000266\n000268\n000269\n000270\n000272\n000273\n000278\n000279\n000281\n000283\n000284\n000289\n000290\n000291\n000293\n000297\n000301\n000302\n000305\n000307\n000308\n000309\n000311\n000312\n000314\n000315\n000319\n000320\n000321\n000323\n000324\n000327\n000328\n000329\n000332\n000333\n000335\n000336\n000340\n000341\n000343\n000345\n000346\n000347\n000350\n000351\n000352\n000354\n000355\n000356\n000357\n000359\n000360\n000361\n000362\n000365\n000366\n000369\n000370\n000372\n000373\n000376\n000377\n000378\n000379\n000381\n000382\n000383\n000385\n000386\n000388\n000391\n000392\n000393\n000394\n000395\n000396\n000397\n000398\n000399\n000401\n000402\n000403\n000404\n000407\n000408\n000409\n000413\n000414\n000415\n000419\n000420\n000422\n000427\n000428\n000429\n000430\n000436\n000437\n000440\n000443\n000446\n000448\n000450\n000451\n000452\n000453\n000454\n000455\n000457\n000459\n000463\n000468\n000469\n000472\n000473\n000475\n000476\n000477\n000478\n000479\n000480\n000481\n000485\n000486\n000489\n000491\n000492\n000493\n000494\n000495\n000496\n000498\n000499\n000503\n000504\n000506\n000508\n000509\n000510\n000512\n000515\n000517\n000519\n000521\n000524\n000527\n000528\n000530\n000533\n000536\n000541\n000542\n000543\n000545\n000546\n000548\n000551\n000554\n000555\n000558\n000559\n000560\n000561\n000564\n000566\n000567\n000568\n000569\n000571\n000572\n000581\n000583\n000588\n000589\n000590\n000591\n000595\n000600\n000601\n000604\n000610\n000611\n000612\n000613\n000614\n000615\n000618\n000619\n000620\n000624\n000625\n000626\n000628\n000630\n000634\n000635\n000636\n000639\n000642\n000644\n000645\n000647\n000648\n000650\n000655\n000657\n000658\n000659\n000660\n000667\n000669\n000670\n000674\n000677\n000679\n000682\n000683\n000684\n000691\n000692\n000694\n000696\n000698\n000699\n000700\n000702\n000704\n000706\n000708\n000716\n000717\n000718\n000721\n000722\n000725\n000727\n000728\n000729\n000731\n000734\n000736\n000737\n000740\n000741\n000745\n000746\n000748\n000750\n000751\n000752\n000754\n000756\n000761\n000765\n000766\n000767\n000768\n000769\n000771\n000772\n000773\n000774\n000778\n000779\n000782\n000790\n000792\n000795\n000798\n000800\n000801\n000802\n000803\n000804\n000805\n000806\n000807\n000809\n000810\n000811\n000812\n000816\n000819\n000823\n000826\n000831\n000837\n000838\n000840\n000841\n000843\n000844\n000847\n000848\n000849\n000850\n000852\n000854\n000859\n000862\n000863\n000869\n000873\n000874\n000875\n000876\n000877\n000878\n000879\n000881\n000884\n000885\n000889\n000893\n000894\n000897\n000899\n000904\n000907\n000909\n000911\n000912\n000915\n000916\n000917\n000920\n000922\n000923\n000926\n000928\n000930\n000931\n000932\n000938\n000939\n000940\n000942\n000943\n000944\n000948\n000949\n000952\n000953\n000956\n000958\n000961\n000963\n000964\n000966\n000967\n000969\n000970\n000971\n000973\n000974\n000976\n000979\n000981\n000983\n000984\n000985\n000986\n000988\n000991\n000999\n001002\n001006\n001007\n001008\n001010\n001011\n001012\n001013\n001014\n001015\n001018\n001019\n001021\n001022\n001025\n001026\n001027\n001035\n001037\n001039\n001042\n001043\n001046\n001050\n001051\n001053\n001054\n001055\n001058\n001063\n001065\n001066\n001067\n001068\n001069\n001070\n001071\n001075\n001076\n001077\n001078\n001083\n001084\n001086\n001088\n001089\n001094\n001095\n001096\n001097\n001099\n001101\n001102\n001104\n001106\n001107\n001108\n001111\n001113\n001114\n001115\n001116\n001118\n001120\n001123\n001125\n001127\n001129\n001131\n001132\n001133\n001134\n001135\n001136\n001138\n001139\n001140\n001141\n001143\n001144\n001145\n001147\n001148\n001149\n001150\n001152\n001153\n001154\n001155\n001158\n001162\n001163\n001167\n001172\n001173\n001176\n001177\n001178\n001179\n001180\n001182\n001183\n001187\n001188\n001189\n001191\n001192\n001193\n001194\n001195\n001198\n001199\n001203\n001206\n001207\n001213\n001214\n001216\n001217\n001218\n001221\n001222\n001224\n001225\n001226\n001228\n001230\n001232\n001234\n001235\n001236\n001237\n001239\n001241\n001242\n001243\n001244\n001245\n001246\n001249\n001251\n001252\n001253\n001254\n001255\n001257\n001259\n001260\n001261\n001263\n001265\n001266\n001267\n001268\n001269\n001270\n001271\n001272\n001273\n001274\n001275\n001281\n001284\n001286\n001287\n001289\n001291\n001292\n001294\n001295\n001296\n001303\n001304\n001305\n001306\n001307\n001308\n001314\n001317\n001318\n001329\n001330\n001331\n001332\n001333\n001334\n001336\n001337\n001339\n001342\n001344\n001345\n001346\n001347\n001350\n001352\n001353\n001355\n001356\n001359\n001363\n001365\n001372\n001374\n001375\n001376\n001377\n001380\n001381\n001382\n001384\n001386\n001387\n001388\n001389\n001391\n001395\n001397\n001398\n001407\n001410\n001411\n001412\n001415\n001416\n001419\n001421\n001424\n001427\n001431\n001432\n001435\n001437\n001438\n001439\n001441\n001442\n001443\n001445\n001446\n001448\n001450\n001451\n001458\n001461\n001463\n001466\n001469\n001471\n001477\n001478\n001480\n001481\n001485\n001487\n001488\n001489\n001495\n001497\n001501\n001502\n001507\n001508\n001511\n001513\n001514\n001516\n001517\n001521\n001522\n001524\n001525\n001526\n001527\n001528\n001533\n001535\n001536\n001537\n001538\n001542\n001545\n001546\n001547\n001549\n001552\n001555\n001557\n001560\n001562\n001564\n001565\n001567\n001569\n001573\n001574\n001576\n001577\n001579\n001582\n001583\n001585\n001586\n001587\n001588\n001589\n001590\n001591\n001592\n001594\n001596\n001597\n001600\n001602\n001603\n001605\n001606\n001610\n001613\n001615\n001616\n001617\n001619\n001621\n001625\n001627\n001629\n001631\n001633\n001634\n001635\n001640\n001643\n001645\n001647\n001650\n001654\n001656\n001658\n001660\n001662\n001664\n001665\n001666\n001667\n001670\n001675\n001680\n001682\n001683\n001684\n001689\n001693\n001694\n001697\n001699\n001701\n001702\n001704\n001705\n001706\n001707\n001709\n001710\n001711\n001712\n001713\n001714\n001717\n001718\n001719\n001721\n001722\n001726\n001727\n001729\n001732\n001733\n001740\n001741\n001742\n001745\n001746\n001749\n001750\n001751\n001752\n001755\n001758\n001762\n001764\n001765\n001768\n001771\n001772\n001774\n001776\n001778\n001780\n001781\n001782\n001783\n001786\n001787\n001794\n001795\n001797\n001800\n001801\n001802\n001804\n001807\n001808\n001813\n001814\n001817\n001818\n001820\n001822\n001823\n001824\n001825\n001828\n001831\n001835\n001840\n001844\n001846\n001848\n001851\n001852\n001853\n001854\n001855\n001856\n001858\n001859\n001861\n001862\n001863\n001867\n001868\n001869\n001872\n001875\n001877\n001878\n001880\n001881\n001884\n001885\n001886\n001887\n001888\n001890\n001892\n001893\n001897\n001898\n001900\n001904\n001905\n001909\n001919\n001920\n001923\n001924\n001925\n001926\n001927\n001928\n001929\n001931\n001932\n001933\n001934\n001936\n001937\n001940\n001941\n001942\n001943\n001945\n001946\n001952\n001954\n001959\n001960\n001966\n001967\n001969\n001972\n001977\n001978\n001979\n001980\n001982\n001983\n001984\n001985\n001986\n001989\n001991\n001995\n001996\n001997\n001999\n002000\n002001\n002002\n002004\n002008\n002010\n002011\n002012\n002013\n002014\n002017\n002019\n002021\n002022\n002025\n002027\n002028\n002029\n002034\n002035\n002036\n002037\n002038\n002042\n002043\n002044\n002045\n002046\n002048\n002049\n002050\n002052\n002054\n002056\n002057\n002058\n002062\n002068\n002071\n002073\n002074\n002075\n002076\n002078\n002079\n002081\n002082\n002085\n002086\n002087\n002089\n002091\n002093\n002094\n002100\n002101\n002102\n002103\n002107\n002108\n002111\n002112\n002113\n002115\n002118\n002120\n002121\n002123\n002124\n002127\n002128\n002130\n002131\n002135\n002136\n002137\n002138\n002139\n002140\n002142\n002151\n002152\n002153\n002158\n002159\n002160\n002161\n002163\n002165\n002166\n002168\n002169\n002170\n002173\n002177\n002179\n002182\n002183\n002185\n002187\n002188\n002193\n002196\n002200\n002201\n002202\n002206\n002207\n002209\n002215\n002216\n002218\n002219\n002220\n002224\n002225\n002228\n002229\n002232\n002233\n002234\n002239\n002243\n002245\n002246\n002248\n002250\n002251\n002254\n002255\n002257\n002258\n002260\n002262\n002266\n002272\n002276\n002277\n002279\n002280\n002282\n002283\n002284\n002286\n002287\n002290\n002291\n002292\n002293\n002294\n002295\n002298\n002299\n002300\n002303\n002304\n002306\n002307\n002308\n002310\n002314\n002315\n002319\n002320\n002325\n002327\n002329\n002330\n002332\n002334\n002336\n002337\n002338\n002340\n002341\n002344\n002345\n002346\n002347\n002348\n002353\n002356\n002357\n002359\n002362\n002365\n002366\n002367\n002369\n002370\n002372\n002376\n002378\n002380\n002382\n002383\n002384\n002385\n002386\n002387\n002391\n002392\n002393\n002397\n002398\n002399\n002404\n002405\n002411\n002414\n002415\n002418\n002419\n002420\n002422\n002423\n002424\n002425\n002428\n002429\n002432\n002433\n002434\n002439\n002440\n002442\n002446\n002450\n002454\n002455\n002457\n002458\n002460\n002461\n002462\n002463\n002473\n002474\n002476\n002477\n002478\n002479\n002483\n002486\n002488\n002490\n002492\n002495\n002497\n002499\n002500\n002502\n002503\n002504\n002505\n002506\n002509\n002511\n002516\n002519\n002520\n002521\n002525\n002526\n002528\n002529\n002530\n002531\n002532\n002534\n002538\n002539\n002540\n002541\n002543\n002546\n002548\n002552\n002556\n002557\n002558\n002562\n002563\n002564\n002565\n002568\n002569\n002570\n002572\n002574\n002575\n002577\n002580\n002581\n002583\n002584\n002585\n002586\n002590\n002594\n002598\n002599\n002600\n002601\n002602\n002603\n002604\n002606\n002612\n002613\n002615\n002619\n002621\n002625\n002626\n002628\n002630\n002631\n002633\n002635\n002636\n002638\n002640\n002641\n002644\n002645\n002646\n002651\n002653\n002656\n002657\n002661\n002663\n002666\n002669\n002673\n002674\n002675\n002677\n002680\n002681\n002685\n002686\n002690\n002692\n002693\n002694\n002695\n002696\n002699\n002702\n002706\n002707\n002709\n002710\n002711\n002712\n002713\n002715\n002717\n002720\n002721\n002722\n002724\n002725\n002726\n002727\n002728\n002729\n002730\n002735\n002737\n002740\n002742\n002744\n002745\n002746\n002747\n002748\n002749\n002752\n002753\n002755\n002757\n002758\n002760\n002761\n002763\n002764\n002765\n002767\n002772\n002773\n002775\n002783\n002786\n002787\n002789\n002793\n002794\n002796\n002797\n002800\n002801\n002804\n002805\n002806\n002809\n002810\n002811\n002812\n002814\n002815\n002818\n002820\n002826\n002827\n002828\n002830\n002831\n002833\n002836\n002839\n002840\n002841\n002844\n002845\n002846\n002847\n002848\n002853\n002856\n002858\n002861\n002863\n002866\n002867\n002875\n002876\n002877\n002878\n002879\n002880\n002881\n002883\n002885\n002889\n002890\n002891\n002892\n002893\n002894\n002895\n002896\n002900\n002901\n002902\n002903\n002905\n002908\n002911\n002914\n002916\n002917\n002919\n002924\n002925\n002928\n002930\n002934\n002935\n002937\n002942\n002944\n002945\n002947\n002948\n002951\n002953\n002955\n002957\n002958\n002959\n002960\n002961\n002962\n002963\n002964\n002966\n002971\n002974\n002976\n002977\n002978\n002979\n002982\n002984\n002985\n002988\n002991\n002993\n002994\n002995\n002997\n002999\n003000\n003001\n003003\n003004\n003005\n003006\n003007\n003010\n003011\n003019\n003022\n003024\n003025\n003027\n003029\n003030\n003031\n003032\n003033\n003034\n003035\n003038\n003042\n003043\n003046\n003047\n003048\n003050\n003052\n003053\n003054\n003055\n003056\n003058\n003061\n003062\n003065\n003066\n003067\n003071\n003073\n003074\n003076\n003080\n003082\n003087\n003088\n003090\n003094\n003096\n003099\n003101\n003102\n003103\n003106\n003107\n003109\n003110\n003112\n003114\n003116\n003118\n003124\n003126\n003127\n003129\n003131\n003133\n003134\n003135\n003136\n003137\n003141\n003142\n003144\n003145\n003146\n003148\n003150\n003153\n003156\n003159\n003161\n003162\n003165\n003167\n003170\n003172\n003174\n003175\n003177\n003179\n003180\n003181\n003182\n003183\n003187\n003190\n003192\n003194\n003197\n003199\n003202\n003203\n003204\n003207\n003210\n003211\n003214\n003216\n003217\n003219\n003221\n003222\n003224\n003225\n003226\n003228\n003229\n003231\n003232\n003233\n003236\n003239\n003240\n003242\n003247\n003250\n003251\n003252\n003254\n003255\n003257\n003259\n003265\n003266\n003269\n003272\n003275\n003276\n003280\n003281\n003283\n003288\n003292\n003295\n003296\n003298\n003300\n003301\n003302\n003304\n003305\n003306\n003308\n003310\n003312\n003313\n003315\n003316\n003318\n003319\n003322\n003323\n003324\n003325\n003330\n003331\n003337\n003338\n003341\n003343\n003346\n003347\n003350\n003351\n003352\n003353\n003355\n003357\n003358\n003364\n003365\n003366\n003367\n003368\n003370\n003373\n003375\n003379\n003385\n003386\n003393\n003394\n003395\n003396\n003397\n003399\n003401\n003402\n003403\n003404\n003405\n003406\n003407\n003408\n003409\n003410\n003411\n003412\n003417\n003419\n003421\n003422\n003425\n003426\n003428\n003429\n003430\n003432\n003434\n003435\n003443\n003447\n003448\n003449\n003450\n003453\n003456\n003461\n003464\n003465\n003466\n003467\n003469\n003470\n003471\n003474\n003478\n003480\n003481\n003482\n003483\n003484\n003487\n003488\n003489\n003490\n003491\n003492\n003495\n003496\n003497\n003502\n003503\n003504\n003506\n003511\n003515\n003517\n003519\n003520\n003521\n003524\n003527\n003528\n003529\n003530\n003531\n003535\n003539\n003543\n003544\n003547\n003550\n003552\n003553\n003554\n003557\n003558\n003559\n003562\n003563\n003568\n003571\n003573\n003574\n003580\n003582\n003583\n003584\n003588\n003600\n003601\n003604\n003605\n003607\n003608\n003609\n003611\n003614\n003616\n003618\n003620\n003621\n003622\n003623\n003624\n003627\n003629\n003630\n003631\n003632\n003633\n003634\n003635\n003643\n003645\n003647\n003649\n003652\n003653\n003655\n003658\n003659\n003661\n003662\n003667\n003668\n003669\n003671\n003676\n003677\n003678\n003679\n003682\n003683\n003684\n003688\n003689\n003690\n003691\n003692\n003702\n003703\n003705\n003707\n003708\n003711\n003712\n003715\n003716\n003718\n003719\n003723\n003726\n003728\n003735\n003736\n003737\n003738\n003739\n003746\n003747\n003748\n003750\n003751\n003753\n003755\n003756\n003762\n003763\n003764\n003769\n003771\n003775\n003777\n003778\n003779\n003781\n003782\n003787\n003788\n003793\n003794\n003798\n003800\n003802\n003804\n003805\n003807\n003808\n003809\n003811\n003812\n003814\n003820\n003822\n003826\n003827\n003828\n003830\n003834\n003835\n003837\n003841\n003847\n003852\n003854\n003856\n003859\n003860\n003864\n003866\n003869\n003870\n003872\n003873\n003874\n003878\n003879\n003880\n003881\n003883\n003885\n003886\n003890\n003891\n003892\n003894\n003897\n003898\n003899\n003901\n003902\n003905\n003907\n003909\n003914\n003915\n003916\n003920\n003923\n003924\n003926\n003931\n003932\n003934\n003937\n003938\n003943\n003945\n003946\n003948\n003950\n003956\n003958\n003961\n003962\n003964\n003965\n003969\n003970\n003972\n003975\n003977\n003980\n003981\n003982\n003984\n003986\n003992\n003996\n003998\n004000\n004001\n004002\n004003\n004004\n004007\n004008\n004009\n004010\n004011\n004016\n004021\n004026\n004027\n004028\n004032\n004033\n004034\n004036\n004038\n004040\n004041\n004042\n004045\n004048\n004049\n004051\n004055\n004059\n004061\n004063\n004064\n004065\n004068\n004072\n004074\n004077\n004079\n004081\n004082\n004083\n004085\n004087\n004089\n004091\n004092\n004095\n004096\n004098\n004100\n004101\n004104\n004105\n004107\n004108\n004109\n004110\n004111\n004113\n004116\n004117\n004118\n004119\n004120\n004121\n004122\n004124\n004125\n004126\n004128\n004129\n004130\n004131\n004132\n004136\n004137\n004138\n004140\n004142\n004143\n004148\n004149\n004150\n004152\n004153\n004154\n004155\n004156\n004157\n004158\n004160\n004161\n004162\n004163\n004164\n004168\n004171\n004172\n004173\n004174\n004175\n004185\n004187\n004188\n004189\n004190\n004191\n004195\n004196\n004202\n004205\n004206\n004207\n004209\n004210\n004213\n004214\n004215\n004220\n004221\n004222\n004223\n004224\n004226\n004228\n004232\n004237\n004239\n004241\n004242\n004243\n004246\n004248\n004249\n004250\n004251\n004254\n004255\n004256\n004259\n004260\n004263\n004270\n004271\n004275\n004277\n004278\n004280\n004281\n004282\n004284\n004285\n004288\n004289\n004290\n004291\n004293\n004294\n004295\n004298\n004299\n004300\n004301\n004303\n004305\n004306\n004307\n004309\n004311\n004312\n004314\n004318\n004319\n004321\n004323\n004324\n004326\n004327\n004329\n004330\n004335\n004336\n004337\n004338\n004340\n004342\n004343\n004345\n004348\n004349\n004350\n004352\n004353\n004360\n004362\n004363\n004364\n004367\n004368\n004369\n004370\n004373\n004374\n004377\n004383\n004384\n004385\n004388\n004391\n004392\n004393\n004396\n004397\n004398\n004401\n004402\n004403\n004404\n004406\n004407\n004414\n004415\n004418\n004419\n004420\n004421\n004422\n004423\n004424\n004425\n004426\n004429\n004430\n004433\n004434\n004435\n004437\n004438\n004439\n004440\n004443\n004444\n004447\n004450\n004452\n004454\n004456\n004458\n004460\n004462\n004465\n004469\n004470\n004472\n004474\n004475\n004480\n004481\n004482\n004483\n004485\n004486\n004487\n004489\n004490\n004491\n004493\n004494\n004496\n004501\n004502\n004508\n004511\n004513\n004516\n004517\n004519\n004520\n004521\n004526\n004527\n004528\n004529\n004530\n004531\n004532\n004534\n004540\n004541\n004542\n004547\n004548\n004549\n004551\n004553\n004556\n004557\n004562\n004566\n004567\n004568\n004569\n004570\n004573\n004574\n004576\n004578\n004581\n004582\n004585\n004587\n004588\n004589\n004591\n004596\n004598\n004599\n004603\n004608\n004609\n004610\n004611\n004612\n004615\n004618\n004620\n004622\n004624\n004626\n004629\n004630\n004632\n004633\n004634\n004636\n004638\n004640\n004644\n004647\n004648\n004649\n004650\n004651\n004652\n004655\n004657\n004658\n004660\n004665\n004666\n004667\n004668\n004669\n004672\n004673\n004679\n004680\n004682\n004683\n004685\n004686\n004687\n004688\n004689\n004691\n004692\n004693\n004694\n004695\n004697\n004698\n004699\n004700\n004705\n004706\n004708\n004709\n004710\n004711\n004713\n004714\n004715\n004716\n004717\n004718\n004720\n004721\n004722\n004724\n004725\n004726\n004730\n004732\n004734\n004735\n004737\n004738\n004739\n004740\n004742\n004743\n004744\n004745\n004746\n004748\n004752\n004753\n004756\n004759\n004762\n004763\n004764\n004766\n004768\n004769\n004770\n004773\n004776\n004777\n004782\n004783\n004787\n004788\n004790\n004791\n004792\n004797\n004799\n004800\n004804\n004806\n004807\n004810\n004811\n004813\n004814\n004815\n004816\n004817\n004821\n004822\n004825\n004829\n004830\n004831\n004832\n004835\n004839\n004843\n004846\n004848\n004849\n004850\n004851\n004852\n004858\n004859\n004860\n004861\n004862\n004863\n004864\n004867\n004868\n004871\n004873\n004874\n004875\n004881\n004885\n004887\n004888\n004891\n004892\n004893\n004895\n004896\n004898\n004902\n004903\n004904\n004905\n004907\n004909\n004914\n004917\n004918\n004920\n004921\n004924\n004926\n004927\n004928\n004929\n004931\n004932\n004934\n004935\n004938\n004941\n004942\n004943\n004944\n004946\n004947\n004948\n004949\n004953\n004954\n004956\n004958\n004959\n004960\n004962\n004963\n004966\n004974\n004976\n004979\n004981\n004983\n004985\n004986\n004988\n004989\n004990\n004993\n004994\n004995\n004996\n004998\n004999\n005001\n005002\n005004\n005008\n005010\n005013\n005014\n005015\n005017\n005019\n005021\n005024\n005026\n005028\n005032\n005034\n005036\n005037\n005038\n005040\n005041\n005045\n005049\n005050\n005052\n005053\n005054\n005055\n005056\n005057\n005058\n005062\n005063\n005064\n005065\n005067\n005068\n005070\n005072\n005073\n005074\n005075\n005077\n005078\n005079\n005080\n005081\n005082\n005086\n005090\n005093\n005094\n005095\n005101\n005103\n005105\n005108\n005109\n005110\n005112\n005113\n005120\n005121\n005122\n005124\n005125\n005127\n005128\n005133\n005135\n005136\n005138\n005139\n005140\n005141\n005143\n005144\n005145\n005147\n005149\n005153\n005155\n005156\n005157\n005158\n005161\n005162\n005163\n005164\n005166\n005167\n005168\n005170\n005172\n005174\n005175\n005176\n005179\n005180\n005181\n005182\n005184\n005185\n005188\n005189\n005190\n005191\n005194\n005197\n005198\n005199\n005201\n005206\n005213\n005214\n005217\n005218\n005219\n005221\n005222\n005226\n005227\n005229\n005230\n005233\n005234\n005236\n005237\n005240\n005241\n005242\n005244\n005246\n005249\n005251\n005255\n005256\n005260\n005262\n005267\n005268\n005271\n005273\n005274\n005275\n005276\n005279\n005280\n005282\n005284\n005287\n005289\n005292\n005296\n005297\n005298\n005299\n005304\n005307\n005308\n005309\n005311\n005312\n005313\n005315\n005316\n005318\n005319\n005321\n005322\n005323\n005325\n005328\n005329\n005330\n005333\n005334\n005335\n005336\n005337\n005338\n005341\n005342\n005343\n005345\n005347\n005349\n005350\n005359\n005360\n005363\n005365\n005366\n005368\n005369\n005371\n005372\n005375\n005377\n005378\n005379\n005381\n005385\n005386\n005389\n005390\n005391\n005404\n005405\n005413\n005415\n005422\n005423\n005426\n005427\n005429\n005430\n005431\n005434\n005437\n005441\n005443\n005444\n005445\n005447\n005448\n005449\n005450\n005452\n005453\n005458\n005459\n005460\n005461\n005465\n005466\n005467\n005471\n005472\n005473\n005474\n005476\n005477\n005479\n005481\n005482\n005484\n005486\n005487\n005489\n005494\n005495\n005498\n005505\n005510\n005511\n005514\n005515\n005523\n005525\n005528\n005531\n005532\n005534\n005536\n005538\n005540\n005542\n005544\n005545\n005546\n005551\n005552\n005555\n005556\n005557\n005558\n005559\n005560\n005565\n005566\n005570\n005571\n005572\n005573\n005576\n005577\n005580\n005581\n005582\n005584\n005586\n005587\n005588\n005589\n005590\n005595\n005596\n005600\n005601\n005602\n005603\n005610\n005613\n005616\n005617\n005618\n005619\n005623\n005625\n005630\n005631\n005633\n005634\n005635\n005638\n005639\n005640\n005642\n005643\n005649\n005650\n005652\n005653\n005656\n005658\n005659\n005660\n005662\n005664\n005668\n005669\n005672\n005673\n005676\n005677\n005680\n005683\n005685\n005687\n005689\n005695\n005698\n005699\n005700\n005703\n005704\n005706\n005707\n005708\n005709\n005712\n005713\n005714\n005717\n005724\n005725\n005727\n005728\n005729\n005731\n005735\n005736\n005739\n005740\n005741\n005743\n005744\n005745\n005746\n005747\n005751\n005754\n005757\n005760\n005762\n005763\n005765\n005777\n005782\n005783\n005784\n005785\n005786\n005787\n005790\n005793\n005794\n005796\n005800\n005801\n005803\n005805\n005806\n005807\n005811\n005812\n005818\n005819\n005820\n005821\n005822\n005826\n005827\n005829\n005834\n005839\n005840\n005841\n005843\n005852\n005854\n005855\n005856\n005857\n005859\n005864\n005869\n005873\n005876\n005878\n005879\n005881\n005882\n005883\n005885\n005887\n005889\n005892\n005893\n005894\n005899\n005900\n005901\n005903\n005905\n005906\n005907\n005909\n005910\n005911\n005912\n005913\n005914\n005916\n005917\n005918\n005919\n005921\n005922\n005923\n005925\n005926\n005927\n005931\n005933\n005935\n005938\n005939\n005944\n005947\n005948\n005949\n005952\n005955\n005958\n005961\n005962\n005963\n005965\n005969\n005970\n005972\n005975\n005978\n005981\n005982\n005984\n005985\n005986\n005988\n005994\n005996\n005997\n005999\n006001\n006002\n006003\n006005\n006008\n006009\n006010\n006012\n006013\n006014\n006016\n006023\n006024\n006026\n006027\n006028\n006029\n006030\n006031\n006033\n006034\n006036\n006038\n006039\n006041\n006042\n006043\n006044\n006045\n006046\n006047\n006048\n006050\n006052\n006054\n006057\n006058\n006060\n006061\n006062\n006063\n006066\n006067\n006068\n006070\n006071\n006074\n006075\n006077\n006078\n006083\n006085\n006086\n006087\n006088\n006093\n006095\n006096\n006097\n006098\n006100\n006102\n006103\n006106\n006107\n006110\n006114\n006115\n006116\n006117\n006118\n006121\n006122\n006123\n006125\n006126\n006127\n006130\n006133\n006136\n006139\n006144\n006146\n006148\n006151\n006152\n006154\n006156\n006161\n006163\n006165\n006167\n006168\n006169\n006173\n006176\n006177\n006182\n006185\n006186\n006187\n006190\n006194\n006195\n006196\n006198\n006202\n006204\n006208\n006210\n006213\n006215\n006219\n006222\n006227\n006228\n006229\n006232\n006233\n006238\n006240\n006244\n006246\n006247\n006249\n006250\n006258\n006263\n006265\n006266\n006267\n006269\n006270\n006272\n006273\n006274\n006275\n006276\n006278\n006280\n006282\n006286\n006287\n006288\n006297\n006300\n006301\n006302\n006305\n006306\n006312\n006314\n006315\n006316\n006317\n006321\n006322\n006324\n006331\n006332\n006333\n006334\n006338\n006339\n006340\n006342\n006343\n006344\n006345\n006348\n006349\n006351\n006353\n006354\n006355\n006356\n006357\n006360\n006364\n006366\n006368\n006369\n006370\n006371\n006372\n006377\n006379\n006380\n006381\n006385\n006386\n006388\n006391\n006393\n006394\n006395\n006396\n006403\n006405\n006406\n006407\n006409\n006410\n006411\n006415\n006416\n006417\n006420\n006423\n006424\n006425\n006426\n006427\n006433\n006434\n006435\n006436\n006437\n006439\n006440\n006441\n006442\n006444\n006445\n006446\n006451\n006452\n006453\n006454\n006462\n006464\n006465\n006468\n006469\n006470\n006472\n006473\n006474\n006475\n006477\n006478\n006481\n006482\n006483\n006484\n006486\n006488\n006491\n006493\n006496\n006497\n006498\n006503\n006505\n006506\n006507\n006508\n006512\n006514\n006515\n006516\n006517\n006519\n006520\n006521\n006524\n006525\n006529\n006530\n006531\n006532\n006533\n006534\n006535\n006537\n006540\n006542\n006548\n006549\n006551\n006553\n006555\n006556\n006558\n006560\n006561\n006563\n006565\n006568\n006569\n006570\n006574\n006576\n006577\n006578\n006581\n006582\n006583\n006586\n006588\n006590\n006592\n006593\n006595\n006596\n006597\n006602\n006603\n006604\n006611\n006612\n006613\n006614\n006618\n006623\n006624\n006625\n006626\n006628\n006629\n006632\n006633\n006634\n006636\n006637\n006638\n006641\n006643\n006647\n006649\n006650\n006651\n006655\n006656\n006658\n006659\n006660\n006664\n006666\n006667\n006669\n006670\n006674\n006676\n006677\n006678\n006679\n006682\n006685\n006686\n006692\n006693\n006694\n006695\n006696\n006698\n006701\n006703\n006709\n006710\n006711\n006712\n006713\n006714\n006715\n006719\n006720\n006723\n006725\n006726\n006729\n006731\n006732\n006733\n006734\n006737\n006738\n006741\n006744\n006745\n006747\n006751\n006752\n006753\n006754\n006755\n006756\n006758\n006759\n006760\n006761\n006762\n006764\n006765\n006767\n006768\n006770\n006771\n006772\n006773\n006777\n006778\n006780\n006781\n006782\n006783\n006785\n006786\n006789\n006791\n006792\n006794\n006796\n006797\n006798\n006800\n006803\n006804\n006806\n006807\n006808\n006811\n006812\n006813\n006815\n006816\n006818\n006819\n006822\n006828\n006829\n006832\n006833\n006836\n006837\n006841\n006843\n006844\n006847\n006849\n006850\n006852\n006853\n006854\n006855\n006856\n006858\n006860\n006862\n006863\n006866\n006868\n006870\n006872\n006873\n006874\n006876\n006879\n006881\n006882\n006884\n006885\n006887\n006889\n006891\n006895\n006897\n006898\n006899\n006900\n006901\n006903\n006906\n006907\n006908\n006910\n006913\n006914\n006917\n006922\n006925\n006928\n006930\n006936\n006937\n006938\n006942\n006943\n006944\n006945\n006948\n006950\n006953\n006954\n006955\n006956\n006959\n006960\n006962\n006964\n006968\n006971\n006973\n006977\n006978\n006980\n006981\n006982\n006987\n006989\n006990\n006992\n006994\n006997\n006999\n007000\n007003\n007005\n007006\n007008\n007010\n007011\n007012\n007014\n007015\n007016\n007019\n007022\n007023\n007026\n007027\n007028\n007029\n007030\n007031\n007032\n007033\n007034\n007037\n007038\n007042\n007043\n007047\n007048\n007049\n007052\n007053\n007055\n007056\n007059\n007061\n007063\n007065\n007067\n007068\n007069\n007071\n007072\n007074\n007076\n007078\n007079\n007080\n007081\n007082\n007083\n007084\n007085\n007087\n007088\n007089\n007091\n007095\n007098\n007100\n007103\n007109\n007110\n007112\n007115\n007117\n007119\n007120\n007122\n007125\n007130\n007131\n007132\n007133\n007135\n007136\n007138\n007139\n007144\n007145\n007146\n007149\n007154\n007157\n007158\n007161\n007162\n007163\n007164\n007165\n007166\n007168\n007169\n007172\n007174\n007176\n007177\n007178\n007180\n007182\n007183\n007187\n007194\n007198\n007199\n007200\n007201\n007202\n007204\n007205\n007207\n007208\n007210\n007212\n007214\n007215\n007217\n007219\n007221\n007225\n007227\n007229\n007230\n007232\n007233\n007235\n007238\n007240\n007242\n007244\n007246\n007247\n007252\n007253\n007255\n007256\n007258\n007260\n007261\n007262\n007265\n007266\n007267\n007271\n007272\n007273\n007274\n007275\n007277\n007278\n007279\n007280\n007283\n007284\n007287\n007288\n007289\n007290\n007291\n007292\n007294\n007299\n007300\n007302\n007303\n007304\n007309\n007310\n007311\n007315\n007318\n007319\n007322\n007323\n007325\n007326\n007327\n007329\n007330\n007331\n007336\n007337\n007339\n007342\n007343\n007344\n007345\n007347\n007349\n007350\n007351\n007352\n007353\n007359\n007360\n007364\n007369\n007371\n007374\n007375\n007376\n007377\n007380\n007381\n007382\n007383\n007384\n007385\n007389\n007391\n007395\n007396\n007397\n007398\n007401\n007402\n007403\n007405\n007407\n007409\n007410\n007411\n007412\n007413\n007415\n007416\n007419\n007420\n007421\n007422\n007423\n007424\n007426\n007430\n007433\n007434\n007435\n007436\n007437\n007439\n007440\n007442\n007445\n007447\n007448\n007449\n007450\n007453\n007456\n007458\n007462\n007463\n007464\n007466\n007467\n007468\n007469\n007470\n007473\n007475\n007477\n007478\n007480"
  },
  {
    "path": "pcdet/__init__.py",
    "content": "import subprocess\nfrom pathlib import Path\n\nfrom .version import __version__\n\n__all__ = [\n    '__version__'\n]\n\n\ndef get_git_commit_number():\n    if not (Path(__file__).parent / '../.git').exists():\n        return '0000000'\n\n    cmd_out = subprocess.run(['git', 'rev-parse', 'HEAD'], stdout=subprocess.PIPE)\n    git_commit_number = cmd_out.stdout.decode('utf-8')[:7]\n    return git_commit_number\n\n\nscript_version = get_git_commit_number()\n\n\nif script_version not in __version__:\n    __version__ = __version__ + '+py%s' % script_version\n"
  },
  {
    "path": "pcdet/config.py",
    "content": "from pathlib import Path\n\nimport yaml\nfrom easydict import EasyDict\n\n\ndef log_config_to_file(cfg, pre='cfg', logger=None):\n    for key, val in cfg.items():\n        if isinstance(cfg[key], EasyDict):\n            logger.info('\\n%s.%s = edict()' % (pre, key))\n            log_config_to_file(cfg[key], pre=pre + '.' + key, logger=logger)\n            continue\n        logger.info('%s.%s: %s' % (pre, key, val))\n\n\ndef cfg_from_list(cfg_list, config):\n    \"\"\"Set config keys via list (e.g., from command line).\"\"\"\n    from ast import literal_eval\n    assert len(cfg_list) % 2 == 0\n    for k, v in zip(cfg_list[0::2], cfg_list[1::2]):\n        key_list = k.split('.')\n        d = config\n        for subkey in key_list[:-1]:\n            assert subkey in d, 'NotFoundKey: %s' % subkey\n            d = d[subkey]\n        subkey = key_list[-1]\n        assert subkey in d, 'NotFoundKey: %s' % subkey\n        try:\n            value = literal_eval(v)\n        except:\n            value = v\n\n        if type(value) != type(d[subkey]) and isinstance(d[subkey], EasyDict):\n            key_val_list = value.split(',')\n            for src in key_val_list:\n                cur_key, cur_val = src.split(':')\n                val_type = type(d[subkey][cur_key])\n                cur_val = val_type(cur_val)\n                d[subkey][cur_key] = cur_val\n        elif type(value) != type(d[subkey]) and isinstance(d[subkey], list):\n            val_list = value.split(',')\n            for k, x in enumerate(val_list):\n                val_list[k] = type(d[subkey][0])(x)\n            d[subkey] = val_list\n        else:\n            assert type(value) == type(d[subkey]), \\\n                'type {} does not match original type {}'.format(type(value), type(d[subkey]))\n            d[subkey] = value\n\n\ndef merge_new_config(config, new_config):\n    if '_BASE_CONFIG_' in new_config:\n        with open(new_config['_BASE_CONFIG_'], 'r') as f:\n            try:\n                yaml_config = yaml.load(f, Loader=yaml.FullLoader)\n            except:\n                yaml_config = yaml.load(f)\n        config.update(EasyDict(yaml_config))\n\n    for key, val in new_config.items():\n        if not isinstance(val, dict):\n            config[key] = val\n            continue\n        if key not in config:\n            config[key] = EasyDict()\n        merge_new_config(config[key], val)\n\n    return config\n\n\ndef cfg_from_yaml_file(cfg_file, config):\n    with open(cfg_file, 'r') as f:\n        try:\n            new_config = yaml.load(f, Loader=yaml.FullLoader)\n        except:\n            new_config = yaml.load(f)\n\n        merge_new_config(config=config, new_config=new_config)\n\n    return config\n\n\ncfg = EasyDict()\ncfg.ROOT_DIR = (Path(__file__).resolve().parent / '../').resolve()\ncfg.LOCAL_RANK = 0\n"
  },
  {
    "path": "pcdet/datasets/__init__.py",
    "content": "import torch\nfrom torch.utils.data import DataLoader\nfrom torch.utils.data import DistributedSampler as _DistributedSampler\n\nfrom pcdet.utils import common_utils\n\nfrom .dataset import DatasetTemplate\nfrom .kitti.kitti_dataset import KittiDataset\nfrom .kitti.kitti_dataset_mm import KittiDatasetMM\n\nfrom prefetch_generator import BackgroundGenerator\n\n\n__all__ = {\n    'DatasetTemplate': DatasetTemplate,\n    'KittiDataset': KittiDataset,\n    'KittiDatasetMM': KittiDatasetMM\n}\n\n\nclass DataLoaderX(DataLoader):\n\n    def __iter__(self):\n        return BackgroundGenerator(super().__iter__())\n\nclass DistributedSampler(_DistributedSampler):\n\n    def __init__(self, dataset, num_replicas=None, rank=None, shuffle=True):\n        super().__init__(dataset, num_replicas=num_replicas, rank=rank)\n        self.shuffle = shuffle\n\n    def __iter__(self):\n        if self.shuffle:\n            g = torch.Generator()\n            g.manual_seed(self.epoch)\n            indices = torch.randperm(len(self.dataset), generator=g).tolist()\n        else:\n            indices = torch.arange(len(self.dataset)).tolist()\n\n        indices += indices[:(self.total_size - len(indices))]\n        assert len(indices) == self.total_size\n\n        indices = indices[self.rank:self.total_size:self.num_replicas]\n        assert len(indices) == self.num_samples\n\n        return iter(indices)\n\n\ndef build_dataloader(dataset_cfg, class_names, batch_size, dist, root_path=None, workers=4,\n                     logger=None, training=True, merge_all_iters_to_one_epoch=False, total_epochs=0):\n\n    dataset = __all__[dataset_cfg.DATASET](\n        dataset_cfg=dataset_cfg,\n        class_names=class_names,\n        root_path=root_path,\n        training=training,\n        logger=logger,\n    )\n\n    if merge_all_iters_to_one_epoch:\n        assert hasattr(dataset, 'merge_all_iters_to_one_epoch')\n        dataset.merge_all_iters_to_one_epoch(merge=True, epochs=total_epochs)\n\n    if dist:\n        if training:\n            sampler = torch.utils.data.distributed.DistributedSampler(dataset)\n        else:\n            rank, world_size = common_utils.get_dist_info()\n            sampler = DistributedSampler(dataset, world_size, rank, shuffle=False)\n    else:\n        sampler = None\n    dataloader = DataLoaderX(\n        dataset, batch_size=batch_size, pin_memory=True, num_workers=workers,\n        shuffle=(sampler is None) and training, collate_fn=dataset.collate_batch,\n        drop_last=False, sampler=sampler, timeout=0\n    )\n\n    return dataset, dataloader, sampler\n"
  },
  {
    "path": "pcdet/datasets/augmentor/X_transform.py",
    "content": "from functools import partial\n\nimport numpy as np\n\nfrom ...utils import common_utils\nfrom . import augmentor_utils\nimport copy\n\n\nclass X_TRANS(object):\n    def __init__(self, augmentor_configs=None, rot_num=1):\n        self.rot_num = rot_num\n        self.data_augmentor_queue = []\n        self.test_back_queue = []\n        if augmentor_configs is None:\n            augmentor_configs=[{'NAME': 'world_rotation',\n                                'WORLD_ROT_ANGLE': [-0.78539816, 0.78539816]},\n                               {'NAME': 'world_flip',\n                                'ALONG_AXIS_LIST': [0, 1]},\n                               {'NAME': 'world_scaling',\n                               'WORLD_SCALE_RANGE': [0.95, 1.05]}]\n            self.augmentor_configs = augmentor_configs\n        else:\n            self.augmentor_configs = augmentor_configs\n        self.aug_config_list = augmentor_configs if isinstance(augmentor_configs, list) \\\n            else augmentor_configs.AUG_CONFIG_LIST\n\n        for i, cur_cfg in enumerate(self.aug_config_list):\n            cur_augmentor = getattr(self, cur_cfg['NAME'])(config=cur_cfg)\n            self.data_augmentor_queue.append(cur_augmentor)\n            back_config = self.aug_config_list[-(i+1)]\n            cur_augmentor = getattr(self, back_config['NAME'])(config=back_config)\n            self.test_back_queue.append(cur_augmentor)\n\n        self.backward_flag = False\n\n    def get_params(self):\n        transform_param = np.zeros(shape=(self.rot_num, len(self.aug_config_list)))\n        for s in range(self.rot_num):\n            for i, config in enumerate(self.aug_config_list):\n                if config.NAME == 'world_rotation':\n                    transform_param[s][i] = config.WORLD_ROT_ANGLE[s]\n                if config.NAME == 'world_flip':\n                    transform_param[s][i] = config.ALONG_AXIS_LIST[s]\n                if config.NAME == 'world_scaling':\n                    transform_param[s][i] = config.WORLD_SCALE_RANGE[s]\n        return transform_param\n\n    def world_rotation(self, data_dict=None, config=None):\n        if data_dict is None:\n            return partial(self.world_rotation, config=config)\n\n        rot_factor = data_dict['transform_param'][0]\n        if isinstance(rot_factor, np.float64):\n            rot_factor = np.array([rot_factor])\n        else:\n            rot_factor = rot_factor.unsqueeze(0)\n\n        if 'points' in data_dict:\n            points = data_dict['points']\n            if self.backward_flag:\n                points[:,0:3] = common_utils.rotate_points_along_z(points[np.newaxis, :, 0:3], -rot_factor)[0]\n            else:\n                points[:, 0:3] = common_utils.rotate_points_along_z(points[np.newaxis, :, 0:3], rot_factor)[0]\n            data_dict['points'] = points\n\n        if 'boxes' in data_dict:\n            boxes_lidar = data_dict['boxes']\n            if self.backward_flag:\n                boxes_lidar[:, 0:3] = common_utils.rotate_points_along_z(boxes_lidar[np.newaxis, :, 0:3], -rot_factor)[0]\n                boxes_lidar[:, 6] += -rot_factor\n            else:\n                boxes_lidar[:, 0:3] = common_utils.rotate_points_along_z(boxes_lidar[np.newaxis, :, 0:3], rot_factor)[0]\n                boxes_lidar[:, 6] += rot_factor\n            data_dict['boxes'] = boxes_lidar\n\n        return data_dict\n\n    def world_flip(self, data_dict=None, config=None):\n\n        if data_dict is None:\n            return partial(self.world_flip, config=config)\n\n        if 'points' in data_dict:\n            points = getattr(augmentor_utils, 'random_flip_with_param')(\n                data_dict['points'], data_dict['transform_param'][1], ax=1)\n            data_dict['points'] = points\n\n        if 'boxes' in data_dict:\n            boxes = getattr(augmentor_utils, 'random_flip_with_param')(\n                data_dict['boxes'], data_dict['transform_param'][1], ax=1)\n            boxes = getattr(augmentor_utils, 'random_flip_with_param')(\n                boxes, data_dict['transform_param'][1], ax=6)\n            data_dict['boxes'] = boxes\n\n        return data_dict\n\n    def world_scaling(self, data_dict=None, config=None):\n        if data_dict is None:\n            return partial(self.world_scaling, config=config)\n        scale_factor = data_dict['transform_param'][2]\n\n        if 'points' in data_dict:\n            points = data_dict['points']\n            if self.backward_flag:\n                points[:, 0:3] /= scale_factor\n            else:\n                points[:, 0:3] *= scale_factor\n\n            data_dict['points'] = points\n\n        if 'boxes' in data_dict:\n            boxes_lidar = data_dict['boxes']\n            if self.backward_flag:\n                boxes_lidar[:, 0:6] /= scale_factor\n            else:\n                boxes_lidar[:, 0:6] *= scale_factor\n            data_dict['boxes'] = boxes_lidar\n\n        return data_dict\n\n    def forward_with_param(self, data_dict):\n        \"\"\"\n        Args:\n            data_dict:\n                points: (N, 3 + C_in)\n                gt_boxes: optional, (N, 7) [x, y, z, dx, dy, dz, heading]\n                gt_names: optional, (N), string\n                ...\n\n        Returns:\n        \"\"\"\n\n        for cur_augmentor in self.data_augmentor_queue:\n            data_dict = cur_augmentor(data_dict=data_dict)\n\n        return data_dict\n\n    def backward_with_param(self, data_dict):\n        \"\"\"\n        Args:\n            data_dict:\n                points: (N, 3 + C_in)\n                gt_boxes: optional, (N, 7) [x, y, z, dx, dy, dz, heading]\n                gt_names: optional, (N), string\n                ...\n\n        Returns:\n        \"\"\"\n        self.backward_flag = True\n        for cur_augmentor in self.test_back_queue:\n            data_dict = cur_augmentor(data_dict=data_dict)\n        self.backward_flag = False\n        return data_dict\n\n    def input_transform(self, data_dict, trans_boxes=False):\n        \"\"\"\n        Args:\n            data_dict:\n                points: (N, 3 + C_in)\n                gt_boxes: optional, (N, 7) [x, y, z, dx, dy, dz, heading]\n                gt_names: optional, (N), string\n                ...\n\n        Returns:\n        \"\"\"\n        params = self.get_params()\n\n        src_points = copy.deepcopy(data_dict['points'])\n        if trans_boxes:\n            src_gt_boxes = copy.deepcopy(data_dict['gt_boxes'])\n\n        for i in range(self.rot_num):\n            if i == 0:\n                rot_num_id = ''\n            else:\n                rot_num_id = str(i)\n            ini_data_dict = {}\n            ini_data_dict['points'] = copy.deepcopy(src_points)\n            if trans_boxes:\n                ini_data_dict['boxes'] = copy.deepcopy(src_gt_boxes)\n\n            ini_data_dict['transform_param'] = copy.deepcopy(params[i])\n\n            transformed_data = self.forward_with_param(ini_data_dict)\n\n            data_dict['points'+rot_num_id] = transformed_data['points']\n            if trans_boxes:\n                data_dict['gt_boxes'+rot_num_id] = transformed_data['boxes']\n\n        data_dict['transform_param'] = params\n\n        return data_dict\n"
  },
  {
    "path": "pcdet/datasets/augmentor/augmentor_utils.py",
    "content": "import numpy as np\nimport math\nimport copy\nfrom ...utils import common_utils,box_np_ops\nfrom ...utils import box_utils\nimport numba\n\ndef random_flip_along_x(gt_boxes, points):\n    \"\"\"\n    Args:\n        gt_boxes: (N, 7 + C), [x, y, z, dx, dy, dz, heading, [vx], [vy]]\n        points: (M, 3 + C)\n    Returns:\n    \"\"\"\n    enable = np.random.choice([False, True], replace=False, p=[0.5, 0.5])\n    if enable:\n        gt_boxes[:, 1] = -gt_boxes[:, 1]\n        gt_boxes[:, 6] = -gt_boxes[:, 6]\n        points[:, 1] = -points[:, 1]\n\n        if gt_boxes.shape[1] > 7:\n            gt_boxes[:, 8] = -gt_boxes[:, 8]\n\n    return gt_boxes, points,enable\n\ndef random_flip_with_param(points, enable, ax=1,offset = 0):\n\n    if enable and points is not None:\n\n        points[:, ax] = -(points[:, ax]+offset)\n\n    return points\n\ndef random_flip_along_y(gt_boxes, points):\n    \"\"\"\n    Args:\n        gt_boxes: (N, 7 + C), [x, y, z, dx, dy, dz, heading, [vx], [vy]]\n        points: (M, 3 + C)\n    Returns:\n    \"\"\"\n    enable = np.random.choice([False, True], replace=False, p=[0.5, 0.5])\n    if enable:\n        gt_boxes[:, 0] = -gt_boxes[:, 0]\n        gt_boxes[:, 6] = -(gt_boxes[:, 6] + np.pi)\n        points[:, 0] = -points[:, 0]\n\n        if gt_boxes.shape[1] > 7:\n            gt_boxes[:, 7] = -gt_boxes[:, 7]\n\n    return gt_boxes, points, enable\n\n\ndef global_rotation(gt_boxes, points, rot_range):\n    \"\"\"\n    Args:\n        gt_boxes: (N, 7 + C), [x, y, z, dx, dy, dz, heading, [vx], [vy]]\n        points: (M, 3 + C),\n        rot_range: [min, max]\n    Returns:\n    \"\"\"\n    noise_rotation = np.random.uniform(rot_range[0], rot_range[1])\n    points = common_utils.rotate_points_along_z(points[np.newaxis, :, :], np.array([noise_rotation]))[0]\n    gt_boxes[:, 0:3] = common_utils.rotate_points_along_z(gt_boxes[np.newaxis, :, 0:3], np.array([noise_rotation]))[0]\n    gt_boxes[:, 6] += noise_rotation\n    if gt_boxes.shape[1] > 7:\n        gt_boxes[:, 7:9] = common_utils.rotate_points_along_z(\n            np.hstack((gt_boxes[:, 7:9], np.zeros((gt_boxes.shape[0], 1))))[np.newaxis, :, :],\n            np.array([noise_rotation])\n        )[0][:, 0:2]\n\n    return gt_boxes, points, noise_rotation\n\ndef global_rotation_with_param(batch_dict, noise_rotation=None,num_frames=2):\n\n    for i in range(num_frames):\n        if i == 0:\n            batch_dict['points'] = common_utils.rotate_points_along_z(batch_dict['points'][np.newaxis, :, :], np.array([noise_rotation]))[0]\n            batch_dict['gt_boxes'][:,0:3] =  common_utils.rotate_points_along_z(batch_dict['gt_boxes'][np.newaxis, :, 0:3], np.array([noise_rotation]))[0]\n            batch_dict['gt_boxes'][:, 6]+=noise_rotation\n            batch_dict['gt_tracklets'][:, 0:3] = common_utils.rotate_points_along_z(batch_dict['gt_tracklets'][np.newaxis, :, 0:3], np.array([noise_rotation]))[0]\n            batch_dict['gt_tracklets'][:, 6] += noise_rotation\n        if 'points'+str(-i) in batch_dict:\n            batch_dict['points'+str(-i)] = common_utils.rotate_points_along_z(batch_dict['points'+str(-i)][np.newaxis, :, :], np.array([noise_rotation]))[0]\n            begin_id = 7+(i-1)*4\n            batch_dict['gt_tracklets'][:, begin_id:begin_id+3] = common_utils.rotate_points_along_z(batch_dict['gt_tracklets'][np.newaxis, :, begin_id:begin_id+3],np.array([noise_rotation]))[0]\n            batch_dict['gt_tracklets'][:, begin_id + 3]+=noise_rotation\n\n        if 'gt_boxes'+str(-i) in batch_dict:\n            batch_dict['gt_boxes'+str(-i)][:, :3] = \\\n            common_utils.rotate_points_along_z(batch_dict['gt_boxes'+str(-i)][np.newaxis, :, :3],\n                                               np.array([noise_rotation]))[0]\n            batch_dict['gt_boxes' + str(-i)][:, 6] +=noise_rotation\n\n    return batch_dict\n\ndef boxes_rotation_with_param(boxes, noise_rotation=None):\n    boxes[:,0:3] =  common_utils.rotate_points_along_z(boxes[np.newaxis, :, 0:3], np.array([noise_rotation]))[0]\n    boxes[:, 6]+=noise_rotation\n    return boxes\n\n\ndef global_scaling(gt_boxes, points, scale_range):\n    \"\"\"\n    Args:\n        gt_boxes: (N, 7), [x, y, z, dx, dy, dz, heading]\n        points: (M, 3 + C),\n        scale_range: [min, max]\n    Returns:\n    \"\"\"\n    if scale_range[1] - scale_range[0] < 1e-3:\n        return gt_boxes, points\n    noise_scale = np.random.uniform(scale_range[0], scale_range[1])\n    points[:, :3] *= noise_scale\n    gt_boxes[:, :6] *= noise_scale\n    return gt_boxes, points, noise_scale\n\ndef global_scaling_with_param(batch_dict, noise_scale=None,num_frames=2):\n\n\n    for i in range(num_frames):\n        if i==0:\n            batch_dict['points'][:,0:3]*=noise_scale\n            batch_dict['gt_boxes'][:, 0:6] *= noise_scale\n            batch_dict['gt_tracklets'][:, 0:6]*=noise_scale\n        if 'points'+str(-i) in batch_dict:\n            begin_id = 7 + (i - 1) * 4\n            batch_dict['points'+str(-i)][:, 0:3] *= noise_scale\n            batch_dict['gt_tracklets'][:, begin_id:begin_id+3] *= noise_scale\n\n        if 'gt_boxes' + str(-i) in batch_dict:\n            batch_dict['gt_boxes' + str(-i)][:, 0:6] *= noise_scale\n\n\n    return batch_dict\n\n\ndef get_points_in_box(points, gt_box):\n    x, y, z = points[:, 0], points[:, 1], points[:, 2]\n    cx, cy, cz = gt_box[0], gt_box[1], gt_box[2]\n    dx, dy, dz, rz = gt_box[3], gt_box[4], gt_box[5], gt_box[6]\n    shift_x, shift_y, shift_z = x - cx, y - cy, z - cz\n\n    MARGIN = 1e-1\n    cosa, sina = math.cos(-rz), math.sin(-rz)\n    local_x = shift_x * cosa + shift_y * (-sina)\n    local_y = shift_x * sina + shift_y * cosa\n\n    mask = np.logical_and(abs(shift_z) <= dz / 2.0,\n                          np.logical_and(abs(local_x) <= dx / 2.0 + MARGIN,\n                                         abs(local_y) <= dy / 2.0 + MARGIN))\n\n    points = points[mask]\n\n    return points, mask\n\n\ndef get_pyramids(boxes):\n    pyramid_orders = np.array([\n        [0, 1, 5, 4],\n        [4, 5, 6, 7],\n        [7, 6, 2, 3],\n        [3, 2, 1, 0],\n        [1, 2, 6, 5],\n        [0, 4, 7, 3]\n    ])\n    boxes_corners = box_utils.boxes_to_corners_3d(boxes).reshape(-1, 24)\n\n    pyramid_list = []\n    for order in pyramid_orders:\n        # frustum polygon: 5 corners, 5 surfaces\n        pyramid = np.concatenate((\n            boxes[:, 0:3],\n            boxes_corners[:, 3 * order[0]: 3 * order[0] + 3],\n            boxes_corners[:, 3 * order[1]: 3 * order[1] + 3],\n            boxes_corners[:, 3 * order[2]: 3 * order[2] + 3],\n            boxes_corners[:, 3 * order[3]: 3 * order[3] + 3]), axis=1)\n        pyramid_list.append(pyramid[:, None, :])\n    pyramids = np.concatenate(pyramid_list, axis=1)  # [N, 6, 15], 15=5*3\n    return pyramids\n\n\ndef one_hot(x, num_class=1):\n    if num_class is None:\n        num_class = 1\n    ohx = np.zeros((len(x), num_class))\n    ohx[range(len(x)), x] = 1\n    return ohx\n\n\ndef points_in_pyramids_mask(points, pyramids):\n    pyramids = pyramids.reshape(-1, 5, 3)\n    flags = np.zeros((points.shape[0], pyramids.shape[0]), dtype=np.bool)\n    for i, pyramid in enumerate(pyramids):\n        flags[:, i] = np.logical_or(flags[:, i], box_utils.in_hull(points[:, 0:3], pyramid))\n    return flags\n\n\ndef local_pyramid_dropout(gt_boxes, points, dropout_prob, pyramids=None):\n    if pyramids is None:\n        pyramids = get_pyramids(gt_boxes).reshape([-1, 6, 5, 3])  # each six surface of boxes: [num_boxes, 6, 15=3*5]\n    drop_pyramid_indices = np.random.randint(0, 6, (pyramids.shape[0]))\n    drop_pyramid_one_hot = one_hot(drop_pyramid_indices, num_class=6)\n    drop_box_mask = np.random.uniform(0, 1, (pyramids.shape[0])) <= dropout_prob\n    if np.sum(drop_box_mask) != 0:\n        drop_pyramid_mask = (np.tile(drop_box_mask[:, None], [1, 6]) * drop_pyramid_one_hot) > 0\n        drop_pyramids = pyramids[drop_pyramid_mask]\n        point_masks = points_in_pyramids_mask(points, drop_pyramids)\n        points = points[np.logical_not(point_masks.any(-1))]\n    # print(drop_box_mask)\n    pyramids = pyramids[np.logical_not(drop_box_mask)]\n    return gt_boxes, points, pyramids\n\n\ndef local_pyramid_sparsify(gt_boxes, points, prob, max_num_pts, pyramids=None):\n    if pyramids is None:\n        pyramids = get_pyramids(gt_boxes).reshape([-1, 6, 5, 3])  # each six surface of boxes: [num_boxes, 6, 15=3*5]\n    if pyramids.shape[0] > 0:\n        sparsity_prob, sparsity_num = prob, max_num_pts\n        sparsify_pyramid_indices = np.random.randint(0, 6, (pyramids.shape[0]))\n        sparsify_pyramid_one_hot = one_hot(sparsify_pyramid_indices, num_class=6)\n        sparsify_box_mask = np.random.uniform(0, 1, (pyramids.shape[0])) <= sparsity_prob\n        sparsify_pyramid_mask = (np.tile(sparsify_box_mask[:, None], [1, 6]) * sparsify_pyramid_one_hot) > 0\n        # print(sparsify_box_mask)\n\n        pyramid_sampled = pyramids[sparsify_pyramid_mask]  # (-1,6,5,3)[(num_sample,6)]\n        # print(pyramid_sampled.shape)\n        pyramid_sampled_point_masks = points_in_pyramids_mask(points, pyramid_sampled)\n        pyramid_sampled_points_num = pyramid_sampled_point_masks.sum(0)  # the number of points in each surface pyramid\n        valid_pyramid_sampled_mask = pyramid_sampled_points_num > sparsity_num  # only much than sparsity_num should be sparse\n\n        sparsify_pyramids = pyramid_sampled[valid_pyramid_sampled_mask]\n        if sparsify_pyramids.shape[0] > 0:\n            point_masks = pyramid_sampled_point_masks[:, valid_pyramid_sampled_mask]\n            remain_points = points[\n                np.logical_not(point_masks.any(-1))]  # points which outside the down sampling pyramid\n            to_sparsify_points = [points[point_masks[:, i]] for i in range(point_masks.shape[1])]\n\n            sparsified_points = []\n            for sample in to_sparsify_points:\n                sampled_indices = np.random.choice(sample.shape[0], size=sparsity_num, replace=False)\n                sparsified_points.append(sample[sampled_indices])\n            sparsified_points = np.concatenate(sparsified_points, axis=0)\n            points = np.concatenate([remain_points, sparsified_points], axis=0)\n        pyramids = pyramids[np.logical_not(sparsify_box_mask)]\n    return gt_boxes, points, pyramids\n\n\ndef local_pyramid_swap(gt_boxes, points, prob, max_num_pts, pyramids=None):\n    def get_points_ratio(points, pyramid):\n        surface_center = (pyramid[3:6] + pyramid[6:9] + pyramid[9:12] + pyramid[12:]) / 4.0\n        vector_0, vector_1, vector_2 = pyramid[6:9] - pyramid[3:6], pyramid[12:] - pyramid[3:6], pyramid[\n                                                                                                 0:3] - surface_center\n        alphas = ((points[:, 0:3] - pyramid[3:6]) * vector_0).sum(-1) / np.power(vector_0, 2).sum()\n        betas = ((points[:, 0:3] - pyramid[3:6]) * vector_1).sum(-1) / np.power(vector_1, 2).sum()\n        gammas = ((points[:, 0:3] - surface_center) * vector_2).sum(-1) / np.power(vector_2, 2).sum()\n        return [alphas, betas, gammas]\n\n    def recover_points_by_ratio(points_ratio, pyramid):\n        alphas, betas, gammas = points_ratio\n        surface_center = (pyramid[3:6] + pyramid[6:9] + pyramid[9:12] + pyramid[12:]) / 4.0\n        vector_0, vector_1, vector_2 = pyramid[6:9] - pyramid[3:6], pyramid[12:] - pyramid[3:6], pyramid[\n                                                                                                 0:3] - surface_center\n        points = (alphas[:, None] * vector_0 + betas[:, None] * vector_1) + pyramid[3:6] + gammas[:, None] * vector_2\n        return points\n\n    def recover_points_intensity_by_ratio(points_intensity_ratio, max_intensity, min_intensity):\n        return points_intensity_ratio * (max_intensity - min_intensity) + min_intensity\n\n    # swap partition\n    if pyramids is None:\n        pyramids = get_pyramids(gt_boxes).reshape([-1, 6, 5, 3])  # each six surface of boxes: [num_boxes, 6, 15=3*5]\n    swap_prob, num_thres = prob, max_num_pts\n    swap_pyramid_mask = np.random.uniform(0, 1, (pyramids.shape[0])) <= swap_prob\n\n    if swap_pyramid_mask.sum() > 0:\n        point_masks = points_in_pyramids_mask(points, pyramids)\n        point_nums = point_masks.sum(0).reshape(pyramids.shape[0], -1)  # [N, 6]\n        non_zero_pyramids_mask = point_nums > num_thres  # ingore dropout pyramids or highly occluded pyramids\n        selected_pyramids = non_zero_pyramids_mask * swap_pyramid_mask[:,\n                                                     None]  # selected boxes and all their valid pyramids\n        # print(selected_pyramids)\n        if selected_pyramids.sum() > 0:\n            # get to_swap pyramids\n            index_i, index_j = np.nonzero(selected_pyramids)\n            selected_pyramid_indices = [np.random.choice(index_j[index_i == i]) \\\n                                            if e and (index_i == i).any() else 0 for i, e in\n                                        enumerate(swap_pyramid_mask)]\n            selected_pyramids_mask = selected_pyramids * one_hot(selected_pyramid_indices, num_class=6) == 1\n            to_swap_pyramids = pyramids[selected_pyramids_mask]\n\n            # get swapped pyramids\n            index_i, index_j = np.nonzero(selected_pyramids_mask)\n            non_zero_pyramids_mask[selected_pyramids_mask] = False\n            swapped_index_i = np.array([np.random.choice(np.where(non_zero_pyramids_mask[:, j])[0]) if \\\n                                            np.where(non_zero_pyramids_mask[:, j])[0].shape[0] > 0 else\n                                        index_i[i] for i, j in enumerate(index_j.tolist())])\n            swapped_indicies = np.concatenate([swapped_index_i[:, None], index_j[:, None]], axis=1)\n            swapped_pyramids = pyramids[\n                swapped_indicies[:, 0].astype(np.int32), swapped_indicies[:, 1].astype(np.int32)]\n\n            # concat to_swap&swapped pyramids\n            swap_pyramids = np.concatenate([to_swap_pyramids, swapped_pyramids], axis=0)\n            swap_point_masks = points_in_pyramids_mask(points, swap_pyramids)\n            remain_points = points[np.logical_not(swap_point_masks.any(-1))]\n\n            # swap pyramids\n            points_res = []\n            num_swapped_pyramids = swapped_pyramids.shape[0]\n            for i in range(num_swapped_pyramids):\n                to_swap_pyramid = to_swap_pyramids[i]\n                swapped_pyramid = swapped_pyramids[i]\n\n                to_swap_points = points[swap_point_masks[:, i]]\n                swapped_points = points[swap_point_masks[:, i + num_swapped_pyramids]]\n                # for intensity transform\n                to_swap_points_intensity_ratio = (to_swap_points[:, 3:] - to_swap_points[:, 3:].min()) / \\\n                                                 np.clip(\n                                                     (to_swap_points[:, 3:].max() - to_swap_points[:, 3:].min()),\n                                                     1e-6, 1)\n                swapped_points_intensity_ratio = (swapped_points[:, 3:] - swapped_points[:, 3:].min()) / \\\n                                                 np.clip(\n                                                     (swapped_points[:, 3:].max() - swapped_points[:, 3:].min()),\n                                                     1e-6, 1)\n\n                to_swap_points_ratio = get_points_ratio(to_swap_points, to_swap_pyramid.reshape(15))\n                swapped_points_ratio = get_points_ratio(swapped_points, swapped_pyramid.reshape(15))\n                new_to_swap_points = recover_points_by_ratio(swapped_points_ratio, to_swap_pyramid.reshape(15))\n                new_swapped_points = recover_points_by_ratio(to_swap_points_ratio, swapped_pyramid.reshape(15))\n                # for intensity transform\n                new_to_swap_points_intensity = recover_points_intensity_by_ratio(\n                    swapped_points_intensity_ratio, to_swap_points[:, 3:].max(),\n                    to_swap_points[:, 3:].min())\n                new_swapped_points_intensity = recover_points_intensity_by_ratio(\n                    to_swap_points_intensity_ratio, swapped_points[:, 3:].max(),\n                    swapped_points[:, 3:].min())\n\n                # new_to_swap_points = np.concatenate([new_to_swap_points, swapped_points[:, -1:]], axis=1)\n                # new_swapped_points = np.concatenate([new_swapped_points, to_swap_points[:, -1:]], axis=1)\n\n                new_to_swap_points = np.concatenate([new_to_swap_points, new_to_swap_points_intensity], axis=1)\n                new_swapped_points = np.concatenate([new_swapped_points, new_swapped_points_intensity], axis=1)\n\n                points_res.append(new_to_swap_points)\n                points_res.append(new_swapped_points)\n\n            points_res = np.concatenate(points_res, axis=0)\n\n            points = np.concatenate([remain_points, points_res], axis=0)\n    return gt_boxes, points\n\ndef noise_per_object_v3_(gt_boxes,\n                         points=None,\n                         points_pseudo=None,\n                         valid_mask=None,\n                         rotation_perturb=np.pi / 4,\n                         center_noise_std=1.0,\n                         global_random_rot_range=np.pi / 4,\n                         data_aug_with_context=-1.0,\n                         num_try=100):\n    \"\"\"Random rotate or remove each groundtruth independently. use kitti viewer\n    to test this function points_transform_\n\n    Args:\n        gt_boxes (np.ndarray): Ground truth boxes with shape (N, 7).\n        points (np.ndarray | None): Input point cloud with shape (M, 4).\n            Default: None.\n        valid_mask (np.ndarray | None): Mask to indicate which boxes are valid.\n            Default: None.\n        rotation_perturb (float): Rotation perturbation. Default: pi / 4.\n        center_noise_std (float): Center noise standard deviation.\n            Default: 1.0.\n        global_random_rot_range (float): Global random rotation range.\n            Default: pi/4.\n        num_try (int): Number of try. Default: 100.\n    \"\"\"\n    num_boxes = gt_boxes.shape[0]\n    if not isinstance(rotation_perturb, (list, tuple, np.ndarray)):\n        rotation_perturb = [-rotation_perturb, rotation_perturb]\n    if not isinstance(global_random_rot_range, (list, tuple, np.ndarray)):\n        global_random_rot_range = [\n            -global_random_rot_range, global_random_rot_range\n        ]\n    enable_grot = np.abs(global_random_rot_range[0] -\n                         global_random_rot_range[1]) >= 1e-3\n\n    if not isinstance(center_noise_std, (list, tuple, np.ndarray)):\n        center_noise_std = [\n            center_noise_std, center_noise_std, center_noise_std\n        ]\n    if valid_mask is None:\n        valid_mask = np.ones((num_boxes, ), dtype=np.bool_)\n    center_noise_std = np.array(center_noise_std, dtype=gt_boxes.dtype)\n\n    loc_noises = np.random.normal(\n        scale=center_noise_std, size=[num_boxes, num_try, 3])\n    rot_noises = np.random.uniform(\n        rotation_perturb[0], rotation_perturb[1], size=[num_boxes, num_try])\n    global_rot_noises = np.random.uniform(\n        global_random_rot_range[0],\n        global_random_rot_range[1],\n        size=[num_boxes, num_try])\n\n    origin = (0.5, 0.5, 0.5)\n    offset = np.array([0.0, 0.0, 0.0, data_aug_with_context[0], data_aug_with_context[1], data_aug_with_context[2], 0.0])\n\n    gt_box_corners = box_np_ops.center_to_corner_box3d(\n        gt_boxes[:, :3],\n        gt_boxes[:, 3:6] + offset[3:6],\n        gt_boxes[:, 6],\n        origin=origin,\n        axis=2)\n\n    if not enable_grot:\n        selected_noise = noise_per_box(gt_boxes[:, [0, 1, 3, 4, 6]] + offset[[0, 1, 3, 4, 6]],\n                                       valid_mask, loc_noises, rot_noises)\n    else:\n        selected_noise = noise_per_box_v2_(gt_boxes[:, [0, 1, 3, 4, 6]] + offset[[0, 1, 3, 4, 6]],\n                                           valid_mask, loc_noises, rot_noises,\n                                           global_rot_noises)\n\n    loc_transforms = _select_transform(loc_noises, selected_noise)\n    rot_transforms = _select_transform(rot_noises, selected_noise)\n\n    surfaces = box_np_ops.corner_to_surfaces_3d_jit(gt_box_corners)\n    if points is not None:\n        point_masks = box_np_ops.points_in_convex_polygon_3d_jit(\n            points[:, :3], surfaces)\n        points_transform_(points, gt_boxes[:, :3], point_masks, loc_transforms,\n                          rot_transforms, valid_mask)\n    if points_pseudo is not None:\n        point_pseudo_masks = box_np_ops.points_in_convex_polygon_3d_jit(\n            points_pseudo[:, :3], surfaces)\n\n\n        points_transform_(points_pseudo, gt_boxes[:, :3], point_pseudo_masks, loc_transforms,\n                          rot_transforms, valid_mask)\n\n    box3d_transform_(gt_boxes, loc_transforms, rot_transforms, valid_mask)\n\n@numba.njit\ndef _rotation_box2d_jit_(corners, angle, rot_mat_T):\n    \"\"\"Rotate 2D boxes.\n\n    Args:\n        corners (np.ndarray): Corners of boxes.\n        angle (float): Rotation angle.\n        rot_mat_T (np.ndarray): Transposed rotation matrix.\n    \"\"\"\n    rot_sin = np.sin(angle)\n    rot_cos = np.cos(angle)\n    rot_mat_T[0, 0] = rot_cos\n    rot_mat_T[0, 1] = -rot_sin\n    rot_mat_T[1, 0] = rot_sin\n    rot_mat_T[1, 1] = rot_cos\n    corners[:] = corners @ rot_mat_T\n\n\n@numba.jit(nopython=True)\ndef box_collision_test(boxes, qboxes, clockwise=True):\n    \"\"\"Box collision test.\n\n    Args:\n        boxes (np.ndarray): Corners of current boxes.\n        qboxes (np.ndarray): Boxes to be avoid colliding.\n        clockwise (bool): Whether the corners are in clockwise order.\n            Default: True.\n    \"\"\"\n    N = boxes.shape[0]\n    K = qboxes.shape[0]\n    ret = np.zeros((N, K), dtype=np.bool_)\n    slices = np.array([1, 2, 3, 0])\n    lines_boxes = np.stack((boxes, boxes[:, slices, :]),\n                           axis=2)  # [N, 4, 2(line), 2(xy)]\n    lines_qboxes = np.stack((qboxes, qboxes[:, slices, :]), axis=2)\n    # vec = np.zeros((2,), dtype=boxes.dtype)\n    boxes_standup = box_np_ops.corner_to_standup_nd_jit(boxes)\n    qboxes_standup = box_np_ops.corner_to_standup_nd_jit(qboxes)\n    for i in range(N):\n        for j in range(K):\n            # calculate standup first\n            iw = (\n                min(boxes_standup[i, 2], qboxes_standup[j, 2]) -\n                max(boxes_standup[i, 0], qboxes_standup[j, 0]))\n            if iw > 0:\n                ih = (\n                    min(boxes_standup[i, 3], qboxes_standup[j, 3]) -\n                    max(boxes_standup[i, 1], qboxes_standup[j, 1]))\n                if ih > 0:\n                    for k in range(4):\n                        for box_l in range(4):\n                            A = lines_boxes[i, k, 0]\n                            B = lines_boxes[i, k, 1]\n                            C = lines_qboxes[j, box_l, 0]\n                            D = lines_qboxes[j, box_l, 1]\n                            acd = (D[1] - A[1]) * (C[0] -\n                                                   A[0]) > (C[1] - A[1]) * (\n                                                       D[0] - A[0])\n                            bcd = (D[1] - B[1]) * (C[0] -\n                                                   B[0]) > (C[1] - B[1]) * (\n                                                       D[0] - B[0])\n                            if acd != bcd:\n                                abc = (C[1] - A[1]) * (B[0] - A[0]) > (\n                                    B[1] - A[1]) * (\n                                        C[0] - A[0])\n                                abd = (D[1] - A[1]) * (B[0] - A[0]) > (\n                                    B[1] - A[1]) * (\n                                        D[0] - A[0])\n                                if abc != abd:\n                                    ret[i, j] = True  # collision.\n                                    break\n                        if ret[i, j] is True:\n                            break\n                    if ret[i, j] is False:\n                        # now check complete overlap.\n                        # box overlap qbox:\n                        box_overlap_qbox = True\n                        for box_l in range(4):  # point l in qboxes\n                            for k in range(4):  # corner k in boxes\n                                vec = boxes[i, k] - boxes[i, (k + 1) % 4]\n                                if clockwise:\n                                    vec = -vec\n                                cross = vec[1] * (\n                                    boxes[i, k, 0] - qboxes[j, box_l, 0])\n                                cross -= vec[0] * (\n                                    boxes[i, k, 1] - qboxes[j, box_l, 1])\n                                if cross >= 0:\n                                    box_overlap_qbox = False\n                                    break\n                            if box_overlap_qbox is False:\n                                break\n\n                        if box_overlap_qbox is False:\n                            qbox_overlap_box = True\n                            for box_l in range(4):  # point box_l in boxes\n                                for k in range(4):  # corner k in qboxes\n                                    vec = qboxes[j, k] - qboxes[j, (k + 1) % 4]\n                                    if clockwise:\n                                        vec = -vec\n                                    cross = vec[1] * (\n                                        qboxes[j, k, 0] - boxes[i, box_l, 0])\n                                    cross -= vec[0] * (\n                                        qboxes[j, k, 1] - boxes[i, box_l, 1])\n                                    if cross >= 0:  #\n                                        qbox_overlap_box = False\n                                        break\n                                if qbox_overlap_box is False:\n                                    break\n                            if qbox_overlap_box:\n                                ret[i, j] = True  # collision.\n                        else:\n                            ret[i, j] = True  # collision.\n    return ret\n\n\n@numba.njit\ndef noise_per_box(boxes, valid_mask, loc_noises, rot_noises):\n    \"\"\"Add noise to every box (only on the horizontal plane).\n\n    Args:\n        boxes (np.ndarray): Input boxes with shape (N, 5).\n        valid_mask (np.ndarray): Mask to indicate which boxes are valid\n            with shape (N).\n        loc_noises (np.ndarray): Location noises with shape (N, M, 3).\n        rot_noises (np.ndarray): Rotation noises with shape (N, M).\n\n    Returns:\n        np.ndarray: Mask to indicate whether the noise is\n            added successfully (pass the collision test).\n    \"\"\"\n    num_boxes = boxes.shape[0]\n    num_tests = loc_noises.shape[1]\n    box_corners = box_np_ops.box2d_to_corner_jit(boxes)\n    current_corners = np.zeros((4, 2), dtype=boxes.dtype)\n    rot_mat_T = np.zeros((2, 2), dtype=boxes.dtype)\n    success_mask = -np.ones((num_boxes, ), dtype=np.int64)\n    for i in range(num_boxes):\n        if valid_mask[i]:\n            for j in range(num_tests):\n                current_corners[:] = box_corners[i]\n                current_corners -= boxes[i, :2]\n                _rotation_box2d_jit_(current_corners, rot_noises[i, j],\n                                     rot_mat_T)\n                current_corners += boxes[i, :2] + loc_noises[i, j, :2]\n                coll_mat = box_collision_test(\n                    current_corners.reshape(1, 4, 2), box_corners)\n                coll_mat[0, i] = False\n                # print(coll_mat)\n                if not coll_mat.any():\n                    success_mask[i] = j\n                    box_corners[i] = current_corners\n                    break\n    return success_mask\n\n\n@numba.njit\ndef noise_per_box_v2_(boxes, valid_mask, loc_noises, rot_noises,\n                      global_rot_noises):\n    \"\"\"Add noise to every box (only on the horizontal plane). Version 2 used\n    when enable global rotations.\n\n    Args:\n        boxes (np.ndarray): Input boxes with shape (N, 5).\n        valid_mask (np.ndarray): Mask to indicate which boxes are valid\n            with shape (N).\n        loc_noises (np.ndarray): Location noises with shape (N, M, 3).\n        rot_noises (np.ndarray): Rotation noises with shape (N, M).\n\n    Returns:\n        np.ndarray: Mask to indicate whether the noise is\n            added successfully (pass the collision test).\n    \"\"\"\n    num_boxes = boxes.shape[0]\n    num_tests = loc_noises.shape[1]\n    box_corners = box_np_ops.box2d_to_corner_jit(boxes)\n    current_corners = np.zeros((4, 2), dtype=boxes.dtype)\n    current_box = np.zeros((1, 5), dtype=boxes.dtype)\n    rot_mat_T = np.zeros((2, 2), dtype=boxes.dtype)\n    dst_pos = np.zeros((2, ), dtype=boxes.dtype)\n    success_mask = -np.ones((num_boxes, ), dtype=np.int64)\n    corners_norm = np.zeros((4, 2), dtype=boxes.dtype)\n    corners_norm[1, 1] = 1.0\n    corners_norm[2] = 1.0\n    corners_norm[3, 0] = 1.0\n    corners_norm -= np.array([0.5, 0.5], dtype=boxes.dtype)\n    corners_norm = corners_norm.reshape(4, 2)\n    for i in range(num_boxes):\n        if valid_mask[i]:\n            for j in range(num_tests):\n                current_box[0, :] = boxes[i]\n                # current_radius = np.sqrt(boxes[i, 0]**2 + boxes[i, 1]**2)\n                # current_grot = np.arctan2(boxes[i, 0], boxes[i, 1])\n                # dst_grot = current_grot + global_rot_noises[i, j]\n                # dst_pos[0] = current_radius * np.sin(dst_grot)\n                # dst_pos[1] = current_radius * np.cos(dst_grot)\n                dst_pos[0] = boxes[i, 0] * np.cos(global_rot_noises[i, j]) + boxes[i, 1] * np.sin(global_rot_noises[i, j])\n                dst_pos[1] = -boxes[i, 0] * np.sin(global_rot_noises[i, j]) + boxes[i, 1] * np.cos(global_rot_noises[i, j])\n                current_box[0, :2] = dst_pos\n                # current_box[0, -1] += (dst_grot - current_grot)\n                current_box[0, -1] += global_rot_noises[i, j]\n\n                rot_sin = np.sin(current_box[0, -1])\n                rot_cos = np.cos(current_box[0, -1])\n                rot_mat_T[0, 0] = rot_cos\n                rot_mat_T[0, 1] = -rot_sin\n                rot_mat_T[1, 0] = rot_sin\n                rot_mat_T[1, 1] = rot_cos\n                current_corners[:] = current_box[\n                    0, 2:4] * corners_norm @ rot_mat_T + current_box[0, :2]\n                current_corners -= current_box[0, :2]\n                _rotation_box2d_jit_(current_corners, rot_noises[i, j],\n                                     rot_mat_T)\n                current_corners += current_box[0, :2] + loc_noises[i, j, :2]\n                coll_mat = box_collision_test(\n                    current_corners.reshape(1, 4, 2), box_corners)\n                coll_mat[0, i] = False\n                if not coll_mat.any():\n                    success_mask[i] = j\n                    box_corners[i] = current_corners\n                    loc_noises[i, j, :2] += (dst_pos - boxes[i, :2])\n                    # rot_noises[i, j] += (dst_grot - current_grot)\n                    rot_noises[i, j] += global_rot_noises[i, j]\n                    break\n    return success_mask\n\n\ndef _select_transform(transform, indices):\n    \"\"\"Select transform.\n\n    Args:\n        transform (np.ndarray): Transforms to select from.\n        indices (np.ndarray): Mask to indicate which transform to select.\n\n    Returns:\n        np.ndarray: Selected transforms.\n    \"\"\"\n    result = np.zeros((transform.shape[0], *transform.shape[2:]),\n                      dtype=transform.dtype)\n    for i in range(transform.shape[0]):\n        if indices[i] != -1:\n            result[i] = transform[i, indices[i]]\n    return result\n\n\n@numba.njit\ndef _rotation_matrix_3d_(rot_mat_T, angle, axis):\n    \"\"\"Get the 3D rotation matrix.\n\n    Args:\n        rot_mat_T (np.ndarray): Transposed rotation matrix.\n        angle (float): Rotation angle.\n        axis (int): Rotation axis.\n    \"\"\"\n    rot_sin = np.sin(angle)\n    rot_cos = np.cos(angle)\n    rot_mat_T[:] = np.eye(3)\n    if axis == 1:\n        rot_mat_T[0, 0] = rot_cos\n        rot_mat_T[0, 2] = -rot_sin\n        rot_mat_T[2, 0] = rot_sin\n        rot_mat_T[2, 2] = rot_cos\n    elif axis == 2 or axis == -1:\n        rot_mat_T[0, 0] = rot_cos\n        rot_mat_T[0, 1] = -rot_sin\n        rot_mat_T[1, 0] = rot_sin\n        rot_mat_T[1, 1] = rot_cos\n    elif axis == 0:\n        rot_mat_T[1, 1] = rot_cos\n        rot_mat_T[1, 2] = -rot_sin\n        rot_mat_T[2, 1] = rot_sin\n        rot_mat_T[2, 2] = rot_cos\n\n\n@numba.njit\ndef points_transform_(points, centers, point_masks, loc_transform,\n                      rot_transform, valid_mask):\n    \"\"\"Apply transforms to points and box centers.\n\n    Args:\n        points (np.ndarray): Input points.\n        centers (np.ndarray): Input box centers.\n        point_masks (np.ndarray): Mask to indicate which points need\n            to be transformed.\n        loc_transform (np.ndarray): Location transform to be applied.\n        rot_transform (np.ndarray): Rotation transform to be applied.\n        valid_mask (np.ndarray): Mask to indicate which boxes are valid.\n    \"\"\"\n    num_box = centers.shape[0]\n    num_points = points.shape[0]\n    rot_mat_T = np.zeros((num_box, 3, 3), dtype=points.dtype)\n    for i in range(num_box):\n        _rotation_matrix_3d_(rot_mat_T[i], rot_transform[i], 2)\n    for i in range(num_points):\n        for j in range(num_box):\n            if valid_mask[j]:\n                if point_masks[i, j] == 1:\n                    points[i, :3] -= centers[j, :3]\n                    points[i:i + 1, :3] = points[i:i + 1, :3] @ rot_mat_T[j]\n                    points[i, :3] += centers[j, :3]\n                    points[i, :3] += loc_transform[j]\n                    break  # only apply first box's transform\n\n\n@numba.njit\ndef box3d_transform_(boxes, loc_transform, rot_transform, valid_mask):\n    \"\"\"Transform 3D boxes.\n\n    Args:\n        boxes (np.ndarray): 3D boxes to be transformed.\n        loc_transform (np.ndarray): Location transform to be applied.\n        rot_transform (np.ndarray): Rotation transform to be applied.\n        valid_mask (np.ndarray | None): Mask to indicate which boxes are valid.\n    \"\"\"\n    num_box = boxes.shape[0]\n    for i in range(num_box):\n        if valid_mask[i]:\n            boxes[i, :3] += loc_transform[i]\n            boxes[i, 6] += rot_transform[i]\n\n"
  },
  {
    "path": "pcdet/datasets/augmentor/data_augmentor.py",
    "content": "from functools import partial\n\nimport numpy as np\n\nfrom ...utils import common_utils\nfrom . import augmentor_utils, database_sampler\n\n\nclass DataAugmentor(object):\n    def __init__(self, root_path, augmentor_configs, class_names, logger=None):\n        self.root_path = root_path\n        self.class_names = class_names\n        self.logger = logger\n\n        self.data_augmentor_queue = []\n        aug_config_list = augmentor_configs if isinstance(augmentor_configs, list) \\\n            else augmentor_configs.AUG_CONFIG_LIST\n\n        for cur_cfg in aug_config_list:\n            if not isinstance(augmentor_configs, list):\n                if cur_cfg.NAME in augmentor_configs.DISABLE_AUG_LIST:\n                    continue\n            cur_augmentor = getattr(self, cur_cfg.NAME)(config=cur_cfg)\n            self.data_augmentor_queue.append(cur_augmentor)\n\n    def gt_sampling(self, config=None):\n        db_sampler = database_sampler.DataBaseSampler(\n            root_path=self.root_path,\n            sampler_cfg=config,\n            class_names=self.class_names,\n            logger=self.logger,\n        )\n        return db_sampler\n\n    def da_sampling(self, config=None):\n        db_sampler = database_sampler.DADataBaseSampler(\n            root_path=self.root_path,\n            sampler_cfg=config,\n            class_names=self.class_names,\n            logger=self.logger,\n        )\n        return db_sampler\n\n    def __getstate__(self):\n        d = dict(self.__dict__)\n        del d['logger']\n        return d\n\n    def __setstate__(self, d):\n        self.__dict__.update(d)\n   \n\n\n    def random_world_rotation(self, data_dict=None, config=None):\n        if data_dict is None:\n            return partial(self.random_world_rotation, config=config)\n        rot_range = config['WORLD_ROT_ANGLE']\n        if not isinstance(rot_range, list):\n            rot_range = [-rot_range, rot_range]\n\n        gt_boxes, points, param = augmentor_utils.global_rotation(\n            data_dict['gt_boxes'], data_dict['points'], rot_range=rot_range\n        )\n\n        data_dict['gt_boxes'] = gt_boxes\n        data_dict['points'] = points\n        aug_param=[param]\n        data_dict['aug_param'] = aug_param\n\n\n        return data_dict\n\n    def random_world_flip(self, data_dict=None, config=None):\n        if data_dict is None:\n            return partial(self.random_world_flip, config=config)\n\n        gt_boxes, points = data_dict['gt_boxes'], data_dict['points']\n        for cur_axis in config['ALONG_AXIS_LIST']:\n            assert cur_axis in ['x', 'y']\n            gt_boxes, points, param = getattr(augmentor_utils, 'random_flip_along_%s' % cur_axis)(\n                gt_boxes, points,\n            )\n\n        data_dict['gt_boxes'] = gt_boxes\n        data_dict['points'] = points\n        if 'aug_param' in data_dict:\n            data_dict['aug_param'].append(int(param))\n        else:\n            data_dict['aug_param'] = [param]\n\n        return data_dict\n\n    def random_world_scaling(self, data_dict=None, config=None):\n        if data_dict is None:\n            return partial(self.random_world_scaling, config=config)\n\n        gt_boxes, points, param = augmentor_utils.global_scaling(\n            data_dict['gt_boxes'], data_dict['points'], config['WORLD_SCALE_RANGE']\n        )\n        data_dict['gt_boxes'] = gt_boxes\n        data_dict['points'] = points\n        if 'aug_param' in data_dict:\n            data_dict['aug_param'].append(param)\n        else:\n            data_dict['aug_param'] = [param]\n\n        return data_dict\n\n\n    def random_local_noise(self, data_dict=None, config=None):\n        if data_dict is None:\n            return partial(self.random_local_noise, config=config)\n        data_dict['gt_boxes'][:, 6] = -data_dict['gt_boxes'][:, 6]\n        augmentor_utils.noise_per_object_v3_(data_dict['gt_boxes'], data_dict['points'], None,\n                                        data_dict.get('valid_noise', None),\n                                        config['LOCAL_ROT_RANGE'], config['TRANSLATION_STD'],\n                                        config['GLOBAL_ROT_RANGE'], config['EXTRA_WIDTH'])\n        data_dict['gt_boxes'][:, 6] = -data_dict['gt_boxes'][:, 6]\n        if 'valid_noise' in data_dict:\n            data_dict.pop('valid_noise')\n        return data_dict\n\n    def random_local_pyramid_aug(self, data_dict=None, config=None):\n        \"\"\"\n        Refer to the paper:\n            SE-SSD: Self-Ensembling Single-Stage Object Detector From Point Cloud\n        \"\"\"\n        if data_dict is None:\n            return partial(self.random_local_pyramid_aug, config=config)\n\n        gt_boxes, points = data_dict['gt_boxes'], data_dict['points']\n\n        gt_boxes, points, pyramids = augmentor_utils.local_pyramid_dropout(gt_boxes, points, config['DROP_PROB'])\n        gt_boxes, points, pyramids = augmentor_utils.local_pyramid_sparsify(gt_boxes, points,\n                                                                            config['SPARSIFY_PROB'],\n                                                                            config['SPARSIFY_MAX_NUM'],\n                                                                            pyramids)\n        gt_boxes, points = augmentor_utils.local_pyramid_swap(gt_boxes, points,\n                                                              config['SWAP_PROB'],\n                                                              config['SWAP_MAX_NUM'],\n                                                              pyramids)\n        data_dict['gt_boxes'] = gt_boxes\n        data_dict['points'] = points\n        return data_dict\n\n    def forward(self, data_dict):\n        \"\"\"\n        Args:\n            data_dict:\n                points: (N, 3 + C_in)\n                gt_boxes: optional, (N, 7) [x, y, z, dx, dy, dz, heading]\n                gt_names: optional, (N), string\n                ...\n        Returns:\n        \"\"\"\n        for cur_augmentor in self.data_augmentor_queue:\n            data_dict = cur_augmentor(data_dict=data_dict)\n\n        data_dict['gt_boxes'][:, 6] = common_utils.limit_period(\n            data_dict['gt_boxes'][:, 6], offset=0.5, period=2 * np.pi\n        )\n        if 'aug_param' in data_dict:\n            data_dict['aug_param'] = np.array(data_dict['aug_param'])\n        if 'calib' in data_dict:\n            data_dict.pop('calib')\n        if 'road_plane' in data_dict:\n            data_dict.pop('road_plane')\n\n        return data_dict\n"
  },
  {
    "path": "pcdet/datasets/augmentor/database_sampler.py",
    "content": "import pathlib\nimport pickle\n\nimport numpy as np\n\nfrom ...ops.iou3d_nms import iou3d_nms_utils\nfrom ...utils import box_utils\nimport time\nimport copy\nimport random\n\nclass DataBaseSampler(object):\n    def __init__(self, root_path, sampler_cfg, class_names, logger=None):\n        self.root_path = root_path\n        self.class_names = class_names\n        self.sampler_cfg = sampler_cfg\n\n        #self.gt_path = pathlib.Path(sampler_cfg.GT_PATH)\n        self.use_van = self.sampler_cfg.get('USE_VAN', None)\n        self.logger = logger\n        self.db_infos = {}\n        for class_name in class_names:\n            self.db_infos[class_name] = []\n        if self.use_van:\n            self.db_infos['Van'] = []\n\n        for db_info_path in sampler_cfg.DB_INFO_PATH:\n\n            db_info_path = self.root_path.resolve() / db_info_path\n            with open(str(db_info_path), 'rb') as f:\n                infos = pickle.load(f)\n                for cls in class_names:\n                    if cls in infos.keys():\n                        self.db_infos[cls].extend(infos[cls])\n                if self.use_van:\n                    if 'Van' in infos.keys():\n                        self.db_infos['Van'].extend(infos['Van'])\n\n        for func_name, val in sampler_cfg.PREPARE.items():\n            self.db_infos = getattr(self, func_name)(self.db_infos, val)\n\n        self.sample_groups = {}\n        self.sample_class_num = {}\n        self.limit_whole_scene = sampler_cfg.get('LIMIT_WHOLE_SCENE', False)\n        for x in sampler_cfg.SAMPLE_GROUPS:\n            class_name, sample_num = x.split(':')\n            if class_name not in class_names:\n                if not (self.use_van and class_name == 'Van'):\n                    continue\n            self.sample_class_num[class_name] = sample_num\n            self.sample_groups[class_name] = {\n                'sample_num': sample_num,\n                'pointer': len(self.db_infos[class_name]),\n                'indices': np.arange(len(self.db_infos[class_name]))\n            }\n\n    def __getstate__(self):\n        d = dict(self.__dict__)\n        del d['logger']\n        return d\n\n    def __setstate__(self, d):\n        self.__dict__.update(d)\n\n    def filter_by_difficulty(self, db_infos, removed_difficulty):\n        new_db_infos = {}\n        for key, dinfos in db_infos.items():\n            pre_len = len(dinfos)\n            this_infos = []\n            for info in dinfos:\n                if 'difficulty' in info:\n                    if info['difficulty'] not in removed_difficulty:\n                        this_infos.append(info)\n                else:\n                    this_infos.append(info)\n            new_db_infos[key] = this_infos\n            if self.logger is not None:\n                self.logger.info('Database filter by difficulty %s: %d => %d' % (key, pre_len, len(new_db_infos[key])))\n        return new_db_infos\n\n    def filter_by_min_points(self, db_infos, min_gt_points_list):\n        for name_num in min_gt_points_list:\n            name, min_num = name_num.split(':')\n            min_num = int(min_num)\n            if min_num > 0 and name in db_infos.keys():\n                filtered_infos = []\n                for info in db_infos[name]:\n                    if info['num_points_in_gt'] >= min_num:\n                        filtered_infos.append(info)\n\n                if self.logger is not None:\n                    self.logger.info('Database filter by min points %s: %d => %d' %\n                                     (name, len(db_infos[name]), len(filtered_infos)))\n                db_infos[name] = filtered_infos\n\n        return db_infos\n\n    def sample_with_fixed_number(self, class_name, sample_group):\n        \"\"\"\n        Args:\n            class_name:\n            sample_group:\n        Returns:\n\n        \"\"\"\n        sample_num, pointer, indices = int(sample_group['sample_num']), sample_group['pointer'], sample_group['indices']\n        if pointer >= len(self.db_infos[class_name]):\n            indices = np.random.permutation(len(self.db_infos[class_name]))\n            pointer = 0\n\n        sampled_dict = [self.db_infos[class_name][idx] for idx in indices[pointer: pointer + sample_num]]\n        pointer += sample_num\n        sample_group['pointer'] = pointer\n        sample_group['indices'] = indices\n        return sampled_dict\n\n    @staticmethod\n    def put_boxes_on_road_planes(gt_boxes, road_planes, calib):\n        \"\"\"\n        Only validate in KITTIDataset\n        Args:\n            gt_boxes: (N, 7 + C) [x, y, z, dx, dy, dz, heading, ...]\n            road_planes: [a, b, c, d]\n            calib:\n\n        Returns:\n        \"\"\"\n        a, b, c, d = road_planes\n        center_cam = calib.lidar_to_rect(gt_boxes[:, 0:3])\n        cur_height_cam = (-d - a * center_cam[:, 0] - c * center_cam[:, 2]) / b\n        center_cam[:, 1] = cur_height_cam\n        cur_lidar_height = calib.rect_to_lidar(center_cam)[:, 2]\n        mv_height = gt_boxes[:, 2] - gt_boxes[:, 5] / 2 - cur_lidar_height\n        gt_boxes[:, 2] -= mv_height  # lidar view\n        return gt_boxes, mv_height\n\n    def points_rigid_transform(self,cloud,pose):\n        if cloud.shape[0]==0:\n            return cloud\n        mat=np.ones(shape=(cloud.shape[0],4),dtype=np.float32)\n        pose_mat=np.mat(pose)\n        mat[:,0:3]=cloud[:,0:3]\n        mat=np.mat(mat)\n        transformed_mat=pose_mat*mat.T\n        T=np.array(transformed_mat.T,dtype=np.float32)\n        return T[:,0:3]\n\n    def get_registration_angle(self,mat):\n\n        cos_theta=mat[0,0]\n        sin_theta=mat[1,0]\n\n        if  cos_theta < -1:\n            cos_theta = -1\n        if cos_theta > 1:\n            cos_theta = 1\n\n        theta_cos = np.arccos(cos_theta)\n\n        if sin_theta >= 0:\n            return theta_cos\n        else:\n            return 2 * np.pi - theta_cos\n\n    def registration(self,pose, pre_pose, pre_obj_points, pre_box3d_lidar):\n\n        inv_pose_of_last_frame = np.linalg.inv(pose)\n        registration_mat = np.matmul(inv_pose_of_last_frame, pre_pose)\n\n        if len(pre_obj_points)!=0:\n            pre_obj_points[:, 0:3] = self.points_rigid_transform(pre_obj_points, registration_mat)[:,0:3]\n        angle = self.get_registration_angle(registration_mat)\n        pre_box3d_lidar[0:3] = self.points_rigid_transform(np.array([pre_box3d_lidar]), registration_mat)[0, 0:3]\n        pre_box3d_lidar[6]+=angle\n\n        return pre_obj_points, pre_box3d_lidar\n\n    def add_sampled_boxes_to_scene(self, data_dict, sampled_gt_boxes, total_valid_sampled_dict):\n\n        gt_boxes_mask = np.array([n in self.class_names for n in data_dict['gt_names']], dtype=np.bool_)\n        gt_boxes = data_dict['gt_boxes'][gt_boxes_mask]\n        gt_names = data_dict['gt_names'][gt_boxes_mask]\n        if 'gt_tracklets' in data_dict:\n            data_dict['gt_tracklets']=data_dict['gt_tracklets'][gt_boxes_mask]\n        points = data_dict['points']\n        if 'road_plane' in data_dict:\n            sampled_gt_boxes, mv_height = self.put_boxes_on_road_planes(\n                sampled_gt_boxes, data_dict['road_plane'], data_dict['calib']\n            )\n\n\n        obj_points_list = []\n\n        for idx, info in enumerate(total_valid_sampled_dict):\n\n            file_path = self.root_path / info['path']\n            #path = pathlib.Path(self.root_path)\n            #file_path = path / info['path']\n            obj_points = np.fromfile(str(file_path), dtype=np.float32).reshape(\n                [-1, self.sampler_cfg.NUM_POINT_FEATURES])\n\n            obj_points[:, :3] += info['box3d_lidar'][:3]\n\n            if 'road_plane' in data_dict:\n                # mv height\n                obj_points[:, 2] -= mv_height[idx]\n\n            obj_points_list.append(obj_points)\n\n        obj_points = np.concatenate(obj_points_list, axis=0)\n        sampled_gt_names = np.array([x['name'] for x in total_valid_sampled_dict])\n\n        if self.use_van:\n            sampled_gt_names = np.array(['Car' if sampled_gt_names[i]=='Van' else sampled_gt_names[i] for i in range(len(sampled_gt_names))])\n\n        large_sampled_gt_boxes = box_utils.enlarge_box3d(\n            sampled_gt_boxes[:, 0:7], extra_width=self.sampler_cfg.REMOVE_EXTRA_WIDTH\n        )\n        points = box_utils.remove_points_in_boxes3d(points, large_sampled_gt_boxes)\n        points = np.concatenate([obj_points[:, 0:points.shape[1]], points], axis=0)\n        gt_names = np.concatenate([gt_names, sampled_gt_names], axis=0)\n        gt_boxes = np.concatenate([gt_boxes, sampled_gt_boxes], axis=0)\n\n        valid_mask = np.ones((len(gt_names),), dtype=np.bool_)\n        valid_mask[:len(gt_names) - len(sampled_gt_names)] = 0\n\n        data_dict['valid_noise'] = valid_mask\n        data_dict['gt_boxes'] = gt_boxes\n        data_dict['gt_names'] = gt_names\n        data_dict['points'] = points\n        if 'road_plane' in data_dict:\n            data_dict.pop('road_plane')\n\n        return data_dict\n\n    def __call__(self, data_dict):\n        \"\"\"\n        Args:\n            data_dict:\n                gt_boxes: (N, 7 + C) [x, y, z, dx, dy, dz, heading, ...]\n\n        Returns:\n\n        \"\"\"\n        \n        gt_boxes = data_dict['gt_boxes']\n        gt_names = data_dict['gt_names'].astype(str)\n        existed_boxes = gt_boxes\n        total_valid_sampled_dict = []\n\n        for class_name, sample_group in self.sample_groups.items():\n            if self.limit_whole_scene:\n                num_gt = np.sum(class_name == gt_names)\n                sample_group['sample_num'] = str(int(self.sample_class_num[class_name]) - num_gt)\n            if int(sample_group['sample_num']) > 0:\n                sampled_dict = self.sample_with_fixed_number(class_name, sample_group)\n\n                sampled_boxes1 = np.stack([x['box3d_lidar'] for x in sampled_dict], axis=0).astype(np.float32)\n\n                if self.sampler_cfg.get('DATABASE_WITH_FAKELIDAR', False):\n                    sampled_boxes1 = box_utils.boxes3d_kitti_fakelidar_to_lidar(sampled_boxes)\n                sampled_boxes = copy.deepcopy(sampled_boxes1)\n                iou1 = iou3d_nms_utils.boxes_bev_iou_cpu(sampled_boxes[:, 0:7], existed_boxes[:, 0:7])\n                iou2 = iou3d_nms_utils.boxes_bev_iou_cpu(sampled_boxes[:, 0:7], sampled_boxes[:, 0:7])\n\n                iou2[range(sampled_boxes.shape[0]), range(sampled_boxes.shape[0])] = 0\n                iou1 = iou1 if iou1.shape[1] > 0 else iou2\n                valid_mask = ((iou1.max(axis=1) + iou2.max(axis=1)) == 0).nonzero()[0]\n                valid_sampled_dict = [sampled_dict[x] for x in valid_mask]\n                valid_sampled_boxes = sampled_boxes[valid_mask]\n\n                existed_boxes = np.concatenate((existed_boxes, valid_sampled_boxes), axis=0)\n                total_valid_sampled_dict.extend(valid_sampled_dict)\n\n        sampled_gt_boxes = existed_boxes[gt_boxes.shape[0]:, :]\n        if total_valid_sampled_dict.__len__() > 0:\n\n            data_dict = self.add_sampled_boxes_to_scene(data_dict, sampled_gt_boxes, total_valid_sampled_dict)\n\n\n        return data_dict\n\n\nclass DADataBaseSampler(object):\n    def __init__(self, root_path, sampler_cfg, class_names, logger=None):\n        self.root_path = root_path\n        self.class_names = class_names\n        self.sampler_cfg = sampler_cfg\n\n        # self.gt_path = pathlib.Path(sampler_cfg.GT_PATH)\n        self.use_van = self.sampler_cfg.get('USE_VAN', None)\n        self.min_sampling_dis = sampler_cfg.MIN_SAMPLING_DIS\n        self.max_sampling_dis = sampler_cfg.MIN_SAMPLING_DIS\n        self.occlusion_noise = sampler_cfg.OCCLUSION_NOISE\n        self.occlusion_offset = sampler_cfg.OCCLUSION_OFFSET\n        self.sampling_method = sampler_cfg.SAMPLING_METHOD\n        self.vert_res = sampler_cfg.VERT_RES\n        self.hor_res = sampler_cfg.HOR_RES\n\n        self.logger = logger\n        self.db_infos = {}\n        for class_name in class_names:\n            self.db_infos[class_name] = []\n        if self.use_van:\n            self.db_infos['Van'] = []\n\n        for db_info_path in sampler_cfg.DB_INFO_PATH:\n\n            db_info_path = self.root_path.resolve() / db_info_path\n            with open(str(db_info_path), 'rb') as f:\n                infos = pickle.load(f)\n                for cls in class_names:\n                    if cls in infos.keys():\n                        self.db_infos[cls].extend(infos[cls])\n                # [self.db_infos[cur_class].extend(infos[cur_class]) for cur_class in class_names]\n                if self.use_van:\n                    if 'Van' in infos.keys():\n                        self.db_infos['Van'].extend(infos['Van'])\n\n        for func_name, val in sampler_cfg.PREPARE.items():\n            self.db_infos = getattr(self, func_name)(self.db_infos, val)\n\n        self.sample_groups = {}\n        self.sample_class_num = {}\n        self.limit_whole_scene = sampler_cfg.get('LIMIT_WHOLE_SCENE', False)\n        for x in sampler_cfg.SAMPLE_GROUPS:\n            class_name, sample_num = x.split(':')\n            if class_name not in class_names:\n                if not (self.use_van and class_name == 'Van'):\n                    continue\n            self.sample_class_num[class_name] = sample_num\n            self.sample_groups[class_name] = {\n                'sample_num': sample_num,\n                'pointer': len(self.db_infos[class_name]),\n                'indices': np.arange(len(self.db_infos[class_name]))\n            }\n\n    def __getstate__(self):\n        d = dict(self.__dict__)\n        del d['logger']\n        return d\n\n    def __setstate__(self, d):\n        self.__dict__.update(d)\n\n    def to_sphere_coords(self, points):\n        r = np.linalg.norm(points[:, 0:3], ord=2, axis=-1)\n        theta = np.arccos(points[:, 2] / r)\n        fan = np.arctan(points[:, 1] / points[:, 0])\n\n        new_points = copy.deepcopy(points)\n        new_points[:, 0] = r\n        new_points[:, 1] = theta\n        new_points[:, 2] = fan\n\n        return new_points\n\n    def la_sampling(self, points, vert_res=0.006, hor_res=0.003):\n        new_points = copy.deepcopy(points)\n\n        sp_coords = self.to_sphere_coords(new_points)\n\n        voxel_dict = {}\n\n        for i, point in enumerate(sp_coords):\n\n            vert_coord = point[1] // vert_res\n            hor_coord = point[2] // hor_res\n\n            voxel_key = str(vert_coord) + '_' + str(hor_coord)\n\n            if voxel_key in voxel_dict:\n\n                voxel_dict[voxel_key]['sp'].append(point)\n                voxel_dict[voxel_key]['pts'].append(new_points[i])\n            else:\n                voxel_dict[voxel_key] = {'sp': [point], 'pts': [new_points[i]]}\n\n        sampled_list = []\n\n        for voxel_key in voxel_dict:\n            sp = voxel_dict[voxel_key]['sp']\n            arg_min = np.argmin(np.array(sp)[:, 1])\n            min_point = voxel_dict[voxel_key]['pts'][arg_min]\n            sampled_list.append(min_point)\n        new_points = np.array(sampled_list)\n        if len(new_points) < 5:\n            return points\n        else:\n            return new_points\n\n    def random_sampling(self, points, box, dis):\n        new_points = copy.deepcopy(points)\n        new_box = copy.deepcopy(box)\n        x_off = dis\n        y_off = 0  # np.random.randn()*10\n\n        new_points[:, 0] -= new_box[0]\n        new_points[:, 1] -= new_box[1]\n\n        new_box[0] = x_off\n        new_box[1] = y_off\n\n        new_points[:, 0] += new_box[0]\n        new_points[:, 1] += new_box[1]\n        nn = random.choices(new_points.tolist(), k=int((1 - dis / 100) ** 3 * 300))\n        return np.array(nn), new_box\n\n    def random_drop_out(self, points, rand_noise=0.2, offset=0.3):\n\n        rand = np.random.choice([0, 1, 2, 3])\n        new_points = []\n        for i, p in enumerate(points):\n            if rand == 0 and p[1] + np.random.randn() * rand_noise < offset:\n                new_points.append(points[i])\n            if rand == 1 and p[1] + np.random.randn() * rand_noise >= -offset:\n                new_points.append(points[i])\n            if rand == 2 and p[2] + np.random.randn() * rand_noise < offset:\n                new_points.append(points[i])\n            if rand == 3 and p[2] + np.random.randn() * rand_noise >= -offset:\n                new_points.append(points[i])\n\n        new_points = np.array(new_points)\n        if len(new_points) < 5:\n            return self.random_drop_out(points, rand_noise, offset)\n\n        return new_points\n\n    def filter_by_difficulty(self, db_infos, removed_difficulty):\n        new_db_infos = {}\n        for key, dinfos in db_infos.items():\n            pre_len = len(dinfos)\n            this_infos = []\n            for info in dinfos:\n                if 'difficulty' in info:\n                    if info['difficulty'] not in removed_difficulty:\n                        this_infos.append(info)\n                else:\n                    this_infos.append(info)\n            new_db_infos[key] = this_infos\n            if self.logger is not None:\n                self.logger.info('Database filter by difficulty %s: %d => %d' % (key, pre_len, len(new_db_infos[key])))\n        return new_db_infos\n\n    def filter_by_min_points(self, db_infos, min_gt_points_list):\n        for name_num in min_gt_points_list:\n            name, min_num = name_num.split(':')\n            min_num = int(min_num)\n            if min_num > 0 and name in db_infos.keys():\n                filtered_infos = []\n                for info in db_infos[name]:\n                    if info['num_points_in_gt'] >= min_num:\n                        filtered_infos.append(info)\n\n                if self.logger is not None:\n                    self.logger.info('Database filter by min points %s: %d => %d' %\n                                     (name, len(db_infos[name]), len(filtered_infos)))\n                db_infos[name] = filtered_infos\n\n        return db_infos\n\n    def sample_with_fixed_number(self, class_name, sample_group):\n        \"\"\"\n        Args:\n            class_name:\n            sample_group:\n        Returns:\n\n        \"\"\"\n        sample_num, pointer, indices = int(sample_group['sample_num']), sample_group['pointer'], sample_group['indices']\n        if pointer >= len(self.db_infos[class_name]):\n            indices = np.random.permutation(len(self.db_infos[class_name]))\n            pointer = 0\n\n        sampled_dict = [self.db_infos[class_name][idx] for idx in indices[pointer: pointer + sample_num]]\n        pointer += sample_num\n        sample_group['pointer'] = pointer\n        sample_group['indices'] = indices\n        return sampled_dict\n\n    @staticmethod\n    def put_boxes_on_road_planes(gt_boxes, road_planes, calib):\n        \"\"\"\n        Only validate in KITTIDataset\n        Args:\n            gt_boxes: (N, 7 + C) [x, y, z, dx, dy, dz, heading, ...]\n            road_planes: [a, b, c, d]\n            calib:\n\n        Returns:\n        \"\"\"\n        a, b, c, d = road_planes\n        center_cam = calib.lidar_to_rect(gt_boxes[:, 0:3])\n        cur_height_cam = (-d - a * center_cam[:, 0] - c * center_cam[:, 2]) / b\n        center_cam[:, 1] = cur_height_cam\n        cur_lidar_height = calib.rect_to_lidar(center_cam)[:, 2]\n        mv_height = gt_boxes[:, 2] - gt_boxes[:, 5] / 2 - cur_lidar_height\n        gt_boxes[:, 2] -= mv_height  # lidar view\n        return gt_boxes, mv_height\n\n    def points_rigid_transform(self, cloud, pose):\n        if cloud.shape[0] == 0:\n            return cloud\n        mat = np.ones(shape=(cloud.shape[0], 4), dtype=np.float32)\n        pose_mat = np.mat(pose)\n        mat[:, 0:3] = cloud[:, 0:3]\n        mat = np.mat(mat)\n        transformed_mat = pose_mat * mat.T\n        T = np.array(transformed_mat.T, dtype=np.float32)\n        return T[:, 0:3]\n\n    def get_registration_angle(self, mat):\n\n        cos_theta = mat[0, 0]\n        sin_theta = mat[1, 0]\n\n        if cos_theta < -1:\n            cos_theta = -1\n        if cos_theta > 1:\n            cos_theta = 1\n\n        theta_cos = np.arccos(cos_theta)\n\n        if sin_theta >= 0:\n            return theta_cos\n        else:\n            return 2 * np.pi - theta_cos\n\n    def registration(self, pose, pre_pose, pre_obj_points, pre_box3d_lidar):\n\n        inv_pose_of_last_frame = np.linalg.inv(pose)\n        registration_mat = np.matmul(inv_pose_of_last_frame, pre_pose)\n\n        if len(pre_obj_points) != 0:\n            pre_obj_points[:, 0:3] = self.points_rigid_transform(pre_obj_points, registration_mat)[:, 0:3]\n        angle = self.get_registration_angle(registration_mat)\n        pre_box3d_lidar[0:3] = self.points_rigid_transform(np.array([pre_box3d_lidar]), registration_mat)[0, 0:3]\n        pre_box3d_lidar[6] += angle\n\n        return pre_obj_points, pre_box3d_lidar\n\n\n    def add_sampled_boxes_to_scene(self, data_dict, sampled_gt_boxes, total_valid_sampled_dict):\n\n        gt_boxes_mask = np.array([n in self.class_names for n in data_dict['gt_names']], dtype=np.bool_)\n        gt_boxes = data_dict['gt_boxes'][gt_boxes_mask]\n        gt_names = data_dict['gt_names'][gt_boxes_mask]\n        if 'gt_tracklets' in data_dict:\n            data_dict['gt_tracklets'] = data_dict['gt_tracklets'][gt_boxes_mask]\n        points = data_dict['points']\n        if 'road_plane' in data_dict:\n            sampled_gt_boxes, mv_height = self.put_boxes_on_road_planes(\n                sampled_gt_boxes, data_dict['road_plane'], data_dict['calib']\n            )\n\n        obj_points_list = []\n\n\n        for idx, info in enumerate(total_valid_sampled_dict):\n\n            file_path = self.root_path / info['path']\n            # path = pathlib.Path(self.root_path)\n            # file_path = path / info['path']\n            obj_points = np.fromfile(str(file_path), dtype=np.float32).reshape(\n                [-1, self.sampler_cfg.NUM_POINT_FEATURES])\n\n            obj_points[:, :3] += sampled_gt_boxes[idx][:3]\n            '''\n            if self.sampler_cfg.get('USE_ROAD_PLANE', False):\n                # mv height\n                obj_points[:, 2] -= mv_height[idx]\n            '''\n            if self.sampling_method == 'LiDAR-aware':\n\n                obj_points = self.la_sampling(obj_points,\n                                          vert_res=self.vert_res,\n                                          hor_res=self.hor_res)\n                obj_points[:, 0:3] -= sampled_gt_boxes[idx][:3]\n\n                obj_points = self.random_drop_out(obj_points, rand_noise=self.occlusion_noise, offset=self.occlusion_offset)\n\n                obj_points[:, 0:3] += sampled_gt_boxes[idx][:3]\n\n\n            obj_points_list.append(obj_points)\n\n        obj_points = np.concatenate(obj_points_list, axis=0)\n        sampled_gt_names = np.array([x['name'] for x in total_valid_sampled_dict])\n\n        large_sampled_gt_boxes = box_utils.enlarge_box3d(\n            sampled_gt_boxes[:, 0:7], extra_width=self.sampler_cfg.REMOVE_EXTRA_WIDTH\n        )\n        points = box_utils.remove_points_in_boxes3d(points, large_sampled_gt_boxes)\n        points = np.concatenate([obj_points[:, 0:points.shape[1]], points], axis=0)\n\n        if self.use_van:\n            sampled_gt_names = np.array(\n                ['Car' if sampled_gt_names[i] == 'Van' else sampled_gt_names[i] for i in range(len(sampled_gt_names))])\n\n        gt_names = np.concatenate([gt_names, sampled_gt_names], axis=0)\n        gt_boxes = np.concatenate([gt_boxes, sampled_gt_boxes], axis=0)\n\n        valid_mask = np.ones((len(gt_names),), dtype=np.bool_)\n        if 'valid_noise' in data_dict:\n            valid_mask[:len(gt_names) - len(sampled_gt_names)] = data_dict['valid_noise'][:]\n        else:\n            valid_mask[:len(gt_names) - len(sampled_gt_names)] = 0\n        data_dict['valid_noise'] = valid_mask\n\n        data_dict['gt_boxes'] = gt_boxes\n        data_dict['gt_names'] = gt_names\n        data_dict['points'] = points\n        if 'road_plane' in data_dict:\n            data_dict.pop('road_plane')\n\n        return data_dict\n\n    def __call__(self, data_dict):\n        \"\"\"\n        Args:\n            data_dict:\n                gt_boxes: (N, 7 + C) [x, y, z, dx, dy, dz, heading, ...]\n\n        Returns:\n\n        \"\"\"\n\n        gt_boxes = data_dict['gt_boxes']\n        gt_names = data_dict['gt_names'].astype(str)\n        existed_boxes = gt_boxes\n        total_valid_sampled_dict = []\n\n        for class_name, sample_group in self.sample_groups.items():\n            if self.limit_whole_scene:\n                num_gt = np.sum(class_name == gt_names)\n                sample_group['sample_num'] = str(int(self.sample_class_num[class_name]) - num_gt)\n            if int(sample_group['sample_num']) > 0:\n                sampled_dict = self.sample_with_fixed_number(class_name, sample_group)\n\n                sampled_boxes1 = np.stack([x['box3d_lidar'] for x in sampled_dict], axis=0).astype(np.float32)\n\n                if self.sampler_cfg.get('DATABASE_WITH_FAKELIDAR', False):\n                    sampled_boxes1 = box_utils.boxes3d_kitti_fakelidar_to_lidar(sampled_boxes1)\n                sampled_boxes = copy.deepcopy(sampled_boxes1)\n                sampled_boxes[:, 0] += np.random.random()*(self.max_sampling_dis-self.min_sampling_dis) + self.min_sampling_dis\n\n                iou1 = iou3d_nms_utils.boxes_bev_iou_cpu(sampled_boxes[:, 0:7], existed_boxes[:, 0:7])\n                iou2 = iou3d_nms_utils.boxes_bev_iou_cpu(sampled_boxes[:, 0:7], sampled_boxes[:, 0:7])\n\n                iou2[range(sampled_boxes.shape[0]), range(sampled_boxes.shape[0])] = 0\n                iou1 = iou1 if iou1.shape[1] > 0 else iou2\n                valid_mask = ((iou1.max(axis=1) + iou2.max(axis=1)) == 0).nonzero()[0]\n                valid_sampled_dict = [sampled_dict[x] for x in valid_mask]\n                valid_sampled_boxes = sampled_boxes[valid_mask]\n                existed_boxes = np.concatenate((existed_boxes, valid_sampled_boxes), axis=0)\n                total_valid_sampled_dict.extend(valid_sampled_dict)\n\n        sampled_gt_boxes = existed_boxes[gt_boxes.shape[0]:, :]\n\n        if total_valid_sampled_dict.__len__() > 0:\n\n            data_dict = self.add_sampled_boxes_to_scene(data_dict, sampled_gt_boxes, total_valid_sampled_dict)\n\n        return data_dict\n"
  },
  {
    "path": "pcdet/datasets/dataset.py",
    "content": "from collections import defaultdict\nfrom pathlib import Path\nimport torch\nimport numpy as np\nimport torch.utils.data as torch_data\nimport os\nfrom ..utils import common_utils\nfrom .augmentor.data_augmentor import DataAugmentor\nfrom .augmentor.X_transform import X_TRANS\nfrom .processor.data_processor import DataProcessor\nfrom .processor.point_feature_encoder import PointFeatureEncoder\nimport copy\nimport time\n\nclass DatasetTemplate(torch_data.Dataset):\n    def __init__(self, dataset_cfg=None, class_names=None, training=True, is_source=True, root_path=None, logger=None,\n                 da_train=False):\n        super().__init__()\n        self.test_flip = False\n        self.dataset_cfg = dataset_cfg\n        self.training = training\n        self.is_source = is_source\n        self.da_train = da_train\n        self.class_names = class_names\n        self.logger = logger\n        self.root_path = root_path if root_path is not None else Path(self.dataset_cfg.DATA_PATH)\n        if self.dataset_cfg is None or class_names is None:\n            return\n\n        self.rot_num = self.dataset_cfg.get('ROT_NUM', 1)\n\n\n        self.point_cloud_range = np.array(self.dataset_cfg.POINT_CLOUD_RANGE, dtype=np.float32)\n\n        self.point_feature_encoder = PointFeatureEncoder(\n            self.dataset_cfg.POINT_FEATURE_ENCODING,\n            point_cloud_range=self.point_cloud_range,\n            rot_num=self.rot_num\n        )\n\n        self.data_augmentor = DataAugmentor(\n            self.root_path, self.dataset_cfg.DATA_AUGMENTOR, self.class_names, logger=self.logger,\n        ) if self.training else None\n\n\n        self.data_processor = DataProcessor(\n            self.dataset_cfg.DATA_PROCESSOR, point_cloud_range=self.point_cloud_range, training=self.training,\n            rot_num=self.rot_num, num_point_features=self.point_feature_encoder.num_point_features\n        )\n\n        x_trans_cfg = self.dataset_cfg.get('X_TRANS', None)\n        if x_trans_cfg is not None:\n            self.x_trans = X_TRANS(x_trans_cfg, rot_num=self.rot_num)\n        else:\n            raise NotImplementedError\n\n\n        self.grid_size = self.data_processor.grid_size\n        self.voxel_size = self.data_processor.voxel_size\n        self.total_epochs = 0\n        self._merge_all_iters_to_one_epoch = False\n        self.iter =0\n\n    @property\n    def mode(self):\n        return 'train' if self.training else 'test'\n\n    def __getstate__(self):\n        d = dict(self.__dict__)\n        del d['logger']\n        return d\n\n    def __setstate__(self, d):\n        self.__dict__.update(d)\n\n    @staticmethod\n    def generate_prediction_dicts(batch_dict, pred_dicts, class_names, output_path=None):\n        \"\"\"\n        To support a custom dataset, implement this function to receive the predicted results from the model, and then\n        transform the unified normative coordinate to your required coordinate, and optionally save them to disk.\n\n        Args:\n            batch_dict: dict of original data from the dataloader\n            pred_dicts: dict of predicted results from the model\n                pred_boxes: (N, 7), Tensor\n                pred_scores: (N), Tensor\n                pred_labels: (N), Tensor\n            class_names:\n            output_path: if it is not None, save the results to this path\n        Returns:\n\n        \"\"\"\n\n    def merge_all_iters_to_one_epoch(self, merge=True, epochs=None):\n        if merge:\n            self._merge_all_iters_to_one_epoch = True\n            self.total_epochs = epochs\n        else:\n            self._merge_all_iters_to_one_epoch = False\n\n    def __len__(self):\n        raise NotImplementedError\n\n    def __getitem__(self, index):\n        \"\"\"\n        To support a custom dataset, implement this function to load the raw data (and labels), then transform them to\n        the unified normative coordinate and call the function self.prepare_data() to process the data and send them\n        to the model.\n\n        Args:\n            index:\n\n        Returns:\n\n        \"\"\"\n        raise NotImplementedError\n\n    def prepare_data(self, data_dict):\n        \"\"\"\n        Args:\n            data_dict:\n                points: (N, 3 + C_in)\n                gt_boxes: optional, (N, 7 + C) [x, y, z, dx, dy, dz, heading, ...]\n                gt_names: optional, (N), string\n                ...\n\n        Returns:\n            data_dict:\n                frame_id: string\n                points: (N, 3 + C_in)\n                gt_boxes: optional, (N, 7 + C) [x, y, z, dx, dy, dz, heading, ...]\n                gt_names: optional, (N), string\n                use_lead_xyz: bool\n                voxels: optional (num_voxels, max_points_per_voxel, 3 + C)\n                voxel_coords: optional (num_voxels, 3)\n                voxel_num_points: optional (num_voxels)\n                ...\n        \"\"\"\n        if self.training:\n            assert 'gt_boxes' in data_dict, 'gt_boxes should be provided for training'\n\n            data_dict = self.data_augmentor.forward(\n                data_dict={\n                    **data_dict,\n                }\n            )\n            if 'road_plane' in data_dict:\n                data_dict.pop('road_plane')\n\n            if self.rot_num>1:\n                data_dict = self.x_trans.input_transform(\n                    data_dict={\n                        **data_dict,\n                    },trans_boxes=True\n                )\n\n        else:\n\n            data_dict = self.x_trans.input_transform(\n                data_dict={\n                    **data_dict,\n                }\n            )\n\n        if data_dict.get('gt_boxes', None) is not None:\n            selected = common_utils.keep_arrays_by_name(data_dict['gt_names'], self.class_names)\n            data_dict['gt_names'] = data_dict['gt_names'][selected]\n            for i in range(self.rot_num):\n                if i == 0:\n                    rot_num_id = ''\n                else:\n                    rot_num_id = str(i)\n                if 'gt_boxes'+rot_num_id in data_dict:\n                    data_dict['gt_boxes'+rot_num_id] = data_dict['gt_boxes'+rot_num_id][selected]\n                    gt_classes = np.array([self.class_names.index(n) + 1 for n in data_dict['gt_names']], dtype=np.int32)\n                    gt_boxes = np.concatenate((data_dict['gt_boxes'+rot_num_id], gt_classes.reshape(-1, 1).astype(np.float32)), axis=1)\n                    data_dict['gt_boxes'+rot_num_id] = gt_boxes\n\n        for i in range(self.rot_num):\n            if i ==0:\n                rot_num_id = ''\n            else:\n                rot_num_id = str(i)\n            if 'mm' in data_dict:\n                data_dict['points_mm'+rot_num_id] = data_dict['points'+rot_num_id][data_dict['points'+rot_num_id][:, -1] == 1]\n                data_dict['points'+rot_num_id] = data_dict['points'+rot_num_id][data_dict['points'+rot_num_id][:, -1] == 2]\n\n        data_dict = self.point_feature_encoder.forward(data_dict)\n        self.iter+=1\n        data_dict = self.data_processor.forward(\n            data_dict=data_dict\n        )\n\n        if self.training and len(data_dict['gt_boxes']) == 0:\n            new_index = np.random.randint(self.__len__())\n            return self.__getitem__(new_index)\n\n        data_dict.pop('gt_names', None)\n        if 'valid_noise' in data_dict:\n            data_dict.pop('valid_noise')\n        return data_dict\n\n    def collate_batch(self, batch_list, _unused=False):\n\n        data_dict = defaultdict(list)\n        for cur_sample in batch_list:\n            for key, val in cur_sample.items():\n                data_dict[key].append(val)\n        batch_size = len(batch_list)\n        ret = {}\n\n        point_key_dict=['points', 'voxel_coords', 'points_mm', 'voxel_coords_mm']\n        for i in range(1, 10):\n\n            point_key_dict.append('points'+str(i))\n            point_key_dict.append('voxel_coords'+str(i))\n            point_key_dict.append('points_mm'+str(i))\n            point_key_dict.append('voxel_coords_mm'+str(i))\n\n        voxel_key_dict=['voxels', 'voxel_num_points', 'voxels_mm', 'voxel_num_points_mm']\n        for i in range(1, 10):\n            voxel_key_dict.append('voxels'+str(i))\n            voxel_key_dict.append('voxel_num_points' + str(i))\n            voxel_key_dict.append('voxels_mm'+str(i))\n            voxel_key_dict.append('voxel_num_points_mm' + str(i))\n\n        boxes_key = ['gt_boxes']\n        for i in range(1, 10):\n            boxes_key.append('gt_boxes'+str(i))\n\n\n        for key, val in data_dict.items():\n            try:\n                if key in voxel_key_dict:\n                    ret[key] = np.concatenate(val, axis=0)\n                elif key in point_key_dict:\n                    coors = []\n                    for i, coor in enumerate(val):\n                        coor_pad = np.pad(coor, ((0, 0), (1, 0)), mode='constant', constant_values=i)\n                        coors.append(coor_pad)\n                    ret[key] = np.concatenate(coors, axis=0)\n                elif key in boxes_key:\n                    max_gt = max([len(x) for x in val])\n                    batch_gt_boxes3d = np.zeros((batch_size, max_gt, val[0].shape[-1]), dtype=np.float32)\n                    for k in range(batch_size):\n                        batch_gt_boxes3d[k, :val[k].__len__(), :] = val[k]\n                    ret[key] = batch_gt_boxes3d\n                else:\n                    ret[key] = np.stack(val, axis=0)\n            except:\n                print('Error in collate_batch: key=%s' % key)\n                raise TypeError\n\n        ret['batch_size'] = batch_size\n        return ret\n"
  },
  {
    "path": "pcdet/datasets/kitti/kitti_dataset.py",
    "content": "import copy\nimport pickle\n\nimport numpy as np\nfrom skimage import io\n\nfrom pcdet.ops.roiaware_pool3d import roiaware_pool3d_utils\nfrom pcdet.utils import box_utils, calibration_kitti, common_utils, object3d_kitti\nfrom pcdet.datasets.dataset import DatasetTemplate\nfrom pcdet.models.model_utils import model_nms_utils\n\nclass KittiDataset(DatasetTemplate):\n    def __init__(self, dataset_cfg, class_names, training=True, root_path=None, logger=None):\n        \"\"\"\n        Args:\n            root_path:\n            dataset_cfg:\n            class_names:\n            training:\n            logger:\n        \"\"\"\n        super().__init__(\n            dataset_cfg=dataset_cfg, class_names=class_names, training=training, root_path=root_path, logger=logger\n        )\n        self.split = self.dataset_cfg.DATA_SPLIT[self.mode]\n        self.root_split_path = self.root_path / ('training' if self.split != 'test' else 'testing')\n\n        split_dir = self.root_path / 'ImageSets' / (self.split + '.txt')\n        self.sample_id_list = [x.strip() for x in open(split_dir).readlines()] if split_dir.exists() else None\n\n        self.kitti_infos = []\n        self.include_kitti_data(self.mode)\n\n    def include_kitti_data(self, mode):\n        if self.logger is not None:\n            self.logger.info('Loading KITTI dataset')\n        kitti_infos = []\n\n        for info_path in self.dataset_cfg.INFO_PATH[mode]:\n            info_path = self.root_path / info_path\n            if not info_path.exists():\n                continue\n            with open(info_path, 'rb') as f:\n                infos = pickle.load(f)\n                kitti_infos.extend(infos)\n\n        self.kitti_infos.extend(kitti_infos)\n\n        if self.logger is not None:\n            self.logger.info('Total samples for KITTI dataset: %d' % (len(kitti_infos)))\n\n    def set_split(self, split):\n        super().__init__(\n            dataset_cfg=self.dataset_cfg, class_names=self.class_names, training=self.training, root_path=self.root_path, logger=self.logger\n        )\n        self.split = split\n        self.root_split_path = self.root_path / ('training' if self.split != 'test' else 'testing')\n\n        split_dir = self.root_path / 'ImageSets' / (self.split + '.txt')\n        self.sample_id_list = [x.strip() for x in open(split_dir).readlines()] if split_dir.exists() else None\n\n    def get_lidar(self, idx):\n        lidar_file = self.root_split_path / 'velodyne' / ('%s.bin' % idx)\n        assert lidar_file.exists()\n        return np.fromfile(str(lidar_file), dtype=np.float32).reshape(-1, 4)\n\n    def get_image_shape(self, idx):\n        img_file = self.root_split_path / 'image_2' / ('%s.png' % idx)\n        assert img_file.exists()\n        return np.array(io.imread(img_file).shape[:2], dtype=np.int32)\n\n    def get_label(self, idx):\n        label_file = self.root_split_path / 'label_2' / ('%s.txt' % idx)\n        assert label_file.exists()\n        return object3d_kitti.get_objects_from_label(label_file)\n\n    def get_calib(self, idx):\n        calib_file = self.root_split_path / 'calib' / ('%s.txt' % idx)\n        assert calib_file.exists()\n        return calibration_kitti.Calibration(calib_file)\n\n    def get_road_plane(self, idx):\n        plane_file = self.root_split_path / 'planes' / ('%s.txt' % idx)\n        if not plane_file.exists():\n            return None\n\n        with open(plane_file, 'r') as f:\n            lines = f.readlines()\n        lines = [float(i) for i in lines[3].split()]\n        plane = np.asarray(lines)\n\n        # Ensure normal is always facing up, this is in the rectified camera coordinate\n        if plane[1] > 0:\n            plane = -plane\n\n        norm = np.linalg.norm(plane[0:3])\n        plane = plane / norm\n        return plane\n\n    @staticmethod\n    def get_fov_flag(pts_rect, img_shape, calib):\n        \"\"\"\n        Args:\n            pts_rect:\n            img_shape:\n            calib:\n\n        Returns:\n\n        \"\"\"\n        pts_img, pts_rect_depth = calib.rect_to_img(pts_rect)\n        val_flag_1 = np.logical_and(pts_img[:, 0] >= 0, pts_img[:, 0] < img_shape[1])\n        val_flag_2 = np.logical_and(pts_img[:, 1] >= 0, pts_img[:, 1] < img_shape[0])\n        val_flag_merge = np.logical_and(val_flag_1, val_flag_2)\n        pts_valid_flag = np.logical_and(val_flag_merge, pts_rect_depth >= 0)\n\n        return pts_valid_flag\n\n    def get_infos(self, num_workers=4, has_label=True, count_inside_pts=True, sample_id_list=None):\n        import concurrent.futures as futures\n\n        def process_single_scene(sample_idx):\n            print('%s sample_idx: %s' % (self.split, sample_idx))\n            info = {}\n            pc_info = {'num_features': 4, 'lidar_idx': sample_idx}\n            info['point_cloud'] = pc_info\n\n            image_info = {'image_idx': sample_idx, 'image_shape': self.get_image_shape(sample_idx)}\n            info['image'] = image_info\n            calib = self.get_calib(sample_idx)\n\n            P2 = np.concatenate([calib.P2, np.array([[0., 0., 0., 1.]])], axis=0)\n            R0_4x4 = np.zeros([4, 4], dtype=calib.R0.dtype)\n            R0_4x4[3, 3] = 1.\n            R0_4x4[:3, :3] = calib.R0\n            V2C_4x4 = np.concatenate([calib.V2C, np.array([[0., 0., 0., 1.]])], axis=0)\n            calib_info = {'P2': P2, 'R0_rect': R0_4x4, 'Tr_velo_to_cam': V2C_4x4}\n\n            info['calib'] = calib_info\n\n            if has_label:\n                obj_list = self.get_label(sample_idx)\n                annotations = {}\n                annotations['name'] = np.array([obj.cls_type for obj in obj_list])\n                annotations['truncated'] = np.array([obj.truncation for obj in obj_list])\n                annotations['occluded'] = np.array([obj.occlusion for obj in obj_list])\n                annotations['alpha'] = np.array([obj.alpha for obj in obj_list])\n                annotations['bbox'] = np.concatenate([obj.box2d.reshape(1, 4) for obj in obj_list], axis=0)\n                annotations['dimensions'] = np.array([[obj.l, obj.h, obj.w] for obj in obj_list])  # lhw(camera) format\n                annotations['location'] = np.concatenate([obj.loc.reshape(1, 3) for obj in obj_list], axis=0)\n                annotations['rotation_y'] = np.array([obj.ry for obj in obj_list])\n                annotations['score'] = np.array([obj.score for obj in obj_list])\n                annotations['difficulty'] = np.array([obj.level for obj in obj_list], np.int32)\n\n                num_objects = len([obj.cls_type for obj in obj_list if obj.cls_type != 'DontCare'])\n                num_gt = len(annotations['name'])\n                index = list(range(num_objects)) + [-1] * (num_gt - num_objects)\n                annotations['index'] = np.array(index, dtype=np.int32)\n\n                loc = annotations['location'][:num_objects]\n                dims = annotations['dimensions'][:num_objects]\n                rots = annotations['rotation_y'][:num_objects]\n                loc_lidar = calib.rect_to_lidar(loc)\n                l, h, w = dims[:, 0:1], dims[:, 1:2], dims[:, 2:3]\n                loc_lidar[:, 2] += h[:, 0] / 2\n                gt_boxes_lidar = np.concatenate([loc_lidar, l, w, h, -(np.pi / 2 + rots[..., np.newaxis])], axis=1)\n                annotations['gt_boxes_lidar'] = gt_boxes_lidar\n\n                info['annos'] = annotations\n\n                if count_inside_pts:\n                    points = self.get_lidar(sample_idx)\n                    calib = self.get_calib(sample_idx)\n                    pts_rect = calib.lidar_to_rect(points[:, 0:3])\n\n                    fov_flag = self.get_fov_flag(pts_rect, info['image']['image_shape'], calib)\n                    pts_fov = points[fov_flag]\n                    corners_lidar = box_utils.boxes_to_corners_3d(gt_boxes_lidar)\n                    num_points_in_gt = -np.ones(num_gt, dtype=np.int32)\n\n                    for k in range(num_objects):\n                        flag = box_utils.in_hull(pts_fov[:, 0:3], corners_lidar[k])\n                        num_points_in_gt[k] = flag.sum()\n                    annotations['num_points_in_gt'] = num_points_in_gt\n\n            return info\n\n        sample_id_list = sample_id_list if sample_id_list is not None else self.sample_id_list\n        with futures.ThreadPoolExecutor(num_workers) as executor:\n            infos = executor.map(process_single_scene, sample_id_list)\n        return list(infos)\n\n    def create_groundtruth_database(self, info_path=None, used_classes=None, split='train'):\n        import torch\n\n        database_save_path = Path(self.root_path) / ('gt_database' if split == 'train' else ('gt_database_%s' % split))\n        db_info_save_path = Path(self.root_path) / ('kitti_dbinfos_%s.pkl' % split)\n\n        database_save_path.mkdir(parents=True, exist_ok=True)\n        all_db_infos = {}\n\n        with open(info_path, 'rb') as f:\n            infos = pickle.load(f)\n\n        for k in range(len(infos)):\n            print('gt_database sample: %d/%d' % (k + 1, len(infos)))\n            info = infos[k]\n            sample_idx = info['point_cloud']['lidar_idx']\n            points = self.get_lidar(sample_idx)\n            annos = info['annos']\n            names = annos['name']\n            difficulty = annos['difficulty']\n            bbox = annos['bbox']\n            gt_boxes = annos['gt_boxes_lidar']\n\n            num_obj = gt_boxes.shape[0]\n            point_indices = roiaware_pool3d_utils.points_in_boxes_cpu(\n                torch.from_numpy(points[:, 0:3]), torch.from_numpy(gt_boxes)\n            ).numpy()  # (nboxes, npoints)\n\n            for i in range(num_obj):\n                filename = '%s_%s_%d.bin' % (sample_idx, names[i], i)\n                filepath = database_save_path / filename\n                gt_points = points[point_indices[i] > 0]\n\n                gt_points[:, :3] -= gt_boxes[i, :3]\n                with open(filepath, 'w') as f:\n                    gt_points.tofile(f)\n\n                if (used_classes is None) or names[i] in used_classes:\n                    db_path = str(filepath.relative_to(self.root_path))  # gt_database/xxxxx.bin\n                    db_info = {'name': names[i], 'path': db_path, 'image_idx': sample_idx, 'gt_idx': i,\n                               'box3d_lidar': gt_boxes[i], 'num_points_in_gt': gt_points.shape[0],\n                               'difficulty': difficulty[i], 'bbox': bbox[i], 'score': annos['score'][i]}\n                    if names[i] in all_db_infos:\n                        all_db_infos[names[i]].append(db_info)\n                    else:\n                        all_db_infos[names[i]] = [db_info]\n        for k, v in all_db_infos.items():\n            print('Database %s: %d' % (k, len(v)))\n\n        with open(db_info_save_path, 'wb') as f:\n            pickle.dump(all_db_infos, f)\n\n        return all_db_infos\n\n    #staticmethod\n    def generate_prediction_dicts(self,batch_dict, pred_dicts, class_names, output_path=None):\n        \"\"\"\n        Args:\n            batch_dict:\n                frame_id:\n            pred_dicts: list of pred_dicts\n                pred_boxes: (N, 7), Tensor\n                pred_scores: (N), Tensor\n                pred_labels: (N), Tensor\n            class_names:\n            output_path:\n\n        Returns:\n\n        \"\"\"\n        def get_template_prediction(num_samples):\n            ret_dict = {\n                'name': np.zeros(num_samples), 'truncated': np.zeros(num_samples),\n                'occluded': np.zeros(num_samples), 'alpha': np.zeros(num_samples),\n                'bbox': np.zeros([num_samples, 4]), 'dimensions': np.zeros([num_samples, 3]),\n                'location': np.zeros([num_samples, 3]), 'rotation_y': np.zeros(num_samples),\n                'score': np.zeros(num_samples), 'boxes_lidar': np.zeros([num_samples, 7])\n            }\n            return ret_dict\n\n        def generate_single_sample_dict(batch_index, box_dict):\n            pred_scores = box_dict['pred_scores'].cpu().numpy()\n            pred_boxes = box_dict['pred_boxes'].cpu().numpy()\n            pred_labels = box_dict['pred_labels'].cpu().numpy()\n\n            if 'WBF' in box_dict:\n                pred_labels,pred_scores,pred_boxes = model_nms_utils.compute_WBF(pred_labels,pred_scores,pred_boxes)\n\n            pred_dict = get_template_prediction(pred_scores.shape[0])\n            if pred_scores.shape[0] == 0:\n                return pred_dict\n\n            calib = batch_dict['calib'][batch_index]\n            image_shape = batch_dict['image_shape'][batch_index]\n            pred_boxes_camera = box_utils.boxes3d_lidar_to_kitti_camera(pred_boxes, calib)\n            pred_boxes_img = box_utils.boxes3d_kitti_camera_to_imageboxes(\n                pred_boxes_camera, calib, image_shape=image_shape\n            )\n\n            pred_dict['name'] = np.array(class_names)[pred_labels - 1]\n            pred_dict['alpha'] = -np.arctan2(-pred_boxes[:, 1], pred_boxes[:, 0]) + pred_boxes_camera[:, 6]\n            pred_dict['bbox'] = pred_boxes_img\n            height = pred_dict['bbox'][:, 3] - pred_dict['bbox'][:, 1]\n            height_mask = height<25\n            pred_dict['bbox'][height_mask, 3] +=2\n            pred_dict['dimensions'] = pred_boxes_camera[:, 3:6]\n            pred_dict['location'] = pred_boxes_camera[:, 0:3]\n            pred_dict['rotation_y'] = pred_boxes_camera[:, 6]\n            pred_dict['score'] = pred_scores\n            pred_dict['boxes_lidar'] = pred_boxes\n\n            return pred_dict\n\n        annos = []\n        for index, box_dict in enumerate(pred_dicts):\n            frame_id = batch_dict['frame_id'][index]\n\n            single_pred_dict = generate_single_sample_dict(index, box_dict)\n\n            single_pred_dict['frame_id'] = frame_id\n            annos.append(single_pred_dict)\n\n            if output_path is not None:\n                cur_det_file = output_path / ('%s.txt' % frame_id)\n                with open(cur_det_file, 'w') as f:\n                    bbox = single_pred_dict['bbox']\n                    loc = single_pred_dict['location']\n                    dims = single_pred_dict['dimensions']  # lhw -> hwl\n\n                    for idx in range(len(bbox)):\n                        print('%s -1 -1 %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f'\n                              % (single_pred_dict['name'][idx], single_pred_dict['alpha'][idx],\n                                 bbox[idx][0], bbox[idx][1], bbox[idx][2], bbox[idx][3],\n                                 dims[idx][1], dims[idx][2], dims[idx][0], loc[idx][0],\n                                 loc[idx][1], loc[idx][2], single_pred_dict['rotation_y'][idx],\n                                 single_pred_dict['score'][idx]), file=f)\n\n        return annos\n\n    def evaluation(self, det_annos, class_names, **kwargs):\n        if 'annos' not in self.kitti_infos[0].keys():\n            return None, {}\n\n        from .kitti_object_eval_python import eval as kitti_eval\n\n        eval_det_annos = copy.deepcopy(det_annos)\n        eval_gt_annos = [copy.deepcopy(info['annos']) for info in self.kitti_infos]\n        ap_result_str, ap_dict = kitti_eval.get_official_eval_result(eval_gt_annos, eval_det_annos, class_names)\n\n        return ap_result_str, ap_dict\n\n    def __len__(self):\n        if self._merge_all_iters_to_one_epoch:\n            return len(self.kitti_infos) * self.total_epochs\n\n        return len(self.kitti_infos)\n\n    def __getitem__(self, index):\n        # index = 4\n        if self._merge_all_iters_to_one_epoch:\n            index = index % len(self.kitti_infos)\n\n        info = copy.deepcopy(self.kitti_infos[index])\n\n        sample_idx = info['point_cloud']['lidar_idx']\n\n        points = self.get_lidar(sample_idx)\n        calib = self.get_calib(sample_idx)\n\n        img_shape = info['image']['image_shape']\n        if self.dataset_cfg.FOV_POINTS_ONLY:\n            pts_rect = calib.lidar_to_rect(points[:, 0:3])\n            fov_flag = self.get_fov_flag(pts_rect, img_shape, calib)\n            points = points[fov_flag]\n\n        input_dict = {\n            'points': points,\n            'frame_id': sample_idx,\n            'calib': calib,\n        }\n\n        if 'annos' in info:\n            annos = info['annos']\n            annos = common_utils.drop_info_with_name(annos, name='DontCare')\n            loc, dims, rots = annos['location'], annos['dimensions'], annos['rotation_y']\n            gt_names = annos['name']\n            gt_boxes_camera = np.concatenate([loc, dims, rots[..., np.newaxis]], axis=1).astype(np.float32)\n            gt_boxes_lidar = box_utils.boxes3d_kitti_camera_to_lidar(gt_boxes_camera, calib)\n\n            input_dict.update({\n                'gt_names': gt_names,\n                'gt_boxes': gt_boxes_lidar\n            })\n            road_plane = self.get_road_plane(sample_idx)\n            if road_plane is not None:\n                input_dict['road_plane'] = road_plane\n\n        data_dict = self.prepare_data(data_dict=input_dict)\n\n        data_dict['image_shape'] = img_shape\n        return data_dict\n\n\ndef create_kitti_infos(dataset_cfg, class_names, data_path, save_path, workers=4):\n    dataset = KittiDataset(dataset_cfg=dataset_cfg, class_names=class_names, root_path=data_path, training=False)\n    train_split, val_split = 'train', 'val'\n\n    train_filename = save_path / ('kitti_infos_%s.pkl' % train_split)\n    val_filename = save_path / ('kitti_infos_%s.pkl' % val_split)\n    trainval_filename = save_path / 'kitti_infos_trainval.pkl'\n    test_filename = save_path / 'kitti_infos_test.pkl'\n\n    print('---------------Start to generate data infos---------------')\n\n    dataset.set_split(train_split)\n    kitti_infos_train = dataset.get_infos(num_workers=workers, has_label=True, count_inside_pts=True)\n    with open(train_filename, 'wb') as f:\n        pickle.dump(kitti_infos_train, f)\n    print('Kitti info train file is saved to %s' % train_filename)\n\n    dataset.set_split(val_split)\n    kitti_infos_val = dataset.get_infos(num_workers=workers, has_label=True, count_inside_pts=True)\n    with open(val_filename, 'wb') as f:\n        pickle.dump(kitti_infos_val, f)\n    print('Kitti info val file is saved to %s' % val_filename)\n\n    with open(trainval_filename, 'wb') as f:\n        pickle.dump(kitti_infos_train + kitti_infos_val, f)\n    print('Kitti info trainval file is saved to %s' % trainval_filename)\n\n    dataset.set_split('test')\n    kitti_infos_test = dataset.get_infos(num_workers=workers, has_label=False, count_inside_pts=False)\n    with open(test_filename, 'wb') as f:\n        pickle.dump(kitti_infos_test, f)\n    print('Kitti info test file is saved to %s' % test_filename)\n\n    print('---------------Start create groundtruth database for data augmentation---------------')\n    dataset.set_split(train_split)\n    dataset.create_groundtruth_database(train_filename, split=train_split)\n\n    print('---------------Data preparation Done---------------')\n\n\nif __name__ == '__main__':\n    import sys\n    if sys.argv.__len__() > 1 and sys.argv[1] == 'create_kitti_infos':\n        import yaml\n        from pathlib import Path\n        from easydict import EasyDict\n        dataset_cfg = EasyDict(yaml.safe_load(open(sys.argv[2])))\n        ROOT_DIR = (Path(__file__).resolve().parent / '../../../').resolve()\n        create_kitti_infos(\n            dataset_cfg=dataset_cfg,\n            class_names=['Car', 'Pedestrian', 'Cyclist'],\n            data_path=ROOT_DIR / 'data' / 'kitti',\n            save_path=ROOT_DIR / 'data' / 'kitti'\n        )\n"
  },
  {
    "path": "pcdet/datasets/kitti/kitti_dataset_mm.py",
    "content": "import copy\nimport pickle\n\nimport numpy as np\nfrom skimage import io\n\nfrom pcdet.ops.roiaware_pool3d import roiaware_pool3d_utils\nfrom pcdet.utils import box_utils, calibration_kitti, common_utils, object3d_kitti\nfrom pcdet.datasets.dataset import DatasetTemplate\nfrom pcdet.models.model_utils import model_nms_utils\nimport time\n\nclass KittiDatasetMM(DatasetTemplate):\n    def __init__(self, dataset_cfg, class_names, training=True, root_path=None, logger=None):\n        \"\"\"\n        Args:\n            root_path:\n            dataset_cfg:\n            class_names:\n            training:\n            logger:\n        \"\"\"\n        super().__init__(\n            dataset_cfg=dataset_cfg, class_names=class_names, training=training, root_path=root_path, logger=logger\n        )\n        self.split = self.dataset_cfg.DATA_SPLIT[self.mode]\n        self.root_split_path = self.root_path / ('training' if self.split != 'test' else 'testing')\n\n        split_dir = self.root_path / 'ImageSets' / (self.split + '.txt')\n        self.sample_id_list = [x.strip() for x in open(split_dir).readlines()] if split_dir.exists() else None\n\n        self.kitti_infos = []\n        self.include_kitti_data(self.mode)\n\n    def include_kitti_data(self, mode):\n        if self.logger is not None:\n            self.logger.info('Loading KITTI dataset')\n        kitti_infos = []\n\n        for info_path in self.dataset_cfg.INFO_PATH[mode]:\n            info_path = self.root_path / info_path\n            if not info_path.exists():\n                continue\n            with open(info_path, 'rb') as f:\n                infos = pickle.load(f)\n                kitti_infos.extend(infos)\n\n        self.kitti_infos.extend(kitti_infos)\n\n        if self.logger is not None:\n            self.logger.info('Total samples for KITTI dataset: %d' % (len(kitti_infos)))\n\n    def set_split(self, split):\n        super().__init__(\n            dataset_cfg=self.dataset_cfg, class_names=self.class_names, training=self.training, root_path=self.root_path, logger=self.logger\n        )\n        self.split = split\n        self.root_split_path = self.root_path / ('training' if self.split != 'test' else 'testing')\n\n        split_dir = self.root_path / 'ImageSets' / (self.split + '.txt')\n        self.sample_id_list = [x.strip() for x in open(split_dir).readlines()] if split_dir.exists() else None\n\n    def get_lidar(self, idx):\n        lidar_file = self.root_split_path / 'velodyne' / ('%s.bin' % idx)\n        assert lidar_file.exists()\n        p = np.fromfile(str(lidar_file), dtype=np.float32).reshape(-1, 4)\n\n        return p\n\n    def get_lidar_mm(self, idx):\n        lidar_file = self.root_split_path / self.dataset_cfg.MM_PATH / ('%s.npy' % idx)\n        assert lidar_file.exists()\n        return np.load(lidar_file).astype(np.float32)\n\n    def get_image_shape(self, idx):\n        img_file = self.root_split_path / 'image_2' / ('%s.png' % idx)\n        assert img_file.exists()\n        return np.array(io.imread(img_file).shape[:2], dtype=np.int32)\n\n    def get_image(self, idx):\n        img_file = self.root_split_path / 'image_2' / ('%s.png' % idx)\n        assert img_file.exists()\n        return np.array(io.imread(img_file))\n\n    def get_label(self, idx):\n        label_file = self.root_split_path / 'label_2' / ('%s.txt' % idx)\n        assert label_file.exists()\n        return object3d_kitti.get_objects_from_label(label_file)\n\n    def get_calib(self, idx):\n        calib_file = self.root_split_path / 'calib' / ('%s.txt' % idx)\n        assert calib_file.exists()\n        return calibration_kitti.Calibration(calib_file)\n\n    def get_road_plane(self, idx):\n        plane_file = self.root_split_path / 'planes' / ('%s.txt' % idx)\n        if not plane_file.exists():\n            return None\n\n        with open(plane_file, 'r') as f:\n            lines = f.readlines()\n        lines = [float(i) for i in lines[3].split()]\n        plane = np.asarray(lines)\n\n        # Ensure normal is always facing up, this is in the rectified camera coordinate\n        if plane[1] > 0:\n            plane = -plane\n\n        norm = np.linalg.norm(plane[0:3])\n        plane = plane / norm\n        return plane\n\n    @staticmethod\n    def get_fov_flag(pts_rect, img_shape, calib):\n        \"\"\"\n        Args:\n            pts_rect:\n            img_shape:\n            calib:\n\n        Returns:\n\n        \"\"\"\n        pts_img, pts_rect_depth = calib.rect_to_img(pts_rect)\n        val_flag_1 = np.logical_and(pts_img[:, 0] >= 0, pts_img[:, 0] < img_shape[1])\n        val_flag_2 = np.logical_and(pts_img[:, 1] >= 0, pts_img[:, 1] < img_shape[0])\n        val_flag_merge = np.logical_and(val_flag_1, val_flag_2)\n        pts_valid_flag = np.logical_and(val_flag_merge, pts_rect_depth >= 0)\n\n        return pts_valid_flag\n\n    def get_infos(self, num_workers=4, has_label=True, count_inside_pts=True, sample_id_list=None):\n        import concurrent.futures as futures\n\n        def process_single_scene(sample_idx):\n            print('%s sample_idx: %s' % (self.split, sample_idx))\n            info = {}\n            pc_info = {'num_features': 4, 'lidar_idx': sample_idx}\n            info['point_cloud'] = pc_info\n\n            image_info = {'image_idx': sample_idx, 'image_shape': self.get_image_shape(sample_idx)}\n            info['image'] = image_info\n            calib = self.get_calib(sample_idx)\n\n            P2 = np.concatenate([calib.P2, np.array([[0., 0., 0., 1.]])], axis=0)\n            R0_4x4 = np.zeros([4, 4], dtype=calib.R0.dtype)\n            R0_4x4[3, 3] = 1.\n            R0_4x4[:3, :3] = calib.R0\n            V2C_4x4 = np.concatenate([calib.V2C, np.array([[0., 0., 0., 1.]])], axis=0)\n            calib_info = {'P2': P2, 'R0_rect': R0_4x4, 'Tr_velo_to_cam': V2C_4x4}\n\n            info['calib'] = calib_info\n\n            if has_label:\n                obj_list = self.get_label(sample_idx)\n                annotations = {}\n                annotations['name'] = np.array([obj.cls_type for obj in obj_list])\n                annotations['truncated'] = np.array([obj.truncation for obj in obj_list])\n                annotations['occluded'] = np.array([obj.occlusion for obj in obj_list])\n                annotations['alpha'] = np.array([obj.alpha for obj in obj_list])\n                annotations['bbox'] = np.concatenate([obj.box2d.reshape(1, 4) for obj in obj_list], axis=0)\n                annotations['dimensions'] = np.array([[obj.l, obj.h, obj.w] for obj in obj_list])  # lhw(camera) format\n                annotations['location'] = np.concatenate([obj.loc.reshape(1, 3) for obj in obj_list], axis=0)\n                annotations['rotation_y'] = np.array([obj.ry for obj in obj_list])\n                annotations['score'] = np.array([obj.score for obj in obj_list])\n                annotations['difficulty'] = np.array([obj.level for obj in obj_list], np.int32)\n\n                num_objects = len([obj.cls_type for obj in obj_list if obj.cls_type != 'DontCare'])\n                num_gt = len(annotations['name'])\n                index = list(range(num_objects)) + [-1] * (num_gt - num_objects)\n                annotations['index'] = np.array(index, dtype=np.int32)\n\n                loc = annotations['location'][:num_objects]\n                dims = annotations['dimensions'][:num_objects]\n                rots = annotations['rotation_y'][:num_objects]\n                loc_lidar = calib.rect_to_lidar(loc)\n                l, h, w = dims[:, 0:1], dims[:, 1:2], dims[:, 2:3]\n                loc_lidar[:, 2] += h[:, 0] / 2\n                gt_boxes_lidar = np.concatenate([loc_lidar, l, w, h, -(np.pi / 2 + rots[..., np.newaxis])], axis=1)\n                annotations['gt_boxes_lidar'] = gt_boxes_lidar\n\n                info['annos'] = annotations\n\n                if count_inside_pts:\n                    points = self.get_lidar(sample_idx)\n                    calib = self.get_calib(sample_idx)\n                    pts_rect = calib.lidar_to_rect(points[:, 0:3])\n\n                    fov_flag = self.get_fov_flag(pts_rect, info['image']['image_shape'], calib)\n                    pts_fov = points[fov_flag]\n                    corners_lidar = box_utils.boxes_to_corners_3d(gt_boxes_lidar)\n                    num_points_in_gt = -np.ones(num_gt, dtype=np.int32)\n\n                    for k in range(num_objects):\n                        flag = box_utils.in_hull(pts_fov[:, 0:3], corners_lidar[k])\n                        num_points_in_gt[k] = flag.sum()\n                    annotations['num_points_in_gt'] = num_points_in_gt\n\n            return info\n\n        sample_id_list = sample_id_list if sample_id_list is not None else self.sample_id_list\n        with futures.ThreadPoolExecutor(num_workers) as executor:\n            infos = executor.map(process_single_scene, sample_id_list)\n        return list(infos)\n\n    def create_groundtruth_database(self, info_path=None, used_classes=None, split='train'):\n        import torch\n\n        database_save_path = Path(self.root_path) / ('gt_database_mm' if split == 'train' else ('gt_database_%s_mm' % split))\n        db_info_save_path = Path(self.root_path) / ('kitti_dbinfos_%s_mm.pkl' % split)\n\n        database_save_path.mkdir(parents=True, exist_ok=True)\n        all_db_infos = {}\n\n        with open(info_path, 'rb') as f:\n            infos = pickle.load(f)\n\n        for k in range(len(infos)):\n            print('gt_database sample: %d/%d' % (k + 1, len(infos)))\n            info = infos[k]\n            sample_idx = info['point_cloud']['lidar_idx']\n            points = self.get_lidar_mm(sample_idx)\n            annos = info['annos']\n            names = annos['name']\n            difficulty = annos['difficulty']\n            bbox = annos['bbox']\n            gt_boxes = annos['gt_boxes_lidar']\n\n            num_obj = gt_boxes.shape[0]\n            point_indices = roiaware_pool3d_utils.points_in_boxes_cpu(\n                torch.from_numpy(points[:, 0:3]), torch.from_numpy(gt_boxes)\n            ).numpy()  # (nboxes, npoints)\n\n            for i in range(num_obj):\n                filename = '%s_%s_%d.bin' % (sample_idx, names[i], i)\n                filepath = database_save_path / filename\n                gt_points = points[point_indices[i] > 0]\n\n                gt_points[:, :3] -= gt_boxes[i, :3]\n                with open(filepath, 'w') as f:\n                    gt_points.tofile(f)\n\n                shape = gt_points[gt_points[:, -1]==2].shape[0]\n\n                if (used_classes is None) or names[i] in used_classes:\n                    db_path = str(filepath.relative_to(self.root_path))  # gt_database/xxxxx.bin\n                    db_info = {'name': names[i], 'path': db_path, 'image_idx': sample_idx, 'gt_idx': i,\n                               'box3d_lidar': gt_boxes[i], 'num_points_in_gt': shape,\n                               'difficulty': difficulty[i], 'bbox': bbox[i], 'score': annos['score'][i]}\n                    if names[i] in all_db_infos:\n                        all_db_infos[names[i]].append(db_info)\n                    else:\n                        all_db_infos[names[i]] = [db_info]\n\n        for k, v in all_db_infos.items():\n            print('Database %s: %d' % (k, len(v)))\n\n        with open(db_info_save_path, 'wb') as f:\n            pickle.dump(all_db_infos, f)\n\n        return all_db_infos\n\n    #staticmethod\n    def generate_prediction_dicts(self,batch_dict, pred_dicts, class_names, output_path=None):\n        \"\"\"\n        Args:\n            batch_dict:\n                frame_id:\n            pred_dicts: list of pred_dicts\n                pred_boxes: (N, 7), Tensor\n                pred_scores: (N), Tensor\n                pred_labels: (N), Tensor\n            class_names:\n            output_path:\n\n        Returns:\n\n        \"\"\"\n        def get_template_prediction(num_samples):\n            ret_dict = {\n                'name': np.zeros(num_samples), 'truncated': np.zeros(num_samples),\n                'occluded': np.zeros(num_samples), 'alpha': np.zeros(num_samples),\n                'bbox': np.zeros([num_samples, 4]), 'dimensions': np.zeros([num_samples, 3]),\n                'location': np.zeros([num_samples, 3]), 'rotation_y': np.zeros(num_samples),\n                'score': np.zeros(num_samples), 'boxes_lidar': np.zeros([num_samples, 7])\n            }\n            return ret_dict\n\n        def generate_single_sample_dict(batch_index, box_dict):\n            pred_scores = box_dict['pred_scores'].cpu().numpy()\n            pred_boxes = box_dict['pred_boxes'].cpu().numpy()\n            pred_labels = box_dict['pred_labels'].cpu().numpy()\n\n            if 'WBF' in box_dict:\n                pred_labels,pred_scores,pred_boxes = model_nms_utils.compute_WBF(pred_labels,pred_scores,pred_boxes)\n\n\n            pred_dict = get_template_prediction(pred_scores.shape[0])\n            if pred_scores.shape[0] == 0:\n                return pred_dict\n\n            calib = batch_dict['calib'][batch_index]\n            image_shape = batch_dict['image_shape'][batch_index]\n            pred_boxes_camera = box_utils.boxes3d_lidar_to_kitti_camera(pred_boxes, calib)\n            pred_boxes_img = box_utils.boxes3d_kitti_camera_to_imageboxes(\n                pred_boxes_camera, calib, image_shape=image_shape\n            )\n\n            pred_dict['name'] = np.array(class_names)[pred_labels - 1]\n            pred_dict['alpha'] = -np.arctan2(-pred_boxes[:, 1], pred_boxes[:, 0]) + pred_boxes_camera[:, 6]\n            pred_dict['bbox'] = pred_boxes_img\n            height = pred_dict['bbox'][:, 3] - pred_dict['bbox'][:, 1]\n            height_mask = height<25\n            pred_dict['bbox'][height_mask, 3] +=2\n            pred_dict['dimensions'] = pred_boxes_camera[:, 3:6]\n            pred_dict['location'] = pred_boxes_camera[:, 0:3]\n            pred_dict['rotation_y'] = pred_boxes_camera[:, 6]\n            pred_dict['score'] = pred_scores\n            pred_dict['boxes_lidar'] = pred_boxes\n\n            return pred_dict\n\n        annos = []\n        for index, box_dict in enumerate(pred_dicts):\n            frame_id = batch_dict['frame_id'][index]\n\n            single_pred_dict = generate_single_sample_dict(index, box_dict)\n\n\n\n            single_pred_dict['frame_id'] = frame_id\n            annos.append(single_pred_dict)\n\n            if output_path is not None:\n                cur_det_file = output_path / ('%s.txt' % frame_id)\n                with open(cur_det_file, 'w') as f:\n                    bbox = single_pred_dict['bbox']\n                    loc = single_pred_dict['location']\n                    dims = single_pred_dict['dimensions']  # lhw -> hwl\n\n                    for idx in range(len(bbox)):\n                        print('%s -1 -1 %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f'\n                              % (single_pred_dict['name'][idx], single_pred_dict['alpha'][idx],\n                                 bbox[idx][0], bbox[idx][1], bbox[idx][2], bbox[idx][3],\n                                 dims[idx][1], dims[idx][2], dims[idx][0], loc[idx][0],\n                                 loc[idx][1], loc[idx][2], single_pred_dict['rotation_y'][idx],\n                                 single_pred_dict['score'][idx]), file=f)\n\n        return annos\n\n    def evaluation(self, det_annos, class_names, **kwargs):\n        if 'annos' not in self.kitti_infos[0].keys():\n            return None, {}\n\n        from .kitti_object_eval_python import eval as kitti_eval\n\n        eval_det_annos = copy.deepcopy(det_annos)\n        eval_gt_annos = [copy.deepcopy(info['annos']) for info in self.kitti_infos]\n        ap_result_str, ap_dict = kitti_eval.get_official_eval_result(eval_gt_annos, eval_det_annos, class_names)\n\n        return ap_result_str, ap_dict\n\n    def __len__(self):\n        if self._merge_all_iters_to_one_epoch:\n            return len(self.kitti_infos) * self.total_epochs\n\n        return len(self.kitti_infos)\n\n    def __getitem__(self, index):\n        # index = 4\n        if self._merge_all_iters_to_one_epoch:\n            index = index % len(self.kitti_infos)\n\n        info = copy.deepcopy(self.kitti_infos[index])\n\n        sample_idx = info['point_cloud']['lidar_idx']\n\n        points = self.get_lidar_mm(sample_idx)\n\n        calib = self.get_calib(sample_idx)\n\n        img_shape = info['image']['image_shape']\n        if self.dataset_cfg.FOV_POINTS_ONLY:\n            pts_rect = calib.lidar_to_rect(points[:, 0:3])\n            fov_flag = self.get_fov_flag(pts_rect, img_shape, calib)\n            points = points[fov_flag]\n\n        input_dict = {\n            'points': points,\n            'frame_id': sample_idx,\n            'calib': calib,\n        }\n        input_dict.update({\n                'mm': np.ones(shape=(1, 1))\n            })\n\n        if 'annos' in info:\n            annos = info['annos']\n            annos = common_utils.drop_info_with_name(annos, name='DontCare')\n            loc, dims, rots = annos['location'], annos['dimensions'], annos['rotation_y']\n            gt_names = annos['name']\n\n            if (self.dataset_cfg.get('USE_VAN', None) is True) and (self.training is True):\n                gt_names = np.array(['Car' if gt_names[i]=='Van' else gt_names[i] for i in range(len(gt_names))])\n\n            gt_boxes_camera = np.concatenate([loc, dims, rots[..., np.newaxis]], axis=1).astype(np.float32)\n            gt_boxes_lidar = box_utils.boxes3d_kitti_camera_to_lidar(gt_boxes_camera, calib)\n            if self.training and 'num_points_in_gt' in annos:\n                nmask = annos['num_points_in_gt']>0\n                annos['num_points_in_gt'] = annos['num_points_in_gt'][nmask]\n                gt_names = gt_names[nmask]\n                gt_boxes_lidar = gt_boxes_lidar[nmask]\n\n            input_dict.update({\n                'gt_names': gt_names,\n                'gt_boxes': gt_boxes_lidar\n            })\n\n\n            road_plane = self.get_road_plane(sample_idx)\n            if road_plane is not None:\n                input_dict['road_plane'] = road_plane\n\n        data_dict = self.prepare_data(data_dict=input_dict)\n        data_dict['image_shape'] = img_shape\n        data_dict['calib'] = calib\n        return data_dict\n\n\ndef create_kitti_infos(dataset_cfg, class_names, data_path, save_path, workers=4):\n    dataset = KittiDatasetMM(dataset_cfg=dataset_cfg, class_names=class_names, root_path=data_path, training=False)\n    train_split, val_split, trainval_split = 'train', 'val', 'trainval'\n\n    train_filename = save_path / ('kitti_infos_%s.pkl' % train_split)\n    val_filename = save_path / ('kitti_infos_%s.pkl' % val_split)\n    trainval_filename = save_path / 'kitti_infos_trainval.pkl'\n    test_filename = save_path / 'kitti_infos_test.pkl'\n\n    print('---------------Start to generate data infos---------------')\n    '''\n    dataset.set_split(train_split)\n    kitti_infos_train = dataset.get_infos(num_workers=workers, has_label=True, count_inside_pts=True)\n    with open(train_filename, 'wb') as f:\n        pickle.dump(kitti_infos_train, f)\n    print('Kitti info train file is saved to %s' % train_filename)\n\n    dataset.set_split(val_split)\n    kitti_infos_val = dataset.get_infos(num_workers=workers, has_label=True, count_inside_pts=True)\n    with open(val_filename, 'wb') as f:\n        pickle.dump(kitti_infos_val, f)\n    print('Kitti info val file is saved to %s' % val_filename)\n\n    with open(trainval_filename, 'wb') as f:\n        pickle.dump(kitti_infos_train + kitti_infos_val, f)\n    print('Kitti info trainval file is saved to %s' % trainval_filename)\n    \n    dataset.set_split('test')\n    kitti_infos_test = dataset.get_infos(num_workers=workers, has_label=False, count_inside_pts=False)\n    with open(test_filename, 'wb') as f:\n        pickle.dump(kitti_infos_test, f)\n    print('Kitti info test file is saved to %s' % test_filename)\n    '''\n    print('---------------Start create groundtruth database for data augmentation---------------')\n    dataset.set_split('train')\n    dataset.create_groundtruth_database(train_filename, split='train')\n\n    print('---------------Data preparation Done---------------')\n\n\nif __name__ == '__main__':\n    import sys\n    if sys.argv.__len__() > 1 and sys.argv[1] == 'create_kitti_infos':\n        import yaml\n        from pathlib import Path\n        from easydict import EasyDict\n        dataset_cfg = EasyDict(yaml.safe_load(open(sys.argv[2])))\n        ROOT_DIR = (Path(__file__).resolve().parent / '../../../').resolve()\n        create_kitti_infos(\n            dataset_cfg=dataset_cfg,\n            class_names=['Car', 'Pedestrian', 'Cyclist'],\n            data_path=ROOT_DIR / 'data' / 'kitti',\n            save_path=ROOT_DIR / 'data' / 'kitti'\n        )\n"
  },
  {
    "path": "pcdet/datasets/kitti/kitti_object_eval_python/LICENSE",
    "content": "MIT License\n\nCopyright (c) 2018 \n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "pcdet/datasets/kitti/kitti_object_eval_python/README.md",
    "content": "# kitti-object-eval-python\n**Note**: This is borrowed from [traveller59/kitti-object-eval-python](https://github.com/traveller59/kitti-object-eval-python)\n\nFast kitti object detection eval in python(finish eval in less than 10 second), support 2d/bev/3d/aos. , support coco-style AP. If you use command line interface, numba need some time to compile jit functions.\n## Dependencies\nOnly support python 3.6+, need `numpy`, `skimage`, `numba`, `fire`. If you have Anaconda, just install `cudatoolkit` in anaconda. Otherwise, please reference to this [page](https://github.com/numba/numba#custom-python-environments) to set up llvm and cuda for numba.\n* Install by conda:\n```\nconda install -c numba cudatoolkit=x.x  (8.0, 9.0, 9.1, depend on your environment) \n```\n## Usage\n* commandline interface:\n```\npython evaluate.py evaluate --label_path=/path/to/your_gt_label_folder --result_path=/path/to/your_result_folder --label_split_file=/path/to/val.txt --current_class=0 --coco=False\n```\n* python interface:\n```Python\nimport kitti_common as kitti\nfrom eval import get_official_eval_result, get_coco_eval_result\ndef _read_imageset_file(path):\n    with open(path, 'r') as f:\n        lines = f.readlines()\n    return [int(line) for line in lines]\ndet_path = \"/path/to/your_result_folder\"\ndt_annos = kitti.get_label_annos(det_path)\ngt_path = \"/path/to/your_gt_label_folder\"\ngt_split_file = \"/path/to/val.txt\" # from https://xiaozhichen.github.io/files/mv3d/imagesets.tar.gz\nval_image_ids = _read_imageset_file(gt_split_file)\ngt_annos = kitti.get_label_annos(gt_path, val_image_ids)\nprint(get_official_eval_result(gt_annos, dt_annos, 0)) # 6s in my computer\nprint(get_coco_eval_result(gt_annos, dt_annos, 0)) # 18s in my computer\n```\n"
  },
  {
    "path": "pcdet/datasets/kitti/kitti_object_eval_python/eval.py",
    "content": "import io as sysio\n\nimport numba\nimport numpy as np\n\nfrom .rotate_iou import rotate_iou_gpu_eval\nimport pickle\n@numba.jit\ndef get_thresholds(scores: np.ndarray, num_gt, num_sample_pts=41):\n    scores.sort()\n    scores = scores[::-1]\n    current_recall = 0\n    thresholds = []\n    for i, score in enumerate(scores):\n        l_recall = (i + 1) / num_gt\n        if i < (len(scores) - 1):\n            r_recall = (i + 2) / num_gt\n        else:\n            r_recall = l_recall\n        if (((r_recall - current_recall) < (current_recall - l_recall))\n                and (i < (len(scores) - 1))):\n            continue\n        # recall = l_recall\n        thresholds.append(score)\n        current_recall += 1 / (num_sample_pts - 1.0)\n    return thresholds\n\n\ndef clean_data(gt_anno, dt_anno, current_class, difficulty):\n    CLASS_NAMES = ['car', 'pedestrian', 'cyclist', 'van', 'person_sitting', 'truck']\n    MIN_HEIGHT = [40, 25, 25]\n    MAX_OCCLUSION = [0, 1, 2]\n    MAX_TRUNCATION = [0.15, 0.3, 0.5]\n    dc_bboxes, ignored_gt, ignored_dt = [], [], []\n    current_cls_name = CLASS_NAMES[current_class].lower()\n    num_gt = len(gt_anno[\"name\"])\n    num_dt = len(dt_anno[\"name\"])\n    num_valid_gt = 0\n    for i in range(num_gt):\n        bbox = gt_anno[\"bbox\"][i]\n        gt_name = gt_anno[\"name\"][i].lower()\n        height = bbox[3] - bbox[1]\n        valid_class = -1\n        if (gt_name == current_cls_name):\n            valid_class = 1\n        elif (current_cls_name == \"Pedestrian\".lower()\n              and \"Person_sitting\".lower() == gt_name):\n            valid_class = 0\n        elif (current_cls_name == \"Car\".lower() and \"Van\".lower() == gt_name):\n            valid_class = 0\n        else:\n            valid_class = -1\n        ignore = False\n        if ((gt_anno[\"occluded\"][i] > MAX_OCCLUSION[difficulty])\n                or (gt_anno[\"truncated\"][i] > MAX_TRUNCATION[difficulty])\n                or (height <= MIN_HEIGHT[difficulty])):\n            # if gt_anno[\"difficulty\"][i] > difficulty or gt_anno[\"difficulty\"][i] == -1:\n            ignore = True\n        if valid_class == 1 and not ignore:\n            ignored_gt.append(0)\n            num_valid_gt += 1\n        elif (valid_class == 0 or (ignore and (valid_class == 1))):\n            ignored_gt.append(1)\n        else:\n            ignored_gt.append(-1)\n    # for i in range(num_gt):\n        if gt_anno[\"name\"][i] == \"DontCare\":\n            dc_bboxes.append(gt_anno[\"bbox\"][i])\n    for i in range(num_dt):\n        if (dt_anno[\"name\"][i].lower() == current_cls_name):\n            valid_class = 1\n        else:\n            valid_class = -1\n        height = abs(dt_anno[\"bbox\"][i, 3] - dt_anno[\"bbox\"][i, 1])\n        if height < MIN_HEIGHT[difficulty]:\n            ignored_dt.append(1)\n        elif valid_class == 1:\n            ignored_dt.append(0)\n        else:\n            ignored_dt.append(-1)\n\n    return num_valid_gt, ignored_gt, ignored_dt, dc_bboxes\n\n\n@numba.jit(nopython=True)\ndef image_box_overlap(boxes, query_boxes, criterion=-1):\n    N = boxes.shape[0]\n    K = query_boxes.shape[0]\n    overlaps = np.zeros((N, K), dtype=boxes.dtype)\n    for k in range(K):\n        qbox_area = ((query_boxes[k, 2] - query_boxes[k, 0]) *\n                     (query_boxes[k, 3] - query_boxes[k, 1]))\n        for n in range(N):\n            iw = (min(boxes[n, 2], query_boxes[k, 2]) -\n                  max(boxes[n, 0], query_boxes[k, 0]))\n            if iw > 0:\n                ih = (min(boxes[n, 3], query_boxes[k, 3]) -\n                      max(boxes[n, 1], query_boxes[k, 1]))\n                if ih > 0:\n                    if criterion == -1:\n                        ua = (\n                            (boxes[n, 2] - boxes[n, 0]) *\n                            (boxes[n, 3] - boxes[n, 1]) + qbox_area - iw * ih)\n                    elif criterion == 0:\n                        ua = ((boxes[n, 2] - boxes[n, 0]) *\n                              (boxes[n, 3] - boxes[n, 1]))\n                    elif criterion == 1:\n                        ua = qbox_area\n                    else:\n                        ua = 1.0\n                    overlaps[n, k] = iw * ih / ua\n    return overlaps\n\n\ndef bev_box_overlap(boxes, qboxes, criterion=-1):\n    riou = rotate_iou_gpu_eval(boxes, qboxes, criterion)\n    return riou\n\n\n@numba.jit(nopython=True, parallel=True)\ndef d3_box_overlap_kernel(boxes, qboxes, rinc, criterion=-1):\n    # ONLY support overlap in CAMERA, not lider.\n    N, K = boxes.shape[0], qboxes.shape[0]\n    for i in range(N):\n        for j in range(K):\n            if rinc[i, j] > 0:\n                # iw = (min(boxes[i, 1] + boxes[i, 4], qboxes[j, 1] +\n                #         qboxes[j, 4]) - max(boxes[i, 1], qboxes[j, 1]))\n                iw = (min(boxes[i, 1], qboxes[j, 1]) - max(\n                    boxes[i, 1] - boxes[i, 4], qboxes[j, 1] - qboxes[j, 4]))\n\n                if iw > 0:\n                    area1 = boxes[i, 3] * boxes[i, 4] * boxes[i, 5]\n                    area2 = qboxes[j, 3] * qboxes[j, 4] * qboxes[j, 5]\n                    inc = iw * rinc[i, j]\n                    if criterion == -1:\n                        ua = (area1 + area2 - inc)\n                    elif criterion == 0:\n                        ua = area1\n                    elif criterion == 1:\n                        ua = area2\n                    else:\n                        ua = inc\n                    rinc[i, j] = inc / ua\n                else:\n                    rinc[i, j] = 0.0\n\n\ndef d3_box_overlap(boxes, qboxes, criterion=-1):\n    rinc = rotate_iou_gpu_eval(boxes[:, [0, 2, 3, 5, 6]],\n                               qboxes[:, [0, 2, 3, 5, 6]], 2)\n    d3_box_overlap_kernel(boxes, qboxes, rinc, criterion)\n    return rinc\n\n\n@numba.jit(nopython=True)\ndef compute_statistics_jit(overlaps,\n                           gt_datas,\n                           dt_datas,\n                           ignored_gt,\n                           ignored_det,\n                           dc_bboxes,\n                           metric,\n                           min_overlap,\n                           thresh=0,\n                           compute_fp=False,\n                           compute_aos=False):\n\n    det_size = dt_datas.shape[0]\n    gt_size = gt_datas.shape[0]\n    dt_scores = dt_datas[:, -1]\n    dt_alphas = dt_datas[:, 4]\n    gt_alphas = gt_datas[:, 4]\n    dt_bboxes = dt_datas[:, :4]\n    gt_bboxes = gt_datas[:, :4]\n\n    assigned_detection = [False] * det_size\n    ignored_threshold = [False] * det_size\n    if compute_fp:\n        for i in range(det_size):\n            if (dt_scores[i] < thresh):\n                ignored_threshold[i] = True\n    NO_DETECTION = -10000000\n    tp, fp, fn, similarity = 0, 0, 0, 0\n    # thresholds = [0.0]\n    # delta = [0.0]\n    thresholds = np.zeros((gt_size, ))\n    thresh_idx = 0\n    delta = np.zeros((gt_size, ))\n    delta_idx = 0\n    for i in range(gt_size):\n        if ignored_gt[i] == -1:\n            continue\n        det_idx = -1\n        valid_detection = NO_DETECTION\n        max_overlap = 0\n        assigned_ignored_det = False\n\n        for j in range(det_size):\n            if (ignored_det[j] == -1):\n                continue\n            if (assigned_detection[j]):\n                continue\n            if (ignored_threshold[j]):\n                continue\n            overlap = overlaps[j, i]\n            dt_score = dt_scores[j]\n            if (not compute_fp and (overlap > min_overlap)\n                    and dt_score > valid_detection):\n                det_idx = j\n                valid_detection = dt_score\n            elif (compute_fp and (overlap > min_overlap)\n                  and (overlap > max_overlap or assigned_ignored_det)\n                  and ignored_det[j] == 0):\n                max_overlap = overlap\n                det_idx = j\n                valid_detection = 1\n                assigned_ignored_det = False\n            elif (compute_fp and (overlap > min_overlap)\n                  and (valid_detection == NO_DETECTION)\n                  and ignored_det[j] == 1):\n                det_idx = j\n                valid_detection = 1\n                assigned_ignored_det = True\n\n        if (valid_detection == NO_DETECTION) and ignored_gt[i] == 0:\n            fn += 1\n        elif ((valid_detection != NO_DETECTION)\n              and (ignored_gt[i] == 1 or ignored_det[det_idx] == 1)):\n            assigned_detection[det_idx] = True\n        elif valid_detection != NO_DETECTION:\n            tp += 1\n            # thresholds.append(dt_scores[det_idx])\n            thresholds[thresh_idx] = dt_scores[det_idx]\n            thresh_idx += 1\n            if compute_aos:\n                # delta.append(gt_alphas[i] - dt_alphas[det_idx])\n                delta[delta_idx] = gt_alphas[i] - dt_alphas[det_idx]\n                delta_idx += 1\n\n            assigned_detection[det_idx] = True\n    if compute_fp:\n        for i in range(det_size):\n            if (not (assigned_detection[i] or ignored_det[i] == -1\n                     or ignored_det[i] == 1 or ignored_threshold[i])):\n                fp += 1\n        nstuff = 0\n        if metric == 0:\n            overlaps_dt_dc = image_box_overlap(dt_bboxes, dc_bboxes, 0)\n            for i in range(dc_bboxes.shape[0]):\n                for j in range(det_size):\n                    if (assigned_detection[j]):\n                        continue\n                    if (ignored_det[j] == -1 or ignored_det[j] == 1):\n                        continue\n                    if (ignored_threshold[j]):\n                        continue\n                    if overlaps_dt_dc[j, i] > min_overlap:\n                        assigned_detection[j] = True\n                        nstuff += 1\n        fp -= nstuff\n        if compute_aos:\n            tmp = np.zeros((fp + delta_idx, ))\n            # tmp = [0] * fp\n            for i in range(delta_idx):\n                tmp[i + fp] = (1.0 + np.cos(delta[i])) / 2.0\n                # tmp.append((1.0 + np.cos(delta[i])) / 2.0)\n            # assert len(tmp) == fp + tp\n            # assert len(delta) == tp\n            if tp > 0 or fp > 0:\n                similarity = np.sum(tmp)\n            else:\n                similarity = -1\n    return tp, fp, fn, similarity, thresholds[:thresh_idx]\n\n\ndef get_split_parts(num, num_part):\n    same_part = num // num_part\n    remain_num = num % num_part\n    if same_part == 0:\n        return [num]\n\n    if remain_num == 0:\n        return [same_part] * num_part\n    else:\n        return [same_part] * num_part + [remain_num]\n\n\n@numba.jit(nopython=True)\ndef fused_compute_statistics(overlaps,\n                             pr,\n                             gt_nums,\n                             dt_nums,\n                             dc_nums,\n                             gt_datas,\n                             dt_datas,\n                             dontcares,\n                             ignored_gts,\n                             ignored_dets,\n                             metric,\n                             min_overlap,\n                             thresholds,\n                             compute_aos=False):\n    gt_num = 0\n    dt_num = 0\n    dc_num = 0\n    for i in range(gt_nums.shape[0]):\n        for t, thresh in enumerate(thresholds):\n            overlap = overlaps[dt_num:dt_num + dt_nums[i], gt_num:\n                               gt_num + gt_nums[i]]\n\n            gt_data = gt_datas[gt_num:gt_num + gt_nums[i]]\n            dt_data = dt_datas[dt_num:dt_num + dt_nums[i]]\n            ignored_gt = ignored_gts[gt_num:gt_num + gt_nums[i]]\n            ignored_det = ignored_dets[dt_num:dt_num + dt_nums[i]]\n            dontcare = dontcares[dc_num:dc_num + dc_nums[i]]\n            tp, fp, fn, similarity, _ = compute_statistics_jit(\n                overlap,\n                gt_data,\n                dt_data,\n                ignored_gt,\n                ignored_det,\n                dontcare,\n                metric,\n                min_overlap=min_overlap,\n                thresh=thresh,\n                compute_fp=True,\n                compute_aos=compute_aos)\n            pr[t, 0] += tp\n            pr[t, 1] += fp\n            pr[t, 2] += fn\n            if similarity != -1:\n                pr[t, 3] += similarity\n        gt_num += gt_nums[i]\n        dt_num += dt_nums[i]\n        dc_num += dc_nums[i]\n\n\ndef calculate_iou_partly(gt_annos, dt_annos, metric, num_parts=50):\n    \"\"\"fast iou algorithm. this function can be used independently to\n    do result analysis. Must be used in CAMERA coordinate system.\n    Args:\n        gt_annos: dict, must from get_label_annos() in kitti_common.py\n        dt_annos: dict, must from get_label_annos() in kitti_common.py\n        metric: eval type. 0: bbox, 1: bev, 2: 3d\n        num_parts: int. a parameter for fast calculate algorithm\n    \"\"\"\n    assert len(gt_annos) == len(dt_annos)\n    total_dt_num = np.stack([len(a[\"name\"]) for a in dt_annos], 0)\n    total_gt_num = np.stack([len(a[\"name\"]) for a in gt_annos], 0)\n    num_examples = len(gt_annos)\n    split_parts = get_split_parts(num_examples, num_parts)\n    parted_overlaps = []\n    example_idx = 0\n\n    for num_part in split_parts:\n        gt_annos_part = gt_annos[example_idx:example_idx + num_part]\n        dt_annos_part = dt_annos[example_idx:example_idx + num_part]\n        if metric == 0:\n            gt_boxes = np.concatenate([a[\"bbox\"] for a in gt_annos_part], 0)\n            dt_boxes = np.concatenate([a[\"bbox\"] for a in dt_annos_part], 0)\n            overlap_part = image_box_overlap(gt_boxes, dt_boxes)\n        elif metric == 1:\n            loc = np.concatenate(\n                [a[\"location\"][:, [0, 2]] for a in gt_annos_part], 0)\n            dims = np.concatenate(\n                [a[\"dimensions\"][:, [0, 2]] for a in gt_annos_part], 0)\n            rots = np.concatenate([a[\"rotation_y\"] for a in gt_annos_part], 0)\n            gt_boxes = np.concatenate(\n                [loc, dims, rots[..., np.newaxis]], axis=1)\n            loc = np.concatenate(\n                [a[\"location\"][:, [0, 2]] for a in dt_annos_part], 0)\n            dims = np.concatenate(\n                [a[\"dimensions\"][:, [0, 2]] for a in dt_annos_part], 0)\n            rots = np.concatenate([a[\"rotation_y\"] for a in dt_annos_part], 0)\n            dt_boxes = np.concatenate(\n                [loc, dims, rots[..., np.newaxis]], axis=1)\n            overlap_part = bev_box_overlap(gt_boxes, dt_boxes).astype(\n                np.float64)\n        elif metric == 2:\n            loc = np.concatenate([a[\"location\"] for a in gt_annos_part], 0)\n            dims = np.concatenate([a[\"dimensions\"] for a in gt_annos_part], 0)\n            rots = np.concatenate([a[\"rotation_y\"] for a in gt_annos_part], 0)\n            gt_boxes = np.concatenate(\n                [loc, dims, rots[..., np.newaxis]], axis=1)\n            loc = np.concatenate([a[\"location\"] for a in dt_annos_part], 0)\n            dims = np.concatenate([a[\"dimensions\"] for a in dt_annos_part], 0)\n            rots = np.concatenate([a[\"rotation_y\"] for a in dt_annos_part], 0)\n            dt_boxes = np.concatenate(\n                [loc, dims, rots[..., np.newaxis]], axis=1)\n            overlap_part = d3_box_overlap(gt_boxes, dt_boxes).astype(\n                np.float64)\n        else:\n            raise ValueError(\"unknown metric\")\n        parted_overlaps.append(overlap_part)\n        example_idx += num_part\n    overlaps = []\n    example_idx = 0\n    for j, num_part in enumerate(split_parts):\n        gt_annos_part = gt_annos[example_idx:example_idx + num_part]\n        dt_annos_part = dt_annos[example_idx:example_idx + num_part]\n        gt_num_idx, dt_num_idx = 0, 0\n        for i in range(num_part):\n            gt_box_num = total_gt_num[example_idx + i]\n            dt_box_num = total_dt_num[example_idx + i]\n            overlaps.append(\n                parted_overlaps[j][gt_num_idx:gt_num_idx + gt_box_num,\n                                   dt_num_idx:dt_num_idx + dt_box_num])\n            gt_num_idx += gt_box_num\n            dt_num_idx += dt_box_num\n        example_idx += num_part\n\n    return overlaps, parted_overlaps, total_gt_num, total_dt_num\n\n\ndef _prepare_data(gt_annos, dt_annos, current_class, difficulty):\n    gt_datas_list = []\n    dt_datas_list = []\n    total_dc_num = []\n    ignored_gts, ignored_dets, dontcares = [], [], []\n    total_num_valid_gt = 0\n    for i in range(len(gt_annos)):\n        rets = clean_data(gt_annos[i], dt_annos[i], current_class, difficulty)\n        num_valid_gt, ignored_gt, ignored_det, dc_bboxes = rets\n        ignored_gts.append(np.array(ignored_gt, dtype=np.int64))\n        ignored_dets.append(np.array(ignored_det, dtype=np.int64))\n        if len(dc_bboxes) == 0:\n            dc_bboxes = np.zeros((0, 4)).astype(np.float64)\n        else:\n            dc_bboxes = np.stack(dc_bboxes, 0).astype(np.float64)\n        total_dc_num.append(dc_bboxes.shape[0])\n        dontcares.append(dc_bboxes)\n        total_num_valid_gt += num_valid_gt\n        gt_datas = np.concatenate(\n            [gt_annos[i][\"bbox\"], gt_annos[i][\"alpha\"][..., np.newaxis]], 1)\n        dt_datas = np.concatenate([\n            dt_annos[i][\"bbox\"], dt_annos[i][\"alpha\"][..., np.newaxis],\n            dt_annos[i][\"score\"][..., np.newaxis]\n        ], 1)\n        gt_datas_list.append(gt_datas)\n        dt_datas_list.append(dt_datas)\n    total_dc_num = np.stack(total_dc_num, axis=0)\n    return (gt_datas_list, dt_datas_list, ignored_gts, ignored_dets, dontcares,\n            total_dc_num, total_num_valid_gt)\n\n\ndef eval_class(gt_annos,\n               dt_annos,\n               current_classes,\n               difficultys,\n               metric,\n               min_overlaps,\n               compute_aos=False,\n               num_parts=100):\n    \"\"\"Kitti eval. support 2d/bev/3d/aos eval. support 0.5:0.05:0.95 coco AP.\n    Args:\n        gt_annos: dict, must from get_label_annos() in kitti_common.py\n        dt_annos: dict, must from get_label_annos() in kitti_common.py\n        current_classes: list of int, 0: car, 1: pedestrian, 2: cyclist\n        difficultys: list of int. eval difficulty, 0: easy, 1: normal, 2: hard\n        metric: eval type. 0: bbox, 1: bev, 2: 3d\n        min_overlaps: float, min overlap. format: [num_overlap, metric, class].\n        num_parts: int. a parameter for fast calculate algorithm\n\n    Returns:\n        dict of recall, precision and aos\n    \"\"\"\n    assert len(gt_annos) == len(dt_annos)\n    num_examples = len(gt_annos)\n    split_parts = get_split_parts(num_examples, num_parts)\n\n    rets = calculate_iou_partly(dt_annos, gt_annos, metric, num_parts)\n    overlaps, parted_overlaps, total_dt_num, total_gt_num = rets\n    N_SAMPLE_PTS = 41\n    num_minoverlap = len(min_overlaps)\n    num_class = len(current_classes)\n    num_difficulty = len(difficultys)\n    precision = np.zeros(\n        [num_class, num_difficulty, num_minoverlap, N_SAMPLE_PTS])\n    recall = np.zeros(\n        [num_class, num_difficulty, num_minoverlap, N_SAMPLE_PTS])\n    aos = np.zeros([num_class, num_difficulty, num_minoverlap, N_SAMPLE_PTS])\n    for m, current_class in enumerate(current_classes):\n        for l, difficulty in enumerate(difficultys):\n            rets = _prepare_data(gt_annos, dt_annos, current_class, difficulty)\n            (gt_datas_list, dt_datas_list, ignored_gts, ignored_dets,\n             dontcares, total_dc_num, total_num_valid_gt) = rets\n            for k, min_overlap in enumerate(min_overlaps[:, metric, m]):\n                thresholdss = []\n                all_tp=0\n                all_fn = 0\n                for i in range(len(gt_annos)):\n                    rets = compute_statistics_jit(\n                        overlaps[i],\n                        gt_datas_list[i],\n                        dt_datas_list[i],\n                        ignored_gts[i],\n                        ignored_dets[i],\n                        dontcares[i],\n                        metric,\n                        min_overlap=min_overlap,\n                        thresh=0.0,\n                        compute_fp=False)\n                    tp, fp, fn, similarity, thresholds = rets\n                    all_tp+=tp\n                    all_fn += fn\n                    thresholdss += thresholds.tolist()\n\n                thresholdss = np.array(thresholdss)\n                thresholds = get_thresholds(thresholdss, total_num_valid_gt)\n                thresholds = np.array(thresholds)\n                pr = np.zeros([len(thresholds), 4])\n                idx = 0\n                for j, num_part in enumerate(split_parts):\n                    gt_datas_part = np.concatenate(\n                        gt_datas_list[idx:idx + num_part], 0)\n                    dt_datas_part = np.concatenate(\n                        dt_datas_list[idx:idx + num_part], 0)\n                    dc_datas_part = np.concatenate(\n                        dontcares[idx:idx + num_part], 0)\n                    ignored_dets_part = np.concatenate(\n                        ignored_dets[idx:idx + num_part], 0)\n                    ignored_gts_part = np.concatenate(\n                        ignored_gts[idx:idx + num_part], 0)\n                    fused_compute_statistics(\n                        parted_overlaps[j],\n                        pr,\n                        total_gt_num[idx:idx + num_part],\n                        total_dt_num[idx:idx + num_part],\n                        total_dc_num[idx:idx + num_part],\n                        gt_datas_part,\n                        dt_datas_part,\n                        dc_datas_part,\n                        ignored_gts_part,\n                        ignored_dets_part,\n                        metric,\n                        min_overlap=min_overlap,\n                        thresholds=thresholds,\n                        compute_aos=compute_aos)\n                    idx += num_part\n                for i in range(len(thresholds)):\n                    recall[m, l, k, i] = pr[i, 0] / (pr[i, 0] + pr[i, 2])\n                    precision[m, l, k, i] = pr[i, 0] / (pr[i, 0] + pr[i, 1])\n                    if compute_aos:\n                        aos[m, l, k, i] = pr[i, 3] / (pr[i, 0] + pr[i, 1])\n                for i in range(len(thresholds)):\n                    precision[m, l, k, i] = np.max(\n                        precision[m, l, k, i:], axis=-1)\n                    #recall[m, l, k, i] = np.max(recall[m, l, k, i:], axis=-1)\n                    if compute_aos:\n                        aos[m, l, k, i] = np.max(aos[m, l, k, i:], axis=-1)\n    ret_dict = {\n        \"recall\": recall,\n        \"precision\": precision,\n        \"orientation\": aos,\n    }\n    return ret_dict\n\n\ndef get_mAP(prec):\n    sums = 0\n    for i in range(0, prec.shape[-1], 4):\n        sums = sums + prec[..., i]\n    return sums / 11 * 100\n\n\ndef get_mAP_R40(prec):\n    sums = 0\n    for i in range(1, prec.shape[-1]):\n        sums = sums + prec[..., i]\n    return sums / 40 * 100\n\n\ndef print_str(value, *arg, sstream=None):\n    if sstream is None:\n        sstream = sysio.StringIO()\n    sstream.truncate(0)\n    sstream.seek(0)\n    print(value, *arg, file=sstream)\n    return sstream.getvalue()\n\n\ndef do_eval(gt_annos,\n            dt_annos,\n            current_classes,\n            min_overlaps,\n            compute_aos=False,\n            PR_detail_dict=None):\n    # min_overlaps: [num_minoverlap, metric, num_class]\n    difficultys = [0, 1, 2]\n    ret = eval_class(gt_annos, dt_annos, current_classes, difficultys, 0,\n                     min_overlaps, compute_aos)\n    # ret: [num_class, num_diff, num_minoverlap, num_sample_points]\n    mAP_bbox = get_mAP(ret[\"precision\"])\n    mAP_bbox_R40 = get_mAP_R40(ret[\"precision\"])\n\n    if PR_detail_dict is not None:\n        PR_detail_dict['bbox'] = ret['precision']\n\n    mAP_aos = mAP_aos_R40 = None\n    if compute_aos:\n        mAP_aos = get_mAP(ret[\"orientation\"])\n        mAP_aos_R40 = get_mAP_R40(ret[\"orientation\"])\n\n        if PR_detail_dict is not None:\n            PR_detail_dict['aos'] = ret['orientation']\n\n    ret = eval_class(gt_annos, dt_annos, current_classes, difficultys, 1,\n                     min_overlaps)\n\n    mAP_bev = get_mAP(ret[\"precision\"])\n    mAP_bev_R40 = get_mAP_R40(ret[\"precision\"])\n\n    if PR_detail_dict is not None:\n        PR_detail_dict['bev'] = ret['precision']\n\n    ret = eval_class(gt_annos, dt_annos, current_classes, difficultys, 2,\n                     min_overlaps)\n    mAP_3d = get_mAP(ret[\"precision\"])\n    mAP_3d_R40 = get_mAP_R40(ret[\"precision\"])\n\n    if PR_detail_dict is not None:\n        PR_detail_dict['3d'] = ret['precision']\n    return mAP_bbox, mAP_bev, mAP_3d, mAP_aos, mAP_bbox_R40, mAP_bev_R40, mAP_3d_R40, mAP_aos_R40\n\n\ndef do_coco_style_eval(gt_annos, dt_annos, current_classes, overlap_ranges,\n                       compute_aos):\n    # overlap_ranges: [range, metric, num_class]\n    min_overlaps = np.zeros([10, *overlap_ranges.shape[1:]])\n    for i in range(overlap_ranges.shape[1]):\n        for j in range(overlap_ranges.shape[2]):\n            min_overlaps[:, i, j] = np.linspace(*overlap_ranges[:, i, j])\n    mAP_bbox, mAP_bev, mAP_3d, mAP_aos = do_eval(\n        gt_annos, dt_annos, current_classes, min_overlaps, compute_aos)\n    # ret: [num_class, num_diff, num_minoverlap]\n    mAP_bbox = mAP_bbox.mean(-1)\n    mAP_bev = mAP_bev.mean(-1)\n    mAP_3d = mAP_3d.mean(-1)\n    if mAP_aos is not None:\n        mAP_aos = mAP_aos.mean(-1)\n    return mAP_bbox, mAP_bev, mAP_3d, mAP_aos\n\n\ndef get_official_eval_result(gt_annos, dt_annos, current_classes, PR_detail_dict=None):\n    overlap_0_7 = np.array([[0.7, 0.5, 0.5, 0.7,\n                             0.5, 0.7], [0.7, 0.5, 0.5, 0.7, 0.5, 0.7],\n                            [0.7, 0.5, 0.5, 0.7, 0.5, 0.7]])\n    overlap_0_5 = np.array([[0.7, 0.5, 0.5, 0.7,\n                             0.5, 0.5], [0.5, 0.25, 0.25, 0.5, 0.25, 0.5],\n                            [0.5, 0.25, 0.25, 0.5, 0.25, 0.5]])\n    min_overlaps = np.stack([overlap_0_7, overlap_0_5], axis=0)  # [2, 3, 5]\n    class_to_name = {\n        0: 'Car',\n        1: 'Pedestrian',\n        2: 'Cyclist',\n        3: 'Van',\n        4: 'Person_sitting',\n        5: 'Truck'\n    }\n    name_to_class = {v: n for n, v in class_to_name.items()}\n    if not isinstance(current_classes, (list, tuple)):\n        current_classes = [current_classes]\n    current_classes_int = []\n    for curcls in current_classes:\n        if isinstance(curcls, str):\n            current_classes_int.append(name_to_class[curcls])\n        else:\n            current_classes_int.append(curcls)\n    current_classes = current_classes_int\n    min_overlaps = min_overlaps[:, :, current_classes]\n    result = ''\n    # check whether alpha is valid\n    compute_aos = False\n    for anno in dt_annos:\n        if anno['alpha'].shape[0] != 0:\n            if anno['alpha'][0] != -10:\n                compute_aos = True\n            break\n    mAPbbox, mAPbev, mAP3d, mAPaos, mAPbbox_R40, mAPbev_R40, mAP3d_R40, mAPaos_R40 = do_eval(\n        gt_annos, dt_annos, current_classes, min_overlaps, compute_aos, PR_detail_dict=PR_detail_dict)\n\n    ret_dict = {}\n    for j, curcls in enumerate(current_classes):\n        # mAP threshold array: [num_minoverlap, metric, class]\n        # mAP result: [num_class, num_diff, num_minoverlap]\n        for i in range(min_overlaps.shape[0]):\n            result += print_str(\n                (f\"{class_to_name[curcls]} \"\n                 \"AP@{:.2f}, {:.2f}, {:.2f}:\".format(*min_overlaps[i, :, j])))\n            result += print_str((f\"bbox AP:{mAPbbox[j, 0, i]:.4f}, \"\n                                 f\"{mAPbbox[j, 1, i]:.4f}, \"\n                                 f\"{mAPbbox[j, 2, i]:.4f}\"))\n            result += print_str((f\"bev  AP:{mAPbev[j, 0, i]:.4f}, \"\n                                 f\"{mAPbev[j, 1, i]:.4f}, \"\n                                 f\"{mAPbev[j, 2, i]:.4f}\"))\n            result += print_str((f\"3d   AP:{mAP3d[j, 0, i]:.4f}, \"\n                                 f\"{mAP3d[j, 1, i]:.4f}, \"\n                                 f\"{mAP3d[j, 2, i]:.4f}\"))\n\n            if compute_aos:\n                result += print_str((f\"aos  AP:{mAPaos[j, 0, i]:.2f}, \"\n                                     f\"{mAPaos[j, 1, i]:.2f}, \"\n                                     f\"{mAPaos[j, 2, i]:.2f}\"))\n                # if i == 0:\n                   # ret_dict['%s_aos/easy' % class_to_name[curcls]] = mAPaos[j, 0, 0]\n                   # ret_dict['%s_aos/moderate' % class_to_name[curcls]] = mAPaos[j, 1, 0]\n                   # ret_dict['%s_aos/hard' % class_to_name[curcls]] = mAPaos[j, 2, 0]\n\n            result += print_str(\n                (f\"{class_to_name[curcls]} \"\n                 \"AP_R40@{:.2f}, {:.2f}, {:.2f}:\".format(*min_overlaps[i, :, j])))\n            result += print_str((f\"bbox AP:{mAPbbox_R40[j, 0, i]:.4f}, \"\n                                 f\"{mAPbbox_R40[j, 1, i]:.4f}, \"\n                                 f\"{mAPbbox_R40[j, 2, i]:.4f}\"))\n            result += print_str((f\"bev  AP:{mAPbev_R40[j, 0, i]:.4f}, \"\n                                 f\"{mAPbev_R40[j, 1, i]:.4f}, \"\n                                 f\"{mAPbev_R40[j, 2, i]:.4f}\"))\n            result += print_str((f\"3d   AP:{mAP3d_R40[j, 0, i]:.4f}, \"\n                                 f\"{mAP3d_R40[j, 1, i]:.4f}, \"\n                                 f\"{mAP3d_R40[j, 2, i]:.4f}\"))\n            if compute_aos:\n                result += print_str((f\"aos  AP:{mAPaos_R40[j, 0, i]:.2f}, \"\n                                     f\"{mAPaos_R40[j, 1, i]:.2f}, \"\n                                     f\"{mAPaos_R40[j, 2, i]:.2f}\"))\n                if i == 0:\n                   ret_dict['%s_aos/easy_R40' % class_to_name[curcls]] = mAPaos_R40[j, 0, 0]\n                   ret_dict['%s_aos/moderate_R40' % class_to_name[curcls]] = mAPaos_R40[j, 1, 0]\n                   ret_dict['%s_aos/hard_R40' % class_to_name[curcls]] = mAPaos_R40[j, 2, 0]\n\n            if i == 0:\n                # ret_dict['%s_3d/easy' % class_to_name[curcls]] = mAP3d[j, 0, 0]\n                # ret_dict['%s_3d/moderate' % class_to_name[curcls]] = mAP3d[j, 1, 0]\n                # ret_dict['%s_3d/hard' % class_to_name[curcls]] = mAP3d[j, 2, 0]\n                # ret_dict['%s_bev/easy' % class_to_name[curcls]] = mAPbev[j, 0, 0]\n                # ret_dict['%s_bev/moderate' % class_to_name[curcls]] = mAPbev[j, 1, 0]\n                # ret_dict['%s_bev/hard' % class_to_name[curcls]] = mAPbev[j, 2, 0]\n                # ret_dict['%s_image/easy' % class_to_name[curcls]] = mAPbbox[j, 0, 0]\n                # ret_dict['%s_image/moderate' % class_to_name[curcls]] = mAPbbox[j, 1, 0]\n                # ret_dict['%s_image/hard' % class_to_name[curcls]] = mAPbbox[j, 2, 0]\n\n                ret_dict['%s_3d/easy_R40' % class_to_name[curcls]] = mAP3d_R40[j, 0, 0]\n                ret_dict['%s_3d/moderate_R40' % class_to_name[curcls]] = mAP3d_R40[j, 1, 0]\n                ret_dict['%s_3d/hard_R40' % class_to_name[curcls]] = mAP3d_R40[j, 2, 0]\n                ret_dict['%s_bev/easy_R40' % class_to_name[curcls]] = mAPbev_R40[j, 0, 0]\n                ret_dict['%s_bev/moderate_R40' % class_to_name[curcls]] = mAPbev_R40[j, 1, 0]\n                ret_dict['%s_bev/hard_R40' % class_to_name[curcls]] = mAPbev_R40[j, 2, 0]\n                ret_dict['%s_image/easy_R40' % class_to_name[curcls]] = mAPbbox_R40[j, 0, 0]\n                ret_dict['%s_image/moderate_R40' % class_to_name[curcls]] = mAPbbox_R40[j, 1, 0]\n                ret_dict['%s_image/hard_R40' % class_to_name[curcls]] = mAPbbox_R40[j, 2, 0]\n\n    return result, ret_dict\n\n\ndef get_coco_eval_result(gt_annos, dt_annos, current_classes):\n    class_to_name = {\n        0: 'Car',\n        1: 'Pedestrian',\n        2: 'Cyclist',\n        3: 'Van',\n        4: 'Person_sitting',\n    }\n    class_to_range = {\n        0: [0.5, 0.95, 10],\n        1: [0.25, 0.7, 10],\n        2: [0.25, 0.7, 10],\n        3: [0.5, 0.95, 10],\n        4: [0.25, 0.7, 10],\n    }\n    name_to_class = {v: n for n, v in class_to_name.items()}\n    if not isinstance(current_classes, (list, tuple)):\n        current_classes = [current_classes]\n    current_classes_int = []\n    for curcls in current_classes:\n        if isinstance(curcls, str):\n            current_classes_int.append(name_to_class[curcls])\n        else:\n            current_classes_int.append(curcls)\n    current_classes = current_classes_int\n    overlap_ranges = np.zeros([3, 3, len(current_classes)])\n    for i, curcls in enumerate(current_classes):\n        overlap_ranges[:, :, i] = np.array(\n            class_to_range[curcls])[:, np.newaxis]\n    result = ''\n    # check whether alpha is valid\n    compute_aos = False\n    for anno in dt_annos:\n        if anno['alpha'].shape[0] != 0:\n            if anno['alpha'][0] != -10:\n                compute_aos = True\n            break\n    mAPbbox, mAPbev, mAP3d, mAPaos = do_coco_style_eval(\n        gt_annos, dt_annos, current_classes, overlap_ranges, compute_aos)\n    for j, curcls in enumerate(current_classes):\n        # mAP threshold array: [num_minoverlap, metric, class]\n        # mAP result: [num_class, num_diff, num_minoverlap]\n        o_range = np.array(class_to_range[curcls])[[0, 2, 1]]\n        o_range[1] = (o_range[2] - o_range[0]) / (o_range[1] - 1)\n        result += print_str((f\"{class_to_name[curcls]} \"\n                             \"coco AP@{:.2f}:{:.2f}:{:.2f}:\".format(*o_range)))\n        result += print_str((f\"bbox AP:{mAPbbox[j, 0]:.2f}, \"\n                             f\"{mAPbbox[j, 1]:.2f}, \"\n                             f\"{mAPbbox[j, 2]:.2f}\"))\n        result += print_str((f\"bev  AP:{mAPbev[j, 0]:.2f}, \"\n                             f\"{mAPbev[j, 1]:.2f}, \"\n                             f\"{mAPbev[j, 2]:.2f}\"))\n        result += print_str((f\"3d   AP:{mAP3d[j, 0]:.2f}, \"\n                             f\"{mAP3d[j, 1]:.2f}, \"\n                             f\"{mAP3d[j, 2]:.2f}\"))\n        if compute_aos:\n            result += print_str((f\"aos  AP:{mAPaos[j, 0]:.2f}, \"\n                                 f\"{mAPaos[j, 1]:.2f}, \"\n                                 f\"{mAPaos[j, 2]:.2f}\"))\n    return result\n"
  },
  {
    "path": "pcdet/datasets/kitti/kitti_object_eval_python/evaluate.py",
    "content": "import time\n\nimport fire\n\nimport pcdet.datasets.kitti.kitti_object_eval_python.kitti_common as kitti\nfrom pcdet.datasets.kitti.kitti_object_eval_python.eval import get_coco_eval_result, get_official_eval_result\nimport pickle\n\ndef _read_imageset_file(path):\n    with open(path, 'r') as f:\n        lines = f.readlines()\n    return [int(line) for line in lines]\n\n\ndef evaluate(label_path,\n             result_path,\n             label_split_file,\n             current_class=[0,1,2],\n             coco=False,\n             score_thresh=-1):\n    dt_annos = pickle.load(open(result_path,'rb'))#kitti.get_label_annos(result_path)\n    if score_thresh > 0:\n        dt_annos = kitti.filter_annos_low_score(dt_annos, score_thresh)\n    val_image_ids = _read_imageset_file(label_split_file)\n    gt_annos = kitti.get_label_annos(label_path, val_image_ids)\n    if coco:\n        return get_coco_eval_result(gt_annos, dt_annos, current_class)\n    else:\n        return get_official_eval_result(gt_annos, dt_annos, current_class)\n\ndef evaluate_dis(label_path,\n             result_path,\n             label_split_file,\n             current_class=[0,1,2],\n             coco=False,\n             min_dis = 0,\n             max_dis = 100):\n    dt_annos = pickle.load(open(result_path,'rb'))\n    val_image_ids = _read_imageset_file(label_split_file)\n    gt_annos = kitti.get_label_annos(label_path, val_image_ids)\n\n    gt_annos = kitti.filter_gt_annos_dis(gt_annos,min_dis,max_dis)\n    dt_annos = kitti.filter_det_annos_dis(dt_annos, min_dis, max_dis)\n\n    if coco:\n        return get_coco_eval_result(gt_annos, dt_annos, current_class)\n    else:\n        return get_official_eval_result(gt_annos, dt_annos, current_class)\n\n"
  },
  {
    "path": "pcdet/datasets/kitti/kitti_object_eval_python/kitti_common.py",
    "content": "import concurrent.futures as futures\nimport os\nimport pathlib\nimport re\nfrom collections import OrderedDict\n\nimport numpy as np\nfrom skimage import io\n\n\ndef get_image_index_str(img_idx):\n    return \"{:06d}\".format(img_idx)\n\n\ndef get_kitti_info_path(idx,\n                        prefix,\n                        info_type='image_2',\n                        file_tail='.png',\n                        training=True,\n                        relative_path=True):\n    img_idx_str = get_image_index_str(idx)\n    img_idx_str += file_tail\n    prefix = pathlib.Path(prefix)\n    if training:\n        file_path = pathlib.Path('training') / info_type / img_idx_str\n    else:\n        file_path = pathlib.Path('testing') / info_type / img_idx_str\n    if not (prefix / file_path).exists():\n        raise ValueError(\"file not exist: {}\".format(file_path))\n    if relative_path:\n        return str(file_path)\n    else:\n        return str(prefix / file_path)\n\n\ndef get_image_path(idx, prefix, training=True, relative_path=True):\n    return get_kitti_info_path(idx, prefix, 'image_2', '.png', training,\n                               relative_path)\n\n\ndef get_label_path(idx, prefix, training=True, relative_path=True):\n    return get_kitti_info_path(idx, prefix, 'label_2', '.txt', training,\n                               relative_path)\n\n\ndef get_velodyne_path(idx, prefix, training=True, relative_path=True):\n    return get_kitti_info_path(idx, prefix, 'velodyne', '.bin', training,\n                               relative_path)\n\n\ndef get_calib_path(idx, prefix, training=True, relative_path=True):\n    return get_kitti_info_path(idx, prefix, 'calib', '.txt', training,\n                               relative_path)\n\n\ndef _extend_matrix(mat):\n    mat = np.concatenate([mat, np.array([[0., 0., 0., 1.]])], axis=0)\n    return mat\n\n\ndef get_kitti_image_info(path,\n                         training=True,\n                         label_info=True,\n                         velodyne=False,\n                         calib=False,\n                         image_ids=7481,\n                         extend_matrix=True,\n                         num_worker=8,\n                         relative_path=True,\n                         with_imageshape=True):\n    # image_infos = []\n    root_path = pathlib.Path(path)\n    if not isinstance(image_ids, list):\n        image_ids = list(range(image_ids))\n\n    def map_func(idx):\n        image_info = {'image_idx': idx}\n        annotations = None\n        if velodyne:\n            image_info['velodyne_path'] = get_velodyne_path(\n                idx, path, training, relative_path)\n        image_info['img_path'] = get_image_path(idx, path, training,\n                                                relative_path)\n        if with_imageshape:\n            img_path = image_info['img_path']\n            if relative_path:\n                img_path = str(root_path / img_path)\n            image_info['img_shape'] = np.array(\n                io.imread(img_path).shape[:2], dtype=np.int32)\n        if label_info:\n            label_path = get_label_path(idx, path, training, relative_path)\n            if relative_path:\n                label_path = str(root_path / label_path)\n            annotations = get_label_anno(label_path)\n        if calib:\n            calib_path = get_calib_path(\n                idx, path, training, relative_path=False)\n            with open(calib_path, 'r') as f:\n                lines = f.readlines()\n            P0 = np.array(\n                [float(info) for info in lines[0].split(' ')[1:13]]).reshape(\n                    [3, 4])\n            P1 = np.array(\n                [float(info) for info in lines[1].split(' ')[1:13]]).reshape(\n                    [3, 4])\n            P2 = np.array(\n                [float(info) for info in lines[2].split(' ')[1:13]]).reshape(\n                    [3, 4])\n            P3 = np.array(\n                [float(info) for info in lines[3].split(' ')[1:13]]).reshape(\n                    [3, 4])\n            if extend_matrix:\n                P0 = _extend_matrix(P0)\n                P1 = _extend_matrix(P1)\n                P2 = _extend_matrix(P2)\n                P3 = _extend_matrix(P3)\n            image_info['calib/P0'] = P0\n            image_info['calib/P1'] = P1\n            image_info['calib/P2'] = P2\n            image_info['calib/P3'] = P3\n            R0_rect = np.array([\n                float(info) for info in lines[4].split(' ')[1:10]\n            ]).reshape([3, 3])\n            if extend_matrix:\n                rect_4x4 = np.zeros([4, 4], dtype=R0_rect.dtype)\n                rect_4x4[3, 3] = 1.\n                rect_4x4[:3, :3] = R0_rect\n            else:\n                rect_4x4 = R0_rect\n            image_info['calib/R0_rect'] = rect_4x4\n            Tr_velo_to_cam = np.array([\n                float(info) for info in lines[5].split(' ')[1:13]\n            ]).reshape([3, 4])\n            Tr_imu_to_velo = np.array([\n                float(info) for info in lines[6].split(' ')[1:13]\n            ]).reshape([3, 4])\n            if extend_matrix:\n                Tr_velo_to_cam = _extend_matrix(Tr_velo_to_cam)\n                Tr_imu_to_velo = _extend_matrix(Tr_imu_to_velo)\n            image_info['calib/Tr_velo_to_cam'] = Tr_velo_to_cam\n            image_info['calib/Tr_imu_to_velo'] = Tr_imu_to_velo\n        if annotations is not None:\n            image_info['annos'] = annotations\n            add_difficulty_to_annos(image_info)\n        return image_info\n\n    with futures.ThreadPoolExecutor(num_worker) as executor:\n        image_infos = executor.map(map_func, image_ids)\n    return list(image_infos)\n\n\ndef filter_kitti_anno(image_anno,\n                      used_classes,\n                      used_difficulty=None,\n                      dontcare_iou=None):\n    if not isinstance(used_classes, (list, tuple)):\n        used_classes = [used_classes]\n    img_filtered_annotations = {}\n    relevant_annotation_indices = [\n        i for i, x in enumerate(image_anno['name']) if x in used_classes\n    ]\n    for key in image_anno.keys():\n        img_filtered_annotations[key] = (\n            image_anno[key][relevant_annotation_indices])\n    if used_difficulty is not None:\n        relevant_annotation_indices = [\n            i for i, x in enumerate(img_filtered_annotations['difficulty'])\n            if x in used_difficulty\n        ]\n        for key in image_anno.keys():\n            img_filtered_annotations[key] = (\n                img_filtered_annotations[key][relevant_annotation_indices])\n\n    if 'DontCare' in used_classes and dontcare_iou is not None:\n        dont_care_indices = [\n            i for i, x in enumerate(img_filtered_annotations['name'])\n            if x == 'DontCare'\n        ]\n        # bounding box format [y_min, x_min, y_max, x_max]\n        all_boxes = img_filtered_annotations['bbox']\n        ious = iou(all_boxes, all_boxes[dont_care_indices])\n\n        # Remove all bounding boxes that overlap with a dontcare region.\n        if ious.size > 0:\n            boxes_to_remove = np.amax(ious, axis=1) > dontcare_iou\n            for key in image_anno.keys():\n                img_filtered_annotations[key] = (img_filtered_annotations[key][\n                    np.logical_not(boxes_to_remove)])\n    return img_filtered_annotations\n\ndef filter_annos_low_score(image_annos, thresh):\n    new_image_annos = []\n    for anno in image_annos:\n        img_filtered_annotations = {}\n        relevant_annotation_indices = [\n            i for i, s in enumerate(anno['score']) if s >= thresh\n        ]\n        for key in anno.keys():\n            img_filtered_annotations[key] = (\n                anno[key][relevant_annotation_indices])\n        new_image_annos.append(img_filtered_annotations)\n    return new_image_annos\n\ndef filter_gt_annos_dis(image_annos, dis_min=0, dis_max=100):\n    new_image_annos = []\n    for anno in image_annos:\n        img_filtered_annotations = {}\n\n        relevant_annotation_indices = [\n            i for i, s in enumerate(anno['location']) if (dis_min<np.sqrt(s[0]**2+s[2]**2) < dis_max)\n        ]\n        for key in anno.keys():\n            img_filtered_annotations[key] = (\n                    anno[key][relevant_annotation_indices])\n        new_image_annos.append(img_filtered_annotations)\n    return new_image_annos\ndef filter_det_annos_dis(image_annos, dis_min=0, dis_max=100):\n    new_image_annos = []\n    for anno in image_annos:\n        img_filtered_annotations = {}\n        relevant_annotation_indices = [\n            i for i, s in enumerate(anno['location']) if (dis_min<np.sqrt(s[0]**2+s[2]**2) < dis_max)\n        ]\n        for key in anno.keys():\n            if key in ['name','alpha','bbox','dimensions',\n                       'location','rotation_y','score','boxes_lidar',\n                       'bbox']:\n                img_filtered_annotations[key] = (\n                        anno[key][relevant_annotation_indices])\n        new_image_annos.append(img_filtered_annotations)\n    return new_image_annos\ndef kitti_result_line(result_dict, precision=4):\n    prec_float = \"{\" + \":.{}f\".format(precision) + \"}\"\n    res_line = []\n    all_field_default = OrderedDict([\n        ('name', None),\n        ('truncated', -1),\n        ('occluded', -1),\n        ('alpha', -10),\n        ('bbox', None),\n        ('dimensions', [-1, -1, -1]),\n        ('location', [-1000, -1000, -1000]),\n        ('rotation_y', -10),\n        ('score', None),\n    ])\n    res_dict = [(key, None) for key, val in all_field_default.items()]\n    res_dict = OrderedDict(res_dict)\n    for key, val in result_dict.items():\n        if all_field_default[key] is None and val is None:\n            raise ValueError(\"you must specify a value for {}\".format(key))\n        res_dict[key] = val\n\n    for key, val in res_dict.items():\n        if key == 'name':\n            res_line.append(val)\n        elif key in ['truncated', 'alpha', 'rotation_y', 'score']:\n            if val is None:\n                res_line.append(str(all_field_default[key]))\n            else:\n                res_line.append(prec_float.format(val))\n        elif key == 'occluded':\n            if val is None:\n                res_line.append(str(all_field_default[key]))\n            else:\n                res_line.append('{}'.format(val))\n        elif key in ['bbox', 'dimensions', 'location']:\n            if val is None:\n                res_line += [str(v) for v in all_field_default[key]]\n            else:\n                res_line += [prec_float.format(v) for v in val]\n        else:\n            raise ValueError(\"unknown key. supported key:{}\".format(\n                res_dict.keys()))\n    return ' '.join(res_line)\n\n\ndef add_difficulty_to_annos(info):\n    min_height = [40, 25,\n                  25]  # minimum height for evaluated groundtruth/detections\n    max_occlusion = [\n        0, 1, 2\n    ]  # maximum occlusion level of the groundtruth used for eval_utils\n    max_trunc = [\n        0.15, 0.3, 0.5\n    ]  # maximum truncation level of the groundtruth used for eval_utils\n    annos = info['annos']\n    dims = annos['dimensions']  # lhw format\n    bbox = annos['bbox']\n    height = bbox[:, 3] - bbox[:, 1]\n    occlusion = annos['occluded']\n    truncation = annos['truncated']\n    diff = []\n    easy_mask = np.ones((len(dims), ), dtype=np.bool)\n    moderate_mask = np.ones((len(dims), ), dtype=np.bool)\n    hard_mask = np.ones((len(dims), ), dtype=np.bool)\n    i = 0\n    for h, o, t in zip(height, occlusion, truncation):\n        if o > max_occlusion[0] or h <= min_height[0] or t > max_trunc[0]:\n            easy_mask[i] = False\n        if o > max_occlusion[1] or h <= min_height[1] or t > max_trunc[1]:\n            moderate_mask[i] = False\n        if o > max_occlusion[2] or h <= min_height[2] or t > max_trunc[2]:\n            hard_mask[i] = False\n        i += 1\n    is_easy = easy_mask\n    is_moderate = np.logical_xor(easy_mask, moderate_mask)\n    is_hard = np.logical_xor(hard_mask, moderate_mask)\n\n    for i in range(len(dims)):\n        if is_easy[i]:\n            diff.append(0)\n        elif is_moderate[i]:\n            diff.append(1)\n        elif is_hard[i]:\n            diff.append(2)\n        else:\n            diff.append(-1)\n    annos[\"difficulty\"] = np.array(diff, np.int32)\n    return diff\n\n\ndef get_label_anno(label_path):\n    annotations = {}\n    annotations.update({\n        'name': [],\n        'truncated': [],\n        'occluded': [],\n        'alpha': [],\n        'bbox': [],\n        'dimensions': [],\n        'location': [],\n        'rotation_y': []\n    })\n    with open(label_path, 'r') as f:\n        lines = f.readlines()\n    # if len(lines) == 0 or len(lines[0]) < 15:\n    #     content = []\n    # else:\n    content = [line.strip().split(' ') for line in lines]\n    annotations['name'] = np.array([x[0] for x in content])\n    annotations['truncated'] = np.array([float(x[1]) for x in content])\n    annotations['occluded'] = np.array([int(x[2]) for x in content])\n    annotations['alpha'] = np.array([float(x[3]) for x in content])\n    annotations['bbox'] = np.array(\n        [[float(info) for info in x[4:8]] for x in content]).reshape(-1, 4)\n    # dimensions will convert hwl format to standard lhw(camera) format.\n    annotations['dimensions'] = np.array(\n        [[float(info) for info in x[8:11]] for x in content]).reshape(\n            -1, 3)[:, [2, 0, 1]]\n    annotations['location'] = np.array(\n        [[float(info) for info in x[11:14]] for x in content]).reshape(-1, 3)\n    annotations['rotation_y'] = np.array(\n        [float(x[14]) for x in content]).reshape(-1)\n    if len(content) != 0 and len(content[0]) == 16:  # have score\n        annotations['score'] = np.array([float(x[15]) for x in content])\n    else:\n        annotations['score'] = np.zeros([len(annotations['bbox'])])\n    return annotations\n\ndef get_label_annos(label_folder, image_ids=None):\n    if image_ids is None:\n        filepaths = pathlib.Path(label_folder).glob('*.txt')\n        prog = re.compile(r'^\\d{6}.txt$')\n        filepaths = filter(lambda f: prog.match(f.name), filepaths)\n        image_ids = [int(p.stem) for p in filepaths]\n        image_ids = sorted(image_ids)\n    if not isinstance(image_ids, list):\n        image_ids = list(range(image_ids))\n    annos = []\n    label_folder = pathlib.Path(label_folder)\n    for idx in image_ids:\n        image_idx = get_image_index_str(idx)\n        label_filename = label_folder / (image_idx + '.txt')\n        annos.append(get_label_anno(label_filename))\n    return annos\n\ndef area(boxes, add1=False):\n    \"\"\"Computes area of boxes.\n\n    Args:\n        boxes: Numpy array with shape [N, 4] holding N boxes\n\n    Returns:\n        a numpy array with shape [N*1] representing box areas\n    \"\"\"\n    if add1:\n        return (boxes[:, 2] - boxes[:, 0] + 1.0) * (\n            boxes[:, 3] - boxes[:, 1] + 1.0)\n    else:\n        return (boxes[:, 2] - boxes[:, 0]) * (boxes[:, 3] - boxes[:, 1])\n\n\ndef intersection(boxes1, boxes2, add1=False):\n    \"\"\"Compute pairwise intersection areas between boxes.\n\n    Args:\n        boxes1: a numpy array with shape [N, 4] holding N boxes\n        boxes2: a numpy array with shape [M, 4] holding M boxes\n\n    Returns:\n        a numpy array with shape [N*M] representing pairwise intersection area\n    \"\"\"\n    [y_min1, x_min1, y_max1, x_max1] = np.split(boxes1, 4, axis=1)\n    [y_min2, x_min2, y_max2, x_max2] = np.split(boxes2, 4, axis=1)\n\n    all_pairs_min_ymax = np.minimum(y_max1, np.transpose(y_max2))\n    all_pairs_max_ymin = np.maximum(y_min1, np.transpose(y_min2))\n    if add1:\n        all_pairs_min_ymax += 1.0\n    intersect_heights = np.maximum(\n        np.zeros(all_pairs_max_ymin.shape),\n        all_pairs_min_ymax - all_pairs_max_ymin)\n\n    all_pairs_min_xmax = np.minimum(x_max1, np.transpose(x_max2))\n    all_pairs_max_xmin = np.maximum(x_min1, np.transpose(x_min2))\n    if add1:\n        all_pairs_min_xmax += 1.0\n    intersect_widths = np.maximum(\n        np.zeros(all_pairs_max_xmin.shape),\n        all_pairs_min_xmax - all_pairs_max_xmin)\n    return intersect_heights * intersect_widths\n\n\ndef iou(boxes1, boxes2, add1=False):\n    \"\"\"Computes pairwise intersection-over-union between box collections.\n\n    Args:\n        boxes1: a numpy array with shape [N, 4] holding N boxes.\n        boxes2: a numpy array with shape [M, 4] holding N boxes.\n\n    Returns:\n        a numpy array with shape [N, M] representing pairwise iou scores.\n    \"\"\"\n    intersect = intersection(boxes1, boxes2, add1)\n    area1 = area(boxes1, add1)\n    area2 = area(boxes2, add1)\n    union = np.expand_dims(\n        area1, axis=1) + np.expand_dims(\n            area2, axis=0) - intersect\n    return intersect / union\n"
  },
  {
    "path": "pcdet/datasets/kitti/kitti_object_eval_python/rotate_iou.py",
    "content": "#####################\n# Based on https://github.com/hongzhenwang/RRPN-revise\n# Licensed under The MIT License\n# Author: yanyan, scrin@foxmail.com\n#####################\nimport math\n\nimport numba\nimport numpy as np\nfrom numba import cuda\n\n\n@numba.jit(nopython=True)\ndef div_up(m, n):\n    return m // n + (m % n > 0)\n\n@cuda.jit('(float32[:], float32[:], float32[:])', device=True, inline=True)\ndef trangle_area(a, b, c):\n    return ((a[0] - c[0]) * (b[1] - c[1]) - (a[1] - c[1]) *\n            (b[0] - c[0])) / 2.0\n\n\n@cuda.jit('(float32[:], int32)', device=True, inline=True)\ndef area(int_pts, num_of_inter):\n    area_val = 0.0\n    for i in range(num_of_inter - 2):\n        area_val += abs(\n            trangle_area(int_pts[:2], int_pts[2 * i + 2:2 * i + 4],\n                         int_pts[2 * i + 4:2 * i + 6]))\n    return area_val\n\n\n@cuda.jit('(float32[:], int32)', device=True, inline=True)\ndef sort_vertex_in_convex_polygon(int_pts, num_of_inter):\n    if num_of_inter > 0:\n        center = cuda.local.array((2, ), dtype=numba.float32)\n        center[:] = 0.0\n        for i in range(num_of_inter):\n            center[0] += int_pts[2 * i]\n            center[1] += int_pts[2 * i + 1]\n        center[0] /= num_of_inter\n        center[1] /= num_of_inter\n        v = cuda.local.array((2, ), dtype=numba.float32)\n        vs = cuda.local.array((16, ), dtype=numba.float32)\n        for i in range(num_of_inter):\n            v[0] = int_pts[2 * i] - center[0]\n            v[1] = int_pts[2 * i + 1] - center[1]\n            d = math.sqrt(v[0] * v[0] + v[1] * v[1])\n            v[0] = v[0] / d\n            v[1] = v[1] / d\n            if v[1] < 0:\n                v[0] = -2 - v[0]\n            vs[i] = v[0]\n        j = 0\n        temp = 0\n        for i in range(1, num_of_inter):\n            if vs[i - 1] > vs[i]:\n                temp = vs[i]\n                tx = int_pts[2 * i]\n                ty = int_pts[2 * i + 1]\n                j = i\n                while j > 0 and vs[j - 1] > temp:\n                    vs[j] = vs[j - 1]\n                    int_pts[j * 2] = int_pts[j * 2 - 2]\n                    int_pts[j * 2 + 1] = int_pts[j * 2 - 1]\n                    j -= 1\n\n                vs[j] = temp\n                int_pts[j * 2] = tx\n                int_pts[j * 2 + 1] = ty\n\n\n@cuda.jit(\n    '(float32[:], float32[:], int32, int32, float32[:])',\n    device=True,\n    inline=True)\ndef line_segment_intersection(pts1, pts2, i, j, temp_pts):\n    A = cuda.local.array((2, ), dtype=numba.float32)\n    B = cuda.local.array((2, ), dtype=numba.float32)\n    C = cuda.local.array((2, ), dtype=numba.float32)\n    D = cuda.local.array((2, ), dtype=numba.float32)\n\n    A[0] = pts1[2 * i]\n    A[1] = pts1[2 * i + 1]\n\n    B[0] = pts1[2 * ((i + 1) % 4)]\n    B[1] = pts1[2 * ((i + 1) % 4) + 1]\n\n    C[0] = pts2[2 * j]\n    C[1] = pts2[2 * j + 1]\n\n    D[0] = pts2[2 * ((j + 1) % 4)]\n    D[1] = pts2[2 * ((j + 1) % 4) + 1]\n    BA0 = B[0] - A[0]\n    BA1 = B[1] - A[1]\n    DA0 = D[0] - A[0]\n    CA0 = C[0] - A[0]\n    DA1 = D[1] - A[1]\n    CA1 = C[1] - A[1]\n    acd = DA1 * CA0 > CA1 * DA0\n    bcd = (D[1] - B[1]) * (C[0] - B[0]) > (C[1] - B[1]) * (D[0] - B[0])\n    if acd != bcd:\n        abc = CA1 * BA0 > BA1 * CA0\n        abd = DA1 * BA0 > BA1 * DA0\n        if abc != abd:\n            DC0 = D[0] - C[0]\n            DC1 = D[1] - C[1]\n            ABBA = A[0] * B[1] - B[0] * A[1]\n            CDDC = C[0] * D[1] - D[0] * C[1]\n            DH = BA1 * DC0 - BA0 * DC1\n            Dx = ABBA * DC0 - BA0 * CDDC\n            Dy = ABBA * DC1 - BA1 * CDDC\n            temp_pts[0] = Dx / DH\n            temp_pts[1] = Dy / DH\n            return True\n    return False\n\n\n@cuda.jit(\n    '(float32[:], float32[:], int32, int32, float32[:])',\n    device=True,\n    inline=True)\ndef line_segment_intersection_v1(pts1, pts2, i, j, temp_pts):\n    a = cuda.local.array((2, ), dtype=numba.float32)\n    b = cuda.local.array((2, ), dtype=numba.float32)\n    c = cuda.local.array((2, ), dtype=numba.float32)\n    d = cuda.local.array((2, ), dtype=numba.float32)\n\n    a[0] = pts1[2 * i]\n    a[1] = pts1[2 * i + 1]\n\n    b[0] = pts1[2 * ((i + 1) % 4)]\n    b[1] = pts1[2 * ((i + 1) % 4) + 1]\n\n    c[0] = pts2[2 * j]\n    c[1] = pts2[2 * j + 1]\n\n    d[0] = pts2[2 * ((j + 1) % 4)]\n    d[1] = pts2[2 * ((j + 1) % 4) + 1]\n\n    area_abc = trangle_area(a, b, c)\n    area_abd = trangle_area(a, b, d)\n\n    if area_abc * area_abd >= 0:\n        return False\n\n    area_cda = trangle_area(c, d, a)\n    area_cdb = area_cda + area_abc - area_abd\n\n    if area_cda * area_cdb >= 0:\n        return False\n    t = area_cda / (area_abd - area_abc)\n\n    dx = t * (b[0] - a[0])\n    dy = t * (b[1] - a[1])\n    temp_pts[0] = a[0] + dx\n    temp_pts[1] = a[1] + dy\n    return True\n\n\n@cuda.jit('(float32, float32, float32[:])', device=True, inline=True)\ndef point_in_quadrilateral(pt_x, pt_y, corners):\n    ab0 = corners[2] - corners[0]\n    ab1 = corners[3] - corners[1]\n\n    ad0 = corners[6] - corners[0]\n    ad1 = corners[7] - corners[1]\n\n    ap0 = pt_x - corners[0]\n    ap1 = pt_y - corners[1]\n\n    abab = ab0 * ab0 + ab1 * ab1\n    abap = ab0 * ap0 + ab1 * ap1\n    adad = ad0 * ad0 + ad1 * ad1\n    adap = ad0 * ap0 + ad1 * ap1\n\n    return abab >= abap and abap >= 0 and adad >= adap and adap >= 0\n\n\n@cuda.jit('(float32[:], float32[:], float32[:])', device=True, inline=True)\ndef quadrilateral_intersection(pts1, pts2, int_pts):\n    num_of_inter = 0\n    for i in range(4):\n        if point_in_quadrilateral(pts1[2 * i], pts1[2 * i + 1], pts2):\n            int_pts[num_of_inter * 2] = pts1[2 * i]\n            int_pts[num_of_inter * 2 + 1] = pts1[2 * i + 1]\n            num_of_inter += 1\n        if point_in_quadrilateral(pts2[2 * i], pts2[2 * i + 1], pts1):\n            int_pts[num_of_inter * 2] = pts2[2 * i]\n            int_pts[num_of_inter * 2 + 1] = pts2[2 * i + 1]\n            num_of_inter += 1\n    temp_pts = cuda.local.array((2, ), dtype=numba.float32)\n    for i in range(4):\n        for j in range(4):\n            has_pts = line_segment_intersection(pts1, pts2, i, j, temp_pts)\n            if has_pts:\n                int_pts[num_of_inter * 2] = temp_pts[0]\n                int_pts[num_of_inter * 2 + 1] = temp_pts[1]\n                num_of_inter += 1\n\n    return num_of_inter\n\n\n@cuda.jit('(float32[:], float32[:])', device=True, inline=True)\ndef rbbox_to_corners(corners, rbbox):\n    # generate clockwise corners and rotate it clockwise\n    angle = rbbox[4]\n    a_cos = math.cos(angle)\n    a_sin = math.sin(angle)\n    center_x = rbbox[0]\n    center_y = rbbox[1]\n    x_d = rbbox[2]\n    y_d = rbbox[3]\n    corners_x = cuda.local.array((4, ), dtype=numba.float32)\n    corners_y = cuda.local.array((4, ), dtype=numba.float32)\n    corners_x[0] = -x_d / 2\n    corners_x[1] = -x_d / 2\n    corners_x[2] = x_d / 2\n    corners_x[3] = x_d / 2\n    corners_y[0] = -y_d / 2\n    corners_y[1] = y_d / 2\n    corners_y[2] = y_d / 2\n    corners_y[3] = -y_d / 2\n    for i in range(4):\n        corners[2 *\n                i] = a_cos * corners_x[i] + a_sin * corners_y[i] + center_x\n        corners[2 * i\n                + 1] = -a_sin * corners_x[i] + a_cos * corners_y[i] + center_y\n\n\n@cuda.jit('(float32[:], float32[:])', device=True, inline=True)\ndef inter(rbbox1, rbbox2):\n    corners1 = cuda.local.array((8, ), dtype=numba.float32)\n    corners2 = cuda.local.array((8, ), dtype=numba.float32)\n    intersection_corners = cuda.local.array((16, ), dtype=numba.float32)\n\n    rbbox_to_corners(corners1, rbbox1)\n    rbbox_to_corners(corners2, rbbox2)\n\n    num_intersection = quadrilateral_intersection(corners1, corners2,\n                                                  intersection_corners)\n    sort_vertex_in_convex_polygon(intersection_corners, num_intersection)\n    # print(intersection_corners.reshape([-1, 2])[:num_intersection])\n\n    return area(intersection_corners, num_intersection)\n\n\n@cuda.jit('(float32[:], float32[:], int32)', device=True, inline=True)\ndef devRotateIoUEval(rbox1, rbox2, criterion=-1):\n    area1 = rbox1[2] * rbox1[3]\n    area2 = rbox2[2] * rbox2[3]\n    area_inter = inter(rbox1, rbox2)\n    if criterion == -1:\n        return area_inter / (area1 + area2 - area_inter)\n    elif criterion == 0:\n        return area_inter / area1\n    elif criterion == 1:\n        return area_inter / area2\n    else:\n        return area_inter\n\n@cuda.jit('(int64, int64, float32[:], float32[:], float32[:], int32)', fastmath=False)\ndef rotate_iou_kernel_eval(N, K, dev_boxes, dev_query_boxes, dev_iou, criterion=-1):\n    threadsPerBlock = 8 * 8\n    row_start = cuda.blockIdx.x\n    col_start = cuda.blockIdx.y\n    tx = cuda.threadIdx.x\n    row_size = min(N - row_start * threadsPerBlock, threadsPerBlock)\n    col_size = min(K - col_start * threadsPerBlock, threadsPerBlock)\n    block_boxes = cuda.shared.array(shape=(64 * 5, ), dtype=numba.float32)\n    block_qboxes = cuda.shared.array(shape=(64 * 5, ), dtype=numba.float32)\n\n    dev_query_box_idx = threadsPerBlock * col_start + tx\n    dev_box_idx = threadsPerBlock * row_start + tx\n    if (tx < col_size):\n        block_qboxes[tx * 5 + 0] = dev_query_boxes[dev_query_box_idx * 5 + 0]\n        block_qboxes[tx * 5 + 1] = dev_query_boxes[dev_query_box_idx * 5 + 1]\n        block_qboxes[tx * 5 + 2] = dev_query_boxes[dev_query_box_idx * 5 + 2]\n        block_qboxes[tx * 5 + 3] = dev_query_boxes[dev_query_box_idx * 5 + 3]\n        block_qboxes[tx * 5 + 4] = dev_query_boxes[dev_query_box_idx * 5 + 4]\n    if (tx < row_size):\n        block_boxes[tx * 5 + 0] = dev_boxes[dev_box_idx * 5 + 0]\n        block_boxes[tx * 5 + 1] = dev_boxes[dev_box_idx * 5 + 1]\n        block_boxes[tx * 5 + 2] = dev_boxes[dev_box_idx * 5 + 2]\n        block_boxes[tx * 5 + 3] = dev_boxes[dev_box_idx * 5 + 3]\n        block_boxes[tx * 5 + 4] = dev_boxes[dev_box_idx * 5 + 4]\n    cuda.syncthreads()\n    if tx < row_size:\n        for i in range(col_size):\n            offset = row_start * threadsPerBlock * K + col_start * threadsPerBlock + tx * K + i\n            dev_iou[offset] = devRotateIoUEval(block_qboxes[i * 5:i * 5 + 5],\n                                           block_boxes[tx * 5:tx * 5 + 5], criterion)\n\n\ndef rotate_iou_gpu_eval(boxes, query_boxes, criterion=-1, device_id=0):\n    \"\"\"rotated box iou running in gpu. 500x faster than cpu version\n    (take 5ms in one example with numba.cuda code).\n    convert from [this project](\n        https://github.com/hongzhenwang/RRPN-revise/tree/master/pcdet/rotation).\n    \n    Args:\n        boxes (float tensor: [N, 5]): rbboxes. format: centers, dims, \n            angles(clockwise when positive)\n        query_boxes (float tensor: [K, 5]): [description]\n        device_id (int, optional): Defaults to 0. [description]\n    \n    Returns:\n        [type]: [description]\n    \"\"\"\n    box_dtype = boxes.dtype\n    boxes = boxes.astype(np.float32)\n    query_boxes = query_boxes.astype(np.float32)\n    N = boxes.shape[0]\n    K = query_boxes.shape[0]\n    iou = np.zeros((N, K), dtype=np.float32)\n    if N == 0 or K == 0:\n        return iou\n    threadsPerBlock = 8 * 8\n    cuda.select_device(device_id)\n    blockspergrid = (div_up(N, threadsPerBlock), div_up(K, threadsPerBlock))\n    \n    stream = cuda.stream()\n    with stream.auto_synchronize():\n        boxes_dev = cuda.to_device(boxes.reshape([-1]), stream)\n        query_boxes_dev = cuda.to_device(query_boxes.reshape([-1]), stream)\n        iou_dev = cuda.to_device(iou.reshape([-1]), stream)\n        rotate_iou_kernel_eval[blockspergrid, threadsPerBlock, stream](\n            N, K, boxes_dev, query_boxes_dev, iou_dev, criterion)\n        iou_dev.copy_to_host(iou.reshape([-1]), stream=stream)\n    return iou.astype(boxes.dtype)\n"
  },
  {
    "path": "pcdet/datasets/processor/data_processor.py",
    "content": "from functools import partial\n\nimport numpy as np\nfrom skimage import transform\n\nfrom ...utils import box_utils, common_utils\n\ntv = None\ntry:\n    import cumm.tensorview as tv\nexcept:\n    pass\n\nclass VoxelGeneratorWrapper():\n    def __init__(self, vsize_xyz, coors_range_xyz, num_point_features, max_num_points_per_voxel, max_num_voxels):\n        try:\n            from spconv.utils import VoxelGeneratorV2 as VoxelGenerator\n            self.spconv_ver = 1\n        except:\n            try:\n                from spconv.utils import VoxelGenerator\n                self.spconv_ver = 1\n            except:\n                from spconv.utils import Point2VoxelCPU3d as VoxelGenerator\n                self.spconv_ver = 2\n\n        if self.spconv_ver == 1:\n            self._voxel_generator = VoxelGenerator(\n                voxel_size=vsize_xyz,\n                point_cloud_range=coors_range_xyz,\n                max_num_points=max_num_points_per_voxel,\n                max_voxels=max_num_voxels\n            )\n        else:\n            self._voxel_generator = VoxelGenerator(\n                vsize_xyz=vsize_xyz,\n                coors_range_xyz=coors_range_xyz,\n                num_point_features=num_point_features,\n                max_num_points_per_voxel=max_num_points_per_voxel,\n                max_num_voxels=max_num_voxels\n            )\n\n    def generate(self, points):\n        if self.spconv_ver == 1:\n            voxel_output = self._voxel_generator.generate(points)\n            if isinstance(voxel_output, dict):\n                voxels, coordinates, num_points = \\\n                    voxel_output['voxels'], voxel_output['coordinates'], voxel_output['num_points_per_voxel']\n            else:\n                voxels, coordinates, num_points = voxel_output\n        else:\n            assert tv is not None, f\"Unexpected error, library: 'cumm' wasn't imported properly.\"\n            voxel_output = self._voxel_generator.point_to_voxel(tv.from_numpy(points))\n            tv_voxels, tv_coordinates, tv_num_points = voxel_output\n            # make copy with numpy(), since numpy_view() will disappear as soon as the generator is deleted\n            voxels = tv_voxels.numpy()\n            coordinates = tv_coordinates.numpy()\n            num_points = tv_num_points.numpy()\n        return voxels, coordinates, num_points\n\nclass DataProcessor(object):\n    def __init__(self, processor_configs, point_cloud_range, training, rot_num, num_point_features):\n        self.rot_num = rot_num\n        self.point_cloud_range = point_cloud_range\n        self.training = training\n        self.num_point_features = num_point_features\n        self.mode = 'train' if training else 'test'\n        self.grid_size = self.voxel_size = None\n        self.data_processor_queue = []\n\n        self.voxel_generator = None\n\n        for cur_cfg in processor_configs:\n            cur_processor = getattr(self, cur_cfg.NAME)(config=cur_cfg)\n            self.data_processor_queue.append(cur_processor)\n\n    def mask_points_and_boxes_outside_range(self, data_dict=None, config=None):\n        if data_dict is None:\n            return partial(self.mask_points_and_boxes_outside_range, config=config)\n\n        for rot_num_id in range(self.rot_num):\n            if rot_num_id == 0:\n                rot_num_id_str = ''\n            else:\n                rot_num_id_str = str(rot_num_id)\n            mask = common_utils.mask_points_by_range(data_dict['points'+rot_num_id_str], self.point_cloud_range)\n            data_dict['points'+rot_num_id_str] = data_dict['points'+rot_num_id_str][mask]\n\n            if 'mm' in data_dict:\n                mask = common_utils.mask_points_by_range(data_dict['points_mm'+rot_num_id_str], self.point_cloud_range)\n                data_dict['points_mm'+rot_num_id_str] = data_dict['points_mm'+rot_num_id_str][mask]\n\n            if data_dict.get('gt_boxes'+rot_num_id_str, None) is not None and config.REMOVE_OUTSIDE_BOXES:\n                mask = box_utils.mask_boxes_outside_range_numpy(\n                    data_dict['gt_boxes'+rot_num_id_str], self.point_cloud_range, min_num_corners=config.get('min_num_corners', 1)\n                )\n                data_dict['gt_boxes'+rot_num_id_str] = data_dict['gt_boxes'+rot_num_id_str][mask]\n\n                if rot_num_id==0 and 'gt_tracklets'+rot_num_id_str in data_dict:\n                    data_dict['gt_tracklets'] = data_dict['gt_tracklets'][mask]\n                    data_dict['num_bbs_in_tracklets'] = data_dict['num_bbs_in_tracklets'][mask]\n\n        return data_dict\n\n    def shuffle_points(self, data_dict=None, config=None):\n        if data_dict is None:\n            return partial(self.shuffle_points, config=config)\n\n        if config.SHUFFLE_ENABLED[self.mode]:\n\n            for rot_num_id in range(self.rot_num):\n                if rot_num_id == 0:\n                    rot_num_id_str = ''\n                else:\n                    rot_num_id_str = str(rot_num_id)\n                points = data_dict['points'+rot_num_id_str]\n                shuffle_idx = np.random.permutation(points.shape[0])\n                points = points[shuffle_idx]\n                data_dict['points'+rot_num_id_str] = points\n                if 'mm' in data_dict:\n                    points = data_dict['points_mm'+rot_num_id_str]\n                    shuffle_idx = np.random.permutation(points.shape[0])\n                    points = points[shuffle_idx]\n                    data_dict['points_mm'+rot_num_id_str] = points\n\n        return data_dict\n\n    def transform_points_to_voxels(self, data_dict=None, config=None):\n        if data_dict is None:\n            grid_size = (self.point_cloud_range[3:6] - self.point_cloud_range[0:3]) / np.array(config.VOXEL_SIZE)\n            self.grid_size = np.round(grid_size).astype(np.int64)\n            self.voxel_size = config.VOXEL_SIZE\n            # just bind the config, we will create the VoxelGeneratorWrapper later,\n            # to avoid pickling issues in multiprocess spawn\n            return partial(self.transform_points_to_voxels, config=config)\n\n        if self.voxel_generator is None:\n            self.voxel_generator = VoxelGeneratorWrapper(\n                vsize_xyz=config.VOXEL_SIZE,\n                coors_range_xyz=self.point_cloud_range,\n                num_point_features=self.num_point_features,\n                max_num_points_per_voxel=config.MAX_POINTS_PER_VOXEL,\n                max_num_voxels=config.MAX_NUMBER_OF_VOXELS[self.mode],\n            )\n\n        for rot_num_id in range(self.rot_num):\n            if rot_num_id == 0:\n                rot_num_id_str = ''\n            else:\n                rot_num_id_str = str(rot_num_id)\n            points = data_dict['points'+rot_num_id_str]\n            voxel_output = self.voxel_generator.generate(points)\n            if isinstance(voxel_output, dict):\n                voxels, coordinates, num_points = \\\n                    voxel_output['voxels'], voxel_output['coordinates'], voxel_output['num_points_per_voxel']\n            else:\n                voxels, coordinates, num_points = voxel_output\n\n            if not data_dict['use_lead_xyz']:\n                voxels = voxels[..., 3:]  # remove xyz in voxels(N, 3)\n\n            data_dict['voxels'+rot_num_id_str] = voxels\n            data_dict['voxel_coords'+rot_num_id_str] = coordinates\n            data_dict['voxel_num_points'+rot_num_id_str] = num_points\n\n            if 'mm' in data_dict:\n                points = data_dict['points_mm'+rot_num_id_str]\n\n                voxel_output = self.voxel_generator.generate(points)\n                if isinstance(voxel_output, dict):\n                    voxels, coordinates, num_points = \\\n                        voxel_output['voxels'], voxel_output['coordinates'], voxel_output['num_points_per_voxel']\n                else:\n                    voxels, coordinates, num_points = voxel_output\n\n                if not data_dict['use_lead_xyz']:\n                    voxels = voxels[..., 3:]  # remove xyz in voxels(N, 3)\n\n                data_dict['voxels_mm'+rot_num_id_str] = voxels\n                data_dict['voxel_coords_mm'+rot_num_id_str] = coordinates\n                data_dict['voxel_num_points_mm'+rot_num_id_str] = num_points\n\n        return data_dict\n\n    def sample_points(self, data_dict=None, config=None):\n        if data_dict is None:\n            return partial(self.sample_points, config=config)\n\n        num_points = config.NUM_POINTS[self.mode]\n        if num_points == -1:\n            return data_dict\n\n        points = data_dict['points']\n        if num_points < len(points):\n            pts_depth = np.linalg.norm(points[:, 0:3], axis=1)\n            pts_near_flag = pts_depth < 40.0\n            far_idxs_choice = np.where(pts_near_flag == 0)[0]\n            near_idxs = np.where(pts_near_flag == 1)[0]\n\n            if num_points > len(far_idxs_choice):\n                near_idxs_choice = np.random.choice(near_idxs, num_points - len(far_idxs_choice), replace=False)\n                choice = np.concatenate((near_idxs_choice, far_idxs_choice), axis=0) \\\n                    if len(far_idxs_choice) > 0 else near_idxs_choice\n            else: \n                choice = np.arange(0, len(points), dtype=np.int32)\n                choice = np.random.choice(choice, num_points, replace=False)\n            np.random.shuffle(choice)\n        else:\n            choice = np.arange(0, len(points), dtype=np.int32)\n            if num_points > len(points):\n                extra_choice = np.random.choice(choice, num_points - len(points), replace=False)\n                choice = np.concatenate((choice, extra_choice), axis=0)\n            np.random.shuffle(choice)\n        data_dict['points'] = points[choice]\n        return data_dict\n\n    def forward(self, data_dict):\n        \"\"\"\n        Args:\n            data_dict:\n                points: (N, 3 + C_in)\n                gt_boxes: optional, (N, 7 + C) [x, y, z, dx, dy, dz, heading, ...]\n                gt_names: optional, (N), string\n                ...\n\n        Returns:\n        \"\"\"\n\n        for cur_processor in self.data_processor_queue:\n            data_dict = cur_processor(data_dict=data_dict)\n\n        return data_dict\n"
  },
  {
    "path": "pcdet/datasets/processor/point_feature_encoder.py",
    "content": "import numpy as np\n\n\nclass PointFeatureEncoder(object):\n    def __init__(self, config, point_cloud_range=None, rot_num=1):\n        super().__init__()\n        self.rot_num=rot_num\n        self.point_encoding_config = config\n        assert list(self.point_encoding_config.src_feature_list[0:3]) == ['x', 'y', 'z']\n        self.used_feature_list = self.point_encoding_config.used_feature_list\n        self.src_feature_list = self.point_encoding_config.src_feature_list\n        self.point_cloud_range = point_cloud_range\n\n    @property\n    def num_point_features(self):\n        return getattr(self, self.point_encoding_config.encoding_type)(points=None)\n\n    def forward(self, data_dict):\n        \"\"\"\n        Args:\n            data_dict:\n                points: (N, 3 + C_in)\n                ...\n        Returns:\n            data_dict:\n                points: (N, 3 + C_out),\n                use_lead_xyz: whether to use xyz as point-wise features\n                ...\n        \"\"\"\n\n        for i in range(self.rot_num):\n            if i == 0:\n                rot_num_id = ''\n            else:\n                rot_num_id = str(i)\n            data_dict['points'+rot_num_id], use_lead_xyz = getattr(self, self.point_encoding_config.encoding_type)(\n                data_dict['points'+rot_num_id]\n            )\n\n            if 'mm' in data_dict:\n                data_dict['points_mm'+rot_num_id], use_lead_xyz = getattr(self, self.point_encoding_config.encoding_type)(\n                    data_dict['points_mm'+rot_num_id]\n                )\n\n        data_dict['use_lead_xyz'] = use_lead_xyz\n\n        return data_dict\n\n    def absolute_coordinates_encoding(self, points=None):\n        if points is None:\n            num_output_features = len(self.used_feature_list)\n            return num_output_features\n\n        point_feature_list = [points[:, 0:3]]\n        for x in self.used_feature_list:\n            if x in ['x', 'y', 'z']:\n                continue\n            idx = self.src_feature_list.index(x)\n            point_feature_list.append(points[:, idx:idx+1])\n        point_features = np.concatenate(point_feature_list, axis=1)\n        return point_features, True\n\n    def absolute_coordinates_encoding_mm(self, points=None):\n        if points is None:\n            num_output_features = self.point_encoding_config.num_features\n            return num_output_features\n\n        point_features = points[:, 0:self.point_encoding_config.num_features]\n        return point_features, True\n"
  },
  {
    "path": "pcdet/models/__init__.py",
    "content": "from collections import namedtuple\n\nimport numpy as np\nimport torch\n\nfrom .detectors import build_detector\n\n\ndef build_network(model_cfg, num_class, dataset):\n    model = build_detector(\n        model_cfg=model_cfg, num_class=num_class, dataset=dataset\n    )\n    return model\n\n\ndef load_data_to_gpu(batch_dict):\n    for key, val in batch_dict.items():\n        if not isinstance(val, np.ndarray):\n            continue\n        if key in ['frame_id', 'metadata', 'calib', 'image_shape', 'seq_id']:\n            continue\n        \n\n        batch_dict[key] = torch.from_numpy(val).float().cuda()\n\n\ndef model_fn_decorator():\n    ModelReturn = namedtuple('ModelReturn', ['loss', 'tb_dict', 'disp_dict'])\n\n    def model_func(model, batch_dict):\n        load_data_to_gpu(batch_dict)\n        ret_dict, tb_dict, disp_dict = model(batch_dict)\n\n        loss = ret_dict['loss'].mean()\n        if hasattr(model, 'update_global_step'):\n            model.update_global_step()\n        else:\n            model.module.update_global_step()\n\n        return ModelReturn(loss, tb_dict, disp_dict)\n\n    return model_func\n"
  },
  {
    "path": "pcdet/models/backbones_2d/__init__.py",
    "content": "from .base_bev_backbone import BaseBEVBackbone\n\n__all__ = {\n    'BaseBEVBackbone': BaseBEVBackbone,\n}\n"
  },
  {
    "path": "pcdet/models/backbones_2d/base_bev_backbone.py",
    "content": "import numpy as np\nimport torch\nimport torch.nn as nn\n\n\nclass BaseBEVBackbone(nn.Module):\n    def __init__(self, model_cfg,  input_channels):\n        super().__init__()\n        self.model_cfg = model_cfg\n\n        if self.model_cfg.get('LAYER_NUMS', None) is not None:\n            assert len(self.model_cfg.LAYER_NUMS) == len(self.model_cfg.LAYER_STRIDES) == len(self.model_cfg.NUM_FILTERS)\n            layer_nums = self.model_cfg.LAYER_NUMS\n            layer_strides = self.model_cfg.LAYER_STRIDES\n            num_filters = self.model_cfg.NUM_FILTERS\n        else:\n            layer_nums = layer_strides = num_filters = []\n\n        if self.model_cfg.get('UPSAMPLE_STRIDES', None) is not None:\n            assert len(self.model_cfg.UPSAMPLE_STRIDES) == len(self.model_cfg.NUM_UPSAMPLE_FILTERS)\n            num_upsample_filters = self.model_cfg.NUM_UPSAMPLE_FILTERS\n            upsample_strides = self.model_cfg.UPSAMPLE_STRIDES\n        else:\n            upsample_strides = num_upsample_filters = []\n\n        num_levels = len(layer_nums)\n        c_in_list = [input_channels, *num_filters[:-1]]\n        self.blocks = nn.ModuleList()\n        self.deblocks = nn.ModuleList()\n        for idx in range(num_levels):\n            cur_layers = [\n                nn.ZeroPad2d(1),\n                nn.Conv2d(\n                    c_in_list[idx], num_filters[idx], kernel_size=3,\n                    stride=layer_strides[idx], padding=0, bias=False\n                ),\n                nn.BatchNorm2d(num_filters[idx], eps=1e-3, momentum=0.01),\n                nn.ReLU()\n            ]\n            for k in range(layer_nums[idx]):\n                cur_layers.extend([\n                    nn.Conv2d(num_filters[idx], num_filters[idx], kernel_size=3, padding=1, bias=False),\n                    nn.BatchNorm2d(num_filters[idx], eps=1e-3, momentum=0.01),\n                    nn.ReLU()\n                ])\n            self.blocks.append(nn.Sequential(*cur_layers))\n            if len(upsample_strides) > 0:\n                stride = upsample_strides[idx]\n                if stride >= 1:\n                    self.deblocks.append(nn.Sequential(\n                        nn.ConvTranspose2d(\n                            num_filters[idx], num_upsample_filters[idx],\n                            upsample_strides[idx],\n                            stride=upsample_strides[idx], bias=False\n                        ),\n                        nn.BatchNorm2d(num_upsample_filters[idx], eps=1e-3, momentum=0.01),\n                        nn.ReLU()\n                    ))\n                else:\n                    stride = np.round(1 / stride).astype(np.int)\n                    self.deblocks.append(nn.Sequential(\n                        nn.Conv2d(\n                            num_filters[idx], num_upsample_filters[idx],\n                            stride,\n                            stride=stride, bias=False\n                        ),\n                        nn.BatchNorm2d(num_upsample_filters[idx], eps=1e-3, momentum=0.01),\n                        nn.ReLU()\n                    ))\n\n        c_in = sum(num_upsample_filters)\n        if len(upsample_strides) > num_levels:\n            self.deblocks.append(nn.Sequential(\n                nn.ConvTranspose2d(c_in, c_in, upsample_strides[-1], stride=upsample_strides[-1], bias=False),\n                nn.BatchNorm2d(c_in, eps=1e-3, momentum=0.01),\n                nn.ReLU(),\n            ))\n\n        #for child in self.children():\n        #    for param in child.parameters():\n        #        param.requires_grad = False\n        self.num_bev_features_post = c_in\n\n    def forward(self, data_dict):\n        \"\"\"\n        Args:\n            data_dict:\n                spatial_features\n        Returns:\n        \"\"\"\n\n        spatial_features = data_dict['spatial_features']\n        ups = []\n        x = spatial_features\n        for i in range(len(self.blocks)):\n            x = self.blocks[i](x)\n\n            stride = int(spatial_features.shape[2] / x.shape[2])\n            if len(self.deblocks) > 0:\n                ups.append(self.deblocks[i](x))\n            else:\n                ups.append(x)\n\n        if len(ups) > 1:\n            x = torch.cat(ups, dim=1)\n        elif len(ups) == 1:\n            x = ups[0]\n\n        if len(self.deblocks) > len(self.blocks):\n            x = self.deblocks[-1](x)\n\n        data_dict['st_features_2d'] = x\n\n\n        return data_dict\n"
  },
  {
    "path": "pcdet/models/backbones_2d/map_to_bev/__init__.py",
    "content": "from .height_compression import BEVPool\nfrom .pointpillar_scatter import PointPillarScatter\n\n__all__ = {\n    'BEVPool': BEVPool,\n    'PointPillarScatter': PointPillarScatter\n}\n"
  },
  {
    "path": "pcdet/models/backbones_2d/map_to_bev/height_compression.py",
    "content": "import torch.nn as nn\nimport numpy as np\nfrom pcdet.datasets.augmentor.X_transform import X_TRANS\nimport torch\n\ndef bilinear_interpolate_torch(im, x, y):\n    \"\"\"\n    Args:\n        im: (H, W, C) [y, x]\n        x: (N)\n        y: (N)\n\n    Returns:\n\n    \"\"\"\n    x0 = torch.floor(x).long()\n    x1 = x0 + 1\n\n    y0 = torch.floor(y).long()\n    y1 = y0 + 1\n\n    x0 = torch.clamp(x0, 0, im.shape[1] - 1)\n    x1 = torch.clamp(x1, 0, im.shape[1] - 1)\n    y0 = torch.clamp(y0, 0, im.shape[0] - 1)\n    y1 = torch.clamp(y1, 0, im.shape[0] - 1)\n\n    Ia = im[y0, x0]\n    Ib = im[y1, x0]\n    Ic = im[y0, x1]\n    Id = im[y1, x1]\n\n    wa = (x1.type_as(x) - x) * (y1.type_as(y) - y)\n    wb = (x1.type_as(x) - x) * (y - y0.type_as(y))\n    wc = (x - x0.type_as(x)) * (y1.type_as(y) - y)\n    wd = (x - x0.type_as(x)) * (y - y0.type_as(y))\n    ans = torch.t((torch.t(Ia) * wa)) + torch.t(torch.t(Ib) * wb) + torch.t(torch.t(Ic) * wc) + torch.t(torch.t(Id) * wd)\n    return ans\n\nclass BEVPool(nn.Module):\n    def __init__(self, model_cfg,  voxel_size=None, point_cloud_range=None):\n        super().__init__()\n        self.model_cfg = model_cfg\n        self.num_bev_features = self.model_cfg.NUM_BEV_FEATURES\n        self.RANGE = [0, -40, -3, 70.4, 40, 1]\n        self.x_trans = X_TRANS()\n        self.point_cloud_range = point_cloud_range\n        self.voxel_size=voxel_size\n\n    def get_pseudo_points(self, pts_range=[0, -40, -3, 70.4, 40, 1], voxel_size=[0.05, 0.05, 0.05], stride=8):\n        x_stride = voxel_size[0] * stride\n        y_stride = voxel_size[1] * stride\n\n        min_x = pts_range[0] + x_stride / 2\n        max_x = pts_range[3] #+ x_stride / 2\n        min_y = pts_range[1] + y_stride / 2\n        max_y = pts_range[4] + y_stride / 2\n\n        x = np.arange(min_x, max_x, x_stride)\n        y = np.arange(min_y, max_y, y_stride)\n\n        x, y = np.meshgrid(x, y)\n        zeo = np.zeros(shape=x.shape)\n\n        grids = torch.from_numpy(np.stack([x, y, zeo]).astype(np.float32)).permute(1,2,0).cuda()\n\n        return grids\n\n    def interpolate_from_bev_features(self, points, bev_features, bev_stride):\n\n        cur_batch_points = points\n\n        x_idxs = (cur_batch_points[:, 0] - self.point_cloud_range[0]) / self.voxel_size[0]\n        y_idxs = (cur_batch_points[:, 1] - self.point_cloud_range[1]) / self.voxel_size[1]\n        cur_x_idxs = x_idxs / bev_stride\n        cur_y_idxs = y_idxs / bev_stride\n\n        cur_bev_features = bev_features.permute(1, 2, 0)  # (H, W, C)\n        point_bev_features = bilinear_interpolate_torch(cur_bev_features, cur_x_idxs, cur_y_idxs)\n\n        return point_bev_features\n\n    def bev_align(self, bev_feat, transform_param, stride, stage_i):\n\n        batch_size = len(bev_feat)\n        w, h = bev_feat.shape[-2], bev_feat.shape[-1]\n\n        all_feat = []\n        for bt_i in range(batch_size):\n            cur_bev_feat = bev_feat[bt_i]\n            grid_pts = self.get_pseudo_points(self.point_cloud_range, self.voxel_size, stride)\n\n            grid_pts = grid_pts.reshape(-1, 3)\n            bt_transform_param = transform_param[bt_i]\n            previous_stage_param = bt_transform_param[0]\n            current_stage_param = bt_transform_param[stage_i]\n\n            trans_dict = self.x_trans.forward_with_param({'points': grid_pts,\n                                                                 'transform_param': current_stage_param})\n            trans_dict = self.x_trans.backward_with_param({'points': trans_dict['points'],\n                                                          'transform_param': previous_stage_param})\n\n            aligned_feat = self.interpolate_from_bev_features(trans_dict['points'], cur_bev_feat, stride).reshape(w, h, -1)\n            aligned_feat=aligned_feat.permute(2,0,1)\n            all_feat.append(aligned_feat)\n\n        return torch.stack(all_feat)\n\n    def forward(self, batch_dict):\n        \"\"\"\n        Args:\n            batch_dict:\n                encoded_spconv_tensor: sparse tensor\n        Returns:\n            batch_dict:\n                spatial_features:\n\n        \"\"\"\n\n\n        if 'transform_param' in batch_dict:\n            trans_param = batch_dict['transform_param']\n            rot_num = trans_param.shape[1]\n        else:\n            rot_num = 1\n\n        batch_dict['spatial_features_stride'] = batch_dict['encoded_spconv_tensor_stride']\n\n        all_feat = []\n\n        for i in range(rot_num):\n            if i==0:\n                rot_num_id = ''\n            else:\n                rot_num_id = str(i)\n\n            encoded_spconv_tensor = batch_dict['encoded_spconv_tensor'+rot_num_id]\n            spatial_features = encoded_spconv_tensor.dense()\n            N, C, D, H, W = spatial_features.shape\n            spatial_features = spatial_features.view(N, C * D, H, W)\n\n            batch_dict['spatial_features'+rot_num_id] = spatial_features\n\n            if i==0:\n                all_feat.append(spatial_features)\n            elif 'transform_param' in batch_dict and i>0:\n                aligned_bev_feat = self.bev_align(spatial_features.clone(),\n                                                  batch_dict['transform_param'],\n                                                  batch_dict['spatial_features_stride'],\n                                                  i)\n                all_feat.append(aligned_bev_feat)\n\n\n        if 'transform_param' in batch_dict:\n            all_feat = torch.stack(all_feat)\n\n            if self.model_cfg.get('ALIGN_METHOD', 'none') == 'max':\n                final_feat = all_feat.max(0)[0]\n                batch_dict['spatial_features'] = final_feat\n            elif self.model_cfg.get('ALIGN_METHOD', 'none') == 'mean':\n                final_feat = all_feat.mean(0)\n                batch_dict['spatial_features'] = final_feat\n            else:\n                raise NotImplementedError\n\n\n        return batch_dict"
  },
  {
    "path": "pcdet/models/backbones_2d/map_to_bev/pointpillar_scatter.py",
    "content": "import torch\nimport torch.nn as nn\n\n\nclass PointPillarScatter(nn.Module):\n    def __init__(self, model_cfg, grid_size, **kwargs):\n        super().__init__()\n\n        self.model_cfg = model_cfg\n        self.num_bev_features = self.model_cfg.NUM_BEV_FEATURES\n        self.nx, self.ny, self.nz = grid_size\n        assert self.nz == 1\n\n    def forward(self, batch_dict, **kwargs):\n        pillar_features, coords = batch_dict['pillar_features'], batch_dict['voxel_coords']\n        batch_spatial_features = []\n        batch_size = coords[:, 0].max().int().item() + 1\n        for batch_idx in range(batch_size):\n            spatial_feature = torch.zeros(\n                self.num_bev_features,\n                self.nz * self.nx * self.ny,\n                dtype=pillar_features.dtype,\n                device=pillar_features.device)\n\n            batch_mask = coords[:, 0] == batch_idx\n            this_coords = coords[batch_mask, :]\n            indices = this_coords[:, 1] + this_coords[:, 2] * self.nx + this_coords[:, 3]\n            indices = indices.type(torch.long)\n            pillars = pillar_features[batch_mask, :]\n            pillars = pillars.t()\n            spatial_feature[:, indices] = pillars\n            batch_spatial_features.append(spatial_feature)\n\n        batch_spatial_features = torch.stack(batch_spatial_features, 0)\n        batch_spatial_features = batch_spatial_features.view(batch_size, self.num_bev_features * self.nz, self.ny, self.nx)\n        batch_dict['spatial_features'] = batch_spatial_features\n        return batch_dict\n"
  },
  {
    "path": "pcdet/models/backbones_3d/__init__.py",
    "content": "from .pointnet2_backbone import PointNet2Backbone, PointNet2MSG\nfrom .spconv_backbone import TeMMVoxelBackBone8x,TeVoxelBackBone8x\n__all__ = {\n    'TeMMVoxelBackBone8x': TeMMVoxelBackBone8x,\n    'TeVoxelBackBone8x': TeVoxelBackBone8x,\n    'PointNet2Backbone': PointNet2Backbone,\n    'PointNet2MSG': PointNet2MSG,\n}\n"
  },
  {
    "path": "pcdet/models/backbones_3d/pfe/__init__.py",
    "content": "from .voxel_set_abstraction import VoxelSetAbstraction\n\n__all__ = {\n    'VoxelSetAbstraction': VoxelSetAbstraction,\n}\n"
  },
  {
    "path": "pcdet/models/backbones_3d/pfe/bev_features_interpolation.py",
    "content": "import torch\nimport torch.nn as nn\n\nfrom ....ops.pointnet2.pointnet2_stack import pointnet2_modules as pointnet2_stack_modules\nfrom ....ops.pointnet2.pointnet2_stack import pointnet2_utils as pointnet2_stack_utils\nfrom ....utils import common_utils\n\n\ndef bilinear_interpolate_torch(im, x, y):\n    \"\"\"\n    Args:\n        im: (H, W, C) [y, x]\n        x: (N)\n        y: (N)\n\n    Returns:\n\n    \"\"\"\n    x0 = torch.floor(x).long()\n    x1 = x0 + 1\n\n    y0 = torch.floor(y).long()\n    y1 = y0 + 1\n\n    x0 = torch.clamp(x0, 0, im.shape[1] - 1)\n    x1 = torch.clamp(x1, 0, im.shape[1] - 1)\n    y0 = torch.clamp(y0, 0, im.shape[0] - 1)\n    y1 = torch.clamp(y1, 0, im.shape[0] - 1)\n\n    Ia = im[y0, x0]\n    Ib = im[y1, x0]\n    Ic = im[y0, x1]\n    Id = im[y1, x1]\n\n    wa = (x1.type_as(x) - x) * (y1.type_as(y) - y)\n    wb = (x1.type_as(x) - x) * (y - y0.type_as(y))\n    wc = (x - x0.type_as(x)) * (y1.type_as(y) - y)\n    wd = (x - x0.type_as(x)) * (y - y0.type_as(y))\n    ans = torch.t((torch.t(Ia) * wa)) + torch.t(torch.t(Ib) * wb) + torch.t(torch.t(Ic) * wc) + torch.t(torch.t(Id) * wd)\n    return ans\n\n\nclass BEVFeaturesInterpolation(nn.Module):\n    def __init__(self, model_cfg, voxel_size, point_cloud_range, num_frames=1, num_bev_features=None,\n                 num_rawpoint_features=None, **kwargs):\n        super().__init__()\n        self.num_frames = num_frames\n        self.model_cfg = model_cfg\n        self.voxel_size = voxel_size\n        self.point_cloud_range = point_cloud_range\n\n        self.SA_layers = nn.ModuleList()\n        self.SA_layer_names = []\n        self.downsample_times_map = {}\n        c_in = 0\n\n        if 'temporal_features' in self.model_cfg.FEATURES_SOURCE:\n            c_bev = num_bev_features\n            c_in += c_bev\n\n        if 'spatial_features' in self.model_cfg.FEATURES_SOURCE:\n            c_bev = num_bev_features\n            c_in += c_bev\n\n        self.output_bev_features = nn.Sequential(\n            nn.Linear(c_in, self.model_cfg.NUM_OUTPUT_FEATURES, bias=False),\n            nn.BatchNorm1d(self.model_cfg.NUM_OUTPUT_FEATURES),\n            nn.ReLU(),\n        )\n        self.num_point_features = self.model_cfg.NUM_OUTPUT_FEATURES\n        self.num_point_features_before_fusion = c_in\n\n    def interpolate_from_bev_features(self, points, bev_features, batch_size, bev_stride):\n\n        point_bev_features_list = []\n        for k in range(batch_size):\n\n            points_b = points[:,0]\n\n            cur_batch_points = points[points_b==k]\n\n            x_idxs = (cur_batch_points[:, 1] - self.point_cloud_range[0]) / self.voxel_size[0]\n            y_idxs = (cur_batch_points[:, 2] - self.point_cloud_range[1]) / self.voxel_size[1]\n            cur_x_idxs = x_idxs / bev_stride\n            cur_y_idxs = y_idxs / bev_stride\n\n            cur_bev_features = bev_features[k].permute(1, 2, 0)  # (H, W, C)\n            point_bev_features = bilinear_interpolate_torch(cur_bev_features, cur_x_idxs, cur_y_idxs)\n\n            point_bev_features_list.append(point_bev_features)\n\n        point_bev_features = torch.cat(point_bev_features_list, dim=0)  # (B, N, C0)\n        return point_bev_features\n\n\n    def forward(self, batch_dict):\n\n        for i in range(self.num_frames):\n            if i==0:\n                point_features_list = []\n                if 'temporal_features' in self.model_cfg.FEATURES_SOURCE:\n                    bev_features = batch_dict['temporal_features']\n                    point_bev_features = self.interpolate_from_bev_features(\n                        batch_dict['points'], bev_features, batch_dict['batch_size'],\n                        bev_stride=batch_dict['spatial_features_stride']\n                    )\n                    point_features_list.append(point_bev_features)\n                if 'spatial_features' in self.model_cfg.FEATURES_SOURCE:\n                    bev_features = batch_dict['spatial_features']\n                    point_bev_features = self.interpolate_from_bev_features(\n                        batch_dict['points'], bev_features, batch_dict['batch_size'],\n                        bev_stride=batch_dict['spatial_features_stride']\n                    )\n                    point_features_list.append(point_bev_features)\n                point_features = torch.cat(point_features_list, dim=-1)\n\n                point_features = self.output_bev_features(point_features.view(-1, point_features.shape[-1]))\n\n                batch_dict['point_features'] = point_features  # (BxN, C)\n                batch_dict['point_coords'] = batch_dict['points'][:,0:4]  # (BxN, 4)\n            elif 'points'+str(-i) in batch_dict:\n\n                points = batch_dict['points'+str(-i)]\n\n                point_features_list = []\n                if 'temporal_features' in self.model_cfg.FEATURES_SOURCE:\n\n                    bev_features = batch_dict['temporal_features'+str(-i)]\n\n                    point_bev_features = self.interpolate_from_bev_features(\n                        points, bev_features, batch_dict['batch_size'],\n                        bev_stride=batch_dict['spatial_features_stride']\n                    )\n                    point_features_list.append(point_bev_features)\n                if 'spatial_features' in self.model_cfg.FEATURES_SOURCE:\n                    bev_features = batch_dict['spatial_features'+str(-i)]\n                    point_bev_features = self.interpolate_from_bev_features(\n                        points, bev_features, batch_dict['batch_size'],\n                        bev_stride=batch_dict['spatial_features_stride']\n                    )\n                    point_features_list.append(point_bev_features)\n                point_features = torch.cat(point_features_list, dim=-1)\n\n                point_features = self.output_bev_features(point_features.view(-1, point_features.shape[-1]))\n\n                batch_dict['point_features'+str(-i)] = point_features  # (BxN, C)\n                batch_dict['point_coords'+str(-i)] = batch_dict['points'+str(-i)][:, 0:4]  # (BxN, 4)\n\n        return batch_dict"
  },
  {
    "path": "pcdet/models/backbones_3d/pfe/voxel_set_abstraction.py",
    "content": "import math\nimport numpy as np\nimport torch\nimport torch.nn as nn\n\nfrom ....ops.pointnet2.pointnet2_stack import pointnet2_modules as pointnet2_stack_modules\nfrom ....ops.pointnet2.pointnet2_stack import pointnet2_utils as pointnet2_stack_utils\nfrom ....utils import common_utils\n\n\ndef bilinear_interpolate_torch(im, x, y):\n    \"\"\"\n    Args:\n        im: (H, W, C) [y, x]\n        x: (N)\n        y: (N)\n\n    Returns:\n\n    \"\"\"\n    x0 = torch.floor(x).long()\n    x1 = x0 + 1\n\n    y0 = torch.floor(y).long()\n    y1 = y0 + 1\n\n    x0 = torch.clamp(x0, 0, im.shape[1] - 1)\n    x1 = torch.clamp(x1, 0, im.shape[1] - 1)\n    y0 = torch.clamp(y0, 0, im.shape[0] - 1)\n    y1 = torch.clamp(y1, 0, im.shape[0] - 1)\n\n    Ia = im[y0, x0]\n    Ib = im[y1, x0]\n    Ic = im[y0, x1]\n    Id = im[y1, x1]\n\n    wa = (x1.type_as(x) - x) * (y1.type_as(y) - y)\n    wb = (x1.type_as(x) - x) * (y - y0.type_as(y))\n    wc = (x - x0.type_as(x)) * (y1.type_as(y) - y)\n    wd = (x - x0.type_as(x)) * (y - y0.type_as(y))\n    ans = torch.t((torch.t(Ia) * wa)) + torch.t(torch.t(Ib) * wb) + torch.t(torch.t(Ic) * wc) + torch.t(torch.t(Id) * wd)\n    return ans\n\n\ndef sample_points_with_roi(rois, points, sample_radius_with_roi, num_max_points_of_part=200000):\n    \"\"\"\n    Args:\n        rois: (M, 7 + C)\n        points: (N, 3)\n        sample_radius_with_roi:\n        num_max_points_of_part:\n\n    Returns:\n        sampled_points: (N_out, 3)\n    \"\"\"\n    if points.shape[0] < num_max_points_of_part:\n        distance = (points[:, None, :] - rois[None, :, 0:3]).norm(dim=-1)\n        min_dis, min_dis_roi_idx = distance.min(dim=-1)\n        roi_max_dim = (rois[min_dis_roi_idx, 3:6] / 2).norm(dim=-1)\n        point_mask = min_dis < roi_max_dim + sample_radius_with_roi\n    else:\n        start_idx = 0\n        point_mask_list = []\n        while start_idx < points.shape[0]:\n            distance = (points[start_idx:start_idx + num_max_points_of_part, None, :] - rois[None, :, 0:3]).norm(dim=-1)\n            min_dis, min_dis_roi_idx = distance.min(dim=-1)\n            roi_max_dim = (rois[min_dis_roi_idx, 3:6] / 2).norm(dim=-1)\n            cur_point_mask = min_dis < roi_max_dim + sample_radius_with_roi\n            point_mask_list.append(cur_point_mask)\n            start_idx += num_max_points_of_part\n        point_mask = torch.cat(point_mask_list, dim=0)\n\n    sampled_points = points[:1] if point_mask.sum() == 0 else points[point_mask, :]\n\n    return sampled_points, point_mask\n\n\ndef sector_fps(points, num_sampled_points, num_sectors):\n    \"\"\"\n    Args:\n        points: (N, 3)\n        num_sampled_points: int\n        num_sectors: int\n\n    Returns:\n        sampled_points: (N_out, 3)\n    \"\"\"\n    sector_size = np.pi * 2 / num_sectors\n    point_angles = torch.atan2(points[:, 1], points[:, 0]) + np.pi\n    sector_idx = (point_angles / sector_size).floor().clamp(min=0, max=num_sectors)\n    xyz_points_list = []\n    xyz_batch_cnt = []\n    num_sampled_points_list = []\n    for k in range(num_sectors):\n        mask = (sector_idx == k)\n        cur_num_points = mask.sum().item()\n        if cur_num_points > 0:\n            xyz_points_list.append(points[mask])\n            xyz_batch_cnt.append(cur_num_points)\n            ratio = cur_num_points / points.shape[0]\n            num_sampled_points_list.append(\n                min(cur_num_points, math.ceil(ratio * num_sampled_points))\n            )\n\n    if len(xyz_batch_cnt) == 0:\n        xyz_points_list.append(points)\n        xyz_batch_cnt.append(len(points))\n        num_sampled_points_list.append(num_sampled_points)\n        print(f'Warning: empty sector points detected in SectorFPS: points.shape={points.shape}')\n\n    xyz = torch.cat(xyz_points_list, dim=0)\n    xyz_batch_cnt = torch.tensor(xyz_batch_cnt, device=points.device).int()\n    sampled_points_batch_cnt = torch.tensor(num_sampled_points_list, device=points.device).int()\n\n    sampled_pt_idxs = pointnet2_stack_utils.stack_farthest_point_sample(\n        xyz.contiguous(), xyz_batch_cnt, sampled_points_batch_cnt\n    ).long()\n\n    sampled_points = xyz[sampled_pt_idxs]\n\n    return sampled_points\n\n\nclass VoxelSetAbstraction(nn.Module):\n    def __init__(self, model_cfg, voxel_size, point_cloud_range, num_bev_features=None,\n                 num_rawpoint_features=None, **kwargs):\n        super().__init__()\n        self.model_cfg = model_cfg\n        self.voxel_size = voxel_size\n        self.point_cloud_range = point_cloud_range\n\n        SA_cfg = self.model_cfg.SA_LAYER\n\n        self.SA_layers = nn.ModuleList()\n        self.SA_layer_names = []\n        self.downsample_times_map = {}\n        c_in = 0\n        for src_name in self.model_cfg.FEATURES_SOURCE:\n            if src_name in ['bev', 'raw_points']:\n                continue\n            self.downsample_times_map[src_name] = SA_cfg[src_name].DOWNSAMPLE_FACTOR\n\n            if SA_cfg[src_name].get('INPUT_CHANNELS', None) is None:\n                input_channels = SA_cfg[src_name].MLPS[0][0] \\\n                    if isinstance(SA_cfg[src_name].MLPS[0], list) else SA_cfg[src_name].MLPS[0]\n            else:\n                input_channels = SA_cfg[src_name]['INPUT_CHANNELS']\n\n            cur_layer, cur_num_c_out = pointnet2_stack_modules.build_local_aggregation_module(\n                input_channels=input_channels, config=SA_cfg[src_name]\n            )\n            self.SA_layers.append(cur_layer)\n            self.SA_layer_names.append(src_name)\n\n            c_in += cur_num_c_out\n\n\n        if 'bev' in self.model_cfg.FEATURES_SOURCE:\n            c_bev = num_bev_features\n            c_in += c_bev\n\n\n        if 'raw_points' in self.model_cfg.FEATURES_SOURCE:\n            self.SA_rawpoints, cur_num_c_out = pointnet2_stack_modules.build_local_aggregation_module(\n                input_channels=num_rawpoint_features - 3, config=SA_cfg['raw_points']\n            )\n            c_in += cur_num_c_out\n\n        self.vsa_point_feature_fusion = nn.Sequential(\n            nn.Linear(c_in, self.model_cfg.NUM_OUTPUT_FEATURES, bias=False),\n            nn.BatchNorm1d(self.model_cfg.NUM_OUTPUT_FEATURES),\n            nn.ReLU(),\n        )\n        self.num_point_features = self.model_cfg.NUM_OUTPUT_FEATURES\n        self.num_point_features_before_fusion = c_in\n\n\n    def interpolate_from_bev_features(self, keypoints, bev_features, batch_size, bev_stride):\n        \"\"\"\n        Args:\n            keypoints: (N1 + N2 + ..., 4)\n            bev_features: (B, C, H, W)\n            batch_size:\n            bev_stride:\n\n        Returns:\n            point_bev_features: (N1 + N2 + ..., C)\n        \"\"\"\n        x_idxs = (keypoints[:, 1] - self.point_cloud_range[0]) / self.voxel_size[0]\n        y_idxs = (keypoints[:, 2] - self.point_cloud_range[1]) / self.voxel_size[1]\n\n        x_idxs = x_idxs / bev_stride\n        y_idxs = y_idxs / bev_stride\n\n        point_bev_features_list = []\n        for k in range(batch_size):\n            bs_mask = (keypoints[:, 0] == k)\n\n            cur_x_idxs = x_idxs[bs_mask]\n            cur_y_idxs = y_idxs[bs_mask]\n            cur_bev_features = bev_features[k].permute(1, 2, 0)  # (H, W, C)\n            point_bev_features = bilinear_interpolate_torch(cur_bev_features, cur_x_idxs, cur_y_idxs)\n            point_bev_features_list.append(point_bev_features)\n\n        point_bev_features = torch.cat(point_bev_features_list, dim=0)  # (N1 + N2 + ..., C)\n        return point_bev_features\n\n    def sectorized_proposal_centric_sampling(self, roi_boxes, points):\n        \"\"\"\n        Args:\n            roi_boxes: (M, 7 + C)\n            points: (N, 3)\n\n        Returns:\n            sampled_points: (N_out, 3)\n        \"\"\"\n\n        sampled_points, _ = sample_points_with_roi(\n            rois=roi_boxes, points=points,\n            sample_radius_with_roi=self.model_cfg.SPC_SAMPLING.SAMPLE_RADIUS_WITH_ROI,\n            num_max_points_of_part=self.model_cfg.SPC_SAMPLING.get('NUM_POINTS_OF_EACH_SAMPLE_PART', 200000)\n        )\n        sampled_points = sector_fps(\n            points=sampled_points, num_sampled_points=self.model_cfg.NUM_KEYPOINTS,\n            num_sectors=self.model_cfg.SPC_SAMPLING.NUM_SECTORS\n        )\n        return sampled_points\n\n    def get_sampled_points(self, batch_dict):\n        \"\"\"\n        Args:\n            batch_dict:\n\n        Returns:\n            keypoints: (N1 + N2 + ..., 4), where 4 indicates [bs_idx, x, y, z]\n        \"\"\"\n        batch_size = batch_dict['batch_size']\n        if self.model_cfg.POINT_SOURCE == 'raw_points':\n            src_points = batch_dict['points'][:, 1:4]\n            batch_indices = batch_dict['points'][:, 0].long()\n        elif self.model_cfg.POINT_SOURCE == 'voxel_centers':\n            src_points = common_utils.get_voxel_centers(\n                batch_dict['voxel_coords'][:, 1:4],\n                downsample_times=1,\n                voxel_size=self.voxel_size,\n                point_cloud_range=self.point_cloud_range\n            )\n            batch_indices = batch_dict['voxel_coords'][:, 0].long()\n        else:\n            raise NotImplementedError\n        keypoints_list = []\n        for bs_idx in range(batch_size):\n            bs_mask = (batch_indices == bs_idx)\n            sampled_points = src_points[bs_mask].unsqueeze(dim=0)  # (1, N, 3)\n            if self.model_cfg.SAMPLE_METHOD == 'FPS':\n                cur_pt_idxs = pointnet2_stack_utils.farthest_point_sample(\n                    sampled_points[:, :, 0:3].contiguous(), self.model_cfg.NUM_KEYPOINTS\n                ).long()\n\n                if sampled_points.shape[1] < self.model_cfg.NUM_KEYPOINTS:\n                    times = int(self.model_cfg.NUM_KEYPOINTS / sampled_points.shape[1]) + 1\n                    non_empty = cur_pt_idxs[0, :sampled_points.shape[1]]\n                    cur_pt_idxs[0] = non_empty.repeat(times)[:self.model_cfg.NUM_KEYPOINTS]\n\n                keypoints = sampled_points[0][cur_pt_idxs[0]].unsqueeze(dim=0)\n\n            elif self.model_cfg.SAMPLE_METHOD == 'SPC':\n                cur_keypoints = self.sectorized_proposal_centric_sampling(\n                    roi_boxes=batch_dict['rois'][bs_idx], points=sampled_points[0]\n                )\n                bs_idxs = cur_keypoints.new_ones(cur_keypoints.shape[0]) * bs_idx\n                keypoints = torch.cat((bs_idxs[:, None], cur_keypoints), dim=1)\n            else:\n                raise NotImplementedError\n\n            keypoints_list.append(keypoints)\n\n        keypoints = torch.cat(keypoints_list, dim=0)  # (B, M, 3) or (N1 + N2 + ..., 4)\n        if len(keypoints.shape) == 3:\n            batch_idx = torch.arange(batch_size, device=keypoints.device).view(-1, 1).repeat(1, keypoints.shape[1]).view(-1, 1)\n            keypoints = torch.cat((batch_idx.float(), keypoints.view(-1, 3)), dim=1)\n\n        return keypoints\n\n    @staticmethod\n    def aggregate_keypoint_features_from_one_source(\n            batch_size, aggregate_func, xyz, xyz_features, xyz_bs_idxs, new_xyz, new_xyz_batch_cnt,\n            filter_neighbors_with_roi=False, radius_of_neighbor=None, num_max_points_of_part=200000, rois=None\n    ):\n        \"\"\"\n\n        Args:\n            aggregate_func:\n            xyz: (N, 3)\n            xyz_features: (N, C)\n            xyz_bs_idxs: (N)\n            new_xyz: (M, 3)\n            new_xyz_batch_cnt: (batch_size), [N1, N2, ...]\n\n            filter_neighbors_with_roi: True/False\n            radius_of_neighbor: float\n            num_max_points_of_part: int\n            rois: (batch_size, num_rois, 7 + C)\n        Returns:\n\n        \"\"\"\n        xyz_batch_cnt = xyz.new_zeros(batch_size).int()\n        if filter_neighbors_with_roi:\n            point_features = torch.cat((xyz, xyz_features), dim=-1) if xyz_features is not None else xyz\n            point_features_list = []\n            for bs_idx in range(batch_size):\n                bs_mask = (xyz_bs_idxs == bs_idx)\n                _, valid_mask = sample_points_with_roi(\n                    rois=rois[bs_idx], points=xyz[bs_mask],\n                    sample_radius_with_roi=radius_of_neighbor, num_max_points_of_part=num_max_points_of_part,\n                )\n                point_features_list.append(point_features[bs_mask][valid_mask])\n                xyz_batch_cnt[bs_idx] = valid_mask.sum()\n\n            valid_point_features = torch.cat(point_features_list, dim=0)\n            xyz = valid_point_features[:, 0:3]\n            xyz_features = valid_point_features[:, 3:] if xyz_features is not None else None\n        else:\n            for bs_idx in range(batch_size):\n                xyz_batch_cnt[bs_idx] = (xyz_bs_idxs == bs_idx).sum()\n\n        pooled_points, pooled_features = aggregate_func(\n            xyz=xyz.contiguous(),\n            xyz_batch_cnt=xyz_batch_cnt,\n            new_xyz=new_xyz,\n            new_xyz_batch_cnt=new_xyz_batch_cnt,\n            features=xyz_features.contiguous(),\n        )\n        return pooled_features\n\n    def forward(self, batch_dict):\n        \"\"\"\n        Args:\n            batch_dict:\n                batch_size:\n                keypoints: (B, num_keypoints, 3)\n                multi_scale_3d_features: {\n                        'x_conv4': ...\n                    }\n                points: optional (N, 1 + 3 + C) [bs_idx, x, y, z, ...]\n                spatial_features: optional\n                spatial_features_stride: optional\n\n        Returns:\n            point_features: (N, C)\n            point_coords: (N, 4)\n\n        \"\"\"\n        keypoints = self.get_sampled_points(batch_dict)\n\n        point_features_list = []\n        if 'bev' in self.model_cfg.FEATURES_SOURCE:\n            point_bev_features = self.interpolate_from_bev_features(\n                keypoints, batch_dict['spatial_features'], batch_dict['batch_size'],\n                bev_stride=batch_dict['spatial_features_stride']\n            )\n            point_features_list.append(point_bev_features)\n\n        batch_size = batch_dict['batch_size']\n\n        new_xyz = keypoints[:, 1:4].contiguous()\n        new_xyz_batch_cnt = new_xyz.new_zeros(batch_size).int()\n        for k in range(batch_size):\n            new_xyz_batch_cnt[k] = (keypoints[:, 0] == k).sum()\n\n        if 'raw_points' in self.model_cfg.FEATURES_SOURCE:\n            raw_points = batch_dict['points']\n\n            pooled_features = self.aggregate_keypoint_features_from_one_source(\n                batch_size=batch_size, aggregate_func=self.SA_rawpoints,\n                xyz=raw_points[:, 1:4],\n                xyz_features=raw_points[:, 4:].contiguous() if raw_points.shape[1] > 4 else None,\n                xyz_bs_idxs=raw_points[:, 0],\n                new_xyz=new_xyz, new_xyz_batch_cnt=new_xyz_batch_cnt,\n                filter_neighbors_with_roi=self.model_cfg.SA_LAYER['raw_points'].get('FILTER_NEIGHBOR_WITH_ROI', False),\n                radius_of_neighbor=self.model_cfg.SA_LAYER['raw_points'].get('RADIUS_OF_NEIGHBOR_WITH_ROI', None),\n                rois=batch_dict.get('rois', None)\n            )\n            point_features_list.append(pooled_features)\n\n        for k, src_name in enumerate(self.SA_layer_names):\n            cur_coords = batch_dict['multi_scale_3d_features'][src_name].indices\n            cur_features = batch_dict['multi_scale_3d_features'][src_name].features.contiguous()\n\n            xyz = common_utils.get_voxel_centers(\n                cur_coords[:, 1:4], downsample_times=self.downsample_times_map[src_name],\n                voxel_size=self.voxel_size, point_cloud_range=self.point_cloud_range\n            )\n\n            pooled_features = self.aggregate_keypoint_features_from_one_source(\n                batch_size=batch_size, aggregate_func=self.SA_layers[k],\n                xyz=xyz.contiguous(), xyz_features=cur_features, xyz_bs_idxs=cur_coords[:, 0],\n                new_xyz=new_xyz, new_xyz_batch_cnt=new_xyz_batch_cnt,\n                filter_neighbors_with_roi=self.model_cfg.SA_LAYER[src_name].get('FILTER_NEIGHBOR_WITH_ROI', False),\n                radius_of_neighbor=self.model_cfg.SA_LAYER[src_name].get('RADIUS_OF_NEIGHBOR_WITH_ROI', None),\n                rois=batch_dict.get('rois', None)\n            )\n\n            point_features_list.append(pooled_features)\n\n        point_features = torch.cat(point_features_list, dim=-1)\n        point_features = point_features.view(-1, point_features.shape[-1])\n\n        batch_dict['point_features_before_fusion'] = point_features\n\n        point_features = self.vsa_point_feature_fusion(point_features)\n\n        batch_dict['point_features'] = point_features  # (BxN, C)\n        batch_dict['point_coords'] = keypoints  # (BxN, 4)\n        return batch_dict\n"
  },
  {
    "path": "pcdet/models/backbones_3d/pointnet2_backbone.py",
    "content": "import torch\nimport torch.nn as nn\n\nfrom ...ops.pointnet2.pointnet2_batch import pointnet2_modules\nfrom ...ops.pointnet2.pointnet2_stack import pointnet2_modules as pointnet2_modules_stack\nfrom ...ops.pointnet2.pointnet2_stack import pointnet2_utils as pointnet2_utils_stack\n\n\nclass PointNet2MSG(nn.Module):\n    def __init__(self, model_cfg, input_channels, **kwargs):\n        super().__init__()\n        self.model_cfg = model_cfg\n\n        self.SA_modules = nn.ModuleList()\n        channel_in = input_channels - 3\n\n        self.num_points_each_layer = []\n        skip_channel_list = [input_channels - 3]\n        for k in range(self.model_cfg.SA_CONFIG.NPOINTS.__len__()):\n            mlps = self.model_cfg.SA_CONFIG.MLPS[k].copy()\n            channel_out = 0\n            for idx in range(mlps.__len__()):\n                mlps[idx] = [channel_in] + mlps[idx]\n                channel_out += mlps[idx][-1]\n\n            self.SA_modules.append(\n                pointnet2_modules.PointnetSAModuleMSG(\n                    npoint=self.model_cfg.SA_CONFIG.NPOINTS[k],\n                    radii=self.model_cfg.SA_CONFIG.RADIUS[k],\n                    nsamples=self.model_cfg.SA_CONFIG.NSAMPLE[k],\n                    mlps=mlps,\n                    use_xyz=self.model_cfg.SA_CONFIG.get('USE_XYZ', True),\n                )\n            )\n            skip_channel_list.append(channel_out)\n            channel_in = channel_out\n\n        self.FP_modules = nn.ModuleList()\n\n        for k in range(self.model_cfg.FP_MLPS.__len__()):\n            pre_channel = self.model_cfg.FP_MLPS[k + 1][-1] if k + 1 < len(self.model_cfg.FP_MLPS) else channel_out\n            self.FP_modules.append(\n                pointnet2_modules.PointnetFPModule(\n                    mlp=[pre_channel + skip_channel_list[k]] + self.model_cfg.FP_MLPS[k]\n                )\n            )\n\n        self.num_point_features = self.model_cfg.FP_MLPS[0][-1]\n\n    def break_up_pc(self, pc):\n        batch_idx = pc[:, 0]\n        xyz = pc[:, 1:4].contiguous()\n        features = (pc[:, 4:].contiguous() if pc.size(-1) > 4 else None)\n        return batch_idx, xyz, features\n\n    def forward(self, batch_dict):\n        \"\"\"\n        Args:\n            batch_dict:\n                batch_size: int\n                vfe_features: (num_voxels, C)\n                points: (num_points, 4 + C), [batch_idx, x, y, z, ...]\n        Returns:\n            batch_dict:\n                encoded_spconv_tensor: sparse tensor\n                point_features: (N, C)\n        \"\"\"\n        batch_size = batch_dict['batch_size']\n        points = batch_dict['points']\n        batch_idx, xyz, features = self.break_up_pc(points)\n\n        xyz_batch_cnt = xyz.new_zeros(batch_size).int()\n        for bs_idx in range(batch_size):\n            xyz_batch_cnt[bs_idx] = (batch_idx == bs_idx).sum()\n\n        assert xyz_batch_cnt.min() == xyz_batch_cnt.max()\n        xyz = xyz.view(batch_size, -1, 3)\n        features = features.view(batch_size, -1, features.shape[-1]).permute(0, 2, 1) if features is not None else None\n\n        l_xyz, l_features = [xyz], [features]\n        for i in range(len(self.SA_modules)):\n            li_xyz, li_features = self.SA_modules[i](l_xyz[i], l_features[i])\n            l_xyz.append(li_xyz)\n            l_features.append(li_features)\n\n        for i in range(-1, -(len(self.FP_modules) + 1), -1):\n            l_features[i - 1] = self.FP_modules[i](\n                l_xyz[i - 1], l_xyz[i], l_features[i - 1], l_features[i]\n            )  # (B, C, N)\n\n        point_features = l_features[0].permute(0, 2, 1).contiguous()  # (B, N, C)\n        batch_dict['point_features'] = point_features.view(-1, point_features.shape[-1])\n        batch_dict['point_coords'] = torch.cat((batch_idx[:, None].float(), l_xyz[0].view(-1, 3)), dim=1)\n        return batch_dict\n\n\nclass PointNet2Backbone(nn.Module):\n    \"\"\"\n    DO NOT USE THIS CURRENTLY SINCE IT MAY HAVE POTENTIAL BUGS, 20200723\n    \"\"\"\n    def __init__(self, model_cfg, input_channels, **kwargs):\n        assert False, 'DO NOT USE THIS CURRENTLY SINCE IT MAY HAVE POTENTIAL BUGS, 20200723'\n        super().__init__()\n        self.model_cfg = model_cfg\n\n        self.SA_modules = nn.ModuleList()\n        channel_in = input_channels - 3\n\n        self.num_points_each_layer = []\n        skip_channel_list = [input_channels]\n        for k in range(self.model_cfg.SA_CONFIG.NPOINTS.__len__()):\n            self.num_points_each_layer.append(self.model_cfg.SA_CONFIG.NPOINTS[k])\n            mlps = self.model_cfg.SA_CONFIG.MLPS[k].copy()\n            channel_out = 0\n            for idx in range(mlps.__len__()):\n                mlps[idx] = [channel_in] + mlps[idx]\n                channel_out += mlps[idx][-1]\n\n            self.SA_modules.append(\n                pointnet2_modules_stack.StackSAModuleMSG(\n                    radii=self.model_cfg.SA_CONFIG.RADIUS[k],\n                    nsamples=self.model_cfg.SA_CONFIG.NSAMPLE[k],\n                    mlps=mlps,\n                    use_xyz=self.model_cfg.SA_CONFIG.get('USE_XYZ', True),\n                )\n            )\n            skip_channel_list.append(channel_out)\n            channel_in = channel_out\n\n        self.FP_modules = nn.ModuleList()\n\n        for k in range(self.model_cfg.FP_MLPS.__len__()):\n            pre_channel = self.model_cfg.FP_MLPS[k + 1][-1] if k + 1 < len(self.model_cfg.FP_MLPS) else channel_out\n            self.FP_modules.append(\n                pointnet2_modules_stack.StackPointnetFPModule(\n                    mlp=[pre_channel + skip_channel_list[k]] + self.model_cfg.FP_MLPS[k]\n                )\n            )\n\n        self.num_point_features = self.model_cfg.FP_MLPS[0][-1]\n\n    def break_up_pc(self, pc):\n        batch_idx = pc[:, 0]\n        xyz = pc[:, 1:4].contiguous()\n        features = (pc[:, 4:].contiguous() if pc.size(-1) > 4 else None)\n        return batch_idx, xyz, features\n\n    def forward(self, batch_dict):\n        \"\"\"\n        Args:\n            batch_dict:\n                batch_size: int\n                vfe_features: (num_voxels, C)\n                points: (num_points, 4 + C), [batch_idx, x, y, z, ...]\n        Returns:\n            batch_dict:\n                encoded_spconv_tensor: sparse tensor\n                point_features: (N, C)\n        \"\"\"\n        batch_size = batch_dict['batch_size']\n        points = batch_dict['points']\n        batch_idx, xyz, features = self.break_up_pc(points)\n\n        xyz_batch_cnt = xyz.new_zeros(batch_size).int()\n        for bs_idx in range(batch_size):\n            xyz_batch_cnt[bs_idx] = (batch_idx == bs_idx).sum()\n\n        l_xyz, l_features, l_batch_cnt = [xyz], [features], [xyz_batch_cnt]\n        for i in range(len(self.SA_modules)):\n            new_xyz_list = []\n            for k in range(batch_size):\n                if len(l_xyz) == 1:\n                    cur_xyz = l_xyz[0][batch_idx == k]\n                else:\n                    last_num_points = self.num_points_each_layer[i - 1]\n                    cur_xyz = l_xyz[-1][k * last_num_points: (k + 1) * last_num_points]\n                cur_pt_idxs = pointnet2_utils_stack.furthest_point_sample(\n                    cur_xyz[None, :, :].contiguous(), self.num_points_each_layer[i]\n                ).long()[0]\n                if cur_xyz.shape[0] < self.num_points_each_layer[i]:\n                    empty_num = self.num_points_each_layer[i] - cur_xyz.shape[1]\n                    cur_pt_idxs[0, -empty_num:] = cur_pt_idxs[0, :empty_num]\n                new_xyz_list.append(cur_xyz[cur_pt_idxs])\n            new_xyz = torch.cat(new_xyz_list, dim=0)\n\n            new_xyz_batch_cnt = xyz.new_zeros(batch_size).int().fill_(self.num_points_each_layer[i])\n            li_xyz, li_features = self.SA_modules[i](\n                xyz=l_xyz[i], features=l_features[i], xyz_batch_cnt=l_batch_cnt[i],\n                new_xyz=new_xyz, new_xyz_batch_cnt=new_xyz_batch_cnt\n            )\n\n            l_xyz.append(li_xyz)\n            l_features.append(li_features)\n            l_batch_cnt.append(new_xyz_batch_cnt)\n\n        l_features[0] = points[:, 1:]\n        for i in range(-1, -(len(self.FP_modules) + 1), -1):\n            l_features[i - 1] = self.FP_modules[i](\n                unknown=l_xyz[i - 1], unknown_batch_cnt=l_batch_cnt[i - 1],\n                known=l_xyz[i], known_batch_cnt=l_batch_cnt[i],\n                unknown_feats=l_features[i - 1], known_feats=l_features[i]\n            )\n\n        batch_dict['point_features'] = l_features[0]\n        batch_dict['point_coords'] = torch.cat((batch_idx[:, None].float(), l_xyz[0]), dim=1)\n        return batch_dict\n"
  },
  {
    "path": "pcdet/models/backbones_3d/spconv_backbone.py",
    "content": "from functools import partial\nfrom ...utils.spconv_utils import replace_feature, spconv\nimport torch.nn as nn\nimport numpy as np\nimport torch\n\ndef post_act_block(in_channels, out_channels, kernel_size, indice_key=None, stride=1, padding=0,\n                   conv_type='subm', norm_fn=None):\n\n    if conv_type == 'subm':\n        conv = spconv.SubMConv3d(in_channels, out_channels, kernel_size, bias=False, indice_key=indice_key)\n        relu = nn.ReLU()\n    elif conv_type == 'spconv':\n        conv = spconv.SparseConv3d(in_channels, out_channels, kernel_size, stride=stride, padding=padding,\n                                   bias=False, indice_key=indice_key)\n        relu =nn.ReLU(inplace=True)\n    elif conv_type == 'inverseconv':\n        conv = spconv.SparseInverseConv3d(in_channels, out_channels, kernel_size, indice_key=indice_key, bias=False)\n        relu = nn.ReLU()\n    else:\n        raise NotImplementedError\n\n    m = spconv.SparseSequential(\n        conv,\n        norm_fn(out_channels),\n        relu,\n    )\n\n    return m\n\n\nclass SparseBasicBlock(spconv.SparseModule):\n    expansion = 1\n\n    def __init__(self, inplanes, planes, stride=1, norm_fn=None, downsample=None, indice_key=None):\n        super(SparseBasicBlock, self).__init__()\n\n        assert norm_fn is not None\n        bias = norm_fn is not None\n        self.conv1 = spconv.SubMConv3d(\n            inplanes, planes, kernel_size=3, stride=stride, padding=1, bias=bias, indice_key=indice_key\n        )\n        self.bn1 = norm_fn(planes)\n        self.relu = nn.ReLU()\n        self.conv2 = spconv.SubMConv3d(\n            planes, planes, kernel_size=3, stride=stride, padding=1, bias=bias, indice_key=indice_key\n        )\n        self.bn2 = norm_fn(planes)\n        self.downsample = downsample\n        self.stride = stride\n\n    def forward(self, x):\n        identity = x\n\n        out = self.conv1(x)\n        out = replace_feature(out, self.bn1(out.features))\n        out = replace_feature(out, self.relu(out.features))\n\n        out = self.conv2(out)\n        out = replace_feature(out, self.bn2(out.features))\n\n        if self.downsample is not None:\n            identity = self.downsample(x)\n\n        out = replace_feature(out, out.features + identity.features)\n        out = replace_feature(out, self.relu(out.features))\n\n        return out\n\nclass BasicBlock(spconv.SparseModule):\n\n    def __init__(self, inplanes, planes,  norm_fn=None, stride=2,  padding=1,  indice_key=None):\n        super(BasicBlock, self).__init__()\n\n        assert norm_fn is not None\n\n        block = post_act_block\n        self.stride = stride\n        if stride >1:\n            self.down_conv = block(inplanes,\n                                    planes,\n                                    3,\n                                    norm_fn=norm_fn,\n                                    stride=2,\n                                    padding=padding,\n                                    indice_key=('sp' + indice_key),\n                                    conv_type='spconv')\n        if stride >1:\n            conv_in = planes\n        else:\n            conv_in = inplanes\n\n        self.conv1 = block(conv_in,\n                              planes // 2,\n                              3,\n                              norm_fn=norm_fn,\n                              padding=1,\n                              indice_key=('subm1' + indice_key))\n        self.conv2 = block(planes//2,\n                              planes // 2,\n                              3,\n                              norm_fn=norm_fn,\n                              padding=1,\n                              indice_key=('subm2' + indice_key))\n\n        self.conv3 = block(planes//2,\n                              planes // 2,\n                              3,\n                              norm_fn=norm_fn,\n                              padding=1,\n                              indice_key=('subm3' + indice_key))\n        self.conv4 = block(planes//2,\n                              planes // 2,\n                              3,\n                              norm_fn=norm_fn,\n                              padding=1,\n                              indice_key=('subm4' + indice_key))\n\n\n    def forward(self, x):\n\n        if self.stride>1:\n            x = self.down_conv(x)\n        x1 = self.conv1(x)\n        x2 = self.conv2(x1)\n        x3 = self.conv3(x2)\n        x4 = self.conv4(x3)\n\n        out = replace_feature(x2, torch.cat([x1.features, x4.features],-1))\n\n        return out\n\nclass TeMMVoxelBackBone8x(nn.Module):\n    def __init__(self, model_cfg, input_channels, grid_size,  **kwargs):\n        super().__init__()\n        self.model_cfg = model_cfg\n\n        self.return_num_features_as_dict = model_cfg.RETURN_NUM_FEATURES_AS_DICT\n        self.out_features=model_cfg.OUT_FEATURES\n\n        num_filters = model_cfg.NUM_FILTERS\n\n        norm_fn = partial(nn.BatchNorm1d, eps=1e-3, momentum=0.01)\n\n        self.sparse_shape = grid_size[::-1] + [1, 0, 0]\n\n        self.conv_input = spconv.SparseSequential(\n            spconv.SubMConv3d(input_channels, num_filters[0], 3, padding=1, bias=False, indice_key='subm1'),\n            norm_fn(num_filters[0]),\n            nn.ReLU(),\n        )\n        block = post_act_block\n\n        self.conv1 = spconv.SparseSequential(\n            block(num_filters[0], num_filters[0], 3, norm_fn=norm_fn, padding=1, indice_key='subm1'),\n        )\n\n        self.conv2 = spconv.SparseSequential(\n            # [1600, 1408, 41] <- [800, 704, 21]\n            block(num_filters[0], num_filters[1], 3, norm_fn=norm_fn, stride=2, padding=1, indice_key='spconv2', conv_type='spconv'),\n            block(num_filters[1], num_filters[1], 3, norm_fn=norm_fn, padding=1, indice_key='subm2'),\n            block(num_filters[1], num_filters[1], 3, norm_fn=norm_fn, padding=1, indice_key='subm2'),\n        )\n\n        self.conv3 = spconv.SparseSequential(\n            # [800, 704, 21] <- [400, 352, 11]\n            block(num_filters[1], num_filters[2], 3, norm_fn=norm_fn, stride=2, padding=1, indice_key='spconv3', conv_type='spconv'),\n            block(num_filters[2], num_filters[2], 3, norm_fn=norm_fn, padding=1, indice_key='subm3'),\n            block(num_filters[2], num_filters[2], 3, norm_fn=norm_fn, padding=1, indice_key='subm3'),\n        )\n\n        self.conv4 = spconv.SparseSequential(\n            # [400, 352, 11] <- [200, 176, 5]\n            block(num_filters[2], num_filters[3], 3, norm_fn=norm_fn, stride=2, padding=(0, 1, 1), indice_key='spconv4', conv_type='spconv'),\n            block(num_filters[3], num_filters[3], 3, norm_fn=norm_fn, padding=1, indice_key='subm4'),\n            block(num_filters[3], num_filters[3], 3, norm_fn=norm_fn, padding=1, indice_key='subm4'),\n        )\n\n        last_pad = 0\n        last_pad = self.model_cfg.get('last_pad', last_pad)\n        self.conv_out = spconv.SparseSequential(\n            # [200, 150, 5] -> [200, 150, 2]\n            spconv.SparseConv3d(num_filters[3], self.out_features, (3, 1, 1), stride=(2, 1, 1), padding=last_pad,\n                                bias=False, indice_key='spconv_down2'),\n            norm_fn(self.out_features),\n            nn.ReLU(),\n        )\n        if self.model_cfg.get('MM', False):\n            self.conv_input_2 = spconv.SparseSequential(\n                spconv.SubMConv3d(input_channels, num_filters[0], 3, padding=1, bias=False, indice_key='subm1_2'),\n                norm_fn(num_filters[0]),\n                nn.ReLU(),\n            )\n            block = post_act_block\n\n            self.conv1_2 = spconv.SparseSequential(\n                block(num_filters[0], num_filters[0], 3, norm_fn=norm_fn, padding=1, indice_key='subm1_2'),\n            )\n\n            self.conv2_2 = spconv.SparseSequential(\n                # [1600, 1408, 41] <- [800, 704, 21]\n                block(num_filters[0], num_filters[1], 3, norm_fn=norm_fn, stride=2, padding=1, indice_key='spconv2_2', conv_type='spconv'),\n                block(num_filters[1], num_filters[1], 3, norm_fn=norm_fn, padding=1, indice_key='subm2_2'),\n                block(num_filters[1], num_filters[1], 3, norm_fn=norm_fn, padding=1, indice_key='subm2_2'),\n            )\n\n            self.conv3_2 = spconv.SparseSequential(\n                # [800, 704, 21] <- [400, 352, 11]\n                block(num_filters[1], num_filters[2], 3, norm_fn=norm_fn, stride=2, padding=1, indice_key='spconv3_2', conv_type='spconv'),\n                block(num_filters[2], num_filters[2], 3, norm_fn=norm_fn, padding=1, indice_key='subm3_2'),\n                block(num_filters[2], num_filters[2], 3, norm_fn=norm_fn, padding=1, indice_key='subm3_2'),\n            )\n\n            self.conv4_2 = spconv.SparseSequential(\n                # [400, 352, 11] <- [200, 176, 5]\n                block(num_filters[2], num_filters[3], 3, norm_fn=norm_fn, stride=2, padding=(0, 1, 1), indice_key='spconv4_2', conv_type='spconv'),\n                block(num_filters[3], num_filters[3], 3, norm_fn=norm_fn, padding=1, indice_key='subm4_2'),\n                block(num_filters[3], num_filters[3], 3, norm_fn=norm_fn, padding=1, indice_key='subm4_2'),\n            )\n\n        self.num_point_features = self.out_features\n\n        if self.return_num_features_as_dict:\n            num_point_features = {}\n            num_point_features.update({\n                'x_conv1': num_filters[0],\n                'x_conv2': num_filters[1],\n                'x_conv3': num_filters[2],\n                'x_conv4': num_filters[3],\n            })\n            self.num_point_features = num_point_features\n\n    def decompose_tensor(self, tensor, i, batch_size):\n        input_shape = tensor.spatial_shape[2]\n        begin_shape_ids = i * (input_shape // 4)\n        end_shape_ids = (i + 1) * (input_shape // 4)\n        x_conv3_features = tensor.features\n        x_conv3_coords = tensor.indices\n\n        mask = (begin_shape_ids < x_conv3_coords[:, 3]) & (x_conv3_coords[:, 3] < end_shape_ids)\n        this_conv3_feat = x_conv3_features[mask]\n        this_conv3_coords = x_conv3_coords[mask]\n        this_conv3_coords[:, 3] -= i * (input_shape // 4)\n        this_shape = [tensor.spatial_shape[0], tensor.spatial_shape[1], tensor.spatial_shape[2] // 4]\n\n        this_conv3_tensor = spconv.SparseConvTensor(\n            features=this_conv3_feat,\n            indices=this_conv3_coords.int(),\n            spatial_shape=this_shape,\n            batch_size=batch_size\n        )\n        return this_conv3_tensor\n\n    def forward_test(self, batch_dict):\n        \"\"\"\n        Args:\n            batch_dict:\n                batch_size: int\n                vfe_features: (num_voxels, C)\n                voxel_coords: (num_voxels, 4), [batch_idx, z_idx, y_idx, x_idx]\n        Returns:\n            batch_dict:\n                encoded_spconv_tensor: sparse tensor\n        \"\"\"\n\n        if 'transform_param' in batch_dict:\n            trans_param = batch_dict['transform_param']\n            rot_num = trans_param.shape[1]\n        else:\n            rot_num = 1\n\n        all_lidar_feat = []\n        all_lidar_coords = []\n\n        new_shape = [self.sparse_shape[0], self.sparse_shape[1], self.sparse_shape[2] * 4]\n\n        for i in range(rot_num):\n            if i==0:\n                rot_num_id = ''\n            else:\n                rot_num_id = str(i)\n\n            voxel_features, voxel_coords = batch_dict['voxel_features'+rot_num_id], batch_dict['voxel_coords'+rot_num_id]\n\n            all_lidar_feat.append(voxel_features)\n            new_coord = voxel_coords.clone()\n            new_coord[:, 3] += i*self.sparse_shape[2]\n            all_lidar_coords.append(new_coord)\n        batch_size = batch_dict['batch_size']\n\n        all_lidar_feat = torch.cat(all_lidar_feat, 0)\n        all_lidar_coords = torch.cat(all_lidar_coords)\n\n        input_sp_tensor = spconv.SparseConvTensor(\n            features=all_lidar_feat,\n            indices=all_lidar_coords.int(),\n            spatial_shape=new_shape,\n            batch_size=batch_size\n        )\n        x = self.conv_input(input_sp_tensor)\n\n        x_conv1 = self.conv1(x)\n        x_conv2 = self.conv2(x_conv1)\n        x_conv3 = self.conv3(x_conv2)\n        x_conv4 = self.conv4(x_conv3)\n        out = self.conv_out(x_conv4)\n        for i in range(rot_num):\n            if i==0:\n                rot_num_id = ''\n            else:\n                rot_num_id = str(i)\n\n            this_conv3 = self.decompose_tensor(x_conv3, i, batch_size)\n            this_conv4 = self.decompose_tensor(x_conv4, i, batch_size)\n            this_out = self.decompose_tensor(out, i, batch_size)\n\n            batch_dict.update({\n                'encoded_spconv_tensor'+rot_num_id: this_out,\n                'encoded_spconv_tensor_stride'+rot_num_id: 8,\n            })\n            batch_dict.update({\n                'multi_scale_3d_features'+rot_num_id: {\n                    'x_conv1': None,\n                    'x_conv2': None,\n                    'x_conv3': this_conv3,\n                    'x_conv4': this_conv4,\n                },\n                'multi_scale_3d_strides'+rot_num_id: {\n                    'x_conv1': 1,\n                    'x_conv2': 2,\n                    'x_conv3': 4,\n                    'x_conv4': 8,\n                }\n            })\n\n\n        if self.model_cfg.get('MM', False):\n            all_mm_feat = []\n            all_mm_coords = []\n            for i in range(rot_num):\n                if i == 0:\n                    rot_num_id = ''\n                else:\n                    rot_num_id = str(i)\n\n                newvoxel_features, newvoxel_coords = batch_dict['voxel_features_mm'+rot_num_id], batch_dict['voxel_coords_mm'+rot_num_id]\n\n                all_mm_feat.append(newvoxel_features)\n                new_mm_coord = newvoxel_coords.clone()\n                new_mm_coord[:, 3] += i * self.sparse_shape[2]\n                all_mm_coords.append(new_mm_coord)\n            all_mm_feat = torch.cat(all_mm_feat, 0)\n            all_mm_coords = torch.cat(all_mm_coords)\n\n            newinput_sp_tensor = spconv.SparseConvTensor(\n                features=all_mm_feat,\n                indices=all_mm_coords.int(),\n                spatial_shape=new_shape,\n                batch_size=batch_size\n            )\n\n            newx = self.conv_input_2(newinput_sp_tensor)\n\n            newx_conv1 = self.conv1_2(newx)\n            newx_conv2 = self.conv2_2(newx_conv1)\n            newx_conv3 = self.conv3_2(newx_conv2)\n            newx_conv4 = self.conv4_2(newx_conv3)\n\n            for i in range(rot_num):\n                if i == 0:\n                    rot_num_id = ''\n                else:\n                    rot_num_id = str(i)\n\n                this_conv3 = self.decompose_tensor(newx_conv3, i, batch_size)\n                this_conv4 = self.decompose_tensor(newx_conv4, i, batch_size)\n                batch_dict.update({\n                    'encoded_spconv_tensor_stride_mm'+rot_num_id: 8\n                })\n                batch_dict.update({\n                    'multi_scale_3d_features_mm'+rot_num_id: {\n                        'x_conv1': None,\n                        'x_conv2': None,\n                        'x_conv3': this_conv3,\n                        'x_conv4': this_conv4,\n                    },\n                    'multi_scale_3d_strides'+rot_num_id: {\n                        'x_conv1': 1,\n                        'x_conv2': 2,\n                        'x_conv3': 4,\n                        'x_conv4': 8,\n                    }\n                })\n\n        return batch_dict\n\n    def forward_train(self, batch_dict):\n        \"\"\"\n        Args:\n            batch_dict:\n                batch_size: int\n                vfe_features: (num_voxels, C)\n                voxel_coords: (num_voxels, 4), [batch_idx, z_idx, y_idx, x_idx]\n        Returns:\n            batch_dict:\n                encoded_spconv_tensor: sparse tensor\n        \"\"\"\n        if 'transform_param' in batch_dict:\n            trans_param = batch_dict['transform_param']\n            rot_num = trans_param.shape[1]\n        else:\n            rot_num = 1\n\n        for i in range(rot_num):\n            if i==0:\n                rot_num_id = ''\n            else:\n                rot_num_id = str(i)\n\n            voxel_features, voxel_coords = batch_dict['voxel_features'+rot_num_id], batch_dict['voxel_coords'+rot_num_id]\n\n            batch_size = batch_dict['batch_size']\n            input_sp_tensor = spconv.SparseConvTensor(\n                features=voxel_features,\n                indices=voxel_coords.int(),\n                spatial_shape=self.sparse_shape,\n                batch_size=batch_size\n            )\n            x = self.conv_input(input_sp_tensor)\n\n            x_conv1 = self.conv1(x)\n            x_conv2 = self.conv2(x_conv1)\n            x_conv3 = self.conv3(x_conv2)\n            x_conv4 = self.conv4(x_conv3)\n\n            # for detection head\n            # [200, 176, 5] -> [200, 176, 2]\n            out = self.conv_out(x_conv4)\n\n            batch_dict.update({\n                'encoded_spconv_tensor'+rot_num_id: out,\n                'encoded_spconv_tensor_stride'+rot_num_id: 8,\n            })\n            batch_dict.update({\n                'multi_scale_3d_features'+rot_num_id: {\n                    'x_conv1': x_conv1,\n                    'x_conv2': x_conv2,\n                    'x_conv3': x_conv3,\n                    'x_conv4': x_conv4,\n                },\n                'multi_scale_3d_strides'+rot_num_id: {\n                    'x_conv1': 1,\n                    'x_conv2': 2,\n                    'x_conv3': 4,\n                    'x_conv4': 8,\n                }\n            })\n\n            if self.model_cfg.get('MM', False):\n                newvoxel_features, newvoxel_coords = batch_dict['voxel_features_mm'+rot_num_id], batch_dict['voxel_coords_mm'+rot_num_id]\n\n                newinput_sp_tensor = spconv.SparseConvTensor(\n                    features=newvoxel_features,\n                    indices=newvoxel_coords.int(),\n                    spatial_shape=self.sparse_shape,\n                    batch_size=batch_size\n                )\n                newx = self.conv_input_2(newinput_sp_tensor)\n\n                newx_conv1 = self.conv1_2(newx)\n                newx_conv2 = self.conv2_2(newx_conv1)\n                newx_conv3 = self.conv3_2(newx_conv2)\n                newx_conv4 = self.conv4_2(newx_conv3)\n\n                # for detection head\n                # [200, 176, 5] -> [200, 176, 2]\n                #newout = self.conv_out(newx_conv4)\n\n                batch_dict.update({\n                    #'encoded_spconv_tensor_mm': newout,\n                    'encoded_spconv_tensor_stride_mm'+rot_num_id: 8\n                })\n                batch_dict.update({\n                    'multi_scale_3d_features_mm'+rot_num_id: {\n                        'x_conv1': newx_conv1,\n                        'x_conv2': newx_conv2,\n                        'x_conv3': newx_conv3,\n                        'x_conv4': newx_conv4,\n                    },\n                    'multi_scale_3d_strides'+rot_num_id: {\n                        'x_conv1': 1,\n                        'x_conv2': 2,\n                        'x_conv3': 4,\n                        'x_conv4': 8,\n                    }\n                })\n\n        return batch_dict\n\n    def forward(self, batch_dict):\n        if self.training:\n            return self.forward_train(batch_dict)\n        else:\n            return self.forward_test(batch_dict)\n\n\nclass TeVoxelBackBone8x(nn.Module):\n    def __init__(self, model_cfg, input_channels, grid_size, **kwargs):\n        super().__init__()\n        self.model_cfg = model_cfg\n\n        self.return_num_features_as_dict = model_cfg.RETURN_NUM_FEATURES_AS_DICT\n        self.out_features=model_cfg.OUT_FEATURES\n\n        num_filters = model_cfg.NUM_FILTERS\n\n        norm_fn = partial(nn.BatchNorm1d, eps=1e-3, momentum=0.01)\n\n        self.sparse_shape = grid_size[::-1] + [1, 0, 0]\n\n        self.conv_input = spconv.SparseSequential(\n            spconv.SubMConv3d(input_channels, num_filters[0], 3, padding=1, bias=False, indice_key='subm1'),\n            norm_fn(num_filters[0]),\n            nn.ReLU(),\n        )\n        block = post_act_block\n        self.conv1 = spconv.SparseSequential(\n            block(num_filters[0], num_filters[0], 3, norm_fn=norm_fn, padding=1, indice_key='conv1'),\n        )\n        self.conv2 = BasicBlock(num_filters[0], num_filters[1], norm_fn=norm_fn,  indice_key='conv2')\n        self.conv3 = BasicBlock(num_filters[1], num_filters[2], norm_fn=norm_fn,  indice_key='conv3')\n        self.conv4 = BasicBlock(num_filters[2], num_filters[3], norm_fn=norm_fn,  padding=(0, 1, 1),  indice_key='conv4')\n\n\n        last_pad = 0\n        last_pad = self.model_cfg.get('last_pad', last_pad)\n        self.conv_out = spconv.SparseSequential(\n            # [200, 150, 5] -> [200, 150, 2]\n            spconv.SparseConv3d(num_filters[3], self.out_features, (3, 1, 1), stride=(2, 1, 1), padding=last_pad,\n                                bias=False, indice_key='spconv_down2'),\n            norm_fn(self.out_features),\n            nn.ReLU(),\n        )\n        if self.model_cfg.get('MM', False):\n            self.conv_input_2 = spconv.SparseSequential(\n                spconv.SubMConv3d(input_channels, num_filters[0], 3, padding=1, bias=False, indice_key='subm1_2'),\n                norm_fn(num_filters[0]),\n                nn.ReLU(),\n            )\n\n            self.conv1_2 = spconv.SparseSequential(\n                block(num_filters[0], num_filters[0], 3, norm_fn=norm_fn, padding=1, indice_key='conv1_2'),\n            )\n            self.conv2_2 = BasicBlock(num_filters[0], num_filters[1], norm_fn=norm_fn, indice_key='conv2_2')\n            self.conv3_2 = BasicBlock(num_filters[1], num_filters[2], norm_fn=norm_fn, indice_key='conv3_2')\n            self.conv4_2 = BasicBlock(num_filters[2], num_filters[3], norm_fn=norm_fn, padding=(0, 1, 1),  indice_key='conv4_2')\n\n\n        self.num_point_features = self.out_features\n\n        if self.return_num_features_as_dict:\n            num_point_features = {}\n            num_point_features.update({\n                'x_conv1': num_filters[0],\n                'x_conv2': num_filters[1],\n                'x_conv3': num_filters[2],\n                'x_conv4': num_filters[3],\n            })\n            self.num_point_features = num_point_features\n\n\n    def decompose_tensor(self, tensor, i, batch_size):\n        input_shape = tensor.spatial_shape[2]\n        begin_shape_ids = i * (input_shape // 4)\n        end_shape_ids = (i + 1) * (input_shape // 4)\n        x_conv3_features = tensor.features\n        x_conv3_coords = tensor.indices\n\n        mask = (begin_shape_ids < x_conv3_coords[:, 3]) & (x_conv3_coords[:, 3] < end_shape_ids)\n        this_conv3_feat = x_conv3_features[mask]\n        this_conv3_coords = x_conv3_coords[mask]\n        this_conv3_coords[:, 3] -= i * (input_shape // 4)\n        this_shape = [tensor.spatial_shape[0], tensor.spatial_shape[1], tensor.spatial_shape[2] // 4]\n\n        this_conv3_tensor = spconv.SparseConvTensor(\n            features=this_conv3_feat,\n            indices=this_conv3_coords.int(),\n            spatial_shape=this_shape,\n            batch_size=batch_size\n        )\n        return this_conv3_tensor\n\n    def forward_test(self, batch_dict):\n        \"\"\"\n        Args:\n            batch_dict:\n                batch_size: int\n                vfe_features: (num_voxels, C)\n                voxel_coords: (num_voxels, 4), [batch_idx, z_idx, y_idx, x_idx]\n        Returns:\n            batch_dict:\n                encoded_spconv_tensor: sparse tensor\n        \"\"\"\n\n        if 'transform_param' in batch_dict:\n            trans_param = batch_dict['transform_param']\n            rot_num = trans_param.shape[1]\n        else:\n            rot_num = 1\n\n        all_lidar_feat = []\n        all_lidar_coords = []\n\n        new_shape = [self.sparse_shape[0], self.sparse_shape[1], self.sparse_shape[2] * 4]\n\n        for i in range(rot_num):\n            if i==0:\n                rot_num_id = ''\n            else:\n                rot_num_id = str(i)\n\n            voxel_features, voxel_coords = batch_dict['voxel_features'+rot_num_id], batch_dict['voxel_coords'+rot_num_id]\n\n            all_lidar_feat.append(voxel_features)\n            new_coord = voxel_coords.clone()\n            new_coord[:, 3] += i*self.sparse_shape[2]\n            all_lidar_coords.append(new_coord)\n        batch_size = batch_dict['batch_size']\n\n        all_lidar_feat = torch.cat(all_lidar_feat, 0)\n        all_lidar_coords = torch.cat(all_lidar_coords)\n\n        input_sp_tensor = spconv.SparseConvTensor(\n            features=all_lidar_feat,\n            indices=all_lidar_coords.int(),\n            spatial_shape=new_shape,\n            batch_size=batch_size\n        )\n        x = self.conv_input(input_sp_tensor)\n\n        x_conv1 = self.conv1(x)\n        x_conv2 = self.conv2(x_conv1)\n        x_conv3 = self.conv3(x_conv2)\n        x_conv4 = self.conv4(x_conv3)\n        out = self.conv_out(x_conv4)\n        for i in range(rot_num):\n            if i==0:\n                rot_num_id = ''\n            else:\n                rot_num_id = str(i)\n\n            this_conv3 = self.decompose_tensor(x_conv3, i, batch_size)\n            this_conv4 = self.decompose_tensor(x_conv4, i, batch_size)\n            this_out = self.decompose_tensor(out, i, batch_size)\n\n            batch_dict.update({\n                'encoded_spconv_tensor'+rot_num_id: this_out,\n                'encoded_spconv_tensor_stride'+rot_num_id: 8,\n            })\n            batch_dict.update({\n                'multi_scale_3d_features'+rot_num_id: {\n                    'x_conv1': None,\n                    'x_conv2': None,\n                    'x_conv3': this_conv3,\n                    'x_conv4': this_conv4,\n                },\n                'multi_scale_3d_strides'+rot_num_id: {\n                    'x_conv1': 1,\n                    'x_conv2': 2,\n                    'x_conv3': 4,\n                    'x_conv4': 8,\n                }\n            })\n\n\n        if self.model_cfg.get('MM', False):\n            all_mm_feat = []\n            all_mm_coords = []\n            for i in range(rot_num):\n                if i == 0:\n                    rot_num_id = ''\n                else:\n                    rot_num_id = str(i)\n\n                newvoxel_features, newvoxel_coords = batch_dict['voxel_features_mm'+rot_num_id], batch_dict['voxel_coords_mm'+rot_num_id]\n\n                all_mm_feat.append(newvoxel_features)\n                new_mm_coord = newvoxel_coords.clone()\n                new_mm_coord[:, 3] += i * self.sparse_shape[2]\n                all_mm_coords.append(new_mm_coord)\n            all_mm_feat = torch.cat(all_mm_feat, 0)\n            all_mm_coords = torch.cat(all_mm_coords)\n\n            newinput_sp_tensor = spconv.SparseConvTensor(\n                features=all_mm_feat,\n                indices=all_mm_coords.int(),\n                spatial_shape=new_shape,\n                batch_size=batch_size\n            )\n\n            newx = self.conv_input_2(newinput_sp_tensor)\n\n            newx_conv1 = self.conv1_2(newx)\n            newx_conv2 = self.conv2_2(newx_conv1)\n            newx_conv3 = self.conv3_2(newx_conv2)\n            newx_conv4 = self.conv4_2(newx_conv3)\n\n            for i in range(rot_num):\n                if i == 0:\n                    rot_num_id = ''\n                else:\n                    rot_num_id = str(i)\n\n                this_conv3 = self.decompose_tensor(newx_conv3, i, batch_size)\n                this_conv4 = self.decompose_tensor(newx_conv4, i, batch_size)\n                batch_dict.update({\n                    'encoded_spconv_tensor_stride_mm'+rot_num_id: 8\n                })\n                batch_dict.update({\n                    'multi_scale_3d_features_mm'+rot_num_id: {\n                        'x_conv1': None,\n                        'x_conv2': None,\n                        'x_conv3': this_conv3,\n                        'x_conv4': this_conv4,\n                    },\n                    'multi_scale_3d_strides'+rot_num_id: {\n                        'x_conv1': 1,\n                        'x_conv2': 2,\n                        'x_conv3': 4,\n                        'x_conv4': 8,\n                    }\n                })\n\n        return batch_dict\n\n    def forward_train(self, batch_dict):\n        \"\"\"\n        Args:\n            batch_dict:\n                batch_size: int\n                vfe_features: (num_voxels, C)\n                voxel_coords: (num_voxels, 4), [batch_idx, z_idx, y_idx, x_idx]\n        Returns:\n            batch_dict:\n                encoded_spconv_tensor: sparse tensor\n        \"\"\"\n        if 'transform_param' in batch_dict:\n            trans_param = batch_dict['transform_param']\n            rot_num = trans_param.shape[1]\n        else:\n            rot_num = 1\n\n\n        for i in range(rot_num):\n            if i==0:\n                rot_num_id = ''\n            else:\n                rot_num_id = str(i)\n\n            voxel_features, voxel_coords = batch_dict['voxel_features'+rot_num_id], batch_dict['voxel_coords'+rot_num_id]\n\n            batch_size = batch_dict['batch_size']\n            input_sp_tensor = spconv.SparseConvTensor(\n                features=voxel_features,\n                indices=voxel_coords.int(),\n                spatial_shape=self.sparse_shape,\n                batch_size=batch_size\n            )\n            x = self.conv_input(input_sp_tensor)\n\n            x_conv1 = self.conv1(x)\n            x_conv2 = self.conv2(x_conv1)\n            x_conv3 = self.conv3(x_conv2)\n            x_conv4 = self.conv4(x_conv3)\n\n            # for detection head\n            # [200, 176, 5] -> [200, 176, 2]\n            out = self.conv_out(x_conv4)\n\n            batch_dict.update({\n                'encoded_spconv_tensor'+rot_num_id: out,\n                'encoded_spconv_tensor_stride'+rot_num_id: 8,\n            })\n            batch_dict.update({\n                'multi_scale_3d_features'+rot_num_id: {\n                    'x_conv1': x_conv1,\n                    'x_conv2': x_conv2,\n                    'x_conv3': x_conv3,\n                    'x_conv4': x_conv4,\n                },\n                'multi_scale_3d_strides'+rot_num_id: {\n                    'x_conv1': 1,\n                    'x_conv2': 2,\n                    'x_conv3': 4,\n                    'x_conv4': 8,\n                }\n            })\n\n            if self.model_cfg.get('MM', False):\n                newvoxel_features, newvoxel_coords = batch_dict['voxel_features_mm'+rot_num_id], batch_dict['voxel_coords_mm'+rot_num_id]\n\n                newinput_sp_tensor = spconv.SparseConvTensor(\n                    features=newvoxel_features,\n                    indices=newvoxel_coords.int(),\n                    spatial_shape=self.sparse_shape,\n                    batch_size=batch_size\n                )\n                newx = self.conv_input_2(newinput_sp_tensor)\n\n                newx_conv1 = self.conv1_2(newx)\n                newx_conv2 = self.conv2_2(newx_conv1)\n                newx_conv3 = self.conv3_2(newx_conv2)\n                newx_conv4 = self.conv4_2(newx_conv3)\n\n                # for detection head\n                # [200, 176, 5] -> [200, 176, 2]\n                #newout = self.conv_out(newx_conv4)\n\n                batch_dict.update({\n                    #'encoded_spconv_tensor_mm': newout,\n                    'encoded_spconv_tensor_stride_mm'+rot_num_id: 8\n                })\n                batch_dict.update({\n                    'multi_scale_3d_features_mm'+rot_num_id: {\n                        'x_conv1': newx_conv1,\n                        'x_conv2': newx_conv2,\n                        'x_conv3': newx_conv3,\n                        'x_conv4': newx_conv4,\n                    },\n                    'multi_scale_3d_strides'+rot_num_id: {\n                        'x_conv1': 1,\n                        'x_conv2': 2,\n                        'x_conv3': 4,\n                        'x_conv4': 8,\n                    }\n                })\n\n        return batch_dict\n\n    def forward(self, batch_dict):\n        if self.training:\n            return self.forward_train(batch_dict)\n        else:\n            return self.forward_test(batch_dict)\n\n\n\n"
  },
  {
    "path": "pcdet/models/backbones_3d/spconv_unet.py",
    "content": "from functools import partial\n\nimport spconv\nimport torch\nimport torch.nn as nn\n\nfrom ...utils import common_utils\nfrom .spconv_backbone import post_act_block\n\n\nclass SparseBasicBlock(spconv.SparseModule):\n    expansion = 1\n\n    def __init__(self, inplanes, planes, stride=1, downsample=None, indice_key=None, norm_fn=None):\n        super(SparseBasicBlock, self).__init__()\n        self.conv1 = spconv.SubMConv3d(\n            inplanes, planes, kernel_size=3, stride=stride, padding=1, bias=False, indice_key=indice_key\n        )\n        self.bn1 = norm_fn(planes)\n        self.relu = nn.ReLU()\n        self.conv2 = spconv.SubMConv3d(\n            planes, planes, kernel_size=3, stride=1, padding=1, bias=False, indice_key=indice_key\n        )\n        self.bn2 = norm_fn(planes)\n        self.downsample = downsample\n        self.stride = stride\n\n    def forward(self, x):\n        identity = x.features\n\n        assert x.features.dim() == 2, 'x.features.dim()=%d' % x.features.dim()\n\n        out = self.conv1(x)\n        out.features = self.bn1(out.features)\n        out.features = self.relu(out.features)\n\n        out = self.conv2(out)\n        out.features = self.bn2(out.features)\n\n        if self.downsample is not None:\n            identity = self.downsample(x)\n\n        out.features += identity\n        out.features = self.relu(out.features)\n\n        return out\n\n\nclass UNetV2(nn.Module):\n    \"\"\"\n    Sparse Convolution based UNet for point-wise feature learning.\n    Reference Paper: https://arxiv.org/abs/1907.03670 (Shaoshuai Shi, et. al)\n    From Points to Parts: 3D Object Detection from Point Cloud with Part-aware and Part-aggregation Network\n    \"\"\"\n    def __init__(self, model_cfg, input_channels, grid_size, voxel_size, point_cloud_range, **kwargs):\n        super().__init__()\n        self.model_cfg = model_cfg\n        self.sparse_shape = grid_size[::-1] + [1, 0, 0]\n        self.voxel_size = voxel_size\n        self.point_cloud_range = point_cloud_range\n\n        norm_fn = partial(nn.BatchNorm1d, eps=1e-3, momentum=0.01)\n\n        self.conv_input = spconv.SparseSequential(\n            spconv.SubMConv3d(input_channels, 16, 3, padding=1, bias=False, indice_key='subm1'),\n            norm_fn(16),\n            nn.ReLU(),\n        )\n        block = post_act_block\n\n        self.conv1 = spconv.SparseSequential(\n            block(16, 16, 3, norm_fn=norm_fn, padding=1, indice_key='subm1'),\n        )\n\n        self.conv2 = spconv.SparseSequential(\n            # [1600, 1408, 41] <- [800, 704, 21]\n            block(16, 32, 3, norm_fn=norm_fn, stride=2, padding=1, indice_key='spconv2', conv_type='spconv'),\n            block(32, 32, 3, norm_fn=norm_fn, padding=1, indice_key='subm2'),\n            block(32, 32, 3, norm_fn=norm_fn, padding=1, indice_key='subm2'),\n        )\n\n        self.conv3 = spconv.SparseSequential(\n            # [800, 704, 21] <- [400, 352, 11]\n            block(32, 64, 3, norm_fn=norm_fn, stride=2, padding=1, indice_key='spconv3', conv_type='spconv'),\n            block(64, 64, 3, norm_fn=norm_fn, padding=1, indice_key='subm3'),\n            block(64, 64, 3, norm_fn=norm_fn, padding=1, indice_key='subm3'),\n        )\n\n        self.conv4 = spconv.SparseSequential(\n            # [400, 352, 11] <- [200, 176, 5]\n            block(64, 64, 3, norm_fn=norm_fn, stride=2, padding=(0, 1, 1), indice_key='spconv4', conv_type='spconv'),\n            block(64, 64, 3, norm_fn=norm_fn, padding=1, indice_key='subm4'),\n            block(64, 64, 3, norm_fn=norm_fn, padding=1, indice_key='subm4'),\n        )\n\n        if self.model_cfg.get('RETURN_ENCODED_TENSOR', True):\n            last_pad = self.model_cfg.get('last_pad', 0)\n\n            self.conv_out = spconv.SparseSequential(\n                # [200, 150, 5] -> [200, 150, 2]\n                spconv.SparseConv3d(64, 128, (3, 1, 1), stride=(2, 1, 1), padding=last_pad,\n                                    bias=False, indice_key='spconv_down2'),\n                norm_fn(128),\n                nn.ReLU(),\n            )\n        else:\n            self.conv_out = None\n\n        # decoder\n        # [400, 352, 11] <- [200, 176, 5]\n        self.conv_up_t4 = SparseBasicBlock(64, 64, indice_key='subm4', norm_fn=norm_fn)\n        self.conv_up_m4 = block(128, 64, 3, norm_fn=norm_fn, padding=1, indice_key='subm4')\n        self.inv_conv4 = block(64, 64, 3, norm_fn=norm_fn, indice_key='spconv4', conv_type='inverseconv')\n\n        # [800, 704, 21] <- [400, 352, 11]\n        self.conv_up_t3 = SparseBasicBlock(64, 64, indice_key='subm3', norm_fn=norm_fn)\n        self.conv_up_m3 = block(128, 64, 3, norm_fn=norm_fn, padding=1, indice_key='subm3')\n        self.inv_conv3 = block(64, 32, 3, norm_fn=norm_fn, indice_key='spconv3', conv_type='inverseconv')\n\n        # [1600, 1408, 41] <- [800, 704, 21]\n        self.conv_up_t2 = SparseBasicBlock(32, 32, indice_key='subm2', norm_fn=norm_fn)\n        self.conv_up_m2 = block(64, 32, 3, norm_fn=norm_fn, indice_key='subm2')\n        self.inv_conv2 = block(32, 16, 3, norm_fn=norm_fn, indice_key='spconv2', conv_type='inverseconv')\n\n        # [1600, 1408, 41] <- [1600, 1408, 41]\n        self.conv_up_t1 = SparseBasicBlock(16, 16, indice_key='subm1', norm_fn=norm_fn)\n        self.conv_up_m1 = block(32, 16, 3, norm_fn=norm_fn, indice_key='subm1')\n\n        self.conv5 = spconv.SparseSequential(\n            block(16, 16, 3, norm_fn=norm_fn, padding=1, indice_key='subm1')\n        )\n        self.num_point_features = 16\n\n    def UR_block_forward(self, x_lateral, x_bottom, conv_t, conv_m, conv_inv):\n        x_trans = conv_t(x_lateral)\n        x = x_trans\n        x.features = torch.cat((x_bottom.features, x_trans.features), dim=1)\n        x_m = conv_m(x)\n        x = self.channel_reduction(x, x_m.features.shape[1])\n        x.features = x_m.features + x.features\n        x = conv_inv(x)\n        return x\n\n    @staticmethod\n    def channel_reduction(x, out_channels):\n        \"\"\"\n        Args:\n            x: x.features (N, C1)\n            out_channels: C2\n\n        Returns:\n\n        \"\"\"\n        features = x.features\n        n, in_channels = features.shape\n        assert (in_channels % out_channels == 0) and (in_channels >= out_channels)\n\n        x.features = features.view(n, out_channels, -1).sum(dim=2)\n        return x\n\n    def forward(self, batch_dict):\n        \"\"\"\n        Args:\n            batch_dict:\n                batch_size: int\n                vfe_features: (num_voxels, C)\n                voxel_coords: (num_voxels, 4), [batch_idx, z_idx, y_idx, x_idx]\n        Returns:\n            batch_dict:\n                encoded_spconv_tensor: sparse tensor\n                point_features: (N, C)\n        \"\"\"\n        voxel_features, voxel_coords = batch_dict['voxel_features'], batch_dict['voxel_coords']\n        batch_size = batch_dict['batch_size']\n        input_sp_tensor = spconv.SparseConvTensor(\n            features=voxel_features,\n            indices=voxel_coords.int(),\n            spatial_shape=self.sparse_shape,\n            batch_size=batch_size\n        )\n        x = self.conv_input(input_sp_tensor)\n\n        x_conv1 = self.conv1(x)\n        x_conv2 = self.conv2(x_conv1)\n        x_conv3 = self.conv3(x_conv2)\n        x_conv4 = self.conv4(x_conv3)\n\n        if self.conv_out is not None:\n            # for detection head\n            # [200, 176, 5] -> [200, 176, 2]\n            out = self.conv_out(x_conv4)\n            batch_dict['encoded_spconv_tensor'] = out\n            batch_dict['encoded_spconv_tensor_stride'] = 8\n\n        # for segmentation head\n        # [400, 352, 11] <- [200, 176, 5]\n        x_up4 = self.UR_block_forward(x_conv4, x_conv4, self.conv_up_t4, self.conv_up_m4, self.inv_conv4)\n        # [800, 704, 21] <- [400, 352, 11]\n        x_up3 = self.UR_block_forward(x_conv3, x_up4, self.conv_up_t3, self.conv_up_m3, self.inv_conv3)\n        # [1600, 1408, 41] <- [800, 704, 21]\n        x_up2 = self.UR_block_forward(x_conv2, x_up3, self.conv_up_t2, self.conv_up_m2, self.inv_conv2)\n        # [1600, 1408, 41] <- [1600, 1408, 41]\n        x_up1 = self.UR_block_forward(x_conv1, x_up2, self.conv_up_t1, self.conv_up_m1, self.conv5)\n\n        batch_dict['point_features'] = x_up1.features\n        point_coords = common_utils.get_voxel_centers(\n            x_up1.indices[:, 1:], downsample_times=1, voxel_size=self.voxel_size,\n            point_cloud_range=self.point_cloud_range\n        )\n        batch_dict['point_coords'] = torch.cat((x_up1.indices[:, 0:1].float(), point_coords), dim=1)\n        return batch_dict\n"
  },
  {
    "path": "pcdet/models/backbones_3d/vfe/__init__.py",
    "content": "from .mean_vfe import MeanVFE\nfrom .pillar_vfe import PillarVFE\nfrom .vfe_template import VFETemplate\n\n__all__ = {\n    'VFETemplate': VFETemplate,\n    'MeanVFE': MeanVFE,\n    'PillarVFE': PillarVFE\n}\n"
  },
  {
    "path": "pcdet/models/backbones_3d/vfe/mean_vfe.py",
    "content": "import torch\n\nfrom .vfe_template import VFETemplate\n\n\nclass MeanVFE(VFETemplate):\n    def __init__(self, model_cfg, num_point_features, **kwargs):\n        super().__init__(model_cfg=model_cfg)\n        self.num_point_features = num_point_features\n        self.model = self.model_cfg.get('MODEL',None)\n\n    def get_output_feature_dim(self):\n        return self.num_point_features\n\n    def forward(self, batch_dict, **kwargs):\n        \"\"\"\n        Args:\n            batch_dict:\n                voxels: (num_voxels, max_points_per_voxel, C)\n                voxel_num_points: optional (num_voxels)\n            **kwargs:\n\n        Returns:\n            vfe_features: (num_voxels, C)\n        \"\"\"\n\n        if 'transform_param' in batch_dict:\n            trans_param = batch_dict['transform_param']\n            rot_num = trans_param.shape[1]\n        else:\n            rot_num = 1\n\n        for i in range(rot_num):\n            if i==0:\n                frame_id = ''\n            else:\n                frame_id = str(i)\n\n            voxel_features, voxel_num_points = batch_dict['voxels'+frame_id], batch_dict['voxel_num_points'+frame_id]\n            points_mean = voxel_features[:, :, :].sum(dim=1, keepdim=False)\n            normalizer = torch.clamp_min(voxel_num_points.view(-1, 1), min=1.0).type_as(voxel_features)\n            points_mean = points_mean / normalizer\n\n            if self.model is not None:\n                if self.model == 'max':\n                    time_max = voxel_features[:, :, :].max(dim=1, keepdim=False)[0]\n                    points_mean[:, -1] = time_max[:, -1]\n\n            batch_dict['voxel_features'+frame_id] = points_mean.contiguous()\n\n            if 'mm' in batch_dict:\n                voxel_features, voxel_num_points = batch_dict['voxels_mm'+frame_id], batch_dict[\n                    'voxel_num_points_mm'+frame_id]\n                points_mean = voxel_features[:, :, :].sum(dim=1, keepdim=False)\n                normalizer = torch.clamp_min(voxel_num_points.view(-1, 1), min=1.0).type_as(voxel_features)\n                points_mean = points_mean / normalizer\n\n                batch_dict['voxel_features_mm'+frame_id] = points_mean.contiguous()\n\n        return batch_dict\n"
  },
  {
    "path": "pcdet/models/backbones_3d/vfe/pillar_vfe.py",
    "content": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\nfrom .vfe_template import VFETemplate\n\n\nclass PFNLayer(nn.Module):\n    def __init__(self,\n                 in_channels,\n                 out_channels,\n                 use_norm=True,\n                 last_layer=False):\n        super().__init__()\n        \n        self.last_vfe = last_layer\n        self.use_norm = use_norm\n        if not self.last_vfe:\n            out_channels = out_channels // 2\n\n        if self.use_norm:\n            self.linear = nn.Linear(in_channels, out_channels, bias=False)\n            self.norm = nn.BatchNorm1d(out_channels, eps=1e-3, momentum=0.01)\n        else:\n            self.linear = nn.Linear(in_channels, out_channels, bias=True)\n\n        self.part = 50000\n\n    def forward(self, inputs):\n        if inputs.shape[0] > self.part:\n            # nn.Linear performs randomly when batch size is too large\n            num_parts = inputs.shape[0] // self.part\n            part_linear_out = [self.linear(inputs[num_part*self.part:(num_part+1)*self.part])\n                               for num_part in range(num_parts+1)]\n            x = torch.cat(part_linear_out, dim=0)\n        else:\n            x = self.linear(inputs)\n        torch.backends.cudnn.enabled = False\n        x = self.norm(x.permute(0, 2, 1)).permute(0, 2, 1) if self.use_norm else x\n        torch.backends.cudnn.enabled = True\n        x = F.relu(x)\n        x_max = torch.max(x, dim=1, keepdim=True)[0]\n\n        if self.last_vfe:\n            return x_max\n        else:\n            x_repeat = x_max.repeat(1, inputs.shape[1], 1)\n            x_concatenated = torch.cat([x, x_repeat], dim=2)\n            return x_concatenated\n\n\nclass PillarVFE(VFETemplate):\n    def __init__(self, model_cfg, num_point_features, voxel_size, point_cloud_range):\n        super().__init__(model_cfg=model_cfg)\n\n        self.use_norm = self.model_cfg.USE_NORM\n        self.with_distance = self.model_cfg.WITH_DISTANCE\n        self.use_absolute_xyz = self.model_cfg.USE_ABSLOTE_XYZ\n        num_point_features += 6 if self.use_absolute_xyz else 3\n        if self.with_distance:\n            num_point_features += 1\n\n        self.num_filters = self.model_cfg.NUM_FILTERS\n        assert len(self.num_filters) > 0\n        num_filters = [num_point_features] + list(self.num_filters)\n\n        pfn_layers = []\n        for i in range(len(num_filters) - 1):\n            in_filters = num_filters[i]\n            out_filters = num_filters[i + 1]\n            pfn_layers.append(\n                PFNLayer(in_filters, out_filters, self.use_norm, last_layer=(i >= len(num_filters) - 2))\n            )\n        self.pfn_layers = nn.ModuleList(pfn_layers)\n\n        self.voxel_x = voxel_size[0]\n        self.voxel_y = voxel_size[1]\n        self.voxel_z = voxel_size[2]\n        self.x_offset = self.voxel_x / 2 + point_cloud_range[0]\n        self.y_offset = self.voxel_y / 2 + point_cloud_range[1]\n        self.z_offset = self.voxel_z / 2 + point_cloud_range[2]\n\n    def get_output_feature_dim(self):\n        return self.num_filters[-1]\n\n    def get_paddings_indicator(self, actual_num, max_num, axis=0):\n        actual_num = torch.unsqueeze(actual_num, axis + 1)\n        max_num_shape = [1] * len(actual_num.shape)\n        max_num_shape[axis + 1] = -1\n        max_num = torch.arange(max_num, dtype=torch.int, device=actual_num.device).view(max_num_shape)\n        paddings_indicator = actual_num.int() > max_num\n        return paddings_indicator\n\n    def forward(self, batch_dict, **kwargs):\n  \n        voxel_features, voxel_num_points, coords = batch_dict['voxels'], batch_dict['voxel_num_points'], batch_dict['voxel_coords']\n        points_mean = voxel_features[:, :, :3].sum(dim=1, keepdim=True) / voxel_num_points.type_as(voxel_features).view(-1, 1, 1)\n        f_cluster = voxel_features[:, :, :3] - points_mean\n\n        f_center = torch.zeros_like(voxel_features[:, :, :3])\n        f_center[:, :, 0] = voxel_features[:, :, 0] - (coords[:, 3].to(voxel_features.dtype).unsqueeze(1) * self.voxel_x + self.x_offset)\n        f_center[:, :, 1] = voxel_features[:, :, 1] - (coords[:, 2].to(voxel_features.dtype).unsqueeze(1) * self.voxel_y + self.y_offset)\n        f_center[:, :, 2] = voxel_features[:, :, 2] - (coords[:, 1].to(voxel_features.dtype).unsqueeze(1) * self.voxel_z + self.z_offset)\n\n        if self.use_absolute_xyz:\n            features = [voxel_features, f_cluster, f_center]\n        else:\n            features = [voxel_features[..., 3:], f_cluster, f_center]\n\n        if self.with_distance:\n            points_dist = torch.norm(voxel_features[:, :, :3], 2, 2, keepdim=True)\n            features.append(points_dist)\n        features = torch.cat(features, dim=-1)\n\n        voxel_count = features.shape[1]\n        mask = self.get_paddings_indicator(voxel_num_points, voxel_count, axis=0)\n        mask = torch.unsqueeze(mask, -1).type_as(voxel_features)\n        features *= mask\n        for pfn in self.pfn_layers:\n            features = pfn(features)\n        features = features.squeeze()\n        batch_dict['pillar_features'] = features\n        return batch_dict\n"
  },
  {
    "path": "pcdet/models/backbones_3d/vfe/vfe_template.py",
    "content": "import torch.nn as nn\n\n\nclass VFETemplate(nn.Module):\n    def __init__(self, model_cfg, **kwargs):\n        super().__init__()\n        self.model_cfg = model_cfg\n\n    def get_output_feature_dim(self):\n        raise NotImplementedError\n\n    def forward(self, **kwargs):\n        \"\"\"\n        Args:\n            **kwargs:\n\n        Returns:\n            batch_dict:\n                ...\n                vfe_features: (num_voxels, C)\n        \"\"\"\n        raise NotImplementedError\n"
  },
  {
    "path": "pcdet/models/dense_heads/__init__.py",
    "content": "from .anchor_head_multi import AnchorHeadMulti\nfrom .anchor_head_single import AnchorHeadSingle\nfrom .center_head import CenterHead\nfrom .anchor_head_template import AnchorHeadTemplate\nfrom .point_head_box import PointHeadBox\nfrom .point_head_simple import PointHeadSimple\nfrom .point_intra_part_head import PointIntraPartOffsetHead\nfrom .center_head import CenterHead\n__all__ = {\n    'AnchorHeadTemplate': AnchorHeadTemplate,\n    'AnchorHeadSingle': AnchorHeadSingle,\n    'CenterHead': CenterHead,\n    'PointIntraPartOffsetHead': PointIntraPartOffsetHead,\n    'PointHeadSimple': PointHeadSimple,\n    'PointHeadBox': PointHeadBox,\n    'AnchorHeadMulti': AnchorHeadMulti,\n\n}\n"
  },
  {
    "path": "pcdet/models/dense_heads/anchor_head_multi.py",
    "content": "import numpy as np\nimport torch\nimport torch.nn as nn\n\nfrom ..backbones_2d import BaseBEVBackbone\nfrom .anchor_head_template import AnchorHeadTemplate\n\n\nclass SingleHead(BaseBEVBackbone):\n    def __init__(self, model_cfg, input_channels, num_class, num_anchors_per_location, code_size, rpn_head_cfg=None,\n                 head_label_indices=None, separate_reg_config=None):\n        super().__init__(rpn_head_cfg, input_channels)\n\n        self.num_anchors_per_location = num_anchors_per_location\n        self.num_class = num_class\n        self.code_size = code_size\n        self.model_cfg = model_cfg\n        self.separate_reg_config = separate_reg_config\n        self.register_buffer('head_label_indices', head_label_indices)\n\n        if self.separate_reg_config is not None:\n            code_size_cnt = 0\n            self.conv_box = nn.ModuleDict()\n            self.conv_box_names = []\n            num_middle_conv = self.separate_reg_config.NUM_MIDDLE_CONV\n            num_middle_filter = self.separate_reg_config.NUM_MIDDLE_FILTER\n            conv_cls_list = []\n            c_in = input_channels\n            for k in range(num_middle_conv):\n                conv_cls_list.extend([\n                    nn.Conv2d(\n                        c_in, num_middle_filter,\n                        kernel_size=3, stride=1, padding=1, bias=False\n                    ),\n                    nn.BatchNorm2d(num_middle_filter),\n                    nn.ReLU()\n                ])\n                c_in = num_middle_filter\n            conv_cls_list.append(nn.Conv2d(\n                c_in, self.num_anchors_per_location * self.num_class,\n                kernel_size=3, stride=1, padding=1\n            ))\n            self.conv_cls = nn.Sequential(*conv_cls_list)\n\n            for reg_config in self.separate_reg_config.REG_LIST:\n                reg_name, reg_channel = reg_config.split(':')\n                reg_channel = int(reg_channel)\n                cur_conv_list = []\n                c_in = input_channels\n                for k in range(num_middle_conv):\n                    cur_conv_list.extend([\n                        nn.Conv2d(\n                            c_in, num_middle_filter,\n                            kernel_size=3, stride=1, padding=1, bias=False\n                        ),\n                        nn.BatchNorm2d(num_middle_filter),\n                        nn.ReLU()\n                    ])\n                    c_in = num_middle_filter\n\n                cur_conv_list.append(nn.Conv2d(\n                    c_in, self.num_anchors_per_location * int(reg_channel),\n                    kernel_size=3, stride=1, padding=1, bias=True\n                ))\n                code_size_cnt += reg_channel\n                self.conv_box[f'conv_{reg_name}'] = nn.Sequential(*cur_conv_list)\n                self.conv_box_names.append(f'conv_{reg_name}')\n\n            for m in self.conv_box.modules():\n                if isinstance(m, nn.Conv2d):\n                    nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')\n                    if m.bias is not None:\n                        nn.init.constant_(m.bias, 0)\n\n            assert code_size_cnt == code_size, f'Code size does not match: {code_size_cnt}:{code_size}'\n        else:\n            self.conv_cls = nn.Conv2d(\n                input_channels, self.num_anchors_per_location * self.num_class,\n                kernel_size=1\n            )\n            self.conv_box = nn.Conv2d(\n                input_channels, self.num_anchors_per_location * self.code_size,\n                kernel_size=1\n            )\n\n        if self.model_cfg.get('USE_DIRECTION_CLASSIFIER', None) is not None:\n            self.conv_dir_cls = nn.Conv2d(\n                input_channels,\n                self.num_anchors_per_location * self.model_cfg.NUM_DIR_BINS,\n                kernel_size=1\n            )\n        else:\n            self.conv_dir_cls = None\n        self.use_multihead = self.model_cfg.get('USE_MULTIHEAD', False)\n        self.init_weights()\n\n    def init_weights(self):\n        pi = 0.01\n        if isinstance(self.conv_cls, nn.Conv2d):\n            nn.init.constant_(self.conv_cls.bias, -np.log((1 - pi) / pi))\n        else:\n            nn.init.constant_(self.conv_cls[-1].bias, -np.log((1 - pi) / pi))\n\n    def forward(self, spatial_features_2d):\n        ret_dict = {}\n        spatial_features_2d = super().forward({'spatial_features': spatial_features_2d})['spatial_features_2d']\n\n        cls_preds = self.conv_cls(spatial_features_2d)\n\n        if self.separate_reg_config is None:\n            box_preds = self.conv_box(spatial_features_2d)\n        else:\n            box_preds_list = []\n            for reg_name in self.conv_box_names:\n                box_preds_list.append(self.conv_box[reg_name](spatial_features_2d))\n            box_preds = torch.cat(box_preds_list, dim=1)\n\n        if not self.use_multihead:\n            box_preds = box_preds.permute(0, 2, 3, 1).contiguous()\n            cls_preds = cls_preds.permute(0, 2, 3, 1).contiguous()\n        else:\n            H, W = box_preds.shape[2:]\n            batch_size = box_preds.shape[0]\n            box_preds = box_preds.view(-1, self.num_anchors_per_location,\n                                       self.code_size, H, W).permute(0, 1, 3, 4, 2).contiguous()\n            cls_preds = cls_preds.view(-1, self.num_anchors_per_location,\n                                       self.num_class, H, W).permute(0, 1, 3, 4, 2).contiguous()\n            box_preds = box_preds.view(batch_size, -1, self.code_size)\n            cls_preds = cls_preds.view(batch_size, -1, self.num_class)\n\n        if self.conv_dir_cls is not None:\n            dir_cls_preds = self.conv_dir_cls(spatial_features_2d)\n            if self.use_multihead:\n                dir_cls_preds = dir_cls_preds.view(\n                    -1, self.num_anchors_per_location, self.model_cfg.NUM_DIR_BINS, H, W).permute(0, 1, 3, 4,\n                                                                                                  2).contiguous()\n                dir_cls_preds = dir_cls_preds.view(batch_size, -1, self.model_cfg.NUM_DIR_BINS)\n            else:\n                dir_cls_preds = dir_cls_preds.permute(0, 2, 3, 1).contiguous()\n\n        else:\n            dir_cls_preds = None\n\n        ret_dict['cls_preds'] = cls_preds\n        ret_dict['box_preds'] = box_preds\n        ret_dict['dir_cls_preds'] = dir_cls_preds\n\n        return ret_dict\n\n\nclass AnchorHeadMulti(AnchorHeadTemplate):\n    def __init__(self, model_cfg, input_channels, num_class, class_names, grid_size, point_cloud_range,\n                 predict_boxes_when_training=True):\n        super().__init__(\n            model_cfg=model_cfg, num_class=num_class, class_names=class_names, grid_size=grid_size,\n            point_cloud_range=point_cloud_range, predict_boxes_when_training=predict_boxes_when_training\n        )\n        self.model_cfg = model_cfg\n        self.separate_multihead = self.model_cfg.get('SEPARATE_MULTIHEAD', False)\n\n        if self.model_cfg.get('SHARED_CONV_NUM_FILTER', None) is not None:\n            shared_conv_num_filter = self.model_cfg.SHARED_CONV_NUM_FILTER\n            self.shared_conv = nn.Sequential(\n                nn.Conv2d(input_channels, shared_conv_num_filter, 3, stride=1, padding=1, bias=False),\n                nn.BatchNorm2d(shared_conv_num_filter, eps=1e-3, momentum=0.01),\n                nn.ReLU(),\n            )\n        else:\n            self.shared_conv = None\n            shared_conv_num_filter = input_channels\n        self.rpn_heads = None\n        self.make_multihead(shared_conv_num_filter)\n\n    def make_multihead(self, input_channels):\n        rpn_head_cfgs = self.model_cfg.RPN_HEAD_CFGS\n        rpn_heads = []\n        class_names = []\n        for rpn_head_cfg in rpn_head_cfgs:\n            class_names.extend(rpn_head_cfg['HEAD_CLS_NAME'])\n\n        for rpn_head_cfg in rpn_head_cfgs:\n            num_anchors_per_location = sum([self.num_anchors_per_location[class_names.index(head_cls)]\n                                            for head_cls in rpn_head_cfg['HEAD_CLS_NAME']])\n            head_label_indices = torch.from_numpy(np.array([\n                self.class_names.index(cur_name) + 1 for cur_name in rpn_head_cfg['HEAD_CLS_NAME']\n            ]))\n\n            rpn_head = SingleHead(\n                self.model_cfg, input_channels,\n                len(rpn_head_cfg['HEAD_CLS_NAME']) if self.separate_multihead else self.num_class,\n                num_anchors_per_location, self.box_coder.code_size, rpn_head_cfg,\n                head_label_indices=head_label_indices,\n                separate_reg_config=self.model_cfg.get('SEPARATE_REG_CONFIG', None)\n            )\n            rpn_heads.append(rpn_head)\n        self.rpn_heads = nn.ModuleList(rpn_heads)\n\n    def forward(self, data_dict):\n        spatial_features_2d = data_dict['spatial_features_2d']\n        if self.shared_conv is not None:\n            spatial_features_2d = self.shared_conv(spatial_features_2d)\n\n        ret_dicts = []\n        for rpn_head in self.rpn_heads:\n            ret_dicts.append(rpn_head(spatial_features_2d))\n\n        cls_preds = [ret_dict['cls_preds'] for ret_dict in ret_dicts]\n        box_preds = [ret_dict['box_preds'] for ret_dict in ret_dicts]\n        ret = {\n            'cls_preds': cls_preds if self.separate_multihead else torch.cat(cls_preds, dim=1),\n            'box_preds': box_preds if self.separate_multihead else torch.cat(box_preds, dim=1),\n        }\n\n        if self.model_cfg.get('USE_DIRECTION_CLASSIFIER', False):\n            dir_cls_preds = [ret_dict['dir_cls_preds'] for ret_dict in ret_dicts]\n            ret['dir_cls_preds'] = dir_cls_preds if self.separate_multihead else torch.cat(dir_cls_preds, dim=1)\n\n        self.forward_ret_dict.update(ret)\n\n        if self.training:\n            targets_dict = self.assign_targets(\n                gt_boxes=data_dict['gt_boxes']\n            )\n            self.forward_ret_dict.update(targets_dict)\n\n        if not self.training or self.predict_boxes_when_training:\n            batch_cls_preds, batch_box_preds = self.generate_predicted_boxes(\n                batch_size=data_dict['batch_size'],\n                cls_preds=ret['cls_preds'], box_preds=ret['box_preds'], dir_cls_preds=ret.get('dir_cls_preds', None)\n            )\n\n            if isinstance(batch_cls_preds, list):\n                multihead_label_mapping = []\n                for idx in range(len(batch_cls_preds)):\n                    multihead_label_mapping.append(self.rpn_heads[idx].head_label_indices)\n\n                data_dict['multihead_label_mapping'] = multihead_label_mapping\n\n            data_dict['batch_cls_preds'] = batch_cls_preds\n            data_dict['batch_box_preds'] = batch_box_preds\n            data_dict['cls_preds_normalized'] = False\n\n        return data_dict\n\n    def get_cls_layer_loss(self):\n        loss_weights = self.model_cfg.LOSS_CONFIG.LOSS_WEIGHTS\n        if 'pos_cls_weight' in loss_weights:\n            pos_cls_weight = loss_weights['pos_cls_weight']\n            neg_cls_weight = loss_weights['neg_cls_weight']\n        else:\n            pos_cls_weight = neg_cls_weight = 1.0\n\n        cls_preds = self.forward_ret_dict['cls_preds']\n        box_cls_labels = self.forward_ret_dict['box_cls_labels']\n        if not isinstance(cls_preds, list):\n            cls_preds = [cls_preds]\n        batch_size = int(cls_preds[0].shape[0])\n        cared = box_cls_labels >= 0  # [N, num_anchors]\n        positives = box_cls_labels > 0\n        negatives = box_cls_labels == 0\n        negative_cls_weights = negatives * 1.0 * neg_cls_weight\n\n        cls_weights = (negative_cls_weights + pos_cls_weight * positives).float()\n\n        reg_weights = positives.float()\n        if self.num_class == 1:\n            # class agnostic\n            box_cls_labels[positives] = 1\n        pos_normalizer = positives.sum(1, keepdim=True).float()\n\n        reg_weights /= torch.clamp(pos_normalizer, min=1.0)\n        cls_weights /= torch.clamp(pos_normalizer, min=1.0)\n        cls_targets = box_cls_labels * cared.type_as(box_cls_labels)\n        one_hot_targets = torch.zeros(\n            *list(cls_targets.shape), self.num_class + 1, dtype=cls_preds[0].dtype, device=cls_targets.device\n        )\n        one_hot_targets.scatter_(-1, cls_targets.unsqueeze(dim=-1).long(), 1.0)\n        one_hot_targets = one_hot_targets[..., 1:]\n        start_idx = c_idx = 0\n        cls_losses = 0\n\n        for idx, cls_pred in enumerate(cls_preds):\n            cur_num_class = self.rpn_heads[idx].num_class\n            cls_pred = cls_pred.view(batch_size, -1, cur_num_class)\n            if self.separate_multihead:\n                one_hot_target = one_hot_targets[:, start_idx:start_idx + cls_pred.shape[1],\n                                 c_idx:c_idx + cur_num_class]\n                c_idx += cur_num_class\n            else:\n                one_hot_target = one_hot_targets[:, start_idx:start_idx + cls_pred.shape[1]]\n            cls_weight = cls_weights[:, start_idx:start_idx + cls_pred.shape[1]]\n            cls_loss_src = self.cls_loss_func(cls_pred, one_hot_target, weights=cls_weight)  # [N, M]\n            cls_loss = cls_loss_src.sum() / batch_size\n            cls_loss = cls_loss * loss_weights['cls_weight']\n            cls_losses += cls_loss\n            start_idx += cls_pred.shape[1]\n        assert start_idx == one_hot_targets.shape[1]\n        tb_dict = {\n            'rpn_loss_cls': cls_losses.item()\n        }\n        return cls_losses, tb_dict\n\n    def get_box_reg_layer_loss(self):\n        box_preds = self.forward_ret_dict['box_preds']\n        box_dir_cls_preds = self.forward_ret_dict.get('dir_cls_preds', None)\n        box_reg_targets = self.forward_ret_dict['box_reg_targets']\n        box_cls_labels = self.forward_ret_dict['box_cls_labels']\n\n        positives = box_cls_labels > 0\n        reg_weights = positives.float()\n        pos_normalizer = positives.sum(1, keepdim=True).float()\n        reg_weights /= torch.clamp(pos_normalizer, min=1.0)\n\n        if not isinstance(box_preds, list):\n            box_preds = [box_preds]\n        batch_size = int(box_preds[0].shape[0])\n\n        if isinstance(self.anchors, list):\n            if self.use_multihead:\n                anchors = torch.cat(\n                    [anchor.permute(3, 4, 0, 1, 2, 5).contiguous().view(-1, anchor.shape[-1])\n                     for anchor in self.anchors], dim=0\n                )\n            else:\n                anchors = torch.cat(self.anchors, dim=-3)\n        else:\n            anchors = self.anchors\n        anchors = anchors.view(1, -1, anchors.shape[-1]).repeat(batch_size, 1, 1)\n\n        start_idx = 0\n        box_losses = 0\n        tb_dict = {}\n        for idx, box_pred in enumerate(box_preds):\n            box_pred = box_pred.view(\n                batch_size, -1,\n                box_pred.shape[-1] // self.num_anchors_per_location if not self.use_multihead else box_pred.shape[-1]\n            )\n            box_reg_target = box_reg_targets[:, start_idx:start_idx + box_pred.shape[1]]\n            reg_weight = reg_weights[:, start_idx:start_idx + box_pred.shape[1]]\n            # sin(a - b) = sinacosb-cosasinb\n            if box_dir_cls_preds is not None:\n                box_pred_sin, reg_target_sin = self.add_sin_difference(box_pred, box_reg_target)\n                loc_loss_src = self.reg_loss_func(box_pred_sin, reg_target_sin, weights=reg_weight)  # [N, M]\n            else:\n                loc_loss_src = self.reg_loss_func(box_pred, box_reg_target, weights=reg_weight)  # [N, M]\n            loc_loss = loc_loss_src.sum() / batch_size\n\n            loc_loss = loc_loss * self.model_cfg.LOSS_CONFIG.LOSS_WEIGHTS['loc_weight']\n            box_losses += loc_loss\n            tb_dict['rpn_loss_loc'] = tb_dict.get('rpn_loss_loc', 0) + loc_loss.item()\n\n            if box_dir_cls_preds is not None:\n                if not isinstance(box_dir_cls_preds, list):\n                    box_dir_cls_preds = [box_dir_cls_preds]\n                dir_targets = self.get_direction_target(\n                    anchors, box_reg_targets,\n                    dir_offset=self.model_cfg.DIR_OFFSET,\n                    num_bins=self.model_cfg.NUM_DIR_BINS\n                )\n                box_dir_cls_pred = box_dir_cls_preds[idx]\n                dir_logit = box_dir_cls_pred.view(batch_size, -1, self.model_cfg.NUM_DIR_BINS)\n                weights = positives.type_as(dir_logit)\n                weights /= torch.clamp(weights.sum(-1, keepdim=True), min=1.0)\n\n                weight = weights[:, start_idx:start_idx + box_pred.shape[1]]\n                dir_target = dir_targets[:, start_idx:start_idx + box_pred.shape[1]]\n                dir_loss = self.dir_loss_func(dir_logit, dir_target, weights=weight)\n                dir_loss = dir_loss.sum() / batch_size\n                dir_loss = dir_loss * self.model_cfg.LOSS_CONFIG.LOSS_WEIGHTS['dir_weight']\n                box_losses += dir_loss\n                tb_dict['rpn_loss_dir'] = tb_dict.get('rpn_loss_dir', 0) + dir_loss.item()\n            start_idx += box_pred.shape[1]\n        return box_losses, tb_dict\n"
  },
  {
    "path": "pcdet/models/dense_heads/anchor_head_single.py",
    "content": "import numpy as np\nimport torch.nn as nn\n\nfrom .anchor_head_template import AnchorHeadTemplate\nimport torch\nimport cv2\nimport numpy as np\n\ndef get_layer(dim,out_dim,init = None):\n    init_func = nn.init.kaiming_normal_\n    layers = []\n    conv = nn.Conv2d(dim, dim,\n                      kernel_size=3, padding=1, bias=True)\n    nn.init.normal_(conv.weight, mean=0, std=0.001)\n    layers.append(conv)\n    layers.append(nn.BatchNorm2d(dim))\n    layers.append(nn.ReLU())\n    conv2 = nn.Conv2d(dim, out_dim,\n                     kernel_size=1, bias=True)\n\n    if init is None:\n        nn.init.normal_(conv2.weight, mean=0, std=0.001)\n        layers.append(conv2)\n\n    else:\n        conv2.bias.data.fill_(init)\n        layers.append(conv2)\n\n    return nn.Sequential(*layers)\n\nclass AnchorHeadSingle(AnchorHeadTemplate):\n    def __init__(self, model_cfg, input_channels, num_class, class_names, grid_size, point_cloud_range,\n                 predict_boxes_when_training=True, **kwargs):\n        super().__init__(\n            model_cfg=model_cfg, num_class=num_class, class_names=class_names, grid_size=grid_size, point_cloud_range=point_cloud_range,\n            predict_boxes_when_training=predict_boxes_when_training\n        )\n        self.grid_size = grid_size  # [1408 1600   40]\n        self.range = point_cloud_range\n\n        self.voxel_size = (point_cloud_range[3] - point_cloud_range[0]) / grid_size[0]\n\n\n        self.num_anchors_per_location = sum(self.num_anchors_per_location)\n\n        self.conv_cls = nn.Conv2d(\n            input_channels, self.num_anchors_per_location * self.num_class,\n            kernel_size=1\n        )\n        self.conv_box = nn.Conv2d(\n            input_channels, self.num_anchors_per_location * self.box_coder.code_size,\n            kernel_size=1\n        )\n\n\n        if self.model_cfg.get('USE_DIRECTION_CLASSIFIER', None) is not None:\n            self.conv_dir_cls = nn.Conv2d(\n                input_channels,\n                self.num_anchors_per_location * self.model_cfg.NUM_DIR_BINS,\n                kernel_size=1\n            )\n        else:\n            self.conv_dir_cls = None\n        self.init_weights()\n\n        #for child in self.children():\n        #    for param in child.parameters():\n        #        param.requires_grad = False\n\n    def init_weights(self):\n        pi = 0.01\n        nn.init.constant_(self.conv_cls.bias, -np.log((1 - pi) / pi))\n        nn.init.normal_(self.conv_box.weight, mean=0, std=0.001)\n\n    def get_anchor_mask(self,data_dict,shape):\n\n        stride = np.round(self.voxel_size*8.*10.)\n\n        minx=self.range[0]\n        miny=self.range[1]\n\n        points = data_dict[\"points\"]\n\n        mask = torch.zeros(shape[-2],shape[-1])\n\n        mask_large = torch.zeros(shape[-2]//10,shape[-1]//10)\n\n        in_x = (points[:, 1] - minx) / stride\n        in_y = (points[:, 2] - miny) / stride\n\n        in_x = in_x.long().clamp(max=shape[-1]//10-1)\n        in_y = in_y.long().clamp(max=shape[-2]//10-1)\n\n\n        mask_large[in_y,in_x] = 1\n\n        mask_large = mask_large.clone().int().detach().cpu().numpy()\n\n        mask_large_index = np.argwhere( mask_large>0 )\n\n        mask_large_index = mask_large_index*10\n\n        index_list=[]\n\n        for i in np.arange(-10, 10, 1):\n            for j in np.arange(-10, 10, 1):\n                index_list.append(mask_large_index+[i,j])\n\n        index_list = np.concatenate(index_list,0)\n\n        inds = torch.from_numpy(index_list).cuda().long()\n\n        mask[inds[:,0],inds[:,1]]=1\n\n        return mask.bool()\n\n\n    def forward(self, data_dict):\n\n        anchor_mask = self.get_anchor_mask(data_dict,data_dict['st_features_2d'].shape)\n\n        new_anchors = []\n        for anchors in self.anchors_root:\n            new_anchors.append(anchors[:, anchor_mask, ...])\n\n        self.anchors = new_anchors\n\n        st_features_2d = data_dict['st_features_2d']\n\n        cls_preds = self.conv_cls(st_features_2d)\n        box_preds = self.conv_box(st_features_2d)\n\n        cls_preds = cls_preds.permute(0, 2, 3, 1).contiguous()[:,anchor_mask,:]  # [N, H, W, C]\n        box_preds = box_preds.permute(0, 2, 3, 1).contiguous()[:,anchor_mask,:]  # [N, H, W, C]\n\n        self.forward_ret_dict['cls_preds'] = cls_preds\n        self.forward_ret_dict['box_preds'] = box_preds\n\n        if self.conv_dir_cls is not None:\n            dir_cls_preds = self.conv_dir_cls(st_features_2d)\n            dir_cls_preds = dir_cls_preds.permute(0, 2, 3, 1).contiguous()[:,anchor_mask,:]\n            self.forward_ret_dict['dir_cls_preds'] = dir_cls_preds\n        else:\n            dir_cls_preds = None\n\n\n\n        if self.training:\n            targets_dict = self.assign_targets(\n                gt_boxes=data_dict['gt_boxes']\n            )\n            self.forward_ret_dict.update(targets_dict)\n            data_dict['gt_ious'] = targets_dict['gt_ious']\n\n        if not self.training or self.predict_boxes_when_training:\n            batch_cls_preds, batch_box_preds = self.generate_predicted_boxes(\n                batch_size=data_dict['batch_size'],\n                cls_preds=cls_preds, box_preds=box_preds, dir_cls_preds=dir_cls_preds\n            )\n            data_dict['batch_cls_preds'] = batch_cls_preds\n            data_dict['batch_box_preds'] = batch_box_preds\n            data_dict['cls_preds_normalized'] = False\n\n        if self.model_cfg.get('NMS_CONFIG', None) is not None:\n            self.proposal_layer(\n                data_dict, nms_config=self.model_cfg.NMS_CONFIG['TRAIN' if self.training else 'TEST']\n            )\n\n        return data_dict\n"
  },
  {
    "path": "pcdet/models/dense_heads/anchor_head_template.py",
    "content": "import numpy as np\nimport torch\nimport torch.nn as nn\n\nfrom ...utils import box_coder_utils, common_utils, loss_utils\nfrom .target_assigner.anchor_generator import AnchorGenerator\nfrom .target_assigner.atss_target_assigner import ATSSTargetAssigner\nfrom .target_assigner.axis_aligned_target_assigner import AxisAlignedTargetAssigner\nfrom ...utils.odiou_loss import odiou_3D\nfrom ..model_utils.model_nms_utils import class_agnostic_nms\nimport copy\n\nclass AnchorHeadTemplate(nn.Module):\n    def __init__(self, model_cfg, num_class, class_names, grid_size, point_cloud_range, predict_boxes_when_training):\n        super().__init__()\n        self.model_cfg = model_cfg\n        self.num_class = num_class\n        self.class_names = class_names\n        self.predict_boxes_when_training = predict_boxes_when_training\n        self.use_multihead = self.model_cfg.get('USE_MULTIHEAD', False)\n\n        anchor_target_cfg = self.model_cfg.TARGET_ASSIGNER_CONFIG\n        self.box_coder = getattr(box_coder_utils, anchor_target_cfg.BOX_CODER)(\n            num_dir_bins=anchor_target_cfg.get('NUM_DIR_BINS', 6),\n            **anchor_target_cfg.get('BOX_CODER_CONFIG', {})\n        )\n\n        anchor_generator_cfg = self.model_cfg.ANCHOR_GENERATOR_CONFIG\n\n        self.grid_size = grid_size\n        self.point_cloud_range = point_cloud_range\n\n        anchors, self.num_anchors_per_location = self.generate_anchors(\n            anchor_generator_cfg, grid_size=grid_size, point_cloud_range=point_cloud_range,\n            anchor_ndim=self.box_coder.code_size\n        )\n        self.anchors_root = [x.cuda() for x in anchors]\n\n        self.target_assigner = self.get_target_assigner(anchor_target_cfg)\n\n        self.forward_ret_dict = {}\n        self.build_losses(self.model_cfg.LOSS_CONFIG)\n\n    @staticmethod\n    def generate_anchors(anchor_generator_cfg, grid_size, point_cloud_range, anchor_ndim=7):\n        anchor_generator = AnchorGenerator(\n            anchor_range=point_cloud_range,\n            anchor_generator_config=anchor_generator_cfg\n        )\n        feature_map_size = [grid_size[:2] // config['feature_map_stride'] for config in anchor_generator_cfg]\n        anchors_list, num_anchors_per_location_list = anchor_generator.generate_anchors(feature_map_size)\n\n        if anchor_ndim != 7:\n            for idx, anchors in enumerate(anchors_list):\n                pad_zeros = anchors.new_zeros([*anchors.shape[0:-1], anchor_ndim - 7])\n                new_anchors = torch.cat((anchors, pad_zeros), dim=-1)\n                anchors_list[idx] = new_anchors\n\n        return anchors_list, num_anchors_per_location_list\n\n    def get_target_assigner(self, anchor_target_cfg):\n        if anchor_target_cfg.NAME == 'ATSS':\n            target_assigner = ATSSTargetAssigner(\n                topk=anchor_target_cfg.TOPK,\n                box_coder=self.box_coder,\n                use_multihead=self.use_multihead,\n                match_height=anchor_target_cfg.MATCH_HEIGHT\n            )\n        elif anchor_target_cfg.NAME == 'AxisAlignedTargetAssigner':\n            target_assigner = AxisAlignedTargetAssigner(\n                model_cfg=self.model_cfg,\n                class_names=self.class_names,\n                box_coder=self.box_coder,\n                grid_size=self.grid_size,\n                point_cloud_range=self.point_cloud_range,\n                match_height=anchor_target_cfg.MATCH_HEIGHT\n            )\n        else:\n            raise NotImplementedError\n        return target_assigner\n\n    def proposal_layer(self, batch_dict, nms_config):\n        \"\"\"\n        Args:\n            batch_dict:\n                batch_size:\n                batch_cls_preds: (B, num_boxes, num_classes | 1) or (N1+N2+..., num_classes | 1)\n                batch_box_preds: (B, num_boxes, 7+C) or (N1+N2+..., 7+C)\n                cls_preds_normalized: indicate whether batch_cls_preds is normalized\n                batch_index: optional (N1+N2+...)\n            nms_config:\n\n        Returns:\n            batch_dict:\n                rois: (B, num_rois, 7+C)\n                roi_scores: (B, num_rois)\n                roi_labels: (B, num_rois)\n\n        \"\"\"\n        if batch_dict.get('rois', None) is not None:\n            return batch_dict\n\n        batch_size = batch_dict['batch_size']\n        batch_box_preds = batch_dict['batch_box_preds']\n        batch_cls_preds = batch_dict['batch_cls_preds']\n\n\n        rois = batch_box_preds.new_zeros((batch_size, nms_config.NMS_POST_MAXSIZE, batch_box_preds.shape[-1]))\n        roi_scores = batch_box_preds.new_zeros((batch_size, nms_config.NMS_POST_MAXSIZE))\n        roi_labels = batch_box_preds.new_zeros((batch_size, nms_config.NMS_POST_MAXSIZE), dtype=torch.long)\n\n\n        for index in range(batch_size):\n            if batch_dict.get('batch_index', None) is not None:\n                assert batch_cls_preds.shape.__len__() == 2\n                batch_mask = (batch_dict['batch_index'] == index)\n            else:\n                assert batch_dict['batch_cls_preds'].shape.__len__() == 3\n                batch_mask = index\n            box_preds = batch_box_preds[batch_mask]\n            cls_preds = batch_cls_preds[batch_mask]\n\n            cur_roi_scores, cur_roi_labels = torch.max(cls_preds, dim=1)\n\n            if nms_config.MULTI_CLASSES_NMS:\n                raise NotImplementedError\n            else:\n                selected, selected_scores = class_agnostic_nms(\n                    box_scores=cur_roi_scores, box_preds=box_preds, nms_config=nms_config\n                )\n\n            rois[index, :len(selected), :] = box_preds[selected]\n\n            roi_scores[index, :len(selected)] = cur_roi_scores[selected]\n            roi_labels[index, :len(selected)] = cur_roi_labels[selected]\n\n        batch_dict['rois'] = rois\n        batch_dict['roi_scores'] = roi_scores\n\n        batch_dict['roi_labels'] = roi_labels + 1\n        batch_dict['has_class_labels'] = True if batch_cls_preds.shape[-1] > 1 else False\n        batch_dict.pop('batch_index', None)\n        return batch_dict\n\n    def build_losses(self, losses_cfg):\n        self.add_module(\n            'cls_loss_func',\n            loss_utils.SigmoidFocalClassificationLoss(alpha=0.25, gamma=2.0)\n        )\n        reg_loss_name = 'WeightedSmoothL1Loss' if losses_cfg.get('REG_LOSS_TYPE', None) is None \\\n            else losses_cfg.REG_LOSS_TYPE\n        self.add_module(\n            'reg_loss_func',\n            getattr(loss_utils, reg_loss_name)(code_weights=losses_cfg.LOSS_WEIGHTS['code_weights'])\n        )\n        self.add_module(\n            'dir_loss_func',\n            loss_utils.WeightedCrossEntropyLoss()\n        )\n        self.add_module(\n            'od_loss_func',\n            odiou_3D()\n        )\n\n    def assign_targets(self, gt_boxes):\n        \"\"\"\n        Args:\n            gt_boxes: (B, M, 8)\n        Returns:\n\n        \"\"\"\n\n        targets_dict = self.target_assigner.assign_targets(\n            self.anchors, gt_boxes\n        )\n        return targets_dict\n\n    def get_cls_layer_loss(self):\n        cls_preds = self.forward_ret_dict['cls_preds']\n        box_cls_labels = self.forward_ret_dict['box_cls_labels']\n\n        batch_size = int(cls_preds.shape[0])\n        cared = box_cls_labels >= 0  # [N, num_anchors]\n        positives = box_cls_labels > 0\n        negatives = box_cls_labels == 0\n        negative_cls_weights = negatives * 1.0\n        cls_weights = (negative_cls_weights + 1.0 * positives).float()\n        reg_weights = positives.float()\n        if self.num_class == 1:\n            # class agnostic\n            box_cls_labels[positives] = 1\n\n        pos_normalizer = positives.sum(1, keepdim=True).float()\n        reg_weights /= torch.clamp(pos_normalizer, min=1.0)\n        cls_weights /= torch.clamp(pos_normalizer, min=1.0)\n\n\n        cls_targets = box_cls_labels * cared.type_as(box_cls_labels)\n        cls_targets = cls_targets.unsqueeze(dim=-1)\n\n        cls_targets = cls_targets.squeeze(dim=-1)\n        one_hot_targets = torch.zeros(\n            *list(cls_targets.shape), self.num_class + 1, dtype=cls_preds.dtype, device=cls_targets.device\n        )\n        one_hot_targets.scatter_(-1, cls_targets.unsqueeze(dim=-1).long(), 1.0)\n        cls_preds = cls_preds.view(batch_size, -1, self.num_class)\n        one_hot_targets = one_hot_targets[..., 1:]\n        cls_loss_src = self.cls_loss_func(cls_preds, one_hot_targets, weights=cls_weights)  # [N, M]\n        cls_loss = cls_loss_src.sum() / batch_size\n\n        cls_loss = cls_loss * self.model_cfg.LOSS_CONFIG.LOSS_WEIGHTS['cls_weight']\n        tb_dict = {\n            'rpn_loss_cls': cls_loss.item()\n        }\n        return cls_loss, tb_dict\n\n    @staticmethod\n    def add_sin_difference(boxes1, boxes2, dim=6):\n        assert dim != -1\n        rad_pred_encoding = torch.sin(boxes1[..., dim:dim + 1]) * torch.cos(boxes2[..., dim:dim + 1])\n        rad_tg_encoding = torch.cos(boxes1[..., dim:dim + 1]) * torch.sin(boxes2[..., dim:dim + 1])\n        boxes1 = torch.cat([boxes1[..., :dim], rad_pred_encoding, boxes1[..., dim + 1:]], dim=-1)\n        boxes2 = torch.cat([boxes2[..., :dim], rad_tg_encoding, boxes2[..., dim + 1:]], dim=-1)\n        return boxes1, boxes2\n\n    @staticmethod\n    def get_direction_target(anchors, reg_targets, one_hot=True, dir_offset=0, num_bins=2):\n        batch_size = reg_targets.shape[0]\n        anchors = anchors.view(batch_size, -1, anchors.shape[-1])\n        rot_gt = reg_targets[..., 6] + anchors[..., 6]\n        offset_rot = common_utils.limit_period(rot_gt - dir_offset, 0, 2 * np.pi)\n        dir_cls_targets = torch.floor(offset_rot / (2 * np.pi / num_bins)).long()\n        dir_cls_targets = torch.clamp(dir_cls_targets, min=0, max=num_bins - 1)\n\n        if one_hot:\n            dir_targets = torch.zeros(*list(dir_cls_targets.shape), num_bins, dtype=anchors.dtype,\n                                      device=dir_cls_targets.device)\n            dir_targets.scatter_(-1, dir_cls_targets.unsqueeze(dim=-1).long(), 1.0)\n            dir_cls_targets = dir_targets\n        return dir_cls_targets\n\n    def get_box_reg_layer_loss(self):\n        box_preds = self.forward_ret_dict['box_preds']\n        box_dir_cls_preds = self.forward_ret_dict.get('dir_cls_preds', None)\n        box_reg_targets = self.forward_ret_dict['box_reg_targets']\n        box_cls_labels = self.forward_ret_dict['box_cls_labels']\n        batch_size = int(box_preds.shape[0])\n\n        positives = box_cls_labels > 0\n        reg_weights = positives.float()\n        pos_normalizer = positives.sum(1, keepdim=True).float()\n        reg_weights /= torch.clamp(pos_normalizer, min=1.0)\n\n        if isinstance(self.anchors, list):\n            if self.use_multihead:\n                anchors = torch.cat(\n                    [anchor.permute(3, 4, 0, 1, 2, 5).contiguous().view(-1, anchor.shape[-1]) for anchor in\n                     self.anchors], dim=0)\n            else:\n                anchors = torch.cat(self.anchors, dim=-3)\n        else:\n            anchors = self.anchors\n        anchors = anchors.view(1, -1, anchors.shape[-1]).repeat(batch_size, 1, 1)\n        box_preds = box_preds.view(batch_size, -1,\n                                   box_preds.shape[-1] // self.num_anchors_per_location if not self.use_multihead else\n                                   box_preds.shape[-1])\n        # sin(a - b) = sinacosb-cosasinb\n        box_preds_sin, reg_targets_sin = self.add_sin_difference(box_preds, box_reg_targets)\n        loc_loss_src = self.reg_loss_func(box_preds_sin, reg_targets_sin, weights=reg_weights)  # [N, M]\n        loc_loss = loc_loss_src.sum() / batch_size\n\n        loc_loss = loc_loss * self.model_cfg.LOSS_CONFIG.LOSS_WEIGHTS['loc_weight']\n        box_loss = loc_loss\n        tb_dict = {\n            'rpn_loss_loc': loc_loss.item()\n        }\n\n        if box_dir_cls_preds is not None:\n            dir_targets = self.get_direction_target(\n                anchors, box_reg_targets,\n                dir_offset=self.model_cfg.DIR_OFFSET,\n                num_bins=self.model_cfg.NUM_DIR_BINS\n            )\n\n            dir_logits = box_dir_cls_preds.view(batch_size, -1, self.model_cfg.NUM_DIR_BINS)\n            weights = positives.type_as(dir_logits)\n            weights /= torch.clamp(weights.sum(-1, keepdim=True), min=1.0)\n            dir_loss = self.dir_loss_func(dir_logits, dir_targets, weights=weights)\n            dir_loss = dir_loss.sum() / batch_size\n            dir_loss = dir_loss * self.model_cfg.LOSS_CONFIG.LOSS_WEIGHTS['dir_weight']\n            box_loss = box_loss+dir_loss\n            tb_dict['rpn_loss_dir'] = dir_loss.item()\n\n        return box_loss, tb_dict\n\n    def get_od_loss(self):\n        box_preds = self.forward_ret_dict['box_preds']\n        gt_bbs = self.forward_ret_dict['gt_bbs']\n        anchors = copy.deepcopy(self.anchors)\n        anchors = torch.cat(anchors, dim=-3)\n        num_anchors = anchors.view(-1, anchors.shape[-1]).shape[0]\n        batch_anchors = anchors.view(1, -1, anchors.shape[-1]).repeat(len(box_preds), 1, 1)\n\n        batch_box_preds = box_preds.view(len(box_preds), num_anchors, -1)\n        batch_box_preds = self.box_coder.decode_torch(batch_box_preds, batch_anchors)\n\n\n        box_cls_labels = self.forward_ret_dict['box_cls_labels']\n        positives = box_cls_labels > 0\n        positives = positives.view(-1,)\n\n        gt_bbs = gt_bbs.view(-1, anchors.shape[-1])\n        batch_box_preds = batch_box_preds.view(-1, anchors.shape[-1])\n\n        loss = self.od_loss_func(gt_bbs[positives], batch_box_preds[positives], 1, len(box_preds))\n\n        loss = 2*loss/(positives.sum()+1)\n        return loss\n\n    def get_loss(self):\n        cls_loss, tb_dict = self.get_cls_layer_loss()\n        box_loss, tb_dict_box = self.get_box_reg_layer_loss()\n\n        tb_dict.update(tb_dict_box)\n\n        rpn_loss = cls_loss + box_loss\n\n        if self.model_cfg.get('OD_LOSS',False):\n            od_loss = self.get_od_loss()\n            rpn_loss += od_loss\n\n        tb_dict['rpn_loss'] = rpn_loss.item()\n        return rpn_loss, tb_dict\n\n    def generate_predicted_boxes(self, batch_size, cls_preds, box_preds, dir_cls_preds=None,):\n        \"\"\"\n        Args:\n            batch_size:\n            cls_preds: (N, H, W, C1)\n            box_preds: (N, H, W, C2)\n            dir_cls_preds: (N, H, W, C3)\n\n        Returns:\n            batch_cls_preds: (B, num_boxes, num_classes)\n            batch_box_preds: (B, num_boxes, 7+C)\n\n        \"\"\"\n        if isinstance(self.anchors, list):\n            if self.use_multihead:\n                anchors = torch.cat([anchor.permute(3, 4, 0, 1, 2, 5).contiguous().view(-1, anchor.shape[-1])\n                                     for anchor in self.anchors], dim=0)\n            else:\n                anchors = torch.cat(self.anchors, dim=-3)\n        else:\n            anchors = self.anchors\n        num_anchors = anchors.view(-1, anchors.shape[-1]).shape[0]\n        batch_anchors = anchors.view(1, -1, anchors.shape[-1]).repeat(batch_size, 1, 1)\n        batch_cls_preds = cls_preds.view(batch_size, num_anchors, -1).float() \\\n            if not isinstance(cls_preds, list) else cls_preds\n        batch_box_preds = box_preds.view(batch_size, num_anchors, -1) if not isinstance(box_preds, list) \\\n            else torch.cat(box_preds, dim=1).view(batch_size, num_anchors, -1)\n        batch_box_preds = self.box_coder.decode_torch(batch_box_preds, batch_anchors)\n\n        if dir_cls_preds is not None:\n            dir_offset = self.model_cfg.DIR_OFFSET\n            dir_limit_offset = self.model_cfg.DIR_LIMIT_OFFSET\n            dir_cls_preds = dir_cls_preds.view(batch_size, num_anchors, -1) if not isinstance(dir_cls_preds, list) \\\n                else torch.cat(dir_cls_preds, dim=1).view(batch_size, num_anchors, -1)\n            dir_labels = torch.max(dir_cls_preds, dim=-1)[1]\n\n            period = (2 * np.pi / self.model_cfg.NUM_DIR_BINS)\n            dir_rot = common_utils.limit_period(\n                batch_box_preds[..., 6] - dir_offset, dir_limit_offset, period\n            )\n            batch_box_preds[..., 6] = dir_rot + dir_offset + period * dir_labels.to(batch_box_preds.dtype)\n\n        if isinstance(self.box_coder, box_coder_utils.PreviousResidualDecoder):\n            batch_box_preds[..., 6] = common_utils.limit_period(\n                -(batch_box_preds[..., 6] + np.pi / 2), offset=0.5, period=np.pi * 2\n            )\n\n        return batch_cls_preds, batch_box_preds\n\n    def forward(self, **kwargs):\n        raise NotImplementedError\n"
  },
  {
    "path": "pcdet/models/dense_heads/center_head.py",
    "content": "import copy\nimport numpy as np\nimport torch\nimport torch.nn as nn\nfrom torch.nn.init import kaiming_normal_\nfrom ..model_utils import model_nms_utils\nfrom ..model_utils import centernet_utils\nfrom ...utils import loss_utils\n\n\nclass SeparateHead(nn.Module):\n    def __init__(self, input_channels, sep_head_dict, init_bias=-2.19, use_bias=False):\n        super().__init__()\n        self.sep_head_dict = sep_head_dict\n\n        for cur_name in self.sep_head_dict:\n            output_channels = self.sep_head_dict[cur_name]['out_channels']\n            num_conv = self.sep_head_dict[cur_name]['num_conv']\n\n            fc_list = []\n            for k in range(num_conv - 1):\n                fc_list.append(nn.Sequential(\n                    nn.Conv2d(input_channels, input_channels, kernel_size=3, stride=1, padding=1, bias=use_bias),\n                    nn.BatchNorm2d(input_channels),\n                    nn.ReLU()\n                ))\n            fc_list.append(nn.Conv2d(input_channels, output_channels, kernel_size=3, stride=1, padding=1, bias=True))\n            fc = nn.Sequential(*fc_list)\n            if 'hm' in cur_name:\n                fc[-1].bias.data.fill_(init_bias)\n            else:\n                for m in fc.modules():\n                    if isinstance(m, nn.Conv2d):\n                        kaiming_normal_(m.weight.data)\n                        if hasattr(m, \"bias\") and m.bias is not None:\n                            nn.init.constant_(m.bias, 0)\n\n            self.__setattr__(cur_name, fc)\n\n    def forward(self, x):\n        ret_dict = {}\n        for cur_name in self.sep_head_dict:\n            ret_dict[cur_name] = self.__getattr__(cur_name)(x)\n\n        return ret_dict\n\n\nclass CenterHead(nn.Module):\n    def __init__(self, model_cfg, num_frames, input_channels, num_class, class_names, grid_size, point_cloud_range, voxel_size,\n                 predict_boxes_when_training=True):\n        super().__init__()\n        self.model_cfg = model_cfg\n        self.num_class = num_class\n        self.grid_size = grid_size\n        self.point_cloud_range = point_cloud_range\n        self.voxel_size = voxel_size\n        self.feature_map_stride = self.model_cfg.TARGET_ASSIGNER_CONFIG.get('FEATURE_MAP_STRIDE', None)\n\n        self.class_names = class_names\n        self.class_names_each_head = []\n        self.class_id_mapping_each_head = []\n\n        for cur_class_names in self.model_cfg.CLASS_NAMES_EACH_HEAD:\n            self.class_names_each_head.append([x for x in cur_class_names if x in class_names])\n            cur_class_id_mapping = torch.from_numpy(np.array(\n                [self.class_names.index(x) for x in cur_class_names if x in class_names]\n            )).cuda()\n            self.class_id_mapping_each_head.append(cur_class_id_mapping)\n\n        total_classes = sum([len(x) for x in self.class_names_each_head])\n        assert total_classes == len(self.class_names), f'class_names_each_head={self.class_names_each_head}'\n\n        self.shared_conv = nn.Sequential(\n            nn.Conv2d(\n                input_channels, self.model_cfg.SHARED_CONV_CHANNEL, 3, stride=1, padding=1,\n                bias=self.model_cfg.get('USE_BIAS_BEFORE_NORM', False)\n            ),\n            nn.BatchNorm2d(self.model_cfg.SHARED_CONV_CHANNEL),\n            nn.ReLU(),\n        )\n\n        self.heads_list = nn.ModuleList()\n        self.separate_head_cfg = self.model_cfg.SEPARATE_HEAD_CFG\n        for idx, cur_class_names in enumerate(self.class_names_each_head):\n            cur_head_dict = copy.deepcopy(self.separate_head_cfg.HEAD_DICT)\n            cur_head_dict['hm'] = dict(out_channels=len(cur_class_names), num_conv=self.model_cfg.NUM_HM_CONV)\n            self.heads_list.append(\n                SeparateHead(\n                    input_channels=self.model_cfg.SHARED_CONV_CHANNEL,\n                    sep_head_dict=cur_head_dict,\n                    init_bias=-2.19,\n                    use_bias=self.model_cfg.get('USE_BIAS_BEFORE_NORM', False)\n                )\n            )\n        self.predict_boxes_when_training = predict_boxes_when_training\n        self.forward_ret_dict = {}\n        self.build_losses()\n\n    def build_losses(self):\n        self.add_module('hm_loss_func', loss_utils.FocalLossCenterNet())\n        self.add_module('reg_loss_func', loss_utils.RegLossCenterNet())\n\n    def assign_target_of_single_head(\n            self, num_classes, gt_boxes, feature_map_size, feature_map_stride, num_max_objs=500,\n            gaussian_overlap=0.1, min_radius=2\n    ):\n        \"\"\"\n        Args:\n            gt_boxes: (N, 8)\n            feature_map_size: (2), [x, y]\n\n        Returns:\n\n        \"\"\"\n        heatmap = gt_boxes.new_zeros(num_classes, feature_map_size[1], feature_map_size[0])\n        ret_boxes = gt_boxes.new_zeros((num_max_objs, gt_boxes.shape[-1] - 1 + 1))\n        inds = gt_boxes.new_zeros(num_max_objs).long()\n        mask = gt_boxes.new_zeros(num_max_objs).long()\n\n        x, y, z = gt_boxes[:, 0], gt_boxes[:, 1], gt_boxes[:, 2]\n        coord_x = (x - self.point_cloud_range[0]) / self.voxel_size[0] / feature_map_stride\n        coord_y = (y - self.point_cloud_range[1]) / self.voxel_size[1] / feature_map_stride\n        coord_x = torch.clamp(coord_x, min=0, max=feature_map_size[0] - 0.5)  # bugfixed: 1e-6 does not work for center.int()\n        coord_y = torch.clamp(coord_y, min=0, max=feature_map_size[1] - 0.5)  #\n        center = torch.cat((coord_x[:, None], coord_y[:, None]), dim=-1)\n        center_int = center.int()\n        center_int_float = center_int.float()\n\n        dx, dy, dz = gt_boxes[:, 3], gt_boxes[:, 4], gt_boxes[:, 5]\n        dx = dx / self.voxel_size[0] / feature_map_stride\n        dy = dy / self.voxel_size[1] / feature_map_stride\n\n        radius = centernet_utils.gaussian_radius(dx, dy, min_overlap=gaussian_overlap)\n        radius = torch.clamp_min(radius.int(), min=min_radius)\n\n        for k in range(min(num_max_objs, gt_boxes.shape[0])):\n            if dx[k] <= 0 or dy[k] <= 0:\n                continue\n\n            if not (0 <= center_int[k][0] <= feature_map_size[0] and 0 <= center_int[k][1] <= feature_map_size[1]):\n                continue\n\n            cur_class_id = (gt_boxes[k, -1] - 1).long()\n            centernet_utils.draw_gaussian_to_heatmap(heatmap[cur_class_id], center[k], radius[k].item())\n\n            inds[k] = center_int[k, 1] * feature_map_size[0] + center_int[k, 0]\n            mask[k] = 1\n\n            ret_boxes[k, 0:2] = center[k] - center_int_float[k].float()\n            ret_boxes[k, 2] = z[k]\n            ret_boxes[k, 3:6] = gt_boxes[k, 3:6].log()\n            ret_boxes[k, 6] = torch.cos(gt_boxes[k, 6])\n            ret_boxes[k, 7] = torch.sin(gt_boxes[k, 6])\n            if gt_boxes.shape[1] > 8:\n                ret_boxes[k, 8:] = gt_boxes[k, 7:-1]\n\n        return heatmap, ret_boxes, inds, mask\n\n    def assign_targets(self, gt_boxes, feature_map_size=None, **kwargs):\n        \"\"\"\n        Args:\n            gt_boxes: (B, M, 8)\n            range_image_polar: (B, 3, H, W)\n            feature_map_size: (2) [H, W]\n            spatial_cartesian: (B, 4, H, W)\n        Returns:\n\n        \"\"\"\n        feature_map_size = feature_map_size[::-1]  # [H, W] ==> [x, y]\n        target_assigner_cfg = self.model_cfg.TARGET_ASSIGNER_CONFIG\n        # feature_map_size = self.grid_size[:2] // target_assigner_cfg.FEATURE_MAP_STRIDE\n\n        batch_size = gt_boxes.shape[0]\n        ret_dict = {\n            'heatmaps': [],\n            'target_boxes': [],\n            'inds': [],\n            'masks': [],\n            'heatmap_masks': []\n        }\n\n        all_names = np.array(['bg', *self.class_names])\n        for idx, cur_class_names in enumerate(self.class_names_each_head):\n            heatmap_list, target_boxes_list, inds_list, masks_list = [], [], [], []\n            for bs_idx in range(batch_size):\n                cur_gt_boxes = gt_boxes[bs_idx]\n                gt_class_names = all_names[cur_gt_boxes[:, -1].cpu().long().numpy()]\n\n                gt_boxes_single_head = []\n\n                for idx, name in enumerate(gt_class_names):\n                    if name not in cur_class_names:\n                        continue\n                    temp_box = cur_gt_boxes[idx]\n                    temp_box[-1] = cur_class_names.index(name) + 1\n                    gt_boxes_single_head.append(temp_box[None, :])\n\n                if len(gt_boxes_single_head) == 0:\n                    gt_boxes_single_head = cur_gt_boxes[:0, :]\n                else:\n                    gt_boxes_single_head = torch.cat(gt_boxes_single_head, dim=0)\n\n                heatmap, ret_boxes, inds, mask = self.assign_target_of_single_head(\n                    num_classes=len(cur_class_names), gt_boxes=gt_boxes_single_head.cpu(),\n                    feature_map_size=feature_map_size, feature_map_stride=target_assigner_cfg.FEATURE_MAP_STRIDE,\n                    num_max_objs=target_assigner_cfg.NUM_MAX_OBJS,\n                    gaussian_overlap=target_assigner_cfg.GAUSSIAN_OVERLAP,\n                    min_radius=target_assigner_cfg.MIN_RADIUS,\n                )\n                heatmap_list.append(heatmap.to(gt_boxes_single_head.device))\n                target_boxes_list.append(ret_boxes.to(gt_boxes_single_head.device))\n                inds_list.append(inds.to(gt_boxes_single_head.device))\n                masks_list.append(mask.to(gt_boxes_single_head.device))\n\n            ret_dict['heatmaps'].append(torch.stack(heatmap_list, dim=0))\n            ret_dict['target_boxes'].append(torch.stack(target_boxes_list, dim=0))\n            ret_dict['inds'].append(torch.stack(inds_list, dim=0))\n            ret_dict['masks'].append(torch.stack(masks_list, dim=0))\n        return ret_dict\n\n    def sigmoid(self, x):\n        y = torch.clamp(x.sigmoid(), min=1e-4, max=1 - 1e-4)\n        return y\n\n    def get_loss(self):\n        pred_dicts = self.forward_ret_dict['pred_dicts']\n        target_dicts = self.forward_ret_dict['target_dicts']\n\n        tb_dict = {}\n        loss = 0\n\n        for idx, pred_dict in enumerate(pred_dicts):\n            pred_dict['hm'] = self.sigmoid(pred_dict['hm'])\n            hm_loss = self.hm_loss_func(pred_dict['hm'], target_dicts['heatmaps'][idx])\n\n            target_boxes = target_dicts['target_boxes'][idx]\n            pred_boxes = torch.cat([pred_dict[head_name] for head_name in self.separate_head_cfg.HEAD_ORDER], dim=1)\n\n            reg_loss = self.reg_loss_func(\n                pred_boxes, target_dicts['masks'][idx], target_dicts['inds'][idx], target_boxes\n            )\n            loc_loss = (reg_loss * reg_loss.new_tensor(self.model_cfg.LOSS_CONFIG.LOSS_WEIGHTS['code_weights'])).sum()\n            loc_loss = loc_loss * self.model_cfg.LOSS_CONFIG.LOSS_WEIGHTS['loc_weight']\n\n            loss += hm_loss + loc_loss\n            tb_dict['hm_loss_head_%d' % idx] = hm_loss.item()\n            tb_dict['loc_loss_head_%d' % idx] = loc_loss.item()\n\n        tb_dict['rpn_loss'] = loss.item()\n        return loss, tb_dict\n\n    def generate_predicted_boxes(self, batch_size, pred_dicts):\n        post_process_cfg = self.model_cfg.POST_PROCESSING\n        post_center_limit_range = torch.tensor(post_process_cfg.POST_CENTER_LIMIT_RANGE).cuda().float()\n\n        ret_dict = [{\n            'pred_boxes': [],\n            'pred_scores': [],\n            'pred_labels': [],\n        } for k in range(batch_size)]\n        for idx, pred_dict in enumerate(pred_dicts):\n            batch_hm = pred_dict['hm'].sigmoid()\n            batch_center = pred_dict['center']\n            batch_center_z = pred_dict['center_z']\n            batch_dim = pred_dict['dim'].exp()\n            batch_rot_cos = pred_dict['rot'][:, 0].unsqueeze(dim=1)\n            batch_rot_sin = pred_dict['rot'][:, 1].unsqueeze(dim=1)\n            batch_vel = pred_dict['vel'] if 'vel' in self.separate_head_cfg.HEAD_ORDER else None\n\n            final_pred_dicts = centernet_utils.decode_bbox_from_heatmap(\n                heatmap=batch_hm, rot_cos=batch_rot_cos, rot_sin=batch_rot_sin,\n                center=batch_center, center_z=batch_center_z, dim=batch_dim, vel=batch_vel,\n                point_cloud_range=self.point_cloud_range, voxel_size=self.voxel_size,\n                feature_map_stride=self.feature_map_stride,\n                K=post_process_cfg.MAX_OBJ_PER_SAMPLE,\n                circle_nms=(post_process_cfg.NMS_CONFIG.NMS_TYPE == 'circle_nms'),\n                score_thresh=post_process_cfg.SCORE_THRESH,\n                post_center_limit_range=post_center_limit_range\n            )\n\n            for k, final_dict in enumerate(final_pred_dicts):\n                final_dict['pred_labels'] = self.class_id_mapping_each_head[idx][final_dict['pred_labels'].long()]\n                if post_process_cfg.NMS_CONFIG.NMS_TYPE != 'circle_nms':\n                    selected, selected_scores = model_nms_utils.class_agnostic_nms(\n                        box_scores=final_dict['pred_scores'], box_preds=final_dict['pred_boxes'],\n                        nms_config=post_process_cfg.NMS_CONFIG,\n                        score_thresh=None\n                    )\n\n                    final_dict['pred_boxes'] = final_dict['pred_boxes'][selected]\n                    final_dict['pred_scores'] = selected_scores\n                    final_dict['pred_labels'] = final_dict['pred_labels'][selected]\n\n                ret_dict[k]['pred_boxes'].append(final_dict['pred_boxes'])\n                ret_dict[k]['pred_scores'].append(final_dict['pred_scores'])\n                ret_dict[k]['pred_labels'].append(final_dict['pred_labels'])\n\n        for k in range(batch_size):\n            ret_dict[k]['pred_boxes'] = torch.cat(ret_dict[k]['pred_boxes'], dim=0)\n            ret_dict[k]['pred_scores'] = torch.cat(ret_dict[k]['pred_scores'], dim=0)\n            ret_dict[k]['pred_labels'] = torch.cat(ret_dict[k]['pred_labels'], dim=0) + 1\n\n        return ret_dict\n\n    @staticmethod\n    def reorder_rois_for_refining(batch_size, pred_dicts):\n        num_max_rois = max([len(cur_dict['pred_boxes']) for cur_dict in pred_dicts])\n        num_max_rois = max(1, num_max_rois)  # at least one faked rois to avoid error\n        pred_boxes = pred_dicts[0]['pred_boxes']\n\n        rois = pred_boxes.new_zeros((batch_size, num_max_rois, pred_boxes.shape[-1]))\n        roi_scores = pred_boxes.new_zeros((batch_size, num_max_rois))\n        roi_labels = pred_boxes.new_zeros((batch_size, num_max_rois)).long()\n\n        for bs_idx in range(batch_size):\n            num_boxes = len(pred_dicts[bs_idx]['pred_boxes'])\n\n            rois[bs_idx, :num_boxes, :] = pred_dicts[bs_idx]['pred_boxes']\n            roi_scores[bs_idx, :num_boxes] = pred_dicts[bs_idx]['pred_scores']\n            roi_labels[bs_idx, :num_boxes] = pred_dicts[bs_idx]['pred_labels']\n        return rois, roi_scores, roi_labels\n\n    def forward(self, data_dict):\n        spatial_features_2d = data_dict['st_features_2d']\n        x = self.shared_conv(spatial_features_2d)\n\n        pred_dicts = []\n        for head in self.heads_list:\n            pred_dicts.append(head(x))\n\n        if self.training:\n            target_dict = self.assign_targets(\n                data_dict['gt_boxes'], feature_map_size=spatial_features_2d.size()[2:],\n                feature_map_stride=data_dict.get('spatial_features_2d_strides', None)\n            )\n            self.forward_ret_dict['target_dicts'] = target_dict\n\n        self.forward_ret_dict['pred_dicts'] = pred_dicts\n\n        if not self.training or self.predict_boxes_when_training:\n            pred_dicts = self.generate_predicted_boxes(\n                data_dict['batch_size'], pred_dicts\n            )\n\n            if self.predict_boxes_when_training:\n                rois, roi_scores, roi_labels = self.reorder_rois_for_refining(data_dict['batch_size'], pred_dicts)\n                data_dict['rois'] = rois\n                data_dict['roi_scores'] = roi_scores\n                data_dict['roi_labels'] = roi_labels\n                data_dict['has_class_labels'] = True\n            else:\n                data_dict['final_box_dicts'] = pred_dicts\n\n        return data_dict\n"
  },
  {
    "path": "pcdet/models/dense_heads/point_head_box.py",
    "content": "import torch\n\nfrom ...utils import box_coder_utils, box_utils\nfrom .point_head_template import PointHeadTemplate\n\n\nclass PointHeadBox(PointHeadTemplate):\n    \"\"\"\n    A simple point-based segmentation head, which are used for PointRCNN.\n    Reference Paper: https://arxiv.org/abs/1812.04244\n    PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud\n    \"\"\"\n    def __init__(self, num_class, input_channels, model_cfg, predict_boxes_when_training=False, **kwargs):\n        super().__init__(model_cfg=model_cfg, num_class=num_class)\n        self.predict_boxes_when_training = predict_boxes_when_training\n        self.cls_layers = self.make_fc_layers(\n            fc_cfg=self.model_cfg.CLS_FC,\n            input_channels=input_channels,\n            output_channels=num_class\n        )\n\n        target_cfg = self.model_cfg.TARGET_CONFIG\n        self.box_coder = getattr(box_coder_utils, target_cfg.BOX_CODER)(\n            **target_cfg.BOX_CODER_CONFIG\n        )\n        self.box_layers = self.make_fc_layers(\n            fc_cfg=self.model_cfg.REG_FC,\n            input_channels=input_channels,\n            output_channels=self.box_coder.code_size\n        )\n\n    def assign_targets(self, input_dict):\n        \"\"\"\n        Args:\n            input_dict:\n                point_features: (N1 + N2 + N3 + ..., C)\n                batch_size:\n                point_coords: (N1 + N2 + N3 + ..., 4) [bs_idx, x, y, z]\n                gt_boxes (optional): (B, M, 8)\n        Returns:\n            point_cls_labels: (N1 + N2 + N3 + ...), long type, 0:background, -1:ignored\n            point_part_labels: (N1 + N2 + N3 + ..., 3)\n        \"\"\"\n        point_coords = input_dict['point_coords']\n        gt_boxes = input_dict['gt_boxes']\n        assert gt_boxes.shape.__len__() == 3, 'gt_boxes.shape=%s' % str(gt_boxes.shape)\n        assert point_coords.shape.__len__() in [2], 'points.shape=%s' % str(point_coords.shape)\n\n        batch_size = gt_boxes.shape[0]\n        extend_gt_boxes = box_utils.enlarge_box3d(\n            gt_boxes.view(-1, gt_boxes.shape[-1]), extra_width=self.model_cfg.TARGET_CONFIG.GT_EXTRA_WIDTH\n        ).view(batch_size, -1, gt_boxes.shape[-1])\n        targets_dict = self.assign_stack_targets(\n            points=point_coords, gt_boxes=gt_boxes, extend_gt_boxes=extend_gt_boxes,\n            set_ignore_flag=True, use_ball_constraint=False,\n            ret_part_labels=False, ret_box_labels=True\n        )\n\n        return targets_dict\n\n    def get_loss(self, tb_dict=None):\n        tb_dict = {} if tb_dict is None else tb_dict\n        point_loss_cls, tb_dict_1 = self.get_cls_layer_loss()\n        point_loss_box, tb_dict_2 = self.get_box_layer_loss()\n\n        point_loss = point_loss_cls + point_loss_box\n        tb_dict.update(tb_dict_1)\n        tb_dict.update(tb_dict_2)\n        return point_loss, tb_dict\n\n    def forward(self, batch_dict):\n\n\n        \"\"\"\n        Args:\n            batch_dict:\n                batch_size:\n                point_features: (N1 + N2 + N3 + ..., C) or (B, N, C)\n                point_features_before_fusion: (N1 + N2 + N3 + ..., C)\n                point_coords: (N1 + N2 + N3 + ..., 4) [bs_idx, x, y, z]\n                point_labels (optional): (N1 + N2 + N3 + ...)\n                gt_boxes (optional): (B, M, 8)\n        Returns:\n            batch_dict:\n                point_cls_scores: (N1 + N2 + N3 + ..., 1)\n                point_part_offset: (N1 + N2 + N3 + ..., 3)\n        \"\"\"\n\n        if self.training and batch_dict['gt_boxes'].shape[-1]>8:\n            news = batch_dict['gt_boxes'][..., 0:8]\n            news[..., 7] = batch_dict['gt_boxes'][..., -1]\n            batch_dict['gt_boxes'] = news\n\n        if self.model_cfg.get('USE_POINT_FEATURES_BEFORE_FUSION', False):\n            point_features = batch_dict['point_features_before_fusion']\n        else:\n            point_features = batch_dict['point_features']\n        point_cls_preds = self.cls_layers(point_features)  # (total_points, num_class)\n        point_box_preds = self.box_layers(point_features)  # (total_points, box_code_size)\n\n        point_cls_preds_max, _ = point_cls_preds.max(dim=-1)\n        batch_dict['point_cls_scores'] = torch.sigmoid(point_cls_preds_max)\n\n        ret_dict = {'point_cls_preds': point_cls_preds,\n                    'point_box_preds': point_box_preds}\n        if self.training:\n            targets_dict = self.assign_targets(batch_dict)\n            ret_dict['point_cls_labels'] = targets_dict['point_cls_labels']\n            ret_dict['point_box_labels'] = targets_dict['point_box_labels']\n\n        if not self.training or self.predict_boxes_when_training:\n            point_cls_preds, point_box_preds = self.generate_predicted_boxes(\n                points=batch_dict['point_coords'][:, 1:4],\n                point_cls_preds=point_cls_preds, point_box_preds=point_box_preds\n            )\n            batch_dict['batch_cls_preds'] = point_cls_preds\n            batch_dict['batch_box_preds'] = point_box_preds\n            batch_dict['batch_index'] = batch_dict['point_coords'][:, 0]\n            batch_dict['cls_preds_normalized'] = False\n\n        self.forward_ret_dict = ret_dict\n\n        return batch_dict\n"
  },
  {
    "path": "pcdet/models/dense_heads/point_head_simple.py",
    "content": "import torch\n\nfrom ...utils import box_utils\nfrom .point_head_template import PointHeadTemplate\n\n\nclass PointHeadSimple(PointHeadTemplate):\n    \"\"\"\n    A simple point-based segmentation head, which are used for PV-RCNN keypoint segmentaion.\n    Reference Paper: https://arxiv.org/abs/1912.13192\n    PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection\n    \"\"\"\n    def __init__(self, num_class, input_channels, model_cfg, **kwargs):\n        super().__init__(model_cfg=model_cfg, num_class=num_class)\n        self.cls_layers = self.make_fc_layers(\n            fc_cfg=self.model_cfg.CLS_FC,\n            input_channels=input_channels,\n            output_channels=num_class\n        )\n\n    def assign_targets(self, input_dict):\n        \"\"\"\n        Args:\n            input_dict:\n                point_features: (N1 + N2 + N3 + ..., C)\n                batch_size:\n                point_coords: (N1 + N2 + N3 + ..., 4) [bs_idx, x, y, z]\n                gt_boxes (optional): (B, M, 8)\n        Returns:\n            point_cls_labels: (N1 + N2 + N3 + ...), long type, 0:background, -1:ignored\n            point_part_labels: (N1 + N2 + N3 + ..., 3)\n        \"\"\"\n        point_coords = input_dict['point_coords']\n        gt_boxes = input_dict['gt_boxes']\n        assert gt_boxes.shape.__len__() == 3, 'gt_boxes.shape=%s' % str(gt_boxes.shape)\n        assert point_coords.shape.__len__() in [2], 'points.shape=%s' % str(point_coords.shape)\n\n        batch_size = gt_boxes.shape[0]\n        extend_gt_boxes = box_utils.enlarge_box3d(\n            gt_boxes.view(-1, gt_boxes.shape[-1]), extra_width=self.model_cfg.TARGET_CONFIG.GT_EXTRA_WIDTH\n        ).view(batch_size, -1, gt_boxes.shape[-1])\n        targets_dict = self.assign_stack_targets(\n            points=point_coords, gt_boxes=gt_boxes, extend_gt_boxes=extend_gt_boxes,\n            set_ignore_flag=True, use_ball_constraint=False,\n            ret_part_labels=False\n        )\n\n        return targets_dict\n\n    def get_loss(self, tb_dict=None):\n        tb_dict = {} if tb_dict is None else tb_dict\n        point_loss_cls, tb_dict_1 = self.get_cls_layer_loss()\n\n        point_loss = point_loss_cls\n        tb_dict.update(tb_dict_1)\n        return point_loss, tb_dict\n\n    def forward(self, batch_dict):\n        \"\"\"\n        Args:\n            batch_dict:\n                batch_size:\n                point_features: (N1 + N2 + N3 + ..., C) or (B, N, C)\n                point_features_before_fusion: (N1 + N2 + N3 + ..., C)\n                point_coords: (N1 + N2 + N3 + ..., 4) [bs_idx, x, y, z]\n                point_labels (optional): (N1 + N2 + N3 + ...)\n                gt_boxes (optional): (B, M, 8)\n        Returns:\n            batch_dict:\n                point_cls_scores: (N1 + N2 + N3 + ..., 1)\n                point_part_offset: (N1 + N2 + N3 + ..., 3)\n        \"\"\"\n        if self.model_cfg.get('USE_POINT_FEATURES_BEFORE_FUSION', False):\n            point_features = batch_dict['point_features_before_fusion']\n        else:\n            point_features = batch_dict['point_features']\n        point_cls_preds = self.cls_layers(point_features)  # (total_points, num_class)\n\n        ret_dict = {\n            'point_cls_preds': point_cls_preds,\n        }\n\n        point_cls_scores = torch.sigmoid(point_cls_preds)\n        batch_dict['point_cls_scores'], _ = point_cls_scores.max(dim=-1)\n\n        if self.training:\n            targets_dict = self.assign_targets(batch_dict)\n            ret_dict['point_cls_labels'] = targets_dict['point_cls_labels']\n        self.forward_ret_dict = ret_dict\n\n        return batch_dict\n"
  },
  {
    "path": "pcdet/models/dense_heads/point_head_template.py",
    "content": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\nfrom ...ops.roiaware_pool3d import roiaware_pool3d_utils\nfrom ...utils import common_utils, loss_utils\n\n\nclass PointHeadTemplate(nn.Module):\n    def __init__(self, model_cfg, num_class):\n        super().__init__()\n        self.model_cfg = model_cfg\n        self.num_class = num_class\n\n        self.build_losses(self.model_cfg.LOSS_CONFIG)\n        self.forward_ret_dict = None\n\n    def build_losses(self, losses_cfg):\n        self.add_module(\n            'cls_loss_func',\n            loss_utils.SigmoidFocalClassificationLoss(alpha=0.25, gamma=2.0)\n        )\n        reg_loss_type = losses_cfg.get('LOSS_REG', None)\n        if reg_loss_type == 'smooth-l1':\n            self.reg_loss_func = F.smooth_l1_loss\n        elif reg_loss_type == 'l1':\n            self.reg_loss_func = F.l1_loss\n        elif reg_loss_type == 'WeightedSmoothL1Loss':\n            self.reg_loss_func = loss_utils.WeightedSmoothL1Loss(\n                code_weights=losses_cfg.LOSS_WEIGHTS.get('code_weights', None)\n            )\n        else:\n            self.reg_loss_func = F.smooth_l1_loss\n\n    @staticmethod\n    def make_fc_layers(fc_cfg, input_channels, output_channels):\n        fc_layers = []\n        c_in = input_channels\n        for k in range(0, fc_cfg.__len__()):\n            fc_layers.extend([\n                nn.Linear(c_in, fc_cfg[k], bias=False),\n                nn.BatchNorm1d(fc_cfg[k]),\n                nn.ReLU(),\n            ])\n            c_in = fc_cfg[k]\n        fc_layers.append(nn.Linear(c_in, output_channels, bias=True))\n        return nn.Sequential(*fc_layers)\n\n    def assign_stack_targets(self, points, gt_boxes, extend_gt_boxes=None,\n                             ret_box_labels=False, ret_part_labels=False,\n                             set_ignore_flag=True, use_ball_constraint=False, central_radius=2.0):\n        \"\"\"\n        Args:\n            points: (N1 + N2 + N3 + ..., 4) [bs_idx, x, y, z]\n            gt_boxes: (B, M, 8)\n            extend_gt_boxes: [B, M, 8]\n            ret_box_labels:\n            ret_part_labels:\n            set_ignore_flag:\n            use_ball_constraint:\n            central_radius:\n\n        Returns:\n            point_cls_labels: (N1 + N2 + N3 + ...), long type, 0:background, -1:ignored\n            point_box_labels: (N1 + N2 + N3 + ..., code_size)\n\n        \"\"\"\n        assert len(points.shape) == 2 and points.shape[1] == 4, 'points.shape=%s' % str(points.shape)\n        assert len(gt_boxes.shape) == 3 and gt_boxes.shape[2] == 8, 'gt_boxes.shape=%s' % str(gt_boxes.shape)\n        assert extend_gt_boxes is None or len(extend_gt_boxes.shape) == 3 and extend_gt_boxes.shape[2] == 8, \\\n            'extend_gt_boxes.shape=%s' % str(extend_gt_boxes.shape)\n        assert set_ignore_flag != use_ball_constraint, 'Choose one only!'\n        batch_size = gt_boxes.shape[0]\n        bs_idx = points[:, 0]\n        point_cls_labels = points.new_zeros(points.shape[0]).long()\n        point_box_labels = gt_boxes.new_zeros((points.shape[0], 8)) if ret_box_labels else None\n        point_part_labels = gt_boxes.new_zeros((points.shape[0], 3)) if ret_part_labels else None\n        for k in range(batch_size):\n            bs_mask = (bs_idx == k)\n            points_single = points[bs_mask][:, 1:4]\n            point_cls_labels_single = point_cls_labels.new_zeros(bs_mask.sum())\n            box_idxs_of_pts = roiaware_pool3d_utils.points_in_boxes_gpu(\n                points_single.unsqueeze(dim=0), gt_boxes[k:k + 1, :, 0:7].contiguous()\n            ).long().squeeze(dim=0)\n            box_fg_flag = (box_idxs_of_pts >= 0)\n            if set_ignore_flag:\n                extend_box_idxs_of_pts = roiaware_pool3d_utils.points_in_boxes_gpu(\n                    points_single.unsqueeze(dim=0), extend_gt_boxes[k:k+1, :, 0:7].contiguous()\n                ).long().squeeze(dim=0)\n                fg_flag = box_fg_flag\n                ignore_flag = fg_flag ^ (extend_box_idxs_of_pts >= 0)\n                point_cls_labels_single[ignore_flag] = -1\n            elif use_ball_constraint:\n                box_centers = gt_boxes[k][box_idxs_of_pts][:, 0:3].clone()\n                box_centers[:, 2] += gt_boxes[k][box_idxs_of_pts][:, 5] / 2\n                ball_flag = ((box_centers - points_single).norm(dim=1) < central_radius)\n                fg_flag = box_fg_flag & ball_flag\n            else:\n                raise NotImplementedError\n\n            gt_box_of_fg_points = gt_boxes[k][box_idxs_of_pts[fg_flag]]\n            point_cls_labels_single[fg_flag] = 1 if self.num_class == 1 else gt_box_of_fg_points[:, -1].long()\n            point_cls_labels[bs_mask] = point_cls_labels_single\n\n            if ret_box_labels:\n                point_box_labels_single = point_box_labels.new_zeros((bs_mask.sum(), 8))\n                fg_point_box_labels = self.box_coder.encode_torch(\n                    gt_boxes=gt_box_of_fg_points[:, :-1], points=points_single[fg_flag],\n                    gt_classes=gt_box_of_fg_points[:, -1].long()\n                )\n                point_box_labels_single[fg_flag] = fg_point_box_labels\n                point_box_labels[bs_mask] = point_box_labels_single\n\n            if ret_part_labels:\n                point_part_labels_single = point_part_labels.new_zeros((bs_mask.sum(), 3))\n                transformed_points = points_single[fg_flag] - gt_box_of_fg_points[:, 0:3]\n                transformed_points = common_utils.rotate_points_along_z(\n                    transformed_points.view(-1, 1, 3), -gt_box_of_fg_points[:, 6]\n                ).view(-1, 3)\n                offset = torch.tensor([0.5, 0.5, 0.5]).view(1, 3).type_as(transformed_points)\n                point_part_labels_single[fg_flag] = (transformed_points / gt_box_of_fg_points[:, 3:6]) + offset\n                point_part_labels[bs_mask] = point_part_labels_single\n\n        targets_dict = {\n            'point_cls_labels': point_cls_labels,\n            'point_box_labels': point_box_labels,\n            'point_part_labels': point_part_labels\n        }\n        return targets_dict\n\n    def get_cls_layer_loss(self, tb_dict=None):\n        point_cls_labels = self.forward_ret_dict['point_cls_labels'].view(-1)\n        point_cls_preds = self.forward_ret_dict['point_cls_preds'].view(-1, self.num_class)\n\n        positives = (point_cls_labels > 0)\n        negative_cls_weights = (point_cls_labels == 0) * 1.0\n        cls_weights = (negative_cls_weights + 1.0 * positives).float()\n        pos_normalizer = positives.sum(dim=0).float()\n        cls_weights /= torch.clamp(pos_normalizer, min=1.0)\n\n        one_hot_targets = point_cls_preds.new_zeros(*list(point_cls_labels.shape), self.num_class + 1)\n        one_hot_targets.scatter_(-1, (point_cls_labels * (point_cls_labels >= 0).long()).unsqueeze(dim=-1).long(), 1.0)\n        one_hot_targets = one_hot_targets[..., 1:]\n        cls_loss_src = self.cls_loss_func(point_cls_preds, one_hot_targets, weights=cls_weights)\n        point_loss_cls = cls_loss_src.sum()\n\n        loss_weights_dict = self.model_cfg.LOSS_CONFIG.LOSS_WEIGHTS\n        point_loss_cls = point_loss_cls * loss_weights_dict['point_cls_weight']\n        if tb_dict is None:\n            tb_dict = {}\n        tb_dict.update({\n            'point_loss_cls': point_loss_cls.item(),\n            'point_pos_num': pos_normalizer.item()\n        })\n        return point_loss_cls, tb_dict\n\n    def get_part_layer_loss(self, tb_dict=None):\n        pos_mask = self.forward_ret_dict['point_cls_labels'] > 0\n        pos_normalizer = max(1, (pos_mask > 0).sum().item())\n        point_part_labels = self.forward_ret_dict['point_part_labels']\n        point_part_preds = self.forward_ret_dict['point_part_preds']\n        point_loss_part = F.binary_cross_entropy(torch.sigmoid(point_part_preds), point_part_labels, reduction='none')\n        point_loss_part = (point_loss_part.sum(dim=-1) * pos_mask.float()).sum() / (3 * pos_normalizer)\n\n        loss_weights_dict = self.model_cfg.LOSS_CONFIG.LOSS_WEIGHTS\n        point_loss_part = point_loss_part * loss_weights_dict['point_part_weight']\n        if tb_dict is None:\n            tb_dict = {}\n        tb_dict.update({'point_loss_part': point_loss_part.item()})\n        return point_loss_part, tb_dict\n\n    def get_box_layer_loss(self, tb_dict=None):\n        pos_mask = self.forward_ret_dict['point_cls_labels'] > 0\n        point_box_labels = self.forward_ret_dict['point_box_labels']\n        point_box_preds = self.forward_ret_dict['point_box_preds']\n\n        reg_weights = pos_mask.float()\n        pos_normalizer = pos_mask.sum().float()\n        reg_weights /= torch.clamp(pos_normalizer, min=1.0)\n\n        point_loss_box_src = self.reg_loss_func(\n            point_box_preds[None, ...], point_box_labels[None, ...], weights=reg_weights[None, ...]\n        )\n        point_loss_box = point_loss_box_src.sum()\n\n        loss_weights_dict = self.model_cfg.LOSS_CONFIG.LOSS_WEIGHTS\n        point_loss_box = point_loss_box * loss_weights_dict['point_box_weight']\n        if tb_dict is None:\n            tb_dict = {}\n        tb_dict.update({'point_loss_box': point_loss_box.item()})\n        return point_loss_box, tb_dict\n\n    def generate_predicted_boxes(self, points, point_cls_preds, point_box_preds):\n        \"\"\"\n        Args:\n            points: (N, 3)\n            point_cls_preds: (N, num_class)\n            point_box_preds: (N, box_code_size)\n        Returns:\n            point_cls_preds: (N, num_class)\n            point_box_preds: (N, box_code_size)\n\n        \"\"\"\n        _, pred_classes = point_cls_preds.max(dim=-1)\n        point_box_preds = self.box_coder.decode_torch(point_box_preds, points, pred_classes + 1)\n\n        return point_cls_preds, point_box_preds\n\n    def forward(self, **kwargs):\n        raise NotImplementedError\n"
  },
  {
    "path": "pcdet/models/dense_heads/point_intra_part_head.py",
    "content": "import torch\n\nfrom ...utils import box_coder_utils, box_utils\nfrom .point_head_template import PointHeadTemplate\n\n\nclass PointIntraPartOffsetHead(PointHeadTemplate):\n    \"\"\"\n    Point-based head for predicting the intra-object part locations.\n    Reference Paper: https://arxiv.org/abs/1907.03670\n    From Points to Parts: 3D Object Detection from Point Cloud with Part-aware and Part-aggregation Network\n    \"\"\"\n    def __init__(self, num_class, input_channels, model_cfg, predict_boxes_when_training=False, **kwargs):\n        super().__init__(model_cfg=model_cfg, num_class=num_class)\n        self.predict_boxes_when_training = predict_boxes_when_training\n        self.cls_layers = self.make_fc_layers(\n            fc_cfg=self.model_cfg.CLS_FC,\n            input_channels=input_channels,\n            output_channels=num_class\n        )\n        self.part_reg_layers = self.make_fc_layers(\n            fc_cfg=self.model_cfg.PART_FC,\n            input_channels=input_channels,\n            output_channels=3\n        )\n        target_cfg = self.model_cfg.TARGET_CONFIG\n        if target_cfg.get('BOX_CODER', None) is not None:\n            self.box_coder = getattr(box_coder_utils, target_cfg.BOX_CODER)(\n                **target_cfg.BOX_CODER_CONFIG\n            )\n            self.box_layers = self.make_fc_layers(\n                fc_cfg=self.model_cfg.REG_FC,\n                input_channels=input_channels,\n                output_channels=self.box_coder.code_size\n            )\n        else:\n            self.box_layers = None\n\n    def assign_targets(self, input_dict):\n        \"\"\"\n        Args:\n            input_dict:\n                point_features: (N1 + N2 + N3 + ..., C)\n                batch_size:\n                point_coords: (N1 + N2 + N3 + ..., 4) [bs_idx, x, y, z]\n                gt_boxes (optional): (B, M, 8)\n        Returns:\n            point_cls_labels: (N1 + N2 + N3 + ...), long type, 0:background, -1:ignored\n            point_part_labels: (N1 + N2 + N3 + ..., 3)\n        \"\"\"\n        point_coords = input_dict['point_coords']\n        gt_boxes = input_dict['gt_boxes']\n        assert gt_boxes.shape.__len__() == 3, 'gt_boxes.shape=%s' % str(gt_boxes.shape)\n        assert point_coords.shape.__len__() in [2], 'points.shape=%s' % str(point_coords.shape)\n\n        batch_size = gt_boxes.shape[0]\n        extend_gt_boxes = box_utils.enlarge_box3d(\n            gt_boxes.view(-1, gt_boxes.shape[-1]), extra_width=self.model_cfg.TARGET_CONFIG.GT_EXTRA_WIDTH\n        ).view(batch_size, -1, gt_boxes.shape[-1])\n        targets_dict = self.assign_stack_targets(\n            points=point_coords, gt_boxes=gt_boxes, extend_gt_boxes=extend_gt_boxes,\n            set_ignore_flag=True, use_ball_constraint=False,\n            ret_part_labels=True, ret_box_labels=(self.box_layers is not None)\n        )\n\n        return targets_dict\n\n    def get_loss(self, tb_dict=None):\n        tb_dict = {} if tb_dict is None else tb_dict\n        point_loss_cls, tb_dict = self.get_cls_layer_loss(tb_dict)\n        point_loss_part, tb_dict = self.get_part_layer_loss(tb_dict)\n        point_loss = point_loss_cls + point_loss_part\n\n        if self.box_layers is not None:\n            point_loss_box, tb_dict = self.get_box_layer_loss(tb_dict)\n            point_loss += point_loss_box\n        return point_loss, tb_dict\n\n    def forward(self, batch_dict):\n        \"\"\"\n        Args:\n            batch_dict:\n                batch_size:\n                point_features: (N1 + N2 + N3 + ..., C) or (B, N, C)\n                point_coords: (N1 + N2 + N3 + ..., 4) [bs_idx, x, y, z]\n                point_labels (optional): (N1 + N2 + N3 + ...)\n                gt_boxes (optional): (B, M, 8)\n        Returns:\n            batch_dict:\n                point_cls_scores: (N1 + N2 + N3 + ..., 1)\n                point_part_offset: (N1 + N2 + N3 + ..., 3)\n        \"\"\"\n        point_features = batch_dict['point_features']\n        point_cls_preds = self.cls_layers(point_features)  # (total_points, num_class)\n        point_part_preds = self.part_reg_layers(point_features)\n\n        ret_dict = {\n            'point_cls_preds': point_cls_preds,\n            'point_part_preds': point_part_preds,\n        }\n        if self.box_layers is not None:\n            point_box_preds = self.box_layers(point_features)\n            ret_dict['point_box_preds'] = point_box_preds\n\n        point_cls_scores = torch.sigmoid(point_cls_preds)\n        point_part_offset = torch.sigmoid(point_part_preds)\n        batch_dict['point_cls_scores'], _ = point_cls_scores.max(dim=-1)\n        batch_dict['point_part_offset'] = point_part_offset\n\n        if self.training:\n            targets_dict = self.assign_targets(batch_dict)\n            ret_dict['point_cls_labels'] = targets_dict['point_cls_labels']\n            ret_dict['point_part_labels'] = targets_dict.get('point_part_labels')\n            ret_dict['point_box_labels'] = targets_dict.get('point_box_labels')\n\n        if self.box_layers is not None and (not self.training or self.predict_boxes_when_training):\n            point_cls_preds, point_box_preds = self.generate_predicted_boxes(\n                points=batch_dict['point_coords'][:, 1:4],\n                point_cls_preds=point_cls_preds, point_box_preds=ret_dict['point_box_preds']\n            )\n            batch_dict['batch_cls_preds'] = point_cls_preds\n            batch_dict['batch_box_preds'] = point_box_preds\n            batch_dict['batch_index'] = batch_dict['point_coords'][:, 0]\n            batch_dict['cls_preds_normalized'] = False\n\n        self.forward_ret_dict = ret_dict\n        return batch_dict\n"
  },
  {
    "path": "pcdet/models/dense_heads/target_assigner/anchor_generator.py",
    "content": "import torch\n\n\nclass AnchorGenerator(object):\n    def __init__(self, anchor_range, anchor_generator_config):\n        super().__init__()\n        self.anchor_generator_cfg = anchor_generator_config\n        self.anchor_range = anchor_range\n        self.anchor_sizes = [config['anchor_sizes'] for config in anchor_generator_config]\n        self.anchor_rotations = [config['anchor_rotations'] for config in anchor_generator_config]\n        self.anchor_heights = [config['anchor_bottom_heights'] for config in anchor_generator_config]\n        self.align_center = [config.get('align_center', False) for config in anchor_generator_config]\n\n        assert len(self.anchor_sizes) == len(self.anchor_rotations) == len(self.anchor_heights)\n        self.num_of_anchor_sets = len(self.anchor_sizes)\n\n    def generate_anchors(self, grid_sizes):\n        assert len(grid_sizes) == self.num_of_anchor_sets\n        all_anchors = []\n        num_anchors_per_location = []\n        for grid_size, anchor_size, anchor_rotation, anchor_height, align_center in zip(\n                grid_sizes, self.anchor_sizes, self.anchor_rotations, self.anchor_heights, self.align_center):\n\n            num_anchors_per_location.append(len(anchor_rotation) * len(anchor_size) * len(anchor_height))\n            if align_center:\n                x_stride = (self.anchor_range[3] - self.anchor_range[0]) / grid_size[0]\n                y_stride = (self.anchor_range[4] - self.anchor_range[1]) / grid_size[1]\n                x_offset, y_offset = x_stride / 2, y_stride / 2\n            else:\n                x_stride = (self.anchor_range[3] - self.anchor_range[0]) / (grid_size[0] - 1)\n                y_stride = (self.anchor_range[4] - self.anchor_range[1]) / (grid_size[1] - 1)\n                x_offset, y_offset = 0, 0\n\n            x_shifts = torch.arange(\n                self.anchor_range[0] + x_offset, self.anchor_range[3] + 1e-5, step=x_stride, dtype=torch.float32,\n            ).cuda()\n            y_shifts = torch.arange(\n                self.anchor_range[1] + y_offset, self.anchor_range[4] + 1e-5, step=y_stride, dtype=torch.float32,\n            ).cuda()\n            z_shifts = x_shifts.new_tensor(anchor_height)\n\n            num_anchor_size, num_anchor_rotation = anchor_size.__len__(), anchor_rotation.__len__()\n            anchor_rotation = x_shifts.new_tensor(anchor_rotation)\n            anchor_size = x_shifts.new_tensor(anchor_size)\n            x_shifts, y_shifts, z_shifts = torch.meshgrid([\n                x_shifts, y_shifts, z_shifts\n            ])  # [x_grid, y_grid, z_grid]\n            anchors = torch.stack((x_shifts, y_shifts, z_shifts), dim=-1)  # [x, y, z, 3]\n            anchors = anchors[:, :, :, None, :].repeat(1, 1, 1, anchor_size.shape[0], 1)\n            anchor_size = anchor_size.view(1, 1, 1, -1, 3).repeat([*anchors.shape[0:3], 1, 1])\n            anchors = torch.cat((anchors, anchor_size), dim=-1)\n            anchors = anchors[:, :, :, :, None, :].repeat(1, 1, 1, 1, num_anchor_rotation, 1)\n            anchor_rotation = anchor_rotation.view(1, 1, 1, 1, -1, 1).repeat([*anchors.shape[0:3], num_anchor_size, 1, 1])\n            anchors = torch.cat((anchors, anchor_rotation), dim=-1)  # [x, y, z, num_size, num_rot, 7]\n\n            anchors = anchors.permute(2, 1, 0, 3, 4, 5).contiguous()\n            #anchors = anchors.view(-1, anchors.shape[-1])\n            anchors[..., 2] += anchors[..., 5] / 2  # shift to box centers\n            all_anchors.append(anchors)\n        return all_anchors, num_anchors_per_location\n\n\nif __name__ == '__main__':\n    from easydict import EasyDict\n    config = [\n        EasyDict({\n            'anchor_sizes': [[2.1, 4.7, 1.7], [0.86, 0.91, 1.73], [0.84, 1.78, 1.78]],\n            'anchor_rotations': [0, 1.57],\n            'anchor_heights': [0, 0.5]\n        })\n    ]\n\n    A = AnchorGenerator(\n        anchor_range=[-75.2, -75.2, -2, 75.2, 75.2, 4],\n        anchor_generator_config=config\n    )\n    import pdb\n    pdb.set_trace()\n    A.generate_anchors([[188, 188]])\n"
  },
  {
    "path": "pcdet/models/dense_heads/target_assigner/atss_target_assigner.py",
    "content": "import torch\n\nfrom ....ops.iou3d_nms import iou3d_nms_utils\nfrom ....utils import common_utils\n\n\nclass ATSSTargetAssigner(object):\n    \"\"\"\n    Reference: https://arxiv.org/abs/1912.02424\n    \"\"\"\n    def __init__(self, topk, box_coder, match_height=False):\n        self.topk = topk\n        self.box_coder = box_coder\n        self.match_height = match_height\n\n    def assign_targets(self, anchors_list, gt_boxes_with_classes, use_multihead=False):\n        \"\"\"\n        Args:\n            anchors: [(N, 7), ...]\n            gt_boxes: (B, M, 8)\n        Returns:\n\n        \"\"\"\n        if not isinstance(anchors_list, list):\n            anchors_list = [anchors_list]\n            single_set_of_anchor = True\n        else:\n            single_set_of_anchor = len(anchors_list) == 1\n        cls_labels_list, reg_targets_list, reg_weights_list = [], [], []\n        for anchors in anchors_list:\n            batch_size = gt_boxes_with_classes.shape[0]\n            gt_classes = gt_boxes_with_classes[:, :, -1]\n            gt_boxes = gt_boxes_with_classes[:, :, :-1]\n            if use_multihead:\n                anchors = anchors.permute(3, 4, 0, 1, 2, 5).contiguous().view(-1, anchors.shape[-1])\n            else:\n                anchors = anchors.view(-1, anchors.shape[-1])\n            cls_labels, reg_targets, reg_weights = [], [], []\n            for k in range(batch_size):\n                cur_gt = gt_boxes[k]\n                cnt = cur_gt.__len__() - 1\n                while cnt > 0 and cur_gt[cnt].sum() == 0:\n                    cnt -= 1\n                cur_gt = cur_gt[:cnt + 1]\n\n                cur_gt_classes = gt_classes[k][:cnt + 1]\n                cur_cls_labels, cur_reg_targets, cur_reg_weights = self.assign_targets_single(\n                    anchors, cur_gt, cur_gt_classes\n                )\n                cls_labels.append(cur_cls_labels)\n                reg_targets.append(cur_reg_targets)\n                reg_weights.append(cur_reg_weights)\n\n            cls_labels = torch.stack(cls_labels, dim=0)\n            reg_targets = torch.stack(reg_targets, dim=0)\n            reg_weights = torch.stack(reg_weights, dim=0)\n            cls_labels_list.append(cls_labels)\n            reg_targets_list.append(reg_targets)\n            reg_weights_list.append(reg_weights)\n\n        if single_set_of_anchor:\n            ret_dict = {\n                'box_cls_labels': cls_labels_list[0],\n                'box_reg_targets': reg_targets_list[0],\n                'reg_weights': reg_weights_list[0]\n            }\n        else:\n            ret_dict = {\n                'box_cls_labels': torch.cat(cls_labels_list, dim=1),\n                'box_reg_targets': torch.cat(reg_targets_list, dim=1),\n                'reg_weights': torch.cat(reg_weights_list, dim=1)\n            }\n        return ret_dict\n\n    def assign_targets_single(self, anchors, gt_boxes, gt_classes):\n        \"\"\"\n        Args:\n            anchors: (N, 7) [x, y, z, dx, dy, dz, heading]\n            gt_boxes: (M, 7) [x, y, z, dx, dy, dz, heading]\n            gt_classes: (M)\n        Returns:\n\n        \"\"\"\n        num_anchor = anchors.shape[0]\n        num_gt = gt_boxes.shape[0]\n\n        # select topk anchors for each gt_boxes\n        if self.match_height:\n            ious = iou3d_nms_utils.boxes_iou3d_gpu(anchors[:, 0:7], gt_boxes[:, 0:7])  # (N, M)\n        else:\n            ious = iou3d_nms_utils.boxes_iou_bev(anchors[:, 0:7], gt_boxes[:, 0:7])\n\n        distance = (anchors[:, None, 0:3] - gt_boxes[None, :, 0:3]).norm(dim=-1)  # (N, M)\n        _, topk_idxs = distance.topk(self.topk, dim=0, largest=False)  # (K, M)\n        candidate_ious = ious[topk_idxs, torch.arange(num_gt)]  # (K, M)\n        iou_mean_per_gt = candidate_ious.mean(dim=0)\n        iou_std_per_gt = candidate_ious.std(dim=0)\n        iou_thresh_per_gt = iou_mean_per_gt + iou_std_per_gt + 1e-6\n        is_pos = candidate_ious >= iou_thresh_per_gt[None, :]  # (K, M)\n\n        # check whether anchor_center in gt_boxes, only check BEV x-y axes\n        candidate_anchors = anchors[topk_idxs.view(-1)]  # (KxM, 7)\n        gt_boxes_of_each_anchor = gt_boxes[:, :].repeat(self.topk, 1)  # (KxM, 7)\n        xyz_local = candidate_anchors[:, 0:3] - gt_boxes_of_each_anchor[:, 0:3]\n        xyz_local = common_utils.rotate_points_along_z(\n            xyz_local[:, None, :], -gt_boxes_of_each_anchor[:, 6]\n        ).squeeze(dim=1)\n        xy_local = xyz_local[:, 0:2]\n        lw = gt_boxes_of_each_anchor[:, 3:5][:, [1, 0]]  # bugfixed: w ==> y, l ==> x in local coords\n        is_in_gt = ((xy_local <= lw / 2) & (xy_local >= -lw / 2)).all(dim=-1).view(-1, num_gt)  # (K, M)\n        is_pos = is_pos & is_in_gt  # (K, M)\n\n        for ng in range(num_gt):\n            topk_idxs[:, ng] += ng * num_anchor\n\n        # select the highest IoU if an anchor box is assigned with multiple gt_boxes\n        INF = -0x7FFFFFFF\n        ious_inf = torch.full_like(ious, INF).t().contiguous().view(-1)  # (MxN)\n        index = topk_idxs.view(-1)[is_pos.view(-1)]\n        ious_inf[index] = ious.t().contiguous().view(-1)[index]\n        ious_inf = ious_inf.view(num_gt, -1).t()  # (N, M)\n\n        anchors_to_gt_values, anchors_to_gt_indexs = ious_inf.max(dim=1)\n\n        # match the gt_boxes to the anchors which have maximum iou with them\n        max_iou_of_each_gt, argmax_iou_of_each_gt = ious.max(dim=0)\n        anchors_to_gt_indexs[argmax_iou_of_each_gt] = torch.arange(0, num_gt, device=ious.device)\n        anchors_to_gt_values[argmax_iou_of_each_gt] = max_iou_of_each_gt\n\n        cls_labels = gt_classes[anchors_to_gt_indexs]\n        cls_labels[anchors_to_gt_values == INF] = 0\n        matched_gts = gt_boxes[anchors_to_gt_indexs]\n\n        pos_mask = cls_labels > 0\n        reg_targets = matched_gts.new_zeros((num_anchor, self.box_coder.code_size))\n        reg_weights = matched_gts.new_zeros(num_anchor)\n        if pos_mask.sum() > 0:\n            reg_targets[pos_mask > 0] = self.box_coder.encode_torch(matched_gts[pos_mask > 0], anchors[pos_mask > 0])\n            reg_weights[pos_mask] = 1.0\n\n        return cls_labels, reg_targets, reg_weights\n"
  },
  {
    "path": "pcdet/models/dense_heads/target_assigner/axis_aligned_target_assigner.py",
    "content": "import numpy as np\nimport torch\n\nfrom ....ops.iou3d_nms import iou3d_nms_utils\n\nfrom ....utils import box_utils\nimport time\n\nclass AxisAlignedTargetAssigner(object):\n    def __init__(self, model_cfg, class_names, box_coder, grid_size, point_cloud_range, match_height=False,):\n        super().__init__()\n\n        anchor_generator_cfg = model_cfg.ANCHOR_GENERATOR_CONFIG\n        anchor_target_cfg = model_cfg.TARGET_ASSIGNER_CONFIG\n\n        self.grid_size = grid_size  # [1408 1600   40]\n        self.point_cloud_range = point_cloud_range  # [  0.  -40.   -3.   70.4  40.    1. ]\n\n        self.voxel_size = (point_cloud_range[3]-point_cloud_range[0])/grid_size[0]\n\n        self.feature_map_stride = [config['feature_map_stride'] for config in anchor_generator_cfg]\n\n        self.box_coder = box_coder\n        self.match_height = match_height\n        self.class_names = np.array(class_names)\n        self.anchor_class_names = [config['class_name'] for config in anchor_generator_cfg]\n        self.pos_fraction = anchor_target_cfg.POS_FRACTION if anchor_target_cfg.POS_FRACTION >= 0 else None\n        self.sample_size = anchor_target_cfg.SAMPLE_SIZE\n        self.norm_by_num_examples = anchor_target_cfg.NORM_BY_NUM_EXAMPLES\n        self.matched_thresholds = {}\n        self.unmatched_thresholds = {}\n        for config in anchor_generator_cfg:\n            self.matched_thresholds[config['class_name']] = config['matched_threshold']\n            self.unmatched_thresholds[config['class_name']] = config['unmatched_threshold']\n         \n        self.use_multihead = model_cfg.get('USE_MULTIHEAD', False)\n        self.seperate_multihead = model_cfg.get('SEPERATE_MULTIHEAD', False)\n        if self.seperate_multihead:\n            rpn_head_cfgs = model_cfg.RPN_HEAD_CFGS\n            self.gt_remapping = {}\n            for rpn_head_cfg in rpn_head_cfgs:\n                for idx, name in enumerate(rpn_head_cfg['HEAD_CLS_NAME']):\n                    self.gt_remapping[name] = idx + 1\n\n\n    def assign_targets(self, all_anchors, gt_boxes_with_classes):\n        \"\"\"\n        Args:\n            all_anchors: [(N, 7), ...]\n            gt_boxes: (B, M, 8)\n        Returns:\n\n        \"\"\"\n\n        bbox_targets = []\n        cls_labels = []\n        reg_weights = []\n        gt_ious = []\n\n        batch_size = gt_boxes_with_classes.shape[0]\n        gt_classes = gt_boxes_with_classes[:, :, -1]\n        gt_boxes = gt_boxes_with_classes[:, :, :-1]\n        for k in range(batch_size):\n            cur_gt = gt_boxes[k]\n            cnt = cur_gt.__len__() - 1\n            while cnt > 0 and cur_gt[cnt].sum() == 0:\n                cnt -= 1\n            cur_gt = cur_gt[:cnt + 1]\n            cur_gt_classes = gt_classes[k][:cnt + 1].int()\n\n            target_list = []\n\n            for anchor_class_name, anchors in zip(self.anchor_class_names, all_anchors):\n                if cur_gt_classes.shape[0] > 1:\n                    mask = torch.from_numpy(self.class_names[cur_gt_classes.cpu() - 1] == anchor_class_name)\n                else:\n                    mask = torch.tensor([self.class_names[c - 1] == anchor_class_name\n                                         for c in cur_gt_classes], dtype=torch.bool)\n                this_gt = cur_gt[mask]\n\n                if self.use_multihead:\n                    anchors = anchors.permute(3, 4, 0, 1, 2, 5).contiguous().view(-1, anchors.shape[-1])\n                    if self.seperate_multihead:\n                        selected_classes = cur_gt_classes[mask].clone()\n                        if len(selected_classes) > 0:\n                            new_cls_id = self.gt_remapping[anchor_class_name]\n                            selected_classes[:] = new_cls_id\n                    else:\n                        selected_classes = cur_gt_classes[mask]\n                else:\n                    feature_map_size = anchors.shape[:2] #1,376,376  (y,x)\n\n                    anchors = anchors.view(-1, anchors.shape[-1])\n                    selected_classes = cur_gt_classes[mask]\n\n                single_target = self.assign_targets_single(\n                    anchors,\n                    this_gt,\n                    gt_classes=selected_classes,\n                    matched_threshold=self.matched_thresholds[anchor_class_name],\n                    unmatched_threshold=self.unmatched_thresholds[anchor_class_name]\n                )\n\n                target_list.append(single_target)\n\n\n            if self.use_multihead:\n                target_dict = {\n                    'box_cls_labels': [t['box_cls_labels'].view(-1) for t in target_list],\n                    'box_reg_targets': [t['box_reg_targets'].view(-1, self.box_coder.code_size) for t in target_list],\n                    'reg_weights': [t['reg_weights'].view(-1) for t in target_list],\n                    'gt_ious': [t['gt_ious'].view(-1) for t in target_list],\n\n                }\n\n                target_dict['box_reg_targets'] = torch.cat(target_dict['box_reg_targets'], dim=0)\n                target_dict['box_cls_labels'] = torch.cat(target_dict['box_cls_labels'], dim=0).view(-1)\n                target_dict['reg_weights'] = torch.cat(target_dict['reg_weights'], dim=0).view(-1)\n                target_dict['gt_ious'] = torch.cat(target_dict['gt_ious'], dim=0).view(-1)\n\n            else:\n                target_dict = {\n                    'box_cls_labels': [t['box_cls_labels'].view(*feature_map_size, -1) for t in target_list],\n                    'gt_ious': [t['gt_ious'].view(*feature_map_size, -1) for t in target_list],\n                    'box_reg_targets': [t['box_reg_targets'].view(*feature_map_size, -1, self.box_coder.code_size)\n                                        for t in target_list],\n                    'reg_weights': [t['reg_weights'].view(*feature_map_size, -1) for t in target_list]\n                }\n                target_dict['box_reg_targets'] = torch.cat(\n                    target_dict['box_reg_targets'], dim=-2\n                ).view(-1, self.box_coder.code_size)\n\n                target_dict['box_cls_labels'] = torch.cat(target_dict['box_cls_labels'], dim=-1).view(-1)\n                target_dict['gt_ious'] = torch.cat(target_dict['gt_ious'], dim=-1).view(-1)\n                target_dict['reg_weights'] = torch.cat(target_dict['reg_weights'], dim=-1).view(-1)\n            bbox_targets.append(target_dict['box_reg_targets'])\n            cls_labels.append(target_dict['box_cls_labels'])\n            gt_ious.append(target_dict['gt_ious'])\n            reg_weights.append(target_dict['reg_weights'])\n\n        bbox_targets = torch.stack(bbox_targets, dim=0)\n\n        cls_labels = torch.stack(cls_labels, dim=0)\n        reg_weights = torch.stack(reg_weights, dim=0)\n        gt_ious = torch.stack(gt_ious, dim=0)\n\n        all_targets_dict = {\n            'box_cls_labels': cls_labels,\n            'box_reg_targets': bbox_targets,\n            'reg_weights': reg_weights,\n            'gt_ious':gt_ious\n\n        }\n        return all_targets_dict\n\n    def assign_targets_single(self, anchors,\n                         gt_boxes,\n                         gt_classes,\n                         matched_threshold=0.6,\n                         unmatched_threshold=0.45\n                        ):\n\n        num_anchors = anchors.shape[0]\n        num_gt = gt_boxes.shape[0]\n\n        labels = torch.ones((num_anchors,), dtype=torch.int32, device=anchors.device) * -1\n        gt_ids = torch.ones((num_anchors,), dtype=torch.int32, device=anchors.device) * -1\n\n        ious = torch.zeros((num_anchors,), dtype=torch.float, device=anchors.device) * -1\n\n        if len(gt_boxes) > 0 and anchors.shape[0] > 0:\n            anchor_by_gt_overlap = iou3d_nms_utils.boxes_iou3d_gpu(anchors[:, 0:7], gt_boxes[:, 0:7]) \\\n                if self.match_height else box_utils.boxes3d_nearest_bev_iou(anchors[:, 0:7], gt_boxes[:, 0:7])\n\n            anchor_to_gt_argmax = torch.from_numpy(anchor_by_gt_overlap.cpu().numpy().argmax(axis=1)).cuda()\n            anchor_to_gt_max = anchor_by_gt_overlap[\n                torch.arange(num_anchors, device=anchors.device), anchor_to_gt_argmax\n            ]\n\n            ious=anchor_to_gt_max\n\n            gt_to_anchor_argmax = torch.from_numpy(anchor_by_gt_overlap.cpu().numpy().argmax(axis=0)).cuda()\n            gt_to_anchor_max = anchor_by_gt_overlap[gt_to_anchor_argmax, torch.arange(num_gt, device=anchors.device)]\n            empty_gt_mask = gt_to_anchor_max == 0\n            gt_to_anchor_max[empty_gt_mask] = -1\n\n            anchors_with_max_overlap = (anchor_by_gt_overlap == gt_to_anchor_max).nonzero()[:, 0]\n            gt_inds_force = anchor_to_gt_argmax[anchors_with_max_overlap]\n            labels[anchors_with_max_overlap] = gt_classes[gt_inds_force]\n            gt_ids[anchors_with_max_overlap] = gt_inds_force.int()\n\n            pos_inds = anchor_to_gt_max >= matched_threshold\n            gt_inds_over_thresh = anchor_to_gt_argmax[pos_inds]\n            labels[pos_inds] = gt_classes[gt_inds_over_thresh]\n            gt_ids[pos_inds] = gt_inds_over_thresh.int()\n            bg_inds = (anchor_to_gt_max < unmatched_threshold).nonzero()[:, 0]\n        else:\n            bg_inds = torch.arange(num_anchors, device=anchors.device)\n\n        fg_inds = (labels > 0).nonzero()[:, 0]\n\n        if self.pos_fraction is not None:\n            num_fg = int(self.pos_fraction * self.sample_size)\n            if len(fg_inds) > num_fg:\n                num_disabled = len(fg_inds) - num_fg\n                disable_inds = torch.randperm(len(fg_inds))[:num_disabled]\n                labels[disable_inds] = -1\n                fg_inds = (labels > 0).nonzero()[:, 0]\n\n            num_bg = self.sample_size - (labels > 0).sum()\n            if len(bg_inds) > num_bg:\n                enable_inds = bg_inds[torch.randint(0, len(bg_inds), size=(num_bg,))]\n                labels[enable_inds] = 0\n            # bg_inds = torch.nonzero(labels == 0)[:, 0]\n        else:\n            if len(gt_boxes) == 0 or anchors.shape[0] == 0:\n                labels[:] = 0\n            else:\n                labels[bg_inds] = 0\n                labels[anchors_with_max_overlap] = gt_classes[gt_inds_force]\n\n        bbox_targets = anchors.new_zeros((num_anchors, self.box_coder.code_size))\n        if len(gt_boxes) > 0 and anchors.shape[0] > 0:\n            fg_gt_boxes = gt_boxes[anchor_to_gt_argmax[fg_inds], :]\n            fg_anchors = anchors[fg_inds, :]\n            bbox_targets[fg_inds, :] = self.box_coder.encode_torch(fg_gt_boxes, fg_anchors)\n\n        reg_weights = anchors.new_zeros((num_anchors,))\n\n        if self.norm_by_num_examples:\n            num_examples = (labels >= 0).sum()\n            num_examples = num_examples if num_examples > 1.0 else 1.0\n            reg_weights[labels > 0] = 1.0 / num_examples\n        else:\n            reg_weights[labels > 0] = 1.0\n\n        ret_dict = {\n            'box_cls_labels': labels,\n            'box_reg_targets': bbox_targets,\n            'reg_weights': reg_weights,\n            'gt_ious':ious\n        }\n        return ret_dict\n"
  },
  {
    "path": "pcdet/models/detectors/__init__.py",
    "content": "from .detector3d_template import Detector3DTemplate\n\nfrom .voxel_rcnn import VoxelRCNN\n__all__ = {\n    'Detector3DTemplate': Detector3DTemplate,\n    'VoxelRCNN': VoxelRCNN,\n\n}\n\ndef build_detector(model_cfg, num_class, dataset):\n    model = __all__[model_cfg.NAME](\n        model_cfg=model_cfg, num_class=num_class, dataset=dataset\n    )\n\n    return model\n"
  },
  {
    "path": "pcdet/models/detectors/detector3d_template.py",
    "content": "import os\n\nimport torch\nimport torch.nn as nn\nfrom ...utils.spconv_utils import find_all_spconv_keys\nfrom ...ops.iou3d_nms import iou3d_nms_utils\nfrom .. import backbones_2d, backbones_3d, dense_heads, roi_heads\nfrom ..backbones_2d import map_to_bev\nfrom ..backbones_3d import pfe, vfe\nfrom ..model_utils import model_nms_utils\n\nclass Detector3DTemplate(nn.Module):\n    def __init__(self, model_cfg, num_class, dataset):\n        super().__init__()\n        self.model_cfg = model_cfg\n        self.num_class = num_class\n        self.dataset = dataset\n        self.class_names = dataset.class_names\n        self.register_buffer('global_step', torch.LongTensor(1).zero_())\n\n        self.module_topology = [\n            'vfe', 'backbone_3d', 'map_to_bev_module',\n            'backbone_2d', 'dense_head','pfe',  'point_head', 'roi_head'\n        ]\n\n        self.test_filp=dataset.test_flip\n\n    @property\n    def mode(self):\n        return 'TRAIN' if self.training else 'TEST'\n\n    def update_global_step(self):\n        self.global_step += 1\n\n    def build_networks(self):\n        model_info_dict = {\n            'module_list': [],\n            'num_rawpoint_features': self.dataset.point_feature_encoder.num_point_features,\n            'num_point_features': self.dataset.point_feature_encoder.num_point_features,\n            'grid_size': self.dataset.grid_size,\n            'point_cloud_range': self.dataset.point_cloud_range,\n            'voxel_size': self.dataset.voxel_size\n        }\n        for module_name in self.module_topology:\n            module, model_info_dict = getattr(self, 'build_%s' % module_name)(\n                model_info_dict=model_info_dict\n            )\n            self.add_module(module_name, module)\n        return model_info_dict['module_list']\n\n    def build_vfe(self, model_info_dict):\n        if self.model_cfg.get('VFE', None) is None:\n            return None, model_info_dict\n\n        vfe_module = vfe.__all__[self.model_cfg.VFE.NAME](\n            model_cfg=self.model_cfg.VFE,\n            num_point_features=model_info_dict['num_rawpoint_features'],\n            point_cloud_range=model_info_dict['point_cloud_range'],\n            voxel_size=model_info_dict['voxel_size'],\n        )\n        model_info_dict['num_point_features'] = vfe_module.get_output_feature_dim()\n        model_info_dict['module_list'].append(vfe_module)\n        return vfe_module, model_info_dict\n\n    def build_backbone_3d(self, model_info_dict):\n        if self.model_cfg.get('BACKBONE_3D', None) is None:\n            return None, model_info_dict\n\n        backbone_3d_module = backbones_3d.__all__[self.model_cfg.BACKBONE_3D.NAME](\n            model_cfg=self.model_cfg.BACKBONE_3D,\n            input_channels=model_info_dict['num_point_features'],\n            grid_size=model_info_dict['grid_size'],\n            voxel_size=model_info_dict['voxel_size'],\n            point_cloud_range=model_info_dict['point_cloud_range']\n        )\n        model_info_dict['module_list'].append(backbone_3d_module)\n        model_info_dict['num_point_features'] = backbone_3d_module.num_point_features\n        return backbone_3d_module, model_info_dict\n\n    def build_map_to_bev_module(self, model_info_dict):\n        if self.model_cfg.get('MAP_TO_BEV', None) is None:\n            return None, model_info_dict\n\n        map_to_bev_module = map_to_bev.__all__[self.model_cfg.MAP_TO_BEV.NAME](\n            model_cfg=self.model_cfg.MAP_TO_BEV,\n            voxel_size=model_info_dict['voxel_size'],\n            point_cloud_range=model_info_dict['point_cloud_range']\n        )\n        model_info_dict['module_list'].append(map_to_bev_module)\n        model_info_dict['num_bev_features'] = map_to_bev_module.num_bev_features\n        return map_to_bev_module, model_info_dict\n\n    def build_backbone_2d(self, model_info_dict):\n        if self.model_cfg.get('BACKBONE_2D', None) is None:\n            return None, model_info_dict\n\n        chan=model_info_dict['num_bev_features']\n\n        backbone_2d_module = backbones_2d.__all__[self.model_cfg.BACKBONE_2D.NAME](\n            model_cfg=self.model_cfg.BACKBONE_2D,\n            input_channels=chan,\n        )\n        model_info_dict['module_list'].append(backbone_2d_module)\n        model_info_dict['num_bev_features_post'] = backbone_2d_module.num_bev_features_post\n        return backbone_2d_module, model_info_dict\n\n    def build_pfe(self, model_info_dict):\n        if self.model_cfg.get('PFE', None) is None:\n            return None, model_info_dict\n\n        pfe_module = pfe.__all__[self.model_cfg.PFE.NAME](\n            model_cfg=self.model_cfg.PFE,\n            voxel_size=model_info_dict['voxel_size'],\n            point_cloud_range=model_info_dict['point_cloud_range'],\n            num_bev_features=model_info_dict['num_bev_features'],\n            num_rawpoint_features=model_info_dict['num_rawpoint_features']\n        )\n        model_info_dict['module_list'].append(pfe_module)\n\n        if type(model_info_dict['num_point_features']) is dict:\n            model_info_dict['num_point_features'].update({\"points_bev\": pfe_module.num_point_features})\n        else:\n            model_info_dict['num_point_features'] = pfe_module.num_point_features\n        model_info_dict['num_point_features_before_fusion'] = pfe_module.num_point_features_before_fusion\n        return pfe_module, model_info_dict\n\n    def build_dense_head(self, model_info_dict):\n        if self.model_cfg.get('DENSE_HEAD', None) is None:\n            return None, model_info_dict\n        dense_head_module = dense_heads.__all__[self.model_cfg.DENSE_HEAD.NAME](\n            model_cfg=self.model_cfg.DENSE_HEAD,\n            input_channels=model_info_dict['num_bev_features_post'],\n            num_class=self.num_class if not self.model_cfg.DENSE_HEAD.CLASS_AGNOSTIC else 1,\n            class_names=self.class_names,\n            grid_size=model_info_dict['grid_size'],\n            point_cloud_range=model_info_dict['point_cloud_range'],\n            predict_boxes_when_training=self.model_cfg.get('ROI_HEAD', False),\n            voxel_size = model_info_dict.get('voxel_size', False)\n        )\n        model_info_dict['module_list'].append(dense_head_module)\n        return dense_head_module, model_info_dict\n\n\n    def build_point_head(self, model_info_dict):\n        if self.model_cfg.get('POINT_HEAD', None) is None:\n            return None, model_info_dict\n\n        if self.model_cfg.POINT_HEAD.get('USE_POINT_FEATURES_BEFORE_FUSION', False):\n            num_point_features = model_info_dict['num_point_features_before_fusion']\n        else:\n            num_point_features = model_info_dict['num_point_features']\n\n        point_head_module = dense_heads.__all__[self.model_cfg.POINT_HEAD.NAME](\n            model_cfg=self.model_cfg.POINT_HEAD,\n            input_channels=num_point_features,\n            num_class=self.num_class if not self.model_cfg.POINT_HEAD.CLASS_AGNOSTIC else 1,\n            predict_boxes_when_training=self.model_cfg.get('ROI_HEAD', False)\n        )\n\n        model_info_dict['module_list'].append(point_head_module)\n        return point_head_module, model_info_dict\n\n    def build_roi_head(self, model_info_dict):\n        if self.model_cfg.get('ROI_HEAD', None) is None:\n            return None, model_info_dict\n        point_head_module = roi_heads.__all__[self.model_cfg.ROI_HEAD.NAME](\n            model_cfg=self.model_cfg.ROI_HEAD,\n            input_channels=model_info_dict['num_point_features'],\n            point_cloud_range=model_info_dict['point_cloud_range'],\n            voxel_size=model_info_dict['voxel_size'],\n            num_class=self.num_class if not self.model_cfg.ROI_HEAD.CLASS_AGNOSTIC else 1,\n        )\n\n        model_info_dict['module_list'].append(point_head_module)\n        return point_head_module, model_info_dict\n\n    def forward(self, **kwargs):\n        raise NotImplementedError\n\n    def post_processing(self, batch_dict):\n        \"\"\"\n        Args:\n            batch_dict:\n                batch_size:\n                batch_cls_preds: (B, num_boxes, num_classes | 1) or (N1+N2+..., num_classes | 1)\n                                or [(B, num_boxes, num_class1), (B, num_boxes, num_class2) ...]\n                multihead_label_mapping: [(num_class1), (num_class2), ...]\n                batch_box_preds: (B, num_boxes, 7+C) or (N1+N2+..., 7+C)\n                cls_preds_normalized: indicate whether batch_cls_preds is normalized\n                batch_index: optional (N1+N2+...)\n                has_class_labels: True/False\n                roi_labels: (B, num_rois)  1 .. num_classes\n                batch_pred_labels: (B, num_boxes, 1)\n        Returns:\n\n        \"\"\"\n        post_process_cfg = self.model_cfg.POST_PROCESSING\n        batch_size = batch_dict['batch_size']\n        recall_dict = {}\n        pred_dicts = []\n        for index in range(batch_size):\n            if batch_dict.get('batch_index', None) is not None:\n                assert batch_dict['batch_box_preds'].shape.__len__() == 2\n                batch_mask = (batch_dict['batch_index'] == index)\n            else:\n                assert batch_dict['batch_box_preds'].shape.__len__() == 3\n                batch_mask = index\n\n            box_preds = batch_dict['batch_box_preds'][batch_mask]\n            src_box_preds = box_preds\n\n            if not isinstance(batch_dict['batch_cls_preds'], list):\n                cls_preds = batch_dict['batch_cls_preds'][batch_mask]\n\n                src_cls_preds = cls_preds\n                assert cls_preds.shape[1] in [1, self.num_class]\n\n                if not batch_dict['cls_preds_normalized']:\n                    cls_preds = torch.sigmoid(cls_preds)\n            else:\n                cls_preds = [x[batch_mask] for x in batch_dict['batch_cls_preds']]\n                src_cls_preds = cls_preds\n                if not batch_dict['cls_preds_normalized']:\n                    cls_preds = [torch.sigmoid(x) for x in cls_preds]\n\n            if post_process_cfg.NMS_CONFIG.MULTI_CLASSES_NMS:\n                if not isinstance(cls_preds, list):\n                    cls_preds = [cls_preds]\n                    multihead_label_mapping = [torch.arange(1, self.num_class, device=cls_preds[0].device)]\n                else:\n                    multihead_label_mapping = batch_dict['multihead_label_mapping']\n\n                cur_start_idx = 0\n                pred_scores, pred_labels, pred_boxes = [], [], []\n                for cur_cls_preds, cur_label_mapping in zip(cls_preds, multihead_label_mapping):\n                    assert cur_cls_preds.shape[1] == len(cur_label_mapping)\n                    cur_box_preds = box_preds[cur_start_idx: cur_start_idx + cur_cls_preds.shape[0]]\n                    cur_pred_scores, cur_pred_labels, cur_pred_boxes = model_nms_utils.multi_classes_nms(\n                        cls_scores=cur_cls_preds, box_preds=cur_box_preds,\n                        nms_config=post_process_cfg.NMS_CONFIG,\n                        score_thresh=post_process_cfg.SCORE_THRESH\n                    )\n                    cur_pred_labels = cur_label_mapping[cur_pred_labels]\n                    pred_scores.append(cur_pred_scores)\n                    pred_labels.append(cur_pred_labels)\n                    pred_boxes.append(cur_pred_boxes)\n                    cur_start_idx += cur_cls_preds.shape[0]\n\n                final_scores = torch.cat(pred_scores, dim=0)\n                final_labels = torch.cat(pred_labels, dim=0)\n                final_boxes = torch.cat(pred_boxes, dim=0)\n            else:\n                cls_preds, label_preds = torch.max(cls_preds, dim=-1)\n                if batch_dict.get('has_class_labels', False):\n                    label_key = 'roi_labels' if 'roi_labels' in batch_dict else 'batch_pred_labels'\n                    label_preds = batch_dict[label_key][index]\n                else:\n                    label_preds = label_preds + 1\n\n                if post_process_cfg.get('WBF', True):\n                    if post_process_cfg.OUTPUT_RAW_SCORE:\n                        max_cls_preds, _ = torch.max(src_cls_preds, dim=-1)\n\n                    score_mask = cls_preds > post_process_cfg.SCORE_THRESH\n                    final_scores = cls_preds[score_mask]\n                    final_labels = label_preds[score_mask]\n                    final_boxes = box_preds[score_mask]\n                else:\n                    selected, selected_scores = model_nms_utils.class_agnostic_nms(\n                        box_scores=cls_preds, box_preds=box_preds,\n                        nms_config=post_process_cfg.NMS_CONFIG,\n                        score_thresh=post_process_cfg.SCORE_THRESH\n                    )\n\n                    if post_process_cfg.OUTPUT_RAW_SCORE:\n                        max_cls_preds, _ = torch.max(src_cls_preds, dim=-1)\n                        selected_scores = max_cls_preds[selected]\n\n                    final_scores = selected_scores\n                    final_labels = label_preds[selected]\n                    final_boxes = box_preds[selected]\n\n\n            recall_dict = self.generate_recall_record(\n                box_preds=final_boxes if 'rois' not in batch_dict else src_box_preds,\n                recall_dict=recall_dict, batch_index=index, data_dict=batch_dict,\n                thresh_list=post_process_cfg.RECALL_THRESH_LIST\n            )\n\n            record_dict = {\n                'pred_boxes': final_boxes,\n                'pred_scores': final_scores,\n                'pred_labels': final_labels\n            }\n            if post_process_cfg.get('WBF', True):\n                record_dict.update({'WBF': True})\n\n            pred_dicts.append(record_dict)\n\n        return pred_dicts, recall_dict\n\n    @staticmethod\n    def generate_recall_record(box_preds, recall_dict, batch_index, data_dict=None, thresh_list=None):\n        if 'gt_boxes' not in data_dict:\n            return recall_dict\n\n        rois = data_dict['rois'][batch_index] if 'rois' in data_dict else None\n        gt_boxes = data_dict['gt_boxes'][batch_index]\n\n        if recall_dict.__len__() == 0:\n            recall_dict = {'gt': 0}\n            for cur_thresh in thresh_list:\n                recall_dict['roi_%s' % (str(cur_thresh))] = 0\n                recall_dict['rcnn_%s' % (str(cur_thresh))] = 0\n\n        cur_gt = gt_boxes\n        k = cur_gt.__len__() - 1\n        while k > 0 and cur_gt[k].sum() == 0:\n            k -= 1\n        cur_gt = cur_gt[:k + 1]\n\n        if cur_gt.shape[0] > 0:\n            if box_preds.shape[0] > 0:\n                iou3d_rcnn = iou3d_nms_utils.boxes_iou3d_gpu(box_preds[:, 0:7], cur_gt[:, 0:7])\n            else:\n                iou3d_rcnn = torch.zeros((0, cur_gt.shape[0]))\n\n            if rois is not None:\n                iou3d_roi = iou3d_nms_utils.boxes_iou3d_gpu(rois[:, 0:7], cur_gt[:, 0:7])\n\n            for cur_thresh in thresh_list:\n                if iou3d_rcnn.shape[0] == 0:\n                    recall_dict['rcnn_%s' % str(cur_thresh)] += 0\n                else:\n                    rcnn_recalled = (iou3d_rcnn.max(dim=0)[0] > cur_thresh).sum().item()\n                    recall_dict['rcnn_%s' % str(cur_thresh)] += rcnn_recalled\n                if rois is not None:\n                    roi_recalled = (iou3d_roi.max(dim=0)[0] > cur_thresh).sum().item()\n                    recall_dict['roi_%s' % str(cur_thresh)] += roi_recalled\n\n            recall_dict['gt'] += cur_gt.shape[0]\n        else:\n            gt_iou = box_preds.new_zeros(box_preds.shape[0])\n        return recall_dict\n\n    def _load_state_dict(self, model_state_disk, *, strict=True):\n        state_dict = self.state_dict()  # local cache of state_dict\n\n        spconv_keys = find_all_spconv_keys(self)\n\n        update_model_state = {}\n        for key, val in model_state_disk.items():\n            if key in spconv_keys and key in state_dict and state_dict[key].shape != val.shape:\n                # with different spconv versions, we need to adapt weight shapes for spconv blocks\n                # adapt spconv weights from version 1.x to version 2.x if you used weights from spconv 1.x\n\n                val_native = val.transpose(-1, -2)  # (k1, k2, k3, c_in, c_out) to (k1, k2, k3, c_out, c_in)\n                if val_native.shape == state_dict[key].shape:\n                    val = val_native.contiguous()\n                else:\n                    assert val.shape.__len__() == 5, 'currently only spconv 3D is supported'\n                    val_implicit = val.permute(4, 0, 1, 2, 3)  # (k1, k2, k3, c_in, c_out) to (c_out, k1, k2, k3, c_in)\n                    if val_implicit.shape == state_dict[key].shape:\n                        val = val_implicit.contiguous()\n\n            if key in state_dict and state_dict[key].shape == val.shape:\n                update_model_state[key] = val\n                # logger.info('Update weight %s: %s' % (key, str(val.shape)))\n\n        if strict:\n            self.load_state_dict(update_model_state)\n        else:\n            state_dict.update(update_model_state)\n            self.load_state_dict(state_dict)\n        return state_dict, update_model_state\n\n    def load_params_from_file(self, filename, logger, to_cpu=False):\n        if not os.path.isfile(filename):\n            raise FileNotFoundError\n\n        logger.info('==> Loading parameters from checkpoint %s to %s' % (filename, 'CPU' if to_cpu else 'GPU'))\n        loc_type = torch.device('cpu') if to_cpu else None\n        checkpoint = torch.load(filename, map_location=loc_type)\n        model_state_disk = checkpoint['model_state']\n\n        version = checkpoint.get(\"version\", None)\n        if version is not None:\n            logger.info('==> Checkpoint trained from version: %s' % version)\n\n        state_dict, update_model_state = self._load_state_dict(model_state_disk, strict=False)\n\n        for key in state_dict:\n            if key not in update_model_state:\n                logger.info('Not updated weight %s: %s' % (key, str(state_dict[key].shape)))\n\n        logger.info('==> Done (loaded %d/%d)' % (len(update_model_state), len(state_dict)))\n\n    def load_params_with_optimizer(self, filename, to_cpu=False, optimizer=None, logger=None):\n        if not os.path.isfile(filename):\n            raise FileNotFoundError\n\n        logger.info('==> Loading parameters from checkpoint %s to %s' % (filename, 'CPU' if to_cpu else 'GPU'))\n        loc_type = torch.device('cpu') if to_cpu else None\n        checkpoint = torch.load(filename, map_location=loc_type)\n        epoch = checkpoint.get('epoch', -1)\n        it = checkpoint.get('it', 0.0)\n\n        self._load_state_dict(checkpoint['model_state'], strict=True)\n\n        if optimizer is not None:\n            if 'optimizer_state' in checkpoint and checkpoint['optimizer_state'] is not None:\n                logger.info('==> Loading optimizer parameters from checkpoint %s to %s'\n                            % (filename, 'CPU' if to_cpu else 'GPU'))\n                optimizer.load_state_dict(checkpoint['optimizer_state'])\n            else:\n                assert filename[-4] == '.', filename\n                src_file, ext = filename[:-4], filename[-3:]\n                optimizer_filename = '%s_optim.%s' % (src_file, ext)\n                if os.path.exists(optimizer_filename):\n                    optimizer_ckpt = torch.load(optimizer_filename, map_location=loc_type)\n                    optimizer.load_state_dict(optimizer_ckpt['optimizer_state'])\n\n        if 'version' in checkpoint:\n            print('==> Checkpoint trained from version: %s' % checkpoint['version'])\n        logger.info('==> Done')\n\n        return it, epoch\n"
  },
  {
    "path": "pcdet/models/detectors/voxel_rcnn.py",
    "content": "from .detector3d_template import Detector3DTemplate\nimport time\nclass VoxelRCNN(Detector3DTemplate):\n    def __init__(self, model_cfg, num_class, dataset):\n        super().__init__(model_cfg=model_cfg, num_class=num_class, dataset=dataset)\n        self.module_list = self.build_networks()\n\n    def forward(self, batch_dict):\n        for cur_module in self.module_list:\n            batch_dict = cur_module(batch_dict)\n\n        if self.training:\n\n            loss, tb_dict, disp_dict = self.get_training_loss()\n\n            ret_dict = {\n                'loss': loss\n            }\n            return ret_dict, tb_dict, disp_dict\n        else:\n            pred_dicts, recall_dicts, = self.post_processing(batch_dict)\n            return pred_dicts, recall_dicts, batch_dict\n\n    def get_training_loss(self):\n        disp_dict = {}\n        loss_rpn, tb_dict = self.dense_head.get_loss()\n        loss_rcnn, tb_dict = self.roi_head.get_loss(tb_dict)#\n\n        loss =  loss_rpn + loss_rcnn\n        return loss, tb_dict, disp_dict\n\n"
  },
  {
    "path": "pcdet/models/model_utils/centernet_utils.py",
    "content": "# This file is modified from https://github.com/tianweiy/CenterPoint\n\nimport torch\nimport torch.nn.functional as F\nimport numpy as np\nimport numba\n\n\ndef gaussian_radius(height, width, min_overlap=0.5):\n    \"\"\"\n    Args:\n        height: (N)\n        width: (N)\n        min_overlap:\n    Returns:\n    \"\"\"\n    a1 = 1\n    b1 = (height + width)\n    c1 = width * height * (1 - min_overlap) / (1 + min_overlap)\n    sq1 = (b1 ** 2 - 4 * a1 * c1).sqrt()\n    r1 = (b1 + sq1) / 2\n\n    a2 = 4\n    b2 = 2 * (height + width)\n    c2 = (1 - min_overlap) * width * height\n    sq2 = (b2 ** 2 - 4 * a2 * c2).sqrt()\n    r2 = (b2 + sq2) / 2\n\n    a3 = 4 * min_overlap\n    b3 = -2 * min_overlap * (height + width)\n    c3 = (min_overlap - 1) * width * height\n    sq3 = (b3 ** 2 - 4 * a3 * c3).sqrt()\n    r3 = (b3 + sq3) / 2\n    ret = torch.min(torch.min(r1, r2), r3)\n    return ret\n\n\ndef gaussian2D(shape, sigma=1):\n    m, n = [(ss - 1.) / 2. for ss in shape]\n    y, x = np.ogrid[-m:m + 1, -n:n + 1]\n\n    h = np.exp(-(x * x + y * y) / (2 * sigma * sigma))\n    h[h < np.finfo(h.dtype).eps * h.max()] = 0\n    return h\n\n\ndef draw_gaussian_to_heatmap(heatmap, center, radius, k=1, valid_mask=None):\n    diameter = 2 * radius + 1\n    gaussian = gaussian2D((diameter, diameter), sigma=diameter / 6)\n\n    x, y = int(center[0]), int(center[1])\n\n    height, width = heatmap.shape[0:2]\n\n    left, right = min(x, radius), min(width - x, radius + 1)\n    top, bottom = min(y, radius), min(height - y, radius + 1)\n\n    masked_heatmap = heatmap[y - top:y + bottom, x - left:x + right]\n    masked_gaussian = torch.from_numpy(\n        gaussian[radius - top:radius + bottom, radius - left:radius + right]\n    ).to(heatmap.device).float()\n\n    if min(masked_gaussian.shape) > 0 and min(masked_heatmap.shape) > 0:  # TODO debug\n        if valid_mask is not None:\n            cur_valid_mask = valid_mask[y - top:y + bottom, x - left:x + right]\n            masked_gaussian = masked_gaussian * cur_valid_mask.float()\n\n        torch.max(masked_heatmap, masked_gaussian * k, out=masked_heatmap)\n    return heatmap\n\n\ndef _nms(heat, kernel=3):\n    pad = (kernel - 1) // 2\n\n    hmax = F.max_pool2d(heat, (kernel, kernel), stride=1, padding=pad)\n    keep = (hmax == heat).float()\n    return heat * keep\n\n\n@numba.jit(nopython=True)\ndef circle_nms(dets, thresh):\n    x1 = dets[:, 0]\n    y1 = dets[:, 1]\n    scores = dets[:, 2]\n    order = scores.argsort()[::-1].astype(np.int32)  # highest->lowest\n    ndets = dets.shape[0]\n    suppressed = np.zeros((ndets), dtype=np.int32)\n    keep = []\n    for _i in range(ndets):\n        i = order[_i]  # start with highest score box\n        if suppressed[i] == 1:  # if any box have enough iou with this, remove it\n            continue\n        keep.append(i)\n        for _j in range(_i + 1, ndets):\n            j = order[_j]\n            if suppressed[j] == 1:\n                continue\n            # calculate center distance between i and j box\n            dist = (x1[i] - x1[j]) ** 2 + (y1[i] - y1[j]) ** 2\n\n            # ovr = inter / areas[j]\n            if dist <= thresh:\n                suppressed[j] = 1\n    return keep\n\n\ndef _circle_nms(boxes, min_radius, post_max_size=83):\n    \"\"\"\n    NMS according to center distance\n    \"\"\"\n    keep = np.array(circle_nms(boxes.cpu().numpy(), thresh=min_radius))[:post_max_size]\n\n    keep = torch.from_numpy(keep).long().to(boxes.device)\n\n    return keep\n\n\ndef _gather_feat(feat, ind, mask=None):\n    dim = feat.size(2)\n    ind = ind.unsqueeze(2).expand(ind.size(0), ind.size(1), dim)\n    feat = feat.gather(1, ind)\n    if mask is not None:\n        mask = mask.unsqueeze(2).expand_as(feat)\n        feat = feat[mask]\n        feat = feat.view(-1, dim)\n    return feat\n\n\ndef _transpose_and_gather_feat(feat, ind):\n    feat = feat.permute(0, 2, 3, 1).contiguous()\n    feat = feat.view(feat.size(0), -1, feat.size(3))\n    feat = _gather_feat(feat, ind)\n    return feat\n\n\ndef _topk(scores, K=40):\n    batch, num_class, height, width = scores.size()\n\n    topk_scores, topk_inds = torch.topk(scores.flatten(2, 3), K)\n\n    topk_inds = topk_inds % (height * width)\n    topk_ys = (topk_inds // width).float()\n    topk_xs = (topk_inds % width).int().float()\n\n    topk_score, topk_ind = torch.topk(topk_scores.view(batch, -1), K)\n    topk_classes = (topk_ind // K).int()\n    topk_inds = _gather_feat(topk_inds.view(batch, -1, 1), topk_ind).view(batch, K)\n    topk_ys = _gather_feat(topk_ys.view(batch, -1, 1), topk_ind).view(batch, K)\n    topk_xs = _gather_feat(topk_xs.view(batch, -1, 1), topk_ind).view(batch, K)\n\n    return topk_score, topk_inds, topk_classes, topk_ys, topk_xs\n\n\ndef decode_bbox_from_heatmap(heatmap, rot_cos, rot_sin, center, center_z, dim,\n                             point_cloud_range=None, voxel_size=None, feature_map_stride=None, vel=None, K=100,\n                             circle_nms=False, score_thresh=None, post_center_limit_range=None):\n    batch_size, num_class, _, _ = heatmap.size()\n\n    if circle_nms:\n        # TODO: not checked yet\n        assert False, 'not checked yet'\n        heatmap = _nms(heatmap)\n\n    scores, inds, class_ids, ys, xs = _topk(heatmap, K=K)\n    center = _transpose_and_gather_feat(center, inds).view(batch_size, K, 2)\n    rot_sin = _transpose_and_gather_feat(rot_sin, inds).view(batch_size, K, 1)\n    rot_cos = _transpose_and_gather_feat(rot_cos, inds).view(batch_size, K, 1)\n    center_z = _transpose_and_gather_feat(center_z, inds).view(batch_size, K, 1)\n    dim = _transpose_and_gather_feat(dim, inds).view(batch_size, K, 3)\n\n    angle = torch.atan2(rot_sin, rot_cos)\n    xs = xs.view(batch_size, K, 1) + center[:, :, 0:1]\n    ys = ys.view(batch_size, K, 1) + center[:, :, 1:2]\n\n    xs = xs * feature_map_stride * voxel_size[0] + point_cloud_range[0]\n    ys = ys * feature_map_stride * voxel_size[1] + point_cloud_range[1]\n\n    box_part_list = [xs, ys, center_z, dim, angle]\n    if vel is not None:\n        vel = _transpose_and_gather_feat(vel, inds).view(batch_size, K, 2)\n        box_part_list.append(vel)\n\n    final_box_preds = torch.cat((box_part_list), dim=-1)\n    final_scores = scores.view(batch_size, K)\n    final_class_ids = class_ids.view(batch_size, K)\n\n    assert post_center_limit_range is not None\n    mask = (final_box_preds[..., :3] >= post_center_limit_range[:3]).all(2)\n    mask &= (final_box_preds[..., :3] <= post_center_limit_range[3:]).all(2)\n\n    if score_thresh is not None:\n        mask &= (final_scores > score_thresh)\n\n    ret_pred_dicts = []\n    for k in range(batch_size):\n        cur_mask = mask[k]\n        cur_boxes = final_box_preds[k, cur_mask]\n        cur_scores = final_scores[k, cur_mask]\n        cur_labels = final_class_ids[k, cur_mask]\n\n        if circle_nms:\n            assert False, 'not checked yet'\n            centers = cur_boxes[:, [0, 1]]\n            boxes = torch.cat((centers, scores.view(-1, 1)), dim=1)\n            keep = _circle_nms(boxes, min_radius=min_radius, post_max_size=nms_post_max_size)\n\n            cur_boxes = cur_boxes[keep]\n            cur_scores = cur_scores[keep]\n            cur_labels = cur_labels[keep]\n\n        ret_pred_dicts.append({\n            'pred_boxes': cur_boxes,\n            'pred_scores': cur_scores,\n            'pred_labels': cur_labels\n        })\n    return ret_pred_dicts\n"
  },
  {
    "path": "pcdet/models/model_utils/ctrans.py",
    "content": "import torch.nn as nn\nimport pdb\nimport torch\nimport numpy as np\nfrom numpy import *\nimport torch.nn.functional as F\nfrom typing import Optional, List\nfrom torch import Tensor\nimport copy\nfrom copy import deepcopy\nfrom ...utils import common_utils, spconv_utils\n\nclass PositionalEmbedding(nn.Module):\n    def __init__(self, demb=256):\n        super(PositionalEmbedding, self).__init__()\n\n        self.demb = demb\n\n        inv_freq = 1 / (10000 ** (torch.arange(0.0, demb, 2.0) / demb))\n        self.register_buffer('inv_freq', inv_freq)\n\n    # pos_seq =  pos_seq = torch.arange(seq_len-1, -1, -1.0)\n    def forward(self, pos_seq, batch_size=2):\n        sinusoid_inp = torch.ger(pos_seq, self.inv_freq)\n        pos_emb = torch.cat([sinusoid_inp.sin(), sinusoid_inp.cos()], dim=-1)\n\n        if batch_size is not None:\n            return pos_emb[:, None, :].expand(-1, batch_size, -1)\n        else:\n            return pos_emb[:, None, :]\n\nclass CrossAttention(nn.Module):\n\n    def __init__(self, hidden_dim, pos = True, head = 4):\n        super(CrossAttention, self).__init__()\n\n        self.hidden_dim = hidden_dim\n        self.pos_dim = 8\n        self.pos = pos\n\n        if self.pos:\n            self.pos_en = PositionalEmbedding(self.pos_dim)\n\n            self.Q_linear = nn.Linear(hidden_dim+self.pos_dim, hidden_dim, bias=False)\n            self.K_linear = nn.Linear(hidden_dim+self.pos_dim, hidden_dim, bias=False)\n            self.V_linear = nn.Linear(hidden_dim+self.pos_dim, hidden_dim, bias=False)\n        else:\n\n            self.Q_linear = nn.Linear(hidden_dim, hidden_dim, bias=False)\n            self.K_linear = nn.Linear(hidden_dim, hidden_dim, bias=False)\n            self.V_linear = nn.Linear(hidden_dim, hidden_dim, bias=False)\n\n        self.att = nn.MultiheadAttention(hidden_dim, head)\n\n\n    def forward(self, inputs, Q_in): # N,B,C\n\n        batch_size = inputs.shape[1]\n        seq_len = inputs.shape[0]\n\n        if self.pos:\n            pos_input = torch.from_numpy(np.arange(seq_len)+1).cuda()\n            pos_input = self.pos_en(pos_input, batch_size)\n            inputs_pos = torch.cat([inputs, pos_input], -1)\n            pos_Q = torch.from_numpy(np.array([seq_len])).cuda()\n            pos_Q = self.pos_en(pos_Q, batch_size)\n            Q_in_pos = torch.cat([Q_in, pos_Q], -1)\n        else:\n            inputs_pos = inputs\n            Q_in_pos = Q_in\n\n        Q = self.Q_linear(Q_in_pos)\n        K = self.K_linear(inputs_pos)\n        V = self.V_linear(inputs_pos)\n\n        out = self.att(Q, K, V)\n\n        return out[0]\n\n\nclass Attention_Layer(nn.Module):\n\n    def __init__(self, hidden_dim):\n        super(Attention_Layer, self).__init__()\n\n        self.hidden_dim = hidden_dim\n\n        self.Q_linear = nn.Linear(hidden_dim, hidden_dim, bias=False)\n        self.K_linear = nn.Linear(hidden_dim, hidden_dim, bias=False)\n        self.V_linear = nn.Linear(hidden_dim, hidden_dim, bias=False)\n\n    def forward(self, inputs): # B,K,N\n\n\n        Q = self.Q_linear(inputs)\n        K = self.K_linear(inputs).permute(0, 2, 1)\n        V = self.V_linear(inputs)\n\n        alpha = torch.matmul(Q, K)\n\n        alpha = F.softmax(alpha, dim=2)\n\n        out = torch.matmul(alpha, V)\n\n        out = torch.mean(out, -2)\n\n        return out\n\n\ndef gen_sample_grid(rois, grid_size=7, grid_offsets=(0, 0), spatial_scale=1.):\n    faked_features = rois.new_ones((grid_size, grid_size))\n    N = rois.shape[0]\n    dense_idx = faked_features.nonzero()  # (N, 2) [x_idx, y_idx]\n    dense_idx = dense_idx.repeat(N, 1, 1).float()  # (B, 7 * 7, 2)\n\n    local_roi_size = rois.view(N, -1)[:, 3:5]\n    local_roi_grid_points = (dense_idx ) / (grid_size-1) * local_roi_size.unsqueeze(dim=1) \\\n                      - (local_roi_size.unsqueeze(dim=1) / 2)  # (B, 7 * 7, 2)\n\n    ones = torch.ones_like(local_roi_grid_points[..., 0:1])\n    local_roi_grid_points = torch.cat([local_roi_grid_points, ones], -1)\n\n    global_roi_grid_points = common_utils.rotate_points_along_z(\n        local_roi_grid_points.clone(), rois[:, 6]\n    ).squeeze(dim=1)\n    global_center = rois[:, 0:3].clone()\n    global_roi_grid_points += global_center.unsqueeze(dim=1)\n\n    x = global_roi_grid_points[..., 0:1]\n    y = global_roi_grid_points[..., 1:2]\n\n    x = (x.permute(1, 2, 0).contiguous() + grid_offsets[0]) * spatial_scale\n    y = (y.permute(1, 2, 0).contiguous() + grid_offsets[1]) * spatial_scale\n\n    return x.view(grid_size**2, -1), y.view(grid_size**2, -1)\n\ndef bilinear_interpolate_torch_gridsample(image, samples_x, samples_y):\n    C, H, W = image.shape\n    image = image.unsqueeze(1)  # change to:  C x 1 x H x W        C,K,1,2   C,K,1,1\n\n    samples_x = samples_x.unsqueeze(2)\n    samples_x = samples_x.unsqueeze(3)# 49,K,1,1\n    samples_y = samples_y.unsqueeze(2)\n    samples_y = samples_y.unsqueeze(3)\n\n    samples = torch.cat([samples_x, samples_y], 3)\n    samples[:, :, :, 0] = (samples[:, :, :, 0] / W)  # normalize to between  0 and 1\n\n    samples[:, :, :, 1] = (samples[:, :, :, 1] / H)  # normalize to between  0 and 1\n    samples = samples * 2 - 1  # normalize to between -1 and 1  # 49,K,1,2\n\n    #B,C,H,W\n    #B,H,W,2\n    #B,C,H,W\n\n    return torch.nn.functional.grid_sample(image, samples, align_corners=False)\n\n\n\nclass MLP(nn.Module):\n    \"\"\" Very simple multi-layer perceptron (also called FFN)\"\"\"\n\n    def __init__(self, input_dim, hidden_dim, output_dim, num_layers):\n        super().__init__()\n        self.num_layers = num_layers\n        h = [hidden_dim] * (num_layers - 1)\n        self.layers = nn.ModuleList(nn.Linear(n, k) for n, k in zip([input_dim] + h, h + [output_dim]))\n        self.init_weights()\n\n\n    def forward(self, x):\n        for i, layer in enumerate(self.layers):\n            x = F.relu(layer(x)) if i < self.num_layers - 1 else layer(x)\n        return x\n\n    def init_weights(self):\n        init_func = nn.init.xavier_normal_\n        for module_list in [self.layers]:\n            for m in module_list.modules():\n                if isinstance(m, nn.Linear):\n                    init_func(m.weight)\n                    if m.bias is not None:\n                        nn.init.constant_(m.bias, 0)\n\ndef MLP_v2(channels: list, do_bn=True):\n    \"\"\" Multi-layer perceptron \"\"\"\n    n = len(channels)\n    layers = []\n    for i in range(1, n):\n        layers.append(\n            nn.Conv1d(channels[i - 1], channels[i], kernel_size=1, bias=True))\n        if i < (n-1):\n            if do_bn:\n                layers.append(nn.BatchNorm1d(channels[i]))\n            layers.append(nn.ReLU())\n    return nn.Sequential(*layers)\n\nclass Transformer(nn.Module):\n\n    def __init__(self, d_model=512, nhead=8, num_encoder_layers=6,\n                 num_decoder_layers=6, dim_feedforward=2048, dropout=0.1,\n                 activation=\"relu\", normalize_before=False,\n                 return_intermediate_dec=False):\n        super().__init__()\n\n        encoder_layer = TransformerEncoderLayer(d_model, nhead, dim_feedforward,\n                                                dropout, activation, normalize_before)\n        encoder_norm = nn.LayerNorm(d_model) if normalize_before else None\n        self.encoder = TransformerEncoder(encoder_layer, num_encoder_layers, encoder_norm)\n\n        decoder_layer = TransformerDecoderLayer(d_model, nhead, dim_feedforward,\n                                                dropout, activation, normalize_before)\n        decoder_norm = nn.LayerNorm(d_model)\n        self.decoder = TransformerDecoder(decoder_layer, num_decoder_layers, decoder_norm,\n                                          return_intermediate=return_intermediate_dec)\n\n        self._reset_parameters()\n\n        self.d_model = d_model\n        self.nhead = nhead\n\n    def _reset_parameters(self):\n        for p in self.parameters():\n            if p.dim() > 1:\n                nn.init.xavier_uniform_(p)\n\n    def forward(self, src, query_embed, pos_embed):\n        bs, n, c = src.shape\n        src = src.permute(1, 0, 2)\n        pos_embed = pos_embed.permute(1, 0, 2)\n        query_embed = query_embed.unsqueeze(1).repeat(1, bs, 1)\n        tgt = torch.zeros_like(query_embed)\n        memory = self.encoder(src, src_key_padding_mask=None, pos=pos_embed)\n        hs = self.decoder(tgt, memory, memory_key_padding_mask=None,\n                          pos=pos_embed, query_pos=query_embed)\n        return hs.transpose(1, 2), memory.permute(1, 2, 0).view(bs, c, n)\n\n\nclass TransformerEncoder(nn.Module):\n\n    def __init__(self, encoder_layer, num_layers, norm=None):\n        super().__init__()\n        self.layers = _get_clones(encoder_layer, num_layers)\n        self.num_layers = num_layers\n        self.norm = norm\n\n    def forward(self, src,\n                mask: Optional[Tensor] = None,\n                src_key_padding_mask: Optional[Tensor] = None,\n                pos: Optional[Tensor] = None):\n        output = src\n\n        for layer in self.layers:\n            output = layer(output, src_mask=mask,\n                           src_key_padding_mask=src_key_padding_mask, pos=pos)\n\n        if self.norm is not None:\n            output = self.norm(output)\n\n        return output\n\n\nclass TransformerDecoder(nn.Module):\n\n    def __init__(self, decoder_layer, num_layers, norm=None, return_intermediate=False):\n        super().__init__()\n        self.layers = _get_clones(decoder_layer, num_layers)\n        self.num_layers = num_layers\n        self.norm = norm\n        self.return_intermediate = return_intermediate\n\n    def forward(self, tgt, memory,\n                tgt_mask: Optional[Tensor] = None,\n                memory_mask: Optional[Tensor] = None,\n                tgt_key_padding_mask: Optional[Tensor] = None,\n                memory_key_padding_mask: Optional[Tensor] = None,\n                pos: Optional[Tensor] = None,\n                query_pos: Optional[Tensor] = None):\n        output = tgt\n\n        intermediate = []\n\n        for layer in self.layers:\n            output = layer(output, memory, tgt_mask=tgt_mask,\n                           memory_mask=memory_mask,\n                           tgt_key_padding_mask=tgt_key_padding_mask,\n                           memory_key_padding_mask=memory_key_padding_mask,\n                           pos=pos, query_pos=query_pos)\n            if self.return_intermediate:\n                intermediate.append(self.norm(output))\n\n        if self.norm is not None:\n            output = self.norm(output)\n            if self.return_intermediate:\n                intermediate.pop()\n                intermediate.append(output)\n\n        if self.return_intermediate:\n            return torch.stack(intermediate)\n\n        return output.unsqueeze(0)\n\n\nclass TransformerEncoderLayer(nn.Module):\n\n    def __init__(self, d_model, nhead, dim_feedforward=2048, dropout=0.1,\n                 activation=\"relu\", normalize_before=False):\n        super().__init__()\n        self.self_attn = nn.MultiheadAttention(d_model, nhead, dropout=dropout)\n        # Implementation of Feedforward model\n        self.linear1 = nn.Linear(d_model, dim_feedforward)\n        self.dropout = nn.Dropout(dropout)\n        self.linear2 = nn.Linear(dim_feedforward, d_model)\n\n        self.norm1 = nn.LayerNorm(d_model)\n        self.norm2 = nn.LayerNorm(d_model)\n        self.dropout1 = nn.Dropout(dropout)\n        self.dropout2 = nn.Dropout(dropout)\n\n        self.activation = _get_activation_fn(activation)\n        self.normalize_before = normalize_before\n\n    def with_pos_embed(self, tensor, pos: Optional[Tensor]):\n        return tensor if pos is None else tensor + pos\n\n    def forward_post(self,\n                     src,\n                     src_mask: Optional[Tensor] = None,\n                     src_key_padding_mask: Optional[Tensor] = None,\n                     pos: Optional[Tensor] = None):\n        q = k = self.with_pos_embed(src, pos)\n        src2 = self.self_attn(q, k, value=src, attn_mask=src_mask,\n                              key_padding_mask=src_key_padding_mask)[0]\n        src = src + self.dropout1(src2)\n        src = self.norm1(src)\n        src2 = self.linear2(self.dropout(self.activation(self.linear1(src))))\n        src = src + self.dropout2(src2)\n        src = self.norm2(src)\n        return src\n\n    def forward_pre(self, src,\n                    src_mask: Optional[Tensor] = None,\n                    src_key_padding_mask: Optional[Tensor] = None,\n                    pos: Optional[Tensor] = None):\n        src2 = self.norm1(src)\n        q = k = self.with_pos_embed(src2, pos)\n        src2 = self.self_attn(q, k, value=src2, attn_mask=src_mask,\n                              key_padding_mask=src_key_padding_mask)[0]\n        src = src + self.dropout1(src2)\n        src2 = self.norm2(src)\n        src2 = self.linear2(self.dropout(self.activation(self.linear1(src2))))\n        src = src + self.dropout2(src2)\n        return src\n\n    def forward(self, src,\n                src_mask: Optional[Tensor] = None,\n                src_key_padding_mask: Optional[Tensor] = None,\n                pos: Optional[Tensor] = None):\n        if self.normalize_before:\n            return self.forward_pre(src, src_mask, src_key_padding_mask, pos)\n        return self.forward_post(src, src_mask, src_key_padding_mask, pos)\n\n\ndef attention(query, key,  value):\n    dim = query.shape[1]\n    scores_1 = torch.einsum('bdhn,bdhm->bhnm', query, key) / dim**.5\n    scores_2 = torch.einsum('abcd, aced->abcd', key, scores_1)\n    prob = torch.nn.functional.softmax(scores_2, dim=-1)\n    output = torch.einsum('bnhm,bdhm->bdhn', prob, value)\n    return output, prob\n\nclass MultiHeadedAttention(nn.Module):\n    \"\"\" Multi-head attention to increase model expressivitiy \"\"\"\n    def __init__(self, num_heads: int, d_model: int):\n        super().__init__()\n        assert d_model % num_heads == 0\n        self.dim = d_model // num_heads\n        self.num_heads = num_heads\n        merge = nn.Conv1d(d_model, d_model, kernel_size=1)\n        self.proj = nn.ModuleList([deepcopy(merge) for _ in range(3)])\n        self.down_mlp = MLP(input_dim = self.dim, hidden_dim = 32, output_dim = 1, num_layers = 1)\n\n\n    def forward(self, query, key, value):\n        batch_dim = query.size(0)\n        # pdb.set_trace()\n        query, key, value = [l(x).view(batch_dim, self.dim, self.num_heads, -1)\n                             for l, x in zip(self.proj, (query, key, value))]\n        x, prob = attention(query, key, value)\n        x = self.down_mlp(x)\n        return x.contiguous().view(batch_dim, self.dim*self.num_heads, -1)\n\n\nclass TransformerDecoderLayer(nn.Module):\n\n    def __init__(self, d_model, nhead, dim_feedforward=2048, dropout=0.1,\n                 activation=\"relu\", normalize_before=False):\n        super().__init__()\n        self.self_attn = nn.MultiheadAttention(d_model, nhead, dropout=dropout)\n        self.multihead_attn = MultiHeadedAttention(nhead, d_model)\n\n        # Implementation of Feedforward model\n        self.linear1 = nn.Linear(d_model, dim_feedforward)\n        self.dropout = nn.Dropout(dropout)\n        self.linear2 = nn.Linear(dim_feedforward, d_model)\n\n        self.norm1 = nn.LayerNorm(d_model)\n        self.norm2 = nn.LayerNorm(d_model)\n        self.norm3 = nn.LayerNorm(d_model)\n        self.dropout1 = nn.Dropout(dropout)\n        self.dropout2 = nn.Dropout(dropout)\n        self.dropout3 = nn.Dropout(dropout)\n\n        self.activation = _get_activation_fn(activation)\n        self.normalize_before = normalize_before\n\n\n    def with_pos_embed(self, tensor, pos: Optional[Tensor]):\n        return tensor if pos is None else tensor + pos\n\n    def forward_post(self, tgt, memory,\n                     tgt_mask: Optional[Tensor] = None,\n                     memory_mask: Optional[Tensor] = None,\n                     tgt_key_padding_mask: Optional[Tensor] = None,\n                     memory_key_padding_mask: Optional[Tensor] = None,\n                     pos: Optional[Tensor] = None,\n                     query_pos: Optional[Tensor] = None):\n\n        q = k = self.with_pos_embed(tgt, query_pos)\n        tgt2 = self.self_attn(q, k, value=tgt, attn_mask=tgt_mask,\n                            key_padding_mask=tgt_key_padding_mask)[0]\n        tgt = tgt + self.dropout1(tgt2)\n        tgt = self.norm1(tgt)\n        tgt2 = self.multihead_attn(query=self.with_pos_embed(tgt, query_pos).permute(1,2,0),\n                                key=self.with_pos_embed(memory, pos).permute(1,2,0),\n                                value=memory.permute(1,2,0))\n        tgt2 = tgt2.permute(2,0,1)\n        tgt = tgt + self.dropout2(tgt2)\n        tgt = self.norm2(tgt)\n        tgt2 = self.linear2(self.dropout(self.activation(self.linear1(tgt))))\n        tgt = tgt + self.dropout3(tgt2)\n        tgt = self.norm3(tgt)\n        return tgt\n\n    def forward_pre(self, tgt, memory,\n                    tgt_mask: Optional[Tensor] = None,\n                    memory_mask: Optional[Tensor] = None,\n                    tgt_key_padding_mask: Optional[Tensor] = None,\n                    memory_key_padding_mask: Optional[Tensor] = None,\n                    pos: Optional[Tensor] = None,\n                    query_pos: Optional[Tensor] = None):\n        tgt2 = self.norm1(tgt)\n        q = k = self.with_pos_embed(tgt2, query_pos)\n        tgt2 = self.self_attn(q, k, value=tgt2, attn_mask=tgt_mask,\n                              key_padding_mask=tgt_key_padding_mask)[0]\n        tgt = tgt + self.dropout1(tgt2)\n        tgt2 = self.norm2(tgt)\n        tgt2 = self.multihead_attn(query=self.with_pos_embed(tgt2, query_pos),\n                                   key=self.with_pos_embed(memory, pos),\n                                   value=memory, attn_mask=memory_mask,\n                                   key_padding_mask=memory_key_padding_mask)[0]\n        tgt = tgt + self.dropout2(tgt2)\n        tgt2 = self.norm3(tgt)\n        tgt2 = self.linear2(self.dropout(self.activation(self.linear1(tgt2))))\n        tgt = tgt + self.dropout3(tgt2)\n        return tgt\n\n    def forward(self, tgt, memory,\n                tgt_mask: Optional[Tensor] = None,\n                memory_mask: Optional[Tensor] = None,\n                tgt_key_padding_mask: Optional[Tensor] = None,\n                memory_key_padding_mask: Optional[Tensor] = None,\n                pos: Optional[Tensor] = None,\n                query_pos: Optional[Tensor] = None):\n        if self.normalize_before:\n            return self.forward_pre(tgt, memory, tgt_mask, memory_mask,\n                                    tgt_key_padding_mask, memory_key_padding_mask, pos, query_pos)\n        return self.forward_post(tgt, memory, tgt_mask, memory_mask,\n                                 tgt_key_padding_mask, memory_key_padding_mask, pos, query_pos)\n\n\ndef _get_clones(module, N):\n    return nn.ModuleList([copy.deepcopy(module) for i in range(N)])\n\n\ndef build_transformer(args):\n    return Transformer(\n        d_model=args.hidden_dim,\n        dropout=args.dropout,\n        nhead=args.nheads,\n        dim_feedforward=args.dim_feedforward,\n        num_encoder_layers=args.enc_layers,\n        num_decoder_layers=args.dec_layers,\n        normalize_before=args.pre_norm,\n        return_intermediate_dec=True,\n    )\n\n\ndef _get_activation_fn(activation):\n    \"\"\"Return an activation function given a string\"\"\"\n    if activation == \"relu\":\n        return F.relu\n    if activation == \"gelu\":\n        return F.gelu\n    if activation == \"glu\":\n        return F.glu\n    raise RuntimeError(F\"activation should be relu/gelu, not {activation}.\")\n"
  },
  {
    "path": "pcdet/models/model_utils/model_nms_utils.py",
    "content": "import torch\nimport numpy as np\nfrom ...ops.iou3d_nms import iou3d_nms_utils\n\ndef limit(ang):\n    ang = ang % (2 * np.pi)\n\n    ang[ang > np.pi] = ang[ang > np.pi] - 2 * np.pi\n\n    ang[ang < -np.pi] = ang[ang < -np.pi] + 2 * np.pi\n\n    return ang\n\ndef compute_WBF(det_names, det_scores, det_boxes, iou_thresh=0.85, iou_thresh2=0.03, type='mean'):\n    if len(det_names) == 0:\n        return det_names, det_scores, det_boxes\n    cluster_id = -1\n    cluster_box_dict = {}\n    cluster_score_dict = {}\n\n    cluster_merged_dict = {}\n    cluster_name_dict = {}\n    '''\n    det_boxes[:, 6] = common_utils.limit_period(\n        det_boxes[:, 6], offset=0.5, period=2 * np.pi\n    )\n    '''\n    det_boxes[:, 6] = limit(det_boxes[:, 6])\n\n    for i, box in enumerate(det_boxes):\n\n        score = det_scores[i]\n        name = det_names[i]\n        if i == 0:\n            cluster_id += 1\n            cluster_box_dict[cluster_id] = [box]\n            cluster_score_dict[cluster_id] = [score]\n            cluster_merged_dict[cluster_id] = box\n            cluster_name_dict[cluster_id] = name\n            continue\n\n        valid_clusters = []\n        keys = list(cluster_merged_dict)\n        keys.sort()\n        for key in keys:\n            valid_clusters.append(cluster_merged_dict[key])\n\n        valid_clusters = np.array(valid_clusters).reshape((-1, 7))\n        ious = iou3d_nms_utils.boxes_bev_iou_cpu(np.array([box[:7]]), valid_clusters)\n\n        argmax = np.argmax(ious, -1)[0]\n        max_iou = np.max(ious, -1)[0]\n\n        if max_iou >= iou_thresh:\n            cluster_box_dict[argmax].append(box)\n            cluster_score_dict[argmax].append(score)\n        elif iou_thresh2<=max_iou<iou_thresh:\n            continue\n        else:\n            cluster_id += 1\n            cluster_box_dict[cluster_id] = [box]\n            cluster_score_dict[cluster_id] = [score]\n            cluster_merged_dict[cluster_id] = box\n            cluster_name_dict[cluster_id] = name\n\n    out_boxes = []\n    out_scores = []\n    out_name = []\n    for i in cluster_merged_dict.keys():\n        if type == 'mean':\n            score_sum = 0\n            box_sum = np.zeros(shape=(7,))\n\n            angles = []\n\n            for j, sub_score in enumerate(cluster_score_dict[i]):\n                box_sum += cluster_box_dict[i][j]\n                score_sum += sub_score\n                angles.append(cluster_box_dict[i][j][6])\n            box_sum /= len(cluster_score_dict[i])\n            score_sum /= len(cluster_score_dict[i])\n\n            cluster_merged_dict[i][:6] = box_sum[:6]\n\n            angles = np.array(angles)\n            angles = limit(angles)\n            res = angles - cluster_merged_dict[i][6]\n            res = limit(res)\n            res = res[np.abs(res) < 1.5]\n            res = res.mean()\n            b = cluster_merged_dict[i][6] + res\n            cluster_merged_dict[i][6] = b\n\n            out_scores.append(score_sum)\n            out_boxes.append(cluster_merged_dict[i])\n            out_name.append(cluster_name_dict[i])\n        elif type == 'max':\n            out_scores.append(np.max(cluster_score_dict[i]))\n            out_boxes.append(cluster_merged_dict[i])\n            out_name.append(cluster_name_dict[i])\n\n    out_boxes = np.array(out_boxes)\n    out_scores = np.array(out_scores)\n    out_names = np.array(out_name)\n\n    return out_names, out_scores, out_boxes\n\ndef class_agnostic_nms(box_scores, box_preds, nms_config, score_thresh=None):\n    src_box_scores = box_scores\n    if score_thresh is not None:\n        scores_mask = (box_scores >= score_thresh)\n        box_scores = box_scores[scores_mask]\n        box_preds = box_preds[scores_mask]\n\n    selected = []\n    if box_scores.shape[0] > 0:\n        box_scores_nms, indices = torch.topk(box_scores, k=min(nms_config.NMS_PRE_MAXSIZE, box_scores.shape[0]))\n        boxes_for_nms = box_preds[indices]\n        keep_idx, selected_scores = getattr(iou3d_nms_utils, nms_config.NMS_TYPE)(\n                boxes_for_nms[:, 0:7], box_scores_nms, nms_config.NMS_THRESH, **nms_config\n        )\n        selected = indices[keep_idx[:nms_config.NMS_POST_MAXSIZE]]\n\n    if score_thresh is not None:\n        original_idxs = scores_mask.nonzero().view(-1)\n        selected = original_idxs[selected]\n    return selected, src_box_scores[selected]\n\n\ndef multi_classes_nms(cls_scores, box_preds, nms_config, score_thresh=None):\n    \"\"\"\n    Args:\n        cls_scores: (N, num_class)\n        box_preds: (N, 7 + C)\n        nms_config:\n        score_thresh:\n\n    Returns:\n\n    \"\"\"\n    pred_scores, pred_labels, pred_boxes = [], [], []\n    for k in range(cls_scores.shape[1]):\n        if score_thresh is not None:\n            scores_mask = (cls_scores[:, k] >= score_thresh)\n            box_scores = cls_scores[scores_mask, k]\n            cur_box_preds = box_preds[scores_mask]\n        else:\n            box_scores = cls_scores[:, k]\n\n        selected = []\n        if box_scores.shape[0] > 0:\n            box_scores_nms, indices = torch.topk(box_scores, k=min(nms_config.NMS_PRE_MAXSIZE, box_scores.shape[0]))\n            boxes_for_nms = cur_box_preds[indices]\n            keep_idx, selected_scores = getattr(iou3d_nms_utils, nms_config.NMS_TYPE)(\n                    boxes_for_nms[:, 0:7], box_scores_nms, nms_config.NMS_THRESH, **nms_config\n            )\n            selected = indices[keep_idx[:nms_config.NMS_POST_MAXSIZE]]\n\n        pred_scores.append(box_scores[selected])\n        pred_labels.append(box_scores.new_ones(len(selected)).long() * k)\n        pred_boxes.append(cur_box_preds[selected])\n\n    pred_scores = torch.cat(pred_scores, dim=0)\n    pred_labels = torch.cat(pred_labels, dim=0)\n    pred_boxes = torch.cat(pred_boxes, dim=0)\n\n    return pred_scores, pred_labels, pred_boxes\n"
  },
  {
    "path": "pcdet/models/roi_heads/__init__.py",
    "content": "from .roi_head_template import RoIHeadTemplate\n\nfrom .ted_head import TEDMHead, TEDSHead\n\n__all__ = {\n    'RoIHeadTemplate': RoIHeadTemplate,\n    'TEDSHead': TEDSHead,\n    'TEDMHead': TEDMHead\n}\n"
  },
  {
    "path": "pcdet/models/roi_heads/roi_head_template.py",
    "content": "import numpy as np\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nfrom ...utils import box_coder_utils, common_utils, loss_utils, box_utils\nfrom ..model_utils.model_nms_utils import class_agnostic_nms\nfrom .target_assigner.proposal_target_layer import ProposalTargetLayer\nfrom ...utils.bbloss import bb_loss\nimport time\nimport copy\n\nclass RoIHeadTemplate(nn.Module):\n    def __init__(self, num_class, model_cfg):\n        super().__init__()\n        self.model_cfg = model_cfg\n        self.num_class = num_class\n        self.box_coder = getattr(box_coder_utils, self.model_cfg.TARGET_CONFIG.BOX_CODER)(\n            **self.model_cfg.TARGET_CONFIG.get('BOX_CODER_CONFIG', {})\n        )\n        self.proposal_target_layers = []\n        for i in range(6):\n            this_cfg = copy.deepcopy(self.model_cfg.TARGET_CONFIG)\n            proposal_target_layer = ProposalTargetLayer(roi_sampler_cfg=this_cfg)\n            self.proposal_target_layers.append(proposal_target_layer)\n\n        self.build_losses(self.model_cfg.LOSS_CONFIG)\n        self.forward_ret_dict = {}\n\n    def build_losses(self, losses_cfg):\n        self.add_module(\n            'reg_loss_func',\n            loss_utils.WeightedSmoothL1Loss(code_weights=losses_cfg.LOSS_WEIGHTS['code_weights'])\n        )\n\n    def make_fc_layers(self, input_channels, output_channels, fc_list):\n        fc_layers = []\n        pre_channel = input_channels\n        for k in range(0, fc_list.__len__()):\n            fc_layers.extend([\n                nn.Conv1d(pre_channel, fc_list[k], kernel_size=1, bias=False),\n                nn.BatchNorm1d(fc_list[k]),\n                nn.ReLU()\n            ])\n            pre_channel = fc_list[k]\n            if self.model_cfg.DP_RATIO >= 0 and k == 0:\n                fc_layers.append(nn.Dropout(self.model_cfg.DP_RATIO))\n        fc_layers.append(nn.Conv1d(pre_channel, output_channels, kernel_size=1, bias=True))\n        fc_layers = nn.Sequential(*fc_layers)\n        return fc_layers\n\n    @torch.no_grad()\n    def proposal_layer(self, batch_dict, nms_config):\n        \"\"\"\n        Args:\n            batch_dict:\n                batch_size:\n                batch_cls_preds: (B, num_boxes, num_classes | 1) or (N1+N2+..., num_classes | 1)\n                batch_box_preds: (B, num_boxes, 7+C) or (N1+N2+..., 7+C)\n                cls_preds_normalized: indicate whether batch_cls_preds is normalized\n                batch_index: optional (N1+N2+...)\n            nms_config:\n\n        Returns:\n            batch_dict:\n                rois: (B, num_rois, 7+C)\n                roi_scores: (B, num_rois)\n                roi_labels: (B, num_rois)\n\n        \"\"\"\n\n        if batch_dict.get('rois', None) is not None:\n            batch_dict['cls_preds_normalized'] = False\n            return batch_dict\n\n        batch_size = batch_dict['batch_size']\n        batch_box_preds = batch_dict['batch_box_preds']\n        batch_cls_preds = batch_dict['batch_cls_preds']\n\n        rois = batch_box_preds.new_zeros((batch_size, nms_config.NMS_POST_MAXSIZE, batch_box_preds.shape[-1]))\n        roi_scores = batch_box_preds.new_zeros((batch_size, nms_config.NMS_POST_MAXSIZE))\n        roi_labels = batch_box_preds.new_zeros((batch_size, nms_config.NMS_POST_MAXSIZE), dtype=torch.long)\n\n\n        for index in range(batch_size):\n            if batch_dict.get('batch_index', None) is not None:\n                assert batch_cls_preds.shape.__len__() == 2\n                batch_mask = (batch_dict['batch_index'] == index)\n            else:\n                assert batch_dict['batch_cls_preds'].shape.__len__() == 3\n                batch_mask = index\n            box_preds = batch_box_preds[batch_mask]\n            cls_preds = batch_cls_preds[batch_mask]\n\n            cur_roi_scores, cur_roi_labels = torch.max(cls_preds, dim=1)\n\n            if nms_config.MULTI_CLASSES_NMS:\n                raise NotImplementedError\n            else:\n                selected, selected_scores = class_agnostic_nms(\n                    box_scores=cur_roi_scores, box_preds=box_preds, nms_config=nms_config\n                )\n\n            rois[index, :len(selected), :] = box_preds[selected]\n\n            roi_scores[index, :len(selected)] = cur_roi_scores[selected]\n            roi_labels[index, :len(selected)] = cur_roi_labels[selected]\n\n        batch_dict['rois'] = rois\n        batch_dict['roi_scores'] = roi_scores\n\n        batch_dict['roi_labels'] = roi_labels + 1\n        batch_dict['has_class_labels'] = True if batch_cls_preds.shape[-1] > 1 else False\n        batch_dict.pop('batch_index', None)\n        return batch_dict\n\n    def assign_targets(self, batch_dict, rot_num_id, enable_dif = False):\n        batch_size = batch_dict['batch_size']\n        with torch.no_grad():\n            if rot_num_id == 0:\n                s_str = ''\n            else:\n                s_str = str(rot_num_id)\n            if enable_dif:\n                targets_dict = self.proposal_target_layers[rot_num_id].forward(batch_dict, s_str)\n            else:\n                targets_dict = self.proposal_target_layers[rot_num_id].forward(batch_dict, '')\n\n        rois = targets_dict['rois']  # (B, N, 7 + C)\n        gt_of_rois = targets_dict['gt_of_rois']  # (B, N, 7 + C + 1)\n\n        targets_dict['gt_of_rois_src'] = gt_of_rois.clone().detach()\n\n        # canonical transformation\n        roi_center = rois[:, :, 0:3]\n        roi_ry = rois[:, :, 6] % (2 * np.pi)\n        gt_of_rois[:, :, 0:3] = gt_of_rois[:, :, 0:3] - roi_center\n        gt_of_rois[:, :, 6] = gt_of_rois[:, :, 6] - roi_ry\n\n        # transfer LiDAR coords to local coords\n        gt_of_rois = common_utils.rotate_points_along_z(\n            points=gt_of_rois.view(-1, 1, gt_of_rois.shape[-1]), angle=-roi_ry.view(-1)\n        ).view(batch_size, -1, gt_of_rois.shape[-1])\n\n        # flip orientation if rois have opposite orientation\n        heading_label = gt_of_rois[:, :, 6] % (2 * np.pi)  # 0 ~ 2pi\n        opposite_flag = (heading_label > np.pi * 0.5) & (heading_label < np.pi * 1.5)\n        heading_label[opposite_flag] = (heading_label[opposite_flag] + np.pi) % (2 * np.pi)  # (0 ~ pi/2, 3pi/2 ~ 2pi)\n        flag = heading_label > np.pi\n        heading_label[flag] = heading_label[flag] - np.pi * 2  # (-pi/2, pi/2)\n        heading_label = torch.clamp(heading_label, min=-np.pi / 2, max=np.pi / 2)\n\n        gt_of_rois[:, :, 6] = heading_label\n        targets_dict['gt_of_rois'] = gt_of_rois\n        return targets_dict\n\n    def get_box_reg_layer_loss(self, forward_ret_dict):\n        loss_cfgs = self.model_cfg.LOSS_CONFIG\n        code_size = self.box_coder.code_size\n        reg_valid_mask = forward_ret_dict['reg_valid_mask'].view(-1)\n        gt_boxes3d_ct = forward_ret_dict['gt_of_rois'].clone()[..., 0:code_size]\n        gt_of_rois_src = forward_ret_dict['gt_of_rois_src'][..., 0:code_size].view(-1, code_size)\n        rcnn_reg = forward_ret_dict['rcnn_reg']  # (rcnn_batch_size, C)\n        roi_boxes3d = forward_ret_dict['rois']\n        rcnn_batch_size = gt_boxes3d_ct.view(-1, code_size).shape[0]\n\n        fg_mask = (reg_valid_mask > 0)\n        fg_sum = fg_mask.long().sum().item()\n\n        tb_dict = {}\n\n        if loss_cfgs.REG_LOSS == 'smooth-l1':\n            rois_anchor = roi_boxes3d.clone().detach().view(-1, code_size)\n            rois_anchor[:, 0:3] = 0\n            rois_anchor[:, 6] = 0\n            reg_targets = self.box_coder.encode_torch(\n                gt_boxes3d_ct.view(rcnn_batch_size, code_size), rois_anchor\n            )\n\n            rcnn_loss_reg = self.reg_loss_func(\n                rcnn_reg.view(rcnn_batch_size, -1).unsqueeze(dim=0),\n                reg_targets.unsqueeze(dim=0),\n            )  # [B, M, 7]\n            rcnn_loss_reg = (rcnn_loss_reg.view(rcnn_batch_size, -1) * fg_mask.unsqueeze(dim=-1).float()).sum() / max(fg_sum, 1)\n            rcnn_loss_reg = rcnn_loss_reg * loss_cfgs.LOSS_WEIGHTS['rcnn_reg_weight']\n            tb_dict['rcnn_loss_reg'] = rcnn_loss_reg.item()\n\n            if loss_cfgs.CORNER_LOSS_REGULARIZATION and fg_sum > 0:\n                # TODO: NEED to BE CHECK\n                fg_rcnn_reg = rcnn_reg.view(rcnn_batch_size, -1)[fg_mask]\n                fg_roi_boxes3d = roi_boxes3d.view(-1, code_size)[fg_mask]\n\n                fg_roi_boxes3d = fg_roi_boxes3d.view(1, -1, code_size)\n                batch_anchors = fg_roi_boxes3d.clone().detach()\n                roi_ry = fg_roi_boxes3d[:, :, 6].view(-1)\n                roi_xyz = fg_roi_boxes3d[:, :, 0:3].view(-1, 3)\n                batch_anchors[:, :, 0:3] = 0\n                rcnn_boxes3d = self.box_coder.decode_torch(\n                    fg_rcnn_reg.view(batch_anchors.shape[0], -1, code_size), batch_anchors\n                ).view(-1, code_size)\n\n                rcnn_boxes3d = common_utils.rotate_points_along_z(\n                    rcnn_boxes3d.unsqueeze(dim=1), roi_ry\n                ).squeeze(dim=1)\n                rcnn_boxes3d[:, 0:3] += roi_xyz\n\n                loss_corner = loss_utils.get_corner_loss_lidar(\n                    rcnn_boxes3d[:, 0:7],\n                    gt_of_rois_src[fg_mask][:, 0:7]\n                )\n                loss_corner = loss_corner.mean()\n                loss_corner = loss_corner * loss_cfgs.LOSS_WEIGHTS['rcnn_corner_weight']\n\n                rcnn_loss_reg += loss_corner\n                tb_dict['rcnn_loss_corner'] = loss_corner.item()\n        else:\n            raise NotImplementedError\n\n        reg_valid_mask = forward_ret_dict['reg_valid_mask'].view(-1)\n        code_size = self.box_coder.code_size\n        shape = forward_ret_dict['gt_of_rois'].shape\n        gt_boxes3d_ct = forward_ret_dict['gt_of_rois'].clone().view(shape[0] * shape[1], -1)[:, 0:7]\n        rcnn_reg = forward_ret_dict['rcnn_reg']  # (rcnn_batch_size, C)\n        rois = forward_ret_dict['rois'].clone().view(-1, code_size)[:, 0:7]\n        rois[:, 0:3] = 0\n        rois[:, 6] = 0\n\n        batch_box_preds = self.box_coder.decode_torch(rcnn_reg, rois).view(-1, code_size)\n\n        fg_mask = (reg_valid_mask > 0)\n\n        if len(gt_boxes3d_ct[fg_mask]) == 0:\n            b_loss=0\n        else:\n            b_loss = bb_loss(batch_box_preds[fg_mask], gt_boxes3d_ct[\n                fg_mask]).sum()\n            b_loss = b_loss / (fg_mask.sum() + 1)\n\n        return rcnn_loss_reg+b_loss, tb_dict\n\n    def get_box_cls_layer_loss(self, forward_ret_dict):\n        loss_cfgs = self.model_cfg.LOSS_CONFIG\n        rcnn_cls = forward_ret_dict['rcnn_cls']\n        rcnn_cls_labels = forward_ret_dict['rcnn_cls_labels'].view(-1)\n\n        if loss_cfgs.CLS_LOSS == 'BinaryCrossEntropy':\n            rcnn_cls_flat = rcnn_cls.view(-1)\n            batch_loss_cls = F.binary_cross_entropy(torch.sigmoid(rcnn_cls_flat), rcnn_cls_labels.float(), reduction='none')\n            cls_valid_mask = (rcnn_cls_labels >= 0).float()\n            rcnn_loss_cls = (batch_loss_cls * cls_valid_mask).sum() / torch.clamp(cls_valid_mask.sum(), min=1.0)\n        elif loss_cfgs.CLS_LOSS == 'CrossEntropy':\n            batch_loss_cls = F.cross_entropy(rcnn_cls, rcnn_cls_labels, reduction='none', ignore_index=-1)\n            cls_valid_mask = (rcnn_cls_labels >= 0).float()\n            rcnn_loss_cls = (batch_loss_cls * cls_valid_mask).sum() / torch.clamp(cls_valid_mask.sum(), min=1.0)\n        else:\n            raise NotImplementedError\n\n\n        rcnn_loss_cls = rcnn_loss_cls * loss_cfgs.LOSS_WEIGHTS['rcnn_cls_weight']\n        tb_dict = {'rcnn_loss_cls': rcnn_loss_cls.item()}\n        return rcnn_loss_cls, tb_dict\n\n    def get_loss(self, tb_dict=None):\n        tb_dict = {} if tb_dict is None else tb_dict\n        rcnn_loss = 0\n        for i in range(6):\n            if 'targets_dict'+str(i) in self.forward_ret_dict:\n                rcnn_loss_cls, cls_tb_dict = self.get_box_cls_layer_loss(self.forward_ret_dict['targets_dict'+str(i)])\n                rcnn_loss += rcnn_loss_cls\n                rcnn_loss_reg, reg_tb_dict = self.get_box_reg_layer_loss(self.forward_ret_dict['targets_dict'+str(i)])\n                rcnn_loss += rcnn_loss_reg\n\n            if 'targets_dict_pi'+str(i) in self.forward_ret_dict:\n                rcnn_loss_cls, cls_tb_dict = self.get_box_cls_layer_loss(self.forward_ret_dict['targets_dict_pi' + str(i)])\n                rcnn_loss += 0.5*rcnn_loss_cls\n                rcnn_loss_reg, reg_tb_dict = self.get_box_reg_layer_loss(self.forward_ret_dict['targets_dict_pi' + str(i)])\n                rcnn_loss += 0.5*rcnn_loss_reg\n\n            if 'targets_dict_p'+str(i) in self.forward_ret_dict:\n                rcnn_loss_cls, cls_tb_dict = self.get_box_cls_layer_loss(self.forward_ret_dict['targets_dict_p' + str(i)])\n                rcnn_loss += 0.5*rcnn_loss_cls\n                rcnn_loss_reg, reg_tb_dict = self.get_box_reg_layer_loss(self.forward_ret_dict['targets_dict_p' + str(i)])\n                rcnn_loss += 0.5*rcnn_loss_reg\n\n        tb_dict['rcnn_loss'] = rcnn_loss.item()\n\n        return rcnn_loss, tb_dict\n\n\n    def generate_predicted_boxes(self, batch_size, rois, cls_preds, box_preds):\n        \"\"\"\n        Args:\n            batch_size:\n            rois: (B, N, 7)\n            cls_preds: (BN, num_class)\n            box_preds: (BN, code_size)\n\n        Returns:\n\n        \"\"\"\n        code_size = self.box_coder.code_size\n        # batch_cls_preds: (B, N, num_class or 1)\n        batch_cls_preds = cls_preds.view(batch_size, -1, cls_preds.shape[-1])\n        batch_box_preds = box_preds.view(batch_size, -1, code_size)\n\n        roi_ry = rois[:, :, 6].view(-1)\n        roi_xyz = rois[:, :, 0:3].view(-1, 3)\n        local_rois = rois.clone()\n        local_rois[:, :, 0:3] = 0\n\n        batch_box_preds = self.box_coder.decode_torch(batch_box_preds, local_rois).view(-1, code_size)\n\n        batch_box_preds = common_utils.rotate_points_along_z(\n            batch_box_preds.unsqueeze(dim=1), roi_ry\n        ).squeeze(dim=1)\n        batch_box_preds[:, 0:3] += roi_xyz\n        batch_box_preds = batch_box_preds.view(batch_size, -1, code_size)\n\n\n        return batch_cls_preds, batch_box_preds\n\n"
  },
  {
    "path": "pcdet/models/roi_heads/target_assigner/proposal_target_layer.py",
    "content": "import numpy as np\nimport torch\nimport torch.nn as nn\n\nfrom ....ops.iou3d_nms import iou3d_nms_utils\n\nclass ProposalTargetLayer(nn.Module):\n    def __init__(self, roi_sampler_cfg):\n        super().__init__()\n        self.roi_sampler_cfg = roi_sampler_cfg\n\n    def limit(self,ang):\n        ang = ang % (2 * np.pi)\n\n        ang[ang > np.pi] = ang[ang > np.pi] - 2 * np.pi\n\n        ang[ang < -np.pi] = ang[ang < -np.pi] + 2 * np.pi\n\n        return ang\n\n    def ang_weight(self,pred, gt):\n\n        a = torch.abs(pred - gt)\n        b = 2 * np.pi - torch.abs(pred - gt)\n\n        res = torch.stack([a, b])\n\n        res = torch.min(res, 0)[0]\n\n        return 1 - res / np.pi\n\n    def forward(self, batch_dict, ind=''):\n        \"\"\"\n        Args:\n            batch_dict:\n                batch_size:\n                rois: (B, num_rois, 7 + C)\n                roi_scores: (B, num_rois)\n                gt_boxes: (B, N, 7 + C + 1)\n                roi_labels: (B, num_rois)\n        Returns:\n            batch_dict:\n                rois: (B, M, 7 + C)\n                gt_of_rois: (B, M, 7 + C)\n                gt_iou_of_rois: (B, M)\n                roi_scores: (B, M)\n                roi_labels: (B, M)\n                reg_valid_mask: (B, M)\n                rcnn_cls_labels: (B, M)\n        \"\"\"\n        batch_rois, batch_gt_of_rois, batch_roi_ious, batch_roi_scores, batch_roi_labels = self.sample_rois_for_rcnn(\n            batch_dict=batch_dict, ind=ind,\n        )\n        # regression valid mask\n\n\n        if self.roi_sampler_cfg.CLS_SCORE_TYPE in ['roi_iou_x', 'roi_ioud_x']:\n            reg_valid_mask = batch_roi_ious.new_zeros(batch_roi_ious.shape).long()\n            for cls_i in range(len(self.roi_sampler_cfg.REG_FG_THRESH)):\n                reg_fg_thresh = self.roi_sampler_cfg.REG_FG_THRESH[cls_i]\n                cls_mask = batch_gt_of_rois[...,-1] == (cls_i+1)\n\n\n                if self.roi_sampler_cfg.get('ENABLE_HARD_SAMPLING', False):\n                    mask_hard = (batch_roi_ious < reg_fg_thresh) & (batch_roi_ious > self.roi_sampler_cfg.HARD_SAMPLING_THRESH[cls_i]) & cls_mask\n\n                    mask_prob = mask_hard.new_zeros(mask_hard.size()).bool()\n                    teval = int(1/self.roi_sampler_cfg.HARD_SAMPLING_RATIO[cls_i])\n                    ints = range(np.random.randint(0, teval), mask_prob.shape[0], teval)\n\n                    mask_prob[ints] = 1\n\n                    mask_hard2 = mask_hard * mask_prob\n\n                    this_fg_inds1 = ((batch_roi_ious > reg_fg_thresh) & cls_mask).long()\n                    this_reg_valid_mask = this_fg_inds1 + mask_hard2.long()\n\n                else:\n                    this_reg_valid_mask = ((batch_roi_ious > reg_fg_thresh) & cls_mask).long()\n                reg_valid_mask += this_reg_valid_mask\n        else:\n            reg_valid_mask = (batch_roi_ious > self.roi_sampler_cfg.REG_FG_THRESH).long()\n\n        # classification label\n        if self.roi_sampler_cfg.CLS_SCORE_TYPE == 'cls':\n            batch_cls_labels = (batch_roi_ious > self.roi_sampler_cfg.CLS_FG_THRESH).long()\n            ignore_mask = (batch_roi_ious > self.roi_sampler_cfg.CLS_BG_THRESH) & \\\n                          (batch_roi_ious < self.roi_sampler_cfg.CLS_FG_THRESH)\n            batch_cls_labels[ignore_mask > 0] = -1\n        elif self.roi_sampler_cfg.CLS_SCORE_TYPE == 'roi_iou':\n            iou_bg_thresh = self.roi_sampler_cfg.CLS_BG_THRESH\n            iou_fg_thresh = self.roi_sampler_cfg.CLS_FG_THRESH\n            fg_mask = batch_roi_ious > iou_fg_thresh\n            bg_mask = batch_roi_ious < iou_bg_thresh\n            interval_mask = (fg_mask == 0) & (bg_mask == 0)\n\n            batch_cls_labels = (fg_mask > 0).float()\n            batch_cls_labels[interval_mask] = \\\n                (batch_roi_ious[interval_mask] - iou_bg_thresh) / (iou_fg_thresh - iou_bg_thresh)\n        elif self.roi_sampler_cfg.CLS_SCORE_TYPE == 'roi_ioud':\n            iou_bg_thresh = self.roi_sampler_cfg.CLS_BG_THRESH\n            iou_fg_thresh = self.roi_sampler_cfg.CLS_FG_THRESH\n            fg_mask = batch_roi_ious > iou_fg_thresh\n            bg_mask = batch_roi_ious < iou_bg_thresh\n            interval_mask = (fg_mask == 0) & (bg_mask == 0)\n\n            batch_cls_labels = (fg_mask > 0).float()\n            batch_cls_labels[interval_mask] = \\\n                (batch_roi_ious[interval_mask] - iou_bg_thresh) / (iou_fg_thresh - iou_bg_thresh)\n\n            ang_roi = batch_rois[...,6]\n            ang_gt = batch_gt_of_rois[...,6]\n\n            ang_roi = self.limit(ang_roi)\n            ang_gt = self.limit(ang_gt)\n\n            ang_target = self.ang_weight(ang_roi,ang_gt)\n            direction_constraint = self.roi_sampler_cfg.DIRECTION_MIN\n            direction_constraint2 = self.roi_sampler_cfg.DIRECTION_MAX\n\n            ang_target = (torch.clamp(ang_target, direction_constraint,\n                                      direction_constraint2) - direction_constraint) / (\n                                 direction_constraint2 - direction_constraint)\n\n            batch_cls_labels *= ang_target\n\n\n        elif self.roi_sampler_cfg.CLS_SCORE_TYPE == 'roi_ioud_x':\n            all_iou_bg_thresh = self.roi_sampler_cfg.CLS_BG_THRESH\n            all_iou_fg_thresh = self.roi_sampler_cfg.CLS_FG_THRESH\n            batch_cls_labels = batch_roi_ious.new_zeros(size = batch_roi_ious.shape)\n            for cls_id in range(len(all_iou_bg_thresh)):\n                gt_cls = batch_gt_of_rois[..., -1]\n                iou_fg_thresh = all_iou_fg_thresh[cls_id]\n                iou_bg_thresh = all_iou_bg_thresh[cls_id]\n\n                cls_mask = gt_cls == (cls_id+1)\n\n                fg_mask = batch_roi_ious > iou_fg_thresh\n                bg_mask = batch_roi_ious < iou_bg_thresh\n                interval_mask = (fg_mask == 0) & (bg_mask == 0)\n\n                cls_labels = (fg_mask > 0).float()\n                cls_labels[interval_mask] = \\\n                    (batch_roi_ious[interval_mask] - iou_bg_thresh) / (iou_fg_thresh - iou_bg_thresh)\n\n                ang_roi = batch_rois[...,6]\n                ang_gt = batch_gt_of_rois[...,6]\n\n                ang_roi = self.limit(ang_roi)\n                ang_gt = self.limit(ang_gt)\n\n                ang_target = self.ang_weight(ang_roi,ang_gt)\n                direction_constraint = self.roi_sampler_cfg.DIRECTION_MIN\n                direction_constraint2 = self.roi_sampler_cfg.DIRECTION_MAX\n\n                ang_target = (torch.clamp(ang_target, direction_constraint, direction_constraint2 ) - direction_constraint) / (\n                            direction_constraint2  - direction_constraint)\n\n                cls_labels*=ang_target\n\n\n                batch_cls_labels[cls_mask] = cls_labels[cls_mask]\n\n        elif self.roi_sampler_cfg.CLS_SCORE_TYPE == 'roi_iou_x':\n            all_iou_bg_thresh = self.roi_sampler_cfg.CLS_BG_THRESH\n            all_iou_fg_thresh = self.roi_sampler_cfg.CLS_FG_THRESH\n            batch_cls_labels = batch_roi_ious.new_zeros(batch_roi_ious.shape)\n            for cls_id in range(len(all_iou_bg_thresh)):\n                gt_cls = batch_gt_of_rois[..., -1]\n                iou_fg_thresh = all_iou_fg_thresh[cls_id]\n                iou_bg_thresh = all_iou_bg_thresh[cls_id]\n\n                cls_mask = gt_cls == (cls_id+1)\n\n                fg_mask = batch_roi_ious > iou_fg_thresh\n                bg_mask = batch_roi_ious < iou_bg_thresh\n                interval_mask = (fg_mask == 0) & (bg_mask == 0)\n\n                cls_labels = (fg_mask > 0).float()\n                cls_labels[interval_mask] = \\\n                    (batch_roi_ious[interval_mask] - iou_bg_thresh) / (iou_fg_thresh - iou_bg_thresh)\n\n                batch_cls_labels[cls_mask] = cls_labels[cls_mask]\n\n        else:\n            raise NotImplementedError\n\n        targets_dict = {'rois': batch_rois, 'gt_of_rois': batch_gt_of_rois, 'gt_iou_of_rois': batch_roi_ious,\n                        'roi_scores': batch_roi_scores, 'roi_labels': batch_roi_labels,\n                        'reg_valid_mask': reg_valid_mask,\n                        'rcnn_cls_labels': batch_cls_labels}\n\n        return targets_dict\n\n    def sample_rois_for_rcnn(self, batch_dict, ind=''):\n        \"\"\"\n        Args:\n            batch_dict:\n                batch_size:\n                rois: (B, num_rois, 7 + C)\n                roi_scores: (B, num_rois)\n                gt_boxes: (B, N, 7 + C + 1)\n                roi_labels: (B, num_rois)\n        Returns:\n\n        \"\"\"\n        batch_size = batch_dict['batch_size']\n        rois = batch_dict['rois']\n        roi_scores = batch_dict['roi_scores']\n        roi_labels = batch_dict['roi_labels']\n        gt_boxes = batch_dict['gt_boxes'+ind]\n\n\n        gt_code_size = gt_boxes.shape[-1]\n        roi_code_size = rois.shape[-1]\n        batch_rois = rois.new_zeros(batch_size, self.roi_sampler_cfg.ROI_PER_IMAGE, roi_code_size)\n        batch_gt_of_rois = rois.new_zeros(batch_size, self.roi_sampler_cfg.ROI_PER_IMAGE, gt_code_size )\n        batch_roi_ious = rois.new_zeros(batch_size, self.roi_sampler_cfg.ROI_PER_IMAGE)\n        batch_roi_scores = rois.new_zeros(batch_size, self.roi_sampler_cfg.ROI_PER_IMAGE)\n        batch_roi_labels = rois.new_zeros((batch_size, self.roi_sampler_cfg.ROI_PER_IMAGE), dtype=torch.long)\n\n        for index in range(batch_size):\n            cur_roi, cur_gt, cur_roi_labels, cur_roi_scores = \\\n                rois[index], gt_boxes[index], roi_labels[index], roi_scores[index]\n            k = cur_gt.__len__() - 1\n            while k > 0 and cur_gt[k].sum() == 0:\n                k -= 1\n            cur_gt = cur_gt[:k + 1]\n            cur_gt = cur_gt.new_zeros((1, cur_gt.shape[1])) if len(cur_gt) == 0 else cur_gt\n\n            if self.roi_sampler_cfg.get('SAMPLE_ROI_BY_EACH_CLASS', False):\n                max_overlaps, gt_assignment = self.get_max_iou_with_same_class(\n                    rois=cur_roi, roi_labels=cur_roi_labels,\n                    gt_boxes=cur_gt[:, 0:7], gt_labels=cur_gt[:, -1].long()\n                )\n            else:\n                iou3d = iou3d_nms_utils.boxes_iou3d_gpu(cur_roi, cur_gt[:, 0:7])  # (M, N)\n                max_overlaps, gt_assignment = torch.max(iou3d, dim=1)\n\n            if self.roi_sampler_cfg.CLS_SCORE_TYPE in ['roi_iou_x','roi_ioud_x']:\n                sampled_inds = self.subsample_rois(max_overlaps=max_overlaps, gts = cur_gt[gt_assignment])\n            else:\n                sampled_inds = self.subsample_rois(max_overlaps=max_overlaps)\n\n            batch_rois[index] = cur_roi[sampled_inds]\n            batch_roi_labels[index] = cur_roi_labels[sampled_inds]\n            batch_roi_ious[index] = max_overlaps[sampled_inds]\n            batch_roi_scores[index] = cur_roi_scores[sampled_inds]\n            batch_gt_of_rois[index] = cur_gt[gt_assignment[sampled_inds]]\n\n        return batch_rois, batch_gt_of_rois, batch_roi_ious, batch_roi_scores, batch_roi_labels\n\n    def subsample_rois(self, max_overlaps, gts=None):\n        # sample fg, easy_bg, hard_bg\n        fg_rois_per_image = int(np.round(self.roi_sampler_cfg.FG_RATIO * self.roi_sampler_cfg.ROI_PER_IMAGE))\n\n        if gts is None:\n            fg_thresh = min(self.roi_sampler_cfg.REG_FG_THRESH, self.roi_sampler_cfg.CLS_FG_THRESH)\n            fg_inds = ((max_overlaps >= fg_thresh)).nonzero().view(-1)\n        else:\n            fg_inds = max_overlaps.new_zeros(max_overlaps.shape).long()\n            for i in range(len(self.roi_sampler_cfg.CLS_FG_THRESH)):\n                cls_mask = gts[...,-1] == (i+1)\n                this_fg_thresh = min(self.roi_sampler_cfg.REG_FG_THRESH[i], self.roi_sampler_cfg.CLS_FG_THRESH[i])\n\n                this_fg_inds = ((max_overlaps >= this_fg_thresh) & cls_mask)\n\n                fg_inds+=this_fg_inds\n            fg_inds = fg_inds.nonzero().view(-1)\n\n\n        easy_bg_inds = ((max_overlaps < self.roi_sampler_cfg.CLS_BG_THRESH_LO)).nonzero().view(-1)\n\n        if gts is None:\n            hard_bg_inds = ((max_overlaps < self.roi_sampler_cfg.REG_FG_THRESH) &\n                            (max_overlaps >= self.roi_sampler_cfg.CLS_BG_THRESH_LO)).nonzero().view(-1)\n        else:\n            hard_bg_inds = max_overlaps.new_zeros(max_overlaps.shape).long()\n            for i in range(len(self.roi_sampler_cfg.REG_FG_THRESH)):\n                cls_mask = gts[...,-1] == (i+1)\n                this_hard_bg_inds = ((max_overlaps < self.roi_sampler_cfg.REG_FG_THRESH[i]) &\n                                (max_overlaps >= self.roi_sampler_cfg.CLS_BG_THRESH_LO) & cls_mask)\n                hard_bg_inds+=this_hard_bg_inds\n            hard_bg_inds = hard_bg_inds.nonzero().view(-1)\n\n\n        fg_num_rois = fg_inds.numel()\n        bg_num_rois = hard_bg_inds.numel() + easy_bg_inds.numel()\n\n        if fg_num_rois > 0 and bg_num_rois > 0:\n            # sampling fg\n            fg_rois_per_this_image = min(fg_rois_per_image, fg_num_rois)\n\n            rand_num = torch.from_numpy(np.random.permutation(fg_num_rois)).type_as(max_overlaps).long()\n            fg_inds = fg_inds[rand_num[:fg_rois_per_this_image]]\n\n            # sampling bg\n            bg_rois_per_this_image = self.roi_sampler_cfg.ROI_PER_IMAGE - fg_rois_per_this_image\n            bg_inds = self.sample_bg_inds(\n                hard_bg_inds, easy_bg_inds, bg_rois_per_this_image, self.roi_sampler_cfg.HARD_BG_RATIO\n            )\n\n        elif fg_num_rois > 0 and bg_num_rois == 0:\n            # sampling fg\n            rand_num = np.floor(np.random.rand(self.roi_sampler_cfg.ROI_PER_IMAGE) * fg_num_rois)\n            rand_num = torch.from_numpy(rand_num).type_as(max_overlaps).long()\n            fg_inds = fg_inds[rand_num]\n            bg_inds = []\n\n        elif bg_num_rois > 0 and fg_num_rois == 0:\n            # sampling bg\n            bg_rois_per_this_image = self.roi_sampler_cfg.ROI_PER_IMAGE\n            bg_inds = self.sample_bg_inds(\n                hard_bg_inds, easy_bg_inds, bg_rois_per_this_image, self.roi_sampler_cfg.HARD_BG_RATIO\n            )\n        else:\n            print('maxoverlaps:(min=%f, max=%f)' % (max_overlaps.min().item(), max_overlaps.max().item()))\n            print('ERROR: FG=%d, BG=%d' % (fg_num_rois, bg_num_rois))\n            raise NotImplementedError\n\n        sampled_inds = torch.cat((fg_inds, bg_inds), dim=0)\n        return sampled_inds\n\n    @staticmethod\n    def sample_bg_inds(hard_bg_inds, easy_bg_inds, bg_rois_per_this_image, hard_bg_ratio):\n        if hard_bg_inds.numel() > 0 and easy_bg_inds.numel() > 0:\n            hard_bg_rois_num = min(int(bg_rois_per_this_image * hard_bg_ratio), len(hard_bg_inds))\n            easy_bg_rois_num = bg_rois_per_this_image - hard_bg_rois_num\n\n            # sampling hard bg\n            rand_idx = torch.randint(low=0, high=hard_bg_inds.numel(), size=(hard_bg_rois_num,)).long()\n            hard_bg_inds = hard_bg_inds[rand_idx]\n\n            # sampling easy bg\n            rand_idx = torch.randint(low=0, high=easy_bg_inds.numel(), size=(easy_bg_rois_num,)).long()\n            easy_bg_inds = easy_bg_inds[rand_idx]\n\n            bg_inds = torch.cat([hard_bg_inds, easy_bg_inds], dim=0)\n        elif hard_bg_inds.numel() > 0 and easy_bg_inds.numel() == 0:\n            hard_bg_rois_num = bg_rois_per_this_image\n            # sampling hard bg\n            rand_idx = torch.randint(low=0, high=hard_bg_inds.numel(), size=(hard_bg_rois_num,)).long()\n            bg_inds = hard_bg_inds[rand_idx]\n        elif hard_bg_inds.numel() == 0 and easy_bg_inds.numel() > 0:\n            easy_bg_rois_num = bg_rois_per_this_image\n            # sampling easy bg\n            rand_idx = torch.randint(low=0, high=easy_bg_inds.numel(), size=(easy_bg_rois_num,)).long()\n            bg_inds = easy_bg_inds[rand_idx]\n        else:\n            raise NotImplementedError\n\n        return bg_inds\n\n    @staticmethod\n    def get_max_iou_with_same_class(rois, roi_labels, gt_boxes, gt_labels):\n        \"\"\"\n        Args:\n            rois: (N, 7)\n            roi_labels: (N)\n            gt_boxes: (N, )\n            gt_labels:\n\n        Returns:\n\n        \"\"\"\n        \"\"\"\n        :param rois: (N, 7)\n        :param roi_labels: (N)\n        :param gt_boxes: (N, 8)\n        :return:\n        \"\"\"\n        max_overlaps = rois.new_zeros(rois.shape[0])\n        gt_assignment = roi_labels.new_zeros(roi_labels.shape[0])\n\n        for k in range(gt_labels.min().item(), gt_labels.max().item() + 1):\n            roi_mask = (roi_labels == k)\n            gt_mask = (gt_labels == k)\n            if roi_mask.sum() > 0 and gt_mask.sum() > 0:\n                cur_roi = rois[roi_mask]\n                cur_gt = gt_boxes[gt_mask]\n                original_gt_assignment = gt_mask.nonzero().view(-1)\n\n                iou3d = iou3d_nms_utils.boxes_iou3d_gpu(cur_roi[:,0:7], cur_gt[:,0:7])  # (M, N)\n                cur_max_overlaps, cur_gt_assignment = torch.max(iou3d, dim=1)\n                max_overlaps[roi_mask] = cur_max_overlaps\n                gt_assignment[roi_mask] = original_gt_assignment[cur_gt_assignment]\n\n        return max_overlaps, gt_assignment\n\nclass ProposalTargetLayerT(nn.Module):\n    def __init__(self, roi_sampler_cfg):\n        super().__init__()\n        self.roi_sampler_cfg = roi_sampler_cfg\n\n    def limit(self,ang):\n        ang = ang % (2 * np.pi)\n\n        ang[ang > np.pi] = ang[ang > np.pi] - 2 * np.pi\n\n        ang[ang < -np.pi] = ang[ang < -np.pi] + 2 * np.pi\n\n        return ang\n\n    def ang_weight(self,pred, gt):\n\n        a = torch.abs(pred - gt)\n        b = 2 * np.pi - torch.abs(pred - gt)\n\n        res = torch.stack([a, b])\n\n        res = torch.min(res, 0)[0]\n\n        return 1 - res / np.pi\n\n    def forward(self, batch_dict):\n        \"\"\"\n        Args:\n            batch_dict:\n                batch_size:\n                rois: (B, num_rois, 7 + C)\n                roi_scores: (B, num_rois)\n                gt_boxes: (B, N, 7 + C + 1)\n                roi_labels: (B, num_rois)\n        Returns:\n            batch_dict:\n                rois: (B, M, 7 + C)\n                gt_of_rois: (B, M, 7 + C)\n                gt_iou_of_rois: (B, M)\n                roi_scores: (B, M)\n                roi_labels: (B, M)\n                reg_valid_mask: (B, M)\n                rcnn_cls_labels: (B, M)\n        \"\"\"\n        batch_rois, batch_gt_of_rois, batch_roi_ious, batch_roi_mious, batch_roi_scores, batch_roi_labels, batch_gt_bbs_mask\\\n            = self.sample_rois_for_rcnn(\n            batch_dict=batch_dict,\n        )\n\n        if self.roi_sampler_cfg.CLS_SCORE_TYPE == 'roi_mious' or self.roi_sampler_cfg.CLS_SCORE_TYPE == 'mcls':\n            batch_roi_ious = batch_roi_mious\n\n        # regression valid mask\n        reg_valid_mask = (batch_roi_ious > self.roi_sampler_cfg.REG_FG_THRESH).long()\n\n        # classification label\n        if self.roi_sampler_cfg.CLS_SCORE_TYPE == 'cls' or self.roi_sampler_cfg.CLS_SCORE_TYPE == 'mcls':\n            batch_cls_labels = (batch_roi_ious > self.roi_sampler_cfg.CLS_FG_THRESH).long()\n            ignore_mask = (batch_roi_ious > self.roi_sampler_cfg.CLS_BG_THRESH) & \\\n                          (batch_roi_ious < self.roi_sampler_cfg.CLS_FG_THRESH)\n            batch_cls_labels[ignore_mask > 0] = -1\n        elif self.roi_sampler_cfg.CLS_SCORE_TYPE == 'roi_ious' or self.roi_sampler_cfg.CLS_SCORE_TYPE == 'roi_mious':\n            iou_bg_thresh = self.roi_sampler_cfg.CLS_BG_THRESH\n            iou_fg_thresh = self.roi_sampler_cfg.CLS_FG_THRESH\n            fg_mask = batch_roi_ious > iou_fg_thresh\n            bg_mask = batch_roi_ious < iou_bg_thresh\n            interval_mask = (fg_mask == 0) & (bg_mask == 0)\n\n            batch_cls_labels = (fg_mask > 0).float()\n            batch_cls_labels[interval_mask] = \\\n                (batch_roi_ious[interval_mask] - iou_bg_thresh) / (iou_fg_thresh - iou_bg_thresh)\n\n        elif self.roi_sampler_cfg.CLS_SCORE_TYPE == 'roi_ioud':\n            iou_bg_thresh = self.roi_sampler_cfg.CLS_BG_THRESH\n            iou_fg_thresh = self.roi_sampler_cfg.CLS_FG_THRESH\n            fg_mask = batch_roi_ious > iou_fg_thresh\n            bg_mask = batch_roi_ious < iou_bg_thresh\n            interval_mask = (fg_mask == 0) & (bg_mask == 0)\n\n            batch_cls_labels = (fg_mask > 0).float()\n            batch_cls_labels[interval_mask] = \\\n                (batch_roi_ious[interval_mask] - iou_bg_thresh) / (iou_fg_thresh - iou_bg_thresh)\n\n            ang_roi = batch_rois[...,6]\n            ang_gt = batch_gt_of_rois[...,6]\n\n            ang_roi = self.limit(ang_roi)\n            ang_gt = self.limit(ang_gt)\n\n            ang_target = self.ang_weight(ang_roi,ang_gt)\n\n            ang_target = torch.clamp(ang_target,0.0,0.8)/0.8\n\n            batch_cls_labels*=ang_target\n\n\n        else:\n            raise NotImplementedError\n\n        targets_dict = {'rois': batch_rois, 'gt_of_rois': batch_gt_of_rois, 'gt_iou_of_rois': batch_roi_ious,\n                        'roi_scores': batch_roi_scores, 'roi_labels': batch_roi_labels,\n                        'reg_valid_mask': reg_valid_mask, 'gt_bbs_mask': batch_gt_bbs_mask,\n                        'rcnn_cls_labels': batch_cls_labels}\n\n        return targets_dict\n\n\n    def sample_rois_for_rcnn(self, batch_dict):\n        \"\"\"\n        Args:\n            batch_dict:\n                batch_size:\n                rois: (B, num_rois, 7 + C)\n                roi_scores: (B, num_rois)\n                gt_boxes: (B, N, 7 + C + 1)\n                roi_labels: (B, num_rois)\n        Returns:\n\n        \"\"\"\n        batch_size = batch_dict['batch_size']\n        rois = batch_dict['rois']\n        roi_scores = batch_dict['roi_scores']\n        roi_labels = batch_dict['roi_labels']\n        gt_tracklets = batch_dict['gt_tracklets']\n\n        num_frame = gt_tracklets.shape[-1]//7\n\n        gt_bbs_mask = batch_dict['gt_bbs_mask']\n\n\n        gt_code_size = gt_tracklets.shape[-1]\n        roi_code_size = rois.shape[-1]\n        batch_rois = rois.new_zeros(batch_size, self.roi_sampler_cfg.ROI_PER_IMAGE, roi_code_size)\n        batch_gt_of_rois = rois.new_zeros(batch_size, self.roi_sampler_cfg.ROI_PER_IMAGE, gt_code_size )\n        batch_roi_scores = rois.new_zeros(batch_size, self.roi_sampler_cfg.ROI_PER_IMAGE)\n        batch_roi_labels = rois.new_zeros((batch_size, self.roi_sampler_cfg.ROI_PER_IMAGE), dtype=torch.long)\n\n        batch_all_roi_ious = rois.new_zeros(batch_size, self.roi_sampler_cfg.ROI_PER_IMAGE,num_frame)\n        batch_gt_bbs_mask = rois.new_zeros(batch_size, self.roi_sampler_cfg.ROI_PER_IMAGE, num_frame)\n\n        for index in range(batch_size):\n            cur_roi, cur_gt, cur_roi_labels, cur_roi_scores,cur_bbs_mask = \\\n                rois[index], gt_tracklets[index], roi_labels[index], roi_scores[index],gt_bbs_mask[index]\n            k = cur_gt.__len__() - 1\n            while k > 0 and cur_gt[k].sum() == 0:\n                k -= 1\n            cur_gt = cur_gt[:k + 1]\n            cur_gt = cur_gt.new_zeros((1, cur_gt.shape[1])) if len(cur_gt) == 0 else cur_gt\n            cur_bbs_mask = cur_bbs_mask.new_zeros((1, cur_bbs_mask.shape[1])) if len(cur_bbs_mask) == 0 else cur_bbs_mask\n\n            if self.roi_sampler_cfg.get('SAMPLE_ROI_BY_EACH_CLASS', False):\n                max_overlaps, gt_assignment = self.get_max_iou_with_same_class(\n                    rois=cur_roi[:,0:7], roi_labels=cur_roi_labels,\n                    gt_boxes=cur_gt[:, 0:7], gt_labels=cur_gt[:, -1].long()\n                )\n            else:\n                iou3d = iou3d_nms_utils.boxes_iou3d_gpu(cur_roi[:, 0:7], cur_gt[:, 0:7])  # (M, N)\n                max_overlaps, gt_assignment = torch.max(iou3d, dim=1)\n\n            sampled_inds = self.subsample_rois(max_overlaps=max_overlaps)\n\n            batch_rois[index] = cur_roi[sampled_inds]\n            batch_roi_labels[index] = cur_roi_labels[sampled_inds]\n            batch_roi_scores[index] = cur_roi_scores[sampled_inds]\n            batch_gt_of_rois[index] = cur_gt[gt_assignment[sampled_inds]]\n\n            batch_all_roi_ious[index,:,0] = max_overlaps[sampled_inds]\n\n            batch_gt_bbs_mask[index] = cur_bbs_mask[gt_assignment[sampled_inds]]\n\n        for i in range(1, num_frame):\n            for j in range(batch_size):\n\n                this_roi = batch_rois[j,:,i*7:i*7+7]\n                this_gt_of_roi = batch_gt_of_rois[j,:,i*7:i*7+7]\n\n                all_ious = iou3d_nms_utils.boxes_iou3d_gpu(this_roi[:, 0:7], this_gt_of_roi[:, 0:7])\n                box_num = this_roi.shape[0]\n\n                ious = all_ious[range(box_num),range(box_num)]\n\n                batch_all_roi_ious[j,:,i] = ious\n\n\n        tracks_mean_ious = batch_all_roi_ious.sum(-1)/(batch_gt_bbs_mask.sum(-1)+0.00001)\n\n        return batch_rois, batch_gt_of_rois,batch_all_roi_ious[...,0], tracks_mean_ious, batch_roi_scores, batch_roi_labels, batch_gt_bbs_mask\n\n    def subsample_rois(self, max_overlaps):\n        # sample fg, easy_bg, hard_bg\n        fg_rois_per_image = int(np.round(self.roi_sampler_cfg.FG_RATIO * self.roi_sampler_cfg.ROI_PER_IMAGE))\n        fg_thresh = min(self.roi_sampler_cfg.REG_FG_THRESH, self.roi_sampler_cfg.CLS_FG_THRESH)\n\n        fg_inds = ((max_overlaps >= fg_thresh)).nonzero().view(-1)\n        easy_bg_inds = ((max_overlaps < self.roi_sampler_cfg.CLS_BG_THRESH_LO)).nonzero().view(-1)\n        hard_bg_inds = ((max_overlaps < self.roi_sampler_cfg.REG_FG_THRESH) &\n                (max_overlaps >= self.roi_sampler_cfg.CLS_BG_THRESH_LO)).nonzero().view(-1)\n\n        fg_num_rois = fg_inds.numel()\n        bg_num_rois = hard_bg_inds.numel() + easy_bg_inds.numel()\n\n        if fg_num_rois > 0 and bg_num_rois > 0:\n            # sampling fg\n            fg_rois_per_this_image = min(fg_rois_per_image, fg_num_rois)\n\n            rand_num = torch.from_numpy(np.random.permutation(fg_num_rois)).type_as(max_overlaps).long()\n            fg_inds = fg_inds[rand_num[:fg_rois_per_this_image]]\n\n            # sampling bg\n            bg_rois_per_this_image = self.roi_sampler_cfg.ROI_PER_IMAGE - fg_rois_per_this_image\n            bg_inds = self.sample_bg_inds(\n                hard_bg_inds, easy_bg_inds, bg_rois_per_this_image, self.roi_sampler_cfg.HARD_BG_RATIO\n            )\n\n        elif fg_num_rois > 0 and bg_num_rois == 0:\n            # sampling fg\n            rand_num = np.floor(np.random.rand(self.roi_sampler_cfg.ROI_PER_IMAGE) * fg_num_rois)\n            rand_num = torch.from_numpy(rand_num).type_as(max_overlaps).long()\n            fg_inds = fg_inds[rand_num]\n            bg_inds = []\n\n        elif bg_num_rois > 0 and fg_num_rois == 0:\n            # sampling bg\n            bg_rois_per_this_image = self.roi_sampler_cfg.ROI_PER_IMAGE\n            bg_inds = self.sample_bg_inds(\n                hard_bg_inds, easy_bg_inds, bg_rois_per_this_image, self.roi_sampler_cfg.HARD_BG_RATIO\n            )\n        else:\n            print('maxoverlaps:(min=%f, max=%f)' % (max_overlaps.min().item(), max_overlaps.max().item()))\n            print('ERROR: FG=%d, BG=%d' % (fg_num_rois, bg_num_rois))\n            raise NotImplementedError\n\n        sampled_inds = torch.cat((fg_inds, bg_inds), dim=0)\n        return sampled_inds\n\n    @staticmethod\n    def sample_bg_inds(hard_bg_inds, easy_bg_inds, bg_rois_per_this_image, hard_bg_ratio):\n        if hard_bg_inds.numel() > 0 and easy_bg_inds.numel() > 0:\n            hard_bg_rois_num = min(int(bg_rois_per_this_image * hard_bg_ratio), len(hard_bg_inds))\n            easy_bg_rois_num = bg_rois_per_this_image - hard_bg_rois_num\n\n            # sampling hard bg\n            rand_idx = torch.randint(low=0, high=hard_bg_inds.numel(), size=(hard_bg_rois_num,)).long()\n            hard_bg_inds = hard_bg_inds[rand_idx]\n\n            # sampling easy bg\n            rand_idx = torch.randint(low=0, high=easy_bg_inds.numel(), size=(easy_bg_rois_num,)).long()\n            easy_bg_inds = easy_bg_inds[rand_idx]\n\n            bg_inds = torch.cat([hard_bg_inds, easy_bg_inds], dim=0)\n        elif hard_bg_inds.numel() > 0 and easy_bg_inds.numel() == 0:\n            hard_bg_rois_num = bg_rois_per_this_image\n            # sampling hard bg\n            rand_idx = torch.randint(low=0, high=hard_bg_inds.numel(), size=(hard_bg_rois_num,)).long()\n            bg_inds = hard_bg_inds[rand_idx]\n        elif hard_bg_inds.numel() == 0 and easy_bg_inds.numel() > 0:\n            easy_bg_rois_num = bg_rois_per_this_image\n            # sampling easy bg\n            rand_idx = torch.randint(low=0, high=easy_bg_inds.numel(), size=(easy_bg_rois_num,)).long()\n            bg_inds = easy_bg_inds[rand_idx]\n        else:\n            raise NotImplementedError\n\n        return bg_inds\n\n    @staticmethod\n    def get_max_iou_with_same_class(rois, roi_labels, gt_boxes, gt_labels):\n        \"\"\"\n        Args:\n            rois: (N, 7)\n            roi_labels: (N)\n            gt_boxes: (N, )\n            gt_labels:\n\n        Returns:\n\n        \"\"\"\n        \"\"\"\n        :param rois: (N, 7)\n        :param roi_labels: (N)\n        :param gt_boxes: (N, 8)\n        :return:\n        \"\"\"\n        max_overlaps = rois.new_zeros(rois.shape[0])\n        gt_assignment = roi_labels.new_zeros(roi_labels.shape[0])\n\n        for k in range(gt_labels.min().item(), gt_labels.max().item() + 1):\n            roi_mask = (roi_labels == k)\n            gt_mask = (gt_labels == k)\n            if roi_mask.sum() > 0 and gt_mask.sum() > 0:\n                cur_roi = rois[roi_mask]\n                cur_gt = gt_boxes[gt_mask]\n                original_gt_assignment = gt_mask.nonzero().view(-1)\n\n                iou3d = iou3d_nms_utils.boxes_iou3d_gpu(cur_roi[:,0:7], cur_gt[:,0:7])  # (M, N)\n                cur_max_overlaps, cur_gt_assignment = torch.max(iou3d, dim=1)\n                max_overlaps[roi_mask] = cur_max_overlaps\n                gt_assignment[roi_mask] = original_gt_assignment[cur_gt_assignment]\n\n        return max_overlaps, gt_assignment\n"
  },
  {
    "path": "pcdet/models/roi_heads/target_assigner/proposal_target_layer3.py",
    "content": "import numpy as np\nimport torch\nimport torch.nn as nn\n\nfrom ....ops.iou3d_nms import iou3d_nms_utils\n\nclass ProposalTargetLayer(nn.Module):\n    def __init__(self, roi_sampler_cfg):\n        super().__init__()\n        self.roi_sampler_cfg = roi_sampler_cfg\n\n    def limit(self,ang):\n        ang = ang % (2 * np.pi)\n\n        ang[ang > np.pi] = ang[ang > np.pi] - 2 * np.pi\n\n        ang[ang < -np.pi] = ang[ang < -np.pi] + 2 * np.pi\n\n        return ang\n\n    def ang_weight(self,pred, gt):\n\n        a = torch.abs(pred - gt)\n        b = 2 * np.pi - torch.abs(pred - gt)\n\n        res = torch.stack([a, b])\n\n        res = torch.min(res, 0)[0]\n\n        return 1 - res / np.pi\n\n    def forward(self, batch_dict,ind=''):\n        \"\"\"\n        Args:\n            batch_dict:\n                batch_size:\n                rois: (B, num_rois, 7 + C)\n                roi_scores: (B, num_rois)\n                gt_boxes: (B, N, 7 + C + 1)\n                roi_labels: (B, num_rois)\n        Returns:\n            batch_dict:\n                rois: (B, M, 7 + C)\n                gt_of_rois: (B, M, 7 + C)\n                gt_iou_of_rois: (B, M)\n                roi_scores: (B, M)\n                roi_labels: (B, M)\n                reg_valid_mask: (B, M)\n                rcnn_cls_labels: (B, M)\n        \"\"\"\n        batch_rois, batch_gt_of_rois, batch_roi_ious, batch_roi_scores, batch_roi_labels = self.sample_rois_for_rcnn(\n            batch_dict=batch_dict,ind=ind,\n        )\n        # regression valid mask\n        if self.roi_sampler_cfg.CLS_SCORE_TYPE in ['roi_ioud_x','roi_ioud']:\n\n            roi_angle = batch_rois[..., 6].clone()\n            gt_angle = batch_gt_of_rois[..., 6].clone()\n            roi_angle = self.limit(roi_angle)\n            gt_angle = self.limit(gt_angle)\n            ang_target = self.ang_weight(roi_angle, gt_angle)\n            direction_constraint = self.roi_sampler_cfg.DIRECTION_C\n\n            ang_target = (torch.clamp(ang_target, direction_constraint, 1) - direction_constraint) / (\n                        1 - direction_constraint)\n            batch_roi_ious*=ang_target\n\n        if self.roi_sampler_cfg.CLS_SCORE_TYPE in ['roi_iou_x','roi_ioud_x']:\n            reg_valid_mask = batch_roi_ious.new_zeros(batch_roi_ious.shape).long()\n            for cls_i in range(len(self.roi_sampler_cfg.REG_FG_THRESH)):\n                reg_fg_thresh = self.roi_sampler_cfg.REG_FG_THRESH[cls_i]\n                cls_mask = batch_gt_of_rois[...,-1] == (cls_i+1)\n                this_reg_valid_mask = ((batch_roi_ious > reg_fg_thresh) & cls_mask).long()\n                reg_valid_mask += this_reg_valid_mask\n        else:\n            reg_valid_mask = (batch_roi_ious > self.roi_sampler_cfg.REG_FG_THRESH).long()\n\n        # classification label\n        if self.roi_sampler_cfg.CLS_SCORE_TYPE == 'cls':\n            batch_cls_labels = (batch_roi_ious > self.roi_sampler_cfg.CLS_FG_THRESH).long()\n            ignore_mask = (batch_roi_ious > self.roi_sampler_cfg.CLS_BG_THRESH) & \\\n                          (batch_roi_ious < self.roi_sampler_cfg.CLS_FG_THRESH)\n            batch_cls_labels[ignore_mask > 0] = -1\n        elif self.roi_sampler_cfg.CLS_SCORE_TYPE in ['roi_iou','roi_ioud']:\n            iou_bg_thresh = self.roi_sampler_cfg.CLS_BG_THRESH\n            iou_fg_thresh = self.roi_sampler_cfg.CLS_FG_THRESH\n            fg_mask = batch_roi_ious > iou_fg_thresh\n            bg_mask = batch_roi_ious < iou_bg_thresh\n            interval_mask = (fg_mask == 0) & (bg_mask == 0)\n\n            batch_cls_labels = (fg_mask > 0).float()\n            batch_cls_labels[interval_mask] = \\\n                (batch_roi_ious[interval_mask] - iou_bg_thresh) / (iou_fg_thresh - iou_bg_thresh)\n\n\n        elif self.roi_sampler_cfg.CLS_SCORE_TYPE in ['roi_iou_x','roi_ioud_x']:\n            all_iou_bg_thresh = self.roi_sampler_cfg.CLS_BG_THRESH\n            all_iou_fg_thresh = self.roi_sampler_cfg.CLS_FG_THRESH\n            batch_cls_labels = batch_roi_ious.new_zeros(batch_roi_ious.shape)\n            for cls_id in range(len(all_iou_bg_thresh)):\n                gt_cls = batch_gt_of_rois[..., -1]\n                iou_fg_thresh = all_iou_fg_thresh[cls_id]\n                iou_bg_thresh = all_iou_bg_thresh[cls_id]\n\n                cls_mask = gt_cls == (cls_id+1)\n\n                fg_mask = batch_roi_ious > iou_fg_thresh\n                bg_mask = batch_roi_ious < iou_bg_thresh\n                interval_mask = (fg_mask == 0) & (bg_mask == 0)\n\n                cls_labels = (fg_mask > 0).float()\n                cls_labels[interval_mask] = \\\n                    (batch_roi_ious[interval_mask] - iou_bg_thresh) / (iou_fg_thresh - iou_bg_thresh)\n\n                batch_cls_labels[cls_mask] = cls_labels[cls_mask]\n\n        else:\n            raise NotImplementedError\n\n        targets_dict = {'rois'+ind: batch_rois, 'gt_of_rois'+ind: batch_gt_of_rois, 'gt_iou_of_rois'+ind: batch_roi_ious,\n                        'roi_scores'+ind: batch_roi_scores, 'roi_labels'+ind: batch_roi_labels,\n                        'reg_valid_mask'+ind: reg_valid_mask,\n                        'rcnn_cls_labels'+ind: batch_cls_labels}\n\n        return targets_dict\n\n    def sample_rois_for_rcnn(self, batch_dict, ind=''):\n        \"\"\"\n        Args:\n            batch_dict:\n                batch_size:\n                rois: (B, num_rois, 7 + C)\n                roi_scores: (B, num_rois)\n                gt_boxes: (B, N, 7 + C + 1)\n                roi_labels: (B, num_rois)\n        Returns:\n\n        \"\"\"\n        batch_size = batch_dict['batch_size']\n        rois = batch_dict['rois'+ind]\n        roi_scores = batch_dict['roi_scores'+ind]\n        roi_labels = batch_dict['roi_labels']\n        gt_boxes = batch_dict['gt_boxes']\n\n        gt_code_size = gt_boxes.shape[-1]\n        roi_code_size = rois.shape[-1]\n        batch_rois = rois.new_zeros(batch_size, self.roi_sampler_cfg.ROI_PER_IMAGE, roi_code_size)\n        batch_gt_of_rois = rois.new_zeros(batch_size, self.roi_sampler_cfg.ROI_PER_IMAGE, gt_code_size )\n        batch_roi_ious = rois.new_zeros(batch_size, self.roi_sampler_cfg.ROI_PER_IMAGE)\n        batch_roi_scores = rois.new_zeros(batch_size, self.roi_sampler_cfg.ROI_PER_IMAGE)\n        batch_roi_labels = rois.new_zeros((batch_size, self.roi_sampler_cfg.ROI_PER_IMAGE), dtype=torch.long)\n\n        for index in range(batch_size):\n            cur_roi, cur_gt, cur_roi_labels, cur_roi_scores = \\\n                rois[index], gt_boxes[index], roi_labels[index], roi_scores[index]\n            k = cur_gt.__len__() - 1\n            while k > 0 and cur_gt[k].sum() == 0:\n                k -= 1\n            cur_gt = cur_gt[:k + 1]\n            cur_gt = cur_gt.new_zeros((1, cur_gt.shape[1])) if len(cur_gt) == 0 else cur_gt\n\n            if self.roi_sampler_cfg.get('SAMPLE_ROI_BY_EACH_CLASS', False):\n                max_overlaps, gt_assignment = self.get_max_iou_with_same_class(\n                    rois=cur_roi, roi_labels=cur_roi_labels,\n                    gt_boxes=cur_gt[:, 0:7], gt_labels=cur_gt[:, -1].long()\n                )\n            else:\n                iou3d = iou3d_nms_utils.boxes_iou3d_gpu(cur_roi, cur_gt[:, 0:7])  # (M, N)\n                max_overlaps, gt_assignment = torch.max(iou3d, dim=1)\n\n            if self.roi_sampler_cfg.CLS_SCORE_TYPE in ['roi_iou_x','roi_ioud_x']:\n                sampled_inds = self.subsample_rois(max_overlaps=max_overlaps,gts = cur_gt[gt_assignment])\n            else:\n                sampled_inds = self.subsample_rois(max_overlaps=max_overlaps)\n\n            batch_rois[index] = cur_roi[sampled_inds]\n            batch_roi_labels[index] = cur_roi_labels[sampled_inds]\n            batch_roi_ious[index] = max_overlaps[sampled_inds]\n            batch_roi_scores[index] = cur_roi_scores[sampled_inds]\n            batch_gt_of_rois[index] = cur_gt[gt_assignment[sampled_inds]]\n\n        return batch_rois, batch_gt_of_rois, batch_roi_ious, batch_roi_scores, batch_roi_labels\n\n    def subsample_rois(self, max_overlaps, gts=None):\n        # sample fg, easy_bg, hard_bg\n        fg_rois_per_image = int(np.round(self.roi_sampler_cfg.FG_RATIO * self.roi_sampler_cfg.ROI_PER_IMAGE))\n\n        if gts is None:\n            fg_thresh = min(self.roi_sampler_cfg.REG_FG_THRESH, self.roi_sampler_cfg.CLS_FG_THRESH)\n            fg_inds = ((max_overlaps >= fg_thresh)).nonzero().view(-1)\n        else:\n            fg_inds = max_overlaps.new_zeros(max_overlaps.shape).long()\n            for i in range(len(self.roi_sampler_cfg.CLS_FG_THRESH)):\n                cls_mask = gts[...,-1] == (i+1)\n                this_fg_thresh = min(self.roi_sampler_cfg.REG_FG_THRESH[i], self.roi_sampler_cfg.CLS_FG_THRESH[i])\n                this_fg_inds = (max_overlaps >= this_fg_thresh) & cls_mask\n                fg_inds+=this_fg_inds\n            fg_inds = fg_inds.nonzero().view(-1)\n\n\n        easy_bg_inds = ((max_overlaps < self.roi_sampler_cfg.CLS_BG_THRESH_LO)).nonzero().view(-1)\n\n        if gts is None:\n            hard_bg_inds = ((max_overlaps < self.roi_sampler_cfg.REG_FG_THRESH) &\n                            (max_overlaps >= self.roi_sampler_cfg.CLS_BG_THRESH_LO)).nonzero().view(-1)\n        else:\n            hard_bg_inds = max_overlaps.new_zeros(max_overlaps.shape).long()\n            for i in range(len(self.roi_sampler_cfg.REG_FG_THRESH)):\n                cls_mask = gts[...,-1] == (i+1)\n                this_hard_bg_inds = ((max_overlaps < self.roi_sampler_cfg.REG_FG_THRESH[i]) &\n                                (max_overlaps >= self.roi_sampler_cfg.CLS_BG_THRESH_LO) & cls_mask)\n                hard_bg_inds+=this_hard_bg_inds\n            hard_bg_inds = hard_bg_inds.nonzero().view(-1)\n\n\n        fg_num_rois = fg_inds.numel()\n        bg_num_rois = hard_bg_inds.numel() + easy_bg_inds.numel()\n\n        if fg_num_rois > 0 and bg_num_rois > 0:\n            # sampling fg\n            fg_rois_per_this_image = min(fg_rois_per_image, fg_num_rois)\n\n            rand_num = torch.from_numpy(np.random.permutation(fg_num_rois)).type_as(max_overlaps).long()\n            fg_inds = fg_inds[rand_num[:fg_rois_per_this_image]]\n\n            # sampling bg\n            bg_rois_per_this_image = self.roi_sampler_cfg.ROI_PER_IMAGE - fg_rois_per_this_image\n            bg_inds = self.sample_bg_inds(\n                hard_bg_inds, easy_bg_inds, bg_rois_per_this_image, self.roi_sampler_cfg.HARD_BG_RATIO\n            )\n\n        elif fg_num_rois > 0 and bg_num_rois == 0:\n            # sampling fg\n            rand_num = np.floor(np.random.rand(self.roi_sampler_cfg.ROI_PER_IMAGE) * fg_num_rois)\n            rand_num = torch.from_numpy(rand_num).type_as(max_overlaps).long()\n            fg_inds = fg_inds[rand_num]\n            bg_inds = []\n\n        elif bg_num_rois > 0 and fg_num_rois == 0:\n            # sampling bg\n            bg_rois_per_this_image = self.roi_sampler_cfg.ROI_PER_IMAGE\n            bg_inds = self.sample_bg_inds(\n                hard_bg_inds, easy_bg_inds, bg_rois_per_this_image, self.roi_sampler_cfg.HARD_BG_RATIO\n            )\n        else:\n            print('maxoverlaps:(min=%f, max=%f)' % (max_overlaps.min().item(), max_overlaps.max().item()))\n            print('ERROR: FG=%d, BG=%d' % (fg_num_rois, bg_num_rois))\n            raise NotImplementedError\n\n        sampled_inds = torch.cat((fg_inds, bg_inds), dim=0)\n        return sampled_inds\n\n    @staticmethod\n    def sample_bg_inds(hard_bg_inds, easy_bg_inds, bg_rois_per_this_image, hard_bg_ratio):\n        if hard_bg_inds.numel() > 0 and easy_bg_inds.numel() > 0:\n            hard_bg_rois_num = min(int(bg_rois_per_this_image * hard_bg_ratio), len(hard_bg_inds))\n            easy_bg_rois_num = bg_rois_per_this_image - hard_bg_rois_num\n\n            # sampling hard bg\n            rand_idx = torch.randint(low=0, high=hard_bg_inds.numel(), size=(hard_bg_rois_num,)).long()\n            hard_bg_inds = hard_bg_inds[rand_idx]\n\n            # sampling easy bg\n            rand_idx = torch.randint(low=0, high=easy_bg_inds.numel(), size=(easy_bg_rois_num,)).long()\n            easy_bg_inds = easy_bg_inds[rand_idx]\n\n            bg_inds = torch.cat([hard_bg_inds, easy_bg_inds], dim=0)\n        elif hard_bg_inds.numel() > 0 and easy_bg_inds.numel() == 0:\n            hard_bg_rois_num = bg_rois_per_this_image\n            # sampling hard bg\n            rand_idx = torch.randint(low=0, high=hard_bg_inds.numel(), size=(hard_bg_rois_num,)).long()\n            bg_inds = hard_bg_inds[rand_idx]\n        elif hard_bg_inds.numel() == 0 and easy_bg_inds.numel() > 0:\n            easy_bg_rois_num = bg_rois_per_this_image\n            # sampling easy bg\n            rand_idx = torch.randint(low=0, high=easy_bg_inds.numel(), size=(easy_bg_rois_num,)).long()\n            bg_inds = easy_bg_inds[rand_idx]\n        else:\n            raise NotImplementedError\n\n        return bg_inds\n\n    @staticmethod\n    def get_max_iou_with_same_class(rois, roi_labels, gt_boxes, gt_labels):\n        \"\"\"\n        Args:\n            rois: (N, 7)\n            roi_labels: (N)\n            gt_boxes: (N, )\n            gt_labels:\n\n        Returns:\n\n        \"\"\"\n        \"\"\"\n        :param rois: (N, 7)\n        :param roi_labels: (N)\n        :param gt_boxes: (N, 8)\n        :return:\n        \"\"\"\n        max_overlaps = rois.new_zeros(rois.shape[0])\n        gt_assignment = roi_labels.new_zeros(roi_labels.shape[0])\n\n        for k in range(gt_labels.min().item(), gt_labels.max().item() + 1):\n            roi_mask = (roi_labels == k)\n            gt_mask = (gt_labels == k)\n            if roi_mask.sum() > 0 and gt_mask.sum() > 0:\n                cur_roi = rois[roi_mask]\n                cur_gt = gt_boxes[gt_mask]\n                original_gt_assignment = gt_mask.nonzero().view(-1)\n\n                iou3d = iou3d_nms_utils.boxes_iou3d_gpu(cur_roi[:,0:7], cur_gt[:,0:7])  # (M, N)\n                cur_max_overlaps, cur_gt_assignment = torch.max(iou3d, dim=1)\n                max_overlaps[roi_mask] = cur_max_overlaps\n                gt_assignment[roi_mask] = original_gt_assignment[cur_gt_assignment]\n\n        return max_overlaps, gt_assignment\n\nclass ProposalTargetLayerT(nn.Module):\n    def __init__(self, roi_sampler_cfg):\n        super().__init__()\n        self.roi_sampler_cfg = roi_sampler_cfg\n\n    def limit(self,ang):\n        ang = ang % (2 * np.pi)\n\n        ang[ang > np.pi] = ang[ang > np.pi] - 2 * np.pi\n\n        ang[ang < -np.pi] = ang[ang < -np.pi] + 2 * np.pi\n\n        return ang\n\n    def ang_weight(self,pred, gt):\n\n        a = torch.abs(pred - gt)\n        b = 2 * np.pi - torch.abs(pred - gt)\n\n        res = torch.stack([a, b])\n\n        res = torch.min(res, 0)[0]\n\n        return 1 - res / np.pi\n\n    def forward(self, batch_dict):\n        \"\"\"\n        Args:\n            batch_dict:\n                batch_size:\n                rois: (B, num_rois, 7 + C)\n                roi_scores: (B, num_rois)\n                gt_boxes: (B, N, 7 + C + 1)\n                roi_labels: (B, num_rois)\n        Returns:\n            batch_dict:\n                rois: (B, M, 7 + C)\n                gt_of_rois: (B, M, 7 + C)\n                gt_iou_of_rois: (B, M)\n                roi_scores: (B, M)\n                roi_labels: (B, M)\n                reg_valid_mask: (B, M)\n                rcnn_cls_labels: (B, M)\n        \"\"\"\n        batch_rois, batch_gt_of_rois, batch_roi_ious, batch_roi_mious, batch_roi_scores, batch_roi_labels, batch_gt_bbs_mask\\\n            = self.sample_rois_for_rcnn(\n            batch_dict=batch_dict,\n        )\n\n        if self.roi_sampler_cfg.CLS_SCORE_TYPE == 'roi_mious' or self.roi_sampler_cfg.CLS_SCORE_TYPE == 'mcls':\n            batch_roi_ious = batch_roi_mious\n\n        # regression valid mask\n        reg_valid_mask = (batch_roi_ious > self.roi_sampler_cfg.REG_FG_THRESH).long()\n\n        # classification label\n        if self.roi_sampler_cfg.CLS_SCORE_TYPE == 'cls' or self.roi_sampler_cfg.CLS_SCORE_TYPE == 'mcls':\n            batch_cls_labels = (batch_roi_ious > self.roi_sampler_cfg.CLS_FG_THRESH).long()\n            ignore_mask = (batch_roi_ious > self.roi_sampler_cfg.CLS_BG_THRESH) & \\\n                          (batch_roi_ious < self.roi_sampler_cfg.CLS_FG_THRESH)\n            batch_cls_labels[ignore_mask > 0] = -1\n        elif self.roi_sampler_cfg.CLS_SCORE_TYPE == 'roi_ious' or self.roi_sampler_cfg.CLS_SCORE_TYPE == 'roi_mious':\n            iou_bg_thresh = self.roi_sampler_cfg.CLS_BG_THRESH\n            iou_fg_thresh = self.roi_sampler_cfg.CLS_FG_THRESH\n            fg_mask = batch_roi_ious > iou_fg_thresh\n            bg_mask = batch_roi_ious < iou_bg_thresh\n            interval_mask = (fg_mask == 0) & (bg_mask == 0)\n\n            batch_cls_labels = (fg_mask > 0).float()\n            batch_cls_labels[interval_mask] = \\\n                (batch_roi_ious[interval_mask] - iou_bg_thresh) / (iou_fg_thresh - iou_bg_thresh)\n\n        elif self.roi_sampler_cfg.CLS_SCORE_TYPE == 'roi_ioud':\n            iou_bg_thresh = self.roi_sampler_cfg.CLS_BG_THRESH\n            iou_fg_thresh = self.roi_sampler_cfg.CLS_FG_THRESH\n            fg_mask = batch_roi_ious > iou_fg_thresh\n            bg_mask = batch_roi_ious < iou_bg_thresh\n            interval_mask = (fg_mask == 0) & (bg_mask == 0)\n\n            batch_cls_labels = (fg_mask > 0).float()\n            batch_cls_labels[interval_mask] = \\\n                (batch_roi_ious[interval_mask] - iou_bg_thresh) / (iou_fg_thresh - iou_bg_thresh)\n\n            ang_roi = batch_rois[...,6]\n            ang_gt = batch_gt_of_rois[...,6]\n\n            ang_roi = self.limit(ang_roi)\n            ang_gt = self.limit(ang_gt)\n\n            ang_target = self.ang_weight(ang_roi,ang_gt)\n\n            ang_target = torch.clamp(ang_target,0.0,0.8)/0.8\n\n            batch_cls_labels*=ang_target\n\n\n        else:\n            raise NotImplementedError\n\n        targets_dict = {'rois': batch_rois, 'gt_of_rois': batch_gt_of_rois, 'gt_iou_of_rois': batch_roi_ious,\n                        'roi_scores': batch_roi_scores, 'roi_labels': batch_roi_labels,\n                        'reg_valid_mask': reg_valid_mask, 'gt_bbs_mask': batch_gt_bbs_mask,\n                        'rcnn_cls_labels': batch_cls_labels}\n\n        return targets_dict\n\n\n    def sample_rois_for_rcnn(self, batch_dict):\n        \"\"\"\n        Args:\n            batch_dict:\n                batch_size:\n                rois: (B, num_rois, 7 + C)\n                roi_scores: (B, num_rois)\n                gt_boxes: (B, N, 7 + C + 1)\n                roi_labels: (B, num_rois)\n        Returns:\n\n        \"\"\"\n        batch_size = batch_dict['batch_size']\n        rois = batch_dict['rois']\n        roi_scores = batch_dict['roi_scores']\n        roi_labels = batch_dict['roi_labels']\n        gt_tracklets = batch_dict['gt_tracklets']\n\n        num_frame = gt_tracklets.shape[-1]//7\n\n        gt_bbs_mask = batch_dict['gt_bbs_mask']\n\n\n        gt_code_size = gt_tracklets.shape[-1]\n        roi_code_size = rois.shape[-1]\n        batch_rois = rois.new_zeros(batch_size, self.roi_sampler_cfg.ROI_PER_IMAGE, roi_code_size)\n        batch_gt_of_rois = rois.new_zeros(batch_size, self.roi_sampler_cfg.ROI_PER_IMAGE, gt_code_size )\n        batch_roi_scores = rois.new_zeros(batch_size, self.roi_sampler_cfg.ROI_PER_IMAGE)\n        batch_roi_labels = rois.new_zeros((batch_size, self.roi_sampler_cfg.ROI_PER_IMAGE), dtype=torch.long)\n\n        batch_all_roi_ious = rois.new_zeros(batch_size, self.roi_sampler_cfg.ROI_PER_IMAGE,num_frame)\n        batch_gt_bbs_mask = rois.new_zeros(batch_size, self.roi_sampler_cfg.ROI_PER_IMAGE, num_frame)\n\n        for index in range(batch_size):\n            cur_roi, cur_gt, cur_roi_labels, cur_roi_scores,cur_bbs_mask = \\\n                rois[index], gt_tracklets[index], roi_labels[index], roi_scores[index],gt_bbs_mask[index]\n            k = cur_gt.__len__() - 1\n            while k > 0 and cur_gt[k].sum() == 0:\n                k -= 1\n            cur_gt = cur_gt[:k + 1]\n            cur_gt = cur_gt.new_zeros((1, cur_gt.shape[1])) if len(cur_gt) == 0 else cur_gt\n            cur_bbs_mask = cur_bbs_mask.new_zeros((1, cur_bbs_mask.shape[1])) if len(cur_bbs_mask) == 0 else cur_bbs_mask\n\n            if self.roi_sampler_cfg.get('SAMPLE_ROI_BY_EACH_CLASS', False):\n                max_overlaps, gt_assignment = self.get_max_iou_with_same_class(\n                    rois=cur_roi[:,0:7], roi_labels=cur_roi_labels,\n                    gt_boxes=cur_gt[:, 0:7], gt_labels=cur_gt[:, -1].long()\n                )\n            else:\n                iou3d = iou3d_nms_utils.boxes_iou3d_gpu(cur_roi[:, 0:7], cur_gt[:, 0:7])  # (M, N)\n                max_overlaps, gt_assignment = torch.max(iou3d, dim=1)\n\n            sampled_inds = self.subsample_rois(max_overlaps=max_overlaps)\n\n            batch_rois[index] = cur_roi[sampled_inds]\n            batch_roi_labels[index] = cur_roi_labels[sampled_inds]\n            batch_roi_scores[index] = cur_roi_scores[sampled_inds]\n            batch_gt_of_rois[index] = cur_gt[gt_assignment[sampled_inds]]\n\n            batch_all_roi_ious[index,:,0] = max_overlaps[sampled_inds]\n\n            batch_gt_bbs_mask[index] = cur_bbs_mask[gt_assignment[sampled_inds]]\n\n        for i in range(1, num_frame):\n            for j in range(batch_size):\n\n                this_roi = batch_rois[j,:,i*7:i*7+7]\n                this_gt_of_roi = batch_gt_of_rois[j,:,i*7:i*7+7]\n\n                all_ious = iou3d_nms_utils.boxes_iou3d_gpu(this_roi[:, 0:7], this_gt_of_roi[:, 0:7])\n                box_num = this_roi.shape[0]\n\n                ious = all_ious[range(box_num),range(box_num)]\n\n                batch_all_roi_ious[j,:,i] = ious\n\n\n        tracks_mean_ious = batch_all_roi_ious.sum(-1)/(batch_gt_bbs_mask.sum(-1)+0.00001)\n\n        return batch_rois, batch_gt_of_rois,batch_all_roi_ious[...,0], tracks_mean_ious, batch_roi_scores, batch_roi_labels, batch_gt_bbs_mask\n\n    def subsample_rois(self, max_overlaps):\n        # sample fg, easy_bg, hard_bg\n        fg_rois_per_image = int(np.round(self.roi_sampler_cfg.FG_RATIO * self.roi_sampler_cfg.ROI_PER_IMAGE))\n        fg_thresh = min(self.roi_sampler_cfg.REG_FG_THRESH, self.roi_sampler_cfg.CLS_FG_THRESH)\n\n        fg_inds = ((max_overlaps >= fg_thresh)).nonzero().view(-1)\n        easy_bg_inds = ((max_overlaps < self.roi_sampler_cfg.CLS_BG_THRESH_LO)).nonzero().view(-1)\n        hard_bg_inds = ((max_overlaps < self.roi_sampler_cfg.REG_FG_THRESH) &\n                (max_overlaps >= self.roi_sampler_cfg.CLS_BG_THRESH_LO)).nonzero().view(-1)\n\n        fg_num_rois = fg_inds.numel()\n        bg_num_rois = hard_bg_inds.numel() + easy_bg_inds.numel()\n\n        if fg_num_rois > 0 and bg_num_rois > 0:\n            # sampling fg\n            fg_rois_per_this_image = min(fg_rois_per_image, fg_num_rois)\n\n            rand_num = torch.from_numpy(np.random.permutation(fg_num_rois)).type_as(max_overlaps).long()\n            fg_inds = fg_inds[rand_num[:fg_rois_per_this_image]]\n\n            # sampling bg\n            bg_rois_per_this_image = self.roi_sampler_cfg.ROI_PER_IMAGE - fg_rois_per_this_image\n            bg_inds = self.sample_bg_inds(\n                hard_bg_inds, easy_bg_inds, bg_rois_per_this_image, self.roi_sampler_cfg.HARD_BG_RATIO\n            )\n\n        elif fg_num_rois > 0 and bg_num_rois == 0:\n            # sampling fg\n            rand_num = np.floor(np.random.rand(self.roi_sampler_cfg.ROI_PER_IMAGE) * fg_num_rois)\n            rand_num = torch.from_numpy(rand_num).type_as(max_overlaps).long()\n            fg_inds = fg_inds[rand_num]\n            bg_inds = []\n\n        elif bg_num_rois > 0 and fg_num_rois == 0:\n            # sampling bg\n            bg_rois_per_this_image = self.roi_sampler_cfg.ROI_PER_IMAGE\n            bg_inds = self.sample_bg_inds(\n                hard_bg_inds, easy_bg_inds, bg_rois_per_this_image, self.roi_sampler_cfg.HARD_BG_RATIO\n            )\n        else:\n            print('maxoverlaps:(min=%f, max=%f)' % (max_overlaps.min().item(), max_overlaps.max().item()))\n            print('ERROR: FG=%d, BG=%d' % (fg_num_rois, bg_num_rois))\n            raise NotImplementedError\n\n        sampled_inds = torch.cat((fg_inds, bg_inds), dim=0)\n        return sampled_inds\n\n    @staticmethod\n    def sample_bg_inds(hard_bg_inds, easy_bg_inds, bg_rois_per_this_image, hard_bg_ratio):\n        if hard_bg_inds.numel() > 0 and easy_bg_inds.numel() > 0:\n            hard_bg_rois_num = min(int(bg_rois_per_this_image * hard_bg_ratio), len(hard_bg_inds))\n            easy_bg_rois_num = bg_rois_per_this_image - hard_bg_rois_num\n\n            # sampling hard bg\n            rand_idx = torch.randint(low=0, high=hard_bg_inds.numel(), size=(hard_bg_rois_num,)).long()\n            hard_bg_inds = hard_bg_inds[rand_idx]\n\n            # sampling easy bg\n            rand_idx = torch.randint(low=0, high=easy_bg_inds.numel(), size=(easy_bg_rois_num,)).long()\n            easy_bg_inds = easy_bg_inds[rand_idx]\n\n            bg_inds = torch.cat([hard_bg_inds, easy_bg_inds], dim=0)\n        elif hard_bg_inds.numel() > 0 and easy_bg_inds.numel() == 0:\n            hard_bg_rois_num = bg_rois_per_this_image\n            # sampling hard bg\n            rand_idx = torch.randint(low=0, high=hard_bg_inds.numel(), size=(hard_bg_rois_num,)).long()\n            bg_inds = hard_bg_inds[rand_idx]\n        elif hard_bg_inds.numel() == 0 and easy_bg_inds.numel() > 0:\n            easy_bg_rois_num = bg_rois_per_this_image\n            # sampling easy bg\n            rand_idx = torch.randint(low=0, high=easy_bg_inds.numel(), size=(easy_bg_rois_num,)).long()\n            bg_inds = easy_bg_inds[rand_idx]\n        else:\n            raise NotImplementedError\n\n        return bg_inds\n\n    @staticmethod\n    def get_max_iou_with_same_class(rois, roi_labels, gt_boxes, gt_labels):\n        \"\"\"\n        Args:\n            rois: (N, 7)\n            roi_labels: (N)\n            gt_boxes: (N, )\n            gt_labels:\n\n        Returns:\n\n        \"\"\"\n        \"\"\"\n        :param rois: (N, 7)\n        :param roi_labels: (N)\n        :param gt_boxes: (N, 8)\n        :return:\n        \"\"\"\n        max_overlaps = rois.new_zeros(rois.shape[0])\n        gt_assignment = roi_labels.new_zeros(roi_labels.shape[0])\n\n        for k in range(gt_labels.min().item(), gt_labels.max().item() + 1):\n            roi_mask = (roi_labels == k)\n            gt_mask = (gt_labels == k)\n            if roi_mask.sum() > 0 and gt_mask.sum() > 0:\n                cur_roi = rois[roi_mask]\n                cur_gt = gt_boxes[gt_mask]\n                original_gt_assignment = gt_mask.nonzero().view(-1)\n\n                iou3d = iou3d_nms_utils.boxes_iou3d_gpu(cur_roi[:,0:7], cur_gt[:,0:7])  # (M, N)\n                cur_max_overlaps, cur_gt_assignment = torch.max(iou3d, dim=1)\n                max_overlaps[roi_mask] = cur_max_overlaps\n                gt_assignment[roi_mask] = original_gt_assignment[cur_gt_assignment]\n\n        return max_overlaps, gt_assignment\n"
  },
  {
    "path": "pcdet/models/roi_heads/ted_head.py",
    "content": "import torch\nimport torch.nn as nn\nfrom .roi_head_template import RoIHeadTemplate\nfrom ...utils import common_utils, spconv_utils\nfrom ...ops.pointnet2.pointnet2_stack import voxel_pool_modules as voxelpool_stack_modules\nfrom torch.autograd import Variable\nimport torch.nn.functional as F\nimport numpy as np\nfrom functools import partial\nimport pickle\nimport copy\n\nfrom pcdet.datasets.augmentor.X_transform import X_TRANS\n\nclass PositionalEmbedding(nn.Module):\n    def __init__(self, demb=256):\n        super(PositionalEmbedding, self).__init__()\n\n        self.demb = demb\n\n        inv_freq = 1 / (10000 ** (torch.arange(0.0, demb, 2.0) / demb))\n        self.register_buffer('inv_freq', inv_freq)\n\n    # pos_seq =  pos_seq = torch.arange(seq_len-1, -1, -1.0)\n    def forward(self, pos_seq, batch_size=2):\n        sinusoid_inp = torch.ger(pos_seq, self.inv_freq)\n        pos_emb = torch.cat([sinusoid_inp.sin(), sinusoid_inp.cos()], dim=-1)\n\n        if batch_size is not None:\n            return pos_emb[:, None, :].expand(-1, batch_size, -1)\n        else:\n            return pos_emb[:, None, :]\n\nclass CrossAttention(nn.Module):\n\n    def __init__(self, hidden_dim, pos = True, head = 4):\n        super(CrossAttention, self).__init__()\n\n        self.hidden_dim = hidden_dim\n        self.pos_dim = 8\n        self.pos = pos\n\n        if self.pos:\n            self.pos_en = PositionalEmbedding(self.pos_dim)\n\n            self.Q_linear = nn.Linear(hidden_dim+self.pos_dim, hidden_dim, bias=False)\n            self.K_linear = nn.Linear(hidden_dim+self.pos_dim, hidden_dim, bias=False)\n            self.V_linear = nn.Linear(hidden_dim+self.pos_dim, hidden_dim, bias=False)\n        else:\n\n            self.Q_linear = nn.Linear(hidden_dim, hidden_dim, bias=False)\n            self.K_linear = nn.Linear(hidden_dim, hidden_dim, bias=False)\n            self.V_linear = nn.Linear(hidden_dim, hidden_dim, bias=False)\n\n        self.att = nn.MultiheadAttention(hidden_dim, head)\n\n\n    def forward(self, inputs, Q_in): # N,B,C\n\n        batch_size = inputs.shape[1]\n        seq_len = inputs.shape[0]\n\n        if self.pos:\n            pos_input = torch.from_numpy(np.arange(seq_len)+1).cuda()\n            pos_input = self.pos_en(pos_input, batch_size)\n            inputs_pos = torch.cat([inputs, pos_input], -1)\n            pos_Q = torch.from_numpy(np.array([seq_len])).cuda()\n            pos_Q = self.pos_en(pos_Q, batch_size)\n            Q_in_pos = torch.cat([Q_in, pos_Q], -1)\n        else:\n            inputs_pos = inputs\n            Q_in_pos = Q_in\n\n        Q = self.Q_linear(Q_in_pos)\n        K = self.K_linear(inputs_pos)\n        V = self.V_linear(inputs_pos)\n\n        out = self.att(Q, K, V)\n\n        return out[0]\n\nclass Attention_Layer(nn.Module):\n\n    def __init__(self, hidden_dim):\n        super(Attention_Layer, self).__init__()\n\n        self.hidden_dim = hidden_dim\n\n        self.Q_linear = nn.Linear(hidden_dim, hidden_dim, bias=False)\n        self.K_linear = nn.Linear(hidden_dim, hidden_dim, bias=False)\n        self.V_linear = nn.Linear(hidden_dim, hidden_dim, bias=False)\n\n    def forward(self, inputs): # B,K,N\n\n\n        Q = self.Q_linear(inputs)\n        K = self.K_linear(inputs).permute(0, 2, 1)\n        V = self.V_linear(inputs)\n\n        alpha = torch.matmul(Q, K)\n\n        alpha = F.softmax(alpha, dim=2)\n\n        out = torch.matmul(alpha, V)\n\n        out = torch.mean(out, -2)\n\n        return out\n\ndef gen_sample_grid(rois, grid_size=7, grid_offsets=(0, 0), spatial_scale=1.):\n    faked_features = rois.new_ones((grid_size, grid_size))\n    N = rois.shape[0]\n    dense_idx = faked_features.nonzero()  # (N, 2) [x_idx, y_idx]\n    dense_idx = dense_idx.repeat(N, 1, 1).float()  # (B, 7 * 7, 2)\n\n    local_roi_size = rois.view(N, -1)[:, 3:5]\n    local_roi_grid_points = (dense_idx ) / (grid_size-1) * local_roi_size.unsqueeze(dim=1) \\\n                      - (local_roi_size.unsqueeze(dim=1) / 2)  # (B, 7 * 7, 2)\n\n    ones = torch.ones_like(local_roi_grid_points[..., 0:1])\n    local_roi_grid_points = torch.cat([local_roi_grid_points, ones], -1)\n\n    global_roi_grid_points = common_utils.rotate_points_along_z(\n        local_roi_grid_points.clone(), rois[:, 6]\n    ).squeeze(dim=1)\n    global_center = rois[:, 0:3].clone()\n    global_roi_grid_points += global_center.unsqueeze(dim=1)\n\n    x = global_roi_grid_points[..., 0:1]\n    y = global_roi_grid_points[..., 1:2]\n\n    x = (x.permute(1, 2, 0).contiguous() + grid_offsets[0]) * spatial_scale\n    y = (y.permute(1, 2, 0).contiguous() + grid_offsets[1]) * spatial_scale\n\n    return x.view(grid_size**2, -1), y.view(grid_size**2, -1)\n\ndef bilinear_interpolate_torch_gridsample(image, samples_x, samples_y):\n    C, H, W = image.shape\n    image = image.unsqueeze(1)  # change to:  C x 1 x H x W        C,K,1,2   C,K,1,1\n\n    samples_x = samples_x.unsqueeze(2)\n    samples_x = samples_x.unsqueeze(3)# 49,K,1,1\n    samples_y = samples_y.unsqueeze(2)\n    samples_y = samples_y.unsqueeze(3)\n\n    samples = torch.cat([samples_x, samples_y], 3)\n    samples[:, :, :, 0] = (samples[:, :, :, 0] / W)  # normalize to between  0 and 1\n\n    samples[:, :, :, 1] = (samples[:, :, :, 1] / H)  # normalize to between  0 and 1\n    samples = samples * 2 - 1  # normalize to between -1 and 1  # 49,K,1,2\n\n    #B,C,H,W\n    #B,H,W,2\n    #B,C,H,W\n\n    return torch.nn.functional.grid_sample(image, samples, align_corners=False)\n\nclass TEDSHead(RoIHeadTemplate):\n    def __init__(self, input_channels, model_cfg, point_cloud_range=None, voxel_size=None, num_class=1,\n                 **kwargs):\n        super().__init__(num_class=num_class,  model_cfg=model_cfg)\n        self.model_cfg = model_cfg\n        self.pool_cfg = model_cfg.ROI_GRID_POOL\n        LAYER_cfg = self.pool_cfg.POOL_LAYERS\n        self.point_cloud_range = point_cloud_range\n        self.voxel_size = voxel_size\n        self.rot_num = 3\n\n        self.x_trans_train = X_TRANS()\n\n        c_out = 0\n        self.roi_grid_pool_layers = nn.ModuleList()\n        for src_name in self.pool_cfg.FEATURES_SOURCE:\n            mlps = LAYER_cfg[src_name].MLPS\n            for k in range(len(mlps)):\n                mlps[k] = [input_channels[src_name]] + mlps[k]\n            pool_layer = voxelpool_stack_modules.NeighborVoxelSAModuleMSG(\n                query_ranges=LAYER_cfg[src_name].QUERY_RANGES,\n                nsamples=LAYER_cfg[src_name].NSAMPLE,\n                radii=LAYER_cfg[src_name].POOL_RADIUS,\n                mlps=mlps,\n                pool_method=LAYER_cfg[src_name].POOL_METHOD,\n            )\n\n            self.roi_grid_pool_layers.append(pool_layer)\n\n            c_out += sum([x[-1] for x in mlps])\n\n        GRID_SIZE = self.model_cfg.ROI_GRID_POOL.GRID_SIZE\n        pre_channel = GRID_SIZE * GRID_SIZE * GRID_SIZE * c_out\n        shared_fc_list = []\n        for k in range(0, self.model_cfg.SHARED_FC.__len__()):\n            shared_fc_list.extend([\n                nn.Linear(pre_channel, self.model_cfg.SHARED_FC[k], bias=False),\n                nn.BatchNorm1d(self.model_cfg.SHARED_FC[k]),\n                nn.ReLU(inplace=True)\n            ])\n            pre_channel = self.model_cfg.SHARED_FC[k]\n\n            if k != self.model_cfg.SHARED_FC.__len__() - 1 and self.model_cfg.DP_RATIO > 0:\n                shared_fc_list.append(nn.Dropout(self.model_cfg.DP_RATIO))\n        self.shared_fc_layers=nn.Sequential(*shared_fc_list)\n\n\n        self.shared_channel = pre_channel\n\n        pre_channel = self.model_cfg.SHARED_FC[-1] * 2\n        cls_fc_list = []\n        for k in range(0, self.model_cfg.CLS_FC.__len__()):\n            cls_fc_list.extend([\n                nn.Linear(pre_channel, self.model_cfg.CLS_FC[k], bias=False),\n                nn.BatchNorm1d(self.model_cfg.CLS_FC[k]),\n                nn.ReLU()\n            ])\n            pre_channel = self.model_cfg.CLS_FC[k]\n\n            if k != self.model_cfg.CLS_FC.__len__() - 1 and self.model_cfg.DP_RATIO > 0:\n                cls_fc_list.append(nn.Dropout(self.model_cfg.DP_RATIO))\n\n        cls_fc_list.append(nn.Linear(pre_channel, self.num_class, bias=True))\n        self.cls_layers=nn.Sequential(*cls_fc_list)\n\n        pre_channel = self.model_cfg.SHARED_FC[-1] * 2\n        reg_fc_list = []\n        for k in range(0, self.model_cfg.REG_FC.__len__()):\n            reg_fc_list.extend([\n                nn.Linear(pre_channel, self.model_cfg.REG_FC[k], bias=False),\n                nn.BatchNorm1d(self.model_cfg.REG_FC[k]),\n                nn.ReLU()\n            ])\n            pre_channel = self.model_cfg.REG_FC[k]\n\n            if k != self.model_cfg.REG_FC.__len__() - 1 and self.model_cfg.DP_RATIO > 0:\n                reg_fc_list.append(nn.Dropout(self.model_cfg.DP_RATIO))\n\n        reg_fc_list.append(nn.Linear(pre_channel, self.box_coder.code_size * self.num_class, bias=True))\n        reg_fc_layers = nn.Sequential(*reg_fc_list)\n        self.reg_layers=reg_fc_layers\n\n\n        self.cross_attention_layers = Attention_Layer(self.shared_channel)\n\n\n        self.init_weights()\n        self.ious = {0: [], 1: [], 2: [], 3: []}\n\n    def init_weights(self):\n        init_func = nn.init.xavier_normal_\n        for trans_module in [self.cls_layers, self.reg_layers]:\n            for m in trans_module.modules():\n                if isinstance(m, nn.Linear):\n                    init_func(m.weight)\n                    if m.bias is not None:\n                        nn.init.constant_(m.bias, 0)\n        for trans_module in [self.cls_layers, self.reg_layers]:\n            nn.init.normal_(trans_module[-1].weight, 0, 0.01)\n            nn.init.constant_(trans_module[-1].bias, 0)\n        for m in self.shared_fc_layers.modules():\n            if isinstance(m, nn.Linear):\n                init_func(m.weight)\n                if m.bias is not None:\n                    nn.init.constant_(m.bias, 0)\n\n\n    def roi_grid_pool(self, batch_dict, i):\n        \"\"\"\n        Args:\n            batch_dict:\n                batch_size:\n                rois: (B, num_rois, 7 + C)\n                point_coords: (num_points, 4)  [bs_idx, x, y, z]\n                point_features: (num_points, C)\n                point_cls_scores: (N1 + N2 + N3 + ..., 1)\n                point_part_offset: (N1 + N2 + N3 + ..., 3)\n        Returns:\n\n        \"\"\"\n\n        if i==0:\n            rot_num_id = ''\n        else:\n            rot_num_id = str(i)\n\n        rois = batch_dict['rois'].clone()\n\n        batch_size = batch_dict['batch_size']\n        with_vf_transform = batch_dict.get('with_voxel_feature_transform', False)\n\n        roi_grid_xyz, _ = self.get_global_grid_points_of_roi(\n            rois, grid_size=self.pool_cfg.GRID_SIZE\n        )  # (BxN, 6x6x6, 3)\n        # roi_grid_xyz: (B, Nx6x6x6, 3)\n        roi_grid_xyz = roi_grid_xyz.view(batch_size, -1, 3)\n\n        # compute the voxel coordinates of grid points\n        roi_grid_coords_x = (roi_grid_xyz[:, :, 0:1] - self.point_cloud_range[0]) // self.voxel_size[0]\n        roi_grid_coords_y = (roi_grid_xyz[:, :, 1:2] - self.point_cloud_range[1]) // self.voxel_size[1]\n        roi_grid_coords_z = (roi_grid_xyz[:, :, 2:3] - self.point_cloud_range[2]) // self.voxel_size[2]\n        # roi_grid_coords: (B, Nx6x6x6, 3)\n        roi_grid_coords = torch.cat([roi_grid_coords_x, roi_grid_coords_y, roi_grid_coords_z], dim=-1)\n\n        batch_idx = rois.new_zeros(batch_size, roi_grid_coords.shape[1], 1)\n        for bs_idx in range(batch_size):\n            batch_idx[bs_idx, :, 0] = bs_idx\n        # roi_grid_coords: (B, Nx6x6x6, 4)\n        # roi_grid_coords = torch.cat([batch_idx, roi_grid_coords], dim=-1)\n        # roi_grid_coords = roi_grid_coords.int()\n        roi_grid_batch_cnt = rois.new_zeros(batch_size).int().fill_(roi_grid_coords.shape[1])\n\n        pooled_features_list = []\n        for k, src_name in enumerate(self.pool_cfg.FEATURES_SOURCE):\n            pool_layer = self.roi_grid_pool_layers[k]\n            if src_name in ['x_conv1', 'x_conv2', 'x_conv3', 'x_conv4']:\n\n                cur_stride = batch_dict['multi_scale_3d_strides'][src_name]\n\n                j=i\n                while 'multi_scale_3d_features'+rot_num_id not in batch_dict:\n                    j-=1\n                    rot_num_id = str(j)\n\n                cur_sp_tensors = batch_dict['multi_scale_3d_features'+rot_num_id][src_name]\n\n                if with_vf_transform:\n                    cur_sp_tensors = batch_dict['multi_scale_3d_features_post'][src_name]\n                else:\n                    cur_sp_tensors = batch_dict['multi_scale_3d_features'+rot_num_id][src_name]\n\n                # compute voxel center xyz and batch_cnt\n                cur_coords = cur_sp_tensors.indices\n                cur_voxel_xyz = common_utils.get_voxel_centers(\n                    cur_coords[:, 1:4],\n                    downsample_times=cur_stride,\n                    voxel_size=self.voxel_size,\n                    point_cloud_range=self.point_cloud_range\n                )  #\n                cur_voxel_xyz_batch_cnt = cur_voxel_xyz.new_zeros(batch_size).int()\n                for bs_idx in range(batch_size):\n                    cur_voxel_xyz_batch_cnt[bs_idx] = (cur_coords[:, 0] == bs_idx).sum()\n                # get voxel2point tensor\n\n                v2p_ind_tensor = spconv_utils.generate_voxel2pinds(cur_sp_tensors)\n\n                # compute the grid coordinates in this scale, in [batch_idx, x y z] order\n                cur_roi_grid_coords = roi_grid_coords // cur_stride\n                cur_roi_grid_coords = torch.cat([batch_idx, cur_roi_grid_coords], dim=-1)\n                cur_roi_grid_coords = cur_roi_grid_coords.int()\n                # voxel neighbor aggregation\n                pooled_features = pool_layer(\n                    xyz=cur_voxel_xyz.contiguous(),\n                    xyz_batch_cnt=cur_voxel_xyz_batch_cnt,\n                    new_xyz=roi_grid_xyz.contiguous().view(-1, 3),\n                    new_xyz_batch_cnt=roi_grid_batch_cnt,\n                    new_coords=cur_roi_grid_coords.contiguous().view(-1, 4),\n                    features=cur_sp_tensors.features.contiguous(),\n                    voxel2point_indices=v2p_ind_tensor\n                )\n\n                pooled_features = pooled_features.view(\n                    -1, self.pool_cfg.GRID_SIZE ** 3,\n                    pooled_features.shape[-1]\n                )  # (BxN, 6x6x6, C)\n                pooled_features_list.append(pooled_features)\n\n        ms_pooled_features = torch.cat(pooled_features_list, dim=-1)\n\n        return ms_pooled_features\n\n    def get_global_grid_points_of_roi(self, rois, grid_size):\n        rois = rois.view(-1, rois.shape[-1])\n        batch_size_rcnn = rois.shape[0]\n\n        local_roi_grid_points = self.get_dense_grid_points(rois, batch_size_rcnn, grid_size)  # (B, 6x6x6, 3)\n        global_roi_grid_points = common_utils.rotate_points_along_z(\n            local_roi_grid_points.clone(), rois[:, 6]\n        ).squeeze(dim=1)\n        global_center = rois[:, 0:3].clone()\n        global_roi_grid_points += global_center.unsqueeze(dim=1)\n        return global_roi_grid_points, local_roi_grid_points\n\n    @staticmethod\n    def get_dense_grid_points(rois, batch_size_rcnn, grid_size):\n        faked_features = rois.new_ones((grid_size, grid_size, grid_size))\n        dense_idx = faked_features.nonzero()  # (N, 3) [x_idx, y_idx, z_idx]\n        dense_idx = dense_idx.repeat(batch_size_rcnn, 1, 1).float()  # (B, 6x6x6, 3)\n\n        local_roi_size = rois.view(batch_size_rcnn, -1)[:, 3:6]\n        roi_grid_points = (dense_idx + 0.5) / grid_size * local_roi_size.unsqueeze(dim=1) \\\n                          - (local_roi_size.unsqueeze(dim=1) / 2)  # (B, 6x6x6, 3)\n        return roi_grid_points\n\n    def roi_x_trans(self, rois, rot_num_i, transform_param):\n\n        batch_size = len(rois)\n        rois = rois.clone()\n\n        x_transformed_roi = []\n\n\n        for bt_i in range(batch_size):\n\n            cur_roi = rois[bt_i]\n            bt_transform_param = transform_param[bt_i]\n            previous_trans_param = bt_transform_param[rot_num_i-1]\n            current_trans_param = bt_transform_param[rot_num_i]\n\n            transed_roi = self.x_trans_train.backward_with_param({'boxes': cur_roi,\n                                                                  'transform_param': previous_trans_param})\n            transed_roi = self.x_trans_train.forward_with_param({'boxes': transed_roi['boxes'],\n                                                                  'transform_param': current_trans_param})\n\n            x_transformed_roi.append(transed_roi['boxes'])\n\n        return torch.stack(x_transformed_roi)\n\n    def pred_x_trans(self, preds, rot_num_i, transform_param):\n\n        batch_size = len(preds)\n        preds = preds.clone()\n\n        x_transformed_roi = []\n\n        for bt_i in range(batch_size):\n\n            cur_roi = preds[bt_i]\n            bt_transform_param = transform_param[bt_i]\n            current_trans_param = bt_transform_param[rot_num_i]\n\n            transed_roi = self.x_trans_train.backward_with_param({'boxes': cur_roi,\n                                                                  'transform_param': current_trans_param})\n\n            x_transformed_roi.append(transed_roi['boxes'])\n\n        return torch.stack(x_transformed_roi)\n\n    def multi_grid_pool_aggregation(self, batch_dict, targets_dict):\n\n        all_preds = []\n        all_scores = []\n\n        all_shared_features = []\n\n        for i in range(self.rot_num):\n\n            rot_num_id = str(i)\n\n            if i >= 1 and 'transform_param' in batch_dict:\n                batch_dict['rois'] = self.roi_x_trans(batch_dict['rois'], i, batch_dict['transform_param'])\n\n            if self.training:\n                targets_dict = self.assign_targets(batch_dict, i, enable_dif=True)\n\n                batch_dict['rois'] = targets_dict['rois']\n\n                batch_dict['roi_labels'] = targets_dict['roi_labels']\n\n            if 'transform_param' in batch_dict:\n                pooled_features = self.roi_grid_pool(batch_dict, i)\n            else:\n                pooled_features = self.roi_grid_pool(batch_dict, 0)\n\n            pooled_features = pooled_features.view(pooled_features.size(0), -1)\n\n            shared_features = self.shared_fc_layers(pooled_features)\n            shared_features = shared_features.unsqueeze(0)  # 1,B,C\n            all_shared_features.append(shared_features)\n            pre_feat = torch.cat(all_shared_features, 0)\n            attentive_cur_feat = self.cross_attention_layers(pre_feat.permute(1, 0, 2)).unsqueeze(0)\n            attentive_cur_feat = torch.cat([attentive_cur_feat, shared_features], -1)\n            attentive_cur_feat = attentive_cur_feat.squeeze(0)  # B, C*2\n\n            rcnn_cls = self.cls_layers(attentive_cur_feat)\n            rcnn_reg = self.reg_layers(attentive_cur_feat)\n\n            batch_cls_preds, batch_box_preds = self.generate_predicted_boxes(\n                batch_size=batch_dict['batch_size'], rois=batch_dict['rois'], cls_preds=rcnn_cls, box_preds=rcnn_reg\n            )\n\n            if self.training:\n\n                targets_dict['rcnn_cls'] = rcnn_cls\n                targets_dict['rcnn_reg'] = rcnn_reg\n\n                self.forward_ret_dict['targets_dict' + rot_num_id] = targets_dict\n\n            batch_dict['rois'] = batch_box_preds\n            batch_dict['roi_scores'] = batch_cls_preds.squeeze(-1)\n\n            outs = batch_box_preds.clone()\n            if 'transform_param' in batch_dict:\n                outs = self.pred_x_trans(outs, i, batch_dict['transform_param'])\n\n            all_preds.append(outs)\n            all_scores.append(batch_cls_preds)\n\n        return torch.mean(torch.stack(all_preds), 0), torch.mean(torch.stack(all_scores), 0)\n\n\n    def forward(self, batch_dict):\n\n        if 'transform_param' in batch_dict:\n            trans_param = batch_dict['transform_param']\n            self.rot_num = trans_param.shape[1]\n\n        targets_dict = self.proposal_layer(\n            batch_dict, nms_config=self.model_cfg.NMS_CONFIG['TRAIN' if self.training else 'TEST']\n        )\n\n        boxes, scores = self.multi_grid_pool_aggregation(batch_dict, targets_dict)\n\n        if not self.training:\n            batch_dict['batch_box_preds'] = boxes\n            batch_dict['batch_cls_preds'] = scores\n\n        return batch_dict\n    \nclass TEDMHead(RoIHeadTemplate):\n    def __init__(self, input_channels, model_cfg, point_cloud_range=None, voxel_size=None, num_class=1,\n                 **kwargs):\n        super().__init__(num_class=num_class, model_cfg=model_cfg)\n        self.model_cfg = model_cfg\n        self.pool_cfg = model_cfg.ROI_GRID_POOL\n        self.pool_cfg_mm = model_cfg.ROI_GRID_POOL_MM\n        LAYER_cfg = self.pool_cfg.POOL_LAYERS\n        LAYER_cfg_mm = self.pool_cfg_mm.POOL_LAYERS\n        self.point_cloud_range = point_cloud_range\n        self.voxel_size = voxel_size\n        self.rot_num = 3\n\n        self.x_trans_train = X_TRANS()\n\n        c_out = 0\n        self.roi_grid_pool_layers = nn.ModuleList()\n        for src_name in self.pool_cfg.FEATURES_SOURCE:\n            mlps = LAYER_cfg[src_name].MLPS\n            for k in range(len(mlps)):\n                mlps[k] = [input_channels[src_name]] + mlps[k]\n            pool_layer = voxelpool_stack_modules.NeighborVoxelSAModuleMSG(\n                query_ranges=LAYER_cfg[src_name].QUERY_RANGES,\n                nsamples=LAYER_cfg[src_name].NSAMPLE,\n                radii=LAYER_cfg[src_name].POOL_RADIUS,\n                mlps=mlps,\n                pool_method=LAYER_cfg[src_name].POOL_METHOD,\n            )\n\n            self.roi_grid_pool_layers.append(pool_layer)\n\n            c_out += sum([x[-1] for x in mlps])\n\n        c_out_mm = 0\n        self.roi_grid_pool_layers_mm = nn.ModuleList()\n        feat = self.pool_cfg_mm.get('FEAT_NUM', 1)\n        for src_name in self.pool_cfg_mm.FEATURES_SOURCE:\n            mlps = LAYER_cfg_mm[src_name].MLPS\n            for k in range(len(mlps)):\n                mlps[k] = [input_channels[src_name]*feat] + mlps[k]\n            pool_layer = voxelpool_stack_modules.NeighborVoxelSAModuleMSG(\n                query_ranges=LAYER_cfg_mm[src_name].QUERY_RANGES,\n                nsamples=LAYER_cfg_mm[src_name].NSAMPLE,\n                radii=LAYER_cfg_mm[src_name].POOL_RADIUS,\n                mlps=mlps,\n                pool_method=LAYER_cfg_mm[src_name].POOL_METHOD,\n            )\n\n            self.roi_grid_pool_layers_mm.append(pool_layer)\n\n            c_out_mm += sum([x[-1] for x in mlps])\n\n\n\n        self.shared_fc_layers = nn.ModuleList()\n\n        for i in range(self.rot_num):\n            GRID_SIZE = self.model_cfg.ROI_GRID_POOL.GRID_SIZE\n            pre_channel = GRID_SIZE * GRID_SIZE * GRID_SIZE * c_out\n            shared_fc_list = []\n            for k in range(0, self.model_cfg.SHARED_FC.__len__()):\n                shared_fc_list.extend([\n                    nn.Linear(pre_channel, self.model_cfg.SHARED_FC[k], bias=False),\n                    nn.BatchNorm1d(self.model_cfg.SHARED_FC[k]),\n                    nn.ReLU(inplace=True)\n                ])\n                pre_channel = self.model_cfg.SHARED_FC[k]\n\n                if k != self.model_cfg.SHARED_FC.__len__() - 1 and self.model_cfg.DP_RATIO > 0:\n                    shared_fc_list.append(nn.Dropout(self.model_cfg.DP_RATIO))\n            self.shared_fc_layers.append(nn.Sequential(*shared_fc_list))\n            break\n\n        self.shared_fc_layers_mm = nn.ModuleList()\n\n        for i in range(self.rot_num):\n            GRID_SIZE = self.model_cfg.ROI_GRID_POOL_MM.GRID_SIZE\n            pre_channel = GRID_SIZE * GRID_SIZE * GRID_SIZE * c_out_mm\n            shared_fc_list = []\n            for k in range(0, self.model_cfg.SHARED_FC.__len__()):\n                shared_fc_list.extend([\n                    nn.Linear(pre_channel, self.model_cfg.SHARED_FC[k], bias=False),\n                    nn.BatchNorm1d(self.model_cfg.SHARED_FC[k]),\n                    nn.ReLU(inplace=True)\n                ])\n                pre_channel = self.model_cfg.SHARED_FC[k]\n\n                if k != self.model_cfg.SHARED_FC.__len__() - 1 and self.model_cfg.DP_RATIO > 0:\n                    shared_fc_list.append(nn.Dropout(self.model_cfg.DP_RATIO))\n            self.shared_fc_layers_mm.append(nn.Sequential(*shared_fc_list))\n            break\n\n        self.shared_channel = pre_channel\n\n        self.cls_layers = nn.ModuleList()\n        self.reg_layers = nn.ModuleList()\n\n        for i in range(self.rot_num):\n            pre_channel = self.model_cfg.SHARED_FC[-1] * 2 * 2\n            cls_fc_list = []\n            for k in range(0, self.model_cfg.CLS_FC.__len__()):\n                cls_fc_list.extend([\n                    nn.Linear(pre_channel, self.model_cfg.CLS_FC[k], bias=False),\n                    nn.BatchNorm1d(self.model_cfg.CLS_FC[k]),\n                    nn.ReLU()\n                ])\n                pre_channel = self.model_cfg.CLS_FC[k]\n\n                if k != self.model_cfg.CLS_FC.__len__() - 1 and self.model_cfg.DP_RATIO > 0:\n                    cls_fc_list.append(nn.Dropout(self.model_cfg.DP_RATIO))\n\n            cls_fc_list.append(nn.Linear(pre_channel, self.num_class, bias=True))\n            cls_fc_layers = nn.Sequential(*cls_fc_list)\n            self.cls_layers.append(cls_fc_layers)\n\n            pre_channel = self.model_cfg.SHARED_FC[-1] * 2 * 2\n            reg_fc_list = []\n            for k in range(0, self.model_cfg.REG_FC.__len__()):\n                reg_fc_list.extend([\n                    nn.Linear(pre_channel, self.model_cfg.REG_FC[k], bias=False),\n                    nn.BatchNorm1d(self.model_cfg.REG_FC[k]),\n                    nn.ReLU()\n                ])\n                pre_channel = self.model_cfg.REG_FC[k]\n\n                if k != self.model_cfg.REG_FC.__len__() - 1 and self.model_cfg.DP_RATIO > 0:\n                    reg_fc_list.append(nn.Dropout(self.model_cfg.DP_RATIO))\n\n            reg_fc_list.append(nn.Linear(pre_channel, self.box_coder.code_size * self.num_class, bias=True))\n            reg_fc_layers = nn.Sequential(*reg_fc_list)\n            self.reg_layers.append(reg_fc_layers)\n            break\n\n        self.cls_layers_P = nn.ModuleList()\n        self.reg_layers_P = nn.ModuleList()\n\n        for i in range(self.rot_num):\n            pre_channel = self.model_cfg.SHARED_FC[-1] * 2\n            cls_fc_list = []\n            for k in range(0, self.model_cfg.CLS_FC.__len__()):\n                cls_fc_list.extend([\n                    nn.Linear(pre_channel, self.model_cfg.CLS_FC[k], bias=False),\n                    nn.BatchNorm1d(self.model_cfg.CLS_FC[k]),\n                    nn.ReLU()\n                ])\n                pre_channel = self.model_cfg.CLS_FC[k]\n\n                if k != self.model_cfg.CLS_FC.__len__() - 1 and self.model_cfg.DP_RATIO > 0:\n                    cls_fc_list.append(nn.Dropout(self.model_cfg.DP_RATIO))\n\n            cls_fc_list.append(nn.Linear(pre_channel, self.num_class, bias=True))\n            cls_fc_layers = nn.Sequential(*cls_fc_list)\n            self.cls_layers_P.append(cls_fc_layers)\n\n            pre_channel = self.model_cfg.SHARED_FC[-1] * 2\n            reg_fc_list = []\n            for k in range(0, self.model_cfg.REG_FC.__len__()):\n                reg_fc_list.extend([\n                    nn.Linear(pre_channel, self.model_cfg.REG_FC[k], bias=False),\n                    nn.BatchNorm1d(self.model_cfg.REG_FC[k]),\n                    nn.ReLU()\n                ])\n                pre_channel = self.model_cfg.REG_FC[k]\n\n                if k != self.model_cfg.REG_FC.__len__() - 1 and self.model_cfg.DP_RATIO > 0:\n                    reg_fc_list.append(nn.Dropout(self.model_cfg.DP_RATIO))\n\n            reg_fc_list.append(nn.Linear(pre_channel, self.box_coder.code_size * self.num_class, bias=True))\n            reg_fc_layers = nn.Sequential(*reg_fc_list)\n            self.reg_layers_P.append(reg_fc_layers)\n            break\n\n        self.cls_layers_PI = nn.ModuleList()\n        self.reg_layers_PI = nn.ModuleList()\n\n        for i in range(self.rot_num):\n            pre_channel = self.model_cfg.SHARED_FC[-1] * 2\n            cls_fc_list = []\n            for k in range(0, self.model_cfg.CLS_FC.__len__()):\n                cls_fc_list.extend([\n                    nn.Linear(pre_channel, self.model_cfg.CLS_FC[k], bias=False),\n                    nn.BatchNorm1d(self.model_cfg.CLS_FC[k]),\n                    nn.ReLU()\n                ])\n                pre_channel = self.model_cfg.CLS_FC[k]\n\n                if k != self.model_cfg.CLS_FC.__len__() - 1 and self.model_cfg.DP_RATIO > 0:\n                    cls_fc_list.append(nn.Dropout(self.model_cfg.DP_RATIO))\n\n            cls_fc_list.append(nn.Linear(pre_channel, self.num_class, bias=True))\n            cls_fc_layers = nn.Sequential(*cls_fc_list)\n            self.cls_layers_PI.append(cls_fc_layers)\n\n            pre_channel = self.model_cfg.SHARED_FC[-1] * 2\n            reg_fc_list = []\n            for k in range(0, self.model_cfg.REG_FC.__len__()):\n                reg_fc_list.extend([\n                    nn.Linear(pre_channel, self.model_cfg.REG_FC[k], bias=False),\n                    nn.BatchNorm1d(self.model_cfg.REG_FC[k]),\n                    nn.ReLU()\n                ])\n                pre_channel = self.model_cfg.REG_FC[k]\n\n                if k != self.model_cfg.REG_FC.__len__() - 1 and self.model_cfg.DP_RATIO > 0:\n                    reg_fc_list.append(nn.Dropout(self.model_cfg.DP_RATIO))\n\n            reg_fc_list.append(nn.Linear(pre_channel, self.box_coder.code_size * self.num_class, bias=True))\n            reg_fc_layers = nn.Sequential(*reg_fc_list)\n            self.reg_layers_PI.append(reg_fc_layers)\n            break\n\n\n        if self.model_cfg.get('PART', False):\n            self.grid_offsets = self.model_cfg.PART.GRID_OFFSETS\n            self.featmap_stride = self.model_cfg.PART.FEATMAP_STRIDE\n            part_inchannel = self.model_cfg.PART.IN_CHANNEL\n            self.num_parts = self.model_cfg.PART.SIZE ** 2\n\n            self.conv_part = nn.Sequential(\n                nn.Conv2d(part_inchannel, part_inchannel, 3, 1, padding=1, bias=False),\n                nn.BatchNorm2d(part_inchannel, eps=1e-3, momentum=0.01),\n                nn.ReLU(inplace=True),\n                nn.Conv2d(part_inchannel, self.num_parts, 1, 1, padding=0, bias=False),\n            )\n            self.gen_grid_fn = partial(gen_sample_grid, grid_offsets=self.grid_offsets,\n                                   spatial_scale=1 / self.featmap_stride)\n\n        self.cross_attention_layers = nn.ModuleList()\n        for i in range(self.rot_num):\n            this_mo = CrossAttention(self.shared_channel)\n            # print(count_parameters(this_mo))\n            # input()\n            self.cross_attention_layers.append(this_mo)\n\n\n\n        self.cross_attention_layers_mm = nn.ModuleList()\n        for i in range(self.rot_num):\n            this_mo = CrossAttention(self.shared_channel)\n            # print(count_parameters(this_mo))\n            # input()\n            self.cross_attention_layers_mm.append(this_mo)\n\n\n\n        self.init_weights()\n        self.ious = {0: [], 1: [], 2: [], 3: []}\n\n    def init_weights(self):\n        init_func = nn.init.xavier_normal_\n        for module_list in [self.cls_layers, self.reg_layers]:\n            for trans_module in module_list:\n                for m in trans_module.modules():\n                    if isinstance(m, nn.Linear):\n                        init_func(m.weight)\n                        if m.bias is not None:\n                            nn.init.constant_(m.bias, 0)\n        for module_list in [self.cls_layers, self.reg_layers]:\n            for trans_module in module_list:\n                nn.init.normal_(trans_module[-1].weight, 0, 0.01)\n                nn.init.constant_(trans_module[-1].bias, 0)\n        for m in self.shared_fc_layers.modules():\n            if isinstance(m, nn.Linear):\n                init_func(m.weight)\n                if m.bias is not None:\n                    nn.init.constant_(m.bias, 0)\n\n    def obtain_conf_preds(self, confi_im, anchors):\n\n        confi = []\n\n        for i, im in enumerate(confi_im):\n            boxes = anchors[i]\n            im = confi_im[i]\n            if len(boxes) == 0:\n                confi.append(torch.empty(0).type_as(im))\n            else:\n                (xs, ys) = self.gen_grid_fn(boxes)\n                out = bilinear_interpolate_torch_gridsample(im, xs, ys)\n                x = torch.mean(out, 0).view(-1, 1)\n                confi.append(x)\n\n        confi = torch.cat(confi)\n\n        return confi\n\n    def roi_part_pool(self, batch_dict, parts_feat):\n        rois = batch_dict['rois_score'].clone()\n        confi_preds = self.obtain_conf_preds(parts_feat, rois)\n\n        return confi_preds\n\n    def roi_grid_pool(self, batch_dict, i):\n        \"\"\"\n        Args:\n            batch_dict:\n                batch_size:\n                rois: (B, num_rois, 7 + C)\n                point_coords: (num_points, 4)  [bs_idx, x, y, z]\n                point_features: (num_points, C)\n                point_cls_scores: (N1 + N2 + N3 + ..., 1)\n                point_part_offset: (N1 + N2 + N3 + ..., 3)\n        Returns:\n\n        \"\"\"\n\n        if i==0:\n            rot_num_id = ''\n        else:\n            rot_num_id = str(i)\n\n        rois = batch_dict['rois'].clone()\n\n        batch_size = batch_dict['batch_size']\n        with_vf_transform = batch_dict.get('with_voxel_feature_transform', False)\n\n        roi_grid_xyz, _ = self.get_global_grid_points_of_roi(\n            rois, grid_size=self.pool_cfg.GRID_SIZE\n        )  # (BxN, 6x6x6, 3)\n        # roi_grid_xyz: (B, Nx6x6x6, 3)\n        roi_grid_xyz = roi_grid_xyz.view(batch_size, -1, 3)\n\n        # compute the voxel coordinates of grid points\n        roi_grid_coords_x = (roi_grid_xyz[:, :, 0:1] - self.point_cloud_range[0]) // self.voxel_size[0]\n        roi_grid_coords_y = (roi_grid_xyz[:, :, 1:2] - self.point_cloud_range[1]) // self.voxel_size[1]\n        roi_grid_coords_z = (roi_grid_xyz[:, :, 2:3] - self.point_cloud_range[2]) // self.voxel_size[2]\n        # roi_grid_coords: (B, Nx6x6x6, 3)\n        roi_grid_coords = torch.cat([roi_grid_coords_x, roi_grid_coords_y, roi_grid_coords_z], dim=-1)\n\n        batch_idx = rois.new_zeros(batch_size, roi_grid_coords.shape[1], 1)\n        for bs_idx in range(batch_size):\n            batch_idx[bs_idx, :, 0] = bs_idx\n        # roi_grid_coords: (B, Nx6x6x6, 4)\n        # roi_grid_coords = torch.cat([batch_idx, roi_grid_coords], dim=-1)\n        # roi_grid_coords = roi_grid_coords.int()\n        roi_grid_batch_cnt = rois.new_zeros(batch_size).int().fill_(roi_grid_coords.shape[1])\n\n        pooled_features_list = []\n        for k, src_name in enumerate(self.pool_cfg.FEATURES_SOURCE):\n            pool_layer = self.roi_grid_pool_layers[k]\n            if src_name in ['x_conv1', 'x_conv2', 'x_conv3', 'x_conv4']:\n\n                cur_stride = batch_dict['multi_scale_3d_strides'][src_name]\n\n                j=i\n                while 'multi_scale_3d_features'+rot_num_id not in batch_dict:\n                    j-=1\n                    rot_num_id = str(j)\n\n                cur_sp_tensors = batch_dict['multi_scale_3d_features'+rot_num_id][src_name]\n\n                if with_vf_transform:\n                    cur_sp_tensors = batch_dict['multi_scale_3d_features_post'][src_name]\n                else:\n                    cur_sp_tensors = batch_dict['multi_scale_3d_features'+rot_num_id][src_name]\n\n                # compute voxel center xyz and batch_cnt\n                cur_coords = cur_sp_tensors.indices\n                cur_voxel_xyz = common_utils.get_voxel_centers(\n                    cur_coords[:, 1:4],\n                    downsample_times=cur_stride,\n                    voxel_size=self.voxel_size,\n                    point_cloud_range=self.point_cloud_range\n                )  #\n                cur_voxel_xyz_batch_cnt = cur_voxel_xyz.new_zeros(batch_size).int()\n                for bs_idx in range(batch_size):\n                    cur_voxel_xyz_batch_cnt[bs_idx] = (cur_coords[:, 0] == bs_idx).sum()\n                # get voxel2point tensor\n\n                v2p_ind_tensor = spconv_utils.generate_voxel2pinds(cur_sp_tensors)\n\n                # compute the grid coordinates in this scale, in [batch_idx, x y z] order\n                cur_roi_grid_coords = roi_grid_coords // cur_stride\n                cur_roi_grid_coords = torch.cat([batch_idx, cur_roi_grid_coords], dim=-1)\n                cur_roi_grid_coords = cur_roi_grid_coords.int()\n                # voxel neighbor aggregation\n                pooled_features = pool_layer(\n                    xyz=cur_voxel_xyz.contiguous(),\n                    xyz_batch_cnt=cur_voxel_xyz_batch_cnt,\n                    new_xyz=roi_grid_xyz.contiguous().view(-1, 3),\n                    new_xyz_batch_cnt=roi_grid_batch_cnt,\n                    new_coords=cur_roi_grid_coords.contiguous().view(-1, 4),\n                    features=cur_sp_tensors.features.contiguous(),\n                    voxel2point_indices=v2p_ind_tensor\n                )\n\n                pooled_features = pooled_features.view(\n                    -1, self.pool_cfg.GRID_SIZE ** 3,\n                    pooled_features.shape[-1]\n                )  # (BxN, 6x6x6, C)\n                pooled_features_list.append(pooled_features)\n\n        ms_pooled_features = torch.cat(pooled_features_list, dim=-1)\n\n        return ms_pooled_features\n    def roi_grid_pool_mm(self, batch_dict, i):\n        \"\"\"\n        Args:\n            batch_dict:\n                batch_size:\n                rois: (B, num_rois, 7 + C)\n                point_coords: (num_points, 4)  [bs_idx, x, y, z]\n                point_features: (num_points, C)\n                point_cls_scores: (N1 + N2 + N3 + ..., 1)\n                point_part_offset: (N1 + N2 + N3 + ..., 3)\n        Returns:\n\n        \"\"\"\n\n        if i==0:\n            rot_num_id = ''\n        else:\n            rot_num_id = str(i)\n\n        rois = batch_dict['rois'].clone()\n        #rois[:, 3:5] = rois[:, 3:5]*0.5\n\n        batch_size = batch_dict['batch_size']\n        with_vf_transform = batch_dict.get('with_voxel_feature_transform', False)\n\n        roi_grid_xyz, _ = self.get_global_grid_points_of_roi(\n            rois, grid_size=self.pool_cfg_mm.GRID_SIZE\n        )  # (BxN, 6x6x6, 3)\n        # roi_grid_xyz: (B, Nx6x6x6, 3)\n        roi_grid_xyz = roi_grid_xyz.view(batch_size, -1, 3)\n\n        # compute the voxel coordinates of grid points\n        roi_grid_coords_x = (roi_grid_xyz[:, :, 0:1] - self.point_cloud_range[0]) // self.voxel_size[0]\n        roi_grid_coords_y = (roi_grid_xyz[:, :, 1:2] - self.point_cloud_range[1]) // self.voxel_size[1]\n        roi_grid_coords_z = (roi_grid_xyz[:, :, 2:3] - self.point_cloud_range[2]) // self.voxel_size[2]\n        # roi_grid_coords: (B, Nx6x6x6, 3)\n        roi_grid_coords = torch.cat([roi_grid_coords_x, roi_grid_coords_y, roi_grid_coords_z], dim=-1)\n\n        batch_idx = rois.new_zeros(batch_size, roi_grid_coords.shape[1], 1)\n        for bs_idx in range(batch_size):\n            batch_idx[bs_idx, :, 0] = bs_idx\n        # roi_grid_coords: (B, Nx6x6x6, 4)\n        # roi_grid_coords = torch.cat([batch_idx, roi_grid_coords], dim=-1)\n        # roi_grid_coords = roi_grid_coords.int()\n        roi_grid_batch_cnt = rois.new_zeros(batch_size).int().fill_(roi_grid_coords.shape[1])\n\n        pooled_features_list = []\n        for k, src_name in enumerate(self.pool_cfg_mm.FEATURES_SOURCE):\n            pool_layer = self.roi_grid_pool_layers_mm[k]\n            if src_name in ['x_conv1', 'x_conv2', 'x_conv3', 'x_conv4']:\n\n                cur_stride = batch_dict['multi_scale_3d_strides'][src_name]\n                j=i\n                while 'multi_scale_3d_features_mm'+rot_num_id not in batch_dict:\n                    j-=1\n                    rot_num_id = str(j)\n                cur_sp_tensors = batch_dict['multi_scale_3d_features_mm'+rot_num_id][src_name]\n\n                if with_vf_transform:\n                    cur_sp_tensors = batch_dict['multi_scale_3d_features_post'][src_name]\n                else:\n                    cur_sp_tensors = batch_dict['multi_scale_3d_features_mm'+rot_num_id][src_name]\n\n                # compute voxel center xyz and batch_cnt\n                cur_coords = cur_sp_tensors.indices\n                cur_voxel_xyz = common_utils.get_voxel_centers(\n                    cur_coords[:, 1:4],\n                    downsample_times=cur_stride,\n                    voxel_size=self.voxel_size,\n                    point_cloud_range=self.point_cloud_range\n                )  #\n                cur_voxel_xyz_batch_cnt = cur_voxel_xyz.new_zeros(batch_size).int()\n                for bs_idx in range(batch_size):\n                    cur_voxel_xyz_batch_cnt[bs_idx] = (cur_coords[:, 0] == bs_idx).sum()\n                # get voxel2point tensor\n\n                v2p_ind_tensor = spconv_utils.generate_voxel2pinds(cur_sp_tensors)\n\n                # compute the grid coordinates in this scale, in [batch_idx, x y z] order\n                cur_roi_grid_coords = roi_grid_coords // cur_stride\n                cur_roi_grid_coords = torch.cat([batch_idx, cur_roi_grid_coords], dim=-1)\n                cur_roi_grid_coords = cur_roi_grid_coords.int()\n                # voxel neighbor aggregation\n                pooled_features = pool_layer(\n                    xyz=cur_voxel_xyz.contiguous(),\n                    xyz_batch_cnt=cur_voxel_xyz_batch_cnt,\n                    new_xyz=roi_grid_xyz.contiguous().view(-1, 3),\n                    new_xyz_batch_cnt=roi_grid_batch_cnt,\n                    new_coords=cur_roi_grid_coords.contiguous().view(-1, 4),\n                    features=cur_sp_tensors.features.contiguous(),\n                    voxel2point_indices=v2p_ind_tensor\n                )\n\n                pooled_features = pooled_features.view(\n                    -1, self.pool_cfg_mm.GRID_SIZE ** 3,\n                    pooled_features.shape[-1]\n                )  # (BxN, 6x6x6, C)\n                pooled_features_list.append(pooled_features)\n\n        ms_pooled_features = torch.cat(pooled_features_list, dim=-1)\n\n        return ms_pooled_features\n\n    def get_global_grid_points_of_roi(self, rois, grid_size):\n        rois = rois.view(-1, rois.shape[-1])\n        batch_size_rcnn = rois.shape[0]\n\n        local_roi_grid_points = self.get_dense_grid_points(rois, batch_size_rcnn, grid_size)  # (B, 6x6x6, 3)\n        global_roi_grid_points = common_utils.rotate_points_along_z(\n            local_roi_grid_points.clone(), rois[:, 6]\n        ).squeeze(dim=1)\n        global_center = rois[:, 0:3].clone()\n        global_roi_grid_points += global_center.unsqueeze(dim=1)\n        return global_roi_grid_points, local_roi_grid_points\n\n    @staticmethod\n    def get_dense_grid_points(rois, batch_size_rcnn, grid_size):\n        faked_features = rois.new_ones((grid_size, grid_size, grid_size))\n        dense_idx = faked_features.nonzero()  # (N, 3) [x_idx, y_idx, z_idx]\n        dense_idx = dense_idx.repeat(batch_size_rcnn, 1, 1).float()  # (B, 6x6x6, 3)\n\n        local_roi_size = rois.view(batch_size_rcnn, -1)[:, 3:6]\n        roi_grid_points = (dense_idx + 0.5) / grid_size * local_roi_size.unsqueeze(dim=1) \\\n                          - (local_roi_size.unsqueeze(dim=1) / 2)  # (B, 6x6x6, 3)\n        return roi_grid_points\n\n\n    def roi_x_trans(self, rois, trans_i, transform_param):\n        while trans_i>=len(transform_param[0]):\n            trans_i-=1\n\n        batch_size = len(rois)\n        rois = rois.clone()\n\n        x_transformed_roi = []\n\n        for bt_i in range(batch_size):\n\n            cur_roi = rois[bt_i]\n            bt_transform_param = transform_param[bt_i]\n            previous_trans_param = bt_transform_param[trans_i-1]\n            current_trans_param = bt_transform_param[trans_i]\n\n            transed_roi = self.x_trans_train.backward_with_param({'boxes': cur_roi,\n                                                                  'transform_param': previous_trans_param})\n            transed_roi = self.x_trans_train.forward_with_param({'boxes': transed_roi['boxes'],\n                                                                  'transform_param': current_trans_param})\n\n            x_transformed_roi.append(transed_roi['boxes'])\n\n        return torch.stack(x_transformed_roi)\n\n    def roi_score_trans(self, rois, trans_i, transform_param):\n        while trans_i>=len(transform_param[0]):\n            trans_i-=1\n\n        batch_size = len(rois)\n        rois = rois.clone()\n\n        x_transformed_roi = []\n\n        for bt_i in range(batch_size):\n\n            cur_roi = rois[bt_i]\n            bt_transform_param = transform_param[bt_i]\n            previous_trans_param = bt_transform_param[0]\n            current_trans_param = bt_transform_param[trans_i]\n\n            transed_roi = self.x_trans_train.backward_with_param({'boxes': cur_roi,\n                                                                  'transform_param': current_trans_param})\n            transed_roi = self.x_trans_train.forward_with_param({'boxes': transed_roi['boxes'],\n                                                                  'transform_param': previous_trans_param})\n\n            x_transformed_roi.append(transed_roi['boxes'])\n\n        return torch.stack(x_transformed_roi)\n\n    def pred_x_trans(self, preds, trans_i, transform_param):\n        while trans_i>=len(transform_param[0]):\n            trans_i-=1\n\n        batch_size = len(preds)\n        preds = preds.clone()\n\n        x_transformed_roi = []\n\n        for bt_i in range(batch_size):\n\n            cur_roi = preds[bt_i]\n            bt_transform_param = transform_param[bt_i]\n            current_trans_param = bt_transform_param[trans_i]\n\n            transed_roi = self.x_trans_train.backward_with_param({'boxes': cur_roi,\n                                                                  'transform_param': current_trans_param})\n\n            x_transformed_roi.append(transed_roi['boxes'])\n\n        return torch.stack(x_transformed_roi)\n\n    def multi_grid_pool_aggregation(self, batch_dict, targets_dict):\n        if self.model_cfg.get('PART', False):\n            feat_2d = batch_dict['st_features_2d']\n            parts_feat = self.conv_part(feat_2d)\n\n        all_preds = []\n        all_scores = []\n\n        all_shared_features = []\n        all_shared_features_mm = []\n\n        for i in range(self.rot_num):\n\n            rot_num_id = str(i)\n\n            if i >= 1 and 'transform_param' in batch_dict:\n                batch_dict['rois'] = self.roi_x_trans(batch_dict['rois'], i, batch_dict['transform_param'])\n\n\n            if self.training:\n                targets_dict = self.assign_targets(batch_dict, i, enable_dif=True)\n                targets_dict['aug_param'] = batch_dict['aug_param']\n                targets_dict['image_shape'] = batch_dict['image_shape']\n                targets_dict['calib'] = batch_dict['calib']\n                batch_dict['rois'] = targets_dict['rois']\n\n                batch_dict['roi_labels'] = targets_dict['roi_labels']\n\n\n            if i >= 1 and 'transform_param' in batch_dict:\n                batch_dict['rois_score'] = self.roi_score_trans(batch_dict['rois'], i, batch_dict['transform_param'])\n            else:\n                batch_dict['rois_score'] = batch_dict['rois']\n            if self.model_cfg.get('PART', False):\n                part_scores = self.roi_part_pool(batch_dict, parts_feat)\n\n\n            if 'transform_param' in batch_dict:\n                pooled_features = self.roi_grid_pool(batch_dict, i)\n                pooled_features_mm = self.roi_grid_pool_mm(batch_dict, i)\n            else:\n                pooled_features = self.roi_grid_pool(batch_dict, 0)\n                pooled_features_mm = self.roi_grid_pool_mm(batch_dict, 0)\n\n            pooled_features = pooled_features.view(pooled_features.size(0), -1)\n\n            shared_features = self.shared_fc_layers[0](pooled_features)\n            shared_features = shared_features.unsqueeze(0)  # 1,B,C\n            all_shared_features.append(shared_features)\n            pre_feat = torch.cat(all_shared_features, 0)\n            cur_feat = self.cross_attention_layers[i](pre_feat, shared_features)\n            cur_feat = torch.cat([cur_feat, shared_features], -1)\n            cur_feat = cur_feat.squeeze(0)  # B, C*2\n\n\n            pooled_features_mm = pooled_features_mm.view(pooled_features_mm.size(0), -1)\n            shared_features_mm = self.shared_fc_layers_mm[0](pooled_features_mm)\n            shared_features_mm = shared_features_mm.unsqueeze(0)  # 1,B,C\n            all_shared_features_mm.append(shared_features_mm)\n            pre_feat_mm = torch.cat(all_shared_features_mm, 0)\n            cur_feat_mm = self.cross_attention_layers_mm[i](pre_feat_mm, shared_features_mm)\n            cur_feat_mm = torch.cat([cur_feat_mm, shared_features_mm], -1)\n            cur_feat_mm = cur_feat_mm.squeeze(0)  # B, C*2\n\n            final_feat = torch.cat([cur_feat_mm, cur_feat],-1)\n            rcnn_cls = self.cls_layers[0](final_feat)\n            rcnn_reg = self.reg_layers[0](final_feat)\n            rcnn_cls_pi = self.cls_layers_PI[0](cur_feat_mm)\n            rcnn_reg_pi = self.reg_layers_PI[0](cur_feat_mm)\n            rcnn_cls_p = self.cls_layers_P[0](cur_feat)\n            rcnn_reg_p = self.reg_layers_P[0](cur_feat)\n\n\n            if self.model_cfg.get('PART', False):\n                rcnn_cls = rcnn_cls+part_scores\n                rcnn_cls_pi = rcnn_cls_pi+part_scores\n                rcnn_cls_p = rcnn_cls_p+part_scores\n\n            batch_cls_preds, batch_box_preds = self.generate_predicted_boxes(\n                batch_size=batch_dict['batch_size'], rois=batch_dict['rois'], cls_preds=rcnn_cls, box_preds=rcnn_reg\n            )\n\n            outs = batch_box_preds.clone()\n            if 'transform_param' in batch_dict:\n                outs = self.pred_x_trans(outs, i, batch_dict['transform_param'])\n            all_preds.append(outs)\n            all_scores.append(batch_cls_preds)\n\n            if self.training:\n                targets_dict_pi = copy.deepcopy(targets_dict)\n                targets_dict_p = copy.deepcopy(targets_dict)\n                targets_dict['rcnn_cls'] = rcnn_cls\n                targets_dict['rcnn_reg'] = rcnn_reg\n                targets_dict_pi['rcnn_cls'] = rcnn_cls_pi\n                targets_dict_pi['rcnn_reg'] = rcnn_reg_pi\n                targets_dict_p['rcnn_cls'] = rcnn_cls_p\n                targets_dict_p['rcnn_reg'] = rcnn_reg_p\n\n                self.forward_ret_dict['targets_dict' + rot_num_id] = targets_dict\n                self.forward_ret_dict['targets_dict_pi' + rot_num_id] = targets_dict_pi\n                self.forward_ret_dict['targets_dict_p' + rot_num_id] = targets_dict_p\n\n            batch_dict['rois'] = batch_box_preds\n            batch_dict['roi_scores'] = batch_cls_preds.squeeze(-1)\n\n        return torch.mean(torch.stack(all_preds), 0), torch.mean(torch.stack(all_scores), 0)\n\n    def forward(self, batch_dict):\n\n        if 'transform_param' in batch_dict:\n            trans_param = batch_dict['transform_param']\n            self.rot_num = trans_param.shape[1]\n\n        targets_dict = self.proposal_layer(\n            batch_dict, nms_config=self.model_cfg.NMS_CONFIG['TRAIN' if self.training else 'TEST']\n        )\n\n        boxes, scores = self.multi_grid_pool_aggregation(batch_dict,targets_dict)\n\n        if not self.training:\n            batch_dict['batch_box_preds'] = boxes\n            batch_dict['batch_cls_preds'] = scores\n\n        return batch_dict\n"
  },
  {
    "path": "pcdet/ops/dcn/__init__.py",
    "content": "from .deform_conv import (DeformConv, DeformConvPack, ModulatedDeformConv,\n                          ModulatedDeformConvPack, deform_conv,\n                          modulated_deform_conv)\n\n__all__ = [\n    'DeformConv', 'DeformConvPack', 'ModulatedDeformConv',\n    'ModulatedDeformConvPack', 'deform_conv', 'modulated_deform_conv',\n]\n"
  },
  {
    "path": "pcdet/ops/dcn/deform_conv.py",
    "content": "import math\n\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nfrom torch.autograd import Function\nfrom torch.autograd.function import once_differentiable\nfrom torch.nn.modules.utils import _pair, _single\n\n# from mmdet.utils import print_log\nfrom . import deform_conv_cuda\n\n\nclass DeformConvFunction(Function):\n\n    @staticmethod\n    def forward(ctx,\n                input,\n                offset,\n                weight,\n                stride=1,\n                padding=0,\n                dilation=1,\n                groups=1,\n                deformable_groups=1,\n                im2col_step=64):\n        if input is not None and input.dim() != 4:\n            raise ValueError(\n                'Expected 4D tensor as input, got {}D tensor instead.'.format(\n                    input.dim()))\n        ctx.stride = _pair(stride)\n        ctx.padding = _pair(padding)\n        ctx.dilation = _pair(dilation)\n        ctx.groups = groups\n        ctx.deformable_groups = deformable_groups\n        ctx.im2col_step = im2col_step\n\n        ctx.save_for_backward(input, offset, weight)\n\n        output = input.new_empty(\n            DeformConvFunction._output_size(input, weight, ctx.padding,\n                                            ctx.dilation, ctx.stride))\n\n        ctx.bufs_ = [input.new_empty(0), input.new_empty(0)]  # columns, ones\n\n        if not input.is_cuda:\n            raise NotImplementedError\n        else:\n            cur_im2col_step = min(ctx.im2col_step, input.shape[0])\n            assert (input.shape[0] %\n                    cur_im2col_step) == 0, 'im2col step must divide batchsize'\n            deform_conv_cuda.deform_conv_forward_cuda(\n                input, weight, offset, output, ctx.bufs_[0], ctx.bufs_[1],\n                weight.size(3), weight.size(2), ctx.stride[1], ctx.stride[0],\n                ctx.padding[1], ctx.padding[0], ctx.dilation[1],\n                ctx.dilation[0], ctx.groups, ctx.deformable_groups,\n                cur_im2col_step)\n        return output\n\n    @staticmethod\n    @once_differentiable\n    def backward(ctx, grad_output):\n        input, offset, weight = ctx.saved_tensors\n\n        grad_input = grad_offset = grad_weight = None\n\n        if not grad_output.is_cuda:\n            raise NotImplementedError\n        else:\n            cur_im2col_step = min(ctx.im2col_step, input.shape[0])\n            assert (input.shape[0] %\n                    cur_im2col_step) == 0, 'im2col step must divide batchsize'\n\n            if ctx.needs_input_grad[0] or ctx.needs_input_grad[1]:\n                grad_input = torch.zeros_like(input)\n                grad_offset = torch.zeros_like(offset)\n                deform_conv_cuda.deform_conv_backward_input_cuda(\n                    input, offset, grad_output, grad_input,\n                    grad_offset, weight, ctx.bufs_[0], weight.size(3),\n                    weight.size(2), ctx.stride[1], ctx.stride[0],\n                    ctx.padding[1], ctx.padding[0], ctx.dilation[1],\n                    ctx.dilation[0], ctx.groups, ctx.deformable_groups,\n                    cur_im2col_step)\n\n            if ctx.needs_input_grad[2]:\n                grad_weight = torch.zeros_like(weight)\n                deform_conv_cuda.deform_conv_backward_parameters_cuda(\n                    input, offset, grad_output,\n                    grad_weight, ctx.bufs_[0], ctx.bufs_[1], weight.size(3),\n                    weight.size(2), ctx.stride[1], ctx.stride[0],\n                    ctx.padding[1], ctx.padding[0], ctx.dilation[1],\n                    ctx.dilation[0], ctx.groups, ctx.deformable_groups, 1,\n                    cur_im2col_step)\n\n        return (grad_input, grad_offset, grad_weight, None, None, None, None,\n                None)\n\n    @staticmethod\n    def _output_size(input, weight, padding, dilation, stride):\n        channels = weight.size(0)\n        output_size = (input.size(0), channels)\n        for d in range(input.dim() - 2):\n            in_size = input.size(d + 2)\n            pad = padding[d]\n            kernel = dilation[d] * (weight.size(d + 2) - 1) + 1\n            stride_ = stride[d]\n            output_size += ((in_size + (2 * pad) - kernel) // stride_ + 1, )\n        if not all(map(lambda s: s > 0, output_size)):\n            raise ValueError(\n                'convolution input is too small (output would be {})'.format(\n                    'x'.join(map(str, output_size))))\n        return output_size\n\n\nclass ModulatedDeformConvFunction(Function):\n\n    @staticmethod\n    def forward(ctx,\n                input,\n                offset,\n                mask,\n                weight,\n                bias=None,\n                stride=1,\n                padding=0,\n                dilation=1,\n                groups=1,\n                deformable_groups=1):\n        ctx.stride = stride\n        ctx.padding = padding\n        ctx.dilation = dilation\n        ctx.groups = groups\n        ctx.deformable_groups = deformable_groups\n        ctx.with_bias = bias is not None\n        if not ctx.with_bias:\n            bias = input.new_empty(1)  # fake tensor\n        if not input.is_cuda:\n            raise NotImplementedError\n        if weight.requires_grad or mask.requires_grad or offset.requires_grad \\\n                or input.requires_grad:\n            ctx.save_for_backward(input, offset, mask, weight, bias)\n        output = input.new_empty(\n            ModulatedDeformConvFunction._infer_shape(ctx, input, weight))\n        ctx._bufs = [input.new_empty(0), input.new_empty(0)]\n        deform_conv_cuda.modulated_deform_conv_cuda_forward(\n            input, weight, bias, ctx._bufs[0], offset, mask, output,\n            ctx._bufs[1], weight.shape[2], weight.shape[3], ctx.stride,\n            ctx.stride, ctx.padding, ctx.padding, ctx.dilation, ctx.dilation,\n            ctx.groups, ctx.deformable_groups, ctx.with_bias)\n        return output\n\n    @staticmethod\n    @once_differentiable\n    def backward(ctx, grad_output):\n        if not grad_output.is_cuda:\n            raise NotImplementedError\n        input, offset, mask, weight, bias = ctx.saved_tensors\n        grad_input = torch.zeros_like(input)\n        grad_offset = torch.zeros_like(offset)\n        grad_mask = torch.zeros_like(mask)\n        grad_weight = torch.zeros_like(weight)\n        grad_bias = torch.zeros_like(bias)\n        deform_conv_cuda.modulated_deform_conv_cuda_backward(\n            input, weight, bias, ctx._bufs[0], offset, mask, ctx._bufs[1],\n            grad_input, grad_weight, grad_bias, grad_offset, grad_mask,\n            grad_output, weight.shape[2], weight.shape[3], ctx.stride,\n            ctx.stride, ctx.padding, ctx.padding, ctx.dilation, ctx.dilation,\n            ctx.groups, ctx.deformable_groups, ctx.with_bias)\n        if not ctx.with_bias:\n            grad_bias = None\n\n        return (grad_input, grad_offset, grad_mask, grad_weight, grad_bias,\n                None, None, None, None, None)\n\n    @staticmethod\n    def _infer_shape(ctx, input, weight):\n        n = input.size(0)\n        channels_out = weight.size(0)\n        height, width = input.shape[2:4]\n        kernel_h, kernel_w = weight.shape[2:4]\n        height_out = (height + 2 * ctx.padding -\n                      (ctx.dilation * (kernel_h - 1) + 1)) // ctx.stride + 1\n        width_out = (width + 2 * ctx.padding -\n                     (ctx.dilation * (kernel_w - 1) + 1)) // ctx.stride + 1\n        return n, channels_out, height_out, width_out\n\n\ndeform_conv = DeformConvFunction.apply\nmodulated_deform_conv = ModulatedDeformConvFunction.apply\n\n\nclass DeformConv(nn.Module):\n\n    def __init__(self,\n                 in_channels,\n                 out_channels,\n                 kernel_size,\n                 stride=1,\n                 padding=0,\n                 dilation=1,\n                 groups=1,\n                 deformable_groups=1,\n                 bias=False):\n        super(DeformConv, self).__init__()\n\n        assert not bias\n        assert in_channels % groups == 0, \\\n            'in_channels {} cannot be divisible by groups {}'.format(\n                in_channels, groups)\n        assert out_channels % groups == 0, \\\n            'out_channels {} cannot be divisible by groups {}'.format(\n                out_channels, groups)\n\n        self.in_channels = in_channels\n        self.out_channels = out_channels\n        self.kernel_size = _pair(kernel_size)\n        self.stride = _pair(stride)\n        self.padding = _pair(padding)\n        self.dilation = _pair(dilation)\n        self.groups = groups\n        self.deformable_groups = deformable_groups\n        # enable compatibility with nn.Conv2d\n        self.transposed = False\n        self.output_padding = _single(0)\n\n        self.weight = nn.Parameter(\n            torch.Tensor(out_channels, in_channels // self.groups,\n                         *self.kernel_size))\n\n        self.reset_parameters()\n\n    def reset_parameters(self):\n        n = self.in_channels\n        for k in self.kernel_size:\n            n *= k\n        stdv = 1. / math.sqrt(n)\n        self.weight.data.uniform_(-stdv, stdv)\n\n    def forward(self, x, offset):\n        # To fix an assert error in deform_conv_cuda.cpp:128\n        # input image is smaller than kernel\n        input_pad = (\n            x.size(2) < self.kernel_size[0] or x.size(3) < self.kernel_size[1])\n        if input_pad:\n            pad_h = max(self.kernel_size[0] - x.size(2), 0)\n            pad_w = max(self.kernel_size[1] - x.size(3), 0)\n            x = F.pad(x, (0, pad_w, 0, pad_h), 'constant', 0).contiguous()\n            offset = F.pad(offset, (0, pad_w, 0, pad_h), 'constant',\n                           0).contiguous()\n        out = deform_conv(x, offset, self.weight, self.stride, self.padding,\n                          self.dilation, self.groups, self.deformable_groups)\n        if input_pad:\n            out = out[:, :, :out.size(2) - pad_h, :out.size(3) -\n                      pad_w].contiguous()\n        return out\n\n\nclass DeformConvPack(DeformConv):\n    \"\"\"A Deformable Conv Encapsulation that acts as normal Conv layers.\n\n    Args:\n        in_channels (int): Same as nn.Conv2d.\n        out_channels (int): Same as nn.Conv2d.\n        kernel_size (int or tuple[int]): Same as nn.Conv2d.\n        stride (int or tuple[int]): Same as nn.Conv2d.\n        padding (int or tuple[int]): Same as nn.Conv2d.\n        dilation (int or tuple[int]): Same as nn.Conv2d.\n        groups (int): Same as nn.Conv2d.\n        bias (bool or str): If specified as `auto`, it will be decided by the\n            norm_cfg. Bias will be set as True if norm_cfg is None, otherwise\n            False.\n    \"\"\"\n\n    _version = 2\n\n    def __init__(self, *args, **kwargs):\n        super(DeformConvPack, self).__init__(*args, **kwargs)\n\n        self.conv_offset = nn.Conv2d(\n            self.in_channels,\n            self.deformable_groups * 2 * self.kernel_size[0] *\n            self.kernel_size[1],\n            kernel_size=self.kernel_size,\n            stride=_pair(self.stride),\n            padding=_pair(self.padding),\n            bias=True)\n        self.init_offset()\n\n    def init_offset(self):\n        self.conv_offset.weight.data.zero_()\n        self.conv_offset.bias.data.zero_()\n\n    def forward(self, x):\n        offset = self.conv_offset(x)\n        return deform_conv(x, offset, self.weight, self.stride, self.padding,\n                           self.dilation, self.groups, self.deformable_groups)\n\n    def _load_from_state_dict(self, state_dict, prefix, local_metadata, strict,\n                              missing_keys, unexpected_keys, error_msgs):\n        version = local_metadata.get('version', None)\n\n        if version is None or version < 2:\n            # the key is different in early versions\n            # In version < 2, DeformConvPack loads previous benchmark models.\n            if (prefix + 'conv_offset.weight' not in state_dict\n                    and prefix[:-1] + '_offset.weight' in state_dict):\n                state_dict[prefix + 'conv_offset.weight'] = state_dict.pop(\n                    prefix[:-1] + '_offset.weight')\n            if (prefix + 'conv_offset.bias' not in state_dict\n                    and prefix[:-1] + '_offset.bias' in state_dict):\n                state_dict[prefix +\n                           'conv_offset.bias'] = state_dict.pop(prefix[:-1] +\n                                                                '_offset.bias')\n\n        if version is not None and version > 1:\n            print_log(\n                'DeformConvPack {} is upgraded to version 2.'.format(\n                    prefix.rstrip('.')),\n                logger='root')\n\n        super()._load_from_state_dict(state_dict, prefix, local_metadata,\n                                      strict, missing_keys, unexpected_keys,\n                                      error_msgs)\n\n\nclass ModulatedDeformConv(nn.Module):\n\n    def __init__(self,\n                 in_channels,\n                 out_channels,\n                 kernel_size,\n                 stride=1,\n                 padding=0,\n                 dilation=1,\n                 groups=1,\n                 deformable_groups=1,\n                 bias=True):\n        super(ModulatedDeformConv, self).__init__()\n        self.in_channels = in_channels\n        self.out_channels = out_channels\n        self.kernel_size = _pair(kernel_size)\n        self.stride = stride\n        self.padding = padding\n        self.dilation = dilation\n        self.groups = groups\n        self.deformable_groups = deformable_groups\n        self.with_bias = bias\n        # enable compatibility with nn.Conv2d\n        self.transposed = False\n        self.output_padding = _single(0)\n\n        self.weight = nn.Parameter(\n            torch.Tensor(out_channels, in_channels // groups,\n                         *self.kernel_size))\n        if bias:\n            self.bias = nn.Parameter(torch.Tensor(out_channels))\n        else:\n            self.register_parameter('bias', None)\n        self.reset_parameters()\n\n    def reset_parameters(self):\n        n = self.in_channels\n        for k in self.kernel_size:\n            n *= k\n        stdv = 1. / math.sqrt(n)\n        self.weight.data.uniform_(-stdv, stdv)\n        if self.bias is not None:\n            self.bias.data.zero_()\n\n    def forward(self, x, offset, mask):\n        return modulated_deform_conv(x, offset, mask, self.weight, self.bias,\n                                     self.stride, self.padding, self.dilation,\n                                     self.groups, self.deformable_groups)\n\n\nclass ModulatedDeformConvPack(ModulatedDeformConv):\n    \"\"\"A ModulatedDeformable Conv Encapsulation that acts as normal Conv layers.\n\n    Args:\n        in_channels (int): Same as nn.Conv2d.\n        out_channels (int): Same as nn.Conv2d.\n        kernel_size (int or tuple[int]): Same as nn.Conv2d.\n        stride (int or tuple[int]): Same as nn.Conv2d.\n        padding (int or tuple[int]): Same as nn.Conv2d.\n        dilation (int or tuple[int]): Same as nn.Conv2d.\n        groups (int): Same as nn.Conv2d.\n        bias (bool or str): If specified as `auto`, it will be decided by the\n            norm_cfg. Bias will be set as True if norm_cfg is None, otherwise\n            False.\n    \"\"\"\n\n    _version = 2\n\n    def __init__(self, *args, **kwargs):\n        super(ModulatedDeformConvPack, self).__init__(*args, **kwargs)\n\n        self.conv_offset = nn.Conv2d(\n            self.in_channels,\n            self.deformable_groups * 3 * self.kernel_size[0] *\n            self.kernel_size[1],\n            kernel_size=self.kernel_size,\n            stride=_pair(self.stride),\n            padding=_pair(self.padding),\n            bias=True)\n        self.init_offset()\n\n    def init_offset(self):\n        self.conv_offset.weight.data.zero_()\n        self.conv_offset.bias.data.zero_()\n\n    def forward(self, x):\n        out = self.conv_offset(x)\n        o1, o2, mask = torch.chunk(out, 3, dim=1)\n        offset = torch.cat((o1, o2), dim=1)\n        mask = torch.sigmoid(mask)\n        return modulated_deform_conv(x, offset, mask, self.weight, self.bias,\n                                     self.stride, self.padding, self.dilation,\n                                     self.groups, self.deformable_groups)\n\n    def _load_from_state_dict(self, state_dict, prefix, local_metadata, strict,\n                              missing_keys, unexpected_keys, error_msgs):\n        version = local_metadata.get('version', None)\n\n        if version is None or version < 2:\n            # the key is different in early versions\n            # In version < 2, ModulatedDeformConvPack\n            # loads previous benchmark models.\n            if (prefix + 'conv_offset.weight' not in state_dict\n                    and prefix[:-1] + '_offset.weight' in state_dict):\n                state_dict[prefix + 'conv_offset.weight'] = state_dict.pop(\n                    prefix[:-1] + '_offset.weight')\n            if (prefix + 'conv_offset.bias' not in state_dict\n                    and prefix[:-1] + '_offset.bias' in state_dict):\n                state_dict[prefix +\n                           'conv_offset.bias'] = state_dict.pop(prefix[:-1] +\n                                                                '_offset.bias')\n\n        if version is not None and version > 1:\n            print_log(\n                'ModulatedDeformConvPack {} is upgraded to version 2.'.format(\n                    prefix.rstrip('.')),\n                logger='root')\n\n        super()._load_from_state_dict(state_dict, prefix, local_metadata,\n                                      strict, missing_keys, unexpected_keys,\n                                      error_msgs)\n"
  },
  {
    "path": "pcdet/ops/dcn/setup.py",
    "content": "from setuptools import setup\nfrom torch.utils.cpp_extension import BuildExtension, CUDAExtension\n\nsetup(\n    name='masked_conv',\n    ext_modules=[\n        CUDAExtension('deform_conv_cuda', [\n            'src/deform_conv_cuda.cpp',\n            'src/deform_conv_cuda_kernel.cu',\n        ],\n        define_macros=[('WITH_CUDA', None)],\n        extra_compile_args={\n            'cxx': [],\n            'nvcc': [\n                '-D__CUDA_NO_HALF_OPERATORS__',\n                '-D__CUDA_NO_HALF_CONVERSIONS__',\n                '-D__CUDA_NO_HALF2_OPERATORS__',\n        ]})],\n        cmdclass={'build_ext': BuildExtension})\n\n"
  },
  {
    "path": "pcdet/ops/dcn/src/deform_conv_cuda.cpp",
    "content": "// modify from\n// https://github.com/chengdazhi/Deformable-Convolution-V2-PyTorch/blob/mmdetection/mmdet/ops/dcn/src/deform_conv_cuda.c\n\n#include <torch/extension.h>\n#include <ATen/DeviceGuard.h>\n\n#include <cmath>\n#include <vector>\n\nvoid deformable_im2col(const at::Tensor data_im, const at::Tensor data_offset,\n                       const int channels, const int height, const int width,\n                       const int ksize_h, const int ksize_w, const int pad_h,\n                       const int pad_w, const int stride_h, const int stride_w,\n                       const int dilation_h, const int dilation_w,\n                       const int parallel_imgs, const int deformable_group,\n                       at::Tensor data_col);\n\nvoid deformable_col2im(const at::Tensor data_col, const at::Tensor data_offset,\n                       const int channels, const int height, const int width,\n                       const int ksize_h, const int ksize_w, const int pad_h,\n                       const int pad_w, const int stride_h, const int stride_w,\n                       const int dilation_h, const int dilation_w,\n                       const int parallel_imgs, const int deformable_group,\n                       at::Tensor grad_im);\n\nvoid deformable_col2im_coord(\n    const at::Tensor data_col, const at::Tensor data_im,\n    const at::Tensor data_offset, const int channels, const int height,\n    const int width, const int ksize_h, const int ksize_w, const int pad_h,\n    const int pad_w, const int stride_h, const int stride_w,\n    const int dilation_h, const int dilation_w, const int parallel_imgs,\n    const int deformable_group, at::Tensor grad_offset);\n\nvoid modulated_deformable_im2col_cuda(\n    const at::Tensor data_im, const at::Tensor data_offset,\n    const at::Tensor data_mask, const int batch_size, const int channels,\n    const int height_im, const int width_im, const int height_col,\n    const int width_col, const int kernel_h, const int kenerl_w,\n    const int pad_h, const int pad_w, const int stride_h, const int stride_w,\n    const int dilation_h, const int dilation_w, const int deformable_group,\n    at::Tensor data_col);\n\nvoid modulated_deformable_col2im_cuda(\n    const at::Tensor data_col, const at::Tensor data_offset,\n    const at::Tensor data_mask, const int batch_size, const int channels,\n    const int height_im, const int width_im, const int height_col,\n    const int width_col, const int kernel_h, const int kenerl_w,\n    const int pad_h, const int pad_w, const int stride_h, const int stride_w,\n    const int dilation_h, const int dilation_w, const int deformable_group,\n    at::Tensor grad_im);\n\nvoid modulated_deformable_col2im_coord_cuda(\n    const at::Tensor data_col, const at::Tensor data_im,\n    const at::Tensor data_offset, const at::Tensor data_mask,\n    const int batch_size, const int channels, const int height_im,\n    const int width_im, const int height_col, const int width_col,\n    const int kernel_h, const int kenerl_w, const int pad_h, const int pad_w,\n    const int stride_h, const int stride_w, const int dilation_h,\n    const int dilation_w, const int deformable_group, at::Tensor grad_offset,\n    at::Tensor grad_mask);\n\nvoid shape_check(at::Tensor input, at::Tensor offset, at::Tensor *gradOutput,\n                 at::Tensor weight, int kH, int kW, int dH, int dW, int padH,\n                 int padW, int dilationH, int dilationW, int group,\n                 int deformable_group) {\n  TORCH_CHECK(weight.ndimension() == 4,\n           \"4D weight tensor (nOutputPlane,nInputPlane,kH,kW) expected, \"\n           \"but got: %s\",\n           weight.ndimension());\n\n  TORCH_CHECK(weight.is_contiguous(), \"weight tensor has to be contiguous\");\n\n  TORCH_CHECK(kW > 0 && kH > 0,\n           \"kernel size should be greater than zero, but got kH: %d kW: %d\", kH,\n           kW);\n\n  TORCH_CHECK((weight.size(2) == kH && weight.size(3) == kW),\n           \"kernel size should be consistent with weight, \",\n           \"but got kH: %d kW: %d weight.size(2): %d, weight.size(3): %d\", kH,\n           kW, weight.size(2), weight.size(3));\n\n  TORCH_CHECK(dW > 0 && dH > 0,\n           \"stride should be greater than zero, but got dH: %d dW: %d\", dH, dW);\n\n  TORCH_CHECK(\n      dilationW > 0 && dilationH > 0,\n      \"dilation should be greater than 0, but got dilationH: %d dilationW: %d\",\n      dilationH, dilationW);\n\n  int ndim = input.ndimension();\n  int dimf = 0;\n  int dimh = 1;\n  int dimw = 2;\n\n  if (ndim == 4) {\n    dimf++;\n    dimh++;\n    dimw++;\n  }\n\n  TORCH_CHECK(ndim == 3 || ndim == 4, \"3D or 4D input tensor expected but got: %s\",\n           ndim);\n\n  long nInputPlane = weight.size(1) * group;\n  long inputHeight = input.size(dimh);\n  long inputWidth = input.size(dimw);\n  long nOutputPlane = weight.size(0);\n  long outputHeight =\n      (inputHeight + 2 * padH - (dilationH * (kH - 1) + 1)) / dH + 1;\n  long outputWidth =\n      (inputWidth + 2 * padW - (dilationW * (kW - 1) + 1)) / dW + 1;\n\n  TORCH_CHECK(nInputPlane % deformable_group == 0,\n           \"input channels must divide deformable group size\");\n\n  if (outputWidth < 1 || outputHeight < 1)\n    AT_ERROR(\n        \"Given input size: (%ld x %ld x %ld). \"\n        \"Calculated output size: (%ld x %ld x %ld). Output size is too small\",\n        nInputPlane, inputHeight, inputWidth, nOutputPlane, outputHeight,\n        outputWidth);\n\n  TORCH_CHECK(input.size(1) == nInputPlane,\n           \"invalid number of input planes, expected: %d, but got: %d\",\n           nInputPlane, input.size(1));\n\n  TORCH_CHECK((inputHeight >= kH && inputWidth >= kW),\n           \"input image is smaller than kernel\");\n\n  TORCH_CHECK((offset.size(2) == outputHeight && offset.size(3) == outputWidth),\n           \"invalid spatial size of offset, expected height: %d width: %d, but \"\n           \"got height: %d width: %d\",\n           outputHeight, outputWidth, offset.size(2), offset.size(3));\n\n  TORCH_CHECK((offset.size(1) == deformable_group * 2 * kH * kW),\n           \"invalid number of channels of offset\");\n\n  if (gradOutput != NULL) {\n    TORCH_CHECK(gradOutput->size(dimf) == nOutputPlane,\n             \"invalid number of gradOutput planes, expected: %d, but got: %d\",\n             nOutputPlane, gradOutput->size(dimf));\n\n    TORCH_CHECK((gradOutput->size(dimh) == outputHeight &&\n              gradOutput->size(dimw) == outputWidth),\n             \"invalid size of gradOutput, expected height: %d width: %d , but \"\n             \"got height: %d width: %d\",\n             outputHeight, outputWidth, gradOutput->size(dimh),\n             gradOutput->size(dimw));\n  }\n}\n\nint deform_conv_forward_cuda(at::Tensor input, at::Tensor weight,\n                             at::Tensor offset, at::Tensor output,\n                             at::Tensor columns, at::Tensor ones, int kW,\n                             int kH, int dW, int dH, int padW, int padH,\n                             int dilationW, int dilationH, int group,\n                             int deformable_group, int im2col_step) {\n  // todo: resize columns to include im2col: done\n  // todo: add im2col_step as input\n  // todo: add new output buffer and transpose it to output (or directly\n  // transpose output) todo: possibly change data indexing because of\n  // parallel_imgs\n\n  shape_check(input, offset, NULL, weight, kH, kW, dH, dW, padH, padW,\n              dilationH, dilationW, group, deformable_group);\n  at::DeviceGuard guard(input.device());\n\n  input = input.contiguous();\n  offset = offset.contiguous();\n  weight = weight.contiguous();\n\n  int batch = 1;\n  if (input.ndimension() == 3) {\n    // Force batch\n    batch = 0;\n    input.unsqueeze_(0);\n    offset.unsqueeze_(0);\n  }\n\n  // todo: assert batchsize dividable by im2col_step\n\n  long batchSize = input.size(0);\n  long nInputPlane = input.size(1);\n  long inputHeight = input.size(2);\n  long inputWidth = input.size(3);\n\n  long nOutputPlane = weight.size(0);\n\n  long outputWidth =\n      (inputWidth + 2 * padW - (dilationW * (kW - 1) + 1)) / dW + 1;\n  long outputHeight =\n      (inputHeight + 2 * padH - (dilationH * (kH - 1) + 1)) / dH + 1;\n\n  TORCH_CHECK((offset.size(0) == batchSize), \"invalid batch size of offset\");\n\n  output = output.view({batchSize / im2col_step, im2col_step, nOutputPlane,\n                        outputHeight, outputWidth});\n  columns = at::zeros(\n      {nInputPlane * kW * kH, im2col_step * outputHeight * outputWidth},\n      input.options());\n\n  if (ones.ndimension() != 2 ||\n      ones.size(0) * ones.size(1) < outputHeight * outputWidth) {\n    ones = at::ones({outputHeight, outputWidth}, input.options());\n  }\n\n  input = input.view({batchSize / im2col_step, im2col_step, nInputPlane,\n                      inputHeight, inputWidth});\n  offset =\n      offset.view({batchSize / im2col_step, im2col_step,\n                   deformable_group * 2 * kH * kW, outputHeight, outputWidth});\n\n  at::Tensor output_buffer =\n      at::zeros({batchSize / im2col_step, nOutputPlane,\n                 im2col_step * outputHeight, outputWidth},\n                output.options());\n\n  output_buffer = output_buffer.view(\n      {output_buffer.size(0), group, output_buffer.size(1) / group,\n       output_buffer.size(2), output_buffer.size(3)});\n\n  for (int elt = 0; elt < batchSize / im2col_step; elt++) {\n    deformable_im2col(input[elt], offset[elt], nInputPlane, inputHeight,\n                      inputWidth, kH, kW, padH, padW, dH, dW, dilationH,\n                      dilationW, im2col_step, deformable_group, columns);\n\n    columns = columns.view({group, columns.size(0) / group, columns.size(1)});\n    weight = weight.view({group, weight.size(0) / group, weight.size(1),\n                          weight.size(2), weight.size(3)});\n\n    for (int g = 0; g < group; g++) {\n      output_buffer[elt][g] = output_buffer[elt][g]\n                                  .flatten(1)\n                                  .addmm_(weight[g].flatten(1), columns[g])\n                                  .view_as(output_buffer[elt][g]);\n    }\n  }\n\n  output_buffer = output_buffer.view(\n      {output_buffer.size(0), output_buffer.size(1) * output_buffer.size(2),\n       output_buffer.size(3), output_buffer.size(4)});\n\n  output_buffer = output_buffer.view({batchSize / im2col_step, nOutputPlane,\n                                      im2col_step, outputHeight, outputWidth});\n  output_buffer.transpose_(1, 2);\n  output.copy_(output_buffer);\n  output = output.view({batchSize, nOutputPlane, outputHeight, outputWidth});\n\n  input = input.view({batchSize, nInputPlane, inputHeight, inputWidth});\n  offset = offset.view(\n      {batchSize, deformable_group * 2 * kH * kW, outputHeight, outputWidth});\n\n  if (batch == 0) {\n    output = output.view({nOutputPlane, outputHeight, outputWidth});\n    input = input.view({nInputPlane, inputHeight, inputWidth});\n    offset = offset.view({offset.size(1), offset.size(2), offset.size(3)});\n  }\n\n  return 1;\n}\n\nint deform_conv_backward_input_cuda(at::Tensor input, at::Tensor offset,\n                                    at::Tensor gradOutput, at::Tensor gradInput,\n                                    at::Tensor gradOffset, at::Tensor weight,\n                                    at::Tensor columns, int kW, int kH, int dW,\n                                    int dH, int padW, int padH, int dilationW,\n                                    int dilationH, int group,\n                                    int deformable_group, int im2col_step) {\n  shape_check(input, offset, &gradOutput, weight, kH, kW, dH, dW, padH, padW,\n              dilationH, dilationW, group, deformable_group);\n  at::DeviceGuard guard(input.device());\n\n  input = input.contiguous();\n  offset = offset.contiguous();\n  gradOutput = gradOutput.contiguous();\n  weight = weight.contiguous();\n\n  int batch = 1;\n\n  if (input.ndimension() == 3) {\n    // Force batch\n    batch = 0;\n    input = input.view({1, input.size(0), input.size(1), input.size(2)});\n    offset = offset.view({1, offset.size(0), offset.size(1), offset.size(2)});\n    gradOutput = gradOutput.view(\n        {1, gradOutput.size(0), gradOutput.size(1), gradOutput.size(2)});\n  }\n\n  long batchSize = input.size(0);\n  long nInputPlane = input.size(1);\n  long inputHeight = input.size(2);\n  long inputWidth = input.size(3);\n\n  long nOutputPlane = weight.size(0);\n\n  long outputWidth =\n      (inputWidth + 2 * padW - (dilationW * (kW - 1) + 1)) / dW + 1;\n  long outputHeight =\n      (inputHeight + 2 * padH - (dilationH * (kH - 1) + 1)) / dH + 1;\n\n  TORCH_CHECK((offset.size(0) == batchSize), 3, \"invalid batch size of offset\");\n  gradInput = gradInput.view({batchSize, nInputPlane, inputHeight, inputWidth});\n  columns = at::zeros(\n      {nInputPlane * kW * kH, im2col_step * outputHeight * outputWidth},\n      input.options());\n\n  // change order of grad output\n  gradOutput = gradOutput.view({batchSize / im2col_step, im2col_step,\n                                nOutputPlane, outputHeight, outputWidth});\n  gradOutput.transpose_(1, 2);\n\n  gradInput = gradInput.view({batchSize / im2col_step, im2col_step, nInputPlane,\n                              inputHeight, inputWidth});\n  input = input.view({batchSize / im2col_step, im2col_step, nInputPlane,\n                      inputHeight, inputWidth});\n  gradOffset = gradOffset.view({batchSize / im2col_step, im2col_step,\n                                deformable_group * 2 * kH * kW, outputHeight,\n                                outputWidth});\n  offset =\n      offset.view({batchSize / im2col_step, im2col_step,\n                   deformable_group * 2 * kH * kW, outputHeight, outputWidth});\n\n  for (int elt = 0; elt < batchSize / im2col_step; elt++) {\n    // divide into groups\n    columns = columns.view({group, columns.size(0) / group, columns.size(1)});\n    weight = weight.view({group, weight.size(0) / group, weight.size(1),\n                          weight.size(2), weight.size(3)});\n    gradOutput = gradOutput.view(\n        {gradOutput.size(0), group, gradOutput.size(1) / group,\n         gradOutput.size(2), gradOutput.size(3), gradOutput.size(4)});\n\n    for (int g = 0; g < group; g++) {\n      columns[g] = columns[g].addmm_(weight[g].flatten(1).transpose(0, 1),\n                                     gradOutput[elt][g].flatten(1), 0.0f, 1.0f);\n    }\n\n    columns =\n        columns.view({columns.size(0) * columns.size(1), columns.size(2)});\n    gradOutput = gradOutput.view(\n        {gradOutput.size(0), gradOutput.size(1) * gradOutput.size(2),\n         gradOutput.size(3), gradOutput.size(4), gradOutput.size(5)});\n\n    deformable_col2im_coord(columns, input[elt], offset[elt], nInputPlane,\n                            inputHeight, inputWidth, kH, kW, padH, padW, dH, dW,\n                            dilationH, dilationW, im2col_step, deformable_group,\n                            gradOffset[elt]);\n\n    deformable_col2im(columns, offset[elt], nInputPlane, inputHeight,\n                      inputWidth, kH, kW, padH, padW, dH, dW, dilationH,\n                      dilationW, im2col_step, deformable_group, gradInput[elt]);\n  }\n\n  gradOutput.transpose_(1, 2);\n  gradOutput =\n      gradOutput.view({batchSize, nOutputPlane, outputHeight, outputWidth});\n\n  gradInput = gradInput.view({batchSize, nInputPlane, inputHeight, inputWidth});\n  input = input.view({batchSize, nInputPlane, inputHeight, inputWidth});\n  gradOffset = gradOffset.view(\n      {batchSize, deformable_group * 2 * kH * kW, outputHeight, outputWidth});\n  offset = offset.view(\n      {batchSize, deformable_group * 2 * kH * kW, outputHeight, outputWidth});\n\n  if (batch == 0) {\n    gradOutput = gradOutput.view({nOutputPlane, outputHeight, outputWidth});\n    input = input.view({nInputPlane, inputHeight, inputWidth});\n    gradInput = gradInput.view({nInputPlane, inputHeight, inputWidth});\n    offset = offset.view({offset.size(1), offset.size(2), offset.size(3)});\n    gradOffset =\n        gradOffset.view({offset.size(1), offset.size(2), offset.size(3)});\n  }\n\n  return 1;\n}\n\nint deform_conv_backward_parameters_cuda(\n    at::Tensor input, at::Tensor offset, at::Tensor gradOutput,\n    at::Tensor gradWeight,  // at::Tensor gradBias,\n    at::Tensor columns, at::Tensor ones, int kW, int kH, int dW, int dH,\n    int padW, int padH, int dilationW, int dilationH, int group,\n    int deformable_group, float scale, int im2col_step) {\n  // todo: transpose and reshape outGrad\n  // todo: reshape columns\n  // todo: add im2col_step as input\n\n  shape_check(input, offset, &gradOutput, gradWeight, kH, kW, dH, dW, padH,\n              padW, dilationH, dilationW, group, deformable_group);\n  at::DeviceGuard guard(input.device());\n\n  input = input.contiguous();\n  offset = offset.contiguous();\n  gradOutput = gradOutput.contiguous();\n\n  int batch = 1;\n\n  if (input.ndimension() == 3) {\n    // Force batch\n    batch = 0;\n    input = input.view(\n        at::IntList({1, input.size(0), input.size(1), input.size(2)}));\n    gradOutput = gradOutput.view(\n        {1, gradOutput.size(0), gradOutput.size(1), gradOutput.size(2)});\n  }\n\n  long batchSize = input.size(0);\n  long nInputPlane = input.size(1);\n  long inputHeight = input.size(2);\n  long inputWidth = input.size(3);\n\n  long nOutputPlane = gradWeight.size(0);\n\n  long outputWidth =\n      (inputWidth + 2 * padW - (dilationW * (kW - 1) + 1)) / dW + 1;\n  long outputHeight =\n      (inputHeight + 2 * padH - (dilationH * (kH - 1) + 1)) / dH + 1;\n\n  TORCH_CHECK((offset.size(0) == batchSize), \"invalid batch size of offset\");\n\n  columns = at::zeros(\n      {nInputPlane * kW * kH, im2col_step * outputHeight * outputWidth},\n      input.options());\n\n  gradOutput = gradOutput.view({batchSize / im2col_step, im2col_step,\n                                nOutputPlane, outputHeight, outputWidth});\n  gradOutput.transpose_(1, 2);\n\n  at::Tensor gradOutputBuffer = at::zeros_like(gradOutput);\n  gradOutputBuffer =\n      gradOutputBuffer.view({batchSize / im2col_step, nOutputPlane, im2col_step,\n                             outputHeight, outputWidth});\n  gradOutputBuffer.copy_(gradOutput);\n  gradOutputBuffer =\n      gradOutputBuffer.view({batchSize / im2col_step, nOutputPlane,\n                             im2col_step * outputHeight, outputWidth});\n\n  gradOutput.transpose_(1, 2);\n  gradOutput =\n      gradOutput.view({batchSize, nOutputPlane, outputHeight, outputWidth});\n\n  input = input.view({batchSize / im2col_step, im2col_step, nInputPlane,\n                      inputHeight, inputWidth});\n  offset =\n      offset.view({batchSize / im2col_step, im2col_step,\n                   deformable_group * 2 * kH * kW, outputHeight, outputWidth});\n\n  for (int elt = 0; elt < batchSize / im2col_step; elt++) {\n    deformable_im2col(input[elt], offset[elt], nInputPlane, inputHeight,\n                      inputWidth, kH, kW, padH, padW, dH, dW, dilationH,\n                      dilationW, im2col_step, deformable_group, columns);\n\n    // divide into group\n    gradOutputBuffer = gradOutputBuffer.view(\n        {gradOutputBuffer.size(0), group, gradOutputBuffer.size(1) / group,\n         gradOutputBuffer.size(2), gradOutputBuffer.size(3)});\n    columns = columns.view({group, columns.size(0) / group, columns.size(1)});\n    gradWeight =\n        gradWeight.view({group, gradWeight.size(0) / group, gradWeight.size(1),\n                         gradWeight.size(2), gradWeight.size(3)});\n\n    for (int g = 0; g < group; g++) {\n      gradWeight[g] = gradWeight[g]\n                          .flatten(1)\n                          .addmm_(gradOutputBuffer[elt][g].flatten(1),\n                                  columns[g].transpose(1, 0), 1.0, scale)\n                          .view_as(gradWeight[g]);\n    }\n    gradOutputBuffer = gradOutputBuffer.view(\n        {gradOutputBuffer.size(0),\n         gradOutputBuffer.size(1) * gradOutputBuffer.size(2),\n         gradOutputBuffer.size(3), gradOutputBuffer.size(4)});\n    columns =\n        columns.view({columns.size(0) * columns.size(1), columns.size(2)});\n    gradWeight = gradWeight.view({gradWeight.size(0) * gradWeight.size(1),\n                                  gradWeight.size(2), gradWeight.size(3),\n                                  gradWeight.size(4)});\n  }\n\n  input = input.view({batchSize, nInputPlane, inputHeight, inputWidth});\n  offset = offset.view(\n      {batchSize, deformable_group * 2 * kH * kW, outputHeight, outputWidth});\n\n  if (batch == 0) {\n    gradOutput = gradOutput.view({nOutputPlane, outputHeight, outputWidth});\n    input = input.view({nInputPlane, inputHeight, inputWidth});\n  }\n\n  return 1;\n}\n\nvoid modulated_deform_conv_cuda_forward(\n    at::Tensor input, at::Tensor weight, at::Tensor bias, at::Tensor ones,\n    at::Tensor offset, at::Tensor mask, at::Tensor output, at::Tensor columns,\n    int kernel_h, int kernel_w, const int stride_h, const int stride_w,\n    const int pad_h, const int pad_w, const int dilation_h,\n    const int dilation_w, const int group, const int deformable_group,\n    const bool with_bias) {\n  TORCH_CHECK(input.is_contiguous(), \"input tensor has to be contiguous\");\n  TORCH_CHECK(weight.is_contiguous(), \"weight tensor has to be contiguous\");\n  at::DeviceGuard guard(input.device());\n\n  const int batch = input.size(0);\n  const int channels = input.size(1);\n  const int height = input.size(2);\n  const int width = input.size(3);\n\n  const int channels_out = weight.size(0);\n  const int channels_kernel = weight.size(1);\n  const int kernel_h_ = weight.size(2);\n  const int kernel_w_ = weight.size(3);\n\n  if (kernel_h_ != kernel_h || kernel_w_ != kernel_w)\n    AT_ERROR(\"Input shape and kernel shape wont match: (%d x %d vs %d x %d).\",\n             kernel_h_, kernel_w, kernel_h_, kernel_w_);\n  if (channels != channels_kernel * group)\n    AT_ERROR(\"Input shape and kernel channels wont match: (%d vs %d).\",\n             channels, channels_kernel * group);\n\n  const int height_out =\n      (height + 2 * pad_h - (dilation_h * (kernel_h - 1) + 1)) / stride_h + 1;\n  const int width_out =\n      (width + 2 * pad_w - (dilation_w * (kernel_w - 1) + 1)) / stride_w + 1;\n\n  if (ones.ndimension() != 2 ||\n      ones.size(0) * ones.size(1) < height_out * width_out) {\n    // Resize plane and fill with ones...\n    ones = at::ones({height_out, width_out}, input.options());\n  }\n\n  // resize output\n  output = output.view({batch, channels_out, height_out, width_out}).zero_();\n  // resize temporary columns\n  columns =\n      at::zeros({channels * kernel_h * kernel_w, 1 * height_out * width_out},\n                input.options());\n\n  output = output.view({output.size(0), group, output.size(1) / group,\n                        output.size(2), output.size(3)});\n\n  for (int b = 0; b < batch; b++) {\n    modulated_deformable_im2col_cuda(\n        input[b], offset[b], mask[b], 1, channels, height, width, height_out,\n        width_out, kernel_h, kernel_w, pad_h, pad_w, stride_h, stride_w,\n        dilation_h, dilation_w, deformable_group, columns);\n\n    // divide into group\n    weight = weight.view({group, weight.size(0) / group, weight.size(1),\n                          weight.size(2), weight.size(3)});\n    columns = columns.view({group, columns.size(0) / group, columns.size(1)});\n\n    for (int g = 0; g < group; g++) {\n      output[b][g] = output[b][g]\n                         .flatten(1)\n                         .addmm_(weight[g].flatten(1), columns[g])\n                         .view_as(output[b][g]);\n    }\n\n    weight = weight.view({weight.size(0) * weight.size(1), weight.size(2),\n                          weight.size(3), weight.size(4)});\n    columns =\n        columns.view({columns.size(0) * columns.size(1), columns.size(2)});\n  }\n\n  output = output.view({output.size(0), output.size(1) * output.size(2),\n                        output.size(3), output.size(4)});\n\n  if (with_bias) {\n    output += bias.view({1, bias.size(0), 1, 1});\n  }\n}\n\nvoid modulated_deform_conv_cuda_backward(\n    at::Tensor input, at::Tensor weight, at::Tensor bias, at::Tensor ones,\n    at::Tensor offset, at::Tensor mask, at::Tensor columns,\n    at::Tensor grad_input, at::Tensor grad_weight, at::Tensor grad_bias,\n    at::Tensor grad_offset, at::Tensor grad_mask, at::Tensor grad_output,\n    int kernel_h, int kernel_w, int stride_h, int stride_w, int pad_h,\n    int pad_w, int dilation_h, int dilation_w, int group, int deformable_group,\n    const bool with_bias) {\n  TORCH_CHECK(input.is_contiguous(), \"input tensor has to be contiguous\");\n  TORCH_CHECK(weight.is_contiguous(), \"weight tensor has to be contiguous\");\n  at::DeviceGuard guard(input.device());\n\n  const int batch = input.size(0);\n  const int channels = input.size(1);\n  const int height = input.size(2);\n  const int width = input.size(3);\n\n  const int channels_kernel = weight.size(1);\n  const int kernel_h_ = weight.size(2);\n  const int kernel_w_ = weight.size(3);\n  if (kernel_h_ != kernel_h || kernel_w_ != kernel_w)\n    AT_ERROR(\"Input shape and kernel shape wont match: (%d x %d vs %d x %d).\",\n             kernel_h_, kernel_w, kernel_h_, kernel_w_);\n  if (channels != channels_kernel * group)\n    AT_ERROR(\"Input shape and kernel channels wont match: (%d vs %d).\",\n             channels, channels_kernel * group);\n\n  const int height_out =\n      (height + 2 * pad_h - (dilation_h * (kernel_h - 1) + 1)) / stride_h + 1;\n  const int width_out =\n      (width + 2 * pad_w - (dilation_w * (kernel_w - 1) + 1)) / stride_w + 1;\n\n  if (ones.ndimension() != 2 ||\n      ones.size(0) * ones.size(1) < height_out * width_out) {\n    // Resize plane and fill with ones...\n    ones = at::ones({height_out, width_out}, input.options());\n  }\n\n  grad_input = grad_input.view({batch, channels, height, width});\n  columns = at::zeros({channels * kernel_h * kernel_w, height_out * width_out},\n                      input.options());\n\n  grad_output =\n      grad_output.view({grad_output.size(0), group, grad_output.size(1) / group,\n                        grad_output.size(2), grad_output.size(3)});\n\n  for (int b = 0; b < batch; b++) {\n    // divide int group\n    columns = columns.view({group, columns.size(0) / group, columns.size(1)});\n    weight = weight.view({group, weight.size(0) / group, weight.size(1),\n                          weight.size(2), weight.size(3)});\n\n    for (int g = 0; g < group; g++) {\n      columns[g].addmm_(weight[g].flatten(1).transpose(0, 1),\n                        grad_output[b][g].flatten(1), 0.0f, 1.0f);\n    }\n\n    columns =\n        columns.view({columns.size(0) * columns.size(1), columns.size(2)});\n    weight = weight.view({weight.size(0) * weight.size(1), weight.size(2),\n                          weight.size(3), weight.size(4)});\n\n    // gradient w.r.t. input coordinate data\n    modulated_deformable_col2im_coord_cuda(\n        columns, input[b], offset[b], mask[b], 1, channels, height, width,\n        height_out, width_out, kernel_h, kernel_w, pad_h, pad_w, stride_h,\n        stride_w, dilation_h, dilation_w, deformable_group, grad_offset[b],\n        grad_mask[b]);\n    // gradient w.r.t. input data\n    modulated_deformable_col2im_cuda(\n        columns, offset[b], mask[b], 1, channels, height, width, height_out,\n        width_out, kernel_h, kernel_w, pad_h, pad_w, stride_h, stride_w,\n        dilation_h, dilation_w, deformable_group, grad_input[b]);\n\n    // gradient w.r.t. weight, dWeight should accumulate across the batch and\n    // group\n    modulated_deformable_im2col_cuda(\n        input[b], offset[b], mask[b], 1, channels, height, width, height_out,\n        width_out, kernel_h, kernel_w, pad_h, pad_w, stride_h, stride_w,\n        dilation_h, dilation_w, deformable_group, columns);\n\n    columns = columns.view({group, columns.size(0) / group, columns.size(1)});\n    grad_weight = grad_weight.view({group, grad_weight.size(0) / group,\n                                    grad_weight.size(1), grad_weight.size(2),\n                                    grad_weight.size(3)});\n    if (with_bias)\n      grad_bias = grad_bias.view({group, grad_bias.size(0) / group});\n\n    for (int g = 0; g < group; g++) {\n      grad_weight[g] =\n          grad_weight[g]\n              .flatten(1)\n              .addmm_(grad_output[b][g].flatten(1), columns[g].transpose(0, 1))\n              .view_as(grad_weight[g]);\n      if (with_bias) {\n        grad_bias[g] =\n            grad_bias[g]\n                .view({-1, 1})\n                .addmm_(grad_output[b][g].flatten(1), ones.view({-1, 1}))\n                .view(-1);\n      }\n    }\n\n    columns =\n        columns.view({columns.size(0) * columns.size(1), columns.size(2)});\n    grad_weight = grad_weight.view({grad_weight.size(0) * grad_weight.size(1),\n                                    grad_weight.size(2), grad_weight.size(3),\n                                    grad_weight.size(4)});\n    if (with_bias)\n      grad_bias = grad_bias.view({grad_bias.size(0) * grad_bias.size(1)});\n  }\n  grad_output = grad_output.view({grad_output.size(0) * grad_output.size(1),\n                                  grad_output.size(2), grad_output.size(3),\n                                  grad_output.size(4)});\n}\n\nPYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {\n  m.def(\"deform_conv_forward_cuda\", &deform_conv_forward_cuda,\n        \"deform forward (CUDA)\");\n  m.def(\"deform_conv_backward_input_cuda\", &deform_conv_backward_input_cuda,\n        \"deform_conv_backward_input (CUDA)\");\n  m.def(\"deform_conv_backward_parameters_cuda\",\n        &deform_conv_backward_parameters_cuda,\n        \"deform_conv_backward_parameters (CUDA)\");\n  m.def(\"modulated_deform_conv_cuda_forward\",\n        &modulated_deform_conv_cuda_forward,\n        \"modulated deform conv forward (CUDA)\");\n  m.def(\"modulated_deform_conv_cuda_backward\",\n        &modulated_deform_conv_cuda_backward,\n        \"modulated deform conv backward (CUDA)\");\n}\n"
  },
  {
    "path": "pcdet/ops/dcn/src/deform_conv_cuda_kernel.cu",
    "content": "/*!\n ******************* BEGIN Caffe Copyright Notice and Disclaimer ****************\n *\n * COPYRIGHT\n *\n * All contributions by the University of California:\n * Copyright (c) 2014-2017 The Regents of the University of California (Regents)\n * All rights reserved.\n *\n * All other contributions:\n * Copyright (c) 2014-2017, the respective contributors\n * All rights reserved.\n *\n * Caffe uses a shared copyright model: each contributor holds copyright over\n * their contributions to Caffe. The project versioning records all such\n * contribution and copyright details. If a contributor wants to further mark\n * their specific copyright on a particular contribution, they should indicate\n * their copyright solely in the commit message of the change when it is\n * committed.\n *\n * LICENSE\n *\n * Redistribution and use in source and binary forms, with or without\n * modification, are permitted provided that the following conditions are met:\n *\n * 1. Redistributions of source code must retain the above copyright notice, this\n * list of conditions and the following disclaimer.\n * 2. Redistributions in binary form must reproduce the above copyright notice,\n * this list of conditions and the following disclaimer in the documentation\n * and/or other materials provided with the distribution.\n *\n * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\" AND\n * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED\n * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE\n * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR\n * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES\n * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;\n * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND\n * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS\n * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n *\n * CONTRIBUTION AGREEMENT\n *\n * By contributing to the BVLC/caffe repository through pull-request, comment,\n * or otherwise, the contributor releases their content to the\n * license and copyright terms herein.\n *\n ***************** END Caffe Copyright Notice and Disclaimer ********************\n *\n * Copyright (c) 2018 Microsoft\n * Licensed under The MIT License [see LICENSE for details]\n * \\file modulated_deformable_im2col.cuh\n * \\brief Function definitions of converting an image to\n * column matrix based on kernel, padding, dilation, and offset.\n * These functions are mainly used in deformable convolution operators.\n * \\ref: https://arxiv.org/abs/1703.06211\n * \\author Yuwen Xiong, Haozhi Qi, Jifeng Dai, Xizhou Zhu, Han Hu, Dazhi Cheng\n */\n\n// modified from https://github.com/chengdazhi/Deformable-Convolution-V2-PyTorch/blob/mmdetection/mmdet/ops/dcn/src/deform_conv_cuda_kernel.cu\n\n#include <ATen/ATen.h>\n#include <ATen/cuda/CUDAContext.h>\n#include <THC/THCAtomics.cuh>\n#include <stdio.h>\n#include <math.h>\n#include <float.h>\n\nusing namespace at;\n\n#define CUDA_KERNEL_LOOP(i, n)                                 \\\n  for (int i = blockIdx.x * blockDim.x + threadIdx.x; i < (n); \\\n       i += blockDim.x * gridDim.x)\n\nconst int CUDA_NUM_THREADS = 1024;\nconst int kMaxGridNum = 65535;\n\ninline int GET_BLOCKS(const int N)\n{\n  return std::min(kMaxGridNum, (N + CUDA_NUM_THREADS - 1) / CUDA_NUM_THREADS);\n}\n\ntemplate <typename scalar_t>\n__device__ scalar_t deformable_im2col_bilinear(const scalar_t *bottom_data, const int data_width,\n                                               const int height, const int width, scalar_t h, scalar_t w)\n{\n\n  int h_low = floor(h);\n  int w_low = floor(w);\n  int h_high = h_low + 1;\n  int w_high = w_low + 1;\n\n  scalar_t lh = h - h_low;\n  scalar_t lw = w - w_low;\n  scalar_t hh = 1 - lh, hw = 1 - lw;\n\n  scalar_t v1 = 0;\n  if (h_low >= 0 && w_low >= 0)\n    v1 = bottom_data[h_low * data_width + w_low];\n  scalar_t v2 = 0;\n  if (h_low >= 0 && w_high <= width - 1)\n    v2 = bottom_data[h_low * data_width + w_high];\n  scalar_t v3 = 0;\n  if (h_high <= height - 1 && w_low >= 0)\n    v3 = bottom_data[h_high * data_width + w_low];\n  scalar_t v4 = 0;\n  if (h_high <= height - 1 && w_high <= width - 1)\n    v4 = bottom_data[h_high * data_width + w_high];\n\n  scalar_t w1 = hh * hw, w2 = hh * lw, w3 = lh * hw, w4 = lh * lw;\n\n  scalar_t val = (w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4);\n  return val;\n}\n\ntemplate <typename scalar_t>\n__device__ scalar_t get_gradient_weight(scalar_t argmax_h, scalar_t argmax_w,\n                                        const int h, const int w, const int height, const int width)\n{\n\n  if (argmax_h <= -1 || argmax_h >= height || argmax_w <= -1 || argmax_w >= width)\n  {\n    //empty\n    return 0;\n  }\n\n  int argmax_h_low = floor(argmax_h);\n  int argmax_w_low = floor(argmax_w);\n  int argmax_h_high = argmax_h_low + 1;\n  int argmax_w_high = argmax_w_low + 1;\n\n  scalar_t weight = 0;\n  if (h == argmax_h_low && w == argmax_w_low)\n    weight = (h + 1 - argmax_h) * (w + 1 - argmax_w);\n  if (h == argmax_h_low && w == argmax_w_high)\n    weight = (h + 1 - argmax_h) * (argmax_w + 1 - w);\n  if (h == argmax_h_high && w == argmax_w_low)\n    weight = (argmax_h + 1 - h) * (w + 1 - argmax_w);\n  if (h == argmax_h_high && w == argmax_w_high)\n    weight = (argmax_h + 1 - h) * (argmax_w + 1 - w);\n  return weight;\n}\n\ntemplate <typename scalar_t>\n__device__ scalar_t get_coordinate_weight(scalar_t argmax_h, scalar_t argmax_w,\n                                          const int height, const int width, const scalar_t *im_data,\n                                          const int data_width, const int bp_dir)\n{\n\n  if (argmax_h <= -1 || argmax_h >= height || argmax_w <= -1 || argmax_w >= width)\n  {\n    //empty\n    return 0;\n  }\n\n  int argmax_h_low = floor(argmax_h);\n  int argmax_w_low = floor(argmax_w);\n  int argmax_h_high = argmax_h_low + 1;\n  int argmax_w_high = argmax_w_low + 1;\n\n  scalar_t weight = 0;\n\n  if (bp_dir == 0)\n  {\n    if (argmax_h_low >= 0 && argmax_w_low >= 0)\n      weight += -1 * (argmax_w_low + 1 - argmax_w) * im_data[argmax_h_low * data_width + argmax_w_low];\n    if (argmax_h_low >= 0 && argmax_w_high <= width - 1)\n      weight += -1 * (argmax_w - argmax_w_low) * im_data[argmax_h_low * data_width + argmax_w_high];\n    if (argmax_h_high <= height - 1 && argmax_w_low >= 0)\n      weight += (argmax_w_low + 1 - argmax_w) * im_data[argmax_h_high * data_width + argmax_w_low];\n    if (argmax_h_high <= height - 1 && argmax_w_high <= width - 1)\n      weight += (argmax_w - argmax_w_low) * im_data[argmax_h_high * data_width + argmax_w_high];\n  }\n  else if (bp_dir == 1)\n  {\n    if (argmax_h_low >= 0 && argmax_w_low >= 0)\n      weight += -1 * (argmax_h_low + 1 - argmax_h) * im_data[argmax_h_low * data_width + argmax_w_low];\n    if (argmax_h_low >= 0 && argmax_w_high <= width - 1)\n      weight += (argmax_h_low + 1 - argmax_h) * im_data[argmax_h_low * data_width + argmax_w_high];\n    if (argmax_h_high <= height - 1 && argmax_w_low >= 0)\n      weight += -1 * (argmax_h - argmax_h_low) * im_data[argmax_h_high * data_width + argmax_w_low];\n    if (argmax_h_high <= height - 1 && argmax_w_high <= width - 1)\n      weight += (argmax_h - argmax_h_low) * im_data[argmax_h_high * data_width + argmax_w_high];\n  }\n\n  return weight;\n}\n\ntemplate <typename scalar_t>\n__global__ void deformable_im2col_gpu_kernel(const int n, const scalar_t *data_im, const scalar_t *data_offset,\n                                             const int height, const int width, const int kernel_h, const int kernel_w,\n                                             const int pad_h, const int pad_w, const int stride_h, const int stride_w,\n                                             const int dilation_h, const int dilation_w, const int channel_per_deformable_group,\n                                             const int batch_size, const int num_channels, const int deformable_group,\n                                             const int height_col, const int width_col,\n                                             scalar_t *data_col)\n{\n  CUDA_KERNEL_LOOP(index, n)\n  {\n    // index index of output matrix\n    const int w_col = index % width_col;\n    const int h_col = (index / width_col) % height_col;\n    const int b_col = (index / width_col / height_col) % batch_size;\n    const int c_im = (index / width_col / height_col) / batch_size;\n    const int c_col = c_im * kernel_h * kernel_w;\n\n    // compute deformable group index\n    const int deformable_group_index = c_im / channel_per_deformable_group;\n\n    const int h_in = h_col * stride_h - pad_h;\n    const int w_in = w_col * stride_w - pad_w;\n    scalar_t *data_col_ptr = data_col + ((c_col * batch_size + b_col) * height_col + h_col) * width_col + w_col;\n    //const scalar_t* data_im_ptr = data_im + ((b_col * num_channels + c_im) * height + h_in) * width + w_in;\n    const scalar_t *data_im_ptr = data_im + (b_col * num_channels + c_im) * height * width;\n    const scalar_t *data_offset_ptr = data_offset + (b_col * deformable_group + deformable_group_index) * 2 * kernel_h * kernel_w * height_col * width_col;\n\n    for (int i = 0; i < kernel_h; ++i)\n    {\n      for (int j = 0; j < kernel_w; ++j)\n      {\n        const int data_offset_h_ptr = ((2 * (i * kernel_w + j)) * height_col + h_col) * width_col + w_col;\n        const int data_offset_w_ptr = ((2 * (i * kernel_w + j) + 1) * height_col + h_col) * width_col + w_col;\n        const scalar_t offset_h = data_offset_ptr[data_offset_h_ptr];\n        const scalar_t offset_w = data_offset_ptr[data_offset_w_ptr];\n        scalar_t val = static_cast<scalar_t>(0);\n        const scalar_t h_im = h_in + i * dilation_h + offset_h;\n        const scalar_t w_im = w_in + j * dilation_w + offset_w;\n        if (h_im > -1 && w_im > -1 && h_im < height && w_im < width)\n        {\n          //const scalar_t map_h = i * dilation_h + offset_h;\n          //const scalar_t map_w = j * dilation_w + offset_w;\n          //const int cur_height = height - h_in;\n          //const int cur_width = width - w_in;\n          //val = deformable_im2col_bilinear(data_im_ptr, width, cur_height, cur_width, map_h, map_w);\n          val = deformable_im2col_bilinear(data_im_ptr, width, height, width, h_im, w_im);\n        }\n        *data_col_ptr = val;\n        data_col_ptr += batch_size * height_col * width_col;\n      }\n    }\n  }\n}\n\nvoid deformable_im2col(\n    const at::Tensor data_im, const at::Tensor data_offset, const int channels,\n    const int height, const int width, const int ksize_h, const int ksize_w,\n    const int pad_h, const int pad_w, const int stride_h, const int stride_w,\n    const int dilation_h, const int dilation_w, const int parallel_imgs,\n    const int deformable_group, at::Tensor data_col)\n{\n  // num_axes should be smaller than block size\n  // todo: check parallel_imgs is correctly passed in\n  int height_col = (height + 2 * pad_h - (dilation_h * (ksize_h - 1) + 1)) / stride_h + 1;\n  int width_col = (width + 2 * pad_w - (dilation_w * (ksize_w - 1) + 1)) / stride_w + 1;\n  int num_kernels = channels * height_col * width_col * parallel_imgs;\n  int channel_per_deformable_group = channels / deformable_group;\n\n  AT_DISPATCH_FLOATING_TYPES_AND_HALF(\n      data_im.scalar_type(), \"deformable_im2col_gpu\", ([&] {\n        const scalar_t *data_im_ = data_im.data<scalar_t>();\n        const scalar_t *data_offset_ = data_offset.data<scalar_t>();\n        scalar_t *data_col_ = data_col.data<scalar_t>();\n\n        deformable_im2col_gpu_kernel<<<GET_BLOCKS(num_kernels), CUDA_NUM_THREADS, 0, at::cuda::getCurrentCUDAStream()>>>(\n            num_kernels, data_im_, data_offset_, height, width, ksize_h, ksize_w,\n            pad_h, pad_w, stride_h, stride_w, dilation_h, dilation_w,\n            channel_per_deformable_group, parallel_imgs, channels, deformable_group,\n            height_col, width_col, data_col_);\n      }));\n\n  cudaError_t err = cudaGetLastError();\n  if (err != cudaSuccess)\n  {\n    printf(\"error in deformable_im2col: %s\\n\", cudaGetErrorString(err));\n  }\n}\n\ntemplate <typename scalar_t>\n__global__ void deformable_col2im_gpu_kernel(\n    const int n, const scalar_t *data_col, const scalar_t *data_offset,\n    const int channels, const int height, const int width,\n    const int kernel_h, const int kernel_w,\n    const int pad_h, const int pad_w,\n    const int stride_h, const int stride_w,\n    const int dilation_h, const int dilation_w,\n    const int channel_per_deformable_group,\n    const int batch_size, const int deformable_group,\n    const int height_col, const int width_col,\n    scalar_t *grad_im)\n{\n  CUDA_KERNEL_LOOP(index, n)\n  {\n    const int j = (index / width_col / height_col / batch_size) % kernel_w;\n    const int i = (index / width_col / height_col / batch_size / kernel_w) % kernel_h;\n    const int c = index / width_col / height_col / batch_size / kernel_w / kernel_h;\n    // compute the start and end of the output\n\n    const int deformable_group_index = c / channel_per_deformable_group;\n\n    int w_out = index % width_col;\n    int h_out = (index / width_col) % height_col;\n    int b = (index / width_col / height_col) % batch_size;\n    int w_in = w_out * stride_w - pad_w;\n    int h_in = h_out * stride_h - pad_h;\n\n    const scalar_t *data_offset_ptr = data_offset + (b * deformable_group + deformable_group_index) *\n                                                        2 * kernel_h * kernel_w * height_col * width_col;\n    const int data_offset_h_ptr = ((2 * (i * kernel_w + j)) * height_col + h_out) * width_col + w_out;\n    const int data_offset_w_ptr = ((2 * (i * kernel_w + j) + 1) * height_col + h_out) * width_col + w_out;\n    const scalar_t offset_h = data_offset_ptr[data_offset_h_ptr];\n    const scalar_t offset_w = data_offset_ptr[data_offset_w_ptr];\n    const scalar_t cur_inv_h_data = h_in + i * dilation_h + offset_h;\n    const scalar_t cur_inv_w_data = w_in + j * dilation_w + offset_w;\n\n    const scalar_t cur_top_grad = data_col[index];\n    const int cur_h = (int)cur_inv_h_data;\n    const int cur_w = (int)cur_inv_w_data;\n    for (int dy = -2; dy <= 2; dy++)\n    {\n      for (int dx = -2; dx <= 2; dx++)\n      {\n        if (cur_h + dy >= 0 && cur_h + dy < height &&\n            cur_w + dx >= 0 && cur_w + dx < width &&\n            abs(cur_inv_h_data - (cur_h + dy)) < 1 &&\n            abs(cur_inv_w_data - (cur_w + dx)) < 1)\n        {\n          int cur_bottom_grad_pos = ((b * channels + c) * height + cur_h + dy) * width + cur_w + dx;\n          scalar_t weight = get_gradient_weight(cur_inv_h_data, cur_inv_w_data, cur_h + dy, cur_w + dx, height, width);\n          atomicAdd(grad_im + cur_bottom_grad_pos, weight * cur_top_grad);\n        }\n      }\n    }\n  }\n}\n\nvoid deformable_col2im(\n    const at::Tensor data_col, const at::Tensor data_offset, const int channels,\n    const int height, const int width, const int ksize_h,\n    const int ksize_w, const int pad_h, const int pad_w,\n    const int stride_h, const int stride_w,\n    const int dilation_h, const int dilation_w,\n    const int parallel_imgs, const int deformable_group,\n    at::Tensor grad_im)\n{\n\n  // todo: make sure parallel_imgs is passed in correctly\n  int height_col = (height + 2 * pad_h - (dilation_h * (ksize_h - 1) + 1)) / stride_h + 1;\n  int width_col = (width + 2 * pad_w - (dilation_w * (ksize_w - 1) + 1)) / stride_w + 1;\n  int num_kernels = channels * ksize_h * ksize_w * height_col * width_col * parallel_imgs;\n  int channel_per_deformable_group = channels / deformable_group;\n\n  AT_DISPATCH_FLOATING_TYPES_AND_HALF(\n      data_col.scalar_type(), \"deformable_col2im_gpu\", ([&] {\n        const scalar_t *data_col_ = data_col.data<scalar_t>();\n        const scalar_t *data_offset_ = data_offset.data<scalar_t>();\n        scalar_t *grad_im_ = grad_im.data<scalar_t>();\n\n        deformable_col2im_gpu_kernel<<<GET_BLOCKS(num_kernels), CUDA_NUM_THREADS, 0, at::cuda::getCurrentCUDAStream()>>>(\n            num_kernels, data_col_, data_offset_, channels, height, width, ksize_h,\n            ksize_w, pad_h, pad_w, stride_h, stride_w,\n            dilation_h, dilation_w, channel_per_deformable_group,\n            parallel_imgs, deformable_group, height_col, width_col, grad_im_);\n      }));\n\n  cudaError_t err = cudaGetLastError();\n  if (err != cudaSuccess)\n  {\n    printf(\"error in deformable_col2im: %s\\n\", cudaGetErrorString(err));\n  }\n}\n\ntemplate <typename scalar_t>\n__global__ void deformable_col2im_coord_gpu_kernel(const int n, const scalar_t *data_col,\n                                                   const scalar_t *data_im, const scalar_t *data_offset,\n                                                   const int channels, const int height, const int width,\n                                                   const int kernel_h, const int kernel_w,\n                                                   const int pad_h, const int pad_w,\n                                                   const int stride_h, const int stride_w,\n                                                   const int dilation_h, const int dilation_w,\n                                                   const int channel_per_deformable_group,\n                                                   const int batch_size, const int offset_channels, const int deformable_group,\n                                                   const int height_col, const int width_col, scalar_t *grad_offset)\n{\n  CUDA_KERNEL_LOOP(index, n)\n  {\n    scalar_t val = 0;\n    int w = index % width_col;\n    int h = (index / width_col) % height_col;\n    int c = (index / width_col / height_col) % offset_channels;\n    int b = (index / width_col / height_col) / offset_channels;\n    // compute the start and end of the output\n\n    const int deformable_group_index = c / (2 * kernel_h * kernel_w);\n    const int col_step = kernel_h * kernel_w;\n    int cnt = 0;\n    const scalar_t *data_col_ptr = data_col + deformable_group_index * channel_per_deformable_group *\n                                                  batch_size * width_col * height_col;\n    const scalar_t *data_im_ptr = data_im + (b * deformable_group + deformable_group_index) *\n                                                channel_per_deformable_group / kernel_h / kernel_w * height * width;\n    const scalar_t *data_offset_ptr = data_offset + (b * deformable_group + deformable_group_index) * 2 *\n                                                        kernel_h * kernel_w * height_col * width_col;\n\n    const int offset_c = c - deformable_group_index * 2 * kernel_h * kernel_w;\n\n    for (int col_c = (offset_c / 2); col_c < channel_per_deformable_group; col_c += col_step)\n    {\n      const int col_pos = (((col_c * batch_size + b) * height_col) + h) * width_col + w;\n      const int bp_dir = offset_c % 2;\n\n      int j = (col_pos / width_col / height_col / batch_size) % kernel_w;\n      int i = (col_pos / width_col / height_col / batch_size / kernel_w) % kernel_h;\n      int w_out = col_pos % width_col;\n      int h_out = (col_pos / width_col) % height_col;\n      int w_in = w_out * stride_w - pad_w;\n      int h_in = h_out * stride_h - pad_h;\n      const int data_offset_h_ptr = (((2 * (i * kernel_w + j)) * height_col + h_out) * width_col + w_out);\n      const int data_offset_w_ptr = (((2 * (i * kernel_w + j) + 1) * height_col + h_out) * width_col + w_out);\n      const scalar_t offset_h = data_offset_ptr[data_offset_h_ptr];\n      const scalar_t offset_w = data_offset_ptr[data_offset_w_ptr];\n      scalar_t inv_h = h_in + i * dilation_h + offset_h;\n      scalar_t inv_w = w_in + j * dilation_w + offset_w;\n      if (inv_h <= -1 || inv_w <= -1 || inv_h >= height || inv_w >= width)\n      {\n        inv_h = inv_w = -2;\n      }\n      const scalar_t weight = get_coordinate_weight(\n          inv_h, inv_w,\n          height, width, data_im_ptr + cnt * height * width, width, bp_dir);\n      val += weight * data_col_ptr[col_pos];\n      cnt += 1;\n    }\n\n    grad_offset[index] = val;\n  }\n}\n\nvoid deformable_col2im_coord(\n    const at::Tensor data_col, const at::Tensor data_im, const at::Tensor data_offset,\n    const int channels, const int height, const int width, const int ksize_h,\n    const int ksize_w, const int pad_h, const int pad_w, const int stride_h,\n    const int stride_w, const int dilation_h, const int dilation_w,\n    const int parallel_imgs, const int deformable_group, at::Tensor grad_offset)\n{\n\n  int height_col = (height + 2 * pad_h - (dilation_h * (ksize_h - 1) + 1)) / stride_h + 1;\n  int width_col = (width + 2 * pad_w - (dilation_w * (ksize_w - 1) + 1)) / stride_w + 1;\n  int num_kernels = height_col * width_col * 2 * ksize_h * ksize_w * deformable_group * parallel_imgs;\n  int channel_per_deformable_group = channels * ksize_h * ksize_w / deformable_group;\n\n  AT_DISPATCH_FLOATING_TYPES_AND_HALF(\n      data_col.scalar_type(), \"deformable_col2im_coord_gpu\", ([&] {\n        const scalar_t *data_col_ = data_col.data<scalar_t>();\n        const scalar_t *data_im_ = data_im.data<scalar_t>();\n        const scalar_t *data_offset_ = data_offset.data<scalar_t>();\n        scalar_t *grad_offset_ = grad_offset.data<scalar_t>();\n\n        deformable_col2im_coord_gpu_kernel<<<GET_BLOCKS(num_kernels), CUDA_NUM_THREADS, 0, at::cuda::getCurrentCUDAStream()>>>(\n            num_kernels, data_col_, data_im_, data_offset_, channels, height, width,\n            ksize_h, ksize_w, pad_h, pad_w, stride_h, stride_w,\n            dilation_h, dilation_w, channel_per_deformable_group,\n            parallel_imgs, 2 * ksize_h * ksize_w * deformable_group, deformable_group,\n            height_col, width_col, grad_offset_);\n      }));\n}\n\ntemplate <typename scalar_t>\n__device__ scalar_t dmcn_im2col_bilinear(const scalar_t *bottom_data, const int data_width,\n                                         const int height, const int width, scalar_t h, scalar_t w)\n{\n  int h_low = floor(h);\n  int w_low = floor(w);\n  int h_high = h_low + 1;\n  int w_high = w_low + 1;\n\n  scalar_t lh = h - h_low;\n  scalar_t lw = w - w_low;\n  scalar_t hh = 1 - lh, hw = 1 - lw;\n\n  scalar_t v1 = 0;\n  if (h_low >= 0 && w_low >= 0)\n    v1 = bottom_data[h_low * data_width + w_low];\n  scalar_t v2 = 0;\n  if (h_low >= 0 && w_high <= width - 1)\n    v2 = bottom_data[h_low * data_width + w_high];\n  scalar_t v3 = 0;\n  if (h_high <= height - 1 && w_low >= 0)\n    v3 = bottom_data[h_high * data_width + w_low];\n  scalar_t v4 = 0;\n  if (h_high <= height - 1 && w_high <= width - 1)\n    v4 = bottom_data[h_high * data_width + w_high];\n\n  scalar_t w1 = hh * hw, w2 = hh * lw, w3 = lh * hw, w4 = lh * lw;\n\n  scalar_t val = (w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4);\n  return val;\n}\n\ntemplate <typename scalar_t>\n__device__ scalar_t dmcn_get_gradient_weight(scalar_t argmax_h, scalar_t argmax_w,\n                                             const int h, const int w, const int height, const int width)\n{\n  if (argmax_h <= -1 || argmax_h >= height || argmax_w <= -1 || argmax_w >= width)\n  {\n    //empty\n    return 0;\n  }\n\n  int argmax_h_low = floor(argmax_h);\n  int argmax_w_low = floor(argmax_w);\n  int argmax_h_high = argmax_h_low + 1;\n  int argmax_w_high = argmax_w_low + 1;\n\n  scalar_t weight = 0;\n  if (h == argmax_h_low && w == argmax_w_low)\n    weight = (h + 1 - argmax_h) * (w + 1 - argmax_w);\n  if (h == argmax_h_low && w == argmax_w_high)\n    weight = (h + 1 - argmax_h) * (argmax_w + 1 - w);\n  if (h == argmax_h_high && w == argmax_w_low)\n    weight = (argmax_h + 1 - h) * (w + 1 - argmax_w);\n  if (h == argmax_h_high && w == argmax_w_high)\n    weight = (argmax_h + 1 - h) * (argmax_w + 1 - w);\n  return weight;\n}\n\ntemplate <typename scalar_t>\n__device__ scalar_t dmcn_get_coordinate_weight(scalar_t argmax_h, scalar_t argmax_w,\n                                               const int height, const int width, const scalar_t *im_data,\n                                               const int data_width, const int bp_dir)\n{\n  if (argmax_h <= -1 || argmax_h >= height || argmax_w <= -1 || argmax_w >= width)\n  {\n    //empty\n    return 0;\n  }\n\n  int argmax_h_low = floor(argmax_h);\n  int argmax_w_low = floor(argmax_w);\n  int argmax_h_high = argmax_h_low + 1;\n  int argmax_w_high = argmax_w_low + 1;\n\n  scalar_t weight = 0;\n\n  if (bp_dir == 0)\n  {\n    if (argmax_h_low >= 0 && argmax_w_low >= 0)\n      weight += -1 * (argmax_w_low + 1 - argmax_w) * im_data[argmax_h_low * data_width + argmax_w_low];\n    if (argmax_h_low >= 0 && argmax_w_high <= width - 1)\n      weight += -1 * (argmax_w - argmax_w_low) * im_data[argmax_h_low * data_width + argmax_w_high];\n    if (argmax_h_high <= height - 1 && argmax_w_low >= 0)\n      weight += (argmax_w_low + 1 - argmax_w) * im_data[argmax_h_high * data_width + argmax_w_low];\n    if (argmax_h_high <= height - 1 && argmax_w_high <= width - 1)\n      weight += (argmax_w - argmax_w_low) * im_data[argmax_h_high * data_width + argmax_w_high];\n  }\n  else if (bp_dir == 1)\n  {\n    if (argmax_h_low >= 0 && argmax_w_low >= 0)\n      weight += -1 * (argmax_h_low + 1 - argmax_h) * im_data[argmax_h_low * data_width + argmax_w_low];\n    if (argmax_h_low >= 0 && argmax_w_high <= width - 1)\n      weight += (argmax_h_low + 1 - argmax_h) * im_data[argmax_h_low * data_width + argmax_w_high];\n    if (argmax_h_high <= height - 1 && argmax_w_low >= 0)\n      weight += -1 * (argmax_h - argmax_h_low) * im_data[argmax_h_high * data_width + argmax_w_low];\n    if (argmax_h_high <= height - 1 && argmax_w_high <= width - 1)\n      weight += (argmax_h - argmax_h_low) * im_data[argmax_h_high * data_width + argmax_w_high];\n  }\n\n  return weight;\n}\n\ntemplate <typename scalar_t>\n__global__ void modulated_deformable_im2col_gpu_kernel(const int n,\n                                                       const scalar_t *data_im, const scalar_t *data_offset, const scalar_t *data_mask,\n                                                       const int height, const int width, const int kernel_h, const int kernel_w,\n                                                       const int pad_h, const int pad_w,\n                                                       const int stride_h, const int stride_w,\n                                                       const int dilation_h, const int dilation_w,\n                                                       const int channel_per_deformable_group,\n                                                       const int batch_size, const int num_channels, const int deformable_group,\n                                                       const int height_col, const int width_col,\n                                                       scalar_t *data_col)\n{\n  CUDA_KERNEL_LOOP(index, n)\n  {\n    // index index of output matrix\n    const int w_col = index % width_col;\n    const int h_col = (index / width_col) % height_col;\n    const int b_col = (index / width_col / height_col) % batch_size;\n    const int c_im = (index / width_col / height_col) / batch_size;\n    const int c_col = c_im * kernel_h * kernel_w;\n\n    // compute deformable group index\n    const int deformable_group_index = c_im / channel_per_deformable_group;\n\n    const int h_in = h_col * stride_h - pad_h;\n    const int w_in = w_col * stride_w - pad_w;\n\n    scalar_t *data_col_ptr = data_col + ((c_col * batch_size + b_col) * height_col + h_col) * width_col + w_col;\n    //const float* data_im_ptr = data_im + ((b_col * num_channels + c_im) * height + h_in) * width + w_in;\n    const scalar_t *data_im_ptr = data_im + (b_col * num_channels + c_im) * height * width;\n    const scalar_t *data_offset_ptr = data_offset + (b_col * deformable_group + deformable_group_index) * 2 * kernel_h * kernel_w * height_col * width_col;\n\n    const scalar_t *data_mask_ptr = data_mask + (b_col * deformable_group + deformable_group_index) * kernel_h * kernel_w * height_col * width_col;\n\n    for (int i = 0; i < kernel_h; ++i)\n    {\n      for (int j = 0; j < kernel_w; ++j)\n      {\n        const int data_offset_h_ptr = ((2 * (i * kernel_w + j)) * height_col + h_col) * width_col + w_col;\n        const int data_offset_w_ptr = ((2 * (i * kernel_w + j) + 1) * height_col + h_col) * width_col + w_col;\n        const int data_mask_hw_ptr = ((i * kernel_w + j) * height_col + h_col) * width_col + w_col;\n        const scalar_t offset_h = data_offset_ptr[data_offset_h_ptr];\n        const scalar_t offset_w = data_offset_ptr[data_offset_w_ptr];\n        const scalar_t mask = data_mask_ptr[data_mask_hw_ptr];\n        scalar_t val = static_cast<scalar_t>(0);\n        const scalar_t h_im = h_in + i * dilation_h + offset_h;\n        const scalar_t w_im = w_in + j * dilation_w + offset_w;\n        //if (h_im >= 0 && w_im >= 0 && h_im < height && w_im < width) {\n        if (h_im > -1 && w_im > -1 && h_im < height && w_im < width)\n        {\n          //const float map_h = i * dilation_h + offset_h;\n          //const float map_w = j * dilation_w + offset_w;\n          //const int cur_height = height - h_in;\n          //const int cur_width = width - w_in;\n          //val = dmcn_im2col_bilinear(data_im_ptr, width, cur_height, cur_width, map_h, map_w);\n          val = dmcn_im2col_bilinear(data_im_ptr, width, height, width, h_im, w_im);\n        }\n        *data_col_ptr = val * mask;\n        data_col_ptr += batch_size * height_col * width_col;\n        //data_col_ptr += height_col * width_col;\n      }\n    }\n  }\n}\n\ntemplate <typename scalar_t>\n__global__ void modulated_deformable_col2im_gpu_kernel(const int n,\n                                                       const scalar_t *data_col, const scalar_t *data_offset, const scalar_t *data_mask,\n                                                       const int channels, const int height, const int width,\n                                                       const int kernel_h, const int kernel_w,\n                                                       const int pad_h, const int pad_w,\n                                                       const int stride_h, const int stride_w,\n                                                       const int dilation_h, const int dilation_w,\n                                                       const int channel_per_deformable_group,\n                                                       const int batch_size, const int deformable_group,\n                                                       const int height_col, const int width_col,\n                                                       scalar_t *grad_im)\n{\n  CUDA_KERNEL_LOOP(index, n)\n  {\n    const int j = (index / width_col / height_col / batch_size) % kernel_w;\n    const int i = (index / width_col / height_col / batch_size / kernel_w) % kernel_h;\n    const int c = index / width_col / height_col / batch_size / kernel_w / kernel_h;\n    // compute the start and end of the output\n\n    const int deformable_group_index = c / channel_per_deformable_group;\n\n    int w_out = index % width_col;\n    int h_out = (index / width_col) % height_col;\n    int b = (index / width_col / height_col) % batch_size;\n    int w_in = w_out * stride_w - pad_w;\n    int h_in = h_out * stride_h - pad_h;\n\n    const scalar_t *data_offset_ptr = data_offset + (b * deformable_group + deformable_group_index) * 2 * kernel_h * kernel_w * height_col * width_col;\n    const scalar_t *data_mask_ptr = data_mask + (b * deformable_group + deformable_group_index) * kernel_h * kernel_w * height_col * width_col;\n    const int data_offset_h_ptr = ((2 * (i * kernel_w + j)) * height_col + h_out) * width_col + w_out;\n    const int data_offset_w_ptr = ((2 * (i * kernel_w + j) + 1) * height_col + h_out) * width_col + w_out;\n    const int data_mask_hw_ptr = ((i * kernel_w + j) * height_col + h_out) * width_col + w_out;\n    const scalar_t offset_h = data_offset_ptr[data_offset_h_ptr];\n    const scalar_t offset_w = data_offset_ptr[data_offset_w_ptr];\n    const scalar_t mask = data_mask_ptr[data_mask_hw_ptr];\n    const scalar_t cur_inv_h_data = h_in + i * dilation_h + offset_h;\n    const scalar_t cur_inv_w_data = w_in + j * dilation_w + offset_w;\n\n    const scalar_t cur_top_grad = data_col[index] * mask;\n    const int cur_h = (int)cur_inv_h_data;\n    const int cur_w = (int)cur_inv_w_data;\n    for (int dy = -2; dy <= 2; dy++)\n    {\n      for (int dx = -2; dx <= 2; dx++)\n      {\n        if (cur_h + dy >= 0 && cur_h + dy < height &&\n            cur_w + dx >= 0 && cur_w + dx < width &&\n            abs(cur_inv_h_data - (cur_h + dy)) < 1 &&\n            abs(cur_inv_w_data - (cur_w + dx)) < 1)\n        {\n          int cur_bottom_grad_pos = ((b * channels + c) * height + cur_h + dy) * width + cur_w + dx;\n          scalar_t weight = dmcn_get_gradient_weight(cur_inv_h_data, cur_inv_w_data, cur_h + dy, cur_w + dx, height, width);\n          atomicAdd(grad_im + cur_bottom_grad_pos, weight * cur_top_grad);\n        }\n      }\n    }\n  }\n}\n\ntemplate <typename scalar_t>\n__global__ void modulated_deformable_col2im_coord_gpu_kernel(const int n,\n                                                             const scalar_t *data_col, const scalar_t *data_im,\n                                                             const scalar_t *data_offset, const scalar_t *data_mask,\n                                                             const int channels, const int height, const int width,\n                                                             const int kernel_h, const int kernel_w,\n                                                             const int pad_h, const int pad_w,\n                                                             const int stride_h, const int stride_w,\n                                                             const int dilation_h, const int dilation_w,\n                                                             const int channel_per_deformable_group,\n                                                             const int batch_size, const int offset_channels, const int deformable_group,\n                                                             const int height_col, const int width_col,\n                                                             scalar_t *grad_offset, scalar_t *grad_mask)\n{\n  CUDA_KERNEL_LOOP(index, n)\n  {\n    scalar_t val = 0, mval = 0;\n    int w = index % width_col;\n    int h = (index / width_col) % height_col;\n    int c = (index / width_col / height_col) % offset_channels;\n    int b = (index / width_col / height_col) / offset_channels;\n    // compute the start and end of the output\n\n    const int deformable_group_index = c / (2 * kernel_h * kernel_w);\n    const int col_step = kernel_h * kernel_w;\n    int cnt = 0;\n    const scalar_t *data_col_ptr = data_col + deformable_group_index * channel_per_deformable_group * batch_size * width_col * height_col;\n    const scalar_t *data_im_ptr = data_im + (b * deformable_group + deformable_group_index) * channel_per_deformable_group / kernel_h / kernel_w * height * width;\n    const scalar_t *data_offset_ptr = data_offset + (b * deformable_group + deformable_group_index) * 2 * kernel_h * kernel_w * height_col * width_col;\n    const scalar_t *data_mask_ptr = data_mask + (b * deformable_group + deformable_group_index) * kernel_h * kernel_w * height_col * width_col;\n\n    const int offset_c = c - deformable_group_index * 2 * kernel_h * kernel_w;\n\n    for (int col_c = (offset_c / 2); col_c < channel_per_deformable_group; col_c += col_step)\n    {\n      const int col_pos = (((col_c * batch_size + b) * height_col) + h) * width_col + w;\n      const int bp_dir = offset_c % 2;\n\n      int j = (col_pos / width_col / height_col / batch_size) % kernel_w;\n      int i = (col_pos / width_col / height_col / batch_size / kernel_w) % kernel_h;\n      int w_out = col_pos % width_col;\n      int h_out = (col_pos / width_col) % height_col;\n      int w_in = w_out * stride_w - pad_w;\n      int h_in = h_out * stride_h - pad_h;\n      const int data_offset_h_ptr = (((2 * (i * kernel_w + j)) * height_col + h_out) * width_col + w_out);\n      const int data_offset_w_ptr = (((2 * (i * kernel_w + j) + 1) * height_col + h_out) * width_col + w_out);\n      const int data_mask_hw_ptr = (((i * kernel_w + j) * height_col + h_out) * width_col + w_out);\n      const scalar_t offset_h = data_offset_ptr[data_offset_h_ptr];\n      const scalar_t offset_w = data_offset_ptr[data_offset_w_ptr];\n      const scalar_t mask = data_mask_ptr[data_mask_hw_ptr];\n      scalar_t inv_h = h_in + i * dilation_h + offset_h;\n      scalar_t inv_w = w_in + j * dilation_w + offset_w;\n      if (inv_h <= -1 || inv_w <= -1 || inv_h >= height || inv_w >= width)\n      {\n        inv_h = inv_w = -2;\n      }\n      else\n      {\n        mval += data_col_ptr[col_pos] * dmcn_im2col_bilinear(data_im_ptr + cnt * height * width, width, height, width, inv_h, inv_w);\n      }\n      const scalar_t weight = dmcn_get_coordinate_weight(\n          inv_h, inv_w,\n          height, width, data_im_ptr + cnt * height * width, width, bp_dir);\n      val += weight * data_col_ptr[col_pos] * mask;\n      cnt += 1;\n    }\n    // KERNEL_ASSIGN(grad_offset[index], offset_req, val);\n    grad_offset[index] = val;\n    if (offset_c % 2 == 0)\n      // KERNEL_ASSIGN(grad_mask[(((b * deformable_group + deformable_group_index) * kernel_h * kernel_w + offset_c / 2) * height_col + h) * width_col + w], mask_req, mval);\n      grad_mask[(((b * deformable_group + deformable_group_index) * kernel_h * kernel_w + offset_c / 2) * height_col + h) * width_col + w] = mval;\n  }\n}\n\nvoid modulated_deformable_im2col_cuda(\n    const at::Tensor data_im, const at::Tensor data_offset, const at::Tensor data_mask,\n    const int batch_size, const int channels, const int height_im, const int width_im,\n    const int height_col, const int width_col, const int kernel_h, const int kenerl_w,\n    const int pad_h, const int pad_w, const int stride_h, const int stride_w,\n    const int dilation_h, const int dilation_w,\n    const int deformable_group, at::Tensor data_col)\n{\n  // num_axes should be smaller than block size\n  const int channel_per_deformable_group = channels / deformable_group;\n  const int num_kernels = channels * batch_size * height_col * width_col;\n\n  AT_DISPATCH_FLOATING_TYPES_AND_HALF(\n      data_im.scalar_type(), \"modulated_deformable_im2col_gpu\", ([&] {\n        const scalar_t *data_im_ = data_im.data<scalar_t>();\n        const scalar_t *data_offset_ = data_offset.data<scalar_t>();\n        const scalar_t *data_mask_ = data_mask.data<scalar_t>();\n        scalar_t *data_col_ = data_col.data<scalar_t>();\n\n        modulated_deformable_im2col_gpu_kernel<<<GET_BLOCKS(num_kernels), CUDA_NUM_THREADS, 0, at::cuda::getCurrentCUDAStream()>>>(\n            num_kernels, data_im_, data_offset_, data_mask_, height_im, width_im, kernel_h, kenerl_w,\n            pad_h, pad_w, stride_h, stride_w, dilation_h, dilation_w, channel_per_deformable_group,\n            batch_size, channels, deformable_group, height_col, width_col, data_col_);\n      }));\n\n  cudaError_t err = cudaGetLastError();\n  if (err != cudaSuccess)\n  {\n    printf(\"error in modulated_deformable_im2col_cuda: %s\\n\", cudaGetErrorString(err));\n  }\n}\n\nvoid modulated_deformable_col2im_cuda(\n    const at::Tensor data_col, const at::Tensor data_offset, const at::Tensor data_mask,\n    const int batch_size, const int channels, const int height_im, const int width_im,\n    const int height_col, const int width_col, const int kernel_h, const int kernel_w,\n    const int pad_h, const int pad_w, const int stride_h, const int stride_w,\n    const int dilation_h, const int dilation_w,\n    const int deformable_group, at::Tensor grad_im)\n{\n\n  const int channel_per_deformable_group = channels / deformable_group;\n  const int num_kernels = channels * kernel_h * kernel_w * batch_size * height_col * width_col;\n\n  AT_DISPATCH_FLOATING_TYPES_AND_HALF(\n      data_col.scalar_type(), \"modulated_deformable_col2im_gpu\", ([&] {\n        const scalar_t *data_col_ = data_col.data<scalar_t>();\n        const scalar_t *data_offset_ = data_offset.data<scalar_t>();\n        const scalar_t *data_mask_ = data_mask.data<scalar_t>();\n        scalar_t *grad_im_ = grad_im.data<scalar_t>();\n\n        modulated_deformable_col2im_gpu_kernel<<<GET_BLOCKS(num_kernels), CUDA_NUM_THREADS, 0, at::cuda::getCurrentCUDAStream()>>>(\n            num_kernels, data_col_, data_offset_, data_mask_, channels, height_im, width_im,\n            kernel_h, kernel_w, pad_h, pad_w, stride_h, stride_w,\n            dilation_h, dilation_w, channel_per_deformable_group,\n            batch_size, deformable_group, height_col, width_col, grad_im_);\n      }));\n\n  cudaError_t err = cudaGetLastError();\n  if (err != cudaSuccess)\n  {\n    printf(\"error in modulated_deformable_col2im_cuda: %s\\n\", cudaGetErrorString(err));\n  }\n}\n\nvoid modulated_deformable_col2im_coord_cuda(\n    const at::Tensor data_col, const at::Tensor data_im, const at::Tensor data_offset, const at::Tensor data_mask,\n    const int batch_size, const int channels, const int height_im, const int width_im,\n    const int height_col, const int width_col, const int kernel_h, const int kernel_w,\n    const int pad_h, const int pad_w, const int stride_h, const int stride_w,\n    const int dilation_h, const int dilation_w,\n    const int deformable_group,\n    at::Tensor grad_offset, at::Tensor grad_mask)\n{\n  const int num_kernels = batch_size * height_col * width_col * 2 * kernel_h * kernel_w * deformable_group;\n  const int channel_per_deformable_group = channels * kernel_h * kernel_w / deformable_group;\n\n  AT_DISPATCH_FLOATING_TYPES_AND_HALF(\n      data_col.scalar_type(), \"modulated_deformable_col2im_coord_gpu\", ([&] {\n        const scalar_t *data_col_ = data_col.data<scalar_t>();\n        const scalar_t *data_im_ = data_im.data<scalar_t>();\n        const scalar_t *data_offset_ = data_offset.data<scalar_t>();\n        const scalar_t *data_mask_ = data_mask.data<scalar_t>();\n        scalar_t *grad_offset_ = grad_offset.data<scalar_t>();\n        scalar_t *grad_mask_ = grad_mask.data<scalar_t>();\n\n        modulated_deformable_col2im_coord_gpu_kernel<<<GET_BLOCKS(num_kernels), CUDA_NUM_THREADS, 0, at::cuda::getCurrentCUDAStream()>>>(\n            num_kernels, data_col_, data_im_, data_offset_, data_mask_, channels, height_im, width_im,\n            kernel_h, kernel_w, pad_h, pad_w, stride_h, stride_w,\n            dilation_h, dilation_w, channel_per_deformable_group,\n            batch_size, 2 * kernel_h * kernel_w * deformable_group, deformable_group, height_col, width_col,\n            grad_offset_, grad_mask_);\n      }));\n  cudaError_t err = cudaGetLastError();\n  if (err != cudaSuccess)\n  {\n    printf(\"error in modulated_deformable_col2im_coord_cuda: %s\\n\", cudaGetErrorString(err));\n  }\n}\n"
  },
  {
    "path": "pcdet/ops/dcn/src/deform_pool_cuda.cpp",
    "content": "// modify from\n// https://github.com/chengdazhi/Deformable-Convolution-V2-PyTorch/blob/mmdetection/mmdet/ops/dcn/src/modulated_dcn_cuda.c\n\n// based on\n// author: Charles Shang\n// https://github.com/torch/cunn/blob/master/lib/THCUNN/generic/SpatialConvolutionMM.cu\n\n#include <torch/extension.h>\n#include <ATen/DeviceGuard.h>\n\n#include <cmath>\n#include <vector>\n\nvoid DeformablePSROIPoolForward(\n    const at::Tensor data, const at::Tensor bbox, const at::Tensor trans,\n    at::Tensor out, at::Tensor top_count, const int batch, const int channels,\n    const int height, const int width, const int num_bbox,\n    const int channels_trans, const int no_trans, const float spatial_scale,\n    const int output_dim, const int group_size, const int pooled_size,\n    const int part_size, const int sample_per_part, const float trans_std);\n\nvoid DeformablePSROIPoolBackwardAcc(\n    const at::Tensor out_grad, const at::Tensor data, const at::Tensor bbox,\n    const at::Tensor trans, const at::Tensor top_count, at::Tensor in_grad,\n    at::Tensor trans_grad, const int batch, const int channels,\n    const int height, const int width, const int num_bbox,\n    const int channels_trans, const int no_trans, const float spatial_scale,\n    const int output_dim, const int group_size, const int pooled_size,\n    const int part_size, const int sample_per_part, const float trans_std);\n\nvoid deform_psroi_pooling_cuda_forward(\n    at::Tensor input, at::Tensor bbox, at::Tensor trans, at::Tensor out,\n    at::Tensor top_count, const int no_trans, const float spatial_scale,\n    const int output_dim, const int group_size, const int pooled_size,\n    const int part_size, const int sample_per_part, const float trans_std) {\n  TORCH_CHECK(input.is_contiguous(), \"input tensor has to be contiguous\");\n  at::DeviceGuard guard(input.device());\n\n  const int batch = input.size(0);\n  const int channels = input.size(1);\n  const int height = input.size(2);\n  const int width = input.size(3);\n  const int channels_trans = no_trans ? 2 : trans.size(1);\n\n  const int num_bbox = bbox.size(0);\n  if (num_bbox != out.size(0))\n    AT_ERROR(\"Output shape and bbox number wont match: (%d vs %d).\",\n             out.size(0), num_bbox);\n\n  DeformablePSROIPoolForward(\n      input, bbox, trans, out, top_count, batch, channels, height, width,\n      num_bbox, channels_trans, no_trans, spatial_scale, output_dim, group_size,\n      pooled_size, part_size, sample_per_part, trans_std);\n}\n\nvoid deform_psroi_pooling_cuda_backward(\n    at::Tensor out_grad, at::Tensor input, at::Tensor bbox, at::Tensor trans,\n    at::Tensor top_count, at::Tensor input_grad, at::Tensor trans_grad,\n    const int no_trans, const float spatial_scale, const int output_dim,\n    const int group_size, const int pooled_size, const int part_size,\n    const int sample_per_part, const float trans_std) {\n  TORCH_CHECK(out_grad.is_contiguous(), \"out_grad tensor has to be contiguous\");\n  TORCH_CHECK(input.is_contiguous(), \"input tensor has to be contiguous\");\n  at::DeviceGuard guard(input.device());\n\n  const int batch = input.size(0);\n  const int channels = input.size(1);\n  const int height = input.size(2);\n  const int width = input.size(3);\n  const int channels_trans = no_trans ? 2 : trans.size(1);\n\n  const int num_bbox = bbox.size(0);\n  if (num_bbox != out_grad.size(0))\n    AT_ERROR(\"Output shape and bbox number wont match: (%d vs %d).\",\n             out_grad.size(0), num_bbox);\n\n  DeformablePSROIPoolBackwardAcc(\n      out_grad, input, bbox, trans, top_count, input_grad, trans_grad, batch,\n      channels, height, width, num_bbox, channels_trans, no_trans,\n      spatial_scale, output_dim, group_size, pooled_size, part_size,\n      sample_per_part, trans_std);\n}\n\nPYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {\n  m.def(\"deform_psroi_pooling_cuda_forward\", &deform_psroi_pooling_cuda_forward,\n        \"deform psroi pooling forward(CUDA)\");\n  m.def(\"deform_psroi_pooling_cuda_backward\",\n        &deform_psroi_pooling_cuda_backward,\n        \"deform psroi pooling backward(CUDA)\");\n}\n"
  },
  {
    "path": "pcdet/ops/dcn/src/deform_pool_cuda_kernel.cu",
    "content": "/*!\n * Copyright (c) 2017 Microsoft\n * Licensed under The MIT License [see LICENSE for details]\n * \\file deformable_psroi_pooling.cu\n * \\brief\n * \\author Yi Li, Guodong Zhang, Jifeng Dai\n*/\n/***************** Adapted by Charles Shang *********************/\n// modify from https://github.com/chengdazhi/Deformable-Convolution-V2-PyTorch/blob/mmdetection/mmdet/ops/dcn/src/cuda/deform_psroi_pooling_cuda.cu\n\n#include <ATen/ATen.h>\n#include <THC/THCAtomics.cuh>\n#include <stdio.h>\n#include <math.h>\n#include <algorithm>\n\nusing namespace at;\n\n#define CUDA_KERNEL_LOOP(i, n)                        \\\n  for (int i = blockIdx.x * blockDim.x + threadIdx.x; \\\n       i < (n);                                       \\\n       i += blockDim.x * gridDim.x)\n\nconst int CUDA_NUM_THREADS = 1024;\ninline int GET_BLOCKS(const int N)\n{\n  return (N + CUDA_NUM_THREADS - 1) / CUDA_NUM_THREADS;\n}\n\ntemplate <typename scalar_t>\n__device__ scalar_t bilinear_interp(\n    const scalar_t *data,\n    const scalar_t x,\n    const scalar_t y,\n    const int width,\n    const int height)\n{\n  int x1 = floor(x);\n  int x2 = ceil(x);\n  int y1 = floor(y);\n  int y2 = ceil(y);\n  scalar_t dist_x = (scalar_t)(x - x1);\n  scalar_t dist_y = (scalar_t)(y - y1);\n  scalar_t value11 = data[y1 * width + x1];\n  scalar_t value12 = data[y2 * width + x1];\n  scalar_t value21 = data[y1 * width + x2];\n  scalar_t value22 = data[y2 * width + x2];\n  scalar_t value = (1 - dist_x) * (1 - dist_y) * value11 + (1 - dist_x) * dist_y * value12 + dist_x * (1 - dist_y) * value21 + dist_x * dist_y * value22;\n  return value;\n}\n\ntemplate <typename scalar_t>\n__global__ void DeformablePSROIPoolForwardKernel(\n    const int count,\n    const scalar_t *bottom_data,\n    const scalar_t spatial_scale,\n    const int channels,\n    const int height, const int width,\n    const int pooled_height, const int pooled_width,\n    const scalar_t *bottom_rois, const scalar_t *bottom_trans,\n    const int no_trans,\n    const scalar_t trans_std,\n    const int sample_per_part,\n    const int output_dim,\n    const int group_size,\n    const int part_size,\n    const int num_classes,\n    const int channels_each_class,\n    scalar_t *top_data,\n    scalar_t *top_count)\n{\n  CUDA_KERNEL_LOOP(index, count)\n  {\n    // The output is in order (n, ctop, ph, pw)\n    int pw = index % pooled_width;\n    int ph = (index / pooled_width) % pooled_height;\n    int ctop = (index / pooled_width / pooled_height) % output_dim;\n    int n = index / pooled_width / pooled_height / output_dim;\n\n    // [start, end) interval for spatial sampling\n    const scalar_t *offset_bottom_rois = bottom_rois + n * 5;\n    int roi_batch_ind = offset_bottom_rois[0];\n    scalar_t roi_start_w = (scalar_t)(round(offset_bottom_rois[1])) * spatial_scale - 0.5;\n    scalar_t roi_start_h = (scalar_t)(round(offset_bottom_rois[2])) * spatial_scale - 0.5;\n    scalar_t roi_end_w = (scalar_t)(round(offset_bottom_rois[3]) + 1.) * spatial_scale - 0.5;\n    scalar_t roi_end_h = (scalar_t)(round(offset_bottom_rois[4]) + 1.) * spatial_scale - 0.5;\n\n    // Force too small ROIs to be 1x1\n    scalar_t roi_width = max(roi_end_w - roi_start_w, 0.1); //avoid 0\n    scalar_t roi_height = max(roi_end_h - roi_start_h, 0.1);\n\n    // Compute w and h at bottom\n    scalar_t bin_size_h = roi_height / (scalar_t)(pooled_height);\n    scalar_t bin_size_w = roi_width / (scalar_t)(pooled_width);\n\n    scalar_t sub_bin_size_h = bin_size_h / (scalar_t)(sample_per_part);\n    scalar_t sub_bin_size_w = bin_size_w / (scalar_t)(sample_per_part);\n\n    int part_h = floor((scalar_t)(ph) / pooled_height * part_size);\n    int part_w = floor((scalar_t)(pw) / pooled_width * part_size);\n    int class_id = ctop / channels_each_class;\n    scalar_t trans_x = no_trans ? (scalar_t)(0) : bottom_trans[(((n * num_classes + class_id) * 2) * part_size + part_h) * part_size + part_w] * (scalar_t)trans_std;\n    scalar_t trans_y = no_trans ? (scalar_t)(0) : bottom_trans[(((n * num_classes + class_id) * 2 + 1) * part_size + part_h) * part_size + part_w] * (scalar_t)trans_std;\n\n    scalar_t wstart = (scalar_t)(pw)*bin_size_w + roi_start_w;\n    wstart += trans_x * roi_width;\n    scalar_t hstart = (scalar_t)(ph)*bin_size_h + roi_start_h;\n    hstart += trans_y * roi_height;\n\n    scalar_t sum = 0;\n    int count = 0;\n    int gw = floor((scalar_t)(pw)*group_size / pooled_width);\n    int gh = floor((scalar_t)(ph)*group_size / pooled_height);\n    gw = min(max(gw, 0), group_size - 1);\n    gh = min(max(gh, 0), group_size - 1);\n\n    const scalar_t *offset_bottom_data = bottom_data + (roi_batch_ind * channels) * height * width;\n    for (int ih = 0; ih < sample_per_part; ih++)\n    {\n      for (int iw = 0; iw < sample_per_part; iw++)\n      {\n        scalar_t w = wstart + iw * sub_bin_size_w;\n        scalar_t h = hstart + ih * sub_bin_size_h;\n        // bilinear interpolation\n        if (w < -0.5 || w > width - 0.5 || h < -0.5 || h > height - 0.5)\n        {\n          continue;\n        }\n        w = min(max(w, 0.), width - 1.);\n        h = min(max(h, 0.), height - 1.);\n        int c = (ctop * group_size + gh) * group_size + gw;\n        scalar_t val = bilinear_interp(offset_bottom_data + c * height * width, w, h, width, height);\n        sum += val;\n        count++;\n      }\n    }\n    top_data[index] = count == 0 ? (scalar_t)(0) : sum / count;\n    top_count[index] = count;\n  }\n}\n\ntemplate <typename scalar_t>\n__global__ void DeformablePSROIPoolBackwardAccKernel(\n    const int count,\n    const scalar_t *top_diff,\n    const scalar_t *top_count,\n    const int num_rois,\n    const scalar_t spatial_scale,\n    const int channels,\n    const int height, const int width,\n    const int pooled_height, const int pooled_width,\n    const int output_dim,\n    scalar_t *bottom_data_diff, scalar_t *bottom_trans_diff,\n    const scalar_t *bottom_data,\n    const scalar_t *bottom_rois,\n    const scalar_t *bottom_trans,\n    const int no_trans,\n    const scalar_t trans_std,\n    const int sample_per_part,\n    const int group_size,\n    const int part_size,\n    const int num_classes,\n    const int channels_each_class)\n{\n  CUDA_KERNEL_LOOP(index, count)\n  {\n    // The output is in order (n, ctop, ph, pw)\n    int pw = index % pooled_width;\n    int ph = (index / pooled_width) % pooled_height;\n    int ctop = (index / pooled_width / pooled_height) % output_dim;\n    int n = index / pooled_width / pooled_height / output_dim;\n\n    // [start, end) interval for spatial sampling\n    const scalar_t *offset_bottom_rois = bottom_rois + n * 5;\n    int roi_batch_ind = offset_bottom_rois[0];\n    scalar_t roi_start_w = (scalar_t)(round(offset_bottom_rois[1])) * spatial_scale - 0.5;\n    scalar_t roi_start_h = (scalar_t)(round(offset_bottom_rois[2])) * spatial_scale - 0.5;\n    scalar_t roi_end_w = (scalar_t)(round(offset_bottom_rois[3]) + 1.) * spatial_scale - 0.5;\n    scalar_t roi_end_h = (scalar_t)(round(offset_bottom_rois[4]) + 1.) * spatial_scale - 0.5;\n\n    // Force too small ROIs to be 1x1\n    scalar_t roi_width = max(roi_end_w - roi_start_w, 0.1); //avoid 0\n    scalar_t roi_height = max(roi_end_h - roi_start_h, 0.1);\n\n    // Compute w and h at bottom\n    scalar_t bin_size_h = roi_height / (scalar_t)(pooled_height);\n    scalar_t bin_size_w = roi_width / (scalar_t)(pooled_width);\n\n    scalar_t sub_bin_size_h = bin_size_h / (scalar_t)(sample_per_part);\n    scalar_t sub_bin_size_w = bin_size_w / (scalar_t)(sample_per_part);\n\n    int part_h = floor((scalar_t)(ph) / pooled_height * part_size);\n    int part_w = floor((scalar_t)(pw) / pooled_width * part_size);\n    int class_id = ctop / channels_each_class;\n    scalar_t trans_x = no_trans ? (scalar_t)(0) : bottom_trans[(((n * num_classes + class_id) * 2) * part_size + part_h) * part_size + part_w] * (scalar_t)trans_std;\n    scalar_t trans_y = no_trans ? (scalar_t)(0) : bottom_trans[(((n * num_classes + class_id) * 2 + 1) * part_size + part_h) * part_size + part_w] * (scalar_t)trans_std;\n\n    scalar_t wstart = (scalar_t)(pw)*bin_size_w + roi_start_w;\n    wstart += trans_x * roi_width;\n    scalar_t hstart = (scalar_t)(ph)*bin_size_h + roi_start_h;\n    hstart += trans_y * roi_height;\n\n    if (top_count[index] <= 0)\n    {\n      continue;\n    }\n    scalar_t diff_val = top_diff[index] / top_count[index];\n    const scalar_t *offset_bottom_data = bottom_data + roi_batch_ind * channels * height * width;\n    scalar_t *offset_bottom_data_diff = bottom_data_diff + roi_batch_ind * channels * height * width;\n    int gw = floor((scalar_t)(pw)*group_size / pooled_width);\n    int gh = floor((scalar_t)(ph)*group_size / pooled_height);\n    gw = min(max(gw, 0), group_size - 1);\n    gh = min(max(gh, 0), group_size - 1);\n\n    for (int ih = 0; ih < sample_per_part; ih++)\n    {\n      for (int iw = 0; iw < sample_per_part; iw++)\n      {\n        scalar_t w = wstart + iw * sub_bin_size_w;\n        scalar_t h = hstart + ih * sub_bin_size_h;\n        // bilinear interpolation\n        if (w < -0.5 || w > width - 0.5 || h < -0.5 || h > height - 0.5)\n        {\n          continue;\n        }\n        w = min(max(w, 0.), width - 1.);\n        h = min(max(h, 0.), height - 1.);\n        int c = (ctop * group_size + gh) * group_size + gw;\n        // backward on feature\n        int x0 = floor(w);\n        int x1 = ceil(w);\n        int y0 = floor(h);\n        int y1 = ceil(h);\n        scalar_t dist_x = w - x0, dist_y = h - y0;\n        scalar_t q00 = (1 - dist_x) * (1 - dist_y);\n        scalar_t q01 = (1 - dist_x) * dist_y;\n        scalar_t q10 = dist_x * (1 - dist_y);\n        scalar_t q11 = dist_x * dist_y;\n        int bottom_index_base = c * height * width;\n        atomicAdd(offset_bottom_data_diff + bottom_index_base + y0 * width + x0, q00 * diff_val);\n        atomicAdd(offset_bottom_data_diff + bottom_index_base + y1 * width + x0, q01 * diff_val);\n        atomicAdd(offset_bottom_data_diff + bottom_index_base + y0 * width + x1, q10 * diff_val);\n        atomicAdd(offset_bottom_data_diff + bottom_index_base + y1 * width + x1, q11 * diff_val);\n\n        if (no_trans)\n        {\n          continue;\n        }\n        scalar_t U00 = offset_bottom_data[bottom_index_base + y0 * width + x0];\n        scalar_t U01 = offset_bottom_data[bottom_index_base + y1 * width + x0];\n        scalar_t U10 = offset_bottom_data[bottom_index_base + y0 * width + x1];\n        scalar_t U11 = offset_bottom_data[bottom_index_base + y1 * width + x1];\n        scalar_t diff_x = (U11 * dist_y + U10 * (1 - dist_y) - U01 * dist_y - U00 * (1 - dist_y)) * trans_std * diff_val;\n        diff_x *= roi_width;\n        scalar_t diff_y = (U11 * dist_x + U01 * (1 - dist_x) - U10 * dist_x - U00 * (1 - dist_x)) * trans_std * diff_val;\n        diff_y *= roi_height;\n\n        atomicAdd(bottom_trans_diff + (((n * num_classes + class_id) * 2) * part_size + part_h) * part_size + part_w, diff_x);\n        atomicAdd(bottom_trans_diff + (((n * num_classes + class_id) * 2 + 1) * part_size + part_h) * part_size + part_w, diff_y);\n      }\n    }\n  }\n}\n\nvoid DeformablePSROIPoolForward(const at::Tensor data,\n                                const at::Tensor bbox,\n                                const at::Tensor trans,\n                                at::Tensor out,\n                                at::Tensor top_count,\n                                const int batch,\n                                const int channels,\n                                const int height,\n                                const int width,\n                                const int num_bbox,\n                                const int channels_trans,\n                                const int no_trans,\n                                const float spatial_scale,\n                                const int output_dim,\n                                const int group_size,\n                                const int pooled_size,\n                                const int part_size,\n                                const int sample_per_part,\n                                const float trans_std)\n{\n  const int pooled_height = pooled_size;\n  const int pooled_width = pooled_size;\n  const int count = num_bbox * output_dim * pooled_height * pooled_width;\n  const int num_classes = no_trans ? 1 : channels_trans / 2;\n  const int channels_each_class = no_trans ? output_dim : output_dim / num_classes;\n\n  AT_DISPATCH_FLOATING_TYPES_AND_HALF(\n      data.scalar_type(), \"deformable_psroi_pool_forward\", ([&] {\n        const scalar_t *bottom_data = data.data<scalar_t>();\n        const scalar_t *bottom_rois = bbox.data<scalar_t>();\n        const scalar_t *bottom_trans = no_trans ? NULL : trans.data<scalar_t>();\n        scalar_t *top_data = out.data<scalar_t>();\n        scalar_t *top_count_data = top_count.data<scalar_t>();\n\n        DeformablePSROIPoolForwardKernel<<<GET_BLOCKS(count), CUDA_NUM_THREADS, 0, at::cuda::getCurrentCUDAStream()>>>(\n            count, bottom_data, (scalar_t)spatial_scale, channels, height, width, pooled_height, pooled_width,\n            bottom_rois, bottom_trans, no_trans, (scalar_t)trans_std, sample_per_part, output_dim,\n            group_size, part_size, num_classes, channels_each_class, top_data, top_count_data);\n      }));\n\n  cudaError_t err = cudaGetLastError();\n  if (err != cudaSuccess)\n  {\n    printf(\"error in DeformablePSROIPoolForward: %s\\n\", cudaGetErrorString(err));\n  }\n}\n\nvoid DeformablePSROIPoolBackwardAcc(const at::Tensor out_grad,\n                                    const at::Tensor data,\n                                    const at::Tensor bbox,\n                                    const at::Tensor trans,\n                                    const at::Tensor top_count,\n                                    at::Tensor in_grad,\n                                    at::Tensor trans_grad,\n                                    const int batch,\n                                    const int channels,\n                                    const int height,\n                                    const int width,\n                                    const int num_bbox,\n                                    const int channels_trans,\n                                    const int no_trans,\n                                    const float spatial_scale,\n                                    const int output_dim,\n                                    const int group_size,\n                                    const int pooled_size,\n                                    const int part_size,\n                                    const int sample_per_part,\n                                    const float trans_std)\n{\n  // LOG(INFO) << \"DeformablePSROIPoolBackward\";\n  const int num_rois = num_bbox;\n  const int pooled_height = pooled_size;\n  const int pooled_width = pooled_size;\n  const int count = num_bbox * output_dim * pooled_height * pooled_width;\n  const int num_classes = no_trans ? 1 : channels_trans / 2;\n  const int channels_each_class = no_trans ? output_dim : output_dim / num_classes;\n\n  AT_DISPATCH_FLOATING_TYPES_AND_HALF(\n      out_grad.scalar_type(), \"deformable_psroi_pool_backward_acc\", ([&] {\n        const scalar_t *top_diff = out_grad.data<scalar_t>();\n        const scalar_t *bottom_data = data.data<scalar_t>();\n        const scalar_t *bottom_rois = bbox.data<scalar_t>();\n        const scalar_t *bottom_trans = no_trans ? NULL : trans.data<scalar_t>();\n        scalar_t *bottom_data_diff = in_grad.data<scalar_t>();\n        scalar_t *bottom_trans_diff = no_trans ? NULL : trans_grad.data<scalar_t>();\n        const scalar_t *top_count_data = top_count.data<scalar_t>();\n\n        DeformablePSROIPoolBackwardAccKernel<<<GET_BLOCKS(count), CUDA_NUM_THREADS, 0, at::cuda::getCurrentCUDAStream()>>>(\n            count, top_diff, top_count_data, num_rois, (scalar_t)spatial_scale, channels, height, width,\n            pooled_height, pooled_width, output_dim, bottom_data_diff, bottom_trans_diff,\n            bottom_data, bottom_rois, bottom_trans, no_trans, (scalar_t)trans_std, sample_per_part,\n            group_size, part_size, num_classes, channels_each_class);\n      }));\n\n  cudaError_t err = cudaGetLastError();\n  if (err != cudaSuccess)\n  {\n    printf(\"error in DeformablePSROIPoolForward: %s\\n\", cudaGetErrorString(err));\n  }\n}\n"
  },
  {
    "path": "pcdet/ops/iou3d_nms/iou3d_nms_utils.py",
    "content": "\"\"\"\n3D IoU Calculation and Rotated NMS\nWritten by Shaoshuai Shi\nAll Rights Reserved 2019-2020.\n\"\"\"\nimport torch\n\nfrom ...utils import common_utils\nfrom . import iou3d_nms_cuda\n\n\ndef boxes_bev_iou_cpu(boxes_a, boxes_b):\n    \"\"\"\n    Args:\n        boxes_a: (N, 7) [x, y, z, dx, dy, dz, heading]\n        boxes_b: (N, 7) [x, y, z, dx, dy, dz, heading]\n\n    Returns:\n\n    \"\"\"\n    boxes_a, is_numpy = common_utils.check_numpy_to_torch(boxes_a)\n    boxes_b, is_numpy = common_utils.check_numpy_to_torch(boxes_b)\n    assert not (boxes_a.is_cuda or boxes_b.is_cuda), 'Only support CPU tensors'\n    assert boxes_a.shape[1] == 7 and boxes_b.shape[1] == 7\n    ans_iou = boxes_a.new_zeros(torch.Size((boxes_a.shape[0], boxes_b.shape[0])))\n    iou3d_nms_cuda.boxes_iou_bev_cpu(boxes_a.contiguous(), boxes_b.contiguous(), ans_iou)\n\n    return ans_iou.numpy() if is_numpy else ans_iou\n\n\ndef boxes_iou_bev(boxes_a, boxes_b):\n    \"\"\"\n    Args:\n        boxes_a: (N, 7) [x, y, z, dx, dy, dz, heading]\n        boxes_b: (N, 7) [x, y, z, dx, dy, dz, heading]\n\n    Returns:\n        ans_iou: (N, M)\n    \"\"\"\n    assert boxes_a.shape[1] == boxes_b.shape[1] == 7\n    ans_iou = torch.cuda.FloatTensor(torch.Size((boxes_a.shape[0], boxes_b.shape[0]))).zero_()\n\n    iou3d_nms_cuda.boxes_iou_bev_gpu(boxes_a.contiguous(), boxes_b.contiguous(), ans_iou)\n\n    return ans_iou\n\ndef boxes_dis(boxes_a, boxes_b):\n    \"\"\"\n    Args:\n        boxes_a: (N, 7) [x, y, z, dx, dy, dz, heading]\n        boxes_b: (N, 7) [x, y, z, dx, dy, dz, heading]\n\n    Returns:\n        dis: (N, M)\n    \"\"\"\n    n,k = boxes_a.shape\n    m,k2 = boxes_b.shape\n\n    new_boxes_a = boxes_a.unsqueeze(1).expand(n, m, k)\n    new_boxes_b = boxes_b.unsqueeze(0).expand(n, m, k2)\n\n    dis = (new_boxes_a[..., 0:2] - new_boxes_b[..., 0:2])**2\n    dis = torch.sqrt(torch.sum(dis, dim=-1))\n    return dis\n\n\ndef boxes_iou3d_gpu(boxes_a, boxes_b):\n    \"\"\"\n    Args:\n        boxes_a: (N, 7) [x, y, z, dx, dy, dz, heading]\n        boxes_b: (N, 7) [x, y, z, dx, dy, dz, heading]\n\n    Returns:\n        ans_iou: (N, M)\n    \"\"\"\n    assert boxes_a.shape[1] == boxes_b.shape[1] == 7\n\n    # height overlap\n    boxes_a_height_max = (boxes_a[:, 2] + boxes_a[:, 5] / 2).view(-1, 1)\n    boxes_a_height_min = (boxes_a[:, 2] - boxes_a[:, 5] / 2).view(-1, 1)\n    boxes_b_height_max = (boxes_b[:, 2] + boxes_b[:, 5] / 2).view(1, -1)\n    boxes_b_height_min = (boxes_b[:, 2] - boxes_b[:, 5] / 2).view(1, -1)\n\n    # bev overlap\n    overlaps_bev = torch.cuda.FloatTensor(torch.Size((boxes_a.shape[0], boxes_b.shape[0]))).zero_()  # (N, M)\n    iou3d_nms_cuda.boxes_overlap_bev_gpu(boxes_a.contiguous(), boxes_b.contiguous(), overlaps_bev)\n\n    max_of_min = torch.max(boxes_a_height_min, boxes_b_height_min)\n    min_of_max = torch.min(boxes_a_height_max, boxes_b_height_max)\n    overlaps_h = torch.clamp(min_of_max - max_of_min, min=0)\n\n    # 3d iou\n    overlaps_3d = overlaps_bev * overlaps_h\n\n    vol_a = (boxes_a[:, 3] * boxes_a[:, 4] * boxes_a[:, 5]).view(-1, 1)\n    vol_b = (boxes_b[:, 3] * boxes_b[:, 4] * boxes_b[:, 5]).view(1, -1)\n\n    iou3d = overlaps_3d / torch.clamp(vol_a + vol_b - overlaps_3d, min=1e-6)\n\n    return iou3d\n\n\ndef nms_gpu(boxes, scores, thresh, pre_maxsize=None, **kwargs):\n    \"\"\"\n    :param boxes: (N, 7) [x, y, z, dx, dy, dz, heading]\n    :param scores: (N)\n    :param thresh:\n    :return:\n    \"\"\"\n    assert boxes.shape[1] == 7\n    order = scores.sort(0, descending=True)[1]\n    if pre_maxsize is not None:\n        order = order[:pre_maxsize]\n\n    boxes = boxes[order].contiguous()\n    keep = torch.LongTensor(boxes.size(0))\n    num_out = iou3d_nms_cuda.nms_gpu(boxes, keep, thresh)\n    return order[keep[:num_out].cuda()].contiguous(), None\n\n\ndef nms_normal_gpu(boxes, scores, thresh, **kwargs):\n    \"\"\"\n    :param boxes: (N, 7) [x, y, z, dx, dy, dz, heading]\n    :param scores: (N)\n    :param thresh:\n    :return:\n    \"\"\"\n    assert boxes.shape[1] == 7\n    order = scores.sort(0, descending=True)[1]\n\n    boxes = boxes[order].contiguous()\n\n    keep = torch.LongTensor(boxes.size(0))\n    num_out = iou3d_nms_cuda.nms_normal_gpu(boxes, keep, thresh)\n    return order[keep[:num_out].cuda()].contiguous(), None\n"
  },
  {
    "path": "pcdet/ops/iou3d_nms/src/iou3d_cpu.cpp",
    "content": "/*\n3D Rotated IoU Calculation (CPU)\nWritten by Shaoshuai Shi\nAll Rights Reserved 2020.\n*/\n\n#include <stdio.h>\n#include <math.h>\n#include <torch/serialize/tensor.h>\n#include <torch/extension.h>\n#include <vector>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n#include \"iou3d_cpu.h\"\n\n#define CHECK_CUDA(x) do { \\\n  if (!x.type().is_cuda()) { \\\n    fprintf(stderr, \"%s must be CUDA tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_CONTIGUOUS(x) do { \\\n  if (!x.is_contiguous()) { \\\n    fprintf(stderr, \"%s must be contiguous tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_INPUT(x) CHECK_CUDA(x);CHECK_CONTIGUOUS(x)\n\ninline float min(float a, float b){\n    return a > b ? b : a;\n}\n\ninline float max(float a, float b){\n    return a > b ? a : b;\n}\n\nconst float EPS = 1e-8;\nstruct Point {\n    float x, y;\n    __device__ Point() {}\n    __device__ Point(double _x, double _y){\n        x = _x, y = _y;\n    }\n\n    __device__ void set(float _x, float _y){\n        x = _x; y = _y;\n    }\n\n    __device__ Point operator +(const Point &b)const{\n        return Point(x + b.x, y + b.y);\n    }\n\n    __device__ Point operator -(const Point &b)const{\n        return Point(x - b.x, y - b.y);\n    }\n};\n\ninline float cross(const Point &a, const Point &b){\n    return a.x * b.y - a.y * b.x;\n}\n\ninline float cross(const Point &p1, const Point &p2, const Point &p0){\n    return (p1.x - p0.x) * (p2.y - p0.y) - (p2.x - p0.x) * (p1.y - p0.y);\n}\n\ninline int check_rect_cross(const Point &p1, const Point &p2, const Point &q1, const Point &q2){\n    int ret = min(p1.x,p2.x) <= max(q1.x,q2.x)  &&\n              min(q1.x,q2.x) <= max(p1.x,p2.x) &&\n              min(p1.y,p2.y) <= max(q1.y,q2.y) &&\n              min(q1.y,q2.y) <= max(p1.y,p2.y);\n    return ret;\n}\n\ninline int check_in_box2d(const float *box, const Point &p){\n    //params: (7) [x, y, z, dx, dy, dz, heading]\n    const float MARGIN = 1e-2;\n\n    float center_x = box[0], center_y = box[1];\n    float angle_cos = cos(-box[6]), angle_sin = sin(-box[6]);  // rotate the point in the opposite direction of box\n    float rot_x = (p.x - center_x) * angle_cos + (p.y - center_y) * (-angle_sin);\n    float rot_y = (p.x - center_x) * angle_sin + (p.y - center_y) * angle_cos;\n\n    return (fabs(rot_x) < box[3] / 2 + MARGIN && fabs(rot_y) < box[4] / 2 + MARGIN);\n}\n\ninline int intersection(const Point &p1, const Point &p0, const Point &q1, const Point &q0, Point &ans){\n    // fast exclusion\n    if (check_rect_cross(p0, p1, q0, q1) == 0) return 0;\n\n    // check cross standing\n    float s1 = cross(q0, p1, p0);\n    float s2 = cross(p1, q1, p0);\n    float s3 = cross(p0, q1, q0);\n    float s4 = cross(q1, p1, q0);\n\n    if (!(s1 * s2 > 0 && s3 * s4 > 0)) return 0;\n\n    // calculate intersection of two lines\n    float s5 = cross(q1, p1, p0);\n    if(fabs(s5 - s1) > EPS){\n        ans.x = (s5 * q0.x - s1 * q1.x) / (s5 - s1);\n        ans.y = (s5 * q0.y - s1 * q1.y) / (s5 - s1);\n\n    }\n    else{\n        float a0 = p0.y - p1.y, b0 = p1.x - p0.x, c0 = p0.x * p1.y - p1.x * p0.y;\n        float a1 = q0.y - q1.y, b1 = q1.x - q0.x, c1 = q0.x * q1.y - q1.x * q0.y;\n        float D = a0 * b1 - a1 * b0;\n\n        ans.x = (b0 * c1 - b1 * c0) / D;\n        ans.y = (a1 * c0 - a0 * c1) / D;\n    }\n\n    return 1;\n}\n\ninline void rotate_around_center(const Point &center, const float angle_cos, const float angle_sin, Point &p){\n    float new_x = (p.x - center.x) * angle_cos + (p.y - center.y) * (-angle_sin) + center.x;\n    float new_y = (p.x - center.x) * angle_sin + (p.y - center.y) * angle_cos + center.y;\n    p.set(new_x, new_y);\n}\n\ninline int point_cmp(const Point &a, const Point &b, const Point &center){\n    return atan2(a.y - center.y, a.x - center.x) > atan2(b.y - center.y, b.x - center.x);\n}\n\ninline float box_overlap(const float *box_a, const float *box_b){\n    // params: box_a (7) [x, y, z, dx, dy, dz, heading]\n    // params: box_b (7) [x, y, z, dx, dy, dz, heading]\n\n//    float a_x1 = box_a[0], a_y1 = box_a[1], a_x2 = box_a[2], a_y2 = box_a[3], a_angle = box_a[4];\n//    float b_x1 = box_b[0], b_y1 = box_b[1], b_x2 = box_b[2], b_y2 = box_b[3], b_angle = box_b[4];\n    float a_angle = box_a[6], b_angle = box_b[6];\n    float a_dx_half = box_a[3] / 2, b_dx_half = box_b[3] / 2, a_dy_half = box_a[4] / 2, b_dy_half = box_b[4] / 2;\n    float a_x1 = box_a[0] - a_dx_half, a_y1 = box_a[1] - a_dy_half;\n    float a_x2 = box_a[0] + a_dx_half, a_y2 = box_a[1] + a_dy_half;\n    float b_x1 = box_b[0] - b_dx_half, b_y1 = box_b[1] - b_dy_half;\n    float b_x2 = box_b[0] + b_dx_half, b_y2 = box_b[1] + b_dy_half;\n\n    Point center_a(box_a[0], box_a[1]);\n    Point center_b(box_b[0], box_b[1]);\n\n    Point box_a_corners[5];\n    box_a_corners[0].set(a_x1, a_y1);\n    box_a_corners[1].set(a_x2, a_y1);\n    box_a_corners[2].set(a_x2, a_y2);\n    box_a_corners[3].set(a_x1, a_y2);\n\n    Point box_b_corners[5];\n    box_b_corners[0].set(b_x1, b_y1);\n    box_b_corners[1].set(b_x2, b_y1);\n    box_b_corners[2].set(b_x2, b_y2);\n    box_b_corners[3].set(b_x1, b_y2);\n\n    // get oriented corners\n    float a_angle_cos = cos(a_angle), a_angle_sin = sin(a_angle);\n    float b_angle_cos = cos(b_angle), b_angle_sin = sin(b_angle);\n\n    for (int k = 0; k < 4; k++){\n        rotate_around_center(center_a, a_angle_cos, a_angle_sin, box_a_corners[k]);\n        rotate_around_center(center_b, b_angle_cos, b_angle_sin, box_b_corners[k]);\n    }\n\n    box_a_corners[4] = box_a_corners[0];\n    box_b_corners[4] = box_b_corners[0];\n\n    // get intersection of lines\n    Point cross_points[16];\n    Point poly_center;\n    int cnt = 0, flag = 0;\n\n    poly_center.set(0, 0);\n    for (int i = 0; i < 4; i++){\n        for (int j = 0; j < 4; j++){\n            flag = intersection(box_a_corners[i + 1], box_a_corners[i], box_b_corners[j + 1], box_b_corners[j], cross_points[cnt]);\n            if (flag){\n                poly_center = poly_center + cross_points[cnt];\n                cnt++;\n            }\n        }\n    }\n\n    // check corners\n    for (int k = 0; k < 4; k++){\n        if (check_in_box2d(box_a, box_b_corners[k])){\n            poly_center = poly_center + box_b_corners[k];\n            cross_points[cnt] = box_b_corners[k];\n            cnt++;\n        }\n        if (check_in_box2d(box_b, box_a_corners[k])){\n            poly_center = poly_center + box_a_corners[k];\n            cross_points[cnt] = box_a_corners[k];\n            cnt++;\n        }\n    }\n\n    poly_center.x /= cnt;\n    poly_center.y /= cnt;\n\n    // sort the points of polygon\n    Point temp;\n    for (int j = 0; j < cnt - 1; j++){\n        for (int i = 0; i < cnt - j - 1; i++){\n            if (point_cmp(cross_points[i], cross_points[i + 1], poly_center)){\n                temp = cross_points[i];\n                cross_points[i] = cross_points[i + 1];\n                cross_points[i + 1] = temp;\n            }\n        }\n    }\n\n    // get the overlap areas\n    float area = 0;\n    for (int k = 0; k < cnt - 1; k++){\n        area += cross(cross_points[k] - cross_points[0], cross_points[k + 1] - cross_points[0]);\n    }\n\n    return fabs(area) / 2.0;\n}\n\ninline float iou_bev(const float *box_a, const float *box_b){\n    // params: box_a (7) [x, y, z, dx, dy, dz, heading]\n    // params: box_b (7) [x, y, z, dx, dy, dz, heading]\n    float sa = box_a[3] * box_a[4];\n    float sb = box_b[3] * box_b[4];\n    float s_overlap = box_overlap(box_a, box_b);\n    return s_overlap / fmaxf(sa + sb - s_overlap, EPS);\n}\n\n\nint boxes_iou_bev_cpu(at::Tensor boxes_a_tensor, at::Tensor boxes_b_tensor, at::Tensor ans_iou_tensor){\n    // params boxes_a_tensor: (N, 7) [x, y, z, dx, dy, dz, heading]\n    // params boxes_b_tensor: (M, 7) [x, y, z, dx, dy, dz, heading]\n    // params ans_iou_tensor: (N, M)\n\n    CHECK_CONTIGUOUS(boxes_a_tensor);\n    CHECK_CONTIGUOUS(boxes_b_tensor);\n\n    int num_boxes_a = boxes_a_tensor.size(0);\n    int num_boxes_b = boxes_b_tensor.size(0);\n    const float *boxes_a = boxes_a_tensor.data<float>();\n    const float *boxes_b = boxes_b_tensor.data<float>();\n    float *ans_iou = ans_iou_tensor.data<float>();\n\n    for (int i = 0; i < num_boxes_a; i++){\n        for (int j = 0; j < num_boxes_b; j++){\n            ans_iou[i * num_boxes_b + j] = iou_bev(boxes_a + i * 7, boxes_b + j * 7);\n        }\n    }\n    return 1;\n}\n"
  },
  {
    "path": "pcdet/ops/iou3d_nms/src/iou3d_cpu.h",
    "content": "#ifndef IOU3D_CPU_H\n#define IOU3D_CPU_H\n\n#include <torch/serialize/tensor.h>\n#include <vector>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n\nint boxes_iou_bev_cpu(at::Tensor boxes_a_tensor, at::Tensor boxes_b_tensor, at::Tensor ans_iou_tensor);\n\n#endif\n"
  },
  {
    "path": "pcdet/ops/iou3d_nms/src/iou3d_nms.cpp",
    "content": "/*\n3D IoU Calculation and Rotated NMS(modified from 2D NMS written by others)\nWritten by Shaoshuai Shi\nAll Rights Reserved 2019-2020.\n*/\n\n#include <torch/serialize/tensor.h>\n#include <torch/extension.h>\n#include <vector>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n#include \"iou3d_nms.h\"\n\n#define CHECK_CUDA(x) do { \\\n  if (!x.type().is_cuda()) { \\\n    fprintf(stderr, \"%s must be CUDA tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_CONTIGUOUS(x) do { \\\n  if (!x.is_contiguous()) { \\\n    fprintf(stderr, \"%s must be contiguous tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_INPUT(x) CHECK_CUDA(x);CHECK_CONTIGUOUS(x)\n\n#define DIVUP(m,n) ((m) / (n) + ((m) % (n) > 0))\n\n#define CHECK_ERROR(ans) { gpuAssert((ans), __FILE__, __LINE__); }\ninline void gpuAssert(cudaError_t code, const char *file, int line, bool abort=true)\n{\n   if (code != cudaSuccess)\n   {\n      fprintf(stderr,\"GPUassert: %s %s %d\\n\", cudaGetErrorString(code), file, line);\n      if (abort) exit(code);\n   }\n}\n\nconst int THREADS_PER_BLOCK_NMS = sizeof(unsigned long long) * 8;\n\n\nvoid boxesoverlapLauncher(const int num_a, const float *boxes_a, const int num_b, const float *boxes_b, float *ans_overlap);\nvoid boxesioubevLauncher(const int num_a, const float *boxes_a, const int num_b, const float *boxes_b, float *ans_iou);\nvoid nmsLauncher(const float *boxes, unsigned long long * mask, int boxes_num, float nms_overlap_thresh);\nvoid nmsNormalLauncher(const float *boxes, unsigned long long * mask, int boxes_num, float nms_overlap_thresh);\n\n\nint boxes_overlap_bev_gpu(at::Tensor boxes_a, at::Tensor boxes_b, at::Tensor ans_overlap){\n    // params boxes_a: (N, 7) [x, y, z, dx, dy, dz, heading]\n    // params boxes_b: (M, 7) [x, y, z, dx, dy, dz, heading]\n    // params ans_overlap: (N, M)\n\n    CHECK_INPUT(boxes_a);\n    CHECK_INPUT(boxes_b);\n    CHECK_INPUT(ans_overlap);\n\n    int num_a = boxes_a.size(0);\n    int num_b = boxes_b.size(0);\n\n    const float * boxes_a_data = boxes_a.data<float>();\n    const float * boxes_b_data = boxes_b.data<float>();\n    float * ans_overlap_data = ans_overlap.data<float>();\n\n    boxesoverlapLauncher(num_a, boxes_a_data, num_b, boxes_b_data, ans_overlap_data);\n\n    return 1;\n}\n\nint boxes_iou_bev_gpu(at::Tensor boxes_a, at::Tensor boxes_b, at::Tensor ans_iou){\n    // params boxes_a: (N, 7) [x, y, z, dx, dy, dz, heading]\n    // params boxes_b: (M, 7) [x, y, z, dx, dy, dz, heading]\n    // params ans_overlap: (N, M)\n    CHECK_INPUT(boxes_a);\n    CHECK_INPUT(boxes_b);\n    CHECK_INPUT(ans_iou);\n\n    int num_a = boxes_a.size(0);\n    int num_b = boxes_b.size(0);\n\n    const float * boxes_a_data = boxes_a.data<float>();\n    const float * boxes_b_data = boxes_b.data<float>();\n    float * ans_iou_data = ans_iou.data<float>();\n\n    boxesioubevLauncher(num_a, boxes_a_data, num_b, boxes_b_data, ans_iou_data);\n\n    return 1;\n}\n\nint nms_gpu(at::Tensor boxes, at::Tensor keep, float nms_overlap_thresh){\n    // params boxes: (N, 7) [x, y, z, dx, dy, dz, heading]\n    // params keep: (N)\n    CHECK_INPUT(boxes);\n    CHECK_CONTIGUOUS(keep);\n\n    int boxes_num = boxes.size(0);\n    const float * boxes_data = boxes.data<float>();\n    long * keep_data = keep.data<long>();\n\n    const int col_blocks = DIVUP(boxes_num, THREADS_PER_BLOCK_NMS);\n\n    unsigned long long *mask_data = NULL;\n    CHECK_ERROR(cudaMalloc((void**)&mask_data, boxes_num * col_blocks * sizeof(unsigned long long)));\n    nmsLauncher(boxes_data, mask_data, boxes_num, nms_overlap_thresh);\n\n    // unsigned long long mask_cpu[boxes_num * col_blocks];\n    // unsigned long long *mask_cpu = new unsigned long long [boxes_num * col_blocks];\n    std::vector<unsigned long long> mask_cpu(boxes_num * col_blocks);\n\n//    printf(\"boxes_num=%d, col_blocks=%d\\n\", boxes_num, col_blocks);\n    CHECK_ERROR(cudaMemcpy(&mask_cpu[0], mask_data, boxes_num * col_blocks * sizeof(unsigned long long),\n                           cudaMemcpyDeviceToHost));\n\n    cudaFree(mask_data);\n\n    unsigned long long remv_cpu[col_blocks];\n    memset(remv_cpu, 0, col_blocks * sizeof(unsigned long long));\n\n    int num_to_keep = 0;\n\n    for (int i = 0; i < boxes_num; i++){\n        int nblock = i / THREADS_PER_BLOCK_NMS;\n        int inblock = i % THREADS_PER_BLOCK_NMS;\n\n        if (!(remv_cpu[nblock] & (1ULL << inblock))){\n            keep_data[num_to_keep++] = i;\n            unsigned long long *p = &mask_cpu[0] + i * col_blocks;\n            for (int j = nblock; j < col_blocks; j++){\n                remv_cpu[j] |= p[j];\n            }\n        }\n    }\n    if ( cudaSuccess != cudaGetLastError() ) printf( \"Error!\\n\" );\n\n    return num_to_keep;\n}\n\n\nint nms_normal_gpu(at::Tensor boxes, at::Tensor keep, float nms_overlap_thresh){\n    // params boxes: (N, 7) [x, y, z, dx, dy, dz, heading]\n    // params keep: (N)\n\n    CHECK_INPUT(boxes);\n    CHECK_CONTIGUOUS(keep);\n\n    int boxes_num = boxes.size(0);\n    const float * boxes_data = boxes.data<float>();\n    long * keep_data = keep.data<long>();\n\n    const int col_blocks = DIVUP(boxes_num, THREADS_PER_BLOCK_NMS);\n\n    unsigned long long *mask_data = NULL;\n    CHECK_ERROR(cudaMalloc((void**)&mask_data, boxes_num * col_blocks * sizeof(unsigned long long)));\n    nmsNormalLauncher(boxes_data, mask_data, boxes_num, nms_overlap_thresh);\n\n    // unsigned long long mask_cpu[boxes_num * col_blocks];\n    // unsigned long long *mask_cpu = new unsigned long long [boxes_num * col_blocks];\n    std::vector<unsigned long long> mask_cpu(boxes_num * col_blocks);\n\n//    printf(\"boxes_num=%d, col_blocks=%d\\n\", boxes_num, col_blocks);\n    CHECK_ERROR(cudaMemcpy(&mask_cpu[0], mask_data, boxes_num * col_blocks * sizeof(unsigned long long),\n                           cudaMemcpyDeviceToHost));\n\n    cudaFree(mask_data);\n\n    unsigned long long remv_cpu[col_blocks];\n    memset(remv_cpu, 0, col_blocks * sizeof(unsigned long long));\n\n    int num_to_keep = 0;\n\n    for (int i = 0; i < boxes_num; i++){\n        int nblock = i / THREADS_PER_BLOCK_NMS;\n        int inblock = i % THREADS_PER_BLOCK_NMS;\n\n        if (!(remv_cpu[nblock] & (1ULL << inblock))){\n            keep_data[num_to_keep++] = i;\n            unsigned long long *p = &mask_cpu[0] + i * col_blocks;\n            for (int j = nblock; j < col_blocks; j++){\n                remv_cpu[j] |= p[j];\n            }\n        }\n    }\n    if ( cudaSuccess != cudaGetLastError() ) printf( \"Error!\\n\" );\n\n    return num_to_keep;\n}\n\n\n"
  },
  {
    "path": "pcdet/ops/iou3d_nms/src/iou3d_nms.h",
    "content": "#ifndef IOU3D_NMS_H\n#define IOU3D_NMS_H\n\n#include <torch/serialize/tensor.h>\n#include <vector>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n\nint boxes_overlap_bev_gpu(at::Tensor boxes_a, at::Tensor boxes_b, at::Tensor ans_overlap);\nint boxes_iou_bev_gpu(at::Tensor boxes_a, at::Tensor boxes_b, at::Tensor ans_iou);\nint nms_gpu(at::Tensor boxes, at::Tensor keep, float nms_overlap_thresh);\nint nms_normal_gpu(at::Tensor boxes, at::Tensor keep, float nms_overlap_thresh);\n\n#endif\n"
  },
  {
    "path": "pcdet/ops/iou3d_nms/src/iou3d_nms_api.cpp",
    "content": "#include <torch/serialize/tensor.h>\n#include <torch/extension.h>\n#include <vector>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n\n#include \"iou3d_cpu.h\"\n#include \"iou3d_nms.h\"\n\n\nPYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {\n\tm.def(\"boxes_overlap_bev_gpu\", &boxes_overlap_bev_gpu, \"oriented boxes overlap\");\n\tm.def(\"boxes_iou_bev_gpu\", &boxes_iou_bev_gpu, \"oriented boxes iou\");\n\tm.def(\"nms_gpu\", &nms_gpu, \"oriented nms gpu\");\n\tm.def(\"nms_normal_gpu\", &nms_normal_gpu, \"nms gpu\");\n\tm.def(\"boxes_iou_bev_cpu\", &boxes_iou_bev_cpu, \"oriented boxes iou\");\n}\n"
  },
  {
    "path": "pcdet/ops/iou3d_nms/src/iou3d_nms_kernel.cu",
    "content": "/*\n3D IoU Calculation and Rotated NMS(modified from 2D NMS written by others)\nWritten by Shaoshuai Shi\nAll Rights Reserved 2019-2020.\n*/\n\n\n#include <stdio.h>\n#define THREADS_PER_BLOCK 16\n#define DIVUP(m, n) ((m) / (n) + ((m) % (n) > 0))\n\n// #define DEBUG\nconst int THREADS_PER_BLOCK_NMS = sizeof(unsigned long long) * 8;\nconst float EPS = 1e-8;\nstruct Point {\n    float x, y;\n    __device__ Point() {}\n    __device__ Point(double _x, double _y){\n        x = _x, y = _y;\n    }\n\n    __device__ void set(float _x, float _y){\n        x = _x; y = _y;\n    }\n\n    __device__ Point operator +(const Point &b)const{\n        return Point(x + b.x, y + b.y);\n    }\n\n    __device__ Point operator -(const Point &b)const{\n        return Point(x - b.x, y - b.y);\n    }\n};\n\n__device__ inline float cross(const Point &a, const Point &b){\n    return a.x * b.y - a.y * b.x;\n}\n\n__device__ inline float cross(const Point &p1, const Point &p2, const Point &p0){\n    return (p1.x - p0.x) * (p2.y - p0.y) - (p2.x - p0.x) * (p1.y - p0.y);\n}\n\n__device__ int check_rect_cross(const Point &p1, const Point &p2, const Point &q1, const Point &q2){\n    int ret = min(p1.x,p2.x) <= max(q1.x,q2.x)  &&\n              min(q1.x,q2.x) <= max(p1.x,p2.x) &&\n              min(p1.y,p2.y) <= max(q1.y,q2.y) &&\n              min(q1.y,q2.y) <= max(p1.y,p2.y);\n    return ret;\n}\n\n__device__ inline int check_in_box2d(const float *box, const Point &p){\n    //params: (7) [x, y, z, dx, dy, dz, heading]\n    const float MARGIN = 1e-2;\n\n    float center_x = box[0], center_y = box[1];\n    float angle_cos = cos(-box[6]), angle_sin = sin(-box[6]);  // rotate the point in the opposite direction of box\n    float rot_x = (p.x - center_x) * angle_cos + (p.y - center_y) * (-angle_sin);\n    float rot_y = (p.x - center_x) * angle_sin + (p.y - center_y) * angle_cos;\n\n    return (fabs(rot_x) < box[3] / 2 + MARGIN && fabs(rot_y) < box[4] / 2 + MARGIN);\n}\n\n__device__ inline int intersection(const Point &p1, const Point &p0, const Point &q1, const Point &q0, Point &ans){\n    // fast exclusion\n    if (check_rect_cross(p0, p1, q0, q1) == 0) return 0;\n\n    // check cross standing\n    float s1 = cross(q0, p1, p0);\n    float s2 = cross(p1, q1, p0);\n    float s3 = cross(p0, q1, q0);\n    float s4 = cross(q1, p1, q0);\n\n    if (!(s1 * s2 > 0 && s3 * s4 > 0)) return 0;\n\n    // calculate intersection of two lines\n    float s5 = cross(q1, p1, p0);\n    if(fabs(s5 - s1) > EPS){\n        ans.x = (s5 * q0.x - s1 * q1.x) / (s5 - s1);\n        ans.y = (s5 * q0.y - s1 * q1.y) / (s5 - s1);\n\n    }\n    else{\n        float a0 = p0.y - p1.y, b0 = p1.x - p0.x, c0 = p0.x * p1.y - p1.x * p0.y;\n        float a1 = q0.y - q1.y, b1 = q1.x - q0.x, c1 = q0.x * q1.y - q1.x * q0.y;\n        float D = a0 * b1 - a1 * b0;\n\n        ans.x = (b0 * c1 - b1 * c0) / D;\n        ans.y = (a1 * c0 - a0 * c1) / D;\n    }\n\n    return 1;\n}\n\n__device__ inline void rotate_around_center(const Point &center, const float angle_cos, const float angle_sin, Point &p){\n    float new_x = (p.x - center.x) * angle_cos + (p.y - center.y) * (-angle_sin) + center.x;\n    float new_y = (p.x - center.x) * angle_sin + (p.y - center.y) * angle_cos + center.y;\n    p.set(new_x, new_y);\n}\n\n__device__ inline int point_cmp(const Point &a, const Point &b, const Point &center){\n    return atan2(a.y - center.y, a.x - center.x) > atan2(b.y - center.y, b.x - center.x);\n}\n\n__device__ inline float box_overlap(const float *box_a, const float *box_b){\n    // params box_a: [x, y, z, dx, dy, dz, heading]\n    // params box_b: [x, y, z, dx, dy, dz, heading]\n\n    float a_angle = box_a[6], b_angle = box_b[6];\n    float a_dx_half = box_a[3] / 2, b_dx_half = box_b[3] / 2, a_dy_half = box_a[4] / 2, b_dy_half = box_b[4] / 2;\n    float a_x1 = box_a[0] - a_dx_half, a_y1 = box_a[1] - a_dy_half;\n    float a_x2 = box_a[0] + a_dx_half, a_y2 = box_a[1] + a_dy_half;\n    float b_x1 = box_b[0] - b_dx_half, b_y1 = box_b[1] - b_dy_half;\n    float b_x2 = box_b[0] + b_dx_half, b_y2 = box_b[1] + b_dy_half;\n\n    Point center_a(box_a[0], box_a[1]);\n    Point center_b(box_b[0], box_b[1]);\n\n#ifdef DEBUG\n    printf(\"a: (%.3f, %.3f, %.3f, %.3f, %.3f), b: (%.3f, %.3f, %.3f, %.3f, %.3f)\\n\", a_x1, a_y1, a_x2, a_y2, a_angle,\n           b_x1, b_y1, b_x2, b_y2, b_angle);\n    printf(\"center a: (%.3f, %.3f), b: (%.3f, %.3f)\\n\", center_a.x, center_a.y, center_b.x, center_b.y);\n#endif\n\n    Point box_a_corners[5];\n    box_a_corners[0].set(a_x1, a_y1);\n    box_a_corners[1].set(a_x2, a_y1);\n    box_a_corners[2].set(a_x2, a_y2);\n    box_a_corners[3].set(a_x1, a_y2);\n\n    Point box_b_corners[5];\n    box_b_corners[0].set(b_x1, b_y1);\n    box_b_corners[1].set(b_x2, b_y1);\n    box_b_corners[2].set(b_x2, b_y2);\n    box_b_corners[3].set(b_x1, b_y2);\n\n    // get oriented corners\n    float a_angle_cos = cos(a_angle), a_angle_sin = sin(a_angle);\n    float b_angle_cos = cos(b_angle), b_angle_sin = sin(b_angle);\n\n    for (int k = 0; k < 4; k++){\n#ifdef DEBUG\n        printf(\"before corner %d: a(%.3f, %.3f), b(%.3f, %.3f) \\n\", k, box_a_corners[k].x, box_a_corners[k].y, box_b_corners[k].x, box_b_corners[k].y);\n#endif\n        rotate_around_center(center_a, a_angle_cos, a_angle_sin, box_a_corners[k]);\n        rotate_around_center(center_b, b_angle_cos, b_angle_sin, box_b_corners[k]);\n#ifdef DEBUG\n        printf(\"corner %d: a(%.3f, %.3f), b(%.3f, %.3f) \\n\", k, box_a_corners[k].x, box_a_corners[k].y, box_b_corners[k].x, box_b_corners[k].y);\n#endif\n    }\n\n    box_a_corners[4] = box_a_corners[0];\n    box_b_corners[4] = box_b_corners[0];\n\n    // get intersection of lines\n    Point cross_points[16];\n    Point poly_center;\n    int cnt = 0, flag = 0;\n\n    poly_center.set(0, 0);\n    for (int i = 0; i < 4; i++){\n        for (int j = 0; j < 4; j++){\n            flag = intersection(box_a_corners[i + 1], box_a_corners[i], box_b_corners[j + 1], box_b_corners[j], cross_points[cnt]);\n            if (flag){\n                poly_center = poly_center + cross_points[cnt];\n                cnt++;\n#ifdef DEBUG\n                printf(\"Cross points (%.3f, %.3f): a(%.3f, %.3f)->(%.3f, %.3f), b(%.3f, %.3f)->(%.3f, %.3f) \\n\",\n                    cross_points[cnt - 1].x, cross_points[cnt - 1].y,\n                    box_a_corners[i].x, box_a_corners[i].y, box_a_corners[i + 1].x, box_a_corners[i + 1].y,\n                    box_b_corners[i].x, box_b_corners[i].y, box_b_corners[i + 1].x, box_b_corners[i + 1].y);\n#endif\n            }\n        }\n    }\n\n    // check corners\n    for (int k = 0; k < 4; k++){\n        if (check_in_box2d(box_a, box_b_corners[k])){\n            poly_center = poly_center + box_b_corners[k];\n            cross_points[cnt] = box_b_corners[k];\n            cnt++;\n#ifdef DEBUG\n                printf(\"b corners in a: corner_b(%.3f, %.3f)\", cross_points[cnt - 1].x, cross_points[cnt - 1].y);\n#endif\n        }\n        if (check_in_box2d(box_b, box_a_corners[k])){\n            poly_center = poly_center + box_a_corners[k];\n            cross_points[cnt] = box_a_corners[k];\n            cnt++;\n#ifdef DEBUG\n                printf(\"a corners in b: corner_a(%.3f, %.3f)\", cross_points[cnt - 1].x, cross_points[cnt - 1].y);\n#endif\n        }\n    }\n\n    poly_center.x /= cnt;\n    poly_center.y /= cnt;\n\n    // sort the points of polygon\n    Point temp;\n    for (int j = 0; j < cnt - 1; j++){\n        for (int i = 0; i < cnt - j - 1; i++){\n            if (point_cmp(cross_points[i], cross_points[i + 1], poly_center)){\n                temp = cross_points[i];\n                cross_points[i] = cross_points[i + 1];\n                cross_points[i + 1] = temp;\n            }\n        }\n    }\n\n#ifdef DEBUG\n    printf(\"cnt=%d\\n\", cnt);\n    for (int i = 0; i < cnt; i++){\n        printf(\"All cross point %d: (%.3f, %.3f)\\n\", i, cross_points[i].x, cross_points[i].y);\n    }\n#endif\n\n    // get the overlap areas\n    float area = 0;\n    for (int k = 0; k < cnt - 1; k++){\n        area += cross(cross_points[k] - cross_points[0], cross_points[k + 1] - cross_points[0]);\n    }\n\n    return fabs(area) / 2.0;\n}\n\n__device__ inline float iou_bev(const float *box_a, const float *box_b){\n    // params box_a: [x, y, z, dx, dy, dz, heading]\n    // params box_b: [x, y, z, dx, dy, dz, heading]\n    float sa = box_a[3] * box_a[4];\n    float sb = box_b[3] * box_b[4];\n    float s_overlap = box_overlap(box_a, box_b);\n    return s_overlap / fmaxf(sa + sb - s_overlap, EPS);\n}\n\n__global__ void boxes_overlap_kernel(const int num_a, const float *boxes_a, const int num_b, const float *boxes_b, float *ans_overlap){\n    // params boxes_a: (N, 7) [x, y, z, dx, dy, dz, heading]\n    // params boxes_b: (M, 7) [x, y, z, dx, dy, dz, heading]\n    const int a_idx = blockIdx.y * THREADS_PER_BLOCK + threadIdx.y;\n    const int b_idx = blockIdx.x * THREADS_PER_BLOCK + threadIdx.x;\n\n    if (a_idx >= num_a || b_idx >= num_b){\n        return;\n    }\n    const float * cur_box_a = boxes_a + a_idx * 7;\n    const float * cur_box_b = boxes_b + b_idx * 7;\n    float s_overlap = box_overlap(cur_box_a, cur_box_b);\n    ans_overlap[a_idx * num_b + b_idx] = s_overlap;\n}\n\n__global__ void boxes_iou_bev_kernel(const int num_a, const float *boxes_a, const int num_b, const float *boxes_b, float *ans_iou){\n    // params boxes_a: (N, 7) [x, y, z, dx, dy, dz, heading]\n    // params boxes_b: (M, 7) [x, y, z, dx, dy, dz, heading]\n    const int a_idx = blockIdx.y * THREADS_PER_BLOCK + threadIdx.y;\n    const int b_idx = blockIdx.x * THREADS_PER_BLOCK + threadIdx.x;\n\n    if (a_idx >= num_a || b_idx >= num_b){\n        return;\n    }\n\n    const float * cur_box_a = boxes_a + a_idx * 7;\n    const float * cur_box_b = boxes_b + b_idx * 7;\n    float cur_iou_bev = iou_bev(cur_box_a, cur_box_b);\n    ans_iou[a_idx * num_b + b_idx] = cur_iou_bev;\n}\n\n__global__ void nms_kernel(const int boxes_num, const float nms_overlap_thresh,\n                           const float *boxes, unsigned long long *mask){\n    //params: boxes (N, 7) [x, y, z, dx, dy, dz, heading]\n    //params: mask (N, N/THREADS_PER_BLOCK_NMS)\n\n    const int row_start = blockIdx.y;\n    const int col_start = blockIdx.x;\n\n    // if (row_start > col_start) return;\n\n    const int row_size = fminf(boxes_num - row_start * THREADS_PER_BLOCK_NMS, THREADS_PER_BLOCK_NMS);\n    const int col_size = fminf(boxes_num - col_start * THREADS_PER_BLOCK_NMS, THREADS_PER_BLOCK_NMS);\n\n    __shared__ float block_boxes[THREADS_PER_BLOCK_NMS * 7];\n\n    if (threadIdx.x < col_size) {\n        block_boxes[threadIdx.x * 7 + 0] = boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 0];\n        block_boxes[threadIdx.x * 7 + 1] = boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 1];\n        block_boxes[threadIdx.x * 7 + 2] = boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 2];\n        block_boxes[threadIdx.x * 7 + 3] = boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 3];\n        block_boxes[threadIdx.x * 7 + 4] = boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 4];\n        block_boxes[threadIdx.x * 7 + 5] = boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 5];\n        block_boxes[threadIdx.x * 7 + 6] = boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 6];\n    }\n    __syncthreads();\n\n    if (threadIdx.x < row_size) {\n        const int cur_box_idx = THREADS_PER_BLOCK_NMS * row_start + threadIdx.x;\n        const float *cur_box = boxes + cur_box_idx * 7;\n\n        int i = 0;\n        unsigned long long t = 0;\n        int start = 0;\n        if (row_start == col_start) {\n          start = threadIdx.x + 1;\n        }\n        for (i = start; i < col_size; i++) {\n            if (iou_bev(cur_box, block_boxes + i * 7) > nms_overlap_thresh){\n                t |= 1ULL << i;\n            }\n        }\n        const int col_blocks = DIVUP(boxes_num, THREADS_PER_BLOCK_NMS);\n        mask[cur_box_idx * col_blocks + col_start] = t;\n    }\n}\n\n\n__device__ inline float iou_normal(float const * const a, float const * const b) {\n    //params: a: [x, y, z, dx, dy, dz, heading]\n    //params: b: [x, y, z, dx, dy, dz, heading]\n\n    float left = fmaxf(a[0] - a[3] / 2, b[0] - b[3] / 2), right = fminf(a[0] + a[3] / 2, b[0] + b[3] / 2);\n    float top = fmaxf(a[1] - a[4] / 2, b[1] - b[4] / 2), bottom = fminf(a[1] + a[4] / 2, b[1] + b[4] / 2);\n    float width = fmaxf(right - left, 0.f), height = fmaxf(bottom - top, 0.f);\n    float interS = width * height;\n    float Sa = a[3] * a[4];\n    float Sb = b[3] * b[4];\n    return interS / fmaxf(Sa + Sb - interS, EPS);\n}\n\n\n__global__ void nms_normal_kernel(const int boxes_num, const float nms_overlap_thresh,\n                           const float *boxes, unsigned long long *mask){\n    //params: boxes (N, 7) [x, y, z, dx, dy, dz, heading]\n    //params: mask (N, N/THREADS_PER_BLOCK_NMS)\n\n    const int row_start = blockIdx.y;\n    const int col_start = blockIdx.x;\n\n    // if (row_start > col_start) return;\n\n    const int row_size = fminf(boxes_num - row_start * THREADS_PER_BLOCK_NMS, THREADS_PER_BLOCK_NMS);\n    const int col_size = fminf(boxes_num - col_start * THREADS_PER_BLOCK_NMS, THREADS_PER_BLOCK_NMS);\n\n    __shared__ float block_boxes[THREADS_PER_BLOCK_NMS * 7];\n\n    if (threadIdx.x < col_size) {\n        block_boxes[threadIdx.x * 7 + 0] = boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 0];\n        block_boxes[threadIdx.x * 7 + 1] = boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 1];\n        block_boxes[threadIdx.x * 7 + 2] = boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 2];\n        block_boxes[threadIdx.x * 7 + 3] = boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 3];\n        block_boxes[threadIdx.x * 7 + 4] = boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 4];\n        block_boxes[threadIdx.x * 7 + 5] = boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 5];\n        block_boxes[threadIdx.x * 7 + 6] = boxes[(THREADS_PER_BLOCK_NMS * col_start + threadIdx.x) * 7 + 6];\n    }\n    __syncthreads();\n\n    if (threadIdx.x < row_size) {\n        const int cur_box_idx = THREADS_PER_BLOCK_NMS * row_start + threadIdx.x;\n        const float *cur_box = boxes + cur_box_idx * 7;\n\n        int i = 0;\n        unsigned long long t = 0;\n        int start = 0;\n        if (row_start == col_start) {\n          start = threadIdx.x + 1;\n        }\n        for (i = start; i < col_size; i++) {\n            if (iou_normal(cur_box, block_boxes + i * 7) > nms_overlap_thresh){\n                t |= 1ULL << i;\n            }\n        }\n        const int col_blocks = DIVUP(boxes_num, THREADS_PER_BLOCK_NMS);\n        mask[cur_box_idx * col_blocks + col_start] = t;\n    }\n}\n\n\n\n\n\nvoid boxesoverlapLauncher(const int num_a, const float *boxes_a, const int num_b, const float *boxes_b, float *ans_overlap){\n\n    dim3 blocks(DIVUP(num_b, THREADS_PER_BLOCK), DIVUP(num_a, THREADS_PER_BLOCK));  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK, THREADS_PER_BLOCK);\n\n    boxes_overlap_kernel<<<blocks, threads>>>(num_a, boxes_a, num_b, boxes_b, ans_overlap);\n#ifdef DEBUG\n    cudaDeviceSynchronize();  // for using printf in kernel function\n#endif\n}\n\nvoid boxesioubevLauncher(const int num_a, const float *boxes_a, const int num_b, const float *boxes_b, float *ans_iou){\n\n    dim3 blocks(DIVUP(num_b, THREADS_PER_BLOCK), DIVUP(num_a, THREADS_PER_BLOCK));  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK, THREADS_PER_BLOCK);\n\n    boxes_iou_bev_kernel<<<blocks, threads>>>(num_a, boxes_a, num_b, boxes_b, ans_iou);\n#ifdef DEBUG\n    cudaDeviceSynchronize();  // for using printf in kernel function\n#endif\n}\n\n\nvoid nmsLauncher(const float *boxes, unsigned long long * mask, int boxes_num, float nms_overlap_thresh){\n    dim3 blocks(DIVUP(boxes_num, THREADS_PER_BLOCK_NMS),\n                DIVUP(boxes_num, THREADS_PER_BLOCK_NMS));\n    dim3 threads(THREADS_PER_BLOCK_NMS);\n    nms_kernel<<<blocks, threads>>>(boxes_num, nms_overlap_thresh, boxes, mask);\n}\n\n\nvoid nmsNormalLauncher(const float *boxes, unsigned long long * mask, int boxes_num, float nms_overlap_thresh){\n    dim3 blocks(DIVUP(boxes_num, THREADS_PER_BLOCK_NMS),\n                DIVUP(boxes_num, THREADS_PER_BLOCK_NMS));\n    dim3 threads(THREADS_PER_BLOCK_NMS);\n    nms_normal_kernel<<<blocks, threads>>>(boxes_num, nms_overlap_thresh, boxes, mask);\n}\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_batch/pointnet2_modules.py",
    "content": "from typing import List\n\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\nfrom . import pointnet2_utils\n\n\nclass _PointnetSAModuleBase(nn.Module):\n\n    def __init__(self):\n        super().__init__()\n        self.npoint = None\n        self.groupers = None\n        self.mlps = None\n        self.pool_method = 'max_pool'\n\n    def forward(self, xyz: torch.Tensor, features: torch.Tensor = None, new_xyz=None) -> (torch.Tensor, torch.Tensor):\n        \"\"\"\n        :param xyz: (B, N, 3) tensor of the xyz coordinates of the features\n        :param features: (B, N, C) tensor of the descriptors of the the features\n        :param new_xyz:\n        :return:\n            new_xyz: (B, npoint, 3) tensor of the new features' xyz\n            new_features: (B, npoint, \\sum_k(mlps[k][-1])) tensor of the new_features descriptors\n        \"\"\"\n        new_features_list = []\n\n        xyz_flipped = xyz.transpose(1, 2).contiguous()\n        if new_xyz is None:\n            new_xyz = pointnet2_utils.gather_operation(\n                xyz_flipped,\n                pointnet2_utils.furthest_point_sample(xyz, self.npoint)\n            ).transpose(1, 2).contiguous() if self.npoint is not None else None\n\n        for i in range(len(self.groupers)):\n            new_features = self.groupers[i](xyz, new_xyz, features)  # (B, C, npoint, nsample)\n\n            new_features = self.mlps[i](new_features)  # (B, mlp[-1], npoint, nsample)\n            if self.pool_method == 'max_pool':\n                new_features = F.max_pool2d(\n                    new_features, kernel_size=[1, new_features.size(3)]\n                )  # (B, mlp[-1], npoint, 1)\n            elif self.pool_method == 'avg_pool':\n                new_features = F.avg_pool2d(\n                    new_features, kernel_size=[1, new_features.size(3)]\n                )  # (B, mlp[-1], npoint, 1)\n            else:\n                raise NotImplementedError\n\n            new_features = new_features.squeeze(-1)  # (B, mlp[-1], npoint)\n            new_features_list.append(new_features)\n\n        return new_xyz, torch.cat(new_features_list, dim=1)\n\n\nclass PointnetSAModuleMSG(_PointnetSAModuleBase):\n    \"\"\"Pointnet set abstraction layer with multiscale grouping\"\"\"\n\n    def __init__(self, *, npoint: int, radii: List[float], nsamples: List[int], mlps: List[List[int]], bn: bool = True,\n                 use_xyz: bool = True, pool_method='max_pool'):\n        \"\"\"\n        :param npoint: int\n        :param radii: list of float, list of radii to group with\n        :param nsamples: list of int, number of samples in each ball query\n        :param mlps: list of list of int, spec of the pointnet before the global pooling for each scale\n        :param bn: whether to use batchnorm\n        :param use_xyz:\n        :param pool_method: max_pool / avg_pool\n        \"\"\"\n        super().__init__()\n\n        assert len(radii) == len(nsamples) == len(mlps)\n\n        self.npoint = npoint\n        self.groupers = nn.ModuleList()\n        self.mlps = nn.ModuleList()\n        for i in range(len(radii)):\n            radius = radii[i]\n            nsample = nsamples[i]\n            self.groupers.append(\n                pointnet2_utils.QueryAndGroup(radius, nsample, use_xyz=use_xyz)\n                if npoint is not None else pointnet2_utils.GroupAll(use_xyz)\n            )\n            mlp_spec = mlps[i]\n            if use_xyz:\n                mlp_spec[0] += 3\n\n            shared_mlps = []\n            for k in range(len(mlp_spec) - 1):\n                shared_mlps.extend([\n                    nn.Conv2d(mlp_spec[k], mlp_spec[k + 1], kernel_size=1, bias=False),\n                    nn.BatchNorm2d(mlp_spec[k + 1]),\n                    nn.ReLU()\n                ])\n            self.mlps.append(nn.Sequential(*shared_mlps))\n\n        self.pool_method = pool_method\n\n\nclass PointnetSAModule(PointnetSAModuleMSG):\n    \"\"\"Pointnet set abstraction layer\"\"\"\n\n    def __init__(self, *, mlp: List[int], npoint: int = None, radius: float = None, nsample: int = None,\n                 bn: bool = True, use_xyz: bool = True, pool_method='max_pool'):\n        \"\"\"\n        :param mlp: list of int, spec of the pointnet before the global max_pool\n        :param npoint: int, number of features\n        :param radius: float, radius of ball\n        :param nsample: int, number of samples in the ball query\n        :param bn: whether to use batchnorm\n        :param use_xyz:\n        :param pool_method: max_pool / avg_pool\n        \"\"\"\n        super().__init__(\n            mlps=[mlp], npoint=npoint, radii=[radius], nsamples=[nsample], bn=bn, use_xyz=use_xyz,\n            pool_method=pool_method\n        )\n\n\nclass PointnetFPModule(nn.Module):\n    r\"\"\"Propigates the features of one set to another\"\"\"\n\n    def __init__(self, *, mlp: List[int], bn: bool = True):\n        \"\"\"\n        :param mlp: list of int\n        :param bn: whether to use batchnorm\n        \"\"\"\n        super().__init__()\n\n        shared_mlps = []\n        for k in range(len(mlp) - 1):\n            shared_mlps.extend([\n                nn.Conv2d(mlp[k], mlp[k + 1], kernel_size=1, bias=False),\n                nn.BatchNorm2d(mlp[k + 1]),\n                nn.ReLU()\n            ])\n        self.mlp = nn.Sequential(*shared_mlps)\n\n    def forward(\n            self, unknown: torch.Tensor, known: torch.Tensor, unknow_feats: torch.Tensor, known_feats: torch.Tensor\n    ) -> torch.Tensor:\n        \"\"\"\n        :param unknown: (B, n, 3) tensor of the xyz positions of the unknown features\n        :param known: (B, m, 3) tensor of the xyz positions of the known features\n        :param unknow_feats: (B, C1, n) tensor of the features to be propigated to\n        :param known_feats: (B, C2, m) tensor of features to be propigated\n        :return:\n            new_features: (B, mlp[-1], n) tensor of the features of the unknown features\n        \"\"\"\n        if known is not None:\n            dist, idx = pointnet2_utils.three_nn(unknown, known)\n            dist_recip = 1.0 / (dist + 1e-8)\n            norm = torch.sum(dist_recip, dim=2, keepdim=True)\n            weight = dist_recip / norm\n\n            interpolated_feats = pointnet2_utils.three_interpolate(known_feats, idx, weight)\n        else:\n            interpolated_feats = known_feats.expand(*known_feats.size()[0:2], unknown.size(1))\n\n        if unknow_feats is not None:\n            new_features = torch.cat([interpolated_feats, unknow_feats], dim=1)  # (B, C2 + C1, n)\n        else:\n            new_features = interpolated_feats\n\n        new_features = new_features.unsqueeze(-1)\n        new_features = self.mlp(new_features)\n\n        return new_features.squeeze(-1)\n\n\nif __name__ == \"__main__\":\n    pass\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_batch/pointnet2_utils.py",
    "content": "from typing import Tuple\n\nimport torch\nimport torch.nn as nn\nfrom torch.autograd import Function, Variable\n\nfrom . import pointnet2_batch_cuda as pointnet2\n\n\nclass FurthestPointSampling(Function):\n    @staticmethod\n    def forward(ctx, xyz: torch.Tensor, npoint: int) -> torch.Tensor:\n        \"\"\"\n        Uses iterative furthest point sampling to select a set of npoint features that have the largest\n        minimum distance\n        :param ctx:\n        :param xyz: (B, N, 3) where N > npoint\n        :param npoint: int, number of features in the sampled set\n        :return:\n             output: (B, npoint) tensor containing the set\n        \"\"\"\n        assert xyz.is_contiguous()\n\n        B, N, _ = xyz.size()\n        output = torch.cuda.IntTensor(B, npoint)\n        temp = torch.cuda.FloatTensor(B, N).fill_(1e10)\n\n        pointnet2.furthest_point_sampling_wrapper(B, N, npoint, xyz, temp, output)\n        return output\n\n    @staticmethod\n    def backward(xyz, a=None):\n        return None, None\n\n\nfurthest_point_sample = FurthestPointSampling.apply\n\n\nclass GatherOperation(Function):\n\n    @staticmethod\n    def forward(ctx, features: torch.Tensor, idx: torch.Tensor) -> torch.Tensor:\n        \"\"\"\n        :param ctx:\n        :param features: (B, C, N)\n        :param idx: (B, npoint) index tensor of the features to gather\n        :return:\n            output: (B, C, npoint)\n        \"\"\"\n        assert features.is_contiguous()\n        assert idx.is_contiguous()\n\n        B, npoint = idx.size()\n        _, C, N = features.size()\n        output = torch.cuda.FloatTensor(B, C, npoint)\n\n        pointnet2.gather_points_wrapper(B, C, N, npoint, features, idx, output)\n\n        ctx.for_backwards = (idx, C, N)\n        return output\n\n    @staticmethod\n    def backward(ctx, grad_out):\n        idx, C, N = ctx.for_backwards\n        B, npoint = idx.size()\n\n        grad_features = Variable(torch.cuda.FloatTensor(B, C, N).zero_())\n        grad_out_data = grad_out.data.contiguous()\n        pointnet2.gather_points_grad_wrapper(B, C, N, npoint, grad_out_data, idx, grad_features.data)\n        return grad_features, None\n\n\ngather_operation = GatherOperation.apply\n\n\nclass ThreeNN(Function):\n\n    @staticmethod\n    def forward(ctx, unknown: torch.Tensor, known: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor]:\n        \"\"\"\n        Find the three nearest neighbors of unknown in known\n        :param ctx:\n        :param unknown: (B, N, 3)\n        :param known: (B, M, 3)\n        :return:\n            dist: (B, N, 3) l2 distance to the three nearest neighbors\n            idx: (B, N, 3) index of 3 nearest neighbors\n        \"\"\"\n        assert unknown.is_contiguous()\n        assert known.is_contiguous()\n\n        B, N, _ = unknown.size()\n        m = known.size(1)\n        dist2 = torch.cuda.FloatTensor(B, N, 3)\n        idx = torch.cuda.IntTensor(B, N, 3)\n\n        pointnet2.three_nn_wrapper(B, N, m, unknown, known, dist2, idx)\n        return torch.sqrt(dist2), idx\n\n    @staticmethod\n    def backward(ctx, a=None, b=None):\n        return None, None\n\n\nthree_nn = ThreeNN.apply\n\n\nclass ThreeInterpolate(Function):\n\n    @staticmethod\n    def forward(ctx, features: torch.Tensor, idx: torch.Tensor, weight: torch.Tensor) -> torch.Tensor:\n        \"\"\"\n        Performs weight linear interpolation on 3 features\n        :param ctx:\n        :param features: (B, C, M) Features descriptors to be interpolated from\n        :param idx: (B, n, 3) three nearest neighbors of the target features in features\n        :param weight: (B, n, 3) weights\n        :return:\n            output: (B, C, N) tensor of the interpolated features\n        \"\"\"\n        assert features.is_contiguous()\n        assert idx.is_contiguous()\n        assert weight.is_contiguous()\n\n        B, c, m = features.size()\n        n = idx.size(1)\n        ctx.three_interpolate_for_backward = (idx, weight, m)\n        output = torch.cuda.FloatTensor(B, c, n)\n\n        pointnet2.three_interpolate_wrapper(B, c, m, n, features, idx, weight, output)\n        return output\n\n    @staticmethod\n    def backward(ctx, grad_out: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]:\n        \"\"\"\n        :param ctx:\n        :param grad_out: (B, C, N) tensor with gradients of outputs\n        :return:\n            grad_features: (B, C, M) tensor with gradients of features\n            None:\n            None:\n        \"\"\"\n        idx, weight, m = ctx.three_interpolate_for_backward\n        B, c, n = grad_out.size()\n\n        grad_features = Variable(torch.cuda.FloatTensor(B, c, m).zero_())\n        grad_out_data = grad_out.data.contiguous()\n\n        pointnet2.three_interpolate_grad_wrapper(B, c, n, m, grad_out_data, idx, weight, grad_features.data)\n        return grad_features, None, None\n\n\nthree_interpolate = ThreeInterpolate.apply\n\n\nclass GroupingOperation(Function):\n\n    @staticmethod\n    def forward(ctx, features: torch.Tensor, idx: torch.Tensor) -> torch.Tensor:\n        \"\"\"\n        :param ctx:\n        :param features: (B, C, N) tensor of features to group\n        :param idx: (B, npoint, nsample) tensor containing the indicies of features to group with\n        :return:\n            output: (B, C, npoint, nsample) tensor\n        \"\"\"\n        assert features.is_contiguous()\n        assert idx.is_contiguous()\n\n        B, nfeatures, nsample = idx.size()\n        _, C, N = features.size()\n        output = torch.cuda.FloatTensor(B, C, nfeatures, nsample)\n\n        pointnet2.group_points_wrapper(B, C, N, nfeatures, nsample, features, idx, output)\n\n        ctx.for_backwards = (idx, N)\n        return output\n\n    @staticmethod\n    def backward(ctx, grad_out: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor]:\n        \"\"\"\n        :param ctx:\n        :param grad_out: (B, C, npoint, nsample) tensor of the gradients of the output from forward\n        :return:\n            grad_features: (B, C, N) gradient of the features\n        \"\"\"\n        idx, N = ctx.for_backwards\n\n        B, C, npoint, nsample = grad_out.size()\n        grad_features = Variable(torch.cuda.FloatTensor(B, C, N).zero_())\n\n        grad_out_data = grad_out.data.contiguous()\n        pointnet2.group_points_grad_wrapper(B, C, N, npoint, nsample, grad_out_data, idx, grad_features.data)\n        return grad_features, None\n\n\ngrouping_operation = GroupingOperation.apply\n\n\nclass BallQuery(Function):\n\n    @staticmethod\n    def forward(ctx, radius: float, nsample: int, xyz: torch.Tensor, new_xyz: torch.Tensor) -> torch.Tensor:\n        \"\"\"\n        :param ctx:\n        :param radius: float, radius of the balls\n        :param nsample: int, maximum number of features in the balls\n        :param xyz: (B, N, 3) xyz coordinates of the features\n        :param new_xyz: (B, npoint, 3) centers of the ball query\n        :return:\n            idx: (B, npoint, nsample) tensor with the indicies of the features that form the query balls\n        \"\"\"\n        assert new_xyz.is_contiguous()\n        assert xyz.is_contiguous()\n\n        B, N, _ = xyz.size()\n        npoint = new_xyz.size(1)\n        idx = torch.cuda.IntTensor(B, npoint, nsample).zero_()\n\n        pointnet2.ball_query_wrapper(B, N, npoint, radius, nsample, new_xyz, xyz, idx)\n        return idx\n\n    @staticmethod\n    def backward(ctx, a=None):\n        return None, None, None, None\n\n\nball_query = BallQuery.apply\n\n\nclass QueryAndGroup(nn.Module):\n    def __init__(self, radius: float, nsample: int, use_xyz: bool = True):\n        \"\"\"\n        :param radius: float, radius of ball\n        :param nsample: int, maximum number of features to gather in the ball\n        :param use_xyz:\n        \"\"\"\n        super().__init__()\n        self.radius, self.nsample, self.use_xyz = radius, nsample, use_xyz\n\n    def forward(self, xyz: torch.Tensor, new_xyz: torch.Tensor, features: torch.Tensor = None) -> Tuple[torch.Tensor]:\n        \"\"\"\n        :param xyz: (B, N, 3) xyz coordinates of the features\n        :param new_xyz: (B, npoint, 3) centroids\n        :param features: (B, C, N) descriptors of the features\n        :return:\n            new_features: (B, 3 + C, npoint, nsample)\n        \"\"\"\n        idx = ball_query(self.radius, self.nsample, xyz, new_xyz)\n        xyz_trans = xyz.transpose(1, 2).contiguous()\n        grouped_xyz = grouping_operation(xyz_trans, idx)  # (B, 3, npoint, nsample)\n        grouped_xyz -= new_xyz.transpose(1, 2).unsqueeze(-1)\n\n        if features is not None:\n            grouped_features = grouping_operation(features, idx)\n            if self.use_xyz:\n                new_features = torch.cat([grouped_xyz, grouped_features], dim=1)  # (B, C + 3, npoint, nsample)\n            else:\n                new_features = grouped_features\n        else:\n            assert self.use_xyz, \"Cannot have not features and not use xyz as a feature!\"\n            new_features = grouped_xyz\n\n        return new_features\n\n\nclass GroupAll(nn.Module):\n    def __init__(self, use_xyz: bool = True):\n        super().__init__()\n        self.use_xyz = use_xyz\n\n    def forward(self, xyz: torch.Tensor, new_xyz: torch.Tensor, features: torch.Tensor = None):\n        \"\"\"\n        :param xyz: (B, N, 3) xyz coordinates of the features\n        :param new_xyz: ignored\n        :param features: (B, C, N) descriptors of the features\n        :return:\n            new_features: (B, C + 3, 1, N)\n        \"\"\"\n        grouped_xyz = xyz.transpose(1, 2).unsqueeze(2)\n        if features is not None:\n            grouped_features = features.unsqueeze(2)\n            if self.use_xyz:\n                new_features = torch.cat([grouped_xyz, grouped_features], dim=1)  # (B, 3 + C, 1, N)\n            else:\n                new_features = grouped_features\n        else:\n            new_features = grouped_xyz\n\n        return new_features\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_batch/src/ball_query.cpp",
    "content": "/*\nbatch version of ball query, modified from the original implementation of official PointNet++ codes.\nWritten by Shaoshuai Shi\nAll Rights Reserved 2018.\n*/\n\n\n#include <torch/serialize/tensor.h>\n#include <vector>\n#include <THC/THC.h>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n#include \"ball_query_gpu.h\"\n\nextern THCState *state;\n\n#define CHECK_CUDA(x) do { \\\n\t  if (!x.type().is_cuda()) { \\\n\t\t      fprintf(stderr, \"%s must be CUDA tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n\t\t      exit(-1); \\\n\t\t    } \\\n} while (0)\n#define CHECK_CONTIGUOUS(x) do { \\\n\t  if (!x.is_contiguous()) { \\\n\t\t      fprintf(stderr, \"%s must be contiguous tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n\t\t      exit(-1); \\\n\t\t    } \\\n} while (0)\n#define CHECK_INPUT(x) CHECK_CUDA(x);CHECK_CONTIGUOUS(x)\n\n\nint ball_query_wrapper_fast(int b, int n, int m, float radius, int nsample, \n    at::Tensor new_xyz_tensor, at::Tensor xyz_tensor, at::Tensor idx_tensor) {\n    CHECK_INPUT(new_xyz_tensor);\n    CHECK_INPUT(xyz_tensor);\n    const float *new_xyz = new_xyz_tensor.data<float>();\n    const float *xyz = xyz_tensor.data<float>();\n    int *idx = idx_tensor.data<int>();\n    \n    ball_query_kernel_launcher_fast(b, n, m, radius, nsample, new_xyz, xyz, idx);\n    return 1;\n}\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_batch/src/ball_query_gpu.cu",
    "content": "/*\nbatch version of ball query, modified from the original implementation of official PointNet++ codes.\nWritten by Shaoshuai Shi\nAll Rights Reserved 2018.\n*/\n\n#include <math.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include \"ball_query_gpu.h\"\n#include \"cuda_utils.h\"\n\n\n__global__ void ball_query_kernel_fast(int b, int n, int m, float radius, int nsample, \n    const float *__restrict__ new_xyz, const float *__restrict__ xyz, int *__restrict__ idx) {\n    // new_xyz: (B, M, 3)\n    // xyz: (B, N, 3)\n    // output:\n    //      idx: (B, M, nsample)\n    int bs_idx = blockIdx.y;\n    int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (bs_idx >= b || pt_idx >= m) return;\n\n    new_xyz += bs_idx * m * 3 + pt_idx * 3;\n    xyz += bs_idx * n * 3;\n    idx += bs_idx * m * nsample + pt_idx * nsample;\n\n    float radius2 = radius * radius;\n    float new_x = new_xyz[0];\n    float new_y = new_xyz[1];\n    float new_z = new_xyz[2];\n\n    int cnt = 0;\n    for (int k = 0; k < n; ++k) {\n        float x = xyz[k * 3 + 0];\n        float y = xyz[k * 3 + 1];\n        float z = xyz[k * 3 + 2];\n        float d2 = (new_x - x) * (new_x - x) + (new_y - y) * (new_y - y) + (new_z - z) * (new_z - z);\n        if (d2 < radius2){\n            if (cnt == 0){\n                for (int l = 0; l < nsample; ++l) {\n                    idx[l] = k;\n                }\n            }\n            idx[cnt] = k;\n            ++cnt;\n            if (cnt >= nsample) break;\n        }\n    }\n}\n\n\nvoid ball_query_kernel_launcher_fast(int b, int n, int m, float radius, int nsample, \\\n    const float *new_xyz, const float *xyz, int *idx) {\n    // new_xyz: (B, M, 3)\n    // xyz: (B, N, 3)\n    // output:\n    //      idx: (B, M, nsample)\n\n    cudaError_t err;\n\n    dim3 blocks(DIVUP(m, THREADS_PER_BLOCK), b);  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    ball_query_kernel_fast<<<blocks, threads>>>(b, n, m, radius, nsample, new_xyz, xyz, idx);\n    // cudaDeviceSynchronize();  // for using printf in kernel function\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_batch/src/ball_query_gpu.h",
    "content": "#ifndef _BALL_QUERY_GPU_H\n#define _BALL_QUERY_GPU_H\n\n#include <torch/serialize/tensor.h>\n#include <vector>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n\nint ball_query_wrapper_fast(int b, int n, int m, float radius, int nsample, \n\tat::Tensor new_xyz_tensor, at::Tensor xyz_tensor, at::Tensor idx_tensor);\n\nvoid ball_query_kernel_launcher_fast(int b, int n, int m, float radius, int nsample, \n\tconst float *xyz, const float *new_xyz, int *idx);\n\n#endif\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_batch/src/cuda_utils.h",
    "content": "#ifndef _CUDA_UTILS_H\n#define _CUDA_UTILS_H\n\n#include <cmath>\n\n#define TOTAL_THREADS 1024\n#define THREADS_PER_BLOCK 256\n#define DIVUP(m,n) ((m) / (n) + ((m) % (n) > 0))\n\ninline int opt_n_threads(int work_size) {\n    const int pow_2 = std::log(static_cast<double>(work_size)) / std::log(2.0);\n\n    return max(min(1 << pow_2, TOTAL_THREADS), 1);\n}\n#endif\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_batch/src/group_points.cpp",
    "content": "/*\nbatch version of point grouping, modified from the original implementation of official PointNet++ codes.\nWritten by Shaoshuai Shi\nAll Rights Reserved 2018.\n*/\n\n\n#include <torch/serialize/tensor.h>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n#include <vector>\n#include <THC/THC.h>\n#include \"group_points_gpu.h\"\n\nextern THCState *state;\n\n\nint group_points_grad_wrapper_fast(int b, int c, int n, int npoints, int nsample, \n    at::Tensor grad_out_tensor, at::Tensor idx_tensor, at::Tensor grad_points_tensor) {\n\n    float *grad_points = grad_points_tensor.data<float>();\n    const int *idx = idx_tensor.data<int>();\n    const float *grad_out = grad_out_tensor.data<float>();\n\n    group_points_grad_kernel_launcher_fast(b, c, n, npoints, nsample, grad_out, idx, grad_points);\n    return 1;\n}\n\n\nint group_points_wrapper_fast(int b, int c, int n, int npoints, int nsample, \n    at::Tensor points_tensor, at::Tensor idx_tensor, at::Tensor out_tensor) {\n\n    const float *points = points_tensor.data<float>();\n    const int *idx = idx_tensor.data<int>();\n    float *out = out_tensor.data<float>();\n\n    group_points_kernel_launcher_fast(b, c, n, npoints, nsample, points, idx, out);\n    return 1;\n}\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_batch/src/group_points_gpu.cu",
    "content": "/*\nbatch version of point grouping, modified from the original implementation of official PointNet++ codes.\nWritten by Shaoshuai Shi\nAll Rights Reserved 2018.\n*/\n\n#include <stdio.h>\n#include <stdlib.h>\n\n#include \"cuda_utils.h\"\n#include \"group_points_gpu.h\"\n\n\n__global__ void group_points_grad_kernel_fast(int b, int c, int n, int npoints, int nsample, \n    const float *__restrict__ grad_out, const int *__restrict__ idx, float *__restrict__ grad_points) {\n    // grad_out: (B, C, npoints, nsample)\n    // idx: (B, npoints, nsample)\n    // output:\n    //      grad_points: (B, C, N)\n    int bs_idx = blockIdx.z;\n    int c_idx = blockIdx.y;\n    int index = blockIdx.x * blockDim.x + threadIdx.x;\n    int pt_idx = index / nsample;\n    if (bs_idx >= b || c_idx >= c || pt_idx >= npoints) return;\n\n    int sample_idx = index % nsample;\n    grad_out += bs_idx * c * npoints * nsample + c_idx * npoints * nsample + pt_idx * nsample + sample_idx;\n    idx += bs_idx * npoints * nsample + pt_idx * nsample + sample_idx; \n    \n    atomicAdd(grad_points + bs_idx * c * n + c_idx * n + idx[0] , grad_out[0]);\n}\n\nvoid group_points_grad_kernel_launcher_fast(int b, int c, int n, int npoints, int nsample, \n    const float *grad_out, const int *idx, float *grad_points) {\n    // grad_out: (B, C, npoints, nsample)\n    // idx: (B, npoints, nsample)\n    // output:\n    //      grad_points: (B, C, N)\n    cudaError_t err;\n    dim3 blocks(DIVUP(npoints * nsample, THREADS_PER_BLOCK), c, b);  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    group_points_grad_kernel_fast<<<blocks, threads>>>(b, c, n, npoints, nsample, grad_out, idx, grad_points);\n\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n\n\n__global__ void group_points_kernel_fast(int b, int c, int n, int npoints, int nsample, \n    const float *__restrict__ points, const int *__restrict__ idx, float *__restrict__ out) {\n    // points: (B, C, N)\n    // idx: (B, npoints, nsample)\n    // output:\n    //      out: (B, C, npoints, nsample)\n    int bs_idx = blockIdx.z;\n    int c_idx = blockIdx.y;\n    int index = blockIdx.x * blockDim.x + threadIdx.x;\n    int pt_idx = index / nsample;\n    if (bs_idx >= b || c_idx >= c || pt_idx >= npoints) return;\n\n    int sample_idx = index % nsample;\n\n    idx += bs_idx * npoints * nsample + pt_idx * nsample + sample_idx; \n    int in_idx = bs_idx * c * n + c_idx * n + idx[0];\n    int out_idx = bs_idx * c * npoints * nsample + c_idx * npoints * nsample + pt_idx * nsample + sample_idx;\n\n    out[out_idx] = points[in_idx];\n}\n\n\nvoid group_points_kernel_launcher_fast(int b, int c, int n, int npoints, int nsample, \n    const float *points, const int *idx, float *out) {\n    // points: (B, C, N)\n    // idx: (B, npoints, nsample)\n    // output:\n    //      out: (B, C, npoints, nsample)\n    cudaError_t err;\n    dim3 blocks(DIVUP(npoints * nsample, THREADS_PER_BLOCK), c, b);  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    group_points_kernel_fast<<<blocks, threads>>>(b, c, n, npoints, nsample, points, idx, out);\n    // cudaDeviceSynchronize();  // for using printf in kernel function\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_batch/src/group_points_gpu.h",
    "content": "#ifndef _GROUP_POINTS_GPU_H\n#define _GROUP_POINTS_GPU_H\n\n#include <torch/serialize/tensor.h>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n#include <vector>\n\n\nint group_points_wrapper_fast(int b, int c, int n, int npoints, int nsample, \n    at::Tensor points_tensor, at::Tensor idx_tensor, at::Tensor out_tensor);\n\nvoid group_points_kernel_launcher_fast(int b, int c, int n, int npoints, int nsample, \n    const float *points, const int *idx, float *out);\n\nint group_points_grad_wrapper_fast(int b, int c, int n, int npoints, int nsample, \n    at::Tensor grad_out_tensor, at::Tensor idx_tensor, at::Tensor grad_points_tensor);\n\nvoid group_points_grad_kernel_launcher_fast(int b, int c, int n, int npoints, int nsample, \n    const float *grad_out, const int *idx, float *grad_points);\n\n#endif\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_batch/src/interpolate.cpp",
    "content": "/*\nbatch version of point interpolation, modified from the original implementation of official PointNet++ codes.\nWritten by Shaoshuai Shi\nAll Rights Reserved 2018.\n*/\n\n\n#include <torch/serialize/tensor.h>\n#include <vector>\n#include <THC/THC.h>\n#include <math.h>\n#include <stdio.h>\n#include <stdlib.h>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n#include \"interpolate_gpu.h\"\n\nextern THCState *state;\n\n\nvoid three_nn_wrapper_fast(int b, int n, int m, at::Tensor unknown_tensor, \n    at::Tensor known_tensor, at::Tensor dist2_tensor, at::Tensor idx_tensor) {\n    const float *unknown = unknown_tensor.data<float>();\n    const float *known = known_tensor.data<float>();\n    float *dist2 = dist2_tensor.data<float>();\n    int *idx = idx_tensor.data<int>();\n\n    three_nn_kernel_launcher_fast(b, n, m, unknown, known, dist2, idx);\n}\n\n\nvoid three_interpolate_wrapper_fast(int b, int c, int m, int n,\n                         at::Tensor points_tensor,\n                         at::Tensor idx_tensor,\n                         at::Tensor weight_tensor,\n                         at::Tensor out_tensor) {\n\n    const float *points = points_tensor.data<float>();\n    const float *weight = weight_tensor.data<float>();\n    float *out = out_tensor.data<float>();\n    const int *idx = idx_tensor.data<int>();\n\n    three_interpolate_kernel_launcher_fast(b, c, m, n, points, idx, weight, out);\n}\n\nvoid three_interpolate_grad_wrapper_fast(int b, int c, int n, int m,\n                            at::Tensor grad_out_tensor,\n                            at::Tensor idx_tensor,\n                            at::Tensor weight_tensor,\n                            at::Tensor grad_points_tensor) {\n\n    const float *grad_out = grad_out_tensor.data<float>();\n    const float *weight = weight_tensor.data<float>();\n    float *grad_points = grad_points_tensor.data<float>();\n    const int *idx = idx_tensor.data<int>();\n\n    three_interpolate_grad_kernel_launcher_fast(b, c, n, m, grad_out, idx, weight, grad_points);\n}\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_batch/src/interpolate_gpu.cu",
    "content": "/*\nbatch version of point interpolation, modified from the original implementation of official PointNet++ codes.\nWritten by Shaoshuai Shi\nAll Rights Reserved 2018.\n*/\n\n\n#include <math.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include \"cuda_utils.h\"\n#include \"interpolate_gpu.h\"\n\n\n__global__ void three_nn_kernel_fast(int b, int n, int m, const float *__restrict__ unknown, \n    const float *__restrict__ known, float *__restrict__ dist2, int *__restrict__ idx) {\n    // unknown: (B, N, 3)\n    // known: (B, M, 3)\n    // output: \n    //      dist2: (B, N, 3)\n    //      idx: (B, N, 3)\n    \n    int bs_idx = blockIdx.y;\n    int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (bs_idx >= b || pt_idx >= n) return;\n\n    unknown += bs_idx * n * 3 + pt_idx * 3;\n    known += bs_idx * m * 3;\n    dist2 += bs_idx * n * 3 + pt_idx * 3;\n    idx += bs_idx * n * 3 + pt_idx * 3;\n\n    float ux = unknown[0];\n    float uy = unknown[1];\n    float uz = unknown[2];\n\n    double best1 = 1e40, best2 = 1e40, best3 = 1e40;\n    int besti1 = 0, besti2 = 0, besti3 = 0;\n    for (int k = 0; k < m; ++k) {\n        float x = known[k * 3 + 0];\n        float y = known[k * 3 + 1];\n        float z = known[k * 3 + 2];\n        float d = (ux - x) * (ux - x) + (uy - y) * (uy - y) + (uz - z) * (uz - z);\n        if (d < best1) {\n            best3 = best2; besti3 = besti2;\n            best2 = best1; besti2 = besti1;\n            best1 = d; besti1 = k;\n        } \n        else if (d < best2) {\n            best3 = best2; besti3 = besti2;\n            best2 = d; besti2 = k;\n        } \n        else if (d < best3) {\n            best3 = d; besti3 = k;\n        }\n    }\n    dist2[0] = best1; dist2[1] = best2; dist2[2] = best3;\n    idx[0] = besti1; idx[1] = besti2; idx[2] = besti3;\n}\n\n\nvoid three_nn_kernel_launcher_fast(int b, int n, int m, const float *unknown, \n    const float *known, float *dist2, int *idx) {\n    // unknown: (B, N, 3)\n    // known: (B, M, 3)\n    // output: \n    //      dist2: (B, N, 3)\n    //      idx: (B, N, 3)\n\n    cudaError_t err;\n    dim3 blocks(DIVUP(n, THREADS_PER_BLOCK), b);  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    three_nn_kernel_fast<<<blocks, threads>>>(b, n, m, unknown, known, dist2, idx);\n\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n\n\n__global__ void three_interpolate_kernel_fast(int b, int c, int m, int n, const float *__restrict__ points, \n    const int *__restrict__ idx, const float *__restrict__ weight, float *__restrict__ out) {\n    // points: (B, C, M)\n    // idx: (B, N, 3)\n    // weight: (B, N, 3)\n    // output:\n    //      out: (B, C, N)\n\n    int bs_idx = blockIdx.z;\n    int c_idx = blockIdx.y;\n    int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;\n\n    if (bs_idx >= b || c_idx >= c || pt_idx >= n) return;\n\n    weight += bs_idx * n * 3 + pt_idx * 3;\n    points += bs_idx * c * m + c_idx * m;\n    idx += bs_idx * n * 3 + pt_idx * 3;\n    out += bs_idx * c * n + c_idx * n;\n\n    out[pt_idx] = weight[0] * points[idx[0]] + weight[1] * points[idx[1]] + weight[2] * points[idx[2]];\n}\n\nvoid three_interpolate_kernel_launcher_fast(int b, int c, int m, int n, \n    const float *points, const int *idx, const float *weight, float *out) {\n    // points: (B, C, M)\n    // idx: (B, N, 3)\n    // weight: (B, N, 3)\n    // output:\n    //      out: (B, C, N)\n\n    cudaError_t err;\n    dim3 blocks(DIVUP(n, THREADS_PER_BLOCK), c, b);  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n    three_interpolate_kernel_fast<<<blocks, threads>>>(b, c, m, n, points, idx, weight, out);\n\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n\n\n__global__ void three_interpolate_grad_kernel_fast(int b, int c, int n, int m, const float *__restrict__ grad_out, \n    const int *__restrict__ idx, const float *__restrict__ weight, float *__restrict__ grad_points) {\n    // grad_out: (B, C, N)\n    // weight: (B, N, 3)\n    // output:\n    //      grad_points: (B, C, M)\n\n    int bs_idx = blockIdx.z;\n    int c_idx = blockIdx.y;\n    int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;\n\n    if (bs_idx >= b || c_idx >= c || pt_idx >= n) return;\n    \n    grad_out += bs_idx * c * n + c_idx * n + pt_idx;\n    weight += bs_idx * n * 3 + pt_idx * 3;\n    grad_points += bs_idx * c * m + c_idx * m;\n    idx += bs_idx * n * 3 + pt_idx * 3;\n\n\n    atomicAdd(grad_points + idx[0], grad_out[0] * weight[0]);\n    atomicAdd(grad_points + idx[1], grad_out[0] * weight[1]);\n    atomicAdd(grad_points + idx[2], grad_out[0] * weight[2]);\n}\n\nvoid three_interpolate_grad_kernel_launcher_fast(int b, int c, int n, int m, const float *grad_out, \n    const int *idx, const float *weight, float *grad_points) {\n    // grad_out: (B, C, N)\n    // weight: (B, N, 3)\n    // output:\n    //      grad_points: (B, C, M)\n\n    cudaError_t err;\n    dim3 blocks(DIVUP(n, THREADS_PER_BLOCK), c, b);  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n    three_interpolate_grad_kernel_fast<<<blocks, threads>>>(b, c, n, m, grad_out, idx, weight, grad_points);\n\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_batch/src/interpolate_gpu.h",
    "content": "#ifndef _INTERPOLATE_GPU_H\n#define _INTERPOLATE_GPU_H\n\n#include <torch/serialize/tensor.h>\n#include<vector>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n\n\nvoid three_nn_wrapper_fast(int b, int n, int m, at::Tensor unknown_tensor, \n  at::Tensor known_tensor, at::Tensor dist2_tensor, at::Tensor idx_tensor);\n\nvoid three_nn_kernel_launcher_fast(int b, int n, int m, const float *unknown,\n\tconst float *known, float *dist2, int *idx);\n\n\nvoid three_interpolate_wrapper_fast(int b, int c, int m, int n, at::Tensor points_tensor, \n    at::Tensor idx_tensor, at::Tensor weight_tensor, at::Tensor out_tensor);\n\nvoid three_interpolate_kernel_launcher_fast(int b, int c, int m, int n, \n    const float *points, const int *idx, const float *weight, float *out);\n\n\nvoid three_interpolate_grad_wrapper_fast(int b, int c, int n, int m, at::Tensor grad_out_tensor, \n    at::Tensor idx_tensor, at::Tensor weight_tensor, at::Tensor grad_points_tensor);\n\nvoid three_interpolate_grad_kernel_launcher_fast(int b, int c, int n, int m, const float *grad_out, \n    const int *idx, const float *weight, float *grad_points);\n\n#endif\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_batch/src/pointnet2_api.cpp",
    "content": "#include <torch/serialize/tensor.h>\n#include <torch/extension.h>\n\n#include \"ball_query_gpu.h\"\n#include \"group_points_gpu.h\"\n#include \"sampling_gpu.h\"\n#include \"interpolate_gpu.h\"\n\n\nPYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {\n    m.def(\"ball_query_wrapper\", &ball_query_wrapper_fast, \"ball_query_wrapper_fast\");\n\n    m.def(\"group_points_wrapper\", &group_points_wrapper_fast, \"group_points_wrapper_fast\");\n    m.def(\"group_points_grad_wrapper\", &group_points_grad_wrapper_fast, \"group_points_grad_wrapper_fast\");\n\n    m.def(\"gather_points_wrapper\", &gather_points_wrapper_fast, \"gather_points_wrapper_fast\");\n    m.def(\"gather_points_grad_wrapper\", &gather_points_grad_wrapper_fast, \"gather_points_grad_wrapper_fast\");\n\n    m.def(\"furthest_point_sampling_wrapper\", &furthest_point_sampling_wrapper, \"furthest_point_sampling_wrapper\");\n    \n    m.def(\"three_nn_wrapper\", &three_nn_wrapper_fast, \"three_nn_wrapper_fast\");\n    m.def(\"three_interpolate_wrapper\", &three_interpolate_wrapper_fast, \"three_interpolate_wrapper_fast\");\n    m.def(\"three_interpolate_grad_wrapper\", &three_interpolate_grad_wrapper_fast, \"three_interpolate_grad_wrapper_fast\");\n}\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_batch/src/sampling.cpp",
    "content": "/*\nbatch version of point sampling and gathering, modified from the original implementation of official PointNet++ codes.\nWritten by Shaoshuai Shi\nAll Rights Reserved 2018.\n*/\n\n\n#include <torch/serialize/tensor.h>\n#include <ATen/cuda/CUDAContext.h>\n#include <vector>\n#include <THC/THC.h>\n\n#include \"sampling_gpu.h\"\n\nextern THCState *state;\n\n\nint gather_points_wrapper_fast(int b, int c, int n, int npoints, \n    at::Tensor points_tensor, at::Tensor idx_tensor, at::Tensor out_tensor){\n    const float *points = points_tensor.data<float>();\n    const int *idx = idx_tensor.data<int>();\n    float *out = out_tensor.data<float>();\n\n    gather_points_kernel_launcher_fast(b, c, n, npoints, points, idx, out);\n    return 1;\n}\n\n\nint gather_points_grad_wrapper_fast(int b, int c, int n, int npoints, \n    at::Tensor grad_out_tensor, at::Tensor idx_tensor, at::Tensor grad_points_tensor) {\n\n    const float *grad_out = grad_out_tensor.data<float>();\n    const int *idx = idx_tensor.data<int>();\n    float *grad_points = grad_points_tensor.data<float>();\n\n    gather_points_grad_kernel_launcher_fast(b, c, n, npoints, grad_out, idx, grad_points);\n    return 1;\n}\n\n\nint furthest_point_sampling_wrapper(int b, int n, int m, \n    at::Tensor points_tensor, at::Tensor temp_tensor, at::Tensor idx_tensor) {\n\n    const float *points = points_tensor.data<float>();\n    float *temp = temp_tensor.data<float>();\n    int *idx = idx_tensor.data<int>();\n\n    furthest_point_sampling_kernel_launcher(b, n, m, points, temp, idx);\n    return 1;\n}\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_batch/src/sampling_gpu.cu",
    "content": "/*\nbatch version of point sampling and gathering, modified from the original implementation of official PointNet++ codes.\nWritten by Shaoshuai Shi\nAll Rights Reserved 2018.\n*/\n\n\n#include <stdio.h>\n#include <stdlib.h>\n\n#include \"cuda_utils.h\"\n#include \"sampling_gpu.h\"\n\n\n__global__ void gather_points_kernel_fast(int b, int c, int n, int m, \n    const float *__restrict__ points, const int *__restrict__ idx, float *__restrict__ out) {\n    // points: (B, C, N)\n    // idx: (B, M)\n    // output:\n    //      out: (B, C, M)\n\n    int bs_idx = blockIdx.z;\n    int c_idx = blockIdx.y;\n    int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (bs_idx >= b || c_idx >= c || pt_idx >= m) return;\n\n    out += bs_idx * c * m + c_idx * m + pt_idx;\n    idx += bs_idx * m + pt_idx;\n    points += bs_idx * c * n + c_idx * n;\n    out[0] = points[idx[0]];\n}\n\nvoid gather_points_kernel_launcher_fast(int b, int c, int n, int npoints, \n    const float *points, const int *idx, float *out) {\n    // points: (B, C, N)\n    // idx: (B, npoints)\n    // output:\n    //      out: (B, C, npoints)\n\n    cudaError_t err;\n    dim3 blocks(DIVUP(npoints, THREADS_PER_BLOCK), c, b);  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    gather_points_kernel_fast<<<blocks, threads>>>(b, c, n, npoints, points, idx, out);\n\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n\n__global__ void gather_points_grad_kernel_fast(int b, int c, int n, int m, const float *__restrict__ grad_out, \n    const int *__restrict__ idx, float *__restrict__ grad_points) {\n    // grad_out: (B, C, M)\n    // idx: (B, M)\n    // output:\n    //      grad_points: (B, C, N)\n\n    int bs_idx = blockIdx.z;\n    int c_idx = blockIdx.y;\n    int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (bs_idx >= b || c_idx >= c || pt_idx >= m) return;\n\n    grad_out += bs_idx * c * m + c_idx * m + pt_idx;\n    idx += bs_idx * m + pt_idx;\n    grad_points += bs_idx * c * n + c_idx * n;\n\n    atomicAdd(grad_points + idx[0], grad_out[0]);\n}\n\nvoid gather_points_grad_kernel_launcher_fast(int b, int c, int n, int npoints, \n    const float *grad_out, const int *idx, float *grad_points) {\n    // grad_out: (B, C, npoints)\n    // idx: (B, npoints)\n    // output:\n    //      grad_points: (B, C, N)\n\n    cudaError_t err;\n    dim3 blocks(DIVUP(npoints, THREADS_PER_BLOCK), c, b);  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    gather_points_grad_kernel_fast<<<blocks, threads>>>(b, c, n, npoints, grad_out, idx, grad_points);\n\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n\n\n__device__ void __update(float *__restrict__ dists, int *__restrict__ dists_i, int idx1, int idx2){\n    const float v1 = dists[idx1], v2 = dists[idx2];\n    const int i1 = dists_i[idx1], i2 = dists_i[idx2];\n    dists[idx1] = max(v1, v2);\n    dists_i[idx1] = v2 > v1 ? i2 : i1;\n}\n\ntemplate <unsigned int block_size>\n__global__ void furthest_point_sampling_kernel(int b, int n, int m, \n    const float *__restrict__ dataset, float *__restrict__ temp, int *__restrict__ idxs) {\n    // dataset: (B, N, 3)\n    // tmp: (B, N)\n    // output:\n    //      idx: (B, M)\n\n    if (m <= 0) return;\n    __shared__ float dists[block_size];\n    __shared__ int dists_i[block_size];\n\n    int batch_index = blockIdx.x;\n    dataset += batch_index * n * 3;\n    temp += batch_index * n;\n    idxs += batch_index * m;\n\n    int tid = threadIdx.x;\n    const int stride = block_size;\n\n    int old = 0;\n    if (threadIdx.x == 0)\n    idxs[0] = old;\n\n    __syncthreads();\n    for (int j = 1; j < m; j++) {\n    int besti = 0;\n    float best = -1;\n    float x1 = dataset[old * 3 + 0];\n    float y1 = dataset[old * 3 + 1];\n    float z1 = dataset[old * 3 + 2];\n    for (int k = tid; k < n; k += stride) {\n        float x2, y2, z2;\n        x2 = dataset[k * 3 + 0];\n        y2 = dataset[k * 3 + 1];\n        z2 = dataset[k * 3 + 2];\n        // float mag = (x2 * x2) + (y2 * y2) + (z2 * z2);\n        // if (mag <= 1e-3)\n        // continue;\n\n        float d = (x2 - x1) * (x2 - x1) + (y2 - y1) * (y2 - y1) + (z2 - z1) * (z2 - z1);\n        float d2 = min(d, temp[k]);\n        temp[k] = d2;\n        besti = d2 > best ? k : besti;\n        best = d2 > best ? d2 : best;\n    }\n    dists[tid] = best;\n    dists_i[tid] = besti;\n    __syncthreads();\n\n    if (block_size >= 1024) {\n        if (tid < 512) {\n            __update(dists, dists_i, tid, tid + 512);\n        }\n        __syncthreads();\n    }\n\n    if (block_size >= 512) {\n        if (tid < 256) {\n            __update(dists, dists_i, tid, tid + 256);\n        }\n        __syncthreads();\n    }\n    if (block_size >= 256) {\n        if (tid < 128) {\n            __update(dists, dists_i, tid, tid + 128);\n        }\n        __syncthreads();\n    }\n    if (block_size >= 128) {\n        if (tid < 64) {\n            __update(dists, dists_i, tid, tid + 64);\n        }\n        __syncthreads();\n    }\n    if (block_size >= 64) {\n        if (tid < 32) {\n            __update(dists, dists_i, tid, tid + 32);\n        }\n        __syncthreads();\n    }\n    if (block_size >= 32) {\n        if (tid < 16) {\n            __update(dists, dists_i, tid, tid + 16);\n        }\n        __syncthreads();\n    }\n    if (block_size >= 16) {\n        if (tid < 8) {\n            __update(dists, dists_i, tid, tid + 8);\n        }\n        __syncthreads();\n    }\n    if (block_size >= 8) {\n        if (tid < 4) {\n            __update(dists, dists_i, tid, tid + 4);\n        }\n        __syncthreads();\n    }\n    if (block_size >= 4) {\n        if (tid < 2) {\n            __update(dists, dists_i, tid, tid + 2);\n        }\n        __syncthreads();\n    }\n    if (block_size >= 2) {\n        if (tid < 1) {\n            __update(dists, dists_i, tid, tid + 1);\n        }\n        __syncthreads();\n    }\n\n    old = dists_i[0];\n    if (tid == 0)\n        idxs[j] = old;\n    }\n}\n\nvoid furthest_point_sampling_kernel_launcher(int b, int n, int m, \n    const float *dataset, float *temp, int *idxs) {\n    // dataset: (B, N, 3)\n    // tmp: (B, N)\n    // output:\n    //      idx: (B, M)\n\n    cudaError_t err;\n    unsigned int n_threads = opt_n_threads(n);\n\n    switch (n_threads) {\n        case 1024:\n        furthest_point_sampling_kernel<1024><<<b, n_threads>>>(b, n, m, dataset, temp, idxs); break;\n        case 512:\n        furthest_point_sampling_kernel<512><<<b, n_threads>>>(b, n, m, dataset, temp, idxs); break;\n        case 256:\n        furthest_point_sampling_kernel<256><<<b, n_threads>>>(b, n, m, dataset, temp, idxs); break;\n        case 128:\n        furthest_point_sampling_kernel<128><<<b, n_threads>>>(b, n, m, dataset, temp, idxs); break;\n        case 64:\n        furthest_point_sampling_kernel<64><<<b, n_threads>>>(b, n, m, dataset, temp, idxs); break;\n        case 32:\n        furthest_point_sampling_kernel<32><<<b, n_threads>>>(b, n, m, dataset, temp, idxs); break;\n        case 16:\n        furthest_point_sampling_kernel<16><<<b, n_threads>>>(b, n, m, dataset, temp, idxs); break;\n        case 8:\n        furthest_point_sampling_kernel<8><<<b, n_threads>>>(b, n, m, dataset, temp, idxs); break;\n        case 4:\n        furthest_point_sampling_kernel<4><<<b, n_threads>>>(b, n, m, dataset, temp, idxs); break;\n        case 2:\n        furthest_point_sampling_kernel<2><<<b, n_threads>>>(b, n, m, dataset, temp, idxs); break;\n        case 1:\n        furthest_point_sampling_kernel<1><<<b, n_threads>>>(b, n, m, dataset, temp, idxs); break;\n        default:\n        furthest_point_sampling_kernel<512><<<b, n_threads>>>(b, n, m, dataset, temp, idxs);\n    }\n\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_batch/src/sampling_gpu.h",
    "content": "#ifndef _SAMPLING_GPU_H\n#define _SAMPLING_GPU_H\n\n#include <torch/serialize/tensor.h>\n#include <ATen/cuda/CUDAContext.h>\n#include<vector>\n\n\nint gather_points_wrapper_fast(int b, int c, int n, int npoints, \n    at::Tensor points_tensor, at::Tensor idx_tensor, at::Tensor out_tensor);\n\nvoid gather_points_kernel_launcher_fast(int b, int c, int n, int npoints, \n    const float *points, const int *idx, float *out);\n\n\nint gather_points_grad_wrapper_fast(int b, int c, int n, int npoints, \n    at::Tensor grad_out_tensor, at::Tensor idx_tensor, at::Tensor grad_points_tensor);\n\nvoid gather_points_grad_kernel_launcher_fast(int b, int c, int n, int npoints, \n    const float *grad_out, const int *idx, float *grad_points);\n\n\nint furthest_point_sampling_wrapper(int b, int n, int m, \n    at::Tensor points_tensor, at::Tensor temp_tensor, at::Tensor idx_tensor);\n\nvoid furthest_point_sampling_kernel_launcher(int b, int n, int m, \n    const float *dataset, float *temp, int *idxs);\n\n#endif\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/pointnet2_modules.py",
    "content": "from typing import List\n\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\nfrom . import pointnet2_utils\n\n\ndef build_local_aggregation_module(input_channels, config):\n    local_aggregation_name = config.get('NAME', 'StackSAModuleMSG')\n\n    if local_aggregation_name == 'StackSAModuleMSG':\n        mlps = config.MLPS\n        for k in range(len(mlps)):\n            mlps[k] = [input_channels] + mlps[k]\n        cur_layer = StackSAModuleMSG(\n            radii=config.POOL_RADIUS, nsamples=config.NSAMPLE, mlps=mlps, use_xyz=True, pool_method='max_pool',\n        )\n        num_c_out = sum([x[-1] for x in mlps])\n    elif local_aggregation_name == 'VectorPoolAggregationModuleMSG':\n        cur_layer = VectorPoolAggregationModuleMSG(input_channels=input_channels, config=config)\n        num_c_out = config.MSG_POST_MLPS[-1]\n    else:\n        raise NotImplementedError\n\n    return cur_layer, num_c_out\n\n\nclass StackSAModuleMSG(nn.Module):\n\n    def __init__(self, *, radii: List[float], nsamples: List[int], mlps: List[List[int]],\n                 use_xyz: bool = True, pool_method='max_pool'):\n        \"\"\"\n        Args:\n            radii: list of float, list of radii to group with\n            nsamples: list of int, number of samples in each ball query\n            mlps: list of list of int, spec of the pointnet before the global pooling for each scale\n            use_xyz:\n            pool_method: max_pool / avg_pool\n        \"\"\"\n        super().__init__()\n\n        assert len(radii) == len(nsamples) == len(mlps)\n\n        self.groupers = nn.ModuleList()\n        self.mlps = nn.ModuleList()\n        for i in range(len(radii)):\n            radius = radii[i]\n            nsample = nsamples[i]\n            self.groupers.append(pointnet2_utils.QueryAndGroup(radius, nsample, use_xyz=use_xyz))\n            mlp_spec = mlps[i]\n            if use_xyz:\n                mlp_spec[0] += 3\n\n            shared_mlps = []\n            for k in range(len(mlp_spec) - 1):\n                shared_mlps.extend([\n                    nn.Conv2d(mlp_spec[k], mlp_spec[k + 1], kernel_size=1, bias=False),\n                    nn.BatchNorm2d(mlp_spec[k + 1]),\n                    nn.ReLU()\n                ])\n            self.mlps.append(nn.Sequential(*shared_mlps))\n        self.pool_method = pool_method\n\n        self.init_weights()\n\n    def init_weights(self):\n        for m in self.modules():\n            if isinstance(m, nn.Conv2d):\n                nn.init.kaiming_normal_(m.weight)\n                if m.bias is not None:\n                    nn.init.constant_(m.bias, 0)\n            if isinstance(m, nn.BatchNorm2d):\n                nn.init.constant_(m.weight, 1.0)\n                nn.init.constant_(m.bias, 0)\n\n    def forward(self, xyz, xyz_batch_cnt, new_xyz, new_xyz_batch_cnt, features=None, empty_voxel_set_zeros=True):\n        \"\"\"\n        :param xyz: (N1 + N2 ..., 3) tensor of the xyz coordinates of the features\n        :param xyz_batch_cnt: (batch_size), [N1, N2, ...]\n        :param new_xyz: (M1 + M2 ..., 3)\n        :param new_xyz_batch_cnt: (batch_size), [M1, M2, ...]\n        :param features: (N1 + N2 ..., C) tensor of the descriptors of the the features\n        :return:\n            new_xyz: (M1 + M2 ..., 3) tensor of the new features' xyz\n            new_features: (M1 + M2 ..., \\sum_k(mlps[k][-1])) tensor of the new_features descriptors\n        \"\"\"\n        new_features_list = []\n        for k in range(len(self.groupers)):\n            new_features, ball_idxs = self.groupers[k](\n                xyz, xyz_batch_cnt, new_xyz, new_xyz_batch_cnt, features\n            )  # (M1 + M2, C, nsample)\n            new_features = new_features.permute(1, 0, 2).unsqueeze(dim=0)  # (1, C, M1 + M2 ..., nsample)\n            new_features = self.mlps[k](new_features)  # (1, C, M1 + M2 ..., nsample)\n\n            if self.pool_method == 'max_pool':\n                new_features = F.max_pool2d(\n                    new_features, kernel_size=[1, new_features.size(3)]\n                ).squeeze(dim=-1)  # (1, C, M1 + M2 ...)\n            elif self.pool_method == 'avg_pool':\n                new_features = F.avg_pool2d(\n                    new_features, kernel_size=[1, new_features.size(3)]\n                ).squeeze(dim=-1)  # (1, C, M1 + M2 ...)\n            else:\n                raise NotImplementedError\n            new_features = new_features.squeeze(dim=0).permute(1, 0)  # (M1 + M2 ..., C)\n            new_features_list.append(new_features)\n\n        new_features = torch.cat(new_features_list, dim=1)  # (M1 + M2 ..., C)\n\n        return new_xyz, new_features\n\n\nclass StackPointnetFPModule(nn.Module):\n    def __init__(self, *, mlp: List[int]):\n        \"\"\"\n        Args:\n            mlp: list of int\n        \"\"\"\n        super().__init__()\n        shared_mlps = []\n        for k in range(len(mlp) - 1):\n            shared_mlps.extend([\n                nn.Conv2d(mlp[k], mlp[k + 1], kernel_size=1, bias=False),\n                nn.BatchNorm2d(mlp[k + 1]),\n                nn.ReLU()\n            ])\n        self.mlp = nn.Sequential(*shared_mlps)\n\n    def forward(self, unknown, unknown_batch_cnt, known, known_batch_cnt, unknown_feats=None, known_feats=None):\n        \"\"\"\n        Args:\n            unknown: (N1 + N2 ..., 3)\n            known: (M1 + M2 ..., 3)\n            unknow_feats: (N1 + N2 ..., C1)\n            known_feats: (M1 + M2 ..., C2)\n\n        Returns:\n            new_features: (N1 + N2 ..., C_out)\n        \"\"\"\n        dist, idx = pointnet2_utils.three_nn(unknown, unknown_batch_cnt, known, known_batch_cnt)\n        dist_recip = 1.0 / (dist + 1e-8)\n        norm = torch.sum(dist_recip, dim=-1, keepdim=True)\n        weight = dist_recip / norm\n\n        interpolated_feats = pointnet2_utils.three_interpolate(known_feats, idx, weight)\n\n        if unknown_feats is not None:\n            new_features = torch.cat([interpolated_feats, unknown_feats], dim=1)  # (N1 + N2 ..., C2 + C1)\n        else:\n            new_features = interpolated_feats\n        new_features = new_features.permute(1, 0)[None, :, :, None]  # (1, C, N1 + N2 ..., 1)\n        new_features = self.mlp(new_features)\n\n        new_features = new_features.squeeze(dim=0).squeeze(dim=-1).permute(1, 0)  # (N1 + N2 ..., C)\n        return new_features\n\n\nclass VectorPoolLocalInterpolateModule(nn.Module):\n    def __init__(self, mlp, num_voxels, max_neighbour_distance, nsample, neighbor_type, use_xyz=True,\n                 neighbour_distance_multiplier=1.0, xyz_encoding_type='concat'):\n        \"\"\"\n        Args:\n            mlp:\n            num_voxels:\n            max_neighbour_distance:\n            neighbor_type: 1: ball, others: cube\n            nsample: find all (-1), find limited number(>0)\n            use_xyz:\n            neighbour_distance_multiplier:\n            xyz_encoding_type:\n        \"\"\"\n        super().__init__()\n        self.num_voxels = num_voxels  # [num_grid_x, num_grid_y, num_grid_z]: number of grids in each local area centered at new_xyz\n        self.num_total_grids = self.num_voxels[0] * self.num_voxels[1] * self.num_voxels[2]\n        self.max_neighbour_distance = max_neighbour_distance\n        self.neighbor_distance_multiplier = neighbour_distance_multiplier\n        self.nsample = nsample\n        self.neighbor_type = neighbor_type\n        self.use_xyz = use_xyz\n        self.xyz_encoding_type = xyz_encoding_type\n\n        if mlp is not None:\n            if self.use_xyz:\n                mlp[0] += 9 if self.xyz_encoding_type == 'concat' else 0\n            shared_mlps = []\n            for k in range(len(mlp) - 1):\n                shared_mlps.extend([\n                    nn.Conv2d(mlp[k], mlp[k + 1], kernel_size=1, bias=False),\n                    nn.BatchNorm2d(mlp[k + 1]),\n                    nn.ReLU()\n                ])\n            self.mlp = nn.Sequential(*shared_mlps)\n        else:\n            self.mlp = None\n\n        self.num_avg_length_of_neighbor_idxs = 1000\n\n    def forward(self, support_xyz, support_features, xyz_batch_cnt, new_xyz, new_xyz_grid_centers, new_xyz_batch_cnt):\n        \"\"\"\n        Args:\n            support_xyz: (N1 + N2 ..., 3) xyz coordinates of the features\n            support_features: (N1 + N2 ..., C) point-wise features\n            xyz_batch_cnt: (batch_size), [N1, N2, ...]\n            new_xyz: (M1 + M2 ..., 3) centers of the ball query\n            new_xyz_grid_centers: (M1 + M2 ..., num_total_grids, 3) grids centers of each grid\n            new_xyz_batch_cnt: (batch_size), [M1, M2, ...]\n        Returns:\n            new_features: (N1 + N2 ..., C_out)\n        \"\"\"\n        with torch.no_grad():\n            dist, idx, num_avg_length_of_neighbor_idxs = pointnet2_utils.three_nn_for_vector_pool_by_two_step(\n                support_xyz, xyz_batch_cnt, new_xyz, new_xyz_grid_centers, new_xyz_batch_cnt,\n                self.max_neighbour_distance, self.nsample, self.neighbor_type,\n                self.num_avg_length_of_neighbor_idxs, self.num_total_grids, self.neighbor_distance_multiplier\n            )\n        self.num_avg_length_of_neighbor_idxs = max(self.num_avg_length_of_neighbor_idxs, num_avg_length_of_neighbor_idxs.item())\n\n        dist_recip = 1.0 / (dist + 1e-8)\n        norm = torch.sum(dist_recip, dim=-1, keepdim=True)\n        weight = dist_recip / torch.clamp_min(norm, min=1e-8)\n\n        empty_mask = (idx.view(-1, 3)[:, 0] == -1)\n        idx.view(-1, 3)[empty_mask] = 0\n\n        interpolated_feats = pointnet2_utils.three_interpolate(support_features, idx.view(-1, 3), weight.view(-1, 3))\n        interpolated_feats = interpolated_feats.view(idx.shape[0], idx.shape[1], -1)  # (M1 + M2 ..., num_total_grids, C)\n        if self.use_xyz:\n            near_known_xyz = support_xyz[idx.view(-1, 3).long()].view(-1, 3, 3)  # ( (M1 + M2 ...)*num_total_grids, 3)\n            local_xyz = (new_xyz_grid_centers.view(-1, 1, 3) - near_known_xyz).view(-1, idx.shape[1], 9)\n            if self.xyz_encoding_type == 'concat':\n                interpolated_feats = torch.cat((interpolated_feats, local_xyz), dim=-1)  # ( M1 + M2 ..., num_total_grids, 9+C)\n            else:\n                raise NotImplementedError\n\n        new_features = interpolated_feats.view(-1, interpolated_feats.shape[-1])  # ((M1 + M2 ...) * num_total_grids, C)\n        new_features[empty_mask, :] = 0\n        if self.mlp is not None:\n            new_features = new_features.permute(1, 0)[None, :, :, None]  # (1, C, N1 + N2 ..., 1)\n            new_features = self.mlp(new_features)\n\n            new_features = new_features.squeeze(dim=0).squeeze(dim=-1).permute(1, 0)  # (N1 + N2 ..., C)\n        return new_features\n\n\nclass VectorPoolAggregationModule(nn.Module):\n    def __init__(\n            self, input_channels, num_local_voxel=(3, 3, 3), local_aggregation_type='local_interpolation',\n            num_reduced_channels=30, num_channels_of_local_aggregation=32, post_mlps=(128,),\n            max_neighbor_distance=None, neighbor_nsample=-1, neighbor_type=0, neighbor_distance_multiplier=2.0):\n        super().__init__()\n        self.num_local_voxel = num_local_voxel\n        self.total_voxels = self.num_local_voxel[0] * self.num_local_voxel[1] * self.num_local_voxel[2]\n        self.local_aggregation_type = local_aggregation_type\n        assert self.local_aggregation_type in ['local_interpolation', 'voxel_avg_pool', 'voxel_random_choice']\n        self.input_channels = input_channels\n        self.num_reduced_channels = input_channels if num_reduced_channels is None else num_reduced_channels\n        self.num_channels_of_local_aggregation = num_channels_of_local_aggregation\n        self.max_neighbour_distance = max_neighbor_distance\n        self.neighbor_nsample = neighbor_nsample\n        self.neighbor_type = neighbor_type  # 1: ball, others: cube\n\n        if self.local_aggregation_type == 'local_interpolation':\n            self.local_interpolate_module = VectorPoolLocalInterpolateModule(\n                mlp=None, num_voxels=self.num_local_voxel,\n                max_neighbour_distance=self.max_neighbour_distance,\n                nsample=self.neighbor_nsample,\n                neighbor_type=self.neighbor_type,\n                neighbour_distance_multiplier=neighbor_distance_multiplier,\n            )\n            num_c_in = (self.num_reduced_channels + 9) * self.total_voxels\n        else:\n            self.local_interpolate_module = None\n            num_c_in = (self.num_reduced_channels + 3) * self.total_voxels\n\n        num_c_out = self.total_voxels * self.num_channels_of_local_aggregation\n\n        self.separate_local_aggregation_layer = nn.Sequential(\n            nn.Conv1d(num_c_in, num_c_out, kernel_size=1, groups=self.total_voxels, bias=False),\n            nn.BatchNorm1d(num_c_out),\n            nn.ReLU()\n        )\n\n        post_mlp_list = []\n        c_in = num_c_out\n        for cur_num_c in post_mlps:\n            post_mlp_list.extend([\n                nn.Conv1d(c_in, cur_num_c, kernel_size=1, bias=False),\n                nn.BatchNorm1d(cur_num_c),\n                nn.ReLU()\n            ])\n            c_in = cur_num_c\n        self.post_mlps = nn.Sequential(*post_mlp_list)\n\n        self.num_mean_points_per_grid = 20\n        self.init_weights()\n\n    def init_weights(self):\n        for m in self.modules():\n            if isinstance(m, nn.Conv2d) or isinstance(m, nn.Conv1d):\n                nn.init.kaiming_normal_(m.weight)\n                if m.bias is not None:\n                    nn.init.constant_(m.bias, 0)\n            if isinstance(m, nn.BatchNorm2d) or isinstance(m, nn.BatchNorm1d):\n                nn.init.constant_(m.weight, 1.0)\n                nn.init.constant_(m.bias, 0)\n\n    def extra_repr(self) -> str:\n        ret = f'radius={self.max_neighbour_distance}, local_voxels=({self.num_local_voxel}, ' \\\n              f'local_aggregation_type={self.local_aggregation_type}, ' \\\n              f'num_c_reduction={self.input_channels}->{self.num_reduced_channels}, ' \\\n              f'num_c_local_aggregation={self.num_channels_of_local_aggregation}'\n        return ret\n\n    def vector_pool_with_voxel_query(self, xyz, xyz_batch_cnt, features, new_xyz, new_xyz_batch_cnt):\n        use_xyz = 1\n        pooling_type = 0 if self.local_aggregation_type == 'voxel_avg_pool' else 1\n\n        new_features, new_local_xyz, num_mean_points_per_grid, point_cnt_of_grid = pointnet2_utils.vector_pool_with_voxel_query_op(\n            xyz, xyz_batch_cnt, features, new_xyz, new_xyz_batch_cnt,\n            self.num_local_voxel[0], self.num_local_voxel[1], self.num_local_voxel[2],\n            self.max_neighbour_distance, self.num_reduced_channels, use_xyz,\n            self.num_mean_points_per_grid, self.neighbor_nsample, self.neighbor_type,\n            pooling_type\n        )\n        self.num_mean_points_per_grid = max(self.num_mean_points_per_grid, num_mean_points_per_grid.item())\n\n        num_new_pts = new_features.shape[0]\n        new_local_xyz = new_local_xyz.view(num_new_pts, -1, 3)  # (N, num_voxel, 3)\n        new_features = new_features.view(num_new_pts, -1, self.num_reduced_channels)  # (N, num_voxel, C)\n        new_features = torch.cat((new_local_xyz, new_features), dim=-1).view(num_new_pts, -1)\n\n        return new_features, point_cnt_of_grid\n\n    @staticmethod\n    def get_dense_voxels_by_center(point_centers, max_neighbour_distance, num_voxels):\n        \"\"\"\n        Args:\n            point_centers: (N, 3)\n            max_neighbour_distance: float\n            num_voxels: [num_x, num_y, num_z]\n\n        Returns:\n            voxel_centers: (N, total_voxels, 3)\n        \"\"\"\n        R = max_neighbour_distance\n        device = point_centers.device\n        x_grids = torch.arange(-R + R / num_voxels[0], R - R / num_voxels[0] + 1e-5, 2 * R / num_voxels[0], device=device)\n        y_grids = torch.arange(-R + R / num_voxels[1], R - R / num_voxels[1] + 1e-5, 2 * R / num_voxels[1], device=device)\n        z_grids = torch.arange(-R + R / num_voxels[2], R - R / num_voxels[2] + 1e-5, 2 * R / num_voxels[2], device=device)\n        x_offset, y_offset, z_offset = torch.meshgrid(x_grids, y_grids, z_grids)  # shape: [num_x, num_y, num_z]\n        xyz_offset = torch.cat((\n            x_offset.contiguous().view(-1, 1),\n            y_offset.contiguous().view(-1, 1),\n            z_offset.contiguous().view(-1, 1)), dim=-1\n        )\n        voxel_centers = point_centers[:, None, :] + xyz_offset[None, :, :]\n        return voxel_centers\n\n    def vector_pool_with_local_interpolate(self, xyz, xyz_batch_cnt, features, new_xyz, new_xyz_batch_cnt):\n        \"\"\"\n        Args:\n            xyz: (N, 3)\n            xyz_batch_cnt: (batch_size)\n            features: (N, C)\n            new_xyz: (M, 3)\n            new_xyz_batch_cnt: (batch_size)\n        Returns:\n            new_features: (M, total_voxels * C)\n        \"\"\"\n        voxel_centers = self.get_dense_voxels_by_center(\n            point_centers=new_xyz, max_neighbour_distance=self.max_neighbour_distance, num_voxels=self.num_local_voxel\n        )  # (M1 + M2 + ..., total_voxels, 3)\n        voxel_features = self.local_interpolate_module.forward(\n            support_xyz=xyz, support_features=features, xyz_batch_cnt=xyz_batch_cnt,\n            new_xyz=new_xyz, new_xyz_grid_centers=voxel_centers, new_xyz_batch_cnt=new_xyz_batch_cnt\n        )  # ((M1 + M2 ...) * total_voxels, C)\n\n        voxel_features = voxel_features.contiguous().view(-1, self.total_voxels * voxel_features.shape[-1])\n        return voxel_features\n\n    def forward(self, xyz, xyz_batch_cnt, new_xyz, new_xyz_batch_cnt, features, **kwargs):\n        \"\"\"\n        :param xyz: (N1 + N2 ..., 3) tensor of the xyz coordinates of the features\n        :param xyz_batch_cnt: (batch_size), [N1, N2, ...]\n        :param new_xyz: (M1 + M2 ..., 3)\n        :param new_xyz_batch_cnt: (batch_size), [M1, M2, ...]\n        :param features: (N1 + N2 ..., C) tensor of the descriptors of the the features\n        :return:\n            new_xyz: (M1 + M2 ..., 3) tensor of the new features' xyz\n            new_features: (M1 + M2 ..., \\sum_k(mlps[k][-1])) tensor of the new_features descriptors\n        \"\"\"\n        N, C = features.shape\n\n        assert C % self.num_reduced_channels == 0, \\\n            f'the input channels ({C}) should be an integral multiple of num_reduced_channels({self.num_reduced_channels})'\n\n        features = features.view(N, -1, self.num_reduced_channels).sum(dim=1)\n\n        if self.local_aggregation_type in ['voxel_avg_pool', 'voxel_random_choice']:\n            vector_features, point_cnt_of_grid = self.vector_pool_with_voxel_query(\n                xyz=xyz, xyz_batch_cnt=xyz_batch_cnt, features=features,\n                new_xyz=new_xyz, new_xyz_batch_cnt=new_xyz_batch_cnt\n            )\n        elif self.local_aggregation_type == 'local_interpolation':\n            vector_features = self.vector_pool_with_local_interpolate(\n                xyz=xyz, xyz_batch_cnt=xyz_batch_cnt, features=features,\n                new_xyz=new_xyz, new_xyz_batch_cnt=new_xyz_batch_cnt\n            )  # (M1 + M2 + ..., total_voxels * C)\n        else:\n            raise NotImplementedError\n\n        vector_features = vector_features.permute(1, 0)[None, :, :]  # (1, num_voxels * C, M1 + M2 ...)\n\n        new_features = self.separate_local_aggregation_layer(vector_features)\n\n        new_features = self.post_mlps(new_features)\n        new_features = new_features.squeeze(dim=0).permute(1, 0)\n        return new_xyz, new_features\n\n\nclass VectorPoolAggregationModuleMSG(nn.Module):\n    def __init__(self, input_channels, config):\n        super().__init__()\n        self.model_cfg = config\n        self.num_groups = self.model_cfg.NUM_GROUPS\n\n        self.layers = []\n        c_in = 0\n        for k in range(self.num_groups):\n            cur_config = self.model_cfg[f'GROUP_CFG_{k}']\n            cur_vector_pool_module = VectorPoolAggregationModule(\n                input_channels=input_channels, num_local_voxel=cur_config.NUM_LOCAL_VOXEL,\n                post_mlps=cur_config.POST_MLPS,\n                max_neighbor_distance=cur_config.MAX_NEIGHBOR_DISTANCE,\n                neighbor_nsample=cur_config.NEIGHBOR_NSAMPLE,\n                local_aggregation_type=self.model_cfg.LOCAL_AGGREGATION_TYPE,\n                num_reduced_channels=self.model_cfg.get('NUM_REDUCED_CHANNELS', None),\n                num_channels_of_local_aggregation=self.model_cfg.NUM_CHANNELS_OF_LOCAL_AGGREGATION,\n                neighbor_distance_multiplier=2.0\n            )\n            self.__setattr__(f'layer_{k}', cur_vector_pool_module)\n            c_in += cur_config.POST_MLPS[-1]\n\n        c_in += 3  # use_xyz\n\n        shared_mlps = []\n        for cur_num_c in self.model_cfg.MSG_POST_MLPS:\n            shared_mlps.extend([\n                nn.Conv1d(c_in, cur_num_c, kernel_size=1, bias=False),\n                nn.BatchNorm1d(cur_num_c),\n                nn.ReLU()\n            ])\n            c_in = cur_num_c\n        self.msg_post_mlps = nn.Sequential(*shared_mlps)\n\n    def forward(self, **kwargs):\n        features_list = []\n        for k in range(self.num_groups):\n            cur_xyz, cur_features = self.__getattr__(f'layer_{k}')(**kwargs)\n            features_list.append(cur_features)\n\n        features = torch.cat(features_list, dim=-1)\n        features = torch.cat((cur_xyz, features), dim=-1)\n        features = features.permute(1, 0)[None, :, :]  # (1, C, N)\n        new_features = self.msg_post_mlps(features)\n        new_features = new_features.squeeze(dim=0).permute(1, 0)  # (N, C)\n\n        return cur_xyz, new_features\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/pointnet2_utils.py",
    "content": "import torch\nimport torch.nn as nn\nfrom torch.autograd import Function, Variable\n\nfrom . import pointnet2_stack_cuda as pointnet2\n\n\nclass BallQuery(Function):\n\n    @staticmethod\n    def forward(ctx, radius: float, nsample: int, xyz: torch.Tensor, xyz_batch_cnt: torch.Tensor,\n                new_xyz: torch.Tensor, new_xyz_batch_cnt):\n        \"\"\"\n        Args:\n            ctx:\n            radius: float, radius of the balls\n            nsample: int, maximum number of features in the balls\n            xyz: (N1 + N2 ..., 3) xyz coordinates of the features\n            xyz_batch_cnt: (batch_size), [N1, N2, ...]\n            new_xyz: (M1 + M2 ..., 3) centers of the ball query\n            new_xyz_batch_cnt: (batch_size), [M1, M2, ...]\n\n        Returns:\n            idx: (M1 + M2, nsample) tensor with the indicies of the features that form the query balls\n        \"\"\"\n        assert new_xyz.is_contiguous()\n        assert new_xyz_batch_cnt.is_contiguous()\n        assert xyz.is_contiguous()\n        assert xyz_batch_cnt.is_contiguous()\n\n        B = xyz_batch_cnt.shape[0]\n        M = new_xyz.shape[0]\n        idx = torch.cuda.IntTensor(M, nsample).zero_()\n\n        pointnet2.ball_query_wrapper(B, M, radius, nsample, new_xyz, new_xyz_batch_cnt, xyz, xyz_batch_cnt, idx)\n        empty_ball_mask = (idx[:, 0] == -1)\n        idx[empty_ball_mask] = 0\n        return idx, empty_ball_mask\n\n    @staticmethod\n    def backward(ctx, a=None):\n        return None, None, None, None\n\n\nball_query = BallQuery.apply\n\n\nclass GroupingOperation(Function):\n\n    @staticmethod\n    def forward(ctx, features: torch.Tensor, features_batch_cnt: torch.Tensor,\n                idx: torch.Tensor, idx_batch_cnt: torch.Tensor):\n        \"\"\"\n        Args:\n            ctx:\n            features: (N1 + N2 ..., C) tensor of features to group\n            features_batch_cnt: (batch_size) [N1 + N2 ...] tensor containing the indicies of features to group with\n            idx: (M1 + M2 ..., nsample) tensor containing the indicies of features to group with\n            idx_batch_cnt: (batch_size) [M1 + M2 ...] tensor containing the indicies of features to group with\n\n        Returns:\n            output: (M1 + M2, C, nsample) tensor\n        \"\"\"\n        assert features.is_contiguous()\n        assert features_batch_cnt.is_contiguous()\n        assert idx.is_contiguous()\n        assert idx_batch_cnt.is_contiguous()\n\n        assert features.shape[0] == features_batch_cnt.sum(), \\\n            'features: %s, features_batch_cnt: %s' % (str(features.shape), str(features_batch_cnt))\n        assert idx.shape[0] == idx_batch_cnt.sum(), \\\n            'idx: %s, idx_batch_cnt: %s' % (str(idx.shape), str(idx_batch_cnt))\n\n        M, nsample = idx.size()\n        N, C = features.size()\n        B = idx_batch_cnt.shape[0]\n        output = torch.cuda.FloatTensor(M, C, nsample)\n\n        pointnet2.group_points_wrapper(B, M, C, nsample, features, features_batch_cnt, idx, idx_batch_cnt, output)\n\n        ctx.for_backwards = (B, N, idx, features_batch_cnt, idx_batch_cnt)\n        return output\n\n    @staticmethod\n    def backward(ctx, grad_out: torch.Tensor):\n        \"\"\"\n        Args:\n            ctx:\n            grad_out: (M1 + M2 ..., C, nsample) tensor of the gradients of the output from forward\n\n        Returns:\n            grad_features: (N1 + N2 ..., C) gradient of the features\n        \"\"\"\n        B, N, idx, features_batch_cnt, idx_batch_cnt = ctx.for_backwards\n\n        M, C, nsample = grad_out.size()\n        grad_features = Variable(torch.cuda.FloatTensor(N, C).zero_())\n\n        grad_out_data = grad_out.data.contiguous()\n        pointnet2.group_points_grad_wrapper(B, M, C, N, nsample, grad_out_data, idx,\n                                            idx_batch_cnt, features_batch_cnt, grad_features.data)\n        return grad_features, None, None, None\n\n\ngrouping_operation = GroupingOperation.apply\n\n\nclass QueryAndGroup(nn.Module):\n    def __init__(self, radius: float, nsample: int, use_xyz: bool = True):\n        \"\"\"\n        Args:\n            radius: float, radius of ball\n            nsample: int, maximum number of features to gather in the ball\n            use_xyz:\n        \"\"\"\n        super().__init__()\n        self.radius, self.nsample, self.use_xyz = radius, nsample, use_xyz\n\n    def forward(self, xyz: torch.Tensor, xyz_batch_cnt: torch.Tensor,\n                new_xyz: torch.Tensor, new_xyz_batch_cnt: torch.Tensor,\n                features: torch.Tensor = None):\n        \"\"\"\n        Args:\n            xyz: (N1 + N2 ..., 3) xyz coordinates of the features\n            xyz_batch_cnt: (batch_size), [N1, N2, ...]\n            new_xyz: (M1 + M2 ..., 3) centers of the ball query\n            new_xyz_batch_cnt: (batch_size), [M1, M2, ...]\n            features: (N1 + N2 ..., C) tensor of features to group\n\n        Returns:\n            new_features: (M1 + M2, C, nsample) tensor\n        \"\"\"\n        assert xyz.shape[0] == xyz_batch_cnt.sum(), 'xyz: %s, xyz_batch_cnt: %s' % (str(xyz.shape), str(new_xyz_batch_cnt))\n        assert new_xyz.shape[0] == new_xyz_batch_cnt.sum(), \\\n            'new_xyz: %s, new_xyz_batch_cnt: %s' % (str(new_xyz.shape), str(new_xyz_batch_cnt))\n\n        # idx: (M1 + M2 ..., nsample), empty_ball_mask: (M1 + M2 ...)\n        idx, empty_ball_mask = ball_query(self.radius, self.nsample, xyz, xyz_batch_cnt, new_xyz, new_xyz_batch_cnt)\n        grouped_xyz = grouping_operation(xyz, xyz_batch_cnt, idx, new_xyz_batch_cnt)  # (M1 + M2, 3, nsample)\n        grouped_xyz -= new_xyz.unsqueeze(-1)\n\n        grouped_xyz[empty_ball_mask] = 0\n\n        if features is not None:\n            grouped_features = grouping_operation(features, xyz_batch_cnt, idx, new_xyz_batch_cnt)  # (M1 + M2, C, nsample)\n            grouped_features[empty_ball_mask] = 0\n            if self.use_xyz:\n                new_features = torch.cat([grouped_xyz, grouped_features], dim=1)  # (M1 + M2 ..., C + 3, nsample)\n            else:\n                new_features = grouped_features\n        else:\n            assert self.use_xyz, \"Cannot have not features and not use xyz as a feature!\"\n            new_features = grouped_xyz\n\n        return new_features, idx\n\n\nclass FarthestPointSampling(Function):\n    @staticmethod\n    def forward(ctx, xyz: torch.Tensor, npoint: int):\n        \"\"\"\n        Args:\n            ctx:\n            xyz: (B, N, 3) where N > npoint\n            npoint: int, number of features in the sampled set\n\n        Returns:\n            output: (B, npoint) tensor containing the set\n        \"\"\"\n        assert xyz.is_contiguous()\n\n        B, N, _ = xyz.size()\n        output = torch.cuda.IntTensor(B, npoint)\n        temp = torch.cuda.FloatTensor(B, N).fill_(1e10)\n\n        pointnet2.farthest_point_sampling_wrapper(B, N, npoint, xyz, temp, output)\n        return output\n\n    @staticmethod\n    def backward(xyz, a=None):\n        return None, None\n\n\nfarthest_point_sample = furthest_point_sample = FarthestPointSampling.apply\n\n\nclass StackFarthestPointSampling(Function):\n    @staticmethod\n    def forward(ctx, xyz, xyz_batch_cnt, npoint):\n        \"\"\"\n        Args:\n            ctx:\n            xyz: (N1 + N2 + ..., 3) where N > npoint\n            xyz_batch_cnt: [N1, N2, ...]\n            npoint: int, number of features in the sampled set\n\n        Returns:\n            output: (npoint.sum()) tensor containing the set,\n            npoint: (M1, M2, ...)\n        \"\"\"\n        assert xyz.is_contiguous() and xyz.shape[1] == 3\n\n        batch_size = xyz_batch_cnt.__len__()\n        if not isinstance(npoint, torch.Tensor):\n            if not isinstance(npoint, list):\n                npoint = [npoint for i in range(batch_size)]\n            npoint = torch.tensor(npoint, device=xyz.device).int()\n\n        N, _ = xyz.size()\n        temp = torch.cuda.FloatTensor(N).fill_(1e10)\n        output = torch.cuda.IntTensor(npoint.sum().item())\n\n        pointnet2.stack_farthest_point_sampling_wrapper(xyz, temp, xyz_batch_cnt, output, npoint)\n        return output\n\n    @staticmethod\n    def backward(xyz, a=None):\n        return None, None\n\n\nstack_farthest_point_sample = StackFarthestPointSampling.apply\n\n\nclass ThreeNN(Function):\n    @staticmethod\n    def forward(ctx, unknown, unknown_batch_cnt, known, known_batch_cnt):\n        \"\"\"\n        Args:\n            ctx:\n            unknown: (N1 + N2..., 3)\n            unknown_batch_cnt: (batch_size), [N1, N2, ...]\n            known: (M1 + M2..., 3)\n            known_batch_cnt: (batch_size), [M1, M2, ...]\n\n        Returns:\n            dist: (N1 + N2 ..., 3)  l2 distance to the three nearest neighbors\n            idx: (N1 + N2 ..., 3)  index of the three nearest neighbors, range [0, M1+M2+...]\n        \"\"\"\n        assert unknown.shape.__len__() == 2 and unknown.shape[1] == 3\n        assert known.shape.__len__() == 2 and known.shape[1] == 3\n        assert unknown_batch_cnt.__len__() == known_batch_cnt.__len__()\n\n        dist2 = unknown.new_zeros(unknown.shape)\n        idx = unknown_batch_cnt.new_zeros(unknown.shape).int()\n\n        pointnet2.three_nn_wrapper(\n            unknown.contiguous(), unknown_batch_cnt.contiguous(),\n            known.contiguous(), known_batch_cnt.contiguous(), dist2, idx\n        )\n        return torch.sqrt(dist2), idx\n\n    @staticmethod\n    def backward(ctx, a=None, b=None):\n        return None, None\n\n\nthree_nn = ThreeNN.apply\n\n\nclass ThreeInterpolate(Function):\n\n    @staticmethod\n    def forward(ctx, features: torch.Tensor, idx: torch.Tensor, weight: torch.Tensor):\n        \"\"\"\n        Args:\n            ctx:\n            features: (M1 + M2 ..., C)\n            idx: [N1 + N2 ..., 3]\n            weight: [N1 + N2 ..., 3]\n\n        Returns:\n            out_tensor: (N1 + N2 ..., C)\n        \"\"\"\n        assert idx.shape[0] == weight.shape[0] and idx.shape[1] == weight.shape[1] == 3\n\n        ctx.three_interpolate_for_backward = (idx, weight, features.shape[0])\n        output = features.new_zeros((idx.shape[0], features.shape[1]))\n        pointnet2.three_interpolate_wrapper(features.contiguous(), idx.contiguous(), weight.contiguous(), output)\n        return output\n\n    @staticmethod\n    def backward(ctx, grad_out: torch.Tensor):\n        \"\"\"\n        Args:\n            ctx:\n            grad_out: (N1 + N2 ..., C)\n\n        Returns:\n            grad_features: (M1 + M2 ..., C)\n        \"\"\"\n        idx, weight, M = ctx.three_interpolate_for_backward\n        grad_features = grad_out.new_zeros((M, grad_out.shape[1]))\n        pointnet2.three_interpolate_grad_wrapper(\n            grad_out.contiguous(), idx.contiguous(), weight.contiguous(), grad_features\n        )\n        return grad_features, None, None\n\n\nthree_interpolate = ThreeInterpolate.apply\n\n\nclass ThreeNNForVectorPoolByTwoStep(Function):\n    @staticmethod\n    def forward(ctx, support_xyz, xyz_batch_cnt, new_xyz, new_xyz_grid_centers, new_xyz_batch_cnt,\n                max_neighbour_distance, nsample, neighbor_type, avg_length_of_neighbor_idxs, num_total_grids,\n                neighbor_distance_multiplier):\n        \"\"\"\n        Args:\n            ctx:\n            // support_xyz: (N1 + N2 ..., 3) xyz coordinates of the features\n            // xyz_batch_cnt: (batch_size), [N1, N2, ...]\n            // new_xyz: (M1 + M2 ..., 3) centers of the ball query\n            // new_xyz_grid_centers: (M1 + M2 ..., num_total_grids, 3) grids centers of each grid\n            // new_xyz_batch_cnt: (batch_size), [M1, M2, ...]\n            // nsample: find all (-1), find limited number(>0)\n            // neighbor_type: 1: ball, others: cube\n            // neighbor_distance_multiplier: query_distance = neighbor_distance_multiplier * max_neighbour_distance\n\n        Returns:\n            // new_xyz_grid_idxs: (M1 + M2 ..., num_total_grids, 3) three-nn\n            // new_xyz_grid_dist2: (M1 + M2 ..., num_total_grids, 3) square of dist of three-nn\n        \"\"\"\n        num_new_xyz = new_xyz.shape[0]\n        new_xyz_grid_dist2 = new_xyz_grid_centers.new_zeros(new_xyz_grid_centers.shape)\n        new_xyz_grid_idxs = new_xyz_grid_centers.new_zeros(new_xyz_grid_centers.shape).int().fill_(-1)\n\n        while True:\n            num_max_sum_points = avg_length_of_neighbor_idxs * num_new_xyz\n            stack_neighbor_idxs = new_xyz_grid_idxs.new_zeros(num_max_sum_points)\n            start_len = new_xyz_grid_idxs.new_zeros(num_new_xyz, 2).int()\n            cumsum = new_xyz_grid_idxs.new_zeros(1)\n\n            pointnet2.query_stacked_local_neighbor_idxs_wrapper_stack(\n                support_xyz.contiguous(), xyz_batch_cnt.contiguous(),\n                new_xyz.contiguous(), new_xyz_batch_cnt.contiguous(),\n                stack_neighbor_idxs.contiguous(), start_len.contiguous(), cumsum,\n                avg_length_of_neighbor_idxs, max_neighbour_distance * neighbor_distance_multiplier,\n                nsample, neighbor_type\n            )\n            avg_length_of_neighbor_idxs = cumsum[0].item() // num_new_xyz + int(cumsum[0].item() % num_new_xyz > 0)\n\n            if cumsum[0] <= num_max_sum_points:\n                break\n\n        stack_neighbor_idxs = stack_neighbor_idxs[:cumsum[0]]\n        pointnet2.query_three_nn_by_stacked_local_idxs_wrapper_stack(\n            support_xyz, new_xyz, new_xyz_grid_centers, new_xyz_grid_idxs, new_xyz_grid_dist2,\n            stack_neighbor_idxs, start_len, num_new_xyz, num_total_grids\n        )\n\n        return torch.sqrt(new_xyz_grid_dist2), new_xyz_grid_idxs, torch.tensor(avg_length_of_neighbor_idxs)\n\n\nthree_nn_for_vector_pool_by_two_step = ThreeNNForVectorPoolByTwoStep.apply\n\n\nclass VectorPoolWithVoxelQuery(Function):\n    @staticmethod\n    def forward(ctx, support_xyz: torch.Tensor, xyz_batch_cnt: torch.Tensor, support_features: torch.Tensor,\n                new_xyz: torch.Tensor, new_xyz_batch_cnt: torch.Tensor, num_grid_x, num_grid_y, num_grid_z,\n                max_neighbour_distance, num_c_out_each_grid, use_xyz,\n                num_mean_points_per_grid=100, nsample=-1, neighbor_type=0, pooling_type=0):\n        \"\"\"\n        Args:\n            ctx:\n            support_xyz: (N1 + N2 ..., 3) xyz coordinates of the features\n            xyz_batch_cnt: (batch_size), [N1, N2, ...]\n            support_features: (N1 + N2 ..., C)\n            new_xyz: (M1 + M2 ..., 3) centers of new positions\n            new_xyz_batch_cnt: (batch_size), [M1, M2, ...]\n            num_grid_x: number of grids in each local area centered at new_xyz\n            num_grid_y:\n            num_grid_z:\n            max_neighbour_distance:\n            num_c_out_each_grid:\n            use_xyz:\n            neighbor_type: 1: ball, others: cube:\n            pooling_type: 0: avg_pool, 1: random choice\n        Returns:\n            new_features: (M1 + M2 ..., num_c_out)\n        \"\"\"\n        assert support_xyz.is_contiguous()\n        assert support_features.is_contiguous()\n        assert xyz_batch_cnt.is_contiguous()\n        assert new_xyz.is_contiguous()\n        assert new_xyz_batch_cnt.is_contiguous()\n        num_total_grids = num_grid_x * num_grid_y * num_grid_z\n        num_c_out = num_c_out_each_grid * num_total_grids\n        N, num_c_in = support_features.shape\n        M = new_xyz.shape[0]\n\n        assert num_c_in % num_c_out_each_grid == 0, \\\n            f'the input channels ({num_c_in}) should be an integral multiple of num_c_out_each_grid({num_c_out_each_grid})'\n\n        while True:\n            new_features = support_features.new_zeros((M, num_c_out))\n            new_local_xyz = support_features.new_zeros((M, 3 * num_total_grids))\n            point_cnt_of_grid = xyz_batch_cnt.new_zeros((M, num_total_grids))\n\n            num_max_sum_points = num_mean_points_per_grid * M\n            grouped_idxs = xyz_batch_cnt.new_zeros((num_max_sum_points, 3))\n\n            num_cum_sum = pointnet2.vector_pool_wrapper(\n                support_xyz, xyz_batch_cnt, support_features, new_xyz, new_xyz_batch_cnt,\n                new_features, new_local_xyz, point_cnt_of_grid, grouped_idxs,\n                num_grid_x, num_grid_y, num_grid_z, max_neighbour_distance, use_xyz,\n                num_max_sum_points, nsample, neighbor_type, pooling_type\n            )\n            num_mean_points_per_grid = num_cum_sum // M + int(num_cum_sum % M > 0)\n            if num_cum_sum <= num_max_sum_points:\n                break\n\n        grouped_idxs = grouped_idxs[:num_cum_sum]\n\n        normalizer = torch.clamp_min(point_cnt_of_grid[:, :, None].float(), min=1e-6)\n        new_features = (new_features.view(-1, num_total_grids, num_c_out_each_grid) / normalizer).view(-1, num_c_out)\n\n        if use_xyz:\n            new_local_xyz = (new_local_xyz.view(-1, num_total_grids, 3) / normalizer).view(-1, num_total_grids * 3)\n\n        num_mean_points_per_grid = torch.Tensor([num_mean_points_per_grid]).int()\n        nsample = torch.Tensor([nsample]).int()\n        ctx.vector_pool_for_backward = (point_cnt_of_grid, grouped_idxs, N, num_c_in)\n        ctx.mark_non_differentiable(new_local_xyz, num_mean_points_per_grid, nsample, point_cnt_of_grid)\n        return new_features, new_local_xyz, num_mean_points_per_grid, point_cnt_of_grid\n\n    @staticmethod\n    def backward(ctx, grad_new_features: torch.Tensor, grad_local_xyz: torch.Tensor, grad_num_cum_sum, grad_point_cnt_of_grid):\n        \"\"\"\n        Args:\n            ctx:\n            grad_new_features: (M1 + M2 ..., num_c_out), num_c_out = num_c_out_each_grid * num_total_grids\n\n        Returns:\n            grad_support_features: (N1 + N2 ..., C_in)\n        \"\"\"\n        point_cnt_of_grid, grouped_idxs, N, num_c_in = ctx.vector_pool_for_backward\n        grad_support_features = grad_new_features.new_zeros((N, num_c_in))\n\n        pointnet2.vector_pool_grad_wrapper(\n            grad_new_features.contiguous(), point_cnt_of_grid, grouped_idxs,\n            grad_support_features\n        )\n\n        return None, None, grad_support_features, None, None, None, None, None, None, None, None, None, None, None, None\n\n\nvector_pool_with_voxel_query_op = VectorPoolWithVoxelQuery.apply\n\n\nif __name__ == '__main__':\n    pass\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/src/ball_query.cpp",
    "content": "/*\nStacked-batch-data version of ball query, modified from the original implementation of official PointNet++ codes.\nWritten by Shaoshuai Shi\nAll Rights Reserved 2019-2020.\n*/\n\n\n#include <torch/serialize/tensor.h>\n#include <vector>\n#include <THC/THC.h>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n#include \"ball_query_gpu.h\"\n\nextern THCState *state;\n\n#define CHECK_CUDA(x) do { \\\n  if (!x.type().is_cuda()) { \\\n    fprintf(stderr, \"%s must be CUDA tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_CONTIGUOUS(x) do { \\\n  if (!x.is_contiguous()) { \\\n    fprintf(stderr, \"%s must be contiguous tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_INPUT(x) CHECK_CUDA(x);CHECK_CONTIGUOUS(x)\n\nint ball_query_wrapper_stack(int B, int M, float radius, int nsample,\n    at::Tensor new_xyz_tensor, at::Tensor new_xyz_batch_cnt_tensor,\n    at::Tensor xyz_tensor, at::Tensor xyz_batch_cnt_tensor, at::Tensor idx_tensor) {\n    CHECK_INPUT(new_xyz_tensor);\n    CHECK_INPUT(xyz_tensor);\n    CHECK_INPUT(new_xyz_batch_cnt_tensor);\n    CHECK_INPUT(xyz_batch_cnt_tensor);\n\n    const float *new_xyz = new_xyz_tensor.data<float>();\n    const float *xyz = xyz_tensor.data<float>();\n    const int *new_xyz_batch_cnt = new_xyz_batch_cnt_tensor.data<int>();\n    const int *xyz_batch_cnt = xyz_batch_cnt_tensor.data<int>();\n    int *idx = idx_tensor.data<int>();\n\n    ball_query_kernel_launcher_stack(B, M, radius, nsample, new_xyz, new_xyz_batch_cnt, xyz, xyz_batch_cnt, idx);\n    return 1;\n}\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/src/ball_query_deform.cpp",
    "content": "#include <torch/serialize/tensor.h>\n#include <vector>\n#include <THC/THC.h>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n#include \"ball_query_deform_gpu.h\"\n\nextern THCState *state;\n\n#define CHECK_CUDA(x) do { \\\n  if (!x.type().is_cuda()) { \\\n    fprintf(stderr, \"%s must be CUDA tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_CONTIGUOUS(x) do { \\\n  if (!x.is_contiguous()) { \\\n    fprintf(stderr, \"%s must be contiguous tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_INPUT(x) CHECK_CUDA(x);CHECK_CONTIGUOUS(x)\n\nint ball_query_deform_wrapper_stack(int B, int M, int nsample,\n    at::Tensor new_xyz_tensor, at::Tensor new_xyz_r_tensor, at::Tensor new_xyz_batch_cnt_tensor,\n    at::Tensor xyz_tensor, at::Tensor xyz_batch_cnt_tensor, at::Tensor idx_tensor) {\n    CHECK_INPUT(new_xyz_tensor);\n    CHECK_INPUT(new_xyz_r_tensor);\n    CHECK_INPUT(xyz_tensor);\n    CHECK_INPUT(new_xyz_batch_cnt_tensor);\n    CHECK_INPUT(xyz_batch_cnt_tensor);\n\n    const float *new_xyz = new_xyz_tensor.data<float>();\n    const float *new_xyz_r = new_xyz_r_tensor.data<float>();\n    const float *xyz = xyz_tensor.data<float>();\n    const int *new_xyz_batch_cnt = new_xyz_batch_cnt_tensor.data<int>();\n    const int *xyz_batch_cnt = xyz_batch_cnt_tensor.data<int>();\n    int *idx = idx_tensor.data<int>();\n\n    ball_query_deform_kernel_launcher_stack(B, M, nsample, new_xyz, new_xyz_r, new_xyz_batch_cnt, xyz, xyz_batch_cnt, idx);\n    return 1;\n}"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/src/ball_query_deform_gpu.cu",
    "content": "#include <math.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include \"ball_query_deform_gpu.h\"\n#include \"cuda_utils.h\"\n\n\n__global__ void ball_query_deform_kernel_stack(int B, int M, int nsample, \\\n    const float *new_xyz, const float *new_xyz_r, const int *new_xyz_batch_cnt, const float *xyz, const int *xyz_batch_cnt, int *idx) {\n    // :param xyz: (N1 + N2 ..., 3) xyz coordinates of the features\n    // :param xyz_batch_cnt: (batch_size), [N1, N2, ...]\n    // :param new_xyz: (M1 + M2 ..., 3) centers of the ball query\n    // :param new_xyz_r: (M1 + M2 ..., 1) radius for each new point\n    // :param new_xyz_batch_cnt: (batch_size), [M1, M2, ...]\n    // output:\n    //      idx: (M, nsample)\n    int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (pt_idx >= M) return;\n\n    int bs_idx = 0, pt_cnt = new_xyz_batch_cnt[0];\n    for (int k = 1; k < B; k++){\n        if (pt_idx < pt_cnt) break;\n        pt_cnt += new_xyz_batch_cnt[k];\n        bs_idx = k;\n    }\n\n    int xyz_batch_start_idx = 0;\n    for (int k = 0; k < bs_idx; k++) xyz_batch_start_idx += xyz_batch_cnt[k];\n    // for (int k = 0; k < bs_idx; k++) new_xyz_batch_start_idx += new_xyz_batch_cnt[k];\n\n    new_xyz += pt_idx * 3;\n    new_xyz_r += pt_idx; //add\n    xyz += xyz_batch_start_idx * 3;\n    idx += pt_idx * nsample;\n\n    float radius = new_xyz_r[0];\n    float radius2 = radius * radius;\n    float new_x = new_xyz[0];\n    float new_y = new_xyz[1];\n    float new_z = new_xyz[2];\n    int n = xyz_batch_cnt[bs_idx];\n\n    int cnt = 0;\n    for (int k = 0; k < n; ++k) {\n        float x = xyz[k * 3 + 0];\n        float y = xyz[k * 3 + 1];\n        float z = xyz[k * 3 + 2];\n        float d2 = (new_x - x) * (new_x - x) + (new_y - y) * (new_y - y) + (new_z - z) * (new_z - z);\n        if (d2 < radius2){\n            if (cnt == 0){\n                for (int l = 0; l < nsample; ++l) {\n                    idx[l] = k;\n                }\n            }\n            idx[cnt] = k;\n            ++cnt;\n            if (cnt >= nsample) break;\n        }\n    }\n    if (cnt == 0) idx[0] = -1;\n}\n\n\nvoid ball_query_deform_kernel_launcher_stack(int B, int M, int nsample,\n    const float *new_xyz, const float *new_xyz_r, const int *new_xyz_batch_cnt, const float *xyz, const int *xyz_batch_cnt, int *idx){\n    // :param xyz: (N1 + N2 ..., 3) xyz coordinates of the features\n    // :param xyz_batch_cnt: (batch_size), [N1, N2, ...]\n    // :param new_xyz: (M1 + M2 ..., 3) centers of the ball query\n    // :param new_xyz_r: (M1 + M2 ..., 1) radius for each new point\n    // :param new_xyz_batch_cnt: (batch_size), [M1, M2, ...]\n    // output:\n    //      idx: (M, nsample)\n\n    cudaError_t err;\n\n    dim3 blocks(DIVUP(M, THREADS_PER_BLOCK));  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    ball_query_deform_kernel_stack<<<blocks, threads>>>(B, M, nsample, new_xyz, new_xyz_r, new_xyz_batch_cnt, xyz, xyz_batch_cnt, idx);\n    // cudaDeviceSynchronize();  // for using printf in kernel function\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/src/ball_query_deform_gpu.h",
    "content": "#ifndef _STACK_BALL_QUERY_DEFORM_GPU_H\n#define _STACK_BALL_QUERY_DEFORM_GPU_H\n\n#include <torch/serialize/tensor.h>\n#include <vector>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n\nint ball_query_deform_wrapper_stack(int B, int M, int nsample,\n    at::Tensor new_xyz_tensor, at::Tensor new_xyz_r_tensor, at::Tensor new_xyz_batch_cnt_tensor,\n    at::Tensor xyz_tensor, at::Tensor xyz_batch_cnt_tensor, at::Tensor idx_tensor);\n\n\nvoid ball_query_deform_kernel_launcher_stack(int B, int M, int nsample,\n    const float *new_xyz, const float *new_xyz_r, const int *new_xyz_batch_cnt, const float *xyz, const int *xyz_batch_cnt, int *idx);\n\n\n#endif"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/src/ball_query_gpu.cu",
    "content": "/*\nStacked-batch-data version of ball query, modified from the original implementation of official PointNet++ codes.\nWritten by Shaoshuai Shi\nAll Rights Reserved 2019-2020.\n*/\n\n\n#include <math.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include \"ball_query_gpu.h\"\n#include \"cuda_utils.h\"\n\n\n__global__ void ball_query_kernel_stack(int B, int M, float radius, int nsample, \\\n    const float *new_xyz, const int *new_xyz_batch_cnt, const float *xyz, const int *xyz_batch_cnt, int *idx) {\n    // :param xyz: (N1 + N2 ..., 3) xyz coordinates of the features\n    // :param xyz_batch_cnt: (batch_size), [N1, N2, ...]\n    // :param new_xyz: (M1 + M2 ..., 3) centers of the ball query\n    // :param new_xyz_batch_cnt: (batch_size), [M1, M2, ...]\n    // output:\n    //      idx: (M, nsample)\n    int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (pt_idx >= M) return;\n\n    int bs_idx = 0, pt_cnt = new_xyz_batch_cnt[0];\n    for (int k = 1; k < B; k++){\n        if (pt_idx < pt_cnt) break;\n        pt_cnt += new_xyz_batch_cnt[k];\n        bs_idx = k;\n    }\n\n    int xyz_batch_start_idx = 0;\n    for (int k = 0; k < bs_idx; k++) xyz_batch_start_idx += xyz_batch_cnt[k];\n    // for (int k = 0; k < bs_idx; k++) new_xyz_batch_start_idx += new_xyz_batch_cnt[k];\n\n    new_xyz += pt_idx * 3;\n    xyz += xyz_batch_start_idx * 3;\n    idx += pt_idx * nsample;\n\n    float radius2 = radius * radius;\n    float new_x = new_xyz[0];\n    float new_y = new_xyz[1];\n    float new_z = new_xyz[2];\n    int n = xyz_batch_cnt[bs_idx];\n\n    int cnt = 0;\n    for (int k = 0; k < n; ++k) {\n        float x = xyz[k * 3 + 0];\n        float y = xyz[k * 3 + 1];\n        float z = xyz[k * 3 + 2];\n        float d2 = (new_x - x) * (new_x - x) + (new_y - y) * (new_y - y) + (new_z - z) * (new_z - z);\n        if (d2 < radius2){\n            if (cnt == 0){\n                for (int l = 0; l < nsample; ++l) {\n                    idx[l] = k;\n                }\n            }\n            idx[cnt] = k;\n            ++cnt;\n            if (cnt >= nsample) break;\n        }\n    }\n    if (cnt == 0) idx[0] = -1;\n}\n\n\nvoid ball_query_kernel_launcher_stack(int B, int M, float radius, int nsample,\n    const float *new_xyz, const int *new_xyz_batch_cnt, const float *xyz, const int *xyz_batch_cnt, int *idx){\n    // :param xyz: (N1 + N2 ..., 3) xyz coordinates of the features\n    // :param xyz_batch_cnt: (batch_size), [N1, N2, ...]\n    // :param new_xyz: (M1 + M2 ..., 3) centers of the ball query\n    // :param new_xyz_batch_cnt: (batch_size), [M1, M2, ...]\n    // output:\n    //      idx: (M, nsample)\n\n    cudaError_t err;\n\n    dim3 blocks(DIVUP(M, THREADS_PER_BLOCK));  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    ball_query_kernel_stack<<<blocks, threads>>>(B, M, radius, nsample, new_xyz, new_xyz_batch_cnt, xyz, xyz_batch_cnt, idx);\n    // cudaDeviceSynchronize();  // for using printf in kernel function\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/src/ball_query_gpu.h",
    "content": "/*\nStacked-batch-data version of ball query, modified from the original implementation of official PointNet++ codes.\nWritten by Shaoshuai Shi\nAll Rights Reserved 2019-2020.\n*/\n\n\n#ifndef _STACK_BALL_QUERY_GPU_H\n#define _STACK_BALL_QUERY_GPU_H\n\n#include <torch/serialize/tensor.h>\n#include <vector>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n\nint ball_query_wrapper_stack(int B, int M, float radius, int nsample,\n    at::Tensor new_xyz_tensor, at::Tensor new_xyz_batch_cnt_tensor,\n    at::Tensor xyz_tensor, at::Tensor xyz_batch_cnt_tensor, at::Tensor idx_tensor);\n\n\nvoid ball_query_kernel_launcher_stack(int B, int M, float radius, int nsample,\n    const float *new_xyz, const int *new_xyz_batch_cnt, const float *xyz, const int *xyz_batch_cnt, int *idx);\n\n\n#endif\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/src/cuda_utils.h",
    "content": "#ifndef _STACK_CUDA_UTILS_H\n#define _STACK_CUDA_UTILS_H\n\n#include <cmath>\n\n#define THREADS_PER_BLOCK 256\n#define DIVUP(m,n) ((m) / (n) + ((m) % (n) > 0))\n\n#endif\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/src/group_points.cpp",
    "content": "/*\nStacked-batch-data version of point grouping, modified from the original implementation of official PointNet++ codes.\nWritten by Shaoshuai Shi\nAll Rights Reserved 2019-2020.\n*/\n\n\n#include <torch/serialize/tensor.h>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n#include <vector>\n#include <THC/THC.h>\n#include \"group_points_gpu.h\"\n\nextern THCState *state;\n#define CHECK_CUDA(x) do { \\\n  if (!x.type().is_cuda()) { \\\n    fprintf(stderr, \"%s must be CUDA tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_CONTIGUOUS(x) do { \\\n  if (!x.is_contiguous()) { \\\n    fprintf(stderr, \"%s must be contiguous tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_INPUT(x) CHECK_CUDA(x);CHECK_CONTIGUOUS(x)\n\n\nint group_points_grad_wrapper_stack(int B, int M, int C, int N, int nsample,\n    at::Tensor grad_out_tensor, at::Tensor idx_tensor, at::Tensor idx_batch_cnt_tensor,\n    at::Tensor features_batch_cnt_tensor, at::Tensor grad_features_tensor) {\n\n    CHECK_INPUT(grad_out_tensor);\n    CHECK_INPUT(idx_tensor);\n    CHECK_INPUT(idx_batch_cnt_tensor);\n    CHECK_INPUT(features_batch_cnt_tensor);\n    CHECK_INPUT(grad_features_tensor);\n\n    const float *grad_out = grad_out_tensor.data<float>();\n    const int *idx = idx_tensor.data<int>();\n    const int *idx_batch_cnt = idx_batch_cnt_tensor.data<int>();\n    const int *features_batch_cnt = features_batch_cnt_tensor.data<int>();\n    float *grad_features = grad_features_tensor.data<float>();\n\n    group_points_grad_kernel_launcher_stack(B, M, C, N, nsample, grad_out, idx, idx_batch_cnt, features_batch_cnt, grad_features);\n    return 1;\n}\n\n\nint group_points_wrapper_stack(int B, int M, int C, int nsample,\n    at::Tensor features_tensor, at::Tensor features_batch_cnt_tensor,\n    at::Tensor idx_tensor, at::Tensor idx_batch_cnt_tensor, at::Tensor out_tensor) {\n\n    CHECK_INPUT(features_tensor);\n    CHECK_INPUT(features_batch_cnt_tensor);\n    CHECK_INPUT(idx_tensor);\n    CHECK_INPUT(idx_batch_cnt_tensor);\n    CHECK_INPUT(out_tensor);\n\n    const float *features = features_tensor.data<float>();\n    const int *idx = idx_tensor.data<int>();\n    const int *features_batch_cnt = features_batch_cnt_tensor.data<int>();\n    const int *idx_batch_cnt = idx_batch_cnt_tensor.data<int>();\n    float *out = out_tensor.data<float>();\n\n    group_points_kernel_launcher_stack(B, M, C, nsample, features, features_batch_cnt, idx, idx_batch_cnt, out);\n    return 1;\n}"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/src/group_points_gpu.cu",
    "content": "/*\nStacked-batch-data version of point grouping, modified from the original implementation of official PointNet++ codes.\nWritten by Shaoshuai Shi\nAll Rights Reserved 2019-2020.\n*/\n\n\n#include <stdio.h>\n#include <stdlib.h>\n\n#include \"cuda_utils.h\"\n#include \"group_points_gpu.h\"\n\n\n__global__ void group_points_grad_kernel_stack(int B, int M, int C, int N, int nsample,\n    const float *grad_out, const int *idx, const int *idx_batch_cnt, const int *features_batch_cnt, float *grad_features) {\n    // :param grad_out: (M1 + M2 ..., C, nsample) tensor of the gradients of the output from forward\n    // :param idx: (M1 + M2 ..., nsample) tensor containing the indicies of features to group with\n    // :param idx_batch_cnt: (batch_size) [M1 + M2 ...] tensor containing the indicies of features to group with\n    // :param features_batch_cnt: (batch_size) [N1 + N2 ...] tensor containing the indicies of features to group with\n    // :return:\n    //     grad_features: (N1 + N2 ..., C) gradient of the features\n    int index = blockIdx.x * blockDim.x + threadIdx.x;\n    int sample_idx = index % nsample;\n    int C_idx = (index / nsample) % C;\n    int pt_idx = (index / nsample / C);\n\n    if (pt_idx >= M || C_idx >= C || sample_idx >= nsample) return;\n\n    int bs_idx = 0, pt_cnt = idx_batch_cnt[0];\n    for (int k = 1; k < B; k++){\n        if (pt_idx < pt_cnt) break;\n        pt_cnt += idx_batch_cnt[k];\n        bs_idx = k;\n    }\n\n    int features_batch_start_idx = 0;\n    for (int k = 0; k < bs_idx; k++) features_batch_start_idx += features_batch_cnt[k];\n\n    grad_out += pt_idx * C * nsample + C_idx * nsample + sample_idx;\n    idx += pt_idx * nsample + sample_idx;\n    grad_features += (features_batch_start_idx + idx[0]) * C + C_idx;\n\n    atomicAdd(grad_features, grad_out[0]);\n}\n\nvoid group_points_grad_kernel_launcher_stack(int B, int M, int C, int N, int nsample,\n    const float *grad_out, const int *idx, const int *idx_batch_cnt, const int *features_batch_cnt, float *grad_features) {\n    // :param grad_out: (M1 + M2 ..., C, nsample) tensor of the gradients of the output from forward\n    // :param idx: (M1 + M2 ..., nsample) tensor containing the indicies of features to group with\n    // :param idx_batch_cnt: (batch_size) [M1 + M2 ...] tensor containing the indicies of features to group with\n    // :param features_batch_cnt: (batch_size) [N1 + N2 ...] tensor containing the indicies of features to group with\n    // :return:\n    //     grad_features: (N1 + N2 ..., C) gradient of the features\n\n    cudaError_t err;\n    // dim3 blocks(DIVUP(npoints * nsample, THREADS_PER_BLOCK), c, b);  // blockIdx.x(col), blockIdx.y(row)\n    dim3 blocks(DIVUP(M * C * nsample, THREADS_PER_BLOCK));  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    group_points_grad_kernel_stack<<<blocks, threads>>>(B, M, C, N, nsample, grad_out, idx, idx_batch_cnt, features_batch_cnt, grad_features);\n\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n\n\n__global__ void group_points_kernel_stack(int B, int M, int C, int nsample,\n    const float *features, const int *features_batch_cnt, const int *idx, const int *idx_batch_cnt, float *out) {\n    // :param features: (N1 + N2 ..., C) tensor of features to group\n    // :param features_batch_cnt: (batch_size) [N1 + N2 ...] tensor containing the indicies of features to group with\n    // :param idx: (M1 + M2 ..., nsample) tensor containing the indicies of features to group with\n    // :param idx_batch_cnt: (batch_size) [M1 + M2 ...] tensor containing the indicies of features to group with\n    // :return:\n    //     output: (M1 + M2, C, nsample) tensor\n    int index = blockIdx.x * blockDim.x + threadIdx.x;\n    int sample_idx = index % nsample;\n    int C_idx = (index / nsample) % C;\n    int pt_idx = (index / nsample / C);\n\n    if (pt_idx >= M || C_idx >= C || sample_idx >= nsample) return;\n\n    int bs_idx = 0, pt_cnt = idx_batch_cnt[0];\n    for (int k = 1; k < B; k++){\n        if (pt_idx < pt_cnt) break;\n        pt_cnt += idx_batch_cnt[k];\n        bs_idx = k;\n    }\n\n    int features_batch_start_idx = 0;\n    for (int k = 0; k < bs_idx; k++) features_batch_start_idx += features_batch_cnt[k];\n    features += features_batch_start_idx * C;\n\n    idx += pt_idx * nsample + sample_idx;\n    int in_idx = idx[0] * C + C_idx;\n    int out_idx = pt_idx * C * nsample + C_idx * nsample + sample_idx;\n\n    out[out_idx] = features[in_idx];\n}\n\n\nvoid group_points_kernel_launcher_stack(int B, int M, int C, int nsample,\n    const float *features, const int *features_batch_cnt, const int *idx, const int *idx_batch_cnt, float *out) {\n    // :param features: (N1 + N2 ..., C) tensor of features to group\n    // :param features_batch_cnt: (batch_size) [N1 + N2 ...] tensor containing the indicies of features to group with\n    // :param idx: (M1 + M2 ..., nsample) tensor containing the indicies of features to group with\n    // :param idx_batch_cnt: (batch_size) [M1 + M2 ...] tensor containing the indicies of features to group with\n    // :return:\n    //     output: (M1 + M2, C, nsample) tensor\n\n    cudaError_t err;\n    dim3 blocks(DIVUP(M * C * nsample, THREADS_PER_BLOCK));  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    group_points_kernel_stack<<<blocks, threads>>>(B, M, C, nsample, features, features_batch_cnt, idx, idx_batch_cnt, out);\n    // cudaDeviceSynchronize();  // for using printf in kernel function\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/src/group_points_gpu.h",
    "content": "/*\nStacked-batch-data version of point grouping, modified from the original implementation of official PointNet++ codes.\nWritten by Shaoshuai Shi\nAll Rights Reserved 2019-2020.\n*/\n\n\n#ifndef _STACK_GROUP_POINTS_GPU_H\n#define _STACK_GROUP_POINTS_GPU_H\n\n#include <torch/serialize/tensor.h>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n#include <vector>\n\n\nint group_points_wrapper_stack(int B, int M, int C, int nsample,\n    at::Tensor features_tensor, at::Tensor features_batch_cnt_tensor,\n    at::Tensor idx_tensor, at::Tensor idx_batch_cnt_tensor, at::Tensor out_tensor);\n\nvoid group_points_kernel_launcher_stack(int B, int M, int C, int nsample,\n    const float *features, const int *features_batch_cnt, const int *idx, const int *idx_batch_cnt, float *out);\n\nint group_points_grad_wrapper_stack(int B, int M, int C, int N, int nsample,\n    at::Tensor grad_out_tensor, at::Tensor idx_tensor, at::Tensor idx_batch_cnt_tensor,\n    at::Tensor features_batch_cnt_tensor, at::Tensor grad_features_tensor);\n\nvoid group_points_grad_kernel_launcher_stack(int B, int M, int C, int N, int nsample,\n    const float *grad_out, const int *idx, const int *idx_batch_cnt, const int *features_batch_cnt, float *grad_features);\n\n#endif\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/src/interpolate.cpp",
    "content": "/*\nStacked-batch-data version of point interpolation, modified from the original implementation of official PointNet++ codes.\nWritten by Shaoshuai Shi\nAll Rights Reserved 2019-2020.\n*/\n\n\n#include <torch/serialize/tensor.h>\n#include <vector>\n#include <THC/THC.h>\n#include <math.h>\n#include <stdio.h>\n#include <stdlib.h>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n#include \"interpolate_gpu.h\"\n\nextern THCState *state;\n\n#define CHECK_CUDA(x) do { \\\n  if (!x.type().is_cuda()) { \\\n    fprintf(stderr, \"%s must be CUDA tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_CONTIGUOUS(x) do { \\\n  if (!x.is_contiguous()) { \\\n    fprintf(stderr, \"%s must be contiguous tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_INPUT(x) CHECK_CUDA(x);CHECK_CONTIGUOUS(x)\n\n\nvoid three_nn_wrapper_stack(at::Tensor unknown_tensor, \n    at::Tensor unknown_batch_cnt_tensor, at::Tensor known_tensor, \n    at::Tensor known_batch_cnt_tensor, at::Tensor dist2_tensor, at::Tensor idx_tensor){\n    // unknown: (N1 + N2 ..., 3)\n    // unknown_batch_cnt: (batch_size), [N1, N2, ...]\n    // known: (M1 + M2 ..., 3)\n    // known_batch_cnt: (batch_size), [M1, M2, ...]\n    // Return:\n    // dist: (N1 + N2 ..., 3)  l2 distance to the three nearest neighbors\n    // idx: (N1 + N2 ..., 3)  index of the three nearest neighbors\n    CHECK_INPUT(unknown_tensor);\n    CHECK_INPUT(unknown_batch_cnt_tensor);\n    CHECK_INPUT(known_tensor);\n    CHECK_INPUT(known_batch_cnt_tensor);\n    CHECK_INPUT(dist2_tensor);\n    CHECK_INPUT(idx_tensor);\n\n    int batch_size = unknown_batch_cnt_tensor.size(0);\n    int N = unknown_tensor.size(0);\n    int M = known_tensor.size(0);\n    const float *unknown = unknown_tensor.data<float>();\n    const int *unknown_batch_cnt = unknown_batch_cnt_tensor.data<int>();\n    const float *known = known_tensor.data<float>();\n    const int *known_batch_cnt = known_batch_cnt_tensor.data<int>();\n    float *dist2 = dist2_tensor.data<float>();\n    int *idx = idx_tensor.data<int>();\n\n    three_nn_kernel_launcher_stack(batch_size, N, M, unknown, unknown_batch_cnt, known, known_batch_cnt, dist2, idx);\n}\n\n\nvoid three_interpolate_wrapper_stack(at::Tensor features_tensor, \n    at::Tensor idx_tensor, at::Tensor weight_tensor, at::Tensor out_tensor) {\n    // features_tensor: (M1 + M2 ..., C)\n    // idx_tensor: [N1 + N2 ..., 3]\n    // weight_tensor: [N1 + N2 ..., 3]\n    // Return:\n    // out_tensor: (N1 + N2 ..., C)\n    CHECK_INPUT(features_tensor);\n    CHECK_INPUT(idx_tensor);\n    CHECK_INPUT(weight_tensor);\n    CHECK_INPUT(out_tensor);\n\n    int N = out_tensor.size(0);\n    int channels = features_tensor.size(1);\n    const float *features = features_tensor.data<float>();\n    const float *weight = weight_tensor.data<float>();\n    const int *idx = idx_tensor.data<int>();\n    float *out = out_tensor.data<float>();\n\n    three_interpolate_kernel_launcher_stack(N, channels, features, idx, weight, out);\n}\n\n\nvoid three_interpolate_grad_wrapper_stack(at::Tensor grad_out_tensor, at::Tensor idx_tensor,\n    at::Tensor weight_tensor, at::Tensor grad_features_tensor) {\n    // grad_out_tensor: (N1 + N2 ..., C)\n    // idx_tensor: [N1 + N2 ..., 3]\n    // weight_tensor: [N1 + N2 ..., 3]\n    // Return:\n    // grad_features_tensor: (M1 + M2 ..., C)\n    CHECK_INPUT(grad_out_tensor);\n    CHECK_INPUT(idx_tensor);\n    CHECK_INPUT(weight_tensor);\n    CHECK_INPUT(grad_features_tensor);\n\n    int N = grad_out_tensor.size(0);\n    int channels = grad_out_tensor.size(1);\n    const float *grad_out = grad_out_tensor.data<float>();\n    const float *weight = weight_tensor.data<float>();\n    const int *idx = idx_tensor.data<int>();\n    float *grad_features = grad_features_tensor.data<float>();\n    \n    // printf(\"N=%d, channels=%d\\n\", N, channels);\n    three_interpolate_grad_kernel_launcher_stack(N, channels, grad_out, idx, weight, grad_features);\n}"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/src/interpolate_gpu.cu",
    "content": "/*\nStacked-batch-data version of point interpolation, modified from the original implementation of official PointNet++ codes.\nWritten by Shaoshuai Shi\nAll Rights Reserved 2019-2020.\n*/\n\n\n#include <math.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include \"cuda_utils.h\"\n#include \"interpolate_gpu.h\"\n\n\n__global__ void three_nn_kernel_stack(int batch_size, int N, int M, const float *unknown, \n    const int *unknown_batch_cnt, const float *known, const int *known_batch_cnt,\n    float *dist2, int *idx) {\n    // unknown: (N1 + N2 ..., 3)\n    // unknown_batch_cnt: (batch_size), [N1, N2, ...]\n    // known: (M1 + M2 ..., 3)\n    // known_batch_cnt: (batch_size), [M1, M2, ...]\n    // Return:\n    // dist: (N1 + N2 ..., 3)  l2 distance to the three nearest neighbors\n    // idx: (N1 + N2 ..., 3)  index of the three nearest neighbors\n\n    int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (pt_idx >= N) return;\n\n    int bs_idx = 0, pt_cnt = unknown_batch_cnt[0];\n    for (int k = 1; k < batch_size; k++){\n        if (pt_idx < pt_cnt) break;\n        pt_cnt += unknown_batch_cnt[k];\n        bs_idx = k;\n    }\n\n    int cur_num_known_points = known_batch_cnt[bs_idx];\n\n    int known_batch_start_idx = 0;\n    for (int k = 0; k < bs_idx; k++) known_batch_start_idx += known_batch_cnt[k];\n\n    known += known_batch_start_idx * 3;\n    unknown += pt_idx * 3;\n    dist2 += pt_idx * 3;\n    idx += pt_idx * 3;\n\n    float ux = unknown[0];\n    float uy = unknown[1];\n    float uz = unknown[2];\n\n    double best1 = 1e40, best2 = 1e40, best3 = 1e40;\n    int besti1 = 0, besti2 = 0, besti3 = 0;\n    for (int k = 0; k < cur_num_known_points; ++k) {\n        float x = known[k * 3 + 0];\n        float y = known[k * 3 + 1];\n        float z = known[k * 3 + 2];\n        float d = (ux - x) * (ux - x) + (uy - y) * (uy - y) + (uz - z) * (uz - z);\n        if (d < best1) {\n            best3 = best2; besti3 = besti2;\n            best2 = best1; besti2 = besti1;\n            best1 = d; besti1 = k;\n        } \n        else if (d < best2) {\n            best3 = best2; besti3 = besti2;\n            best2 = d; besti2 = k;\n        } \n        else if (d < best3) {\n            best3 = d; besti3 = k;\n        }\n    }\n    dist2[0] = best1; dist2[1] = best2; dist2[2] = best3;\n    idx[0] = besti1 + known_batch_start_idx; \n    idx[1] = besti2 + known_batch_start_idx; \n    idx[2] = besti3 + known_batch_start_idx;\n}\n\n\nvoid three_nn_kernel_launcher_stack(int batch_size, int N, int M, const float *unknown, \n    const int *unknown_batch_cnt, const float *known, const int *known_batch_cnt,\n    float *dist2, int *idx) {\n    // unknown: (N1 + N2 ..., 3)\n    // unknown_batch_cnt: (batch_size), [N1, N2, ...]\n    // known: (M1 + M2 ..., 3)\n    // known_batch_cnt: (batch_size), [M1, M2, ...]\n    // Return:\n    // dist: (N1 + N2 ..., 3)  l2 distance to the three nearest neighbors\n    // idx: (N1 + N2 ..., 3)  index of the three nearest neighbors\n\n    cudaError_t err;\n    dim3 blocks(DIVUP(N, THREADS_PER_BLOCK));  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    three_nn_kernel_stack<<<blocks, threads>>>(\n        batch_size, N, M, unknown, unknown_batch_cnt, \n        known, known_batch_cnt, dist2, idx\n    );\n\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n\n\n\n__global__ void three_interpolate_kernel_stack(int N, int channels, const float *features, \n    const int *idx, const float *weight, float *out) {\n    // features: (M1 + M2 ..., C)\n    // idx: [N1 + N2 ..., 3]\n    // weight: [N1 + N2 ..., 3]\n    // Return:\n    // out: (N1 + N2 ..., C)\n\n    int c_idx = blockIdx.y;\n    int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (pt_idx >= N || c_idx >= channels) return;\n\n    weight += pt_idx * 3;\n    idx += pt_idx * 3;\n    out += pt_idx * channels + c_idx;\n\n    out[0] = weight[0] * features[idx[0] * channels + c_idx] + \n        weight[1] * features[idx[1] * channels + c_idx] + \n        weight[2] * features[idx[2] * channels + c_idx];\n}\n\n\n\nvoid three_interpolate_kernel_launcher_stack(int N, int channels,\n    const float *features, const int *idx, const float *weight, float *out) {\n    // features: (M1 + M2 ..., C)\n    // idx: [N1 + N2 ..., 3]\n    // weight: [N1 + N2 ..., 3]\n    // Return:\n    // out: (N1 + N2 ..., C)\n\n    cudaError_t err;\n    dim3 blocks(DIVUP(N, THREADS_PER_BLOCK), channels);\n    dim3 threads(THREADS_PER_BLOCK);\n    three_interpolate_kernel_stack<<<blocks, threads>>>(N, channels, features, idx, weight, out);\n\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n\n\n__global__ void three_interpolate_grad_kernel_stack(int N, int channels, const float *grad_out, \n    const int *idx, const float *weight, float *grad_features) {\n    // grad_out_tensor: (N1 + N2 ..., C)\n    // idx_tensor: [N1 + N2 ..., 3]\n    // weight_tensor: [N1 + N2 ..., 3]\n    // Return:\n    // grad_features_tensor: (M1 + M2 ..., C)\n\n    int c_idx = blockIdx.y;\n    int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (pt_idx >= N || c_idx >= channels) return;\n\n    grad_out += pt_idx * channels + c_idx;\n    weight += pt_idx * 3;\n    idx += pt_idx * 3;\n    \n    // printf(\"pt_idx=%d, c_idx=%d, idx=(%d, %d, %d), grad_out=%f\\n\", pt_idx, c_idx, idx[0], idx[1], idx[2], grad_out[0]);\n\n    atomicAdd(grad_features + idx[0] * channels + c_idx, grad_out[0] * weight[0]);\n    atomicAdd(grad_features + idx[1] * channels + c_idx, grad_out[0] * weight[1]);\n    atomicAdd(grad_features + idx[2] * channels + c_idx, grad_out[0] * weight[2]);\n}\n\n\nvoid three_interpolate_grad_kernel_launcher_stack(int N, int channels, const float *grad_out, \n    const int *idx, const float *weight, float *grad_features) {\n    // grad_out_tensor: (N1 + N2 ..., C)\n    // idx_tensor: [N1 + N2 ..., 3]\n    // weight_tensor: [N1 + N2 ..., 3]\n    // Return:\n    // grad_features_tensor: (M1 + M2 ..., C)\n\n    cudaError_t err;\n    dim3 blocks(DIVUP(N, THREADS_PER_BLOCK), channels);  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n    three_interpolate_grad_kernel_stack<<<blocks, threads>>>(\n        N, channels, grad_out, idx, weight, grad_features\n    );\n\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/src/interpolate_gpu.h",
    "content": "#ifndef _INTERPOLATE_GPU_H\n#define _INTERPOLATE_GPU_H\n\n#include <torch/serialize/tensor.h>\n#include<vector>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n\n\nvoid three_nn_wrapper_stack(at::Tensor unknown_tensor, \n    at::Tensor unknown_batch_cnt_tensor, at::Tensor known_tensor, \n    at::Tensor known_batch_cnt_tensor, at::Tensor dist2_tensor, at::Tensor idx_tensor);\n\n\nvoid three_interpolate_wrapper_stack(at::Tensor features_tensor, \n    at::Tensor idx_tensor, at::Tensor weight_tensor, at::Tensor out_tensor);\n\n\n\nvoid three_interpolate_grad_wrapper_stack(at::Tensor grad_out_tensor, at::Tensor idx_tensor,\n    at::Tensor weight_tensor, at::Tensor grad_features_tensor);\n\n\nvoid three_nn_kernel_launcher_stack(int batch_size, int N, int M, const float *unknown, \n    const int *unknown_batch_cnt, const float *known, const int *known_batch_cnt,\n    float *dist2, int *idx);\n\n\nvoid three_interpolate_kernel_launcher_stack(int N, int channels,\n    const float *features, const int *idx, const float *weight, float *out);\n\n\n\nvoid three_interpolate_grad_kernel_launcher_stack(int N, int channels, const float *grad_out, \n    const int *idx, const float *weight, float *grad_features);\n\n\n\n#endif"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/src/pointnet2_api.cpp",
    "content": "#include <torch/serialize/tensor.h>\n#include <torch/extension.h>\n\n#include \"ball_query_gpu.h\"\n#include \"group_points_gpu.h\"\n#include \"sampling_gpu.h\"\n#include \"interpolate_gpu.h\"\n#include \"voxel_query_gpu.h\"\n#include \"ball_query_deform_gpu.h\"\n#include \"vector_pool_gpu.h\"\n\nPYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {\n    m.def(\"ball_query_wrapper\", &ball_query_wrapper_stack, \"ball_query_wrapper_stack\");\n    m.def(\"voxel_query_wrapper\", &voxel_query_wrapper_stack, \"voxel_query_wrapper_stack\");\n    m.def(\"ball_query_deform_wrapper\", &ball_query_deform_wrapper_stack, \"ball_query_deform_wrapper_stack\");\n\n    m.def(\"farthest_point_sampling_wrapper\", &farthest_point_sampling_wrapper, \"farthest_point_sampling_wrapper\");\n    m.def(\"stack_farthest_point_sampling_wrapper\", &stack_farthest_point_sampling_wrapper, \"stack_farthest_point_sampling_wrapper\");\n\n    m.def(\"group_points_wrapper\", &group_points_wrapper_stack, \"group_points_wrapper_stack\");\n    m.def(\"group_points_grad_wrapper\", &group_points_grad_wrapper_stack, \"group_points_grad_wrapper_stack\");\n\n    m.def(\"three_nn_wrapper\", &three_nn_wrapper_stack, \"three_nn_wrapper_stack\");\n    m.def(\"three_interpolate_wrapper\", &three_interpolate_wrapper_stack, \"three_interpolate_wrapper_stack\");\n    m.def(\"three_interpolate_grad_wrapper\", &three_interpolate_grad_wrapper_stack, \"three_interpolate_grad_wrapper_stack\");\n\n\n    m.def(\"query_stacked_local_neighbor_idxs_wrapper_stack\", &query_stacked_local_neighbor_idxs_wrapper_stack, \"query_stacked_local_neighbor_idxs_wrapper_stack\");\n    m.def(\"query_three_nn_by_stacked_local_idxs_wrapper_stack\", &query_three_nn_by_stacked_local_idxs_wrapper_stack, \"query_three_nn_by_stacked_local_idxs_wrapper_stack\");\n\n    m.def(\"vector_pool_wrapper\", &vector_pool_wrapper_stack, \"vector_pool_grad_wrapper_stack\");\n    m.def(\"vector_pool_grad_wrapper\", &vector_pool_grad_wrapper_stack, \"vector_pool_grad_wrapper_stack\");\n}\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/src/sampling.cpp",
    "content": "#include <torch/serialize/tensor.h>\n#include <ATen/cuda/CUDAContext.h>\n#include <vector>\n#include <THC/THC.h>\n\n#include \"sampling_gpu.h\"\n\nextern THCState *state;\n#define CHECK_CUDA(x) do { \\\n  if (!x.type().is_cuda()) { \\\n    fprintf(stderr, \"%s must be CUDA tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_CONTIGUOUS(x) do { \\\n  if (!x.is_contiguous()) { \\\n    fprintf(stderr, \"%s must be contiguous tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_INPUT(x) CHECK_CUDA(x);CHECK_CONTIGUOUS(x)\n\n\nint farthest_point_sampling_wrapper(int b, int n, int m,\n    at::Tensor points_tensor, at::Tensor temp_tensor, at::Tensor idx_tensor) {\n\n    CHECK_INPUT(points_tensor);\n    CHECK_INPUT(temp_tensor);\n    CHECK_INPUT(idx_tensor);\n\n    const float *points = points_tensor.data<float>();\n    float *temp = temp_tensor.data<float>();\n    int *idx = idx_tensor.data<int>();\n\n    farthest_point_sampling_kernel_launcher(b, n, m, points, temp, idx);\n    return 1;\n}\n\n\nint stack_farthest_point_sampling_wrapper(at::Tensor points_tensor,\n  at::Tensor temp_tensor, at::Tensor xyz_batch_cnt_tensor, at::Tensor idx_tensor,\n  at::Tensor num_sampled_points_tensor) {\n\n    CHECK_INPUT(points_tensor);\n    CHECK_INPUT(temp_tensor);\n    CHECK_INPUT(idx_tensor);\n    CHECK_INPUT(xyz_batch_cnt_tensor);\n    CHECK_INPUT(num_sampled_points_tensor);\n\n    int batch_size = xyz_batch_cnt_tensor.size(0);\n    int N = points_tensor.size(0);\n    const float *points = points_tensor.data<float>();\n    float *temp = temp_tensor.data<float>();\n    int *xyz_batch_cnt = xyz_batch_cnt_tensor.data<int>();\n    int *idx = idx_tensor.data<int>();\n    int *num_sampled_points = num_sampled_points_tensor.data<int>();\n\n    stack_farthest_point_sampling_kernel_launcher(N, batch_size, points, temp, xyz_batch_cnt, idx, num_sampled_points);\n    return 1;\n}"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/src/sampling_gpu.cu",
    "content": "#include <stdio.h>\n#include <stdlib.h>\n\n#include \"cuda_utils.h\"\n#include \"sampling_gpu.h\"\n#define TOTAL_THREADS 1024\n\n\ninline int opt_n_threads(int work_size) {\n    const int pow_2 = std::log(static_cast<double>(work_size)) / std::log(2.0);\n\n    return max(min(1 << pow_2, TOTAL_THREADS), 1);\n}\n\n\n__device__ void __update(float *__restrict__ dists, int *__restrict__ dists_i, int idx1, int idx2){\n    const float v1 = dists[idx1], v2 = dists[idx2];\n    const int i1 = dists_i[idx1], i2 = dists_i[idx2];\n    dists[idx1] = max(v1, v2);\n    dists_i[idx1] = v2 > v1 ? i2 : i1;\n}\n\n\ntemplate <unsigned int block_size>\n__global__ void farthest_point_sampling_kernel(int b, int n, int m,\n    const float *__restrict__ dataset, float *__restrict__ temp, int *__restrict__ idxs) {\n    // dataset: (B, N, 3)\n    // tmp: (B, N)\n    // output:\n    //      idx: (B, M)\n\n    if (m <= 0) return;\n    __shared__ float dists[block_size];\n    __shared__ int dists_i[block_size];\n\n    int batch_index = blockIdx.x;\n    dataset += batch_index * n * 3;\n    temp += batch_index * n;\n    idxs += batch_index * m;\n\n    int tid = threadIdx.x;\n    const int stride = block_size;\n\n    int old = 0;\n    if (threadIdx.x == 0)\n    idxs[0] = old;\n\n    __syncthreads();\n    for (int j = 1; j < m; j++) {\n    int besti = 0;\n    float best = -1;\n    float x1 = dataset[old * 3 + 0];\n    float y1 = dataset[old * 3 + 1];\n    float z1 = dataset[old * 3 + 2];\n    for (int k = tid; k < n; k += stride) {\n        float x2, y2, z2;\n        x2 = dataset[k * 3 + 0];\n        y2 = dataset[k * 3 + 1];\n        z2 = dataset[k * 3 + 2];\n        // float mag = (x2 * x2) + (y2 * y2) + (z2 * z2);\n        // if (mag <= 1e-3)\n        // continue;\n\n        float d = (x2 - x1) * (x2 - x1) + (y2 - y1) * (y2 - y1) + (z2 - z1) * (z2 - z1);\n        float d2 = min(d, temp[k]);\n        temp[k] = d2;\n        besti = d2 > best ? k : besti;\n        best = d2 > best ? d2 : best;\n    }\n    dists[tid] = best;\n    dists_i[tid] = besti;\n    __syncthreads();\n\n    if (block_size >= 1024) {\n        if (tid < 512) {\n            __update(dists, dists_i, tid, tid + 512);\n        }\n        __syncthreads();\n    }\n\n    if (block_size >= 512) {\n        if (tid < 256) {\n            __update(dists, dists_i, tid, tid + 256);\n        }\n        __syncthreads();\n    }\n    if (block_size >= 256) {\n        if (tid < 128) {\n            __update(dists, dists_i, tid, tid + 128);\n        }\n        __syncthreads();\n    }\n    if (block_size >= 128) {\n        if (tid < 64) {\n            __update(dists, dists_i, tid, tid + 64);\n        }\n        __syncthreads();\n    }\n    if (block_size >= 64) {\n        if (tid < 32) {\n            __update(dists, dists_i, tid, tid + 32);\n        }\n        __syncthreads();\n    }\n    if (block_size >= 32) {\n        if (tid < 16) {\n            __update(dists, dists_i, tid, tid + 16);\n        }\n        __syncthreads();\n    }\n    if (block_size >= 16) {\n        if (tid < 8) {\n            __update(dists, dists_i, tid, tid + 8);\n        }\n        __syncthreads();\n    }\n    if (block_size >= 8) {\n        if (tid < 4) {\n            __update(dists, dists_i, tid, tid + 4);\n        }\n        __syncthreads();\n    }\n    if (block_size >= 4) {\n        if (tid < 2) {\n            __update(dists, dists_i, tid, tid + 2);\n        }\n        __syncthreads();\n    }\n    if (block_size >= 2) {\n        if (tid < 1) {\n            __update(dists, dists_i, tid, tid + 1);\n        }\n        __syncthreads();\n    }\n\n    old = dists_i[0];\n    if (tid == 0)\n        idxs[j] = old;\n    }\n}\n\nvoid farthest_point_sampling_kernel_launcher(int b, int n, int m,\n    const float *dataset, float *temp, int *idxs) {\n    // dataset: (B, N, 3)\n    // tmp: (B, N)\n    // output:\n    //      idx: (B, M)\n\n    cudaError_t err;\n    unsigned int n_threads = opt_n_threads(n);\n\n    switch (n_threads) {\n        case 1024:\n        farthest_point_sampling_kernel<1024><<<b, n_threads>>>(b, n, m, dataset, temp, idxs); break;\n        case 512:\n        farthest_point_sampling_kernel<512><<<b, n_threads>>>(b, n, m, dataset, temp, idxs); break;\n        case 256:\n        farthest_point_sampling_kernel<256><<<b, n_threads>>>(b, n, m, dataset, temp, idxs); break;\n        case 128:\n        farthest_point_sampling_kernel<128><<<b, n_threads>>>(b, n, m, dataset, temp, idxs); break;\n        case 64:\n        farthest_point_sampling_kernel<64><<<b, n_threads>>>(b, n, m, dataset, temp, idxs); break;\n        case 32:\n        farthest_point_sampling_kernel<32><<<b, n_threads>>>(b, n, m, dataset, temp, idxs); break;\n        case 16:\n        farthest_point_sampling_kernel<16><<<b, n_threads>>>(b, n, m, dataset, temp, idxs); break;\n        case 8:\n        farthest_point_sampling_kernel<8><<<b, n_threads>>>(b, n, m, dataset, temp, idxs); break;\n        case 4:\n        farthest_point_sampling_kernel<4><<<b, n_threads>>>(b, n, m, dataset, temp, idxs); break;\n        case 2:\n        farthest_point_sampling_kernel<2><<<b, n_threads>>>(b, n, m, dataset, temp, idxs); break;\n        case 1:\n        farthest_point_sampling_kernel<1><<<b, n_threads>>>(b, n, m, dataset, temp, idxs); break;\n        default:\n        farthest_point_sampling_kernel<512><<<b, n_threads>>>(b, n, m, dataset, temp, idxs);\n    }\n\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n\n\ntemplate <unsigned int block_size>\n__global__ void stack_farthest_point_sampling_kernel(int batch_size, int N,\n    const float *dataset, float *temp, int *xyz_batch_cnt, int *idxs, int *num_sampled_points) {\n    // \"\"\"\n    // Args:\n    //     ctx:\n    //     dataset: (N1 + N2 + ..., 3) where N > npoint\n    //     temp: (N1 + N2 + ...) where N > npoint\n    //     xyz_batch_cnt: [N1, N2, ...]\n    //     num_sampled_points: [M1, M2, ...] int, number of features in the sampled set\n\n    // Returns:\n    //     idxs: (npoint.sum()) tensor containing the set,\n    //     npoint: (M1, M2, ...)\n    // \"\"\"\n\n    __shared__ float dists[block_size];\n    __shared__ int dists_i[block_size];\n\n    int bs_idx = blockIdx.x;\n\n    int xyz_batch_start_idx = 0, idxs_start_idx = 0;\n    for (int k = 0; k < bs_idx; k++){\n        xyz_batch_start_idx += xyz_batch_cnt[k];\n        idxs_start_idx += num_sampled_points[k];\n    }\n\n    dataset += xyz_batch_start_idx * 3;\n    temp += xyz_batch_start_idx;\n    idxs += idxs_start_idx;\n\n    int n = xyz_batch_cnt[bs_idx];\n    int m = num_sampled_points[bs_idx];\n\n    int tid = threadIdx.x;\n    const int stride = block_size;\n\n    int old = 0;\n    if (threadIdx.x == 0) idxs[0] = xyz_batch_start_idx;\n\n    __syncthreads();\n    for (int j = 1; j < m; j++) {\n        int besti = 0;\n        float best = -1;\n        float x1 = dataset[old * 3 + 0];\n        float y1 = dataset[old * 3 + 1];\n        float z1 = dataset[old * 3 + 2];\n        for (int k = tid; k < n; k += stride) {\n            float x2, y2, z2;\n            x2 = dataset[k * 3 + 0];\n            y2 = dataset[k * 3 + 1];\n            z2 = dataset[k * 3 + 2];\n            // float mag = (x2 * x2) + (y2 * y2) + (z2 * z2);\n            // if (mag <= 1e-3)\n            // continue;\n\n            float d = (x2 - x1) * (x2 - x1) + (y2 - y1) * (y2 - y1) + (z2 - z1) * (z2 - z1);\n            float d2 = min(d, temp[k]);\n            temp[k] = d2;\n            besti = d2 > best ? k : besti;\n            best = d2 > best ? d2 : best;\n        }\n        dists[tid] = best;\n        dists_i[tid] = besti;\n        __syncthreads();\n\n        if (block_size >= 1024) {\n            if (tid < 512) {\n                __update(dists, dists_i, tid, tid + 512);\n            }\n            __syncthreads();\n        }\n\n        if (block_size >= 512) {\n            if (tid < 256) {\n                __update(dists, dists_i, tid, tid + 256);\n            }\n            __syncthreads();\n        }\n        if (block_size >= 256) {\n            if (tid < 128) {\n                __update(dists, dists_i, tid, tid + 128);\n            }\n            __syncthreads();\n        }\n        if (block_size >= 128) {\n            if (tid < 64) {\n                __update(dists, dists_i, tid, tid + 64);\n            }\n            __syncthreads();\n        }\n        if (block_size >= 64) {\n            if (tid < 32) {\n                __update(dists, dists_i, tid, tid + 32);\n            }\n            __syncthreads();\n        }\n        if (block_size >= 32) {\n            if (tid < 16) {\n                __update(dists, dists_i, tid, tid + 16);\n            }\n            __syncthreads();\n        }\n        if (block_size >= 16) {\n            if (tid < 8) {\n                __update(dists, dists_i, tid, tid + 8);\n            }\n            __syncthreads();\n        }\n        if (block_size >= 8) {\n            if (tid < 4) {\n                __update(dists, dists_i, tid, tid + 4);\n            }\n            __syncthreads();\n        }\n        if (block_size >= 4) {\n            if (tid < 2) {\n                __update(dists, dists_i, tid, tid + 2);\n            }\n            __syncthreads();\n        }\n        if (block_size >= 2) {\n            if (tid < 1) {\n                __update(dists, dists_i, tid, tid + 1);\n            }\n            __syncthreads();\n        }\n\n        old = dists_i[0];\n        if (tid == 0)\n            idxs[j] = old + xyz_batch_start_idx;\n    }\n}\n\n\nvoid stack_farthest_point_sampling_kernel_launcher(int N, int batch_size,\n    const float *dataset, float *temp, int *xyz_batch_cnt, int *idxs, int *num_sampled_points) {\n    // \"\"\"\n    // Args:\n    //     ctx:\n    //     dataset: (N1 + N2 + ..., 3) where N > npoint\n    //     temp: (N1 + N2 + ...) where N > npoint\n    //     xyz_batch_cnt: [N1, N2, ...]\n    //     npoint: int, number of features in the sampled set\n\n    // Returns:\n    //     idxs: (npoint.sum()) tensor containing the set,\n    //     npoint: (M1, M2, ...)\n    // \"\"\"\n\n    cudaError_t err;\n    unsigned int n_threads = opt_n_threads(N);\n\n    stack_farthest_point_sampling_kernel<1024><<<batch_size, 1024>>>(\n        batch_size, N, dataset, temp, xyz_batch_cnt, idxs, num_sampled_points\n    );\n\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/src/sampling_gpu.h",
    "content": "#ifndef _SAMPLING_GPU_H\n#define _SAMPLING_GPU_H\n\n#include <torch/serialize/tensor.h>\n#include <ATen/cuda/CUDAContext.h>\n#include<vector>\n\n\nint farthest_point_sampling_wrapper(int b, int n, int m,\n    at::Tensor points_tensor, at::Tensor temp_tensor, at::Tensor idx_tensor);\n\nvoid farthest_point_sampling_kernel_launcher(int b, int n, int m,\n    const float *dataset, float *temp, int *idxs);\n\nint stack_farthest_point_sampling_wrapper(\n    at::Tensor points_tensor, at::Tensor temp_tensor, at::Tensor xyz_batch_cnt_tensor,\n    at::Tensor idx_tensor, at::Tensor num_sampled_points_tensor);\n\n\nvoid stack_farthest_point_sampling_kernel_launcher(int N, int batch_size,\n    const float *dataset, float *temp, int *xyz_batch_cnt, int *idxs, int *num_sampled_points);\n\n#endif\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/src/vector_pool.cpp",
    "content": "/*\nVector-pool aggregation based local feature aggregation for point cloud.\nPV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection\nhttps://arxiv.org/abs/2102.00463\n\nWritten by Shaoshuai Shi\nAll Rights Reserved 2020.\n*/\n\n\n#include <torch/serialize/tensor.h>\n#include <vector>\n#include <THC/THC.h>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n#include \"vector_pool_gpu.h\"\n\nextern THCState *state;\n\n#define CHECK_CUDA(x) do { \\\n  if (!x.type().is_cuda()) { \\\n    fprintf(stderr, \"%s must be CUDA tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_CONTIGUOUS(x) do { \\\n  if (!x.is_contiguous()) { \\\n    fprintf(stderr, \"%s must be contiguous tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_INPUT(x) CHECK_CUDA(x);CHECK_CONTIGUOUS(x)\n\n\nint query_stacked_local_neighbor_idxs_wrapper_stack(at::Tensor support_xyz_tensor, at::Tensor xyz_batch_cnt_tensor,\n    at::Tensor new_xyz_tensor, at::Tensor new_xyz_batch_cnt_tensor,\n    at::Tensor stack_neighbor_idxs_tensor, at::Tensor start_len_tensor, at::Tensor cumsum_tensor,\n    int avg_length_of_neighbor_idxs, float max_neighbour_distance, int nsample, int neighbor_type){\n    // support_xyz: (N1 + N2 ..., 3) xyz coordinates of the features\n    // xyz_batch_cnt: (batch_size), [N1, N2, ...]\n    // new_xyz: (M1 + M2 ..., 3) centers of the ball query\n    // new_xyz_grid_centers: (M1 + M2 ..., num_total_grids, 3) grids centers of each grid\n    // new_xyz_batch_cnt: (batch_size), [M1, M2, ...]\n    // new_xyz_grid_idxs: (M1 + M2 ..., num_total_grids, 3) three-nn\n    // new_xyz_grid_dist2: (M1 + M2 ..., num_total_grids, 3) square of dist of three-nn\n    // num_grid_x, num_grid_y, num_grid_z: number of grids in each local area centered at new_xyz\n    // nsample: find all (-1), find limited number(>0)\n    // neighbor_type: 1: ball, others: cube\n\n    CHECK_INPUT(support_xyz_tensor);\n    CHECK_INPUT(xyz_batch_cnt_tensor);\n    CHECK_INPUT(new_xyz_tensor);\n    CHECK_INPUT(new_xyz_batch_cnt_tensor);\n    CHECK_INPUT(stack_neighbor_idxs_tensor);\n    CHECK_INPUT(start_len_tensor);\n    CHECK_INPUT(cumsum_tensor);\n\n    const float *support_xyz = support_xyz_tensor.data<float>();\n    const int *xyz_batch_cnt = xyz_batch_cnt_tensor.data<int>();\n    const float *new_xyz = new_xyz_tensor.data<float>();\n    const int *new_xyz_batch_cnt = new_xyz_batch_cnt_tensor.data<int>();\n    int *stack_neighbor_idxs = stack_neighbor_idxs_tensor.data<int>();\n    int *start_len = start_len_tensor.data<int>();\n    int *cumsum = cumsum_tensor.data<int>();\n\n    int batch_size = xyz_batch_cnt_tensor.size(0);\n    int M = new_xyz_tensor.size(0);\n\n    query_stacked_local_neighbor_idxs_kernel_launcher_stack(\n        support_xyz, xyz_batch_cnt, new_xyz, new_xyz_batch_cnt,\n        stack_neighbor_idxs, start_len, cumsum, avg_length_of_neighbor_idxs,\n        max_neighbour_distance, batch_size, M, nsample, neighbor_type\n    );\n    return 0;\n}\n\n\nint query_three_nn_by_stacked_local_idxs_wrapper_stack(at::Tensor support_xyz_tensor,\n    at::Tensor new_xyz_tensor, at::Tensor new_xyz_grid_centers_tensor,\n    at::Tensor new_xyz_grid_idxs_tensor, at::Tensor new_xyz_grid_dist2_tensor,\n    at::Tensor stack_neighbor_idxs_tensor, at::Tensor start_len_tensor,\n    int M, int num_total_grids){\n    // support_xyz: (N1 + N2 ..., 3) xyz coordinates of the features\n    // new_xyz: (M1 + M2 ..., 3) centers of the ball query\n    // new_xyz_grid_centers: (M1 + M2 ..., num_total_grids, 3) grids centers of each grid\n    // new_xyz_grid_idxs: (M1 + M2 ..., num_total_grids, 3) three-nn\n    // new_xyz_grid_dist2: (M1 + M2 ..., num_total_grids, 3) square of dist of three-nn\n    // stack_neighbor_idxs: (max_length_of_neighbor_idxs)\n    // start_len: (M1 + M2, 2)  [start_offset, neighbor_length]\n\n    CHECK_INPUT(support_xyz_tensor);\n    CHECK_INPUT(new_xyz_tensor);\n    CHECK_INPUT(new_xyz_grid_centers_tensor);\n    CHECK_INPUT(new_xyz_grid_idxs_tensor);\n    CHECK_INPUT(new_xyz_grid_dist2_tensor);\n    CHECK_INPUT(stack_neighbor_idxs_tensor);\n    CHECK_INPUT(start_len_tensor);\n\n    const float *support_xyz = support_xyz_tensor.data<float>();\n    const float *new_xyz = new_xyz_tensor.data<float>();\n    const float *new_xyz_grid_centers = new_xyz_grid_centers_tensor.data<float>();\n    int *new_xyz_grid_idxs = new_xyz_grid_idxs_tensor.data<int>();\n    float *new_xyz_grid_dist2 = new_xyz_grid_dist2_tensor.data<float>();\n    int *stack_neighbor_idxs = stack_neighbor_idxs_tensor.data<int>();\n    int *start_len = start_len_tensor.data<int>();\n\n    query_three_nn_by_stacked_local_idxs_kernel_launcher_stack(\n        support_xyz, new_xyz, new_xyz_grid_centers,\n        new_xyz_grid_idxs, new_xyz_grid_dist2, stack_neighbor_idxs, start_len,\n        M, num_total_grids\n    );\n    return 0;\n}\n\n\nint vector_pool_wrapper_stack(at::Tensor support_xyz_tensor, at::Tensor xyz_batch_cnt_tensor,\n    at::Tensor support_features_tensor, at::Tensor new_xyz_tensor, at::Tensor new_xyz_batch_cnt_tensor,\n    at::Tensor new_features_tensor, at::Tensor new_local_xyz_tensor,\n    at::Tensor point_cnt_of_grid_tensor, at::Tensor grouped_idxs_tensor,\n    int num_grid_x, int num_grid_y, int num_grid_z, float max_neighbour_distance, int use_xyz,\n    int num_max_sum_points, int nsample, int neighbor_type, int pooling_type){\n    // support_xyz_tensor: (N1 + N2 ..., 3) xyz coordinates of the features\n    // support_features_tensor: (N1 + N2 ..., C)\n    // xyz_batch_cnt: (batch_size), [N1, N2, ...]\n    // new_xyz_tensor: (M1 + M2 ..., 3) centers of new positions\n    // new_features_tensor: (M1 + M2 ..., C)\n    // new_xyz_batch_cnt: (batch_size), [M1, M2, ...]\n    // point_cnt_of_grid: (M1 + M2 ..., num_total_grids)\n    // grouped_idxs_tensor: (num_max_sum_points, 3)\n    // num_grid_x, num_grid_y, num_grid_z: number of grids in each local area centered at new_xyz\n    // use_xyz: whether to calculate new_local_xyz\n    // neighbor_type: 1: ball, others: cube\n    // pooling_type: 0: avg_pool, 1: random choice\n\n    CHECK_INPUT(support_xyz_tensor);\n    CHECK_INPUT(support_features_tensor);\n    CHECK_INPUT(xyz_batch_cnt_tensor);\n    CHECK_INPUT(new_xyz_tensor);\n    CHECK_INPUT(new_xyz_batch_cnt_tensor);\n    CHECK_INPUT(new_features_tensor);\n    CHECK_INPUT(new_local_xyz_tensor);\n    CHECK_INPUT(point_cnt_of_grid_tensor);\n    CHECK_INPUT(grouped_idxs_tensor);\n\n    const float *support_xyz = support_xyz_tensor.data<float>();\n    const float *support_features = support_features_tensor.data<float>();\n    const int *xyz_batch_cnt = xyz_batch_cnt_tensor.data<int>();\n    const float *new_xyz = new_xyz_tensor.data<float>();\n    const int *new_xyz_batch_cnt = new_xyz_batch_cnt_tensor.data<int>();\n    float *new_features = new_features_tensor.data<float>();\n    float *new_local_xyz = new_local_xyz_tensor.data<float>();\n    int *point_cnt_of_grid = point_cnt_of_grid_tensor.data<int>();\n    int *grouped_idxs = grouped_idxs_tensor.data<int>();\n\n    int N = support_xyz_tensor.size(0);\n    int batch_size = xyz_batch_cnt_tensor.size(0);\n    int M = new_xyz_tensor.size(0);\n    int num_c_out = new_features_tensor.size(1);\n    int num_c_in = support_features_tensor.size(1);\n    int num_total_grids = point_cnt_of_grid_tensor.size(1);\n\n    int cum_sum = vector_pool_kernel_launcher_stack(\n        support_xyz, support_features, xyz_batch_cnt,\n        new_xyz, new_features, new_local_xyz, new_xyz_batch_cnt,\n        point_cnt_of_grid, grouped_idxs,\n        num_grid_x, num_grid_y, num_grid_z, max_neighbour_distance,\n        batch_size, N, M, num_c_in, num_c_out, num_total_grids, use_xyz, num_max_sum_points, nsample, neighbor_type, pooling_type\n    );\n    return cum_sum;\n}\n\n\nint vector_pool_grad_wrapper_stack(at::Tensor grad_new_features_tensor,\n    at::Tensor point_cnt_of_grid_tensor, at::Tensor grouped_idxs_tensor,\n    at::Tensor grad_support_features_tensor) {\n    // grad_new_features_tensor: (M1 + M2 ..., C_out)\n    // point_cnt_of_grid_tensor: (M1 + M2 ..., num_total_grids)\n    // grouped_idxs_tensor: (num_max_sum_points, 3) [idx of support_xyz, idx of new_xyz, idx of grid_idx in new_xyz]\n    // grad_support_features_tensor: (N1 + N2 ..., C_in)\n\n    CHECK_INPUT(grad_new_features_tensor);\n    CHECK_INPUT(point_cnt_of_grid_tensor);\n    CHECK_INPUT(grouped_idxs_tensor);\n    CHECK_INPUT(grad_support_features_tensor);\n\n    int M = grad_new_features_tensor.size(0);\n    int num_c_out = grad_new_features_tensor.size(1);\n    int N = grad_support_features_tensor.size(0);\n    int num_c_in = grad_support_features_tensor.size(1);\n    int num_total_grids = point_cnt_of_grid_tensor.size(1);\n    int num_max_sum_points = grouped_idxs_tensor.size(0);\n\n    const float *grad_new_features = grad_new_features_tensor.data<float>();\n    const int *point_cnt_of_grid = point_cnt_of_grid_tensor.data<int>();\n    const int *grouped_idxs = grouped_idxs_tensor.data<int>();\n    float *grad_support_features = grad_support_features_tensor.data<float>();\n\n    vector_pool_grad_kernel_launcher_stack(\n        grad_new_features, point_cnt_of_grid, grouped_idxs, grad_support_features,\n        N, M, num_c_out, num_c_in, num_total_grids, num_max_sum_points\n    );\n    return 1;\n}\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/src/vector_pool_gpu.cu",
    "content": "/*\nVector-pool aggregation based local feature aggregation for point cloud.\nPV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection\nhttps://arxiv.org/abs/2102.00463\n\nWritten by Shaoshuai Shi\nAll Rights Reserved 2020.\n*/\n\n\n#include <math.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include \"vector_pool_gpu.h\"\n#include \"cuda_utils.h\"\n\n\n__global__ void query_three_nn_by_stacked_local_idxs_kernel(\n    const float *support_xyz, const float *new_xyz, const float *new_xyz_grid_centers,\n    int *new_xyz_grid_idxs, float *new_xyz_grid_dist2,\n    const int *stack_neighbor_idxs, const int *start_len,\n    int M, int num_total_grids){\n    // support_xyz: (N1 + N2 ..., 3) xyz coordinates of the features\n    // new_xyz: (M1 + M2 ..., 3) centers of the ball query\n    // new_xyz_grid_centers: (M1 + M2 ..., num_total_grids, 3) grids centers of each grid\n    // new_xyz_grid_idxs: (M1 + M2 ..., num_total_grids, 3) three-nn\n    // new_xyz_grid_dist2: (M1 + M2 ..., num_total_grids, 3) square of dist of three-nn\n    // stack_neighbor_idxs: (max_length_of_neighbor_idxs)\n    // start_len: (M1 + M2, 2)  [start_offset, neighbor_length]\n\n    int grid_idx = blockIdx.y;\n    int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;\n\n    if (pt_idx >= M || grid_idx >= num_total_grids) return;\n\n    new_xyz += pt_idx * 3;\n    new_xyz_grid_centers += pt_idx * num_total_grids * 3 + grid_idx * 3;\n    new_xyz_grid_idxs += pt_idx * num_total_grids * 3 + grid_idx * 3;\n    new_xyz_grid_dist2 += pt_idx * num_total_grids * 3 + grid_idx * 3;\n\n    start_len += pt_idx * 2;\n    stack_neighbor_idxs += start_len[0];\n    int neighbor_length = start_len[1];\n\n    float center_x = new_xyz_grid_centers[0];\n    float center_y = new_xyz_grid_centers[1];\n    float center_z = new_xyz_grid_centers[2];\n\n    double best1 = 1e40, best2 = 1e40, best3 = 1e40;\n    int besti1 = -1, besti2 = -1, besti3 = -1;\n    for (int k = 0; k < neighbor_length; k++){\n        int cur_neighbor_idx = stack_neighbor_idxs[k];\n\n        float x = support_xyz[cur_neighbor_idx * 3 + 0];\n        float y = support_xyz[cur_neighbor_idx * 3 + 1];\n        float z = support_xyz[cur_neighbor_idx * 3 + 2];\n\n        float d = (center_x - x) * (center_x - x) + (center_y - y) * (center_y - y) + (center_z - z) * (center_z - z);\n\n        if (d < best1) {\n            best3 = best2; besti3 = besti2;\n            best2 = best1; besti2 = besti1;\n            best1 = d; besti1 = cur_neighbor_idx;\n        }\n        else if (d < best2) {\n            best3 = best2; besti3 = besti2;\n            best2 = d; besti2 = cur_neighbor_idx;\n        }\n        else if (d < best3) {\n            best3 = d; besti3 = cur_neighbor_idx;\n        }\n    }\n    if (besti2 == -1){\n        besti2 = besti1; best2 = best1;\n    }\n    if (besti3 == -1){\n        besti3 = besti1; best3 = best1;\n    }\n    new_xyz_grid_dist2[0] = best1;\n    new_xyz_grid_dist2[1] = best2;\n    new_xyz_grid_dist2[2] = best3;\n    new_xyz_grid_idxs[0] = besti1;\n    new_xyz_grid_idxs[1] = besti2;\n    new_xyz_grid_idxs[2] = besti3;\n}\n\n\nint query_three_nn_by_stacked_local_idxs_kernel_launcher_stack(\n    const float *support_xyz, const float *new_xyz, const float *new_xyz_grid_centers,\n    int *new_xyz_grid_idxs, float *new_xyz_grid_dist2,\n    const int *stack_neighbor_idxs, const int *start_len,\n    int M, int num_total_grids){\n    // support_xyz: (N1 + N2 ..., 3) xyz coordinates of the features\n    // new_xyz: (M1 + M2 ..., 3) centers of the ball query\n    // new_xyz_grid_centers: (M1 + M2 ..., num_total_grids, 3) grids centers of each grid\n    // new_xyz_grid_idxs: (M1 + M2 ..., num_total_grids, 3) three-nn\n    // new_xyz_grid_dist2: (M1 + M2 ..., num_total_grids, 3) square of dist of three-nn\n    // stack_neighbor_idxs: (max_length_of_neighbor_idxs)\n    // start_len: (M1 + M2, 2)  [start_offset, neighbor_length]\n\n    cudaError_t err;\n    dim3 blocks(DIVUP(M, THREADS_PER_BLOCK), num_total_grids);  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    query_three_nn_by_stacked_local_idxs_kernel<<<blocks, threads>>>(\n        support_xyz, new_xyz, new_xyz_grid_centers,\n        new_xyz_grid_idxs, new_xyz_grid_dist2, stack_neighbor_idxs, start_len,\n        M, num_total_grids\n    );\n\n    // cudaDeviceSynchronize();  // for using printf in kernel function\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n    return 0;\n}\n\n\n__global__ void query_stacked_local_neighbor_idxs_kernel(\n    const float *support_xyz, const int *xyz_batch_cnt, const float *new_xyz, const int *new_xyz_batch_cnt,\n    int *stack_neighbor_idxs, int *start_len, int *cumsum, int avg_length_of_neighbor_idxs,\n    float max_neighbour_distance, int batch_size, int M, int nsample, int neighbor_type){\n    // support_xyz: (N1 + N2 ..., 3) xyz coordinates of the features\n    // xyz_batch_cnt: (batch_size), [N1, N2, ...]\n    // new_xyz: (M1 + M2 ..., 3) centers of the ball query\n    // new_xyz_batch_cnt: (batch_size), [M1, M2, ...]\n    // stack_neighbor_idxs: (max_length_of_neighbor_idxs)\n    // start_len: (M1 + M2, 2)  [start_offset, neighbor_length]\n    // cumsum: (1), max offset of current data in stack_neighbor_idxs\n    // max_neighbour_distance: float\n    // nsample: find all (-1), find limited number(>0)\n    // neighbor_type: 1: ball, others: cube\n\n    int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (pt_idx >= M) return;\n\n    int bs_idx = 0, pt_cnt = new_xyz_batch_cnt[0];\n    for (int k = 1; k < batch_size; k++){\n        if (pt_idx < pt_cnt) break;\n        pt_cnt += new_xyz_batch_cnt[k];\n        bs_idx = k;\n    }\n\n    int xyz_batch_start_idx = 0;\n    for (int k = 0; k < bs_idx; k++) xyz_batch_start_idx += xyz_batch_cnt[k];\n\n    support_xyz += xyz_batch_start_idx * 3;\n    new_xyz += pt_idx * 3;\n    start_len += pt_idx * 2;\n\n    float new_x = new_xyz[0];\n    float new_y = new_xyz[1];\n    float new_z = new_xyz[2];\n    int n = xyz_batch_cnt[bs_idx];\n\n    float local_x, local_y, local_z;\n    float radius2 = max_neighbour_distance * max_neighbour_distance;\n\n    int temp_idxs[1000];\n\n    int sample_cnt = 0;\n    for (int k = 0; k < n; ++k) {\n        local_x = support_xyz[k * 3 + 0] - new_x;\n        local_y = support_xyz[k * 3 + 1] - new_y;\n        local_z = support_xyz[k * 3 + 2] - new_z;\n\n        if (neighbor_type == 1){\n            // ball\n            if (local_x * local_x + local_y * local_y + local_z * local_z > radius2){\n                continue;\n            }\n        }\n        else{\n            // voxel\n            if ((fabs(local_x) > max_neighbour_distance) |\n                (fabs(local_y) > max_neighbour_distance) |\n                (fabs(local_z) > max_neighbour_distance)){\n                continue;\n            }\n        }\n        if (sample_cnt < 1000){\n            temp_idxs[sample_cnt] = k;\n        }\n        else{\n            break;\n        }\n        sample_cnt++;\n        if (nsample > 0 && sample_cnt >= nsample) break;\n    }\n    start_len[0] = atomicAdd(cumsum, sample_cnt);\n    start_len[1] = sample_cnt;\n\n    int max_thresh = avg_length_of_neighbor_idxs * M;\n    if (start_len[0] >= max_thresh) return;\n\n    stack_neighbor_idxs += start_len[0];\n    if (start_len[0] + sample_cnt >= max_thresh) sample_cnt = max_thresh - start_len[0];\n\n    for (int k = 0; k < sample_cnt; k++){\n        stack_neighbor_idxs[k] = temp_idxs[k] + xyz_batch_start_idx;\n    }\n}\n\n\nint query_stacked_local_neighbor_idxs_kernel_launcher_stack(\n    const float *support_xyz, const int *xyz_batch_cnt, const float *new_xyz, const int *new_xyz_batch_cnt,\n    int *stack_neighbor_idxs, int *start_len, int *cumsum, int avg_length_of_neighbor_idxs,\n    float max_neighbour_distance, int batch_size, int M, int nsample, int neighbor_type){\n    // support_xyz: (N1 + N2 ..., 3) xyz coordinates of the features\n    // xyz_batch_cnt: (batch_size), [N1, N2, ...]\n    // new_xyz: (M1 + M2 ..., 3) centers of the ball query\n    // new_xyz_batch_cnt: (batch_size), [M1, M2, ...]\n    // stack_neighbor_idxs: (max_length_of_neighbor_idxs)\n    // start_len: (M1 + M2, 2)  [start_offset, neighbor_length]\n    // cumsum: (1), max offset of current data in stack_neighbor_idxs\n    // max_neighbour_distance: float\n    // nsample: find all (-1), find limited number(>0)\n    // neighbor_type: 1: ball, others: cube\n\n    cudaError_t err;\n    dim3 blocks(DIVUP(M, THREADS_PER_BLOCK));  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    query_stacked_local_neighbor_idxs_kernel<<<blocks, threads>>>(\n        support_xyz, xyz_batch_cnt, new_xyz, new_xyz_batch_cnt,\n        stack_neighbor_idxs, start_len, cumsum, avg_length_of_neighbor_idxs,\n        max_neighbour_distance, batch_size, M, nsample, neighbor_type\n    );\n\n    // cudaDeviceSynchronize();  // for using printf in kernel function\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n    return 0;\n}\n\n\n__global__ void vector_pool_kernel_stack(\n    const float *support_xyz, const float *support_features, const int *xyz_batch_cnt,\n    const float *new_xyz, float *new_features, float *new_local_xyz, const int *new_xyz_batch_cnt,\n    int num_grid_x, int num_grid_y, int num_grid_z, float max_neighbour_distance,\n    int batch_size, int M, int num_c_in, int num_c_out,\n    int num_c_each_grid, int num_total_grids, int *point_cnt_of_grid, int *grouped_idxs,\n    int use_xyz, float grid_size_x, float grid_size_y,\n    float grid_size_z, int *cum_sum, int num_max_sum_points, int nsample, int neighbor_type, int pooling_type){\n    // support_xyz: (N1 + N2 ..., 3) xyz coordinates of the features\n    // support_features: (N1 + N2 ..., C)\n    // xyz_batch_cnt: (batch_size), [N1, N2, ...]\n    // new_xyz: (M1 + M2 ..., 3) centers of the ball query\n    // new_features: (M1 + M2 ..., C), C = num_total_grids * num_c_each_grid\n    // new_local_xyz: (M1 + M2 ..., 3 * num_total_grids)\n    // new_xyz_batch_cnt: (batch_size), [M1, M2, ...]\n    // num_grid_x, num_grid_y, num_grid_z: number of grids in each local area centered at new_xyz\n    // point_cnt_of_grid: (M1 + M2 ..., num_total_grids)\n    // grouped_idxs: (num_max_sum_points, 3)[idx of support_xyz, idx of new_xyz, idx of grid_idx in new_xyz]\n    // use_xyz: whether to calculate new_local_xyz\n    // neighbor_type: 1: ball, others: cube\n    // pooling_type: 0: avg_pool, 1: random choice\n\n    int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (pt_idx >= M) return;\n\n    int bs_idx = 0, pt_cnt = new_xyz_batch_cnt[0];\n    for (int k = 1; k < batch_size; k++){\n        if (pt_idx < pt_cnt) break;\n        pt_cnt += new_xyz_batch_cnt[k];\n        bs_idx = k;\n    }\n\n    int xyz_batch_start_idx = 0;\n    for (int k = 0; k < bs_idx; k++) xyz_batch_start_idx += xyz_batch_cnt[k];\n\n    support_xyz += xyz_batch_start_idx * 3;\n    support_features += xyz_batch_start_idx * num_c_in;\n\n    new_xyz += pt_idx * 3;\n    new_features += pt_idx * num_c_out;\n    point_cnt_of_grid += pt_idx * num_total_grids;\n    new_local_xyz += pt_idx * 3 * num_total_grids;\n\n    float new_x = new_xyz[0];\n    float new_y = new_xyz[1];\n    float new_z = new_xyz[2];\n    int n = xyz_batch_cnt[bs_idx], grid_idx_x, grid_idx_y, grid_idx_z, grid_idx;\n    float local_x, local_y, local_z;\n    float radius2 = max_neighbour_distance * max_neighbour_distance;\n\n    int sample_cnt = 0;\n    for (int k = 0; k < n; ++k) {\n        local_x = support_xyz[k * 3 + 0] - new_x;\n        local_y = support_xyz[k * 3 + 1] - new_y;\n        local_z = support_xyz[k * 3 + 2] - new_z;\n\n        if (neighbor_type == 1){\n            // ball\n            if (local_x * local_x + local_y * local_y + local_z * local_z > radius2){\n                continue;\n            }\n        }\n        else{\n            // voxel\n            if ((fabs(local_x) > max_neighbour_distance) |\n                (fabs(local_y) > max_neighbour_distance) |\n                (fabs(local_z) > max_neighbour_distance)){\n                continue;\n            }\n        }\n\n        grid_idx_x = floorf((local_x + max_neighbour_distance) / grid_size_x);\n        grid_idx_y = floorf((local_y + max_neighbour_distance) / grid_size_y);\n        grid_idx_z = floorf((local_z + max_neighbour_distance) / grid_size_z);\n        grid_idx = grid_idx_x * num_grid_y * num_grid_z + grid_idx_y * num_grid_z + grid_idx_z;\n        grid_idx = min(max(grid_idx, 0), num_total_grids - 1);\n\n        if (pooling_type == 0){\n            // avg pooling\n            point_cnt_of_grid[grid_idx] ++;\n\n            for (int i = 0; i < num_c_in; i++){\n                new_features[grid_idx * num_c_each_grid + i % num_c_each_grid] += support_features[k * num_c_in + i];\n            }\n            if (use_xyz){\n                new_local_xyz[grid_idx * 3 + 0] += local_x;\n                new_local_xyz[grid_idx * 3 + 1] += local_y;\n                new_local_xyz[grid_idx * 3 + 2] += local_z;\n            }\n\n            int cnt = atomicAdd(cum_sum, 1);\n            if (cnt >= num_max_sum_points) continue;  // continue to statistics the max number of points\n\n            grouped_idxs[cnt * 3 + 0] = xyz_batch_start_idx + k;\n            grouped_idxs[cnt * 3 + 1] = pt_idx;\n            grouped_idxs[cnt * 3 + 2] = grid_idx;\n\n            sample_cnt++;\n            if(nsample > 0 && sample_cnt >= nsample) break;\n        }\n        else if (pooling_type == 1){\n            // random choose one within sub-voxel\n            // printf(\"new_xyz=(%.2f, %.2f, %.2f, ), find neighbor k=%d: support_xyz=(%.2f, %.2f, %.2f), local_xyz=(%.2f, %.2f, %.2f), neighbor=%.2f, grid_idx=%d, point_cnt_of_grid_idx=%d\\n\",\n            // new_x, new_y, new_z, k, support_xyz[k * 3 + 0], support_xyz[k * 3 + 1], support_xyz[k * 3 + 2], local_x, local_y, local_z, max_neighbour_distance, grid_idx, point_cnt_of_grid[grid_idx]);\n\n            if (point_cnt_of_grid[grid_idx] == 0){\n                point_cnt_of_grid[grid_idx] ++;\n                for (int i = 0; i < num_c_in; i++){\n                    new_features[grid_idx * num_c_each_grid + i % num_c_each_grid] = support_features[k * num_c_in + i];\n                }\n                if (use_xyz){\n                    new_local_xyz[grid_idx * 3 + 0] = local_x;\n                    new_local_xyz[grid_idx * 3 + 1] = local_y;\n                    new_local_xyz[grid_idx * 3 + 2] = local_z;\n                }\n\n                int cnt = atomicAdd(cum_sum, 1);\n                if (cnt >= num_max_sum_points) continue;  // continue to statistics the max number of points\n\n                grouped_idxs[cnt * 3 + 0] = xyz_batch_start_idx + k;\n                grouped_idxs[cnt * 3 + 1] = pt_idx;\n                grouped_idxs[cnt * 3 + 2] = grid_idx;\n\n                sample_cnt++;\n                if(nsample > 0 && sample_cnt >= nsample || sample_cnt >= num_total_grids) break;\n            }\n\n        }\n\n    }\n}\n\n\nint vector_pool_kernel_launcher_stack(\n    const float *support_xyz, const float *support_features, const int *xyz_batch_cnt,\n    const float *new_xyz, float *new_features, float *new_local_xyz, const int *new_xyz_batch_cnt,\n    int *point_cnt_of_grid, int *grouped_idxs,\n    int num_grid_x, int num_grid_y, int num_grid_z, float max_neighbour_distance,\n    int batch_size, int N, int M, int num_c_in, int num_c_out, int num_total_grids,\n    int use_xyz, int num_max_sum_points, int nsample, int neighbor_type, int pooling_type){\n    // support_xyz: (N1 + N2 ..., 3) xyz coordinates of the features\n    // support_features: (N1 + N2 ..., C)\n    // xyz_batch_cnt: (batch_size), [N1, N2, ...]\n    // new_xyz: (M1 + M2 ..., 3) centers of the ball query\n    // new_features: (M1 + M2 ..., C)\n    // new_local_xyz: (M1 + M2 ..., 3)\n    // new_xyz_batch_cnt: (batch_size), [M1, M2, ...]\n    // num_grid_x, num_grid_y, num_grid_z: number of grids in each local area centered at new_xyz\n    // use_xyz: whether to calculate new_local_xyz\n    // grouped_idxs: (num_max_sum_points, 3)[idx of support_xyz, idx of new_xyz, idx of grid_idx in new_xyz]\n    // neighbor_type: 1: ball, others: cube\n    // pooling_type: 0: avg_pool, 1: random choice\n\n\n    cudaError_t err;\n    int num_c_each_grid = num_c_out / num_total_grids;\n    float grid_size_x = max_neighbour_distance * 2 / num_grid_x;\n    float grid_size_y = max_neighbour_distance * 2 / num_grid_y;\n    float grid_size_z = max_neighbour_distance * 2 / num_grid_z;\n\n    dim3 blocks(DIVUP(M, THREADS_PER_BLOCK));  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    int cum_sum = 0;\n    int *p_cum_sum;\n    cudaMalloc((void**)&p_cum_sum, sizeof(int));\n    cudaMemcpy(p_cum_sum, &cum_sum, sizeof(int), cudaMemcpyHostToDevice);\n\n    vector_pool_kernel_stack<<<blocks, threads>>>(\n        support_xyz, support_features, xyz_batch_cnt,\n        new_xyz, new_features, new_local_xyz, new_xyz_batch_cnt,\n        num_grid_x, num_grid_y, num_grid_z, max_neighbour_distance,\n        batch_size, M, num_c_in, num_c_out,\n        num_c_each_grid, num_total_grids, point_cnt_of_grid, grouped_idxs,\n        use_xyz, grid_size_x, grid_size_y, grid_size_z, p_cum_sum, num_max_sum_points,\n        nsample, neighbor_type, pooling_type\n    );\n\n    cudaMemcpy(&cum_sum, p_cum_sum, sizeof(int), cudaMemcpyDeviceToHost);\n\n    // cudaDeviceSynchronize();  // for using printf in kernel function\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n    return cum_sum;\n}\n\n\n__global__ void vector_pool_grad_kernel_stack(const float *grad_new_features,\n    const int *point_cnt_of_grid, const int *grouped_idxs,\n    float *grad_support_features, int N, int M, int num_c_out, int num_c_in,\n    int num_c_each_grid, int num_total_grids, int num_max_sum_points){\n    // grad_new_features: (M1 + M2 ..., C_out)\n    // point_cnt_of_grid: (M1 + M2 ..., num_total_grids)\n    // grouped_idxs: (num_max_sum_points, 3) [idx of support_xyz, idx of new_xyz, idx of grid_idx in new_xyz]\n    // grad_support_features: (N1 + N2 ..., C_in)\n\n    int channel_idx = blockIdx.y;\n    int index = blockIdx.x * blockDim.x + threadIdx.x;\n\n    if (index >= num_max_sum_points || channel_idx >= num_c_in) return;\n\n    int idx_of_support_xyz = grouped_idxs[index * 3 + 0];\n    int idx_of_new_xyz = grouped_idxs[index * 3 + 1];\n    int idx_of_grid_idx = grouped_idxs[index * 3 + 2];\n\n    int num_total_pts = point_cnt_of_grid[idx_of_new_xyz * num_total_grids + idx_of_grid_idx];\n    grad_support_features += idx_of_support_xyz * num_c_in + channel_idx;\n\n    grad_new_features += idx_of_new_xyz * num_c_out + idx_of_grid_idx * num_c_each_grid;\n    int channel_idx_of_cin = channel_idx % num_c_each_grid;\n    float cur_grad = 1 / fmaxf(float(num_total_pts), 1.0);\n    atomicAdd(grad_support_features, grad_new_features[channel_idx_of_cin] * cur_grad);\n}\n\n\nvoid vector_pool_grad_kernel_launcher_stack(\n    const float *grad_new_features, const int *point_cnt_of_grid, const int *grouped_idxs,\n    float *grad_support_features, int N, int M, int num_c_out, int num_c_in, int num_total_grids,\n    int num_max_sum_points){\n    // grad_new_features: (M1 + M2 ..., C_out)\n    // point_cnt_of_grid: (M1 + M2 ..., num_total_grids)\n    // grouped_idxs: (num_max_sum_points, 3) [idx of support_xyz, idx of new_xyz, idx of grid_idx in new_xyz]\n    // grad_support_features: (N1 + N2 ..., C_in)\n    int num_c_each_grid = num_c_out / num_total_grids;\n\n    cudaError_t err;\n\n    dim3 blocks(DIVUP(num_max_sum_points, THREADS_PER_BLOCK), num_c_in);  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    vector_pool_grad_kernel_stack<<<blocks, threads>>>(\n        grad_new_features, point_cnt_of_grid, grouped_idxs, grad_support_features,\n        N, M, num_c_out, num_c_in, num_c_each_grid, num_total_grids, num_max_sum_points\n    );\n    // cudaDeviceSynchronize();  // for using printf in kernel function\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/src/vector_pool_gpu.h",
    "content": "/*\nVector-pool aggregation based local feature aggregation for point cloud.\nPV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection\nhttps://arxiv.org/abs/2102.00463\n\nWritten by Shaoshuai Shi\nAll Rights Reserved 2020.\n*/\n\n\n#ifndef _STACK_VECTOR_POOL_GPU_H\n#define _STACK_VECTOR_POOL_GPU_H\n\n#include <torch/serialize/tensor.h>\n#include <vector>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n\n\nint query_stacked_local_neighbor_idxs_kernel_launcher_stack(\n    const float *support_xyz, const int *xyz_batch_cnt, const float *new_xyz, const int *new_xyz_batch_cnt,\n    int *stack_neighbor_idxs, int *start_len, int *cumsum, int avg_length_of_neighbor_idxs,\n    float max_neighbour_distance, int batch_size, int M, int nsample, int neighbor_type);\n\nint query_stacked_local_neighbor_idxs_wrapper_stack(at::Tensor support_xyz_tensor, at::Tensor xyz_batch_cnt_tensor,\n    at::Tensor new_xyz_tensor, at::Tensor new_xyz_batch_cnt_tensor,\n    at::Tensor stack_neighbor_idxs_tensor, at::Tensor start_len_tensor, at::Tensor cumsum_tensor,\n    int avg_length_of_neighbor_idxs, float max_neighbour_distance, int nsample, int neighbor_type);\n\n\nint query_three_nn_by_stacked_local_idxs_kernel_launcher_stack(\n    const float *support_xyz, const float *new_xyz, const float *new_xyz_grid_centers,\n    int *new_xyz_grid_idxs, float *new_xyz_grid_dist2,\n    const int *stack_neighbor_idxs, const int *start_len,\n    int M, int num_total_grids);\n\nint query_three_nn_by_stacked_local_idxs_wrapper_stack(at::Tensor support_xyz_tensor,\n    at::Tensor new_xyz_tensor, at::Tensor new_xyz_grid_centers_tensor,\n    at::Tensor new_xyz_grid_idxs_tensor, at::Tensor new_xyz_grid_dist2_tensor,\n    at::Tensor stack_neighbor_idxs_tensor, at::Tensor start_len_tensor,\n    int M, int num_total_grids);\n\n\nint vector_pool_wrapper_stack(at::Tensor support_xyz_tensor, at::Tensor xyz_batch_cnt_tensor,\n    at::Tensor support_features_tensor, at::Tensor new_xyz_tensor, at::Tensor new_xyz_batch_cnt_tensor,\n    at::Tensor new_features_tensor, at::Tensor new_local_xyz,\n    at::Tensor point_cnt_of_grid_tensor, at::Tensor grouped_idxs_tensor,\n    int num_grid_x, int num_grid_y, int num_grid_z, float max_neighbour_distance, int use_xyz,\n    int num_max_sum_points, int nsample, int neighbor_type, int pooling_type);\n\n\nint vector_pool_kernel_launcher_stack(\n    const float *support_xyz, const float *support_features, const int *xyz_batch_cnt,\n    const float *new_xyz, float *new_features, float * new_local_xyz, const int *new_xyz_batch_cnt,\n    int *point_cnt_of_grid, int *grouped_idxs,\n    int num_grid_x, int num_grid_y, int num_grid_z, float max_neighbour_distance,\n    int batch_size, int N, int M, int num_c_in, int num_c_out, int num_total_grids, int use_xyz,\n    int num_max_sum_points, int nsample, int neighbor_type, int pooling_type);\n\n\nint vector_pool_grad_wrapper_stack(at::Tensor grad_new_features_tensor,\n    at::Tensor point_cnt_of_grid_tensor, at::Tensor grouped_idxs_tensor,\n    at::Tensor grad_support_features_tensor);\n\n\nvoid vector_pool_grad_kernel_launcher_stack(\n    const float *grad_new_features, const int *point_cnt_of_grid, const int *grouped_idxs,\n    float *grad_support_features, int N, int M, int num_c_out, int num_c_in, int num_total_grids,\n    int num_max_sum_points);\n\n#endif\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/src/voxel_query.cpp",
    "content": "#include <torch/serialize/tensor.h>\n#include <vector>\n#include <THC/THC.h>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n#include \"voxel_query_gpu.h\"\n\nextern THCState *state;\n\n#define CHECK_CUDA(x) do { \\\n  if (!x.type().is_cuda()) { \\\n    fprintf(stderr, \"%s must be CUDA tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_CONTIGUOUS(x) do { \\\n  if (!x.is_contiguous()) { \\\n    fprintf(stderr, \"%s must be contiguous tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_INPUT(x) CHECK_CUDA(x);CHECK_CONTIGUOUS(x)\n\n\nint voxel_query_wrapper_stack(int M, int R1, int R2, int R3, int nsample, float radius, \n    int z_range, int y_range, int x_range, at::Tensor new_xyz_tensor, at::Tensor xyz_tensor, \n    at::Tensor new_coords_tensor, at::Tensor point_indices_tensor, at::Tensor idx_tensor) {\n    CHECK_INPUT(new_coords_tensor);\n    CHECK_INPUT(point_indices_tensor);\n    CHECK_INPUT(new_xyz_tensor);\n    CHECK_INPUT(xyz_tensor);\n    \n    const float *new_xyz = new_xyz_tensor.data<float>();\n    const float *xyz = xyz_tensor.data<float>();\n    const int *new_coords = new_coords_tensor.data<int>();\n    const int *point_indices = point_indices_tensor.data<int>();\n    int *idx = idx_tensor.data<int>();\n\n    voxel_query_kernel_launcher_stack(M, R1, R2, R3, nsample, radius, z_range, y_range, x_range, new_xyz, xyz, new_coords, point_indices, idx);\n    return 1;\n}\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/src/voxel_query_gpu.cu",
    "content": "#include <math.h>\n#include <stdio.h>\n#include <stdlib.h>\n#include <curand_kernel.h>\n\n#include \"voxel_query_gpu.h\"\n#include \"cuda_utils.h\"\n\n\n__global__ void voxel_query_kernel_stack(int M, int R1, int R2, int R3, int nsample, \n            float radius, int z_range, int y_range, int x_range, const float *new_xyz, \n            const float *xyz, const int *new_coords, const int *point_indices, int *idx) {\n    // :param new_coords: (M1 + M2 ..., 4) centers of the ball query\n    // :param point_indices: (B, Z, Y, X)\n    // output:\n    //      idx: (M1 + M2, nsample)\n    int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (pt_idx >= M) return;\n    \n    new_xyz += pt_idx * 3;\n    new_coords += pt_idx * 4;\n    idx += pt_idx * nsample;\n\n    curandState state;\n    curand_init(pt_idx, 0, 0, &state);\n    \n    float radius2 = radius * radius;\n    float new_x = new_xyz[0];\n    float new_y = new_xyz[1];\n    float new_z = new_xyz[2];\n\n    int batch_idx = new_coords[0];\n    int new_coords_z = new_coords[1];\n    int new_coords_y = new_coords[2];\n    int new_coords_x = new_coords[3];\n    \n    int cnt = 0;\n    int cnt2 = 0;\n    // for (int dz = -1*z_range; dz <= z_range; ++dz) {\n    for (int dz = -1*z_range; dz <= z_range; ++dz) {\n        int z_coord = new_coords_z + dz;\n        if (z_coord < 0 || z_coord >= R1) continue;\n\n        for (int dy = -1*y_range; dy <= y_range; ++dy) {\n            int y_coord = new_coords_y + dy;\n            if (y_coord < 0 || y_coord >= R2) continue;\n\n            for (int dx = -1*x_range; dx <= x_range; ++dx) {\n                int x_coord = new_coords_x + dx;\n                if (x_coord < 0 || x_coord >= R3) continue;\n\n                int index = batch_idx * R1 * R2 * R3 + \\\n                            z_coord * R2 * R3 + \\\n                            y_coord * R3 + \\\n                            x_coord;\n                int neighbor_idx = point_indices[index];\n                if (neighbor_idx < 0) continue;\n                \n                float x_per = xyz[neighbor_idx*3 + 0];\n                float y_per = xyz[neighbor_idx*3 + 1];\n                float z_per = xyz[neighbor_idx*3 + 2];\n\n                float dist2 = (x_per - new_x) * (x_per - new_x) + (y_per - new_y) * (y_per - new_y) + (z_per - new_z) * (z_per - new_z);\n\n                if (dist2 > radius2) continue;\n                \n                ++cnt2;\n\n                if (cnt < nsample) {\n                    if (cnt == 0) {\n                        for (int l = 0; l < nsample; ++l) {\n                            idx[l] = neighbor_idx;\n                        }\n                    }\n                    idx[cnt] = neighbor_idx;\n                    ++cnt;\n                }\n                // else {\n                //     float rnd = curand_uniform(&state);\n                //     if (rnd < (float(nsample) / cnt2)) {\n                //         int insertidx = ceilf(curand_uniform(&state) * nsample) - 1;\n                //         idx[insertidx] = neighbor_idx;\n                //     }\n                // }\n            }\n        }\n    }\n   if (cnt == 0) idx[0] = -1;\n}\n\n\nvoid voxel_query_kernel_launcher_stack(int M, int R1, int R2, int R3, int nsample,\n    float radius, int z_range, int y_range, int x_range, const float *new_xyz, \n    const float *xyz, const int *new_coords, const int *point_indices, int *idx) {\n    // :param new_coords: (M1 + M2 ..., 4) centers of the voxel query\n    // :param point_indices: (B, Z, Y, X) \n    // output:\n    //      idx: (M1 + M2, nsample)\n\n    cudaError_t err;\n\n    dim3 blocks(DIVUP(M, THREADS_PER_BLOCK));  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    voxel_query_kernel_stack<<<blocks, threads>>>(M, R1, R2, R3, nsample, radius, z_range, y_range, x_range, new_xyz, xyz, new_coords, point_indices, idx);\n    // cudaDeviceSynchronize();  // for using printf in kernel function\n\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/src/voxel_query_gpu.h",
    "content": "#ifndef _STACK_VOXEL_QUERY_GPU_H\n#define _STACK_VOXEL_QUERY_GPU_H\n\n#include <torch/serialize/tensor.h>\n#include <vector>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n\nint voxel_query_wrapper_stack(int M, int R1, int R2, int R3, int nsample, float radius, \n    int z_range, int y_range, int x_range, at::Tensor new_xyz_tensor, at::Tensor xyz_tensor, \n    at::Tensor new_coords_tensor, at::Tensor point_indices_tensor, at::Tensor idx_tensor);\n\n\nvoid voxel_query_kernel_launcher_stack(int M, int R1, int R2, int R3, int nsample,\n    float radius, int z_range, int y_range, int x_range, const float *new_xyz, \n    const float *xyz, const int *new_coords, const int *point_indices, int *idx);\n\n\n#endif\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/voxel_pool_modules.py",
    "content": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nfrom . import voxel_query_utils\nfrom typing import List\n\n\nclass NeighborVoxelSAModuleMSG(nn.Module):\n                 \n    def __init__(self, *, query_ranges: List[List[int]], radii: List[float], \n        nsamples: List[int], mlps: List[List[int]], use_xyz: bool = True, pool_method='max_pool'):\n        \"\"\"\n        Args:\n            query_ranges: list of int, list of neighbor ranges to group with\n            nsamples: list of int, number of samples in each ball query\n            mlps: list of list of int, spec of the pointnet before the global pooling for each scale\n            use_xyz:\n            pool_method: max_pool / avg_pool\n        \"\"\"\n        super().__init__()\n\n        assert len(query_ranges) == len(nsamples) == len(mlps)\n        \n        self.groupers = nn.ModuleList()\n        self.mlps_in = nn.ModuleList()\n        self.mlps_pos = nn.ModuleList()\n        self.mlps_out = nn.ModuleList()\n        for i in range(len(query_ranges)):\n            max_range = query_ranges[i]\n            nsample = nsamples[i]\n            radius = radii[i]\n            self.groupers.append(voxel_query_utils.VoxelQueryAndGrouping(max_range, radius, nsample))\n            mlp_spec = mlps[i]\n\n            cur_mlp_in = nn.Sequential(\n                nn.Conv1d(mlp_spec[0], mlp_spec[1], kernel_size=1, bias=False),\n                nn.BatchNorm1d(mlp_spec[1])\n            )\n            \n            cur_mlp_pos = nn.Sequential(\n                nn.Conv2d(3, mlp_spec[1], kernel_size=1, bias=False),\n                nn.BatchNorm2d(mlp_spec[1])\n            )\n\n            cur_mlp_out = nn.Sequential(\n                nn.Conv1d(mlp_spec[1], mlp_spec[2], kernel_size=1, bias=False),\n                nn.BatchNorm1d(mlp_spec[2]),\n                nn.ReLU()\n            )\n\n            self.mlps_in.append(cur_mlp_in)\n            self.mlps_pos.append(cur_mlp_pos)\n            self.mlps_out.append(cur_mlp_out)\n\n        self.relu = nn.ReLU()\n        self.pool_method = pool_method\n\n        self.init_weights()\n\n    def init_weights(self):\n        for m in self.modules():\n            if isinstance(m, nn.Conv2d) or isinstance(m, nn.Conv1d):\n                nn.init.kaiming_normal_(m.weight)\n                if m.bias is not None:\n                    nn.init.constant_(m.bias, 0)\n            if isinstance(m, nn.BatchNorm2d) or isinstance(m, nn.BatchNorm1d):\n                nn.init.constant_(m.weight, 1.0)\n                nn.init.constant_(m.bias, 0)\n\n    def forward(self, xyz, xyz_batch_cnt, new_xyz, new_xyz_batch_cnt, \\\n                                        new_coords, features, voxel2point_indices):\n        \"\"\"\n        :param xyz: (N1 + N2 ..., 3) tensor of the xyz coordinates of the features\n        :param xyz_batch_cnt: (batch_size), [N1, N2, ...]\n        :param new_xyz: (M1 + M2 ..., 3)\n        :param new_xyz_batch_cnt: (batch_size), [M1, M2, ...]\n        :param features: (N1 + N2 ..., C) tensor of the descriptors of the the features\n        :param point_indices: (B, Z, Y, X) tensor of point indices\n        :return:\n            new_xyz: (M1 + M2 ..., 3) tensor of the new features' xyz\n            new_features: (M1 + M2 ..., \\sum_k(mlps[k][-1])) tensor of the new_features descriptors\n        \"\"\"\n        # change the order to [batch_idx, z, y, x]\n        new_coords = new_coords[:, [0, 3, 2, 1]].contiguous()\n        new_features_list = []\n        for k in range(len(self.groupers)):\n            # features_in: (1, C, M1+M2)\n            features_in = features.permute(1, 0).unsqueeze(0)\n            features_in = self.mlps_in[k](features_in)\n            # features_in: (1, M1+M2, C)\n            features_in = features_in.permute(0, 2, 1).contiguous()\n            # features_in: (M1+M2, C)\n            features_in = features_in.view(-1, features_in.shape[-1])\n            # grouped_features: (M1+M2, C, nsample)\n            # grouped_xyz: (M1+M2, 3, nsample)\n            grouped_features, grouped_xyz, empty_ball_mask = self.groupers[k](\n                new_coords, xyz, xyz_batch_cnt, new_xyz, new_xyz_batch_cnt, features_in, voxel2point_indices\n            )\n            grouped_features[empty_ball_mask] = 0\n\n            # grouped_features: (1, C, M1+M2, nsample)\n            grouped_features = grouped_features.permute(1, 0, 2).unsqueeze(dim=0)\n            # grouped_xyz: (M1+M2, 3, nsample)\n            grouped_xyz = grouped_xyz - new_xyz.unsqueeze(-1)\n            grouped_xyz[empty_ball_mask] = 0\n            # grouped_xyz: (1, 3, M1+M2, nsample)\n            grouped_xyz = grouped_xyz.permute(1, 0, 2).unsqueeze(0)\n            # grouped_xyz: (1, C, M1+M2, nsample)\n            position_features = self.mlps_pos[k](grouped_xyz)\n            new_features = grouped_features + position_features\n            new_features = self.relu(new_features)\n            \n            if self.pool_method == 'max_pool':\n                new_features = F.max_pool2d(\n                    new_features, kernel_size=[1, new_features.size(3)]\n                ).squeeze(dim=-1)  # (1, C, M1 + M2 ...)\n            elif self.pool_method == 'avg_pool':\n                new_features = F.avg_pool2d(\n                    new_features, kernel_size=[1, new_features.size(3)]\n                ).squeeze(dim=-1)  # (1, C, M1 + M2 ...)\n            else:\n                raise NotImplementedError\n            \n            new_features = self.mlps_out[k](new_features)\n            new_features = new_features.squeeze(dim=0).permute(1, 0)  # (M1 + M2 ..., C)\n            new_features_list.append(new_features)\n        \n        # (M1 + M2 ..., C)\n        new_features = torch.cat(new_features_list, dim=1)\n        return new_features\n\n"
  },
  {
    "path": "pcdet/ops/pointnet2/pointnet2_stack/voxel_query_utils.py",
    "content": "import torch\nfrom torch.autograd import Variable\nfrom torch.autograd import Function\nimport torch.nn as nn\nfrom typing import List\n\nfrom . import pointnet2_stack_cuda as pointnet2\nfrom . import pointnet2_utils\n\nclass VoxelQuery(Function):\n\n    @staticmethod\n    def forward(ctx, max_range: int, radius: float, nsample: int, xyz: torch.Tensor, \\\n                    new_xyz: torch.Tensor, new_coords: torch.Tensor, point_indices: torch.Tensor):\n        \"\"\"\n        Args:\n            ctx:\n            max_range: int, max range of voxels to be grouped\n            nsample: int, maximum number of features in the balls\n            new_coords: (M1 + M2, 4), [batch_id, z, y, x] cooridnates of keypoints\n            new_xyz_batch_cnt: (batch_size), [M1, M2, ...]\n            point_indices: (batch_size, Z, Y, X) 4-D tensor recording the point indices of voxels\n        Returns:\n            idx: (M1 + M2, nsample) tensor with the indicies of the features that form the query balls\n        \"\"\"\n        assert new_xyz.is_contiguous()\n        assert xyz.is_contiguous()\n        assert new_coords.is_contiguous()\n        assert point_indices.is_contiguous()\n\n        M = new_coords.shape[0]\n        B, Z, Y, X = point_indices.shape\n        idx = torch.cuda.IntTensor(M, nsample).zero_()\n\n        z_range, y_range, x_range = max_range\n        pointnet2.voxel_query_wrapper(M, Z, Y, X, nsample, radius, z_range, y_range, x_range, \\\n                    new_xyz, xyz, new_coords, point_indices, idx)\n\n        empty_ball_mask = (idx[:, 0] == -1)\n        idx[empty_ball_mask] = 0\n\n        return idx, empty_ball_mask\n\n    @staticmethod\n    def backward(ctx, a=None):\n        return None, None, None, None\n\nvoxel_query = VoxelQuery.apply\n\n\nclass VoxelQueryAndGrouping(nn.Module):\n    def __init__(self, max_range: int, radius: float, nsample: int):\n        \"\"\"\n        Args:\n            radius: float, radius of ball\n            nsample: int, maximum number of features to gather in the ball\n        \"\"\"\n        super().__init__()\n        self.max_range, self.radius, self.nsample = max_range, radius, nsample\n\n    def forward(self, new_coords: torch.Tensor, xyz: torch.Tensor, xyz_batch_cnt: torch.Tensor,\n                new_xyz: torch.Tensor, new_xyz_batch_cnt: torch.Tensor,\n                features: torch.Tensor, voxel2point_indices: torch.Tensor):\n        \"\"\"\n        Args:\n            new_coords: (M1 + M2 ..., 3) centers voxel indices of the ball query\n            xyz: (N1 + N2 ..., 3) xyz coordinates of the features\n            xyz_batch_cnt: (batch_size), [N1, N2, ...]\n            new_xyz: (M1 + M2 ..., 3) centers of the ball query\n            new_xyz_batch_cnt: (batch_size), [M1, M2, ...]\n            features: (N1 + N2 ..., C) tensor of features to group\n            voxel2point_indices: (B, Z, Y, X) tensor of points indices of voxels\n\n        Returns:\n            new_features: (M1 + M2, C, nsample) tensor\n        \"\"\"\n        assert xyz.shape[0] == xyz_batch_cnt.sum(), 'xyz: %s, xyz_batch_cnt: %s' % (str(xyz.shape), str(new_xyz_batch_cnt))\n        assert new_coords.shape[0] == new_xyz_batch_cnt.sum(), \\\n            'new_coords: %s, new_xyz_batch_cnt: %s' % (str(new_coords.shape), str(new_xyz_batch_cnt))\n        batch_size = xyz_batch_cnt.shape[0]\n        \n        # idx: (M1 + M2 ..., nsample), empty_ball_mask: (M1 + M2 ...)\n        idx1, empty_ball_mask1 = voxel_query(self.max_range, self.radius, self.nsample, xyz, new_xyz, new_coords, voxel2point_indices)\n\n        idx1 = idx1.view(batch_size, -1, self.nsample)\n        count = 0\n        for bs_idx in range(batch_size):\n            idx1[bs_idx] -= count\n            count += xyz_batch_cnt[bs_idx]\n        idx1 = idx1.view(-1, self.nsample)\n        idx1[empty_ball_mask1] = 0\n\n        idx = idx1\n        empty_ball_mask = empty_ball_mask1\n        \n        grouped_xyz = pointnet2_utils.grouping_operation(xyz, xyz_batch_cnt, idx, new_xyz_batch_cnt)\n        # grouped_features: (M1 + M2, C, nsample)\n        grouped_features = pointnet2_utils.grouping_operation(features, xyz_batch_cnt, idx, new_xyz_batch_cnt)  \n        \n        return grouped_features, grouped_xyz, empty_ball_mask\n"
  },
  {
    "path": "pcdet/ops/roiaware_pool3d/roiaware_pool3d_utils.py",
    "content": "import torch\nimport torch.nn as nn\nfrom torch.autograd import Function\n\nfrom ...utils import common_utils\nfrom . import roiaware_pool3d_cuda\n\n\ndef points_in_boxes_cpu(points, boxes):\n    \"\"\"\n    Args:\n        points: (num_points, 3)\n        boxes: [x, y, z, dx, dy, dz, heading], (x, y, z) is the box center, each box DO NOT overlaps\n    Returns:\n        point_indices: (N, num_points)\n    \"\"\"\n    assert boxes.shape[1] == 7\n    assert points.shape[1] == 3\n    points, is_numpy = common_utils.check_numpy_to_torch(points)\n    boxes, is_numpy = common_utils.check_numpy_to_torch(boxes)\n\n    point_indices = points.new_zeros((boxes.shape[0], points.shape[0]), dtype=torch.int)\n    roiaware_pool3d_cuda.points_in_boxes_cpu(boxes.float().contiguous(), points.float().contiguous(), point_indices)\n\n    return point_indices.numpy() if is_numpy else point_indices\n\n\ndef points_in_boxes_gpu(points, boxes):\n    \"\"\"\n    :param points: (B, M, 3)\n    :param boxes: (B, T, 7), num_valid_boxes <= T\n    :return box_idxs_of_pts: (B, M), default background = -1\n    \"\"\"\n    assert boxes.shape[0] == points.shape[0]\n    assert boxes.shape[2] == 7 and points.shape[2] == 3\n    batch_size, num_points, _ = points.shape\n\n    box_idxs_of_pts = points.new_zeros((batch_size, num_points), dtype=torch.int).fill_(-1)\n    roiaware_pool3d_cuda.points_in_boxes_gpu(boxes.contiguous(), points.contiguous(), box_idxs_of_pts)\n\n    return box_idxs_of_pts\n\n\nclass RoIAwarePool3d(nn.Module):\n    def __init__(self, out_size, max_pts_each_voxel=128):\n        super().__init__()\n        self.out_size = out_size\n        self.max_pts_each_voxel = max_pts_each_voxel\n\n    def forward(self, rois, pts, pts_feature, pool_method='max'):\n        assert pool_method in ['max', 'avg']\n        return RoIAwarePool3dFunction.apply(rois, pts, pts_feature, self.out_size, self.max_pts_each_voxel, pool_method)\n\n\nclass RoIAwarePool3dFunction(Function):\n    @staticmethod\n    def forward(ctx, rois, pts, pts_feature, out_size, max_pts_each_voxel, pool_method):\n        \"\"\"\n        Args:\n            ctx:\n            rois: (N, 7) [x, y, z, dx, dy, dz, heading] (x, y, z) is the box center\n            pts: (npoints, 3)\n            pts_feature: (npoints, C)\n            out_size: int or tuple, like 7 or (7, 7, 7)\n            max_pts_each_voxel:\n            pool_method: 'max' or 'avg'\n\n        Returns:\n            pooled_features: (N, out_x, out_y, out_z, C)\n        \"\"\"\n        assert rois.shape[1] == 7 and pts.shape[1] == 3\n        if isinstance(out_size, int):\n            out_x = out_y = out_z = out_size\n        else:\n            assert len(out_size) == 3\n            for k in range(3):\n                assert isinstance(out_size[k], int)\n            out_x, out_y, out_z = out_size\n\n        num_rois = rois.shape[0]\n        num_channels = pts_feature.shape[-1]\n        num_pts = pts.shape[0]\n\n        pooled_features = pts_feature.new_zeros((num_rois, out_x, out_y, out_z, num_channels))\n        argmax = pts_feature.new_zeros((num_rois, out_x, out_y, out_z, num_channels), dtype=torch.int)\n        pts_idx_of_voxels = pts_feature.new_zeros((num_rois, out_x, out_y, out_z, max_pts_each_voxel), dtype=torch.int)\n\n        pool_method_map = {'max': 0, 'avg': 1}\n        pool_method = pool_method_map[pool_method]\n        roiaware_pool3d_cuda.forward(rois, pts, pts_feature, argmax, pts_idx_of_voxels, pooled_features, pool_method)\n\n        ctx.roiaware_pool3d_for_backward = (pts_idx_of_voxels, argmax, pool_method, num_pts, num_channels)\n        return pooled_features\n\n    @staticmethod\n    def backward(ctx, grad_out):\n        \"\"\"\n        :param grad_out: (N, out_x, out_y, out_z, C)\n        :return:\n            grad_in: (npoints, C)\n        \"\"\"\n        pts_idx_of_voxels, argmax, pool_method, num_pts, num_channels = ctx.roiaware_pool3d_for_backward\n\n        grad_in = grad_out.new_zeros((num_pts, num_channels))\n        roiaware_pool3d_cuda.backward(pts_idx_of_voxels, argmax, grad_out.contiguous(), grad_in, pool_method)\n\n        return None, None, grad_in, None, None, None\n\n\nif __name__ == '__main__':\n    pass\n"
  },
  {
    "path": "pcdet/ops/roiaware_pool3d/src/roiaware_pool3d.cpp",
    "content": "/*\nRoI-aware point cloud feature pooling\nReference paper:  https://arxiv.org/abs/1907.03670\nWritten by Shaoshuai Shi\nAll Rights Reserved 2019-2020.\n*/\n\n\n#include <torch/serialize/tensor.h>\n#include <torch/extension.h>\n#include <assert.h>\n\n\n//#define CHECK_CUDA(x) AT_CHECK(x.type().is_cuda(), #x, \" must be a CUDAtensor \")\n//#define CHECK_CONTIGUOUS(x) AT_CHECK(x.is_contiguous(), #x, \" must be contiguous \")\n//#define CHECK_INPUT(x) CHECK_CUDA(x);CHECK_CONTIGUOUS(x)\n\n\nvoid roiaware_pool3d_launcher(int boxes_num, int pts_num, int channels, int max_pts_each_voxel,\n    int out_x, int out_y, int out_z, const float *rois, const float *pts, const float *pts_feature,\n    int *argmax, int *pts_idx_of_voxels, float *pooled_features, int pool_method);\n\nvoid roiaware_pool3d_backward_launcher(int boxes_num, int out_x, int out_y, int out_z, int channels, int max_pts_each_voxel,\n    const int *pts_idx_of_voxels, const int *argmax, const float *grad_out, float *grad_in, int pool_method);\n\nvoid points_in_boxes_launcher(int batch_size, int boxes_num, int pts_num, const float *boxes,\n    const float *pts, int *box_idx_of_points);\n\nint roiaware_pool3d_gpu(at::Tensor rois, at::Tensor pts, at::Tensor pts_feature, at::Tensor argmax,\n    at::Tensor pts_idx_of_voxels, at::Tensor pooled_features, int pool_method){\n    // params rois: (N, 7) [x, y, z, dx, dy, dz, heading] (x, y, z) is the box center\n    // params pts: (npoints, 3) [x, y, z]\n    // params pts_feature: (npoints, C)\n    // params argmax: (N, out_x, out_y, out_z, C)\n    // params pts_idx_of_voxels: (N, out_x, out_y, out_z, max_pts_each_voxel)\n    // params pooled_features: (N, out_x, out_y, out_z, C)\n    // params pool_method: 0: max_pool 1: avg_pool\n\n//    CHECK_INPUT(rois);\n//    CHECK_INPUT(pts);\n//    CHECK_INPUT(pts_feature);\n//    CHECK_INPUT(argmax);\n//    CHECK_INPUT(pts_idx_of_voxels);\n//    CHECK_INPUT(pooled_features);\n\n    int boxes_num = rois.size(0);\n    int pts_num = pts.size(0);\n    int channels = pts_feature.size(1);\n    int max_pts_each_voxel = pts_idx_of_voxels.size(4);  // index 0 is the counter\n    int out_x = pts_idx_of_voxels.size(1);\n    int out_y = pts_idx_of_voxels.size(2);\n    int out_z = pts_idx_of_voxels.size(3);\n    assert ((out_x < 256) && (out_y < 256) && (out_z < 256));  // we encode index with 8bit\n\n    const float *rois_data = rois.data<float>();\n    const float *pts_data = pts.data<float>();\n    const float *pts_feature_data = pts_feature.data<float>();\n    int *argmax_data = argmax.data<int>();\n    int *pts_idx_of_voxels_data = pts_idx_of_voxels.data<int>();\n    float *pooled_features_data = pooled_features.data<float>();\n\n    roiaware_pool3d_launcher(boxes_num, pts_num, channels, max_pts_each_voxel, out_x, out_y, out_z,\n        rois_data, pts_data, pts_feature_data, argmax_data, pts_idx_of_voxels_data, pooled_features_data, pool_method);\n\n    return 1;\n}\n\nint roiaware_pool3d_gpu_backward(at::Tensor pts_idx_of_voxels, at::Tensor argmax, at::Tensor grad_out, at::Tensor grad_in, int pool_method){\n    // params pts_idx_of_voxels: (N, out_x, out_y, out_z, max_pts_each_voxel)\n    // params argmax: (N, out_x, out_y, out_z, C)\n    // params grad_out: (N, out_x, out_y, out_z, C)\n    // params grad_in: (npoints, C), return value\n    // params pool_method: 0: max_pool 1: avg_pool\n\n//    CHECK_INPUT(pts_idx_of_voxels);\n//    CHECK_INPUT(argmax);\n//    CHECK_INPUT(grad_out);\n//    CHECK_INPUT(grad_in);\n\n    int boxes_num = pts_idx_of_voxels.size(0);\n    int out_x = pts_idx_of_voxels.size(1);\n    int out_y = pts_idx_of_voxels.size(2);\n    int out_z = pts_idx_of_voxels.size(3);\n    int max_pts_each_voxel = pts_idx_of_voxels.size(4);  // index 0 is the counter\n    int channels = grad_out.size(4);\n\n    const int *pts_idx_of_voxels_data = pts_idx_of_voxels.data<int>();\n    const int *argmax_data = argmax.data<int>();\n    const float *grad_out_data = grad_out.data<float>();\n    float *grad_in_data = grad_in.data<float>();\n\n    roiaware_pool3d_backward_launcher(boxes_num, out_x, out_y, out_z, channels, max_pts_each_voxel,\n        pts_idx_of_voxels_data, argmax_data, grad_out_data, grad_in_data, pool_method);\n\n    return 1;\n}\n\nint points_in_boxes_gpu(at::Tensor boxes_tensor, at::Tensor pts_tensor, at::Tensor box_idx_of_points_tensor){\n    // params boxes: (B, N, 7) [x, y, z, dx, dy, dz, heading] (x, y, z) is the box center\n    // params pts: (B, npoints, 3) [x, y, z]\n    // params boxes_idx_of_points: (B, npoints), default -1\n\n//    CHECK_INPUT(boxes_tensor);\n//    CHECK_INPUT(pts_tensor);\n//    CHECK_INPUT(box_idx_of_points_tensor);\n\n    int batch_size = boxes_tensor.size(0);\n    int boxes_num = boxes_tensor.size(1);\n    int pts_num = pts_tensor.size(1);\n\n    const float *boxes = boxes_tensor.data<float>();\n    const float *pts = pts_tensor.data<float>();\n    int *box_idx_of_points = box_idx_of_points_tensor.data<int>();\n\n    points_in_boxes_launcher(batch_size, boxes_num, pts_num, boxes, pts, box_idx_of_points);\n\n    return 1;\n}\n\n\ninline void lidar_to_local_coords_cpu(float shift_x, float shift_y, float rot_angle, float &local_x, float &local_y){\n    float cosa = cos(-rot_angle), sina = sin(-rot_angle);\n    local_x = shift_x * cosa + shift_y * (-sina);\n    local_y = shift_x * sina + shift_y * cosa;\n}\n\n\ninline int check_pt_in_box3d_cpu(const float *pt, const float *box3d, float &local_x, float &local_y){\n    // param pt: (x, y, z)\n    // param box3d: [x, y, z, dx, dy, dz, heading], (x, y, z) is the box center\n    const float MARGIN = 1e-2;\n    float x = pt[0], y = pt[1], z = pt[2];\n    float cx = box3d[0], cy = box3d[1], cz = box3d[2];\n    float dx = box3d[3], dy = box3d[4], dz = box3d[5], rz = box3d[6];\n\n    if (fabsf(z - cz) > dz / 2.0) return 0;\n    lidar_to_local_coords_cpu(x - cx, y - cy, rz, local_x, local_y);\n    float in_flag = (fabs(local_x) < dx / 2.0 + MARGIN) & (fabs(local_y) < dy / 2.0 + MARGIN);\n    return in_flag;\n}\n\n\nint points_in_boxes_cpu(at::Tensor boxes_tensor, at::Tensor pts_tensor, at::Tensor pts_indices_tensor){\n    // params boxes: (N, 7) [x, y, z, dx, dy, dz, heading], (x, y, z) is the box center, each box DO NOT overlaps\n    // params pts: (num_points, 3) [x, y, z]\n    // params pts_indices: (N, num_points)\n\n//    CHECK_CONTIGUOUS(boxes_tensor);\n//    CHECK_CONTIGUOUS(pts_tensor);\n//    CHECK_CONTIGUOUS(pts_indices_tensor);\n\n    int boxes_num = boxes_tensor.size(0);\n    int pts_num = pts_tensor.size(0);\n\n    const float *boxes = boxes_tensor.data<float>();\n    const float *pts = pts_tensor.data<float>();\n    int *pts_indices = pts_indices_tensor.data<int>();\n\n    float local_x = 0, local_y = 0;\n    for (int i = 0; i < boxes_num; i++){\n        for (int j = 0; j < pts_num; j++){\n            int cur_in_flag = check_pt_in_box3d_cpu(pts + j * 3, boxes + i * 7, local_x, local_y);\n            pts_indices[i * pts_num + j] = cur_in_flag;\n        }\n    }\n\n    return 1;\n}\n\n\n\nPYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {\n    m.def(\"forward\", &roiaware_pool3d_gpu, \"roiaware pool3d forward (CUDA)\");\n    m.def(\"backward\", &roiaware_pool3d_gpu_backward, \"roiaware pool3d backward (CUDA)\");\n    m.def(\"points_in_boxes_gpu\", &points_in_boxes_gpu, \"points_in_boxes_gpu forward (CUDA)\");\n    m.def(\"points_in_boxes_cpu\", &points_in_boxes_cpu, \"points_in_boxes_cpu forward (CUDA)\");\n}\n"
  },
  {
    "path": "pcdet/ops/roiaware_pool3d/src/roiaware_pool3d_kernel.cu",
    "content": "/*\nRoI-aware point cloud feature pooling\nWritten by Shaoshuai Shi\nAll Rights Reserved 2019-2020.\n*/\n\n\n#include <math.h>\n#include <stdio.h>\n\n#define THREADS_PER_BLOCK 256\n#define DIVUP(m,n) ((m) / (n) + ((m) % (n) > 0))\n// #define DEBUG\n\n\n__device__ inline void lidar_to_local_coords(float shift_x, float shift_y, float rot_angle, float &local_x, float &local_y){\n    float cosa = cos(-rot_angle), sina = sin(-rot_angle);\n    local_x = shift_x * cosa + shift_y * (-sina);\n    local_y = shift_x * sina + shift_y * cosa;\n}\n\n\n__device__ inline int check_pt_in_box3d(const float *pt, const float *box3d, float &local_x, float &local_y){\n    // param pt: (x, y, z)\n    // param box3d: [x, y, z, dx, dy, dz, heading] (x, y, z) is the box center\n\n    const float MARGIN = 1e-5;\n    float x = pt[0], y = pt[1], z = pt[2];\n    float cx = box3d[0], cy = box3d[1], cz = box3d[2];\n    float dx = box3d[3], dy = box3d[4], dz = box3d[5], rz = box3d[6];\n\n    if (fabsf(z - cz) > dz / 2.0) return 0;\n    lidar_to_local_coords(x - cx, y - cy, rz, local_x, local_y);\n    float in_flag = (fabs(local_x) < dx / 2.0 + MARGIN) & (fabs(local_y) < dy / 2.0 + MARGIN);\n    return in_flag;\n}\n\n\n__global__ void generate_pts_mask_for_box3d(int boxes_num, int pts_num, int out_x, int out_y, int out_z,\n    const float *rois, const float *pts, int *pts_mask){\n    // params rois: [x, y, z, dx, dy, dz, heading] (x, y, z) is the box center\n    // params pts: (npoints, 3) [x, y, z]\n    // params pts_mask: (N, npoints): -1 means point doesnot in this box, otherwise: encode (x_idxs, y_idxs, z_idxs) by binary bit\n    int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    int box_idx = blockIdx.y;\n    if (pt_idx >= pts_num || box_idx >= boxes_num) return;\n\n    pts += pt_idx * 3;\n    rois += box_idx * 7;\n    pts_mask += box_idx * pts_num + pt_idx;\n\n    float local_x = 0, local_y = 0;\n    int cur_in_flag = check_pt_in_box3d(pts, rois, local_x, local_y);\n\n    pts_mask[0] = -1;\n    if (cur_in_flag > 0){\n        float local_z = pts[2] - rois[2];\n        float dx = rois[3], dy = rois[4], dz = rois[5];\n\n        float x_res = dx / out_x;\n        float y_res = dy / out_y;\n        float z_res = dz / out_z;\n\n        unsigned int x_idx = int((local_x + dx / 2) / x_res);\n        unsigned int y_idx = int((local_y + dy / 2) / y_res);\n        unsigned int z_idx = int((local_z + dz / 2) / z_res);\n\n        x_idx = min(max(x_idx, 0), out_x - 1);\n        y_idx = min(max(y_idx, 0), out_y - 1);\n        z_idx = min(max(z_idx, 0), out_z - 1);\n\n        unsigned int idx_encoding = (x_idx << 16) + (y_idx << 8) + z_idx;\n        pts_mask[0] = idx_encoding;\n    }\n}\n\n\n__global__ void collect_inside_pts_for_box3d(int boxes_num, int pts_num, int max_pts_each_voxel,\n    int out_x, int out_y, int out_z, const int *pts_mask, int *pts_idx_of_voxels){\n    // params pts_mask: (N, npoints)  0 or 1\n    // params pts_idx_of_voxels: (N, out_x, out_y, out_z, max_pts_each_voxel)\n\n    int box_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (box_idx >= boxes_num) return;\n\n    int max_num_pts = max_pts_each_voxel - 1;  // index 0 is the counter\n    pts_idx_of_voxels += box_idx * out_x * out_y * out_z * max_pts_each_voxel;\n\n    for (int k = 0; k < pts_num; k++){\n        if (pts_mask[box_idx * pts_num + k] != -1){\n            unsigned int idx_encoding = pts_mask[box_idx * pts_num + k];\n            unsigned int x_idx = (idx_encoding >> 16) & 0xFF;\n            unsigned int y_idx = (idx_encoding >> 8) & 0xFF;\n            unsigned int z_idx = idx_encoding & 0xFF;\n            unsigned int base_offset = x_idx * out_y * out_z * max_pts_each_voxel + y_idx * out_z * max_pts_each_voxel + z_idx * max_pts_each_voxel;\n            unsigned int cnt = pts_idx_of_voxels[base_offset];\n            if (cnt < max_num_pts){\n                pts_idx_of_voxels[base_offset + cnt + 1] = k;\n                pts_idx_of_voxels[base_offset]++;\n            }\n#ifdef DEBUG\n        printf(\"collect: pts_%d, idx(%d, %d, %d), idx_encoding=%x\\n\",\n            k, x_idx, y_idx, z_idx, idx_encoding);\n#endif\n\n        }\n    }\n}\n\n\n__global__ void roiaware_maxpool3d(int boxes_num, int pts_num, int channels, int max_pts_each_voxel, int out_x,\n    int out_y, int out_z, const float *pts_feature, const int *pts_idx_of_voxels, float *pooled_features, int *argmax){\n    // params pts_feature: (npoints, C)\n    // params pts_idx_of_voxels: (N, out_x, out_y, out_z, max_pts_each_voxel), index 0 is the counter\n    // params pooled_features: (N, out_x, out_y, out_z, C)\n    // params argmax: (N, out_x, out_y, out_z, C)\n\n    int box_idx = blockIdx.z;\n    int channel_idx = blockIdx.y;\n    int voxel_idx_flat = blockIdx.x * blockDim.x + threadIdx.x;\n\n    int x_idx = voxel_idx_flat / (out_y * out_z);\n    int y_idx = (voxel_idx_flat - x_idx * (out_y * out_z)) / out_z;\n    int z_idx = voxel_idx_flat % out_z;\n    if (box_idx >= boxes_num || channel_idx >= channels|| x_idx >= out_x || y_idx >= out_y || z_idx >= out_z) return;\n\n#ifdef DEBUG\n    printf(\"src pts_idx_of_voxels: (%p, ), argmax: %p\\n\", pts_idx_of_voxels, argmax);\n#endif\n\n    int offset_base = x_idx * out_y * out_z + y_idx * out_z + z_idx;\n    pts_idx_of_voxels += box_idx * out_x * out_y * out_z * max_pts_each_voxel + offset_base * max_pts_each_voxel;\n    pooled_features += box_idx * out_x * out_y * out_z * channels + offset_base * channels + channel_idx;\n    argmax += box_idx * out_x * out_y * out_z * channels + offset_base * channels + channel_idx;\n\n    int argmax_idx = -1;\n    float max_val = -1e50;\n\n    int total_pts = pts_idx_of_voxels[0];\n\n    for (int k = 1; k <= total_pts; k++){\n        if (pts_feature[pts_idx_of_voxels[k] * channels + channel_idx] > max_val){\n            max_val = pts_feature[pts_idx_of_voxels[k] * channels + channel_idx];\n            argmax_idx = pts_idx_of_voxels[k];\n        }\n    }\n\n    if (argmax_idx != -1){\n        pooled_features[0] = max_val;\n    }\n    argmax[0] = argmax_idx;\n\n#ifdef DEBUG\n    printf(\"channel_%d idx(%d, %d, %d), argmax_idx=(%d, %.3f), total=%d, after pts_idx: %p, argmax: (%p, %d)\\n\",\n        channel_idx, x_idx, y_idx, z_idx, argmax_idx, max_val, total_pts, pts_idx_of_voxels, argmax, argmax_idx);\n#endif\n}\n\n\n__global__ void roiaware_avgpool3d(int boxes_num, int pts_num, int channels, int max_pts_each_voxel, int out_x,\n    int out_y, int out_z, const float *pts_feature, const int *pts_idx_of_voxels, float *pooled_features){\n    // params pts_feature: (npoints, C)\n    // params pts_idx_of_voxels: (N, out_x, out_y, out_z, max_pts_each_voxel), index 0 is the counter\n    // params pooled_features: (N, out_x, out_y, out_z, C)\n    // params argmax: (N, out_x, out_y, out_z, C)\n\n    int box_idx = blockIdx.z;\n    int channel_idx = blockIdx.y;\n    int voxel_idx_flat = blockIdx.x * blockDim.x + threadIdx.x;\n\n    int x_idx = voxel_idx_flat / (out_y * out_z);\n    int y_idx = (voxel_idx_flat - x_idx * (out_y * out_z)) / out_z;\n    int z_idx = voxel_idx_flat % out_z;\n    if (box_idx >= boxes_num || channel_idx >= channels|| x_idx >= out_x || y_idx >= out_y || z_idx >= out_z) return;\n\n    int offset_base = x_idx * out_y * out_z + y_idx * out_z + z_idx;\n    pts_idx_of_voxels += box_idx * out_x * out_y * out_z * max_pts_each_voxel + offset_base * max_pts_each_voxel;\n    pooled_features += box_idx * out_x * out_y * out_z * channels + offset_base * channels + channel_idx;\n\n    float sum_val = 0;\n    int total_pts = pts_idx_of_voxels[0];\n\n    for (int k = 1; k <= total_pts; k++){\n        sum_val += pts_feature[pts_idx_of_voxels[k] * channels + channel_idx];\n    }\n\n    if (total_pts > 0){\n        pooled_features[0] = sum_val / total_pts;\n    }\n}\n\n\nvoid roiaware_pool3d_launcher(int boxes_num, int pts_num, int channels, int max_pts_each_voxel, int out_x, int out_y, int out_z,\n    const float *rois, const float *pts, const float *pts_feature, int *argmax, int *pts_idx_of_voxels, float *pooled_features, int pool_method){\n    // params rois: (N, 7) [x, y, z, dx, dy, dz, heading] (x, y, z) is the box center\n    // params pts: (npoints, 3) [x, y, z]\n    // params pts_feature: (npoints, C)\n    // params argmax: (N, out_x, out_y, out_z, C)\n    // params pts_idx_of_voxels: (N, out_x, out_y, out_z, max_pts_each_voxel)\n    // params pooled_features: (N, out_x, out_y, out_z, C)\n    // params pool_method: 0: max_pool 1: avg_pool\n\n    int *pts_mask = NULL;\n    cudaMalloc(&pts_mask, boxes_num * pts_num * sizeof(int));  // (N, M)\n    cudaMemset(pts_mask, -1, boxes_num * pts_num * sizeof(int));\n\n    dim3 blocks_mask(DIVUP(pts_num, THREADS_PER_BLOCK), boxes_num);\n    dim3 threads(THREADS_PER_BLOCK);\n    generate_pts_mask_for_box3d<<<blocks_mask, threads>>>(boxes_num, pts_num, out_x, out_y, out_z, rois, pts, pts_mask);\n\n    // TODO: Merge the collect and pool functions, SS\n\n    dim3 blocks_collect(DIVUP(boxes_num, THREADS_PER_BLOCK));\n    collect_inside_pts_for_box3d<<<blocks_collect, threads>>>(boxes_num, pts_num, max_pts_each_voxel,\n        out_x, out_y, out_z, pts_mask, pts_idx_of_voxels);\n\n    dim3 blocks_pool(DIVUP(out_x * out_y * out_z, THREADS_PER_BLOCK), channels, boxes_num);\n    if (pool_method == 0){\n        roiaware_maxpool3d<<<blocks_pool, threads>>>(boxes_num, pts_num, channels, max_pts_each_voxel, out_x, out_y, out_z,\n            pts_feature, pts_idx_of_voxels, pooled_features, argmax);\n    }\n    else if (pool_method == 1){\n        roiaware_avgpool3d<<<blocks_pool, threads>>>(boxes_num, pts_num, channels, max_pts_each_voxel, out_x, out_y, out_z,\n            pts_feature, pts_idx_of_voxels, pooled_features);\n    }\n\n\n    cudaFree(pts_mask);\n\n#ifdef DEBUG\n    cudaDeviceSynchronize();  // for using printf in kernel function\n#endif\n}\n\n\n__global__ void roiaware_maxpool3d_backward(int boxes_num, int channels, int out_x, int out_y, int out_z,\n    const int *argmax, const float *grad_out, float *grad_in){\n    // params argmax: (N, out_x, out_y, out_z, C)\n    // params grad_out: (N, out_x, out_y, out_z, C)\n    // params grad_in: (npoints, C), return value\n\n    int box_idx = blockIdx.z;\n    int channel_idx = blockIdx.y;\n    int voxel_idx_flat = blockIdx.x * blockDim.x + threadIdx.x;\n\n    int x_idx = voxel_idx_flat / (out_y * out_z);\n    int y_idx = (voxel_idx_flat - x_idx * (out_y * out_z)) / out_z;\n    int z_idx = voxel_idx_flat % out_z;\n    if (box_idx >= boxes_num || channel_idx >= channels|| x_idx >= out_x || y_idx >= out_y || z_idx >= out_z) return;\n\n    int offset_base = x_idx * out_y * out_z + y_idx * out_z + z_idx;\n    argmax += box_idx * out_x * out_y * out_z * channels + offset_base * channels + channel_idx;\n    grad_out += box_idx * out_x * out_y * out_z * channels + offset_base * channels + channel_idx;\n\n    if (argmax[0] == -1) return;\n\n    atomicAdd(grad_in + argmax[0] * channels + channel_idx, grad_out[0] * 1);\n}\n\n\n__global__ void roiaware_avgpool3d_backward(int boxes_num, int channels, int out_x, int out_y, int out_z,\n    int max_pts_each_voxel, const int *pts_idx_of_voxels, const float *grad_out, float *grad_in){\n    // params pts_idx_of_voxels: (N, out_x, out_y, out_z, max_pts_each_voxel)\n    // params grad_out: (N, out_x, out_y, out_z, C)\n    // params grad_in: (npoints, C), return value\n\n    int box_idx = blockIdx.z;\n    int channel_idx = blockIdx.y;\n    int voxel_idx_flat = blockIdx.x * blockDim.x + threadIdx.x;\n\n    int x_idx = voxel_idx_flat / (out_y * out_z);\n    int y_idx = (voxel_idx_flat - x_idx * (out_y * out_z)) / out_z;\n    int z_idx = voxel_idx_flat % out_z;\n    if (box_idx >= boxes_num || channel_idx >= channels|| x_idx >= out_x || y_idx >= out_y || z_idx >= out_z) return;\n\n    int offset_base = x_idx * out_y * out_z + y_idx * out_z + z_idx;\n    pts_idx_of_voxels += box_idx * out_x * out_y * out_z * max_pts_each_voxel + offset_base * max_pts_each_voxel;\n    grad_out += box_idx * out_x * out_y * out_z * channels + offset_base * channels + channel_idx;\n\n\n    int total_pts = pts_idx_of_voxels[0];\n    float cur_grad = 1 / fmaxf(float(total_pts), 1.0);\n    for (int k = 1; k <= total_pts; k++){\n        atomicAdd(grad_in + pts_idx_of_voxels[k] * channels + channel_idx, grad_out[0] * cur_grad);\n    }\n}\n\n\nvoid roiaware_pool3d_backward_launcher(int boxes_num, int out_x, int out_y, int out_z, int channels, int max_pts_each_voxel,\n    const int *pts_idx_of_voxels, const int *argmax, const float *grad_out, float *grad_in, int pool_method){\n    // params pts_idx_of_voxels: (N, out_x, out_y, out_z, max_pts_each_voxel)\n    // params argmax: (N, out_x, out_y, out_z, C)\n    // params grad_out: (N, out_x, out_y, out_z, C)\n    // params grad_in: (npoints, C), return value\n    // params pool_method: 0: max_pool, 1: avg_pool\n\n    dim3 blocks(DIVUP(out_x * out_y * out_z, THREADS_PER_BLOCK), channels, boxes_num);\n    dim3 threads(THREADS_PER_BLOCK);\n    if (pool_method == 0){\n        roiaware_maxpool3d_backward<<<blocks, threads>>>(\n            boxes_num, channels, out_x, out_y, out_z, argmax, grad_out, grad_in\n        );\n    }\n    else if (pool_method == 1){\n        roiaware_avgpool3d_backward<<<blocks, threads>>>(\n            boxes_num, channels, out_x, out_y, out_z, max_pts_each_voxel, pts_idx_of_voxels, grad_out, grad_in\n        );\n    }\n\n}\n\n\n__global__ void points_in_boxes_kernel(int batch_size, int boxes_num, int pts_num, const float *boxes,\n    const float *pts, int *box_idx_of_points){\n    // params boxes: (B, N, 7) [x, y, z, dx, dy, dz, heading] (x, y, z) is the box center\n    // params pts: (B, npoints, 3) [x, y, z] in LiDAR coordinate\n    // params boxes_idx_of_points: (B, npoints), default -1\n\n    int bs_idx = blockIdx.y;\n    int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (bs_idx >= batch_size || pt_idx >= pts_num) return;\n\n    boxes += bs_idx * boxes_num * 7;\n    pts += bs_idx * pts_num * 3 + pt_idx * 3;\n    box_idx_of_points += bs_idx * pts_num + pt_idx;\n\n    float local_x = 0, local_y = 0;\n    int cur_in_flag = 0;\n    for (int k = 0; k < boxes_num; k++){\n        cur_in_flag = check_pt_in_box3d(pts, boxes + k * 7, local_x, local_y);\n        if (cur_in_flag){\n            box_idx_of_points[0] = k;\n            break;\n        }\n    }\n}\n\n\nvoid points_in_boxes_launcher(int batch_size, int boxes_num, int pts_num, const float *boxes,\n    const float *pts, int *box_idx_of_points){\n    // params boxes: (B, N, 7) [x, y, z, dx, dy, dz, heading] (x, y, z) is the box center\n    // params pts: (B, npoints, 3) [x, y, z]\n    // params boxes_idx_of_points: (B, npoints), default -1\n    cudaError_t err;\n\n    dim3 blocks(DIVUP(pts_num, THREADS_PER_BLOCK), batch_size);\n    dim3 threads(THREADS_PER_BLOCK);\n    points_in_boxes_kernel<<<blocks, threads>>>(batch_size, boxes_num, pts_num, boxes, pts, box_idx_of_points);\n\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n\n#ifdef DEBUG\n    cudaDeviceSynchronize();  // for using printf in kernel function\n#endif\n}\n"
  },
  {
    "path": "pcdet/ops/roipoint_pool3d/roipoint_pool3d_utils.py",
    "content": "import torch\nimport torch.nn as nn\nfrom torch.autograd import Function\n\nfrom ...utils import box_utils\nfrom . import roipoint_pool3d_cuda\n\n\nclass RoIPointPool3d(nn.Module):\n    def __init__(self, num_sampled_points=512, pool_extra_width=1.0):\n        super().__init__()\n        self.num_sampled_points = num_sampled_points\n        self.pool_extra_width = pool_extra_width\n\n    def forward(self, points, point_features, boxes3d):\n        \"\"\"\n        Args:\n            points: (B, N, 3)\n            point_features: (B, N, C)\n            boxes3d: (B, M, 7), [x, y, z, dx, dy, dz, heading]\n\n        Returns:\n            pooled_features: (B, M, 512, 3 + C)\n            pooled_empty_flag: (B, M)\n        \"\"\"\n        return RoIPointPool3dFunction.apply(\n            points, point_features, boxes3d, self.pool_extra_width, self.num_sampled_points\n        )\n\n\nclass RoIPointPool3dFunction(Function):\n    @staticmethod\n    def forward(ctx, points, point_features, boxes3d, pool_extra_width, num_sampled_points=512):\n        \"\"\"\n        Args:\n            ctx:\n            points: (B, N, 3)\n            point_features: (B, N, C)\n            boxes3d: (B, num_boxes, 7), [x, y, z, dx, dy, dz, heading]\n            pool_extra_width:\n            num_sampled_points:\n\n        Returns:\n            pooled_features: (B, num_boxes, 512, 3 + C)\n            pooled_empty_flag: (B, num_boxes)\n        \"\"\"\n        assert points.shape.__len__() == 3 and points.shape[2] == 3\n        batch_size, boxes_num, feature_len = points.shape[0], boxes3d.shape[1], point_features.shape[2]\n        pooled_boxes3d = box_utils.enlarge_box3d(boxes3d.view(-1, 7), pool_extra_width).view(batch_size, -1, 7)\n\n        pooled_features = point_features.new_zeros((batch_size, boxes_num, num_sampled_points, 3 + feature_len))\n        pooled_empty_flag = point_features.new_zeros((batch_size, boxes_num)).int()\n\n        roipoint_pool3d_cuda.forward(\n            points.contiguous(), pooled_boxes3d.contiguous(),\n            point_features.contiguous(), pooled_features, pooled_empty_flag\n        )\n\n        return pooled_features, pooled_empty_flag\n\n    @staticmethod\n    def backward(ctx, grad_out):\n        raise NotImplementedError\n\n\nif __name__ == '__main__':\n    pass\n"
  },
  {
    "path": "pcdet/ops/roipoint_pool3d/src/roipoint_pool3d.cpp",
    "content": "#include <torch/serialize/tensor.h>\n#include <torch/extension.h>\n\n#define CHECK_CUDA(x) do { \\\n  if (!x.type().is_cuda()) { \\\n    fprintf(stderr, \"%s must be CUDA tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_CONTIGUOUS(x) do { \\\n  if (!x.is_contiguous()) { \\\n    fprintf(stderr, \"%s must be contiguous tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_INPUT(x) CHECK_CUDA(x);CHECK_CONTIGUOUS(x)\n\n\nvoid roipool3dLauncher(int batch_size, int pts_num, int boxes_num, int feature_in_len, int sampled_pts_num,\n                       const float *xyz, const float *boxes3d, const float *pts_feature, float *pooled_features, int *pooled_empty_flag);\n\n\nint roipool3d_gpu(at::Tensor xyz, at::Tensor boxes3d, at::Tensor pts_feature, at::Tensor pooled_features, at::Tensor pooled_empty_flag){\n    // params xyz: (B, N, 3)\n    // params boxes3d: (B, M, 7)\n    // params pts_feature: (B, N, C)\n    // params pooled_features: (B, M, 512, 3+C)\n    // params pooled_empty_flag: (B, M)\n    CHECK_INPUT(xyz);\n    CHECK_INPUT(boxes3d);\n    CHECK_INPUT(pts_feature);\n    CHECK_INPUT(pooled_features);\n    CHECK_INPUT(pooled_empty_flag);\n\n    int batch_size = xyz.size(0);\n    int pts_num = xyz.size(1);\n    int boxes_num = boxes3d.size(1);\n    int feature_in_len = pts_feature.size(2);\n    int sampled_pts_num = pooled_features.size(2);\n\n\n    const float * xyz_data = xyz.data<float>();\n    const float * boxes3d_data = boxes3d.data<float>();\n    const float * pts_feature_data = pts_feature.data<float>();\n    float * pooled_features_data = pooled_features.data<float>();\n    int * pooled_empty_flag_data = pooled_empty_flag.data<int>();\n\n    roipool3dLauncher(batch_size, pts_num, boxes_num, feature_in_len, sampled_pts_num,\n                       xyz_data, boxes3d_data, pts_feature_data, pooled_features_data, pooled_empty_flag_data);\n\n\n\n    return 1;\n}\n\n\nPYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {\n    m.def(\"forward\", &roipool3d_gpu, \"roipool3d forward (CUDA)\");\n}\n\n"
  },
  {
    "path": "pcdet/ops/roipoint_pool3d/src/roipoint_pool3d_kernel.cu",
    "content": "/*\nPoint cloud feature pooling\nWritten by Shaoshuai Shi\nAll Rights Reserved 2018.\n*/\n\n#include <math.h>\n#include <stdio.h>\n\n#define THREADS_PER_BLOCK 256\n#define DIVUP(m,n) ((m) / (n) + ((m) % (n) > 0))\n// #define DEBUG\n\n\n__device__ inline void lidar_to_local_coords(float shift_x, float shift_y, float rot_angle, float &local_x, float &local_y){\n    float cosa = cos(-rot_angle), sina = sin(-rot_angle);\n    local_x = shift_x * cosa + shift_y * (-sina);\n    local_y = shift_x * sina + shift_y * cosa;\n}\n\n\n__device__ inline int check_pt_in_box3d(const float *pt, const float *box3d, float &local_x, float &local_y){\n    // param pt: (x, y, z)\n    // param box3d: [x, y, z, dx, dy, dz, heading] (x, y, z) is the box center\n\n    const float MARGIN = 1e-5;\n    float x = pt[0], y = pt[1], z = pt[2];\n    float cx = box3d[0], cy = box3d[1], cz = box3d[2];\n    float dx = box3d[3], dy = box3d[4], dz = box3d[5], rz = box3d[6];\n\n    if (fabsf(z - cz) > dz / 2.0) return 0;\n    lidar_to_local_coords(x - cx, y - cy, rz, local_x, local_y);\n    float in_flag = (fabs(local_x) < dx / 2.0 + MARGIN) & (fabs(local_y) < dy / 2.0 + MARGIN);\n    return in_flag;\n}\n\n\n__global__ void assign_pts_to_box3d(int batch_size, int pts_num, int boxes_num, const float *xyz, const float *boxes3d, int *pts_assign){\n    // params xyz: (B, N, 3)\n    // params boxes3d: (B, M, 7)\n    // params pts_assign: (B, N, M): idx of the corresponding box3d, -1 means background points\n    int pt_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    int box_idx = blockIdx.y;\n    int bs_idx = blockIdx.z;\n\n    if (pt_idx >= pts_num || box_idx >= boxes_num || bs_idx >= batch_size){\n        return;\n    }\n    int assign_idx = bs_idx * pts_num * boxes_num + pt_idx * boxes_num + box_idx;\n    pts_assign[assign_idx] = 0;\n\n    int box_offset = bs_idx * boxes_num * 7 + box_idx * 7;\n    int pt_offset = bs_idx * pts_num * 3 + pt_idx * 3;\n\n\n    float local_x = 0, local_y = 0;\n    int cur_in_flag = check_pt_in_box3d(xyz + pt_offset, boxes3d + box_offset, local_x, local_y);\n    pts_assign[assign_idx] = cur_in_flag;\n    // printf(\"bs=%d, pt=%d, in=%d\\n\", bs_idx, pt_idx, pts_assign[bs_idx * pts_num + pt_idx]);\n}\n\n\n__global__ void get_pooled_idx(int batch_size, int pts_num, int boxes_num, int sampled_pts_num,\n                               const int *pts_assign, int *pts_idx, int *pooled_empty_flag){\n    // params xyz: (B, N, 3)\n    // params pts_feature: (B, N, C)\n    // params pts_assign: (B, N)\n    // params pts_idx: (B, M, 512)\n    // params pooled_empty_flag: (B, M)\n\n    int boxes_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (boxes_idx >= boxes_num){\n        return;\n    }\n\n    int bs_idx = blockIdx.y;\n\n    int cnt = 0;\n    for (int k = 0; k < pts_num; k++){\n        if (pts_assign[bs_idx * pts_num * boxes_num + k * boxes_num + boxes_idx]){\n            if (cnt < sampled_pts_num){\n                pts_idx[bs_idx * boxes_num * sampled_pts_num + boxes_idx * sampled_pts_num + cnt] = k;\n                cnt++;\n            }\n            else break;\n        }\n    }\n\n    if (cnt == 0){\n        pooled_empty_flag[bs_idx * boxes_num + boxes_idx] = 1;\n    }\n    else if (cnt < sampled_pts_num){\n        // duplicate same points for sampling\n        for (int k = cnt; k < sampled_pts_num; k++){\n            int duplicate_idx = k % cnt;\n            int base_offset = bs_idx * boxes_num * sampled_pts_num + boxes_idx * sampled_pts_num;\n            pts_idx[base_offset + k] = pts_idx[base_offset + duplicate_idx];\n        }\n    }\n}\n\n\n__global__ void roipool3d_forward(int batch_size, int pts_num, int boxes_num, int feature_in_len, int sampled_pts_num,\n                                   const float *xyz, const int *pts_idx, const float *pts_feature,\n                                   float *pooled_features, int *pooled_empty_flag){\n    // params xyz: (B, N, 3)\n    // params pts_idx: (B, M, 512)\n    // params pts_feature: (B, N, C)\n    // params pooled_features: (B, M, 512, 3+C)\n    // params pooled_empty_flag: (B, M)\n\n    int sample_pt_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    int box_idx = blockIdx.y;\n    int bs_idx = blockIdx.z;\n\n    if (sample_pt_idx >= sampled_pts_num || box_idx >= boxes_num || bs_idx >= batch_size){\n        return;\n    }\n\n    if (pooled_empty_flag[bs_idx * boxes_num + box_idx]){\n        return;\n    }\n\n    int temp_idx = bs_idx * boxes_num * sampled_pts_num + box_idx * sampled_pts_num + sample_pt_idx;\n    int src_pt_idx = pts_idx[temp_idx];\n    int dst_feature_offset = temp_idx * (3 + feature_in_len);\n\n    for (int j = 0; j < 3; j++)\n        pooled_features[dst_feature_offset + j] = xyz[bs_idx * pts_num * 3 + src_pt_idx * 3 + j];\n\n    int src_feature_offset = bs_idx * pts_num * feature_in_len + src_pt_idx * feature_in_len;\n    for (int j = 0; j < feature_in_len; j++)\n        pooled_features[dst_feature_offset + 3 + j] = pts_feature[src_feature_offset + j];\n}\n\n\nvoid roipool3dLauncher(int batch_size, int pts_num, int boxes_num, int feature_in_len, int sampled_pts_num,\n                       const float *xyz, const float *boxes3d, const float *pts_feature, float *pooled_features, int *pooled_empty_flag){\n\n    // printf(\"batch_size=%d, pts_num=%d, boxes_num=%d\\n\", batch_size, pts_num, boxes_num);\n    int *pts_assign = NULL;\n    cudaMalloc(&pts_assign, batch_size * pts_num * boxes_num * sizeof(int));  // (batch_size, N, M)\n    // cudaMemset(&pts_assign, -1, batch_size * pts_num * boxes_num * sizeof(int));\n\n    dim3 blocks(DIVUP(pts_num, THREADS_PER_BLOCK), boxes_num, batch_size);  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n    assign_pts_to_box3d<<<blocks, threads>>>(batch_size, pts_num, boxes_num, xyz, boxes3d, pts_assign);\n\n    int *pts_idx = NULL;\n    cudaMalloc(&pts_idx, batch_size * boxes_num * sampled_pts_num * sizeof(int));  // (batch_size, M, sampled_pts_num)\n\n    dim3 blocks2(DIVUP(boxes_num, THREADS_PER_BLOCK), batch_size);  // blockIdx.x(col), blockIdx.y(row)\n    get_pooled_idx<<<blocks2, threads>>>(batch_size, pts_num, boxes_num, sampled_pts_num, pts_assign, pts_idx, pooled_empty_flag);\n\n    dim3 blocks_pool(DIVUP(sampled_pts_num, THREADS_PER_BLOCK), boxes_num, batch_size);\n    roipool3d_forward<<<blocks_pool, threads>>>(batch_size, pts_num, boxes_num, feature_in_len, sampled_pts_num,\n                                                      xyz, pts_idx, pts_feature, pooled_features, pooled_empty_flag);\n\n    cudaFree(pts_assign);\n    cudaFree(pts_idx);\n\n#ifdef DEBUG\n    cudaDeviceSynchronize();  // for using printf in kernel function\n#endif\n}"
  },
  {
    "path": "pcdet/ops/votr_ops/src/build_attention_indices.cpp",
    "content": "/*\nFind indices for each attention pattern\nWritten by Jiageng Mao\n*/\n\n#include <torch/serialize/tensor.h>\n#include <vector>\n#include <THC/THC.h>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n#include \"build_attention_indices_gpu.h\"\n\nextern THCState *state;\n\n#define CHECK_CUDA(x) do { \\\n  if (!x.type().is_cuda()) { \\\n    fprintf(stderr, \"%s must be CUDA tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_CONTIGUOUS(x) do { \\\n  if (!x.is_contiguous()) { \\\n    fprintf(stderr, \"%s must be contiguous tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_INPUT(x) CHECK_CUDA(x);CHECK_CONTIGUOUS(x)\n\nint sparse_local_attention_with_tensor_wrapper(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                int num_voxels, int attend_size, int attend_range,\n                                                at::Tensor attend_indices_tensor, at::Tensor v_indices_tensor, at::Tensor xyz_to_vidx_tensor) {\n    CHECK_INPUT(attend_indices_tensor);\n    CHECK_INPUT(v_indices_tensor);\n    CHECK_INPUT(xyz_to_vidx_tensor);\n\n    int *attend_indices = attend_indices_tensor.data<int>();\n    const int *v_indices = v_indices_tensor.data<int>();\n    const int *xyz_to_vidx = xyz_to_vidx_tensor.data<int>();\n\n    sparse_local_attention_with_tensor_kernel_launcher(x_max, y_max, z_max, x_stride, y_stride, z_stride, num_voxels, attend_size, attend_range,\n                                                        attend_indices, v_indices, xyz_to_vidx);\n    return 1;\n}\n\nint sparse_local_attention_with_hash_wrapper(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                int num_voxels, int attend_size, int attend_range, int hash_size,\n                                                at::Tensor attend_indices_tensor, at::Tensor v_indices_tensor, at::Tensor xyz_to_vidx_tensor) {\n    CHECK_INPUT(attend_indices_tensor);\n    CHECK_INPUT(v_indices_tensor);\n    CHECK_INPUT(xyz_to_vidx_tensor);\n\n    int *attend_indices = attend_indices_tensor.data<int>();\n    const int *v_indices = v_indices_tensor.data<int>();\n    const int *xyz_to_vidx = xyz_to_vidx_tensor.data<int>();\n\n    sparse_local_attention_with_hash_kernel_launcher(x_max, y_max, z_max, x_stride, y_stride, z_stride, num_voxels, attend_size, attend_range, hash_size,\n                                                        attend_indices, v_indices, xyz_to_vidx);\n    return 1;\n}\n\nint subm_local_attention_with_tensor_wrapper(int x_max, int y_max, int z_max, int num_voxels, int attend_size, int attend_range,\n                                                at::Tensor attend_indices_tensor, at::Tensor v_indices_tensor, at::Tensor xyz_to_vidx_tensor) {\n    CHECK_INPUT(attend_indices_tensor);\n    CHECK_INPUT(v_indices_tensor);\n    CHECK_INPUT(xyz_to_vidx_tensor);\n\n    int *attend_indices = attend_indices_tensor.data<int>();\n    const int *v_indices = v_indices_tensor.data<int>();\n    const int *xyz_to_vidx = xyz_to_vidx_tensor.data<int>();\n\n    subm_local_attention_with_tensor_kernel_launcher(x_max, y_max, z_max, num_voxels, attend_size, attend_range,\n                                                        attend_indices, v_indices, xyz_to_vidx);\n    return 1;\n}\n\nint subm_local_attention_with_hash_wrapper(int x_max, int y_max, int z_max, int num_voxels, int attend_size, int attend_range, int hash_size,\n                                                at::Tensor attend_indices_tensor, at::Tensor v_indices_tensor, at::Tensor xyz_to_vidx_tensor) {\n    CHECK_INPUT(attend_indices_tensor);\n    CHECK_INPUT(v_indices_tensor);\n    CHECK_INPUT(xyz_to_vidx_tensor);\n\n    int *attend_indices = attend_indices_tensor.data<int>();\n    const int *v_indices = v_indices_tensor.data<int>();\n    const int *xyz_to_vidx = xyz_to_vidx_tensor.data<int>();\n\n    subm_local_attention_with_hash_kernel_launcher(x_max, y_max, z_max, num_voxels, attend_size, attend_range, hash_size,\n                                                        attend_indices, v_indices, xyz_to_vidx);\n    return 1;\n}\n\nint sparse_strided_attention_with_tensor_wrapper(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                    int num_voxels, int attend_size, int num_range,\n                                                    at::Tensor attend_indices_tensor, at::Tensor v_indices_tensor,\n                                                    at::Tensor xyz_to_vidx_tensor, at::Tensor range_spec_tensor) {\n    CHECK_INPUT(attend_indices_tensor);\n    CHECK_INPUT(v_indices_tensor);\n    CHECK_INPUT(xyz_to_vidx_tensor);\n    CHECK_INPUT(range_spec_tensor);\n\n    int *attend_indices = attend_indices_tensor.data<int>();\n    const int *v_indices = v_indices_tensor.data<int>();\n    const int *xyz_to_vidx = xyz_to_vidx_tensor.data<int>();\n    const int *range_spec = range_spec_tensor.data<int>();\n\n    sparse_strided_attention_with_tensor_kernel_launcher(x_max, y_max, z_max, x_stride, y_stride, z_stride, num_voxels, attend_size, num_range,\n                                                       attend_indices, v_indices, xyz_to_vidx, range_spec);\n    return 1;\n}\n\nint sparse_strided_attention_with_hash_wrapper(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                    int num_voxels, int attend_size, int num_range, int hash_size,\n                                                    at::Tensor attend_indices_tensor, at::Tensor v_indices_tensor,\n                                                    at::Tensor xyz_to_vidx_tensor, at::Tensor range_spec_tensor) {\n    CHECK_INPUT(attend_indices_tensor);\n    CHECK_INPUT(v_indices_tensor);\n    CHECK_INPUT(xyz_to_vidx_tensor);\n    CHECK_INPUT(range_spec_tensor);\n\n    int *attend_indices = attend_indices_tensor.data<int>();\n    const int *v_indices = v_indices_tensor.data<int>();\n    const int *xyz_to_vidx = xyz_to_vidx_tensor.data<int>();\n    const int *range_spec = range_spec_tensor.data<int>();\n\n    sparse_strided_attention_with_hash_kernel_launcher(x_max, y_max, z_max, x_stride, y_stride, z_stride, num_voxels, attend_size, num_range, hash_size,\n                                                       attend_indices, v_indices, xyz_to_vidx, range_spec);\n    return 1;\n}\n\nint subm_strided_attention_with_tensor_wrapper(int x_max, int y_max, int z_max, int num_voxels, int attend_size, int num_range,\n                                                at::Tensor attend_indices_tensor, at::Tensor v_indices_tensor,\n                                                at::Tensor xyz_to_vidx_tensor, at::Tensor range_spec_tensor) {\n    CHECK_INPUT(attend_indices_tensor);\n    CHECK_INPUT(v_indices_tensor);\n    CHECK_INPUT(xyz_to_vidx_tensor);\n    CHECK_INPUT(range_spec_tensor);\n\n    int *attend_indices = attend_indices_tensor.data<int>();\n    const int *v_indices = v_indices_tensor.data<int>();\n    const int *xyz_to_vidx = xyz_to_vidx_tensor.data<int>();\n    const int *range_spec = range_spec_tensor.data<int>();\n\n    subm_strided_attention_with_tensor_kernel_launcher(x_max, y_max, z_max, num_voxels, attend_size, num_range,\n                                                       attend_indices, v_indices, xyz_to_vidx, range_spec);\n    return 1;\n}\n\nint subm_strided_attention_with_hash_wrapper(int x_max, int y_max, int z_max, int num_voxels, int attend_size, int num_range, int hash_size,\n                                                at::Tensor attend_indices_tensor, at::Tensor v_indices_tensor,\n                                                at::Tensor xyz_to_vidx_tensor, at::Tensor range_spec_tensor) {\n    CHECK_INPUT(attend_indices_tensor);\n    CHECK_INPUT(v_indices_tensor);\n    CHECK_INPUT(xyz_to_vidx_tensor);\n    CHECK_INPUT(range_spec_tensor);\n\n    int *attend_indices = attend_indices_tensor.data<int>();\n    const int *v_indices = v_indices_tensor.data<int>();\n    const int *xyz_to_vidx = xyz_to_vidx_tensor.data<int>();\n    const int *range_spec = range_spec_tensor.data<int>();\n\n    subm_strided_attention_with_hash_kernel_launcher(x_max, y_max, z_max, num_voxels, attend_size, num_range, hash_size,\n                                                       attend_indices, v_indices, xyz_to_vidx, range_spec);\n    return 1;\n}"
  },
  {
    "path": "pcdet/ops/votr_ops/src/build_attention_indices_gpu.cu",
    "content": "/*\nFind indices for each attention pattern\nWritten by Jiageng Mao\n*/\n\n#include <math.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include \"build_attention_indices_gpu.h\"\n#include \"votr_cuda_utils.h\"\n\n__device__ int simple_hash(int k, int hash_size) {\n    return k % hash_size;\n}\n\n__device__ int hash_table_find(int &key, int &hash_size, const int *xyz_to_vidx) {\n    int hash_idx = simple_hash(key, hash_size);\n    int v_idx = EMPTY_KEY;\n    int prob_cnt = 0;\n    while (true) {\n        // found\n        if (xyz_to_vidx[hash_idx * 2 + 0] == key) {\n            v_idx = xyz_to_vidx[hash_idx * 2 + 1];\n            break;\n        }\n        // empty, not found\n        if (xyz_to_vidx[hash_idx * 2 + 0] == EMPTY_KEY) {\n            break;\n        }\n        // linear probing\n        hash_idx = (hash_idx + 1) % hash_size;\n        // security in case of dead loop\n        prob_cnt += 1;\n        if (prob_cnt >= hash_size) break;\n    }\n    return v_idx;\n}\n\n__global__ void sparse_local_attention_with_tensor_kernel(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                            int num_voxels, int attend_size, int attend_range,\n                                                            int *attend_indices, const int *v_indices, const int *xyz_to_vidx) {\n    /*\n        in sparse attention, voxels are not necessary at the non-empty location\n        attend_indices: [num_voxels, attend_size] for gather attend indices\n        v_indices: [num_voxels, 4] bs + zyx indices of voxels\n        xyz_to_vidx: [bs, x_max, y_max, z_max] voxel coordinates to voxel indices\n    */\n\n    int th_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (th_idx >= num_voxels) return;\n\n    int bs_idx = v_indices[th_idx * 4 + 0];\n    int z_idx = v_indices[th_idx * 4 + 1];\n    int y_idx = v_indices[th_idx * 4 + 2];\n    int x_idx = v_indices[th_idx * 4 + 3];\n\n    xyz_to_vidx += bs_idx * x_max * y_max * z_max;\n\n    int num_samples = 0;\n    for (int sz_idx = z_idx * z_stride - attend_range; sz_idx <= z_idx * z_stride + (z_stride - 1) + attend_range; ++sz_idx){\n        if (sz_idx >= z_max || sz_idx < 0) continue;\n        for (int sy_idx = y_idx * y_stride - attend_range; sy_idx <= y_idx * y_stride + (y_stride - 1) + attend_range; ++sy_idx){\n            if (sy_idx >= y_max || sy_idx < 0) continue;\n            for (int sx_idx = x_idx * x_stride - attend_range; sx_idx <= x_idx * x_stride + (x_stride - 1) + attend_range; ++sx_idx){\n                if (sx_idx >= x_max || sx_idx < 0) continue;\n                int sv_idx = xyz_to_vidx[sx_idx * y_max * z_max + sy_idx * z_max + sz_idx];\n                if (sv_idx != EMPTY_KEY) { // found non-empty index\n                    if (num_samples >= attend_size) return; // full and return\n                    attend_indices[th_idx * attend_size + num_samples] = sv_idx;\n                    num_samples++;\n                }else { // not found\n                    ;\n                }\n            }\n        }\n    }\n    return;\n}\n\nvoid sparse_local_attention_with_tensor_kernel_launcher(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                        int num_voxels, int attend_size, int attend_range,\n                                                        int *attend_indices, const int *v_indices, const int *xyz_to_vidx) {\n\n    cudaError_t err;\n\n    dim3 blocks(DIVUP(num_voxels, THREADS_PER_BLOCK));  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    sparse_local_attention_with_tensor_kernel<<<blocks, threads>>>(x_max, y_max, z_max, x_stride, y_stride, z_stride,\n                                                                    num_voxels, attend_size, attend_range,\n                                                                    attend_indices, v_indices, xyz_to_vidx);\n    // cudaDeviceSynchronize();  // for using printf in kernel function\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n\n__global__ void sparse_local_attention_with_hash_kernel(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                            int num_voxels, int attend_size, int attend_range, int hash_size,\n                                                            int *attend_indices, const int *v_indices, const int *xyz_to_vidx) {\n    /*\n        in sparse attention, voxels are not necessary at the non-empty location\n        attend_indices: [num_voxels, attend_size] for gather attend indices\n        v_indices: [num_voxels, 4] bs + zyx indices of voxels\n        xyz_to_vidx: [bs, hash_size, 2] voxel coordinates to voxel indices\n    */\n\n    int th_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (th_idx >= num_voxels) return;\n\n    int bs_idx = v_indices[th_idx * 4 + 0];\n    int z_idx = v_indices[th_idx * 4 + 1];\n    int y_idx = v_indices[th_idx * 4 + 2];\n    int x_idx = v_indices[th_idx * 4 + 3];\n\n    xyz_to_vidx += bs_idx * hash_size * 2;\n\n    int num_samples = 0;\n    for (int sz_idx = z_idx * z_stride - attend_range; sz_idx <= z_idx * z_stride + (z_stride - 1) + attend_range; ++sz_idx){\n        if (sz_idx >= z_max || sz_idx < 0) continue;\n        for (int sy_idx = y_idx * y_stride - attend_range; sy_idx <= y_idx * y_stride + (y_stride - 1) + attend_range; ++sy_idx){\n            if (sy_idx >= y_max || sy_idx < 0) continue;\n            for (int sx_idx = x_idx * x_stride - attend_range; sx_idx <= x_idx * x_stride + (x_stride - 1) + attend_range; ++sx_idx){\n                if (sx_idx >= x_max || sx_idx < 0) continue;\n                int skey = sx_idx * y_max * z_max + sy_idx * z_max + sz_idx;\n                int sv_idx = hash_table_find(skey, hash_size, xyz_to_vidx);\n                if (sv_idx != EMPTY_KEY) { // found non-empty index\n                    if (num_samples >= attend_size) return; // full and return\n                    attend_indices[th_idx * attend_size + num_samples] = sv_idx;\n                    num_samples++;\n                }else { // not found\n                    ;\n                }\n            }\n        }\n    }\n    return;\n}\n\nvoid sparse_local_attention_with_hash_kernel_launcher(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                        int num_voxels, int attend_size, int attend_range, int hash_size,\n                                                        int *attend_indices, const int *v_indices, const int *xyz_to_vidx) {\n\n    cudaError_t err;\n\n    dim3 blocks(DIVUP(num_voxels, THREADS_PER_BLOCK));  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    sparse_local_attention_with_hash_kernel<<<blocks, threads>>>(x_max, y_max, z_max, x_stride, y_stride, z_stride,\n                                                                    num_voxels, attend_size, attend_range, hash_size,\n                                                                    attend_indices, v_indices, xyz_to_vidx);\n    // cudaDeviceSynchronize();  // for using printf in kernel function\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n\n__global__ void subm_local_attention_with_tensor_kernel(int x_max, int y_max, int z_max, int num_voxels, int attend_size, int attend_range,\n                                                        int *attend_indices, const int *v_indices, const int *xyz_to_vidx) {\n    /*\n        attend_indices: [num_voxels, attend_size] for gather attend indices\n        v_indices: [num_voxels, 4] bs + zyx indices of voxels\n        xyz_to_vidx: [bs, x_max, y_max, z_max] voxel coordinates to voxel indices\n    */\n\n    int th_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (th_idx >= num_voxels) return;\n\n    int bs_idx = v_indices[th_idx * 4 + 0];\n    int z_idx = v_indices[th_idx * 4 + 1];\n    int y_idx = v_indices[th_idx * 4 + 2];\n    int x_idx = v_indices[th_idx * 4 + 3];\n    if (x_idx >= x_max || x_idx < 0 || y_idx < 0 || y_idx >= y_max || z_idx < 0 || z_idx >= z_max) return;\n    xyz_to_vidx += bs_idx * x_max * y_max * z_max;\n\n    int num_samples = 0;\n    for (int sz_idx = z_idx - attend_range; sz_idx <= z_idx + attend_range; ++sz_idx){\n        if (sz_idx >= z_max || sz_idx < 0) continue;\n        for (int sy_idx = y_idx - attend_range; sy_idx <= y_idx + attend_range; ++sy_idx){\n            if (sy_idx >= y_max || sy_idx < 0) continue;\n            for (int sx_idx = x_idx - attend_range; sx_idx <= x_idx + attend_range; ++sx_idx){\n                if (sx_idx >= x_max || sx_idx < 0) continue;\n                int sv_idx = xyz_to_vidx[sx_idx * y_max * z_max + sy_idx * z_max + sz_idx];\n                if (sv_idx != EMPTY_KEY) { // found non-empty index\n                    if (num_samples >= attend_size) return; // full and return\n                    attend_indices[th_idx * attend_size + num_samples] = sv_idx;\n                    num_samples++;\n                }else { // not found\n                    ;\n                }\n            }\n        }\n    }\n    return;\n}\n\nvoid subm_local_attention_with_tensor_kernel_launcher(int x_max, int y_max, int z_max, int num_voxels, int attend_size, int attend_range,\n                                                        int *attend_indices, const int *v_indices, const int *xyz_to_vidx){\n\n    cudaError_t err;\n\n    dim3 blocks(DIVUP(num_voxels, THREADS_PER_BLOCK));  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    subm_local_attention_with_tensor_kernel<<<blocks, threads>>>(x_max, y_max, z_max, num_voxels,\n                                                                    attend_size, attend_range, attend_indices, v_indices, xyz_to_vidx);\n    // cudaDeviceSynchronize();  // for using printf in kernel function\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n\n__global__ void subm_local_attention_with_hash_kernel(int x_max, int y_max, int z_max, int num_voxels, int attend_size, int attend_range, int hash_size,\n                                                        int *attend_indices, const int *v_indices, const int *xyz_to_vidx) {\n    /*\n        attend_indices: [num_voxels, attend_size] for gather attend indices\n        v_indices: [num_voxels, 4] bs + zyx indices of voxels\n        xyz_to_vidx: [bs, x_max, y_max, z_max] voxel coordinates to voxel indices\n    */\n\n    int th_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (th_idx >= num_voxels) return;\n\n    int bs_idx = v_indices[th_idx * 4 + 0];\n    int z_idx = v_indices[th_idx * 4 + 1];\n    int y_idx = v_indices[th_idx * 4 + 2];\n    int x_idx = v_indices[th_idx * 4 + 3];\n    if (x_idx >= x_max || x_idx < 0 || y_idx < 0 || y_idx >= y_max || z_idx < 0 || z_idx >= z_max) return;\n    xyz_to_vidx += bs_idx * hash_size * 2;\n\n    int num_samples = 0;\n    for (int sz_idx = z_idx - attend_range; sz_idx <= z_idx + attend_range; ++sz_idx){\n        if (sz_idx >= z_max || sz_idx < 0) continue;\n        for (int sy_idx = y_idx - attend_range; sy_idx <= y_idx + attend_range; ++sy_idx){\n            if (sy_idx >= y_max || sy_idx < 0) continue;\n            for (int sx_idx = x_idx - attend_range; sx_idx <= x_idx + attend_range; ++sx_idx){\n                if (sx_idx >= x_max || sx_idx < 0) continue;\n                int skey = sx_idx * y_max * z_max + sy_idx * z_max + sz_idx;\n                int sv_idx = hash_table_find(skey, hash_size, xyz_to_vidx);\n                if (sv_idx != EMPTY_KEY) { // found non-empty index\n                    if (num_samples >= attend_size) return; // full and return\n                    attend_indices[th_idx * attend_size + num_samples] = sv_idx;\n                    num_samples++;\n                }else { // not found\n                    ;\n                }\n            }\n        }\n    }\n    return;\n}\n\nvoid subm_local_attention_with_hash_kernel_launcher(int x_max, int y_max, int z_max, int num_voxels, int attend_size, int attend_range, int hash_size,\n                                                        int *attend_indices, const int *v_indices, const int *xyz_to_vidx){\n\n    cudaError_t err;\n\n    dim3 blocks(DIVUP(num_voxels, THREADS_PER_BLOCK));  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    subm_local_attention_with_hash_kernel<<<blocks, threads>>>(x_max, y_max, z_max, num_voxels, attend_size, attend_range, hash_size,\n                                                                    attend_indices, v_indices, xyz_to_vidx);\n    // cudaDeviceSynchronize();  // for using printf in kernel function\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n\n__global__ void sparse_strided_attention_with_tensor_kernel(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                            int num_voxels, int attend_size, int num_range,\n                                                            int *attend_indices, const int *v_indices, const int *xyz_to_vidx, const int *range_spec) {\n    /*\n        attend_indices: [num_voxels, attend_size] for gather attend indices\n        v_indices: [num_voxels, 4] bs + zyx indices of voxels\n        xyz_to_vidx: [bs, x_max, y_max, z_max] voxel coordinates to voxel indices\n        range_spec: [num_range, 3] half start/end range & stride\n    */\n\n    int th_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (th_idx >= num_voxels) return;\n\n    int bs_idx = v_indices[th_idx * 4 + 0];\n    int z_idx = v_indices[th_idx * 4 + 1];\n    int y_idx = v_indices[th_idx * 4 + 2];\n    int x_idx = v_indices[th_idx * 4 + 3];\n    if (x_idx >= x_max || x_idx < 0 || y_idx < 0 || y_idx >= y_max || z_idx < 0 || z_idx >= z_max) return;\n    xyz_to_vidx += bs_idx * x_max * y_max * z_max;\n\n    int num_samples = 0;\n    for (int range_idx = 0; range_idx < num_range; ++range_idx) {\n        int search_x_start_range = range_spec[range_idx * 9 + 0];\n        int search_x_end_range = range_spec[range_idx * 9 + 1];\n        int search_x_stride = range_spec[range_idx * 9 + 2];\n        int search_y_start_range = range_spec[range_idx * 9 + 3];\n        int search_y_end_range = range_spec[range_idx * 9 + 4];\n        int search_y_stride = range_spec[range_idx * 9 + 5];\n        int search_z_start_range = range_spec[range_idx * 9 + 6];\n        int search_z_end_range = range_spec[range_idx * 9 + 7];\n        int search_z_stride = range_spec[range_idx * 9 + 8];\n        for (int z_offset = 0; z_offset < search_z_end_range; z_offset += search_z_stride) {\n        for (int y_offset = 0; y_offset < search_y_end_range; y_offset += search_y_stride) {\n        for (int x_offset = 0; x_offset < search_x_end_range; x_offset += search_x_stride) {\n             if ((x_offset < search_x_start_range) && (y_offset < search_y_start_range)\n             && (z_offset < search_z_start_range)) {\n                continue;\n             }\n            // each loop process 8 points\n            for (int sz_idx = z_idx * z_stride - z_offset; sz_idx <= z_idx * z_stride + (z_stride - 1) + z_offset; sz_idx += (2 * z_offset + z_stride - 1)){\n                if (sz_idx >= z_max || sz_idx < 0) continue;\n                for (int sy_idx = y_idx * y_stride - y_offset; sy_idx <= y_idx * y_stride + (y_stride - 1) + y_offset; sy_idx += (2 * y_offset + y_stride - 1)){\n                    if (sy_idx >= y_max || sy_idx < 0) continue;\n                    for (int sx_idx = x_idx * x_stride - x_offset; sx_idx <= x_idx * x_stride + (x_stride - 1) + x_offset; sx_idx += (2 * x_offset + x_stride - 1)){\n                        if (sx_idx >= x_max || sx_idx < 0) continue;\n                        int sv_idx = xyz_to_vidx[sx_idx * y_max * z_max + sy_idx * z_max + sz_idx];\n                        if (sv_idx != EMPTY_KEY) { // found non-empty index\n                            if (num_samples >= attend_size) return; // full and return\n                            attend_indices[th_idx * attend_size + num_samples] = sv_idx;\n                            num_samples++;\n                        }else { // not found\n                            ;\n                        }\n                    }\n                }\n            }\n        }\n        }\n        }\n\n    }\n    return;\n}\n\nvoid sparse_strided_attention_with_tensor_kernel_launcher(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                            int num_voxels, int attend_size, int num_range,\n                                                            int *attend_indices, const int *v_indices, const int *xyz_to_vidx, const int *range_spec){\n\n    cudaError_t err;\n\n    dim3 blocks(DIVUP(num_voxels, THREADS_PER_BLOCK));  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    sparse_strided_attention_with_tensor_kernel<<<blocks, threads>>>(x_max, y_max, z_max, x_stride, y_stride, z_stride,\n                                                                        num_voxels, attend_size, num_range,\n                                                                        attend_indices, v_indices, xyz_to_vidx, range_spec);\n    // cudaDeviceSynchronize();  // for using printf in kernel function\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n\n__global__ void sparse_strided_attention_with_hash_kernel(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                            int num_voxels, int attend_size, int num_range, int hash_size,\n                                                            int *attend_indices, const int *v_indices, const int *xyz_to_vidx, const int *range_spec) {\n    /*\n        attend_indices: [num_voxels, attend_size] for gather attend indices\n        v_indices: [num_voxels, 4] bs + zyx indices of voxels\n        xyz_to_vidx: [bs, hash_size, 2] voxel coordinates to voxel indices\n        range_spec: [num_range, 3] half start/end range & stride\n    */\n\n    int th_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (th_idx >= num_voxels) return;\n\n    int bs_idx = v_indices[th_idx * 4 + 0];\n    int z_idx = v_indices[th_idx * 4 + 1];\n    int y_idx = v_indices[th_idx * 4 + 2];\n    int x_idx = v_indices[th_idx * 4 + 3];\n    if (x_idx >= x_max || x_idx < 0 || y_idx < 0 || y_idx >= y_max || z_idx < 0 || z_idx >= z_max) return;\n    xyz_to_vidx += bs_idx * hash_size * 2;\n\n    int num_samples = 0;\n    for (int range_idx = 0; range_idx < num_range; ++range_idx) {\n        int search_x_start_range = range_spec[range_idx * 9 + 0];\n        int search_x_end_range = range_spec[range_idx * 9 + 1];\n        int search_x_stride = range_spec[range_idx * 9 + 2];\n        int search_y_start_range = range_spec[range_idx * 9 + 3];\n        int search_y_end_range = range_spec[range_idx * 9 + 4];\n        int search_y_stride = range_spec[range_idx * 9 + 5];\n        int search_z_start_range = range_spec[range_idx * 9 + 6];\n        int search_z_end_range = range_spec[range_idx * 9 + 7];\n        int search_z_stride = range_spec[range_idx * 9 + 8];\n        for (int z_offset = 0; z_offset < search_z_end_range; z_offset += search_z_stride) {\n        for (int y_offset = 0; y_offset < search_y_end_range; y_offset += search_y_stride) {\n        for (int x_offset = 0; x_offset < search_x_end_range; x_offset += search_x_stride) {\n            if ((x_offset < search_x_start_range) && (y_offset < search_y_start_range)\n             && (z_offset < search_z_start_range)) {\n                continue;\n             }\n            // each loop process 8 points\n            for (int sz_idx = z_idx * z_stride - z_offset; sz_idx <= z_idx * z_stride + (z_stride - 1) + z_offset; sz_idx += (2 * z_offset + z_stride - 1)){\n                if (sz_idx >= z_max || sz_idx < 0) continue;\n                for (int sy_idx = y_idx * y_stride - y_offset; sy_idx <= y_idx * y_stride + (y_stride - 1) + y_offset; sy_idx += (2 * y_offset + y_stride - 1)){\n                    if (sy_idx >= y_max || sy_idx < 0) continue;\n                    for (int sx_idx = x_idx * x_stride - x_offset; sx_idx <= x_idx * x_stride + (x_stride - 1) + x_offset; sx_idx += (2 * x_offset + x_stride - 1)){\n                        if (sx_idx >= x_max || sx_idx < 0) continue;\n                        int skey = sx_idx * y_max * z_max + sy_idx * z_max + sz_idx;\n                        int sv_idx = hash_table_find(skey, hash_size, xyz_to_vidx);\n                        if (sv_idx != EMPTY_KEY) { // found non-empty index\n                            if (num_samples >= attend_size) return; // full and return\n                            attend_indices[th_idx * attend_size + num_samples] = sv_idx;\n                            num_samples++;\n                        }else { // not found\n                            ;\n                        }\n                    }\n                }\n            }\n        }\n        }\n        }\n\n    }\n    return;\n}\n\nvoid sparse_strided_attention_with_hash_kernel_launcher(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                            int num_voxels, int attend_size, int num_range, int hash_size,\n                                                            int *attend_indices, const int *v_indices, const int *xyz_to_vidx, const int *range_spec){\n\n    cudaError_t err;\n\n    dim3 blocks(DIVUP(num_voxels, THREADS_PER_BLOCK));  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    sparse_strided_attention_with_hash_kernel<<<blocks, threads>>>(x_max, y_max, z_max, x_stride, y_stride, z_stride,\n                                                                        num_voxels, attend_size, num_range, hash_size,\n                                                                        attend_indices, v_indices, xyz_to_vidx, range_spec);\n    // cudaDeviceSynchronize();  // for using printf in kernel function\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n\n__global__ void subm_strided_attention_with_tensor_kernel(int x_max, int y_max, int z_max, int num_voxels, int attend_size, int num_range,\n                                                            int *attend_indices, const int *v_indices, const int *xyz_to_vidx, const int *range_spec) {\n    /*\n        attend_indices: [num_voxels, attend_size] for gather attend indices\n        v_indices: [num_voxels, 4] bs + zyx indices of voxels\n        xyz_to_vidx: [bs, x_max, y_max, z_max] voxel coordinates to voxel indices\n        range_spec: [num_range, 3] half start/end range & stride\n    */\n\n    int th_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (th_idx >= num_voxels) return;\n\n    int bs_idx = v_indices[th_idx * 4 + 0];\n    int z_idx = v_indices[th_idx * 4 + 1];\n    int y_idx = v_indices[th_idx * 4 + 2];\n    int x_idx = v_indices[th_idx * 4 + 3];\n    if (x_idx >= x_max || x_idx < 0 || y_idx < 0 || y_idx >= y_max || z_idx < 0 || z_idx >= z_max) return;\n    xyz_to_vidx += bs_idx * x_max * y_max * z_max;\n\n    int num_samples = 0;\n    for (int range_idx = 0; range_idx < num_range; ++range_idx) {\n        int search_x_start_range = range_spec[range_idx * 9 + 0];\n        int search_x_end_range = range_spec[range_idx * 9 + 1];\n        int search_x_stride = range_spec[range_idx * 9 + 2];\n        int search_y_start_range = range_spec[range_idx * 9 + 3];\n        int search_y_end_range = range_spec[range_idx * 9 + 4];\n        int search_y_stride = range_spec[range_idx * 9 + 5];\n        int search_z_start_range = range_spec[range_idx * 9 + 6];\n        int search_z_end_range = range_spec[range_idx * 9 + 7];\n        int search_z_stride = range_spec[range_idx * 9 + 8];\n        int x_step = 0;\n        int y_step = 0;\n        int z_step = 0;\n        for (int z_offset = 0; z_offset < search_z_end_range; z_offset += search_z_stride) {\n        for (int y_offset = 0; y_offset < search_y_end_range; y_offset += search_y_stride) {\n        for (int x_offset = 0; x_offset < search_x_end_range; x_offset += search_x_stride) {\n            if ((x_offset < search_x_start_range) && (y_offset < search_y_start_range)\n             && (z_offset < search_z_start_range)) {\n                continue;\n             }\n            // each loop process 8 points\n            if (z_offset == 0) {\n                z_step = 1;\n            } else {\n                z_step = 2 * z_offset;\n            }\n            for (int sz_idx = z_idx - z_offset; sz_idx <= z_idx + z_offset; sz_idx += z_step){\n                if (sz_idx >= z_max || sz_idx < 0) continue;\n                if (sz_idx >= z_max || sz_idx < 0) continue;\n                if (y_offset == 0) {\n                    y_step = 1;\n                } else {\n                    y_step = 2 * y_offset;\n                }\n                for (int sy_idx = y_idx - y_offset; sy_idx <= y_idx + y_offset; sy_idx += y_step){\n                    if (sy_idx >= y_max || sy_idx < 0) continue;\n                    if (x_offset == 0) {\n                        x_step = 1;\n                    } else {\n                        x_step = 2 * x_offset;\n                    }\n                    for (int sx_idx = x_idx - x_offset; sx_idx <= x_idx + x_offset; sx_idx += x_step){\n                        if (sx_idx >= x_max || sx_idx < 0) continue;\n                        int sv_idx = xyz_to_vidx[sx_idx * y_max * z_max + sy_idx * z_max + sz_idx];\n                        if (sv_idx != EMPTY_KEY) { // found non-empty index\n                            if (num_samples >= attend_size) return; // full and return\n                            attend_indices[th_idx * attend_size + num_samples] = sv_idx;\n                            num_samples++;\n                        }else { // not found\n                            ;\n                        }\n                    }\n                }\n            }\n        }\n        }\n        }\n\n    }\n    return;\n}\n\nvoid subm_strided_attention_with_tensor_kernel_launcher(int x_max, int y_max, int z_max, int num_voxels, int attend_size, int num_range,\n                                                            int *attend_indices, const int *v_indices, const int *xyz_to_vidx, const int *range_spec){\n\n    cudaError_t err;\n\n    dim3 blocks(DIVUP(num_voxels, THREADS_PER_BLOCK));  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    subm_strided_attention_with_tensor_kernel<<<blocks, threads>>>(x_max, y_max, z_max, num_voxels, attend_size, num_range,\n                                                                    attend_indices, v_indices, xyz_to_vidx, range_spec);\n    // cudaDeviceSynchronize();  // for using printf in kernel function\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n\n__global__ void subm_strided_attention_with_hash_kernel(int x_max, int y_max, int z_max, int num_voxels, int attend_size, int num_range, int hash_size,\n                                                            int *attend_indices, const int *v_indices, const int *xyz_to_vidx, const int *range_spec) {\n    /*\n        attend_indices: [num_voxels, attend_size] for gather attend indices\n        v_indices: [num_voxels, 4] bs + zyx indices of voxels\n        xyz_to_vidx: [bs, hash_size, 2] voxel coordinates to voxel indices\n        range_spec: [num_range, 3] half start/end range & stride\n    */\n\n    int th_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (th_idx >= num_voxels) return;\n\n    int bs_idx = v_indices[th_idx * 4 + 0];\n    int z_idx = v_indices[th_idx * 4 + 1];\n    int y_idx = v_indices[th_idx * 4 + 2];\n    int x_idx = v_indices[th_idx * 4 + 3];\n    if (x_idx >= x_max || x_idx < 0 || y_idx < 0 || y_idx >= y_max || z_idx < 0 || z_idx >= z_max) return;\n    xyz_to_vidx += bs_idx * hash_size * 2;\n\n    int num_samples = 0;\n    for (int range_idx = 0; range_idx < num_range; ++range_idx) {\n        int search_x_start_range = range_spec[range_idx * 9 + 0];\n        int search_x_end_range = range_spec[range_idx * 9 + 1];\n        int search_x_stride = range_spec[range_idx * 9 + 2];\n        int search_y_start_range = range_spec[range_idx * 9 + 3];\n        int search_y_end_range = range_spec[range_idx * 9 + 4];\n        int search_y_stride = range_spec[range_idx * 9 + 5];\n        int search_z_start_range = range_spec[range_idx * 9 + 6];\n        int search_z_end_range = range_spec[range_idx * 9 + 7];\n        int search_z_stride = range_spec[range_idx * 9 + 8];\n        int x_step = 0;\n        int y_step = 0;\n        int z_step = 0;\n        for (int z_offset = 0; z_offset < search_z_end_range; z_offset += search_z_stride) {\n        for (int y_offset = 0; y_offset < search_y_end_range; y_offset += search_y_stride) {\n        for (int x_offset = 0; x_offset < search_x_end_range; x_offset += search_x_stride) {\n            if ((x_offset < search_x_start_range) && (y_offset < search_y_start_range)\n             && (z_offset < search_z_start_range)) {\n                continue;\n             }\n            // each loop process 8 points\n            if (z_offset == 0) {\n                z_step = 1;\n            } else {\n                z_step = 2 * z_offset;\n            }\n            for (int sz_idx = z_idx - z_offset; sz_idx <= z_idx + z_offset; sz_idx += z_step){\n                if (sz_idx >= z_max || sz_idx < 0) continue;\n                if (y_offset == 0) {\n                    y_step = 1;\n                } else {\n                    y_step = 2 * y_offset;\n                }\n                for (int sy_idx = y_idx - y_offset; sy_idx <= y_idx + y_offset; sy_idx += y_step){\n                    if (sy_idx >= y_max || sy_idx < 0) continue;\n                    if (x_offset == 0) {\n                        x_step = 1;\n                    } else {\n                        x_step = 2 * x_offset;\n                    }\n                    for (int sx_idx = x_idx - x_offset; sx_idx <= x_idx + x_offset; sx_idx += x_step){\n                        if (sx_idx >= x_max || sx_idx < 0) continue;\n                        int skey = sx_idx * y_max * z_max + sy_idx * z_max + sz_idx;\n                        int sv_idx = hash_table_find(skey, hash_size, xyz_to_vidx);\n                        if (sv_idx != EMPTY_KEY) { // found non-empty index\n                            if (num_samples >= attend_size) return; // full and return\n                            attend_indices[th_idx * attend_size + num_samples] = sv_idx;\n                            num_samples++;\n                        }else { // not found\n                            ;\n                        }\n                    }\n                }\n            }\n        }\n        }\n        }\n\n    }\n    return;\n}\n\nvoid subm_strided_attention_with_hash_kernel_launcher(int x_max, int y_max, int z_max, int num_voxels, int attend_size, int num_range, int hash_size,\n                                                            int *attend_indices, const int *v_indices, const int *xyz_to_vidx, const int *range_spec){\n\n    cudaError_t err;\n\n    dim3 blocks(DIVUP(num_voxels, THREADS_PER_BLOCK));  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    subm_strided_attention_with_hash_kernel<<<blocks, threads>>>(x_max, y_max, z_max, num_voxels, attend_size, num_range, hash_size,\n                                                                    attend_indices, v_indices, xyz_to_vidx, range_spec);\n    // cudaDeviceSynchronize();  // for using printf in kernel function\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}"
  },
  {
    "path": "pcdet/ops/votr_ops/src/build_attention_indices_gpu.h",
    "content": "/*\nFind indices for each attention pattern\nWritten by Jiageng Mao\n*/\n\n#ifndef BUILD_ATTENTION_INDICES_GPU_H\n#define BUILD_ATTENTION_INDICES_GPU_H\n\n#include <torch/serialize/tensor.h>\n#include <vector>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n\nint subm_local_attention_with_tensor_wrapper(int x_max, int y_max, int z_max, int num_voxels, int attend_size, int attend_range,\n                                                at::Tensor attend_indices_tensor, at::Tensor v_indices_tensor, at::Tensor xyz_to_vidx_tensor);\n\nint subm_local_attention_with_hash_wrapper(int x_max, int y_max, int z_max, int num_voxels, int attend_size, int attend_range, int hash_size,\n                                                at::Tensor attend_indices_tensor, at::Tensor v_indices_tensor, at::Tensor xyz_to_vidx_tensor);\n\nint subm_strided_attention_with_tensor_wrapper(int x_max, int y_max, int z_max, int num_voxels, int attend_size, int num_range,\n                                                at::Tensor attend_indices_tensor, at::Tensor v_indices_tensor,\n                                                at::Tensor xyz_to_vidx_tensor, at::Tensor range_spec_tensor);\n\nint subm_strided_attention_with_hash_wrapper(int x_max, int y_max, int z_max, int num_voxels, int attend_size, int num_range, int hash_size,\n                                                at::Tensor attend_indices_tensor, at::Tensor v_indices_tensor,\n                                                at::Tensor xyz_to_vidx_tensor, at::Tensor range_spec_tensor);\n\nvoid subm_local_attention_with_tensor_kernel_launcher(int x_max, int y_max, int z_max, int num_voxels, int attend_size, int attend_range,\n                                                        int *attend_indices, const int *v_indices, const int *xyz_to_vidx);\n\nvoid subm_local_attention_with_hash_kernel_launcher(int x_max, int y_max, int z_max, int num_voxels, int attend_size, int attend_range, int hash_size,\n                                                        int *attend_indices, const int *v_indices, const int *xyz_to_vidx);\n\nvoid subm_strided_attention_with_tensor_kernel_launcher(int x_max, int y_max, int z_max, int num_voxels, int attend_size, int num_range,\n                                                            int *attend_indices, const int *v_indices, const int *xyz_to_vidx, const int *range_spec);\n\nvoid subm_strided_attention_with_hash_kernel_launcher(int x_max, int y_max, int z_max, int num_voxels, int attend_size, int num_range, int hash_size,\n                                                            int *attend_indices, const int *v_indices, const int *xyz_to_vidx, const int *range_spec);\n\nint sparse_local_attention_with_tensor_wrapper(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                int num_voxels, int attend_size, int attend_range,\n                                                at::Tensor attend_indices_tensor, at::Tensor v_indices_tensor, at::Tensor xyz_to_vidx_tensor);\n\nint sparse_local_attention_with_hash_wrapper(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                int num_voxels, int attend_size, int attend_range, int hash_size,\n                                                at::Tensor attend_indices_tensor, at::Tensor v_indices_tensor, at::Tensor xyz_to_vidx_tensor);\n\nint sparse_strided_attention_with_tensor_wrapper(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                    int num_voxels, int attend_size, int num_range,\n                                                    at::Tensor attend_indices_tensor, at::Tensor v_indices_tensor,\n                                                    at::Tensor xyz_to_vidx_tensor, at::Tensor range_spec_tensor);\n\nint sparse_strided_attention_with_hash_wrapper(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                    int num_voxels, int attend_size, int num_range, int hash_size,\n                                                    at::Tensor attend_indices_tensor, at::Tensor v_indices_tensor,\n                                                    at::Tensor xyz_to_vidx_tensor, at::Tensor range_spec_tensor);\n\nvoid sparse_local_attention_with_tensor_kernel_launcher(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                        int num_voxels, int attend_size, int attend_range,\n                                                        int *attend_indices, const int *v_indices, const int *xyz_to_vidx);\n\nvoid sparse_local_attention_with_hash_kernel_launcher(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                        int num_voxels, int attend_size, int attend_range, int hash_size,\n                                                        int *attend_indices, const int *v_indices, const int *xyz_to_vidx);\n\nvoid sparse_strided_attention_with_tensor_kernel_launcher(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                            int num_voxels, int attend_size, int num_range,\n                                                            int *attend_indices, const int *v_indices, const int *xyz_to_vidx, const int *range_spec);\n\nvoid sparse_strided_attention_with_hash_kernel_launcher(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                            int num_voxels, int attend_size, int num_range, int hash_size,\n                                                            int *attend_indices, const int *v_indices, const int *xyz_to_vidx, const int *range_spec);\n#endif"
  },
  {
    "path": "pcdet/ops/votr_ops/src/build_mapping.cpp",
    "content": "/*\nBuilding xyz -> idx sparse tensor mapping\nWritten by Jiageng Mao\n*/\n\n#include <torch/serialize/tensor.h>\n#include <vector>\n#include <THC/THC.h>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n#include \"build_mapping_gpu.h\"\n\nextern THCState *state;\n\n#define CHECK_CUDA(x) do { \\\n  if (!x.type().is_cuda()) { \\\n    fprintf(stderr, \"%s must be CUDA tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_CONTIGUOUS(x) do { \\\n  if (!x.is_contiguous()) { \\\n    fprintf(stderr, \"%s must be contiguous tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_INPUT(x) CHECK_CUDA(x);CHECK_CONTIGUOUS(x)\n\nint build_mapping_with_tensor_wrapper(int x_max, int y_max, int z_max, int num_voxels,\n                                        at::Tensor v_indices_tensor, at::Tensor v_bs_cnt_tensor, at::Tensor xyz_to_vidx_tensor) {\n    CHECK_INPUT(v_indices_tensor);\n    CHECK_INPUT(v_bs_cnt_tensor);\n    CHECK_INPUT(xyz_to_vidx_tensor);\n\n    const int *v_indices = v_indices_tensor.data<int>();\n    const int *v_bs_cnt = v_bs_cnt_tensor.data<int>();\n    int *xyz_to_vidx = xyz_to_vidx_tensor.data<int>();\n\n    build_mapping_with_tensor_kernel_launcher(x_max, y_max, z_max, num_voxels, v_indices, v_bs_cnt, xyz_to_vidx);\n    return 1;\n}\n\nint downsample_with_tensor_wrapper(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                    int num_voxels, int num_ds_voxels,\n                                    at::Tensor v_indices_tensor, at::Tensor ds_v_indices_tensor, at::Tensor xyz_to_vidx_tensor, at::Tensor vcount_tensor) {\n    CHECK_INPUT(v_indices_tensor);\n    CHECK_INPUT(ds_v_indices_tensor);\n    CHECK_INPUT(xyz_to_vidx_tensor);\n    CHECK_INPUT(vcount_tensor);\n\n    const int *v_indices = v_indices_tensor.data<int>();\n    int *ds_v_indices = ds_v_indices_tensor.data<int>();\n    int *xyz_to_vidx = xyz_to_vidx_tensor.data<int>();\n    int *vcount = vcount_tensor.data<int>();\n\n    downsample_with_tensor_kernel_launcher(x_max, y_max, z_max, x_stride, y_stride, z_stride, num_voxels, num_ds_voxels,\n                                                v_indices, ds_v_indices, xyz_to_vidx, vcount);\n    return 1;\n}\n\nint build_mapping_with_hash_wrapper(int x_max, int y_max, int z_max, int num_voxels, int hash_size,\n                                        at::Tensor v_indices_tensor, at::Tensor v_bs_cnt_tensor, at::Tensor xyz_to_vidx_tensor) {\n    CHECK_INPUT(v_indices_tensor);\n    CHECK_INPUT(v_bs_cnt_tensor);\n    CHECK_INPUT(xyz_to_vidx_tensor);\n\n    const int *v_indices = v_indices_tensor.data<int>();\n    const int *v_bs_cnt = v_bs_cnt_tensor.data<int>();\n    int *xyz_to_vidx = xyz_to_vidx_tensor.data<int>();\n\n    build_mapping_with_hash_kernel_launcher(x_max, y_max, z_max, num_voxels, hash_size, v_indices, v_bs_cnt, xyz_to_vidx);\n    return 1;\n}\n\nint downsample_with_hash_wrapper(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                    int num_voxels, int num_ds_voxels, int hash_size,\n                                    at::Tensor v_indices_tensor, at::Tensor ds_v_indices_tensor, at::Tensor xyz_to_vidx_tensor, at::Tensor vcount_tensor) {\n    CHECK_INPUT(v_indices_tensor);\n    CHECK_INPUT(ds_v_indices_tensor);\n    CHECK_INPUT(xyz_to_vidx_tensor);\n    CHECK_INPUT(vcount_tensor);\n\n    const int *v_indices = v_indices_tensor.data<int>();\n    int *ds_v_indices = ds_v_indices_tensor.data<int>();\n    int *xyz_to_vidx = xyz_to_vidx_tensor.data<int>();\n    int *vcount = vcount_tensor.data<int>();\n\n    downsample_with_hash_kernel_launcher(x_max, y_max, z_max, x_stride, y_stride, z_stride, num_voxels, num_ds_voxels, hash_size,\n                                                v_indices, ds_v_indices, xyz_to_vidx, vcount);\n    return 1;\n}"
  },
  {
    "path": "pcdet/ops/votr_ops/src/build_mapping_gpu.cu",
    "content": "/*\nBuilding xyz -> idx sparse tensor mapping\nWritten by Jiageng Mao\n*/\n\n#include <math.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include \"build_mapping_gpu.h\"\n#include \"votr_cuda_utils.h\"\n\n// 32 bit Murmur3 hash\n// unsigned int -> int, k >= 0, hash_size >0, should be ok?\n__device__ int murmur_hash(int k, int hash_size) {\n    k ^= k >> 16;\n    k *= 0x85ebca6b;\n    k ^= k >> 13;\n    k *= 0xc2b2ae35;\n    k ^= k >> 16;\n    //return k & (hash_size-1);\n    return k % hash_size;\n}\n\n__device__ int hash(int k, int hash_size) {\n    return k % hash_size;\n}\n\n__device__ void hash_table_insert(int &key, int &value, int &hash_size, int *xyz_to_vidx) {\n    /*\n        xyz_to_idx (hash_size, 2) NO BATCH SIZE\n    */\n    int hash_idx = hash(key, hash_size);\n    int prob_cnt = 0;\n    while(true) {\n        int prev_key = atomicCAS(xyz_to_vidx + hash_idx*2 + 0, EMPTY_KEY, key); // insert key when empty\n        if (prev_key == EMPTY_KEY || prev_key == key) {\n            xyz_to_vidx[hash_idx*2 + 1] = value; // insert value\n            break;\n        }\n        // linear probing\n        hash_idx = (hash_idx + 1) % hash_size;\n\n        // security in case of dead loop\n        prob_cnt += 1;\n        if (prob_cnt >= hash_size) break;\n    }\n}\n\n__global__ void downsample_with_tensor_kernel(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                int num_voxels, int num_ds_voxels,\n                                                const int *v_indices, int *ds_v_indices, int *xyz_to_vidx, int *vcount) {\n    /*\n        v_indices: [num_voxels, 4] bs + zyx indices of voxels\n        ds_v_indices: [bs, num_ds_voxels, 3] downsampled voxels, -1 if not unique\n        xyz_to_vidx: [bs, x_max, y_max, z_max] downsampled dense map\n        vcount: [bs]\n    */\n\n    int th_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (th_idx >= num_voxels) return;\n\n    int bs_idx = v_indices[th_idx * 4 + 0];\n    int z_idx = v_indices[th_idx * 4 + 1];\n    int y_idx = v_indices[th_idx * 4 + 2];\n    int x_idx = v_indices[th_idx * 4 + 3];\n\n    int ds_z_idx = z_idx / z_stride;\n    int ds_y_idx = y_idx / y_stride;\n    int ds_x_idx = x_idx / x_stride;\n\n    if (ds_x_idx >= x_max || ds_x_idx < 0 || ds_y_idx < 0 || ds_y_idx >= y_max || ds_z_idx < 0 || ds_z_idx >= z_max) return;\n\n    xyz_to_vidx += bs_idx * x_max * y_max * z_max;\n    ds_v_indices += bs_idx * num_ds_voxels * 3;\n\n    int ret_v = atomicExch(xyz_to_vidx + ds_x_idx * y_max * z_max + ds_y_idx * z_max + ds_z_idx, BLK_SIGNAL);\n    if (ret_v == BLK_SIGNAL){ // kill all block threads\n        return;\n    } else if (ret_v != EMPTY_KEY) { // already occupied\n        ret_v = atomicExch(xyz_to_vidx + ds_x_idx * y_max * z_max + ds_y_idx * z_max + ds_z_idx, ret_v);\n        return;\n    } else if (ret_v == EMPTY_KEY) {\n        int v_idx = atomicAdd(vcount + bs_idx, 1);\n        ds_v_indices[v_idx * 3 + 0] = ds_z_idx;\n        ds_v_indices[v_idx * 3 + 1] = ds_y_idx;\n        ds_v_indices[v_idx * 3 + 2] = ds_x_idx;\n        ret_v = atomicExch(xyz_to_vidx + ds_x_idx * y_max * z_max + ds_y_idx * z_max + ds_z_idx, v_idx);\n        return;\n    }\n\n}\n\nvoid downsample_with_tensor_kernel_launcher(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                int num_voxels, int num_ds_voxels,\n                                                const int *v_indices, int *ds_v_indices, int *xyz_to_vidx, int *vcount) {\n\n    cudaError_t err;\n\n    dim3 blocks(DIVUP(num_voxels, THREADS_PER_BLOCK));  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    downsample_with_tensor_kernel<<<blocks, threads>>>(x_max, y_max, z_max, x_stride, y_stride, z_stride,\n                                                num_voxels, num_ds_voxels,\n                                                v_indices, ds_v_indices, xyz_to_vidx, vcount);\n    // cudaDeviceSynchronize();  // for using printf in kernel function\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n\n}\n\n__global__ void downsample_with_hash_kernel(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                int num_voxels, int num_ds_voxels, int hash_size,\n                                                const int *v_indices, int *ds_v_indices, int *xyz_to_vidx, int *vcount) {\n    /*\n        v_indices: [num_voxels, 4] bs + zyx indices of voxels\n        ds_v_indices: [bs, num_ds_voxels, 3] downsampled voxels, -1 if not unique\n        xyz_to_vidx: [bs, hash_size, 2] downsampled dense map\n        vcount: [bs]\n    */\n\n    int th_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (th_idx >= num_voxels) return;\n\n    int bs_idx = v_indices[th_idx * 4 + 0];\n    int z_idx = v_indices[th_idx * 4 + 1];\n    int y_idx = v_indices[th_idx * 4 + 2];\n    int x_idx = v_indices[th_idx * 4 + 3];\n\n    int ds_z_idx = z_idx / z_stride;\n    int ds_y_idx = y_idx / y_stride;\n    int ds_x_idx = x_idx / x_stride;\n\n    if (ds_x_idx >= x_max || ds_x_idx < 0 || ds_y_idx < 0 || ds_y_idx >= y_max || ds_z_idx < 0 || ds_z_idx >= z_max) return;\n\n    xyz_to_vidx += bs_idx * hash_size * 2;\n    ds_v_indices += bs_idx * num_ds_voxels * 3;\n\n    int key = ds_x_idx * y_max * z_max + ds_y_idx * z_max + ds_z_idx;\n    // hash table with force insert, reject duplicates\n    int hash_idx = hash(key, hash_size);\n    int prob_cnt = 0;\n    while(true) {\n        int prev_key = atomicCAS(xyz_to_vidx + hash_idx*2 + 0, EMPTY_KEY, key); // insert key when empty\n        if (prev_key == EMPTY_KEY) {\n            int v_idx = atomicAdd(vcount + bs_idx, 1);\n            ds_v_indices[v_idx * 3 + 0] = ds_z_idx; // insert zyx to ds_indices\n            ds_v_indices[v_idx * 3 + 1] = ds_y_idx;\n            ds_v_indices[v_idx * 3 + 2] = ds_x_idx;\n            xyz_to_vidx[hash_idx*2 + 1] = v_idx; // insert value to hash table\n            break;\n        } else if (prev_key == key) { // already occupied\n            break;\n        }\n        // linear probing\n        hash_idx = (hash_idx + 1) % hash_size;\n        // security in case of dead loop\n        prob_cnt += 1;\n        if (prob_cnt >= hash_size) break;\n    }\n}\n\nvoid downsample_with_hash_kernel_launcher(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                int num_voxels, int num_ds_voxels, int hash_size,\n                                                const int *v_indices, int *ds_v_indices, int *xyz_to_vidx, int *vcount) {\n\n    cudaError_t err;\n\n    dim3 blocks(DIVUP(num_voxels, THREADS_PER_BLOCK));  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    downsample_with_hash_kernel<<<blocks, threads>>>(x_max, y_max, z_max, x_stride, y_stride, z_stride,\n                                                num_voxels, num_ds_voxels, hash_size,\n                                                v_indices, ds_v_indices, xyz_to_vidx, vcount);\n    // cudaDeviceSynchronize();  // for using printf in kernel function\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n\n}\n\n__global__ void build_mapping_with_tensor_kernel(int x_max, int y_max, int z_max, int num_voxels,\n                                                    const int *v_indices, const int *v_bs_cnt, int *xyz_to_vidx) {\n    /*\n        v_indices: [num_voxels, 4] bs + zyx indices of voxels\n        xyz_to_vidx: [bs, x_max, y_max, z_max] voxel coordinates to voxel indices\n    */\n\n    int th_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (th_idx >= num_voxels) return;\n\n    int bs_idx = v_indices[th_idx * 4 + 0];\n    int z_idx = v_indices[th_idx * 4 + 1];\n    int y_idx = v_indices[th_idx * 4 + 2];\n    int x_idx = v_indices[th_idx * 4 + 3];\n    if (x_idx >= x_max || x_idx < 0 || y_idx < 0 || y_idx >= y_max || z_idx < 0 || z_idx >= z_max) return;\n\n    int v_sum = 0;\n    int bs_cnt = bs_idx - 1;\n    while(bs_cnt >= 0){\n        v_sum += v_bs_cnt[bs_cnt];\n        bs_cnt--;\n    }\n    int v_idx = th_idx - v_sum; // v_idx for this sample\n\n    xyz_to_vidx[bs_idx * x_max * y_max * z_max + x_idx * y_max * z_max + y_idx * z_max + z_idx] = v_idx;\n}\n\nvoid build_mapping_with_tensor_kernel_launcher(int x_max, int y_max, int z_max, int num_voxels,\n                                                const int *v_indices, const int *v_bs_cnt, int *xyz_to_vidx){\n\n    cudaError_t err;\n\n    dim3 blocks(DIVUP(num_voxels, THREADS_PER_BLOCK));  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    build_mapping_with_tensor_kernel<<<blocks, threads>>>(x_max, y_max, z_max, num_voxels, v_indices, v_bs_cnt, xyz_to_vidx);\n    // cudaDeviceSynchronize();  // for using printf in kernel function\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n\n__global__ void build_mapping_with_hash_kernel(int x_max, int y_max, int z_max, int num_voxels, int hash_size,\n                                                const int *v_indices, const int *v_bs_cnt, int *xyz_to_vidx) {\n    /*\n        v_indices: [N1+N2, 4] bs zyx indices of voxels\n        v_bs_cnt: [bs] num_voxels in each sample\n        xyz_to_vidx: [B, hash_size, 2] hash table key-value for dim-2\n    */\n\n    int th_idx = blockIdx.x * blockDim.x + threadIdx.x;\n    if (th_idx >= num_voxels) return;\n    int bs_idx = v_indices[th_idx * 4 + 0];\n    int z_idx = v_indices[th_idx * 4 + 1];\n    int y_idx = v_indices[th_idx * 4 + 2];\n    int x_idx = v_indices[th_idx * 4 + 3];\n\n    int v_sum = 0;\n    int bs_cnt = bs_idx - 1;\n    while(bs_cnt >= 0){\n        v_sum += v_bs_cnt[bs_cnt];\n        bs_cnt--;\n    }\n    int v_idx = th_idx - v_sum; // v_idx for this sample\n\n    xyz_to_vidx += bs_idx * hash_size * 2;\n    if (x_idx >= x_max || x_idx < 0 || y_idx < 0 || y_idx >= y_max || z_idx < 0 || z_idx >= z_max) return; // out of bound\n\n    // key -> [x_max, y_max, z_max] value -> v_idx\n    int key = x_idx * y_max * z_max + y_idx * z_max + z_idx;\n    hash_table_insert(key, v_idx, hash_size, xyz_to_vidx);\n\n    return;\n}\n\nvoid build_mapping_with_hash_kernel_launcher(int x_max, int y_max, int z_max, int num_voxels, int hash_size,\n                                                const int *v_indices, const int *v_bs_cnt, int *xyz_to_vidx){\n\n    cudaError_t err;\n\n    dim3 blocks(DIVUP(num_voxels, THREADS_PER_BLOCK));  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    build_mapping_with_hash_kernel<<<blocks, threads>>>(x_max, y_max, z_max, num_voxels, hash_size,\n                                                            v_indices, v_bs_cnt, xyz_to_vidx);\n    // cudaDeviceSynchronize();  // for using printf in kernel function\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n"
  },
  {
    "path": "pcdet/ops/votr_ops/src/build_mapping_gpu.h",
    "content": "/*\nBuilding xyz -> idx sparse tensor mapping\nWritten by Jiageng Mao\n*/\n\n#ifndef BUILD_MAPPING_GPU_H\n#define BUILD_MAPPING_GPU_H\n\n#include <torch/serialize/tensor.h>\n#include <vector>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n\nint build_mapping_with_tensor_wrapper(int x_max, int y_max, int z_max, int num_voxels,\n                                        at::Tensor v_indices_tensor, at::Tensor v_bs_cnt_tensor, at::Tensor xyz_to_vidx_tensor);\n\nvoid build_mapping_with_tensor_kernel_launcher(int x_max, int y_max, int z_max, int num_voxels,\n                                                const int *v_indices, const int *v_bs_cnt, int *xyz_to_vidx);\n\nint build_mapping_with_hash_wrapper(int x_max, int y_max, int z_max, int num_voxels, int hash_size,\n                                        at::Tensor v_indices_tensor, at::Tensor v_bs_cnt_tensor, at::Tensor xyz_to_vidx_tensor);\n\nvoid build_mapping_with_hash_kernel_launcher(int x_max, int y_max, int z_max, int num_voxels, int hash_size,\n                                                const int *v_indices, const int *v_bs_cnt, int *xyz_to_vidx);\n\nint downsample_with_tensor_wrapper(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                            int num_voxels, int num_ds_voxels,\n                                            at::Tensor v_indices_tensor, at::Tensor ds_v_indices_tensor,\n                                            at::Tensor xyz_to_vidx_tensor, at::Tensor vcount_tensor);\n\nvoid downsample_with_tensor_kernel_launcher(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                int num_voxels, int num_ds_voxels,\n                                                const int *v_indices, int *ds_v_indices,\n                                                int *xyz_to_vidx, int *vcount);\n\nint downsample_with_hash_wrapper(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                            int num_voxels, int num_ds_voxels, int hash_size,\n                                            at::Tensor v_indices_tensor, at::Tensor ds_v_indices_tensor,\n                                            at::Tensor xyz_to_vidx_tensor, at::Tensor vcount_tensor);\n\nvoid downsample_with_hash_kernel_launcher(int x_max, int y_max, int z_max, int x_stride, int y_stride, int z_stride,\n                                                int num_voxels, int num_ds_voxels, int hash_size,\n                                                const int *v_indices, int *ds_v_indices,\n                                                int *xyz_to_vidx, int *vcount);\n#endif"
  },
  {
    "path": "pcdet/ops/votr_ops/src/group_features.cpp",
    "content": "/*\nStacked-batch-data version of point grouping, modified from the original implementation of official PointNet++ codes.\nWritten by Shaoshuai Shi\nAll Rights Reserved 2019-2020.\n*/\n\n\n#include <torch/serialize/tensor.h>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n#include <vector>\n#include <THC/THC.h>\n#include \"group_features_gpu.h\"\n\nextern THCState *state;\n#define CHECK_CUDA(x) do { \\\n  if (!x.type().is_cuda()) { \\\n    fprintf(stderr, \"%s must be CUDA tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_CONTIGUOUS(x) do { \\\n  if (!x.is_contiguous()) { \\\n    fprintf(stderr, \"%s must be contiguous tensor at %s:%d\\n\", #x, __FILE__, __LINE__); \\\n    exit(-1); \\\n  } \\\n} while (0)\n#define CHECK_INPUT(x) CHECK_CUDA(x);CHECK_CONTIGUOUS(x)\n\n\nint group_features_grad_wrapper_stack(int B, int M, int C, int N, int nsample,\n    at::Tensor grad_out_tensor, at::Tensor idx_tensor, at::Tensor idx_batch_cnt_tensor,\n    at::Tensor features_batch_cnt_tensor, at::Tensor grad_features_tensor) {\n\n    CHECK_INPUT(grad_out_tensor);\n    CHECK_INPUT(idx_tensor);\n    CHECK_INPUT(idx_batch_cnt_tensor);\n    CHECK_INPUT(features_batch_cnt_tensor);\n    CHECK_INPUT(grad_features_tensor);\n\n    const float *grad_out = grad_out_tensor.data<float>();\n    const int *idx = idx_tensor.data<int>();\n    const int *idx_batch_cnt = idx_batch_cnt_tensor.data<int>();\n    const int *features_batch_cnt = features_batch_cnt_tensor.data<int>();\n    float *grad_features = grad_features_tensor.data<float>();\n\n    group_features_grad_kernel_launcher_stack(B, M, C, N, nsample, grad_out, idx, idx_batch_cnt, features_batch_cnt, grad_features);\n    return 1;\n}\n\n\nint group_features_wrapper_stack(int B, int M, int C, int nsample,\n    at::Tensor features_tensor, at::Tensor features_batch_cnt_tensor,\n    at::Tensor idx_tensor, at::Tensor idx_batch_cnt_tensor, at::Tensor out_tensor) {\n\n    CHECK_INPUT(features_tensor);\n    CHECK_INPUT(features_batch_cnt_tensor);\n    CHECK_INPUT(idx_tensor);\n    CHECK_INPUT(idx_batch_cnt_tensor);\n    CHECK_INPUT(out_tensor);\n\n    const float *features = features_tensor.data<float>();\n    const int *idx = idx_tensor.data<int>();\n    const int *features_batch_cnt = features_batch_cnt_tensor.data<int>();\n    const int *idx_batch_cnt = idx_batch_cnt_tensor.data<int>();\n    float *out = out_tensor.data<float>();\n\n    group_features_kernel_launcher_stack(B, M, C, nsample, features, features_batch_cnt, idx, idx_batch_cnt, out);\n    return 1;\n}"
  },
  {
    "path": "pcdet/ops/votr_ops/src/group_features_gpu.cu",
    "content": "/*\nModified from group points, don't care indices with -1(<0)\nWritten by Jiageng Mao\nAll Rights Reserved 2019-2020.\n*/\n\n\n#include <stdio.h>\n#include <stdlib.h>\n\n#include \"votr_cuda_utils.h\"\n#include \"group_features_gpu.h\"\n\n\n__global__ void group_features_grad_kernel_stack(int B, int M, int C, int N, int nsample,\n    const float *grad_out, const int *idx, const int *idx_batch_cnt, const int *features_batch_cnt, float *grad_features) {\n    // :param grad_out: (M1 + M2 ..., C, nsample) tensor of the gradients of the output from forward\n    // :param idx: (M1 + M2 ..., nsample) tensor containing the indicies of features to group with\n    // :param idx_batch_cnt: (batch_size) [M1 + M2 ...] tensor containing the indicies of features to group with\n    // :param features_batch_cnt: (batch_size) [N1 + N2 ...] tensor containing the indicies of features to group with\n    // :return:\n    //     grad_features: (N1 + N2 ..., C) gradient of the features\n    int index = blockIdx.x * blockDim.x + threadIdx.x;\n    int sample_idx = index % nsample;\n    int C_idx = (index / nsample) % C;\n    int pt_idx = (index / nsample / C);\n\n    if (pt_idx >= M || C_idx >= C || sample_idx >= nsample) return;\n\n    idx += pt_idx * nsample + sample_idx;\n    if (idx[0] < 0) return; // don't care neg indices\n\n    int bs_idx = 0, pt_cnt = idx_batch_cnt[0];\n    for (int k = 1; k < B; k++){\n        if (pt_idx < pt_cnt) break;\n        pt_cnt += idx_batch_cnt[k];\n        bs_idx = k;\n    }\n\n    int features_batch_start_idx = 0;\n    for (int k = 0; k < bs_idx; k++) features_batch_start_idx += features_batch_cnt[k];\n\n    grad_out += pt_idx * C * nsample + C_idx * nsample + sample_idx;\n\n    grad_features += (features_batch_start_idx + idx[0]) * C + C_idx;\n    atomicAdd(grad_features, grad_out[0]);\n}\n\nvoid group_features_grad_kernel_launcher_stack(int B, int M, int C, int N, int nsample,\n    const float *grad_out, const int *idx, const int *idx_batch_cnt, const int *features_batch_cnt, float *grad_features) {\n    // :param grad_out: (M1 + M2 ..., C, nsample) tensor of the gradients of the output from forward\n    // :param idx: (M1 + M2 ..., nsample) tensor containing the indicies of features to group with\n    // :param idx_batch_cnt: (batch_size) [M1 + M2 ...] tensor containing the indicies of features to group with\n    // :param features_batch_cnt: (batch_size) [N1 + N2 ...] tensor containing the indicies of features to group with\n    // :return:\n    //     grad_features: (N1 + N2 ..., C) gradient of the features\n\n    cudaError_t err;\n    // dim3 blocks(DIVUP(npoints * nsample, THREADS_PER_BLOCK), c, b);  // blockIdx.x(col), blockIdx.y(row)\n    dim3 blocks(DIVUP(M * C * nsample, THREADS_PER_BLOCK));  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    group_features_grad_kernel_stack<<<blocks, threads>>>(B, M, C, N, nsample, grad_out, idx, idx_batch_cnt, features_batch_cnt, grad_features);\n\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}\n\n\n__global__ void group_features_kernel_stack(int B, int M, int C, int nsample,\n    const float *features, const int *features_batch_cnt, const int *idx, const int *idx_batch_cnt, float *out) {\n    // :param features: (N1 + N2 ..., C) tensor of features to group\n    // :param features_batch_cnt: (batch_size) [N1 + N2 ...] tensor containing the indicies of features to group with\n    // :param idx: (M1 + M2 ..., nsample) tensor containing the indicies of features to group with\n    // :param idx_batch_cnt: (batch_size) [M1 + M2 ...] tensor containing the indicies of features to group with\n    // :return:\n    //     output: (M1 + M2, C, nsample) tensor\n    int index = blockIdx.x * blockDim.x + threadIdx.x;\n    int sample_idx = index % nsample;\n    int C_idx = (index / nsample) % C;\n    int pt_idx = (index / nsample / C);\n\n    if (pt_idx >= M || C_idx >= C || sample_idx >= nsample) return;\n\n    idx += pt_idx * nsample + sample_idx;\n    if (idx[0] < 0) return; // don't care neg indices\n\n    int bs_idx = 0, pt_cnt = idx_batch_cnt[0];\n    for (int k = 1; k < B; k++){\n        if (pt_idx < pt_cnt) break;\n        pt_cnt += idx_batch_cnt[k];\n        bs_idx = k;\n    }\n\n    int features_batch_start_idx = 0;\n    for (int k = 0; k < bs_idx; k++) features_batch_start_idx += features_batch_cnt[k];\n    features += features_batch_start_idx * C;\n\n    int in_idx = idx[0] * C + C_idx;\n    int out_idx = pt_idx * C * nsample + C_idx * nsample + sample_idx;\n\n    out[out_idx] = features[in_idx];\n}\n\n\nvoid group_features_kernel_launcher_stack(int B, int M, int C, int nsample,\n    const float *features, const int *features_batch_cnt, const int *idx, const int *idx_batch_cnt, float *out) {\n    // :param features: (N1 + N2 ..., C) tensor of features to group\n    // :param features_batch_cnt: (batch_size) [N1 + N2 ...] tensor containing the indicies of features to group with\n    // :param idx: (M1 + M2 ..., nsample) tensor containing the indicies of features to group with\n    // :param idx_batch_cnt: (batch_size) [M1 + M2 ...] tensor containing the indicies of features to group with\n    // :return:\n    //     output: (M1 + M2, C, nsample) tensor\n\n    cudaError_t err;\n    dim3 blocks(DIVUP(M * C * nsample, THREADS_PER_BLOCK));  // blockIdx.x(col), blockIdx.y(row)\n    dim3 threads(THREADS_PER_BLOCK);\n\n    group_features_kernel_stack<<<blocks, threads>>>(B, M, C, nsample, features, features_batch_cnt, idx, idx_batch_cnt, out);\n    // cudaDeviceSynchronize();  // for using printf in kernel function\n    err = cudaGetLastError();\n    if (cudaSuccess != err) {\n        fprintf(stderr, \"CUDA kernel failed : %s\\n\", cudaGetErrorString(err));\n        exit(-1);\n    }\n}"
  },
  {
    "path": "pcdet/ops/votr_ops/src/group_features_gpu.h",
    "content": "/*\nModified from group points, don't care indices with -1(<0)\nWritten by Jiageng Mao\nAll Rights Reserved 2019-2020.\n*/\n\n\n#ifndef _STACK_GROUP_FEATURES_GPU_H\n#define _STACK_GROUP_FEATURES_GPU_H\n\n#include <torch/serialize/tensor.h>\n#include <cuda.h>\n#include <cuda_runtime_api.h>\n#include <vector>\n\n\nint group_features_wrapper_stack(int B, int M, int C, int nsample,\n    at::Tensor features_tensor, at::Tensor features_batch_cnt_tensor,\n    at::Tensor idx_tensor, at::Tensor idx_batch_cnt_tensor, at::Tensor out_tensor);\n\nvoid group_features_kernel_launcher_stack(int B, int M, int C, int nsample,\n    const float *features, const int *features_batch_cnt, const int *idx, const int *idx_batch_cnt, float *out);\n\nint group_features_grad_wrapper_stack(int B, int M, int C, int N, int nsample,\n    at::Tensor grad_out_tensor, at::Tensor idx_tensor, at::Tensor idx_batch_cnt_tensor,\n    at::Tensor features_batch_cnt_tensor, at::Tensor grad_features_tensor);\n\nvoid group_features_grad_kernel_launcher_stack(int B, int M, int C, int N, int nsample,\n    const float *grad_out, const int *idx, const int *idx_batch_cnt, const int *features_batch_cnt, float *grad_features);\n\n#endif\n"
  },
  {
    "path": "pcdet/ops/votr_ops/src/votr_api.cpp",
    "content": "#include <torch/serialize/tensor.h>\n#include <torch/extension.h>\n\n#include \"build_mapping_gpu.h\"\n#include \"build_attention_indices_gpu.h\"\n#include \"group_features_gpu.h\"\n\nPYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {\n    m.def(\"build_mapping_with_tensor_wrapper\", &build_mapping_with_tensor_wrapper, \"build_mapping_with_tensor_wrapper\");\n    m.def(\"build_mapping_with_hash_wrapper\", &build_mapping_with_hash_wrapper, \"build_mapping_with_hash_wrapper\");\n    m.def(\"downsample_with_tensor_wrapper\", &downsample_with_tensor_wrapper, \"downsample_with_tensor_wrapper\");\n    m.def(\"downsample_with_hash_wrapper\", &downsample_with_hash_wrapper, \"downsample_with_hash_wrapper\");\n    m.def(\"subm_local_attention_with_tensor_wrapper\", &subm_local_attention_with_tensor_wrapper, \"subm_local_attention_with_tensor_wrapper\");\n    m.def(\"subm_local_attention_with_hash_wrapper\", &subm_local_attention_with_hash_wrapper, \"subm_local_attention_with_hash_wrapper\");\n    m.def(\"sparse_local_attention_with_tensor_wrapper\", &sparse_local_attention_with_tensor_wrapper, \"sparse_local_attention_with_tensor_wrapper\");\n    m.def(\"sparse_local_attention_with_hash_wrapper\", &sparse_local_attention_with_hash_wrapper, \"sparse_local_attention_with_hash_wrapper\");\n    m.def(\"subm_strided_attention_with_tensor_wrapper\", &subm_strided_attention_with_tensor_wrapper, \"subm_strided_attention_with_tensor_wrapper\");\n    m.def(\"subm_strided_attention_with_hash_wrapper\", &subm_strided_attention_with_hash_wrapper, \"subm_strided_attention_with_hash_wrapper\");\n    m.def(\"sparse_strided_attention_with_tensor_wrapper\", &sparse_strided_attention_with_tensor_wrapper, \"sparse_strided_attention_with_tensor_wrapper\");\n    m.def(\"sparse_strided_attention_with_hash_wrapper\", &sparse_strided_attention_with_hash_wrapper, \"sparse_strided_attention_with_hash_wrapper\");\n    m.def(\"group_features_grad_wrapper\", &group_features_grad_wrapper_stack, \"group_features_grad_wrapper_stack\");\n    m.def(\"group_features_wrapper\", &group_features_wrapper_stack, \"group_features_wrapper_stack\");\n}\n"
  },
  {
    "path": "pcdet/ops/votr_ops/src/votr_cuda_utils.h",
    "content": "#ifndef VOTR_CUDA_UTILS_H\n#define VOTR_CUDA_UTILS_H\n\n#include <cmath>\n\n#define THREADS_PER_BLOCK 256\n#define DIVUP(m,n) ((m) / (n) + ((m) % (n) > 0))\n#define EMPTY_KEY -1\n#define BLK_SIGNAL -2\n\n#endif"
  },
  {
    "path": "pcdet/ops/votr_ops/votr_utils.py",
    "content": "import torch\nfrom torch.autograd import Function, Variable\n\nfrom . import votr_ops_cuda as votr\n\nclass BuildTensorTable(Function):\n\n    @staticmethod\n    def forward(ctx, batch_size, spatial_shape, voxel_indices, v_bs_cnt):\n        \"\"\"\n        Args:\n            ctx:\n            voxel_indices: (num_voxels, 4) (bs_idx, z, y, x)\n        Returns:\n        \"\"\"\n        x_max, y_max, z_max = spatial_shape\n        num_voxels = voxel_indices.shape[0]\n        assert voxel_indices.is_contiguous()\n        dense_map = torch.zeros((batch_size, x_max, y_max, z_max)).int().fill_(-1)\n        dense_map = dense_map.to(voxel_indices.device)\n\n        votr.build_mapping_with_tensor_wrapper(x_max, y_max, z_max, num_voxels, voxel_indices, v_bs_cnt, dense_map)\n        return dense_map\n\n    @staticmethod\n    def backward(ctx, a=None):\n        return None, None, None\n\nbuild_tensor_table = BuildTensorTable.apply\n\nclass BuildHashTable(Function):\n\n    @staticmethod\n    def forward(ctx, batch_size, hash_size, spatial_shape, voxel_indices, v_bs_cnt):\n        \"\"\"\n        Args:\n            ctx:\n            voxel_indices: (num_voxels, 4) (bs_idx, z, y, x)\n        Returns:\n        \"\"\"\n        x_max, y_max, z_max = spatial_shape\n        num_voxels = voxel_indices.shape[0]\n        assert voxel_indices.is_contiguous()\n        dense_map = torch.zeros((batch_size, hash_size, 2)).int().fill_(-1)\n        dense_map = dense_map.to(voxel_indices.device)\n\n        votr.build_mapping_with_hash_wrapper(x_max, y_max, z_max, num_voxels, hash_size, voxel_indices, v_bs_cnt, dense_map)\n        return dense_map\n\n    @staticmethod\n    def backward(ctx, a=None):\n        return None, None, None\n\nbuild_hash_table = BuildHashTable.apply\n\nclass TensorDownSample(Function):\n    @staticmethod\n    def forward(ctx, strides, num_ds_voxels, batch_size, spatial_shape, voxel_indices):\n        \"\"\"\n        Args:\n            ctx:\n            voxel_indices: (num_voxels, 4) (bs_idx, z, y, x)\n        Returns:\n        \"\"\"\n        x_stride, y_stride, z_stride = strides\n        x_max, y_max, z_max = spatial_shape\n        dense_map = torch.zeros((batch_size, x_max, y_max, z_max)).int().fill_(-1)\n        dense_map = dense_map.to(voxel_indices.device)\n        num_voxels = voxel_indices.shape[0]\n        assert voxel_indices.is_contiguous()\n        ds_voxel_indices = torch.zeros((batch_size, num_ds_voxels, 3)).int().fill_(-1).to(voxel_indices.device)\n        vcount = torch.zeros(batch_size).int().to(voxel_indices.device)\n        votr.downsample_with_tensor_wrapper(x_max, y_max, z_max, x_stride, y_stride, z_stride,\n                                                num_voxels, num_ds_voxels,\n                                                voxel_indices, ds_voxel_indices, dense_map, vcount)\n        ds_voxel_list = []\n        for i in range(batch_size):\n            ds_voxel = ds_voxel_indices[i]\n            ds_voxel = ds_voxel[ds_voxel[:, 0] >= 0] # not -1\n            bs_idx = torch.zeros((ds_voxel.shape[0], 1)).int().fill_(i).to(voxel_indices.device)\n            ds_voxel = torch.cat([bs_idx, ds_voxel], dim = 1)\n            ds_voxel_list.append(ds_voxel)\n\n        output_voxels = torch.cat(ds_voxel_list, dim = 0).contiguous()\n        return output_voxels, dense_map\n\n    @staticmethod\n    def backward(ctx, a=None):\n        return None, None, None, None\n\ntensor_down_sample = TensorDownSample.apply\n\nclass HashTableDownSample(Function):\n    @staticmethod\n    def forward(ctx, strides, num_ds_voxels, batch_size, hash_size, spatial_shape, voxel_indices):\n        \"\"\"\n        Args:\n            ctx:\n            voxel_indices: (num_voxels, 4) (bs_idx, z, y, x)\n        Returns:\n        \"\"\"\n        x_stride, y_stride, z_stride = strides\n        x_max, y_max, z_max = spatial_shape\n        dense_map = torch.zeros((batch_size, hash_size, 2)).int().fill_(-1)\n        dense_map = dense_map.to(voxel_indices.device)\n        num_voxels = voxel_indices.shape[0]\n        assert voxel_indices.is_contiguous()\n        ds_voxel_indices = torch.zeros((batch_size, num_ds_voxels, 3)).int().fill_(-1).to(voxel_indices.device)\n        vcount = torch.zeros(batch_size).int().to(voxel_indices.device)\n        votr.downsample_with_hash_wrapper(x_max, y_max, z_max, x_stride, y_stride, z_stride,\n                                                num_voxels, num_ds_voxels, hash_size,\n                                                voxel_indices, ds_voxel_indices, dense_map, vcount)\n        ds_voxel_list = []\n        for i in range(batch_size):\n            ds_voxel = ds_voxel_indices[i]\n            ds_voxel = ds_voxel[ds_voxel[:, 0] >= 0] # not -1\n            bs_idx = torch.zeros((ds_voxel.shape[0], 1)).int().fill_(i).to(voxel_indices.device)\n            ds_voxel = torch.cat([bs_idx, ds_voxel], dim = 1)\n            ds_voxel_list.append(ds_voxel)\n\n        output_voxels = torch.cat(ds_voxel_list, dim = 0).contiguous()\n        return output_voxels, dense_map\n\n    @staticmethod\n    def backward(ctx, a=None):\n        return None, None, None, None\n\nhash_table_down_sample = HashTableDownSample.apply\n\nclass SparseLocalAttentionTensorIndices(Function):\n\n    @staticmethod\n    def forward(ctx, attend_size, attend_range, strides, dense_map, voxel_indices):\n        \"\"\"\n        Args:\n            ctx:\n            dense_map: (bs_idx, x_max, y_max, z_max) -> old map table\n            voxel_indices: (num_voxels, 4) (bs_idx, z, y, x) -> new downsampled indices\n        Returns:\n        \"\"\"\n        x_stride, y_stride, z_stride = strides\n        batch_size, x_max, y_max, z_max = dense_map.shape\n        num_voxels = voxel_indices.shape[0]\n        assert voxel_indices.is_contiguous()\n\n        attend_indices = torch.zeros((num_voxels, attend_size)).int().fill_(-1).to(voxel_indices.device)\n\n        votr.sparse_local_attention_with_tensor_wrapper(x_max, y_max, z_max, x_stride, y_stride, z_stride,\n                                                        num_voxels, attend_size, attend_range,\n                                                        attend_indices, voxel_indices, dense_map)\n        return attend_indices\n\n    @staticmethod\n    def backward(ctx, a=None):\n        return None, None, None, None, None\n\nsparse_local_attention_tensor_indices = SparseLocalAttentionTensorIndices.apply\n\nclass SparseLocalAttentionHashIndices(Function):\n\n    @staticmethod\n    def forward(ctx, spatial_shape, attend_size, attend_range, strides, dense_map, voxel_indices):\n        \"\"\"\n        Args:\n            ctx:\n            dense_map: (bs_idx, hash_size, 2) -> old map table\n            voxel_indices: (num_voxels, 4) (bs_idx, z, y, x) -> new downsampled indices\n        Returns:\n        \"\"\"\n        x_stride, y_stride, z_stride = strides\n        x_max, y_max, z_max = spatial_shape\n        batch_size, hash_size, _ = dense_map.shape\n        num_voxels = voxel_indices.shape[0]\n        assert voxel_indices.is_contiguous()\n\n        attend_indices = torch.zeros((num_voxels, attend_size)).int().fill_(-1).to(voxel_indices.device)\n\n        votr.sparse_local_attention_with_hash_wrapper(x_max, y_max, z_max, x_stride, y_stride, z_stride,\n                                                        num_voxels, attend_size, attend_range, hash_size,\n                                                        attend_indices, voxel_indices, dense_map)\n        return attend_indices\n\n    @staticmethod\n    def backward(ctx, a=None):\n        return None, None, None, None, None\n\nsparse_local_attention_hash_indices = SparseLocalAttentionHashIndices.apply\n\nclass SparseStridedAttentionTensorIndices(Function):\n\n    @staticmethod\n    def forward(ctx, attend_size, range_spec, strides, dense_map, voxel_indices):\n        \"\"\"\n        Args:\n            ctx:\n            dense_map: (bs_idx, x_max, y_max, z_max) -> old map table\n            voxel_indices: (num_voxels, 4) (bs_idx, z, y, x) -> new downsampled indices\n        Returns:\n        \"\"\"\n        x_stride, y_stride, z_stride = strides\n        batch_size, x_max, y_max, z_max = dense_map.shape\n        num_voxels = voxel_indices.shape[0]\n        assert voxel_indices.is_contiguous()\n        range_spec = torch.tensor(range_spec).int().to(voxel_indices.device)\n        num_range = range_spec.shape[0]\n        attend_indices = torch.zeros((num_voxels, attend_size)).int().fill_(-1).to(voxel_indices.device)\n\n        votr.sparse_strided_attention_with_tensor_wrapper(x_max, y_max, z_max, x_stride, y_stride, z_stride,\n                                                            num_voxels, attend_size, num_range,\n                                                            attend_indices, voxel_indices,\n                                                            dense_map, range_spec)\n        return attend_indices\n\n    @staticmethod\n    def backward(ctx, a=None):\n        return None, None, None, None, None\n\nsparse_strided_attention_tensor_indices = SparseStridedAttentionTensorIndices.apply\n\nclass SparseStridedAttentionHashIndices(Function):\n\n    @staticmethod\n    def forward(ctx, spatial_shape, attend_size, range_spec, strides, dense_map, voxel_indices):\n        \"\"\"\n        Args:\n            ctx:\n            dense_map: (bs_idx, x_max, y_max, z_max) -> old map table\n            voxel_indices: (num_voxels, 4) (bs_idx, z, y, x) -> new downsampled indices\n        Returns:\n        \"\"\"\n        x_stride, y_stride, z_stride = strides\n        x_max, y_max, z_max = spatial_shape\n        batch_size, hash_size, _ = dense_map.shape\n        num_voxels = voxel_indices.shape[0]\n        assert voxel_indices.is_contiguous()\n        range_spec = torch.tensor(range_spec).int().to(voxel_indices.device)\n        num_range = range_spec.shape[0]\n        attend_indices = torch.zeros((num_voxels, attend_size)).int().fill_(-1).to(voxel_indices.device)\n\n        votr.sparse_strided_attention_with_hash_wrapper(x_max, y_max, z_max, x_stride, y_stride, z_stride,\n                                                            num_voxels, attend_size, num_range, hash_size,\n                                                            attend_indices, voxel_indices,\n                                                            dense_map, range_spec)\n        return attend_indices\n\n    @staticmethod\n    def backward(ctx, a=None):\n        return None, None, None, None, None\n\nsparse_strided_attention_hash_indices = SparseStridedAttentionHashIndices.apply\n\nclass SubMLocalAttentionTensorIndices(Function):\n\n    @staticmethod\n    def forward(ctx, attend_size, attend_range, dense_map, voxel_indices):\n        \"\"\"\n        Args:\n            ctx:\n            voxel_indices: (num_voxels, 4) (bs_idx, z, y, x)\n        Returns:\n        \"\"\"\n\n        batch_size, x_max, y_max, z_max = dense_map.shape\n        num_voxels = voxel_indices.shape[0]\n        assert voxel_indices.is_contiguous()\n\n        attend_indices = torch.zeros((num_voxels, attend_size)).int().fill_(-1).to(voxel_indices.device)\n\n        votr.subm_local_attention_with_tensor_wrapper(x_max, y_max, z_max,\n                                                        num_voxels, attend_size, attend_range,\n                                                        attend_indices, voxel_indices, dense_map)\n        return attend_indices\n\n    @staticmethod\n    def backward(ctx, a=None):\n        return None, None, None, None, None\n\nsubm_local_attention_tensor_indices = SubMLocalAttentionTensorIndices.apply\n\nclass SubMLocalAttentionHashIndices(Function):\n\n    @staticmethod\n    def forward(ctx, spatial_shape, attend_size, attend_range, dense_map, voxel_indices):\n        \"\"\"\n        Args:\n            ctx:\n            voxel_indices: (num_voxels, 4) (bs_idx, z, y, x)\n        Returns:\n        \"\"\"\n\n        x_max, y_max, z_max = spatial_shape\n        batch_size, hash_size, _ = dense_map.shape\n        num_voxels = voxel_indices.shape[0]\n        assert voxel_indices.is_contiguous()\n\n        attend_indices = torch.zeros((num_voxels, attend_size)).int().fill_(-1).to(voxel_indices.device)\n\n        votr.subm_local_attention_with_hash_wrapper(x_max, y_max, z_max,\n                                                        num_voxels, attend_size, attend_range, hash_size,\n                                                        attend_indices, voxel_indices, dense_map)\n        return attend_indices\n\n    @staticmethod\n    def backward(ctx, a=None):\n        return None, None, None, None, None\n\nsubm_local_attention_hash_indices = SubMLocalAttentionHashIndices.apply\n\nclass SubMStridedAttentionTensorIndices(Function):\n\n    @staticmethod\n    def forward(ctx, attend_size, range_spec, dense_map, voxel_indices):\n        \"\"\"\n        Args:\n            ctx:\n            voxel_indices: (num_voxels, 4) (bs_idx, z, y, x)\n        Returns:\n        \"\"\"\n        batch_size, x_max, y_max, z_max = dense_map.shape\n        num_voxels = voxel_indices.shape[0]\n        assert voxel_indices.is_contiguous()\n        range_spec = torch.tensor(range_spec).int().to(voxel_indices.device)\n        num_range = range_spec.shape[0]\n        attend_indices = torch.zeros((num_voxels, attend_size)).int().fill_(-1).to(voxel_indices.device)\n\n        votr.subm_strided_attention_with_tensor_wrapper(x_max, y_max, z_max,\n                                                            num_voxels, attend_size, num_range,\n                                                            attend_indices, voxel_indices,\n                                                            dense_map, range_spec)\n        return attend_indices\n\n    @staticmethod\n    def backward(ctx, a=None):\n        return None, None, None, None, None\n\nsubm_strided_attention_tensor_indices = SubMStridedAttentionTensorIndices.apply\n\nclass SubMStridedAttentionHashIndices(Function):\n\n    @staticmethod\n    def forward(ctx, spatial_shape, attend_size, range_spec, dense_map, voxel_indices):\n        \"\"\"\n        Args:\n            ctx:\n            voxel_indices: (num_voxels, 4) (bs_idx, z, y, x)\n        Returns:\n        \"\"\"\n        x_max, y_max, z_max = spatial_shape\n        batch_size, hash_size, _ = dense_map.shape\n        num_voxels = voxel_indices.shape[0]\n        assert voxel_indices.is_contiguous()\n        range_spec = torch.tensor(range_spec).int().to(voxel_indices.device)\n        num_range = range_spec.shape[0]\n        attend_indices = torch.zeros((num_voxels, attend_size)).int().fill_(-1).to(voxel_indices.device)\n\n        votr.subm_strided_attention_with_hash_wrapper(x_max, y_max, z_max,\n                                                            num_voxels, attend_size, num_range, hash_size,\n                                                            attend_indices, voxel_indices,\n                                                            dense_map, range_spec)\n        return attend_indices\n\n    @staticmethod\n    def backward(ctx, a=None):\n        return None, None, None, None, None\n\nsubm_strided_attention_hash_indices = SubMStridedAttentionHashIndices.apply\n\nclass GroupingOperation(Function):\n\n    @staticmethod\n    def forward(ctx, features: torch.Tensor, features_batch_cnt: torch.Tensor,\n                idx: torch.Tensor, idx_batch_cnt: torch.Tensor):\n        \"\"\"\n        Args:\n            ctx:\n            features: (N1 + N2 ..., C) tensor of features to group\n            features_batch_cnt: (batch_size) [N1 + N2 ...] tensor containing the indicies of features to group with\n            idx: (M1 + M2 ..., nsample) tensor containing the indicies of features to group with\n            idx_batch_cnt: (batch_size) [M1 + M2 ...] tensor containing the indicies of features to group with\n\n        Returns:\n            output: (M1 + M2, C, nsample) tensor\n        \"\"\"\n        assert features.is_contiguous()\n        assert features_batch_cnt.is_contiguous()\n        assert idx.is_contiguous()\n        assert idx_batch_cnt.is_contiguous()\n\n        assert features.shape[0] == features_batch_cnt.sum(), \\\n            'features: %s, features_batch_cnt: %s' % (str(features.shape), str(features_batch_cnt))\n        assert idx.shape[0] == idx_batch_cnt.sum(), \\\n            'idx: %s, idx_batch_cnt: %s' % (str(idx.shape), str(idx_batch_cnt))\n\n        M, nsample = idx.size()\n        N, C = features.size()\n        B = idx_batch_cnt.shape[0]\n        output = torch.cuda.FloatTensor(M, C, nsample).zero_()\n\n        votr.group_features_wrapper(B, M, C, nsample, features, features_batch_cnt, idx, idx_batch_cnt, output)\n\n        ctx.for_backwards = (B, N, idx, features_batch_cnt, idx_batch_cnt)\n        return output\n\n    @staticmethod\n    def backward(ctx, grad_out: torch.Tensor):\n        \"\"\"\n        Args:\n            ctx:\n            grad_out: (M1 + M2 ..., C, nsample) tensor of the gradients of the output from forward\n\n        Returns:\n            grad_features: (N1 + N2 ..., C) gradient of the features\n        \"\"\"\n        B, N, idx, features_batch_cnt, idx_batch_cnt = ctx.for_backwards\n\n        M, C, nsample = grad_out.size()\n        grad_features = Variable(torch.cuda.FloatTensor(N, C).zero_())\n\n        grad_out_data = grad_out.data.contiguous()\n        votr.group_features_grad_wrapper(B, M, C, N, nsample, grad_out_data, idx,\n                                            idx_batch_cnt, features_batch_cnt, grad_features.data)\n        return grad_features, None, None, None\n\ngrouping_operation = GroupingOperation.apply"
  },
  {
    "path": "pcdet/utils/__init__.py",
    "content": ""
  },
  {
    "path": "pcdet/utils/bbloss.py",
    "content": "import torch\nimport numpy as np\n\ndef limit( ang):\n    ang = ang % (2 * np.pi)\n\n    ang[ang > np.pi] = ang[ang > np.pi] - 2 * np.pi\n\n    ang[ang < -np.pi] = ang[ang < -np.pi] + 2 * np.pi\n\n    return ang\n\n\ndef ang_weight(pred, gt):\n\n    a2 = torch.abs(torch.sin(pred - gt))\n\n    return 1-a2\n\ndef compute_iou(x,w,y,l):\n    zmax1 = x + w * 0.5\n    zmin1 = x - w * 0.5\n    zmax2 = y + l * 0.5\n    zmin2 = y - l * 0.5\n    z_overlap = (torch.min(zmax1, zmax2) - torch.max(zmin1, zmin2)).clamp_min(0.)\n    all_lap = (torch.max(zmax1, zmax2) - torch.min(zmin1, zmin2)).clamp_min(0.)\n    iou = z_overlap / all_lap\n    return iou\n\ndef bb_loss(pred, target):\n    iouw = compute_iou(pred[..., 0], pred[..., 3], target[..., 0], target[..., 3])\n    ioul = compute_iou(pred[..., 1], pred[..., 4], target[..., 1], target[..., 4])\n    iouh = compute_iou(pred[..., 2], pred[..., 5], target[..., 2], target[..., 5])\n\n    a_p = limit(pred[..., 6])\n    a_g = limit(target[..., 6])\n    ioua = ang_weight(a_p, a_g)\n\n    iou = iouw*ioul*iouh*ioua\n\n    diff_angle = pred[:, -1] - target[:, -1]\n    angle_factor = 1.25 * (1.0 - torch.abs(torch.cos(diff_angle)))\n\n    center_dist_square = torch.pow(target[:, 0:3] - pred[:, 0:3], 2).sum(-1)\n\n    finall_loss = 1-iou + angle_factor + center_dist_square\n\n    return finall_loss*1.5\n\nclass APLoss(torch.autograd.Function):\n    @staticmethod\n    def forward(ctx, logits, targets):\n\n        classification_grads, classification_losses = AP_loss(logits, targets)\n        #########################################################\n\n        ctx.save_for_backward(classification_grads, None)\n        return classification_losses\n\n    @staticmethod\n    def backward(ctx, out_grad1):\n        g1,g2 = ctx.saved_tensors\n        return g1 * out_grad1, None\n\n\ndef AP_loss(logits, targets):\n    delta = 1.0\n\n    grad = torch.zeros(logits.shape).cuda()\n    metric = torch.zeros(1).cuda()\n\n    if torch.max(targets) <= 0:\n        return grad, metric\n\n    labels_p = (targets == 1)\n    fg_logits = logits[labels_p]\n    threshold_logit = torch.min(fg_logits) - delta #-0.9\n\n    ######## Ignore those negative j that satisfy (L_{ij}=0 for all positive i), to accelerate the AP-loss computation.\n    valid_labels_n = ((targets == 0) & (logits >= threshold_logit))\n    valid_bg_logits = logits[valid_labels_n]\n    valid_bg_grad = torch.zeros(len(valid_bg_logits)).cuda()\n    ########\n\n    fg_num = len(fg_logits)\n    prec = torch.zeros(fg_num).cuda()\n    order = torch.argsort(fg_logits)\n    max_prec = 0\n\n    for ii in order:\n        tmp1 = fg_logits - fg_logits[ii]\n        tmp1 = torch.clamp(tmp1 / (2 * delta) + 0.5, min=0, max=1)\n        tmp2 = valid_bg_logits - fg_logits[ii]\n        tmp2 = torch.clamp(tmp2 / (2 * delta) + 0.5, min=0, max=1)\n        a = torch.sum(tmp1) + 0.5\n        b = torch.sum(tmp2)\n        tmp2 /= (a + b)\n        current_prec = a / (a + b)\n        if (max_prec <= current_prec):\n            max_prec = current_prec\n        else:\n            tmp2 *= ((1 - max_prec) / (1 - current_prec))\n        valid_bg_grad += tmp2\n        prec[ii] = max_prec\n\n    grad[valid_labels_n] = valid_bg_grad\n    grad[labels_p] = -(1 - prec)\n\n    fg_num = max(fg_num, 1)\n\n    grad /= (fg_num)\n\n    metric = torch.sum(prec, dim=0, keepdim=True) / fg_num\n\n    return grad, 1 - metric"
  },
  {
    "path": "pcdet/utils/box_coder_utils.py",
    "content": "import numpy as np\nimport torch\n\n\nclass ResidualCoder(object):\n    def __init__(self, code_size=7, encode_angle_by_sincos=False, **kwargs):\n        super().__init__()\n        self.code_size = code_size\n        self.encode_angle_by_sincos = encode_angle_by_sincos\n        if self.encode_angle_by_sincos:\n            self.code_size += 1\n\n    def encode_torch(self, boxes, anchors):\n        \"\"\"\n        Args:\n            boxes: (N, 7 + C) [x, y, z, dx, dy, dz, heading, ...]\n            anchors: (N, 7 + C) [x, y, z, dx, dy, dz, heading or *[cos, sin], ...]\n\n        Returns:\n\n        \"\"\"\n        anchors[:, 3:6] = torch.clamp_min(anchors[:, 3:6], min=1e-5)\n        boxes[:, 3:6] = torch.clamp_min(boxes[:, 3:6], min=1e-5)\n\n        xa, ya, za, dxa, dya, dza, ra, *cas = torch.split(anchors, 1, dim=-1)\n\n        xg, yg, zg, dxg, dyg, dzg, rg, *cgs = torch.split(boxes, 1, dim=-1)\n\n        diagonal = torch.sqrt(dxa ** 2 + dya ** 2)\n        xt = (xg - xa) / diagonal\n        yt = (yg - ya) / diagonal\n        zt = (zg - za) / dza\n        dxt = torch.log(dxg / dxa)\n        dyt = torch.log(dyg / dya)\n        dzt = torch.log(dzg / dza)\n        if self.encode_angle_by_sincos:\n            rt_cos = torch.cos(rg) - torch.cos(ra)\n            rt_sin = torch.sin(rg) - torch.sin(ra)\n            rts = [rt_cos, rt_sin]\n        else:\n            rts = [rg - ra]\n\n        cts = [g - a for g, a in zip(cgs, cas)]\n        return torch.cat([xt, yt, zt, dxt, dyt, dzt, *rts, *cts], dim=-1)\n\n    def decode_torch(self, box_encodings, anchors):\n        \"\"\"\n        Args:\n            box_encodings: (B, N, 7 + C) or (N, 7 + C) [x, y, z, dx, dy, dz, heading or *[cos, sin], ...]\n            anchors: (B, N, 7 + C) or (N, 7 + C) [x, y, z, dx, dy, dz, heading, ...]\n\n        Returns:\n\n        \"\"\"\n        xa, ya, za, dxa, dya, dza, ra, *cas = torch.split(anchors, 1, dim=-1)\n\n        if not self.encode_angle_by_sincos:\n            xt, yt, zt, dxt, dyt, dzt, rt, *cts = torch.split(box_encodings, 1, dim=-1)\n        else:\n            xt, yt, zt, dxt, dyt, dzt, cost, sint, *cts = torch.split(box_encodings, 1, dim=-1)\n\n        diagonal = torch.sqrt(dxa ** 2 + dya ** 2)\n        xg = xt * diagonal + xa\n        yg = yt * diagonal + ya\n        zg = zt * dza + za\n\n        dxg = torch.exp(dxt) * dxa\n        dyg = torch.exp(dyt) * dya\n        dzg = torch.exp(dzt) * dza\n\n        if self.encode_angle_by_sincos:\n            rg_cos = cost + torch.cos(ra)\n            rg_sin = sint + torch.sin(ra)\n            rg = torch.atan2(rg_sin, rg_cos)\n        else:\n            rg = rt + ra\n\n        cgs = [t + a for t, a in zip(cts, cas)]\n        return torch.cat([xg, yg, zg, dxg, dyg, dzg, rg, *cgs], dim=-1)\n\nclass ResidualCoderV2(object):\n    def __init__(self, code_size=7, encode_angle_by_sincos=False, **kwargs):\n        super().__init__()\n        self.code_size = code_size\n        self.encode_angle_by_sincos = encode_angle_by_sincos\n        if self.encode_angle_by_sincos:\n            self.code_size += 1\n\n    def encode_torch(self, boxes, anchors):\n        \"\"\"\n        Args:\n            boxes: (N, 7 + C) [x, y, z, dx, dy, dz, heading, ...]\n            anchors: (N, 7 + C) [x, y, z, dx, dy, dz, heading or *[cos, sin], ...]\n\n        Returns:\n\n        \"\"\"\n        anchors[:, 3:6] = torch.clamp_min(anchors[:, 3:6], min=1e-5)\n        boxes[:, 3:6] = torch.clamp_min(boxes[:, 3:6], min=1e-5)\n\n        xa, ya, za, dxa, dya, dza, ra, *cas = torch.split(anchors, 1, dim=-1)\n        xg, yg, zg, dxg, dyg, dzg, rg, *cgs = torch.split(boxes, 1, dim=-1)\n\n        za = za - dza/2\n        zg = zg - dzg / 2\n\n        xt = (xg - xa)\n        yt = (yg - ya)\n        zt = (zg - za)\n\n        dxt = torch.log(dxg )\n        dyt = torch.log(dyg )\n        dzt = torch.log(dzg )\n        if self.encode_angle_by_sincos:\n            rt_cos = torch.cos(rg)\n            rt_sin = torch.sin(rg)\n            rts = [rt_cos, rt_sin]\n        else:\n            rts = [rg - ra]\n\n        cts = [g - a for g, a in zip(cgs, cas)]\n        return torch.cat([xt, yt, zt, dxt, dyt, dzt, *rts, *cts], dim=-1)\n\n    def decode_torch(self, box_encodings, anchors):\n        \"\"\"\n        Args:\n            box_encodings: (B, N, 7 + C) or (N, 7 + C) [x, y, z, dx, dy, dz, heading or *[cos, sin], ...]\n            anchors: (B, N, 7 + C) or (N, 7 + C) [x, y, z, dx, dy, dz, heading, ...]\n\n        Returns:\n\n        \"\"\"\n        xa, ya, za, dxa, dya, dza, ra, *cas = torch.split(anchors, 1, dim=-1)\n\n        if not self.encode_angle_by_sincos:\n            xt, yt, zt, dxt, dyt, dzt, rt, *cts = torch.split(box_encodings, 1, dim=-1)\n        else:\n            xt, yt, zt, dxt, dyt, dzt, cost, sint, *cts = torch.split(box_encodings, 1, dim=-1)\n\n        za = za - dza / 2\n\n        xg = xt + xa\n        yg = yt + ya\n        zg = zt + za\n\n        dxg = torch.exp(dxt)\n        dyg = torch.exp(dyt)\n        dzg = torch.exp(dzt)\n\n        zg = zg + dzg/2\n\n        if self.encode_angle_by_sincos:\n            rg = torch.atan2(sint, cost)\n        else:\n            rg = rt + ra\n\n        cgs = [t + a for t, a in zip(cts, cas)]\n        return torch.cat([xg, yg, zg, dxg, dyg, dzg, rg, *cgs], dim=-1)\n\nclass ResidualCoderFree(object):\n    def __init__(self, code_size=8,  **kwargs):\n        super().__init__()\n        self.code_size = code_size\n\n    def encode_torch(self, boxes, centers):\n        \"\"\"\n        Args:\n            boxes: (N, 7 + C) [x, y, z, dx, dy, dz, heading, ...]\n            anchors: (N, 7 + C) [x, y, z, dx, dy, dz, heading or *[cos, sin], ...]\n\n        Returns:\n\n        \"\"\"\n\n        boxes[:, 3:6] = torch.clamp_min(boxes[:, 3:6], min=1e-5)\n\n        xa, ya,  *cas = torch.split(centers, 1, dim=-1)\n        xg, yg, zg, dxg, dyg, dzg, rg, *cgs = torch.split(boxes, 1, dim=-1)\n\n        xt = (xg - xa)\n        yt = (yg - ya)\n        zt = zg\n        dxt = torch.log(dxg)\n        dyt = torch.log(dyg)\n        dzt = torch.log(dzg)\n\n        rt_cos = torch.cos(rg)\n        rt_sin = torch.sin(rg)\n        rts = [rt_cos, rt_sin]\n\n        return torch.cat([xt, yt, zt, dxt, dyt, dzt, *rts, *cgs], dim=-1)\n\n    def decode_torch(self, box_encodings, centers):\n        \"\"\"\n        Args:\n            box_encodings: (B, N, 7 + C) or (N, 7 + C) [x, y, z, dx, dy, dz, heading or *[cos, sin], ...]\n            anchors: (B, N, 7 + C) or (N, 7 + C) [x, y, z, dx, dy, dz, heading, ...]\n\n        Returns:\n\n        \"\"\"\n        xa, ya, *cas = torch.split(centers, 1, dim=-1)\n\n        xt, yt, zt, dxt, dyt, dzt, cost, sint, *cts = torch.split(box_encodings, 1, dim=-1)\n\n        xg = xt + xa\n        yg = yt + ya\n        zg = zt\n\n        dxg = torch.exp(dxt)\n        dyg = torch.exp(dyt)\n        dzg = torch.exp(dzt)\n\n        rg = torch.atan2(sint, cost)\n\n        return torch.cat([xg, yg, zg, dxg, dyg, dzg, rg, *cts], dim=-1)\n\nclass PreviousResidualDecoder(object):\n    def __init__(self, code_size=7, **kwargs):\n        super().__init__()\n        self.code_size = code_size\n\n    @staticmethod\n    def decode_torch(box_encodings, anchors):\n        \"\"\"\n        Args:\n            box_encodings:  (B, N, 7 + ?) x, y, z, w, l, h, r, custom values\n            anchors: (B, N, 7 + C) or (N, 7 + C) [x, y, z, dx, dy, dz, heading, ...]\n\n        Returns:\n\n        \"\"\"\n        xa, ya, za, dxa, dya, dza, ra, *cas = torch.split(anchors, 1, dim=-1)\n        xt, yt, zt, wt, lt, ht, rt, *cts = torch.split(box_encodings, 1, dim=-1)\n\n        diagonal = torch.sqrt(dxa ** 2 + dya ** 2)\n        xg = xt * diagonal + xa\n        yg = yt * diagonal + ya\n        zg = zt * dza + za\n\n        dxg = torch.exp(lt) * dxa\n        dyg = torch.exp(wt) * dya\n        dzg = torch.exp(ht) * dza\n        rg = rt + ra\n\n        cgs = [t + a for t, a in zip(cts, cas)]\n        return torch.cat([xg, yg, zg, dxg, dyg, dzg, rg, *cgs], dim=-1)\n\n\nclass PreviousResidualRoIDecoder(object):\n    def __init__(self, code_size=7, **kwargs):\n        super().__init__()\n        self.code_size = code_size\n\n    @staticmethod\n    def decode_torch(box_encodings, anchors):\n        \"\"\"\n        Args:\n            box_encodings:  (B, N, 7 + ?) x, y, z, w, l, h, r, custom values\n            anchors: (B, N, 7 + C) or (N, 7 + C) [x, y, z, dx, dy, dz, heading, ...]\n\n        Returns:\n\n        \"\"\"\n        xa, ya, za, dxa, dya, dza, ra, *cas = torch.split(anchors, 1, dim=-1)\n        xt, yt, zt, wt, lt, ht, rt, *cts = torch.split(box_encodings, 1, dim=-1)\n\n        diagonal = torch.sqrt(dxa ** 2 + dya ** 2)\n        xg = xt * diagonal + xa\n        yg = yt * diagonal + ya\n        zg = zt * dza + za\n\n        dxg = torch.exp(lt) * dxa\n        dyg = torch.exp(wt) * dya\n        dzg = torch.exp(ht) * dza\n        rg = ra - rt\n\n        cgs = [t + a for t, a in zip(cts, cas)]\n        return torch.cat([xg, yg, zg, dxg, dyg, dzg, rg, *cgs], dim=-1)\n\n\nclass PointResidualCoder(object):\n    def __init__(self, code_size=8, use_mean_size=True, **kwargs):\n        super().__init__()\n        self.code_size = code_size\n        self.use_mean_size = use_mean_size\n        if self.use_mean_size:\n            self.mean_size = torch.from_numpy(np.array(kwargs['mean_size'])).cuda().float()\n            assert self.mean_size.min() > 0\n\n    def encode_torch(self, gt_boxes, points, gt_classes=None):\n        \"\"\"\n        Args:\n            gt_boxes: (N, 7 + C) [x, y, z, dx, dy, dz, heading, ...]\n            points: (N, 3) [x, y, z]\n            gt_classes: (N) [1, num_classes]\n        Returns:\n            box_coding: (N, 8 + C)\n        \"\"\"\n        gt_boxes[:, 3:6] = torch.clamp_min(gt_boxes[:, 3:6], min=1e-5)\n\n        xg, yg, zg, dxg, dyg, dzg, rg, *cgs = torch.split(gt_boxes, 1, dim=-1)\n        xa, ya, za = torch.split(points, 1, dim=-1)\n\n        if self.use_mean_size:\n            assert gt_classes.max() <= self.mean_size.shape[0]\n            point_anchor_size = self.mean_size[gt_classes - 1]\n            dxa, dya, dza = torch.split(point_anchor_size, 1, dim=-1)\n            diagonal = torch.sqrt(dxa ** 2 + dya ** 2)\n            xt = (xg - xa) / diagonal\n            yt = (yg - ya) / diagonal\n            zt = (zg - za) / dza\n            dxt = torch.log(dxg / dxa)\n            dyt = torch.log(dyg / dya)\n            dzt = torch.log(dzg / dza)\n        else:\n            xt = (xg - xa)\n            yt = (yg - ya)\n            zt = (zg - za)\n            dxt = torch.log(dxg)\n            dyt = torch.log(dyg)\n            dzt = torch.log(dzg)\n\n        cts = [g for g in cgs]\n        return torch.cat([xt, yt, zt, dxt, dyt, dzt, torch.cos(rg), torch.sin(rg), *cts], dim=-1)\n\n    def decode_torch(self, box_encodings, points, pred_classes=None):\n        \"\"\"\n        Args:\n            box_encodings: (N, 8 + C) [x, y, z, dx, dy, dz, cos, sin, ...]\n            points: [x, y, z]\n            pred_classes: (N) [1, num_classes]\n        Returns:\n\n        \"\"\"\n        xt, yt, zt, dxt, dyt, dzt, cost, sint, *cts = torch.split(box_encodings, 1, dim=-1)\n        xa, ya, za = torch.split(points, 1, dim=-1)\n\n        if self.use_mean_size:\n            assert pred_classes.max() <= self.mean_size.shape[0]\n            point_anchor_size = self.mean_size[pred_classes - 1]\n            dxa, dya, dza = torch.split(point_anchor_size, 1, dim=-1)\n            diagonal = torch.sqrt(dxa ** 2 + dya ** 2)\n            xg = xt * diagonal + xa\n            yg = yt * diagonal + ya\n            zg = zt * dza + za\n\n            dxg = torch.exp(dxt) * dxa\n            dyg = torch.exp(dyt) * dya\n            dzg = torch.exp(dzt) * dza\n        else:\n            xg = xt + xa\n            yg = yt + ya\n            zg = zt + za\n            dxg, dyg, dzg = torch.split(torch.exp(box_encodings[..., 3:6]), 1, dim=-1)\n\n        rg = torch.atan2(sint, cost)\n\n        cgs = [t for t in cts]\n        return torch.cat([xg, yg, zg, dxg, dyg, dzg, rg, *cgs], dim=-1)\n"
  },
  {
    "path": "pcdet/utils/box_np_ops.py",
    "content": "import numba\nimport numpy as np\n\n\ndef corners_nd(dims, origin=0.5):\n    \"\"\"Generate relative box corners based on length per dim and origin point.\n\n    Args:\n        dims (np.ndarray, shape=[N, ndim]): Array of length per dim\n        origin (list or array or float): origin point relate to smallest point.\n\n    Returns:\n        np.ndarray, shape=[N, 2 ** ndim, ndim]: Returned corners.\n        point layout example: (2d) x0y0, x0y1, x1y0, x1y1;\n            (3d) x0y0z0, x0y0z1, x0y1z0, x0y1z1, x1y0z0, x1y0z1, x1y1z0, x1y1z1\n            where x0 < x1, y0 < y1, z0 < z1.\n    \"\"\"\n    ndim = int(dims.shape[1])\n    corners_norm = np.stack(\n        np.unravel_index(np.arange(2**ndim), [2] * ndim),\n        axis=1).astype(dims.dtype)\n    # now corners_norm has format: (2d) x0y0, x0y1, x1y0, x1y1\n    # (3d) x0y0z0, x0y0z1, x0y1z0, x0y1z1, x1y0z0, x1y0z1, x1y1z0, x1y1z1\n    # so need to convert to a format which is convenient to do other computing.\n    # for 2d boxes, format is clockwise start with minimum point\n    # for 3d boxes, please draw lines by your hand.\n    if ndim == 2:\n        # generate clockwise box corners\n        corners_norm = corners_norm[[0, 1, 3, 2]]\n    elif ndim == 3:\n        corners_norm = corners_norm[[0, 1, 3, 2, 4, 5, 7, 6]]\n    corners_norm = corners_norm - np.array(origin, dtype=dims.dtype)\n    corners = dims.reshape([-1, 1, ndim]) * corners_norm.reshape(\n        [1, 2**ndim, ndim])\n    return corners\n\n\ndef rotation_3d_in_axis(points, angles, axis=0):\n    \"\"\"Rotate points in specific axis.\n\n    Args:\n        points (np.ndarray, shape=[N, point_size, 3]]):\n        angles (np.ndarray, shape=[N]]):\n        axis (int): Axis to rotate at.\n\n    Returns:\n        np.ndarray: Rotated points.\n    \"\"\"\n    # points: [N, point_size, 3]\n    rot_sin = np.sin(angles)\n    rot_cos = np.cos(angles)\n    ones = np.ones_like(rot_cos)\n    zeros = np.zeros_like(rot_cos)\n    if axis == 1:\n        rot_mat_T = np.stack([[rot_cos, zeros, -rot_sin], [zeros, ones, zeros],\n                              [rot_sin, zeros, rot_cos]])\n    elif axis == 2 or axis == -1:\n        rot_mat_T = np.stack([[rot_cos, -rot_sin, zeros],\n                              [rot_sin, rot_cos, zeros], [zeros, zeros, ones]])\n    elif axis == 0:\n        rot_mat_T = np.stack([[zeros, rot_cos, -rot_sin],\n                              [zeros, rot_sin, rot_cos], [ones, zeros, zeros]])\n    else:\n        raise ValueError('axis should in range')\n\n    return np.einsum('aij,jka->aik', points, rot_mat_T)\n\n\ndef center_to_corner_box3d(centers,\n                           dims,\n                           angles=None,\n                           origin=(0.5, 1.0, 0.5),\n                           axis=1):\n    \"\"\"Convert kitti locations, dimensions and angles to corners.\n\n    Args:\n        centers (np.ndarray): Locations in kitti label file with shape (N, 3).\n        dims (np.ndarray): Dimensions in kitti label file with shape (N, 3).\n        angles (np.ndarray): Rotation_y in kitti label file with shape (N).\n        origin (list or array or float): Origin point relate to smallest point.\n            use (0.5, 1.0, 0.5) in camera and (0.5, 0.5, 0) in lidar.\n        axis (int): Rotation axis. 1 for camera and 2 for lidar.\n\n    Returns:\n        np.ndarray: Corners with the shape of (N, 8, 3).\n            6 -------- 5\n           /|         /|\n          2 -------- 1 .\n          | |        | |\n          . 7 -------- 4\n          |/         |/\n          3 -------- 0\n    \"\"\"\n    # 'length' in kitti format is in x axis.\n    # yzx(hwl)(kitti label file)<->xyz(lhw)(camera)<->z(-x)(-y)(wlh)(lidar)\n    # center in kitti format is [0.5, 1.0, 0.5] in xyz.\n    corners = corners_nd(dims, origin=origin)\n    # corners: [N, 8, 3]\n    if angles is not None:\n        corners = rotation_3d_in_axis(corners, angles, axis=axis)\n    corners += centers.reshape([-1, 1, 3])\n    return corners\n\n\n@numba.jit(nopython=True)\ndef box2d_to_corner_jit(boxes):\n    \"\"\"Convert box2d to corner.\n\n    Args:\n        boxes (np.ndarray, shape=[N, 5]): Boxes2d with rotation.\n\n    Returns:\n        box_corners (np.ndarray, shape=[N, 4, 2]): Box corners.\n            2 ------ 3\n           /        /\n          1 ------ 0\n    \"\"\"\n    num_box = boxes.shape[0]\n    corners_norm = np.zeros((4, 2), dtype=boxes.dtype)\n    corners_norm[1, 1] = 1.0\n    corners_norm[2] = 1.0\n    corners_norm[3, 0] = 1.0\n    corners_norm -= np.array([0.5, 0.5], dtype=boxes.dtype)\n    corners = boxes.reshape(num_box, 1, 5)[:, :, 2:4] * corners_norm.reshape(\n        1, 4, 2)\n    rot_mat_T = np.zeros((2, 2), dtype=boxes.dtype)\n    box_corners = np.zeros((num_box, 4, 2), dtype=boxes.dtype)\n    for i in range(num_box):\n        rot_sin = np.sin(boxes[i, -1])\n        rot_cos = np.cos(boxes[i, -1])\n        rot_mat_T[0, 0] = rot_cos\n        rot_mat_T[0, 1] = -rot_sin\n        rot_mat_T[1, 0] = rot_sin\n        rot_mat_T[1, 1] = rot_cos\n        box_corners[i] = corners[i] @ rot_mat_T + boxes[i, :2]\n    return box_corners\n\n\n@numba.njit\ndef corner_to_standup_nd_jit(boxes_corner):\n    \"\"\"Convert boxes_corner to aligned (min-max) boxes.\n\n    Args:\n        boxes_corner (np.ndarray, shape=[N, 2**dim, dim]): Boxes corners.\n\n    Returns:\n        np.ndarray, shape=[N, dim*2]: Aligned (min-max) boxes.\n    \"\"\"\n    num_boxes = boxes_corner.shape[0]\n    ndim = boxes_corner.shape[-1]\n    result = np.zeros((num_boxes, ndim * 2), dtype=boxes_corner.dtype)\n    for i in range(num_boxes):\n        for j in range(ndim):\n            result[i, j] = np.min(boxes_corner[i, :, j])\n        for j in range(ndim):\n            result[i, j + ndim] = np.max(boxes_corner[i, :, j])\n    return result\n\n\n@numba.jit(nopython=True)\ndef corner_to_surfaces_3d_jit(corners):\n    \"\"\"Convert 3d box corners from corner function above to surfaces that\n    normal vectors all direct to internal.\n\n    Args:\n        corners (np.ndarray): 3d box corners with the shape of (N, 8, 3).\n            6 -------- 5\n           /|         /|\n          2 -------- 1 .\n          | |        | |\n          . 7 -------- 4\n          |/         |/\n          3 -------- 0\n    Returns:\n        np.ndarray: Surfaces with the shape of (N, 6, 4, 3).\n    \"\"\"\n    # box_corners: [N, 8, 3], must from corner functions in this module\n    num_boxes = corners.shape[0]\n    surfaces = np.zeros((num_boxes, 6, 4, 3), dtype=corners.dtype)\n    corner_idxes = np.array([\n        0, 1, 2, 3, 7, 6, 5, 4, 0, 3, 7, 4, 1, 5, 6, 2, 0, 4, 5, 1, 3, 2, 6, 7\n    ]).reshape(6, 4)\n    for i in range(num_boxes):\n        for j in range(6):\n            for k in range(4):\n                surfaces[i, j, k] = corners[i, corner_idxes[j, k]]\n    return surfaces\n\n\ndef rotation_points_single_angle(points, angle, axis=0):\n    \"\"\"Rotate points with a single angle.\n\n    Args:\n        points (np.ndarray, shape=[N, 3]]):\n        angles (np.ndarray, shape=[1]]):\n        axis (int): Axis to rotate at.\n\n    Returns:\n        np.ndarray: Rotated points.\n    \"\"\"\n    # points: [N, 3]\n    rot_sin = np.sin(angle)\n    rot_cos = np.cos(angle)\n    if axis == 1:\n        rot_mat_T = np.array(\n            [[rot_cos, 0, -rot_sin], [0, 1, 0], [rot_sin, 0, rot_cos]],\n            dtype=points.dtype)\n    elif axis == 2 or axis == -1:\n        rot_mat_T = np.array(\n            [[rot_cos, -rot_sin, 0], [rot_sin, rot_cos, 0], [0, 0, 1]],\n            dtype=points.dtype)\n    elif axis == 0:\n        rot_mat_T = np.array(\n            [[1, 0, 0], [0, rot_cos, -rot_sin], [0, rot_sin, rot_cos]],\n            dtype=points.dtype)\n    else:\n        raise ValueError('axis should in range')\n\n    return points @ rot_mat_T, rot_mat_T\n\n\ndef corner_to_surfaces_3d(corners):\n    \"\"\"convert 3d box corners from corner function above to surfaces that\n    normal vectors all direct to internal.\n\n    Args:\n        corners (np.ndarray): 3D box corners with shape of (N, 8, 3).\n\n    Returns:\n        np.ndarray: Surfaces with the shape of (N, 6, 4, 3).\n    \"\"\"\n    # box_corners: [N, 8, 3], must from corner functions in this module\n    surfaces = np.array([\n        [corners[:, 0], corners[:, 1], corners[:, 2], corners[:, 3]],\n        [corners[:, 7], corners[:, 6], corners[:, 5], corners[:, 4]],\n        [corners[:, 0], corners[:, 3], corners[:, 7], corners[:, 4]],\n        [corners[:, 1], corners[:, 5], corners[:, 6], corners[:, 2]],\n        [corners[:, 0], corners[:, 4], corners[:, 5], corners[:, 1]],\n        [corners[:, 3], corners[:, 2], corners[:, 6], corners[:, 7]],\n    ]).transpose([2, 0, 1, 3])\n    return surfaces\n\n\ndef surface_equ_3d(polygon_surfaces):\n    \"\"\"\n\n    Args:\n        polygon_surfaces (np.ndarray): Polygon surfaces with shape of\n            [num_polygon, max_num_surfaces, max_num_points_of_surface, 3].\n            All surfaces' normal vector must direct to internal.\n            Max_num_points_of_surface must at least 3.\n\n    Returns:\n        tuple: normal vector and its direction.\n    \"\"\"\n    # return [a, b, c], d in ax+by+cz+d=0\n    # polygon_surfaces: [num_polygon, num_surfaces, num_points_of_polygon, 3]\n    surface_vec = polygon_surfaces[:, :, :2, :] - \\\n        polygon_surfaces[:, :, 1:3, :]\n    # normal_vec: [..., 3]\n    normal_vec = np.cross(surface_vec[:, :, 0, :], surface_vec[:, :, 1, :])\n    # print(normal_vec.shape, points[..., 0, :].shape)\n    # d = -np.inner(normal_vec, points[..., 0, :])\n    d = np.einsum('aij, aij->ai', normal_vec, polygon_surfaces[:, :, 0, :])\n    return normal_vec, -d\n\n\n@numba.njit\ndef _points_in_convex_polygon_3d_jit(points, polygon_surfaces, normal_vec, d,\n                                     num_surfaces):\n    \"\"\"\n    Args:\n        points (np.ndarray): Input points with shape of (num_points, 3).\n        polygon_surfaces (np.ndarray): Polygon surfaces with shape of\n            (num_polygon, max_num_surfaces, max_num_points_of_surface, 3).\n            All surfaces' normal vector must direct to internal.\n            Max_num_points_of_surface must at least 3.\n        normal_vec (np.ndarray): Normal vector of polygon_surfaces.\n        d (int): Directions of normal vector.\n        num_surfaces (np.ndarray): Number of surfaces a polygon contains\n            shape of (num_polygon).\n\n    Returns:\n        np.ndarray: Result matrix with the shape of [num_points, num_polygon].\n    \"\"\"\n    max_num_surfaces, max_num_points_of_surface = polygon_surfaces.shape[1:3]\n    num_points = points.shape[0]\n    num_polygons = polygon_surfaces.shape[0]\n    ret = np.ones((num_points, num_polygons), dtype=np.bool_)\n    sign = 0.0\n    for i in range(num_points):\n        for j in range(num_polygons):\n            for k in range(max_num_surfaces):\n                if k > num_surfaces[j]:\n                    break\n                sign = (\n                    points[i, 0] * normal_vec[j, k, 0] +\n                    points[i, 1] * normal_vec[j, k, 1] +\n                    points[i, 2] * normal_vec[j, k, 2] + d[j, k])\n                if sign >= 0:\n                    ret[i, j] = False\n                    break\n    return ret\n\n\ndef points_in_convex_polygon_3d_jit(points,\n                                    polygon_surfaces,\n                                    num_surfaces=None):\n    \"\"\"Check points is in 3d convex polygons.\n\n    Args:\n        points (np.ndarray): Input points with shape of (num_points, 3).\n        polygon_surfaces (np.ndarray): Polygon surfaces with shape of \\\n            (num_polygon, max_num_surfaces, max_num_points_of_surface, 3). \\\n            All surfaces' normal vector must direct to internal. \\\n            Max_num_points_of_surface must at least 3.\n        num_surfaces (np.ndarray): Number of surfaces a polygon contains \\\n            shape of (num_polygon).\n\n    Returns:\n        np.ndarray: Result matrix with the shape of [num_points, num_polygon].\n    \"\"\"\n    max_num_surfaces, max_num_points_of_surface = polygon_surfaces.shape[1:3]\n    # num_points = points.shape[0]\n    num_polygons = polygon_surfaces.shape[0]\n    if num_surfaces is None:\n        num_surfaces = np.full((num_polygons, ), 9999999, dtype=np.int64)\n    normal_vec, d = surface_equ_3d(polygon_surfaces[:, :, :3, :])\n    # normal_vec: [num_polygon, max_num_surfaces, 3]\n    # d: [num_polygon, max_num_surfaces]\n    return _points_in_convex_polygon_3d_jit(points, polygon_surfaces,\n                                            normal_vec, d, num_surfaces)\n\n\n@numba.jit\ndef points_in_convex_polygon_jit(points, polygon, clockwise=True):\n    \"\"\"Check points is in 2d convex polygons. True when point in polygon.\n\n    Args:\n        points (np.ndarray): Input points with the shape of [num_points, 2].\n        polygon (np.ndarray): Input polygon with the shape of\n            [num_polygon, num_points_of_polygon, 2].\n        clockwise (bool): Indicate polygon is clockwise.\n\n    Returns:\n        np.ndarray: Result matrix with the shape of [num_points, num_polygon].\n    \"\"\"\n    # first convert polygon to directed lines\n    num_points_of_polygon = polygon.shape[1]\n    num_points = points.shape[0]\n    num_polygons = polygon.shape[0]\n    # if clockwise:\n    #     vec1 = polygon - polygon[:, [num_points_of_polygon - 1] +\n    #                              list(range(num_points_of_polygon - 1)), :]\n    # else:\n    #     vec1 = polygon[:, [num_points_of_polygon - 1] +\n    #                    list(range(num_points_of_polygon - 1)), :] - polygon\n    # vec1: [num_polygon, num_points_of_polygon, 2]\n    vec1 = np.zeros((2), dtype=polygon.dtype)\n    ret = np.zeros((num_points, num_polygons), dtype=np.bool_)\n    success = True\n    cross = 0.0\n    for i in range(num_points):\n        for j in range(num_polygons):\n            success = True\n            for k in range(num_points_of_polygon):\n                if clockwise:\n                    vec1 = polygon[j, k] - polygon[j, k - 1]\n                else:\n                    vec1 = polygon[j, k - 1] - polygon[j, k]\n                cross = vec1[1] * (polygon[j, k, 0] - points[i, 0])\n                cross -= vec1[0] * (polygon[j, k, 1] - points[i, 1])\n                if cross >= 0:\n                    success = False\n                    break\n            ret[i, j] = success\n    return ret\n"
  },
  {
    "path": "pcdet/utils/box_utils.py",
    "content": "import numpy as np\nimport scipy\nimport torch\nfrom scipy.spatial import Delaunay\n\nfrom ..ops.roiaware_pool3d import roiaware_pool3d_utils\nfrom . import common_utils\n\n\ndef in_hull(p, hull):\n    \"\"\"\n    :param p: (N, K) test points\n    :param hull: (M, K) M corners of a box\n    :return (N) bool\n    \"\"\"\n    try:\n        if not isinstance(hull, Delaunay):\n            hull = Delaunay(hull)\n        flag = hull.find_simplex(p) >= 0\n    except scipy.spatial.qhull.QhullError:\n        print('Warning: not a hull %s' % str(hull))\n        flag = np.zeros(p.shape[0], dtype=np.bool)\n\n    return flag\n\n\ndef boxes_to_corners_3d(boxes3d):\n    \"\"\"\n        7 -------- 4\n       /|         /|\n      6 -------- 5 .\n      | |        | |\n      . 3 -------- 0\n      |/         |/\n      2 -------- 1\n    Args:\n        boxes3d:  (N, 7) [x, y, z, dx, dy, dz, heading], (x, y, z) is the box center\n\n    Returns:\n    \"\"\"\n    boxes3d, is_numpy = common_utils.check_numpy_to_torch(boxes3d)\n\n    template = boxes3d.new_tensor((\n        [1, 1, -1], [1, -1, -1], [-1, -1, -1], [-1, 1, -1],\n        [1, 1, 1], [1, -1, 1], [-1, -1, 1], [-1, 1, 1],\n    )) / 2\n\n    corners3d = boxes3d[:, None, 3:6].repeat(1, 8, 1) * template[None, :, :]\n    corners3d = common_utils.rotate_points_along_z(corners3d.view(-1, 8, 3), boxes3d[:, 6]).view(-1, 8, 3)\n    corners3d += boxes3d[:, None, 0:3]\n\n    return corners3d.numpy() if is_numpy else corners3d\n\n\ndef mask_boxes_outside_range_numpy(boxes, limit_range, min_num_corners=1):\n    \"\"\"\n    Args:\n        boxes: (N, 7) [x, y, z, dx, dy, dz, heading, ...], (x, y, z) is the box center\n        limit_range: [minx, miny, minz, maxx, maxy, maxz]\n        min_num_corners:\n\n    Returns:\n\n    \"\"\"\n    if boxes.shape[1] > 7:\n        boxes = boxes[:, 0:7]\n    corners = boxes_to_corners_3d(boxes)  # (N, 8, 3)\n    mask = ((corners >= limit_range[0:3]) & (corners <= limit_range[3:6])).all(axis=2)\n    mask = mask.sum(axis=1) >= min_num_corners  # (N)\n\n    return mask\n\n\ndef remove_points_in_boxes3d(points, boxes3d):\n    \"\"\"\n    Args:\n        points: (num_points, 3 + C)\n        boxes3d: (N, 7) [x, y, z, dx, dy, dz, heading], (x, y, z) is the box center, each box DO NOT overlaps\n\n    Returns:\n\n    \"\"\"\n    boxes3d, is_numpy = common_utils.check_numpy_to_torch(boxes3d)\n    points, is_numpy = common_utils.check_numpy_to_torch(points)\n    point_masks = roiaware_pool3d_utils.points_in_boxes_cpu(points[:, 0:3], boxes3d)\n    points = points[point_masks.sum(dim=0) == 0]\n\n    return points.numpy() if is_numpy else points\n\n\ndef boxes3d_kitti_camera_to_lidar(boxes3d_camera, calib):\n    \"\"\"\n    Args:\n        boxes3d_camera: (N, 7) [x, y, z, l, h, w, r] in rect camera coords\n        calib:\n\n    Returns:\n        boxes3d_lidar: [x, y, z, dx, dy, dz, heading], (x, y, z) is the box center\n\n    \"\"\"\n    xyz_camera = boxes3d_camera[:, 0:3]\n    l, h, w, r = boxes3d_camera[:, 3:4], boxes3d_camera[:, 4:5], boxes3d_camera[:, 5:6], boxes3d_camera[:, 6:7]\n    xyz_lidar = calib.rect_to_lidar(xyz_camera)\n    xyz_lidar[:, 2] += h[:, 0] / 2\n    return np.concatenate([xyz_lidar, l, w, h, -(r + np.pi / 2)], axis=-1)\n\n\ndef boxes3d_kitti_fakelidar_to_lidar(boxes3d_lidar):\n    \"\"\"\n    Args:\n        boxes3d_fakelidar: (N, 7) [x, y, z, w, l, h, r] in old LiDAR coordinates, z is bottom center\n\n    Returns:\n        boxes3d_lidar: [x, y, z, dx, dy, dz, heading], (x, y, z) is the box center\n\n    \"\"\"\n    w, l, h, r = boxes3d_lidar[:, 3:4], boxes3d_lidar[:, 4:5], boxes3d_lidar[:, 5:6], boxes3d_lidar[:, 6:7]\n    boxes3d_lidar[:, 2] += h[:, 0] / 2\n    return np.concatenate([boxes3d_lidar[:, 0:3], l, w, h, -(r + np.pi / 2)], axis=-1)\n\n\ndef boxes3d_kitti_lidar_to_fakelidar(boxes3d_lidar):\n    \"\"\"\n    Args:\n        boxes3d_lidar: (N, 7) [x, y, z, dx, dy, dz, heading], (x, y, z) is the box center\n\n    Returns:\n        boxes3d_fakelidar: [x, y, z, w, l, h, r] in old LiDAR coordinates, z is bottom center\n\n    \"\"\"\n    dx, dy, dz, heading = boxes3d_lidar[:, 3:4], boxes3d_lidar[:, 4:5], boxes3d_lidar[:, 5:6], boxes3d_lidar[:, 6:7]\n    boxes3d_lidar[:, 2] -= dz[:, 0] / 2\n    return np.concatenate([boxes3d_lidar[:, 0:3], dy, dx, dz, -heading - np.pi / 2], axis=-1)\n\n\ndef enlarge_box3d(boxes3d, extra_width=(0, 0, 0)):\n    \"\"\"\n    Args:\n        boxes3d: [x, y, z, dx, dy, dz, heading], (x, y, z) is the box center\n        extra_width: [extra_x, extra_y, extra_z]\n\n    Returns:\n\n    \"\"\"\n    boxes3d, is_numpy = common_utils.check_numpy_to_torch(boxes3d)\n    large_boxes3d = boxes3d.clone()\n\n    large_boxes3d[:, 3:6] += boxes3d.new_tensor(extra_width)[None, :]\n    return large_boxes3d\n\n\ndef boxes3d_lidar_to_kitti_camera(boxes3d_lidar, calib):\n    \"\"\"\n    :param boxes3d_lidar: (N, 7) [x, y, z, dx, dy, dz, heading], (x, y, z) is the box center\n    :param calib:\n    :return:\n        boxes3d_camera: (N, 7) [x, y, z, l, h, w, r] in rect camera coords\n    \"\"\"\n    xyz_lidar = boxes3d_lidar[:, 0:3]\n    l, w, h, r = boxes3d_lidar[:, 3:4], boxes3d_lidar[:, 4:5], boxes3d_lidar[:, 5:6], boxes3d_lidar[:, 6:7]\n\n    xyz_lidar[:, 2] -= h.reshape(-1) / 2\n    xyz_cam = calib.lidar_to_rect(xyz_lidar)\n    # xyz_cam[:, 1] += h.reshape(-1) / 2\n    r = -r - np.pi / 2\n    return np.concatenate([xyz_cam, l, h, w, r], axis=-1)\n\n\ndef boxes3d_to_corners3d_kitti_camera(boxes3d, bottom_center=True):\n    \"\"\"\n    :param boxes3d: (N, 7) [x, y, z, l, h, w, ry] in camera coords, see the definition of ry in KITTI dataset\n    :param bottom_center: whether y is on the bottom center of object\n    :return: corners3d: (N, 8, 3)\n        7 -------- 4\n       /|         /|\n      6 -------- 5 .\n      | |        | |\n      . 3 -------- 0\n      |/         |/\n      2 -------- 1\n    \"\"\"\n    boxes_num = boxes3d.shape[0]\n    l, h, w = boxes3d[:, 3], boxes3d[:, 4], boxes3d[:, 5]\n    x_corners = np.array([l / 2., l / 2., -l / 2., -l / 2., l / 2., l / 2., -l / 2., -l / 2], dtype=np.float32).T\n    z_corners = np.array([w / 2., -w / 2., -w / 2., w / 2., w / 2., -w / 2., -w / 2., w / 2.], dtype=np.float32).T\n    if bottom_center:\n        y_corners = np.zeros((boxes_num, 8), dtype=np.float32)\n        y_corners[:, 4:8] = -h.reshape(boxes_num, 1).repeat(4, axis=1)  # (N, 8)\n    else:\n        y_corners = np.array([h / 2., h / 2., h / 2., h / 2., -h / 2., -h / 2., -h / 2., -h / 2.], dtype=np.float32).T\n\n    ry = boxes3d[:, 6]\n    zeros, ones = np.zeros(ry.size, dtype=np.float32), np.ones(ry.size, dtype=np.float32)\n    rot_list = np.array([[np.cos(ry), zeros, -np.sin(ry)],\n                         [zeros, ones, zeros],\n                         [np.sin(ry), zeros, np.cos(ry)]])  # (3, 3, N)\n    R_list = np.transpose(rot_list, (2, 0, 1))  # (N, 3, 3)\n\n    temp_corners = np.concatenate((x_corners.reshape(-1, 8, 1), y_corners.reshape(-1, 8, 1),\n                                   z_corners.reshape(-1, 8, 1)), axis=2)  # (N, 8, 3)\n    rotated_corners = np.matmul(temp_corners, R_list)  # (N, 8, 3)\n    x_corners, y_corners, z_corners = rotated_corners[:, :, 0], rotated_corners[:, :, 1], rotated_corners[:, :, 2]\n\n    x_loc, y_loc, z_loc = boxes3d[:, 0], boxes3d[:, 1], boxes3d[:, 2]\n\n    x = x_loc.reshape(-1, 1) + x_corners.reshape(-1, 8)\n    y = y_loc.reshape(-1, 1) + y_corners.reshape(-1, 8)\n    z = z_loc.reshape(-1, 1) + z_corners.reshape(-1, 8)\n\n    corners = np.concatenate((x.reshape(-1, 8, 1), y.reshape(-1, 8, 1), z.reshape(-1, 8, 1)), axis=2)\n\n    return corners.astype(np.float32)\n\n\ndef boxes3d_kitti_camera_to_imageboxes(boxes3d, calib, image_shape=None):\n    \"\"\"\n    :param boxes3d: (N, 7) [x, y, z, l, h, w, r] in rect camera coords\n    :param calib:\n    :return:\n        box_2d_preds: (N, 4) [x1, y1, x2, y2]\n    \"\"\"\n    corners3d = boxes3d_to_corners3d_kitti_camera(boxes3d)\n    pts_img, _ = calib.rect_to_img(corners3d.reshape(-1, 3))\n    corners_in_image = pts_img.reshape(-1, 8, 2)\n\n    min_uv = np.min(corners_in_image, axis=1)  # (N, 2)\n    max_uv = np.max(corners_in_image, axis=1)  # (N, 2)\n    boxes2d_image = np.concatenate([min_uv, max_uv], axis=1)\n    if image_shape is not None:\n        boxes2d_image[:, 0] = np.clip(boxes2d_image[:, 0], a_min=0, a_max=image_shape[1] - 1)\n        boxes2d_image[:, 1] = np.clip(boxes2d_image[:, 1], a_min=0, a_max=image_shape[0] - 1)\n        boxes2d_image[:, 2] = np.clip(boxes2d_image[:, 2], a_min=0, a_max=image_shape[1] - 1)\n        boxes2d_image[:, 3] = np.clip(boxes2d_image[:, 3], a_min=0, a_max=image_shape[0] - 1)\n\n    return boxes2d_image\n\n\ndef boxes_iou_normal(boxes_a, boxes_b):\n    \"\"\"\n    Args:\n        boxes_a: (N, 4) [x1, y1, x2, y2]\n        boxes_b: (M, 4) [x1, y1, x2, y2]\n\n    Returns:\n\n    \"\"\"\n    assert boxes_a.shape[1] == boxes_b.shape[1] == 4\n    x_min = torch.max(boxes_a[:, 0, None], boxes_b[None, :, 0])\n    x_max = torch.min(boxes_a[:, 2, None], boxes_b[None, :, 2])\n    y_min = torch.max(boxes_a[:, 1, None], boxes_b[None, :, 1])\n    y_max = torch.min(boxes_a[:, 3, None], boxes_b[None, :, 3])\n    x_len = torch.clamp_min(x_max - x_min, min=0)\n    y_len = torch.clamp_min(y_max - y_min, min=0)\n    area_a = (boxes_a[:, 2] - boxes_a[:, 0]) * (boxes_a[:, 3] - boxes_a[:, 1])\n    area_b = (boxes_b[:, 2] - boxes_b[:, 0]) * (boxes_b[:, 3] - boxes_b[:, 1])\n    a_intersect_b = x_len * y_len\n    iou = a_intersect_b / torch.clamp_min(area_a[:, None] + area_b[None, :] - a_intersect_b, min=1e-6)\n    return iou\n\n\ndef boxes3d_lidar_to_aligned_bev_boxes(boxes3d):\n    \"\"\"\n    Args:\n        boxes3d: (N, 7 + C) [x, y, z, dx, dy, dz, heading] in lidar coordinate\n\n    Returns:\n        aligned_bev_boxes: (N, 4) [x1, y1, x2, y2] in the above lidar coordinate\n    \"\"\"\n    rot_angle = common_utils.limit_period(boxes3d[:, 6], offset=0.5, period=np.pi).abs()\n    choose_dims = torch.where(rot_angle[:, None] < np.pi / 4, boxes3d[:, [3, 4]], boxes3d[:, [4, 3]])\n    aligned_bev_boxes = torch.cat((boxes3d[:, 0:2] - choose_dims / 2, boxes3d[:, 0:2] + choose_dims / 2), dim=1)\n    return aligned_bev_boxes\n\n\ndef boxes3d_nearest_bev_iou(boxes_a, boxes_b):\n    \"\"\"\n    Args:\n        boxes_a: (N, 7) [x, y, z, dx, dy, dz, heading]\n        boxes_b: (N, 7) [x, y, z, dx, dy, dz, heading]\n\n    Returns:\n\n    \"\"\"\n    boxes_bev_a = boxes3d_lidar_to_aligned_bev_boxes(boxes_a)\n    boxes_bev_b = boxes3d_lidar_to_aligned_bev_boxes(boxes_b)\n\n    return boxes_iou_normal(boxes_bev_a, boxes_bev_b)\n"
  },
  {
    "path": "pcdet/utils/calibration_kitti.py",
    "content": "import numpy as np\nimport re\nimport torch\n'''\ndef get_calib_from_file(calib_file):\n    with open(calib_file) as f:\n        lines = f.readlines()\n\n    obj = lines[2].strip().split(' ')[1:]\n    P2 = np.array(obj, dtype=np.float32)\n    obj = lines[3].strip().split(' ')[1:]\n    P3 = np.array(obj, dtype=np.float32)\n    obj = lines[4].strip().split(' ')[1:]\n    R0 = np.array(obj, dtype=np.float32)\n    obj = lines[5].strip().split(' ')[1:]\n    Tr_velo_to_cam = np.array(obj, dtype=np.float32)\n\n    return {'P2': P2.reshape(3, 4),\n            'P3': P3.reshape(3, 4),\n            'R0': R0.reshape(3, 3),\n            'Tr_velo2cam': Tr_velo_to_cam.reshape(3, 4)}\n'''\n\ndef get_calib_from_file(filepath):\n    ''' Read in a calibration file and parse into a dictionary.\n    Ref: https://github.com/utiasSTARS/pykitti/blob/master/pykitti/utils.py\n    '''\n\n    data2 = {}\n    R0 = np.array([[ 0.99992624,  0.00965411, -0.0072371 ],\n                  [-0.00968531,  0.99994343, -0.00433077],\n                  [ 0.00719491,  0.00440054,  0.99996366]])\n    with open(filepath) as f:\n        for line in f.readlines():\n            if line[:2] == \"P2\":\n                P2 = re.split(\" \", line.strip())\n                P2 = np.array(P2[-12:], np.float32)\n\n            if line[:2] == \"P3\":\n                P3 = re.split(\" \", line.strip())\n                P3 = np.array(P3[-12:], np.float32)\n\n            if line[:14] == \"Tr_velo_to_cam\" or line[:11] == \"Tr_velo_cam\":\n                vtc_mat = re.split(\" \", line.strip())\n                vtc_mat = np.array(vtc_mat[-12:], np.float32)\n\n            if line[:7] == \"R0_rect\" or line[:6] == \"R_rect\":\n                R0 = re.split(\" \", line.strip())\n                R0 = np.array(R0[-9:], np.float32)\n\n    data2[\"P2\"]=P2.reshape(3, 4)\n    data2[\"P3\"]=P3.reshape(3, 4)\n    data2[\"Tr_velo2cam\"]=vtc_mat.reshape(3, 4)\n    data2[\"R0\"]=R0.reshape(3, 3)\n\n    return data2\n\n\n\nclass Calibration(object):\n    def __init__(self, calib_file):\n        if not isinstance(calib_file, dict):\n            calib = get_calib_from_file(calib_file)\n        else:\n            calib = calib_file\n\n        self.P2 = calib['P2']  # 3 x 4\n        self.R0 = calib['R0']  # 3 x 3\n        self.V2C = calib['Tr_velo2cam']  # 3 x 4\n\n        # Camera intrinsics and extrinsics\n        self.cu = self.P2[0, 2]\n        self.cv = self.P2[1, 2]\n        self.fu = self.P2[0, 0]\n        self.fv = self.P2[1, 1]\n        self.tx = self.P2[0, 3] / (-self.fu)\n        self.ty = self.P2[1, 3] / (-self.fv)\n\n    def cart_to_hom(self, pts):\n        \"\"\"\n        :param pts: (N, 3 or 2)\n        :return pts_hom: (N, 4 or 3)\n        \"\"\"\n        pts_hom = np.hstack((pts, np.ones((pts.shape[0], 1), dtype=np.float32)))\n        return pts_hom\n\n    def cart_to_hom_cuda(self, pts):\n        \"\"\"\n        :param pts: (N, 3 or 2)\n        :return pts_hom: (N, 4 or 3) Nx3\n        \"\"\"\n        pts_hom = torch.cat([pts, torch.ones((pts.shape[0], 1)).to(pts.device)], -1)\n        return pts_hom\n\n    def rect_to_lidar(self, pts_rect):\n        \"\"\"\n        :param pts_lidar: (N, 3)\n        :return pts_rect: (N, 3)\n        \"\"\"\n        pts_rect_hom = self.cart_to_hom(pts_rect)  # (N, 4)\n        R0_ext = np.hstack((self.R0, np.zeros((3, 1), dtype=np.float32)))  # (3, 4)\n        R0_ext = np.vstack((R0_ext, np.zeros((1, 4), dtype=np.float32)))  # (4, 4)\n        R0_ext[3, 3] = 1\n        V2C_ext = np.vstack((self.V2C, np.zeros((1, 4), dtype=np.float32)))  # (4, 4)\n        V2C_ext[3, 3] = 1\n\n        pts_lidar = np.dot(pts_rect_hom, np.linalg.inv(np.dot(R0_ext, V2C_ext).T))\n        return pts_lidar[:, 0:3]\n\n    def lidar_to_rect(self, pts_lidar):\n        \"\"\"\n        :param pts_lidar: (N, 3)\n        :return pts_rect: (N, 3)\n        \"\"\"\n        pts_lidar_hom = self.cart_to_hom(pts_lidar)\n        pts_rect = np.dot(pts_lidar_hom, np.dot(self.V2C.T, self.R0.T))\n        # pts_rect = reduce(np.dot, (pts_lidar_hom, self.V2C.T, self.R0.T))\n        return pts_rect\n\n    def lidar_to_rect_cuda(self, pts_lidar):\n        \"\"\"\n        :param pts_lidar: (N, 3)\n        :return pts_rect: (N, 3)\n        \"\"\"\n        pts_lidar_hom = self.cart_to_hom_cuda(pts_lidar)\n        V2C = torch.from_numpy(self.V2C.T).to(pts_lidar.device)\n        R0 = torch.from_numpy(self.R0.T).to(pts_lidar.device)\n        pts_rect = torch.matmul(pts_lidar_hom, torch.matmul(V2C, R0))\n        # pts_rect = reduce(np.dot, (pts_lidar_hom, self.V2C.T, self.R0.T))\n        return pts_rect\n\n    def rect_to_img(self, pts_rect):\n        \"\"\"\n        :param pts_rect: (N, 3)\n        :return pts_img: (N, 2)\n        \"\"\"\n        pts_rect_hom = self.cart_to_hom(pts_rect)\n        pts_2d_hom = np.dot(pts_rect_hom, self.P2.T)\n        pts_img = (pts_2d_hom[:, 0:2].T / pts_rect_hom[:, 2]).T  # (N, 2)\n        pts_rect_depth = pts_2d_hom[:, 2] - self.P2.T[3, 2]  # depth in rect camera coord\n        return pts_img, pts_rect_depth\n\n    def rect_to_img_cuda(self, pts_rect):\n        \"\"\"\n        :param pts_rect: (N, 3)\n        :return pts_img: (N, 2)\n        \"\"\"\n        pts_rect_hom = self.cart_to_hom_cuda(pts_rect)\n        P2 = torch.from_numpy(self.P2.T).to(pts_rect.device)\n        pts_2d_hom = torch.matmul(pts_rect_hom, P2)\n        pts_img = (pts_2d_hom[:, 0:2].T / pts_rect_hom[:, 2]).T  # (N, 2)\n        pts_rect_depth = pts_2d_hom[:, 2] - P2[3, 2]  # depth in rect camera coord\n        return pts_img, pts_rect_depth\n\n    def lidar_to_img(self, pts_lidar):\n        \"\"\"\n        :param pts_lidar: (N, 3)\n        :return pts_img: (N, 2)\n        \"\"\"\n        pts_rect = self.lidar_to_rect(pts_lidar)\n        pts_img, pts_depth = self.rect_to_img(pts_rect)\n        return pts_img, pts_depth\n\n    def img_to_rect(self, u, v, depth_rect):\n        \"\"\"\n        :param u: (N)\n        :param v: (N)\n        :param depth_rect: (N)\n        :return:\n        \"\"\"\n        x = ((u - self.cu) * depth_rect) / self.fu + self.tx\n        y = ((v - self.cv) * depth_rect) / self.fv + self.ty\n        pts_rect = np.concatenate((x.reshape(-1, 1), y.reshape(-1, 1), depth_rect.reshape(-1, 1)), axis=1)\n        return pts_rect\n\n    def corners3d_to_img_boxes(self, corners3d):\n        \"\"\"\n        :param corners3d: (N, 8, 3) corners in rect coordinate\n        :return: boxes: (None, 4) [x1, y1, x2, y2] in rgb coordinate\n        :return: boxes_corner: (None, 8) [xi, yi] in rgb coordinate\n        \"\"\"\n        sample_num = corners3d.shape[0]\n        corners3d_hom = np.concatenate((corners3d, np.ones((sample_num, 8, 1))), axis=2)  # (N, 8, 4)\n\n        img_pts = np.matmul(corners3d_hom, self.P2.T)  # (N, 8, 3)\n\n        x, y = img_pts[:, :, 0] / img_pts[:, :, 2], img_pts[:, :, 1] / img_pts[:, :, 2]\n        x1, y1 = np.min(x, axis=1), np.min(y, axis=1)\n        x2, y2 = np.max(x, axis=1), np.max(y, axis=1)\n\n        boxes = np.concatenate((x1.reshape(-1, 1), y1.reshape(-1, 1), x2.reshape(-1, 1), y2.reshape(-1, 1)), axis=1)\n        boxes_corner = np.concatenate((x.reshape(-1, 8, 1), y.reshape(-1, 8, 1)), axis=2)\n\n        return boxes, boxes_corner\n"
  },
  {
    "path": "pcdet/utils/common_utils.py",
    "content": "import logging\nimport os\nimport pickle\nimport random\nimport shutil\nimport subprocess\n\nimport numpy as np\nimport torch\nimport torch.distributed as dist\nimport torch.multiprocessing as mp\n\n\ndef check_numpy_to_torch(x):\n    if isinstance(x, np.ndarray):\n        return torch.from_numpy(x).float(), True\n    return x, False\n\n\ndef limit_period(val, offset=0.5, period=np.pi):\n    val, is_numpy = check_numpy_to_torch(val)\n    ans = val - torch.floor(val / period + offset) * period\n    return ans.numpy() if is_numpy else ans\n\n\ndef drop_info_with_name(info, name):\n    ret_info = {}\n    keep_indices = [i for i, x in enumerate(info['name']) if x != name]\n    for key in info.keys():\n        ret_info[key] = info[key][keep_indices]\n    return ret_info\n\n\ndef rotate_points_along_z(points, angle):\n    \"\"\"\n    Args:\n        points: (B, N, 3 + C)\n        angle: (B), angle along z-axis, angle increases x ==> y\n    Returns:\n\n    \"\"\"\n    points, is_numpy = check_numpy_to_torch(points)\n    angle, _ = check_numpy_to_torch(angle)\n\n    cosa = torch.cos(angle)\n    sina = torch.sin(angle)\n    zeros = angle.new_zeros(points.shape[0])\n    ones = angle.new_ones(points.shape[0])\n    rot_matrix = torch.stack((\n        cosa,  sina, zeros,\n        -sina, cosa, zeros,\n        zeros, zeros, ones\n    ), dim=1).view(-1, 3, 3).float()\n    points_rot = torch.matmul(points[:, :, 0:3], rot_matrix)\n    points_rot = torch.cat((points_rot, points[:, :, 3:]), dim=-1)\n    return points_rot.numpy() if is_numpy else points_rot\n\n\ndef mask_points_by_range(points, limit_range):\n    mask = (points[:, 0] >= limit_range[0]) & (points[:, 0] <= limit_range[3]) \\\n           & (points[:, 1] >= limit_range[1]) & (points[:, 1] <= limit_range[4])\n    return mask\n\n\ndef get_voxel_centers(voxel_coords, downsample_times, voxel_size, point_cloud_range):\n    \"\"\"\n    Args:\n        voxel_coords: (N, 3)\n        downsample_times:\n        voxel_size:\n        point_cloud_range:\n\n    Returns:\n\n    \"\"\"\n    assert voxel_coords.shape[1] == 3\n    voxel_centers = voxel_coords[:, [2, 1, 0]].float()  # (xyz)\n    voxel_size = torch.tensor(voxel_size, device=voxel_centers.device).float() * downsample_times\n    pc_range = torch.tensor(point_cloud_range[0:3], device=voxel_centers.device).float()\n    voxel_centers = (voxel_centers + 0.5) * voxel_size + pc_range\n    return voxel_centers\n\n\ndef create_logger(log_file=None, rank=0, log_level=logging.INFO):\n    logger = logging.getLogger(__name__)\n    logger.setLevel(log_level if rank == 0 else 'ERROR')\n    formatter = logging.Formatter('%(asctime)s  %(levelname)5s  %(message)s')\n    console = logging.StreamHandler()\n    console.setLevel(log_level if rank == 0 else 'ERROR')\n    console.setFormatter(formatter)\n    logger.addHandler(console)\n    if log_file is not None:\n        file_handler = logging.FileHandler(filename=log_file)\n        file_handler.setLevel(log_level if rank == 0 else 'ERROR')\n        file_handler.setFormatter(formatter)\n        logger.addHandler(file_handler)\n    return logger\n\n\ndef set_random_seed(seed):\n    random.seed(seed)\n    np.random.seed(seed)\n    torch.manual_seed(seed)\n    torch.backends.cudnn.deterministic = True\n    torch.backends.cudnn.benchmark = False\n\n\ndef keep_arrays_by_name(gt_names, used_classes):\n    inds = [i for i, x in enumerate(gt_names) if x in used_classes]\n    inds = np.array(inds, dtype=np.int64)\n    return inds\n\n\ndef init_dist_slurm(tcp_port, local_rank, backend='nccl'):\n    \"\"\"\n    modified from https://github.com/open-mmlab/mmdetection\n    Args:\n        tcp_port:\n        backend:\n\n    Returns:\n\n    \"\"\"\n    proc_id = int(os.environ['SLURM_PROCID'])\n    ntasks = int(os.environ['SLURM_NTASKS'])\n    node_list = os.environ['SLURM_NODELIST']\n    num_gpus = torch.cuda.device_count()\n    torch.cuda.set_device(proc_id % num_gpus)\n    addr = subprocess.getoutput('scontrol show hostname {} | head -n1'.format(node_list))\n    os.environ['MASTER_PORT'] = str(tcp_port)\n    os.environ['MASTER_ADDR'] = addr\n    os.environ['WORLD_SIZE'] = str(ntasks)\n    os.environ['RANK'] = str(proc_id)\n    dist.init_process_group(backend=backend)\n\n    total_gpus = dist.get_world_size()\n    rank = dist.get_rank()\n    return total_gpus, rank\n\n\ndef init_dist_pytorch(tcp_port, local_rank, backend='nccl'):\n    if mp.get_start_method(allow_none=True) is None:\n        mp.set_start_method('spawn')\n\n    num_gpus = torch.cuda.device_count()\n    torch.cuda.set_device(local_rank % num_gpus)\n    dist.init_process_group(\n        backend=backend,\n        init_method='tcp://127.0.0.1:%d' % tcp_port,\n        rank=local_rank,\n        world_size=num_gpus\n    )\n    rank = dist.get_rank()\n    return num_gpus, rank\n\n\ndef get_dist_info():\n    if torch.__version__ < '1.0':\n        initialized = dist._initialized\n    else:\n        if dist.is_available():\n            initialized = dist.is_initialized()\n        else:\n            initialized = False\n    if initialized:\n        rank = dist.get_rank()\n        world_size = dist.get_world_size()\n    else:\n        rank = 0\n        world_size = 1\n    return rank, world_size\n\n\ndef merge_results_dist(result_part, size, tmpdir):\n    rank, world_size = get_dist_info()\n    os.makedirs(tmpdir, exist_ok=True)\n\n    dist.barrier()\n    pickle.dump(result_part, open(os.path.join(tmpdir, 'result_part_{}.pkl'.format(rank)), 'wb'))\n    dist.barrier()\n\n    if rank != 0:\n        return None\n\n    part_list = []\n    for i in range(world_size):\n        part_file = os.path.join(tmpdir, 'result_part_{}.pkl'.format(i))\n        part_list.append(pickle.load(open(part_file, 'rb')))\n\n    ordered_results = []\n    for res in zip(*part_list):\n        ordered_results.extend(list(res))\n    ordered_results = ordered_results[:size]\n    shutil.rmtree(tmpdir)\n    return ordered_results\n"
  },
  {
    "path": "pcdet/utils/commu_utils.py",
    "content": "\"\"\"\nThis file contains primitives for multi-gpu communication.\nThis is useful when doing distributed training.\n\ndeeply borrow from maskrcnn-benchmark and ST3D\n\"\"\"\n\nimport pickle\nimport time\n\nimport torch\nimport torch.distributed as dist\n\n\ndef get_world_size():\n    if not dist.is_available():\n        return 1\n    if not dist.is_initialized():\n        return 1\n    return dist.get_world_size()\n\n\ndef get_rank():\n    if not dist.is_available():\n        return 0\n    if not dist.is_initialized():\n        return 0\n    return dist.get_rank()\n\n\ndef is_main_process():\n    return get_rank() == 0\n\n\ndef synchronize():\n    \"\"\"\n    Helper function to synchronize (barrier) among all processes when\n    using distributed training\n    \"\"\"\n    if not dist.is_available():\n        return\n    if not dist.is_initialized():\n        return\n    world_size = dist.get_world_size()\n    if world_size == 1:\n        return\n    dist.barrier()\n\n\ndef all_gather(data):\n    \"\"\"\n    Run all_gather on arbitrary picklable data (not necessarily tensors)\n    Args:\n        data: any picklable object\n    Returns:\n        list[data]: list of data gathered from each rank\n    \"\"\"\n    world_size = get_world_size()\n    if world_size == 1:\n        return [data]\n\n    # serialized to a Tensor\n    origin_size = None\n    if not isinstance(data, torch.Tensor):\n        buffer = pickle.dumps(data)\n        storage = torch.ByteStorage.from_buffer(buffer)\n        tensor = torch.ByteTensor(storage).to(\"cuda\")\n    else:\n        origin_size = data.size()\n        tensor = data.reshape(-1)\n\n    tensor_type = tensor.dtype\n\n    # obtain Tensor size of each rank\n    local_size = torch.LongTensor([tensor.numel()]).to(\"cuda\")\n    size_list = [torch.LongTensor([0]).to(\"cuda\") for _ in range(world_size)]\n    dist.all_gather(size_list, local_size)\n    size_list = [int(size.item()) for size in size_list]\n    max_size = max(size_list)\n\n    # receiving Tensor from all ranks\n    # we pad the tensor because torch all_gather does not support\n    # gathering tensors of different shapes\n    tensor_list = []\n    for _ in size_list:\n        tensor_list.append(torch.FloatTensor(size=(max_size,)).cuda().to(tensor_type))\n    if local_size != max_size:\n        padding = torch.FloatTensor(size=(max_size - local_size,)).cuda().to(tensor_type)\n        tensor = torch.cat((tensor, padding), dim=0)\n    dist.all_gather(tensor_list, tensor)\n\n    data_list = []\n    for size, tensor in zip(size_list, tensor_list):\n        if origin_size is None:\n            buffer = tensor.cpu().numpy().tobytes()[:size]\n            data_list.append(pickle.loads(buffer))\n        else:\n            buffer = tensor[:size]\n            data_list.append(buffer)\n\n    if origin_size is not None:\n        new_shape = [-1] + list(origin_size[1:])\n        resized_list = []\n        for data in data_list:\n            # suppose the difference of tensor size exist in first dimension\n            data = data.reshape(new_shape)\n            resized_list.append(data)\n\n        return resized_list\n    else:\n        return data_list\n\n\ndef reduce_dict(input_dict, average=True):\n    \"\"\"\n    Args:\n        input_dict (dict): all the values will be reduced\n        average (bool): whether to do average or sum\n    Reduce the values in the dictionary from all processes so that process with rank\n    0 has the averaged results. Returns a dict with the same fields as\n    input_dict, after reduction.\n    \"\"\"\n    world_size = get_world_size()\n    if world_size < 2:\n        return input_dict\n    with torch.no_grad():\n        names = []\n        values = []\n        # sort the keys so that they are consistent across processes\n        for k in sorted(input_dict.keys()):\n            names.append(k)\n            values.append(input_dict[k])\n        values = torch.stack(values, dim=0)\n        dist.reduce(values, dst=0)\n        if dist.get_rank() == 0 and average:\n            # only main process gets accumulated, so only divide by\n            # world_size in this case\n            values /= world_size\n        reduced_dict = {k: v for k, v in zip(names, values)}\n    return reduced_dict\n\n\ndef average_reduce_value(data):\n    data_list = all_gather(data)\n    return sum(data_list) / len(data_list)\n\n\ndef all_reduce(data, op=\"sum\", average=False):\n\n    def op_map(op):\n        op_dict = {\n            \"SUM\": dist.ReduceOp.SUM,\n            \"MAX\": dist.ReduceOp.MAX,\n            \"MIN\": dist.ReduceOp.MIN,\n            \"PRODUCT\": dist.ReduceOp.PRODUCT,\n        }\n        return op_dict[op]\n\n    world_size = get_world_size()\n    if world_size > 1:\n        reduced_data = data.clone()\n        dist.all_reduce(reduced_data, op=op_map(op.upper()))\n        if average:\n            assert op.upper() == 'SUM'\n            return reduced_data / world_size\n        else:\n            return reduced_data\n    return data\n\n\n@torch.no_grad()\ndef concat_all_gather(tensor):\n    \"\"\"\n    Performs all_gather operation on the provided tensors.\n    *** Warning ***: torch.distributed.all_gather has no gradient.\n    \"\"\"\n    tensors_gather = [torch.ones_like(tensor)\n        for _ in range(torch.distributed.get_world_size())]\n    torch.distributed.all_gather(tensors_gather, tensor, async_op=False)\n\n    output = torch.cat(tensors_gather, dim=0)\n    return output\n"
  },
  {
    "path": "pcdet/utils/loss_utils.py",
    "content": "import numpy as np\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\nfrom . import box_utils\n\n\nclass SigmoidFocalClassificationLoss(nn.Module):\n    \"\"\"\n    Sigmoid focal cross entropy loss.\n    \"\"\"\n\n    def __init__(self, gamma: float = 2.0, alpha: float = 0.25):\n        \"\"\"\n        Args:\n            gamma: Weighting parameter to balance loss for hard and easy examples.\n            alpha: Weighting parameter to balance loss for positive and negative examples.\n        \"\"\"\n        super(SigmoidFocalClassificationLoss, self).__init__()\n        self.alpha = alpha\n        self.gamma = gamma\n\n    @staticmethod\n    def sigmoid_cross_entropy_with_logits(input: torch.Tensor, target: torch.Tensor):\n        \"\"\" PyTorch Implementation for tf.nn.sigmoid_cross_entropy_with_logits:\n            max(x, 0) - x * z + log(1 + exp(-abs(x))) in\n            https://www.tensorflow.org/api_docs/python/tf/nn/sigmoid_cross_entropy_with_logits\n\n        Args:\n            input: (B, #anchors, #classes) float tensor.\n                Predicted logits for each class\n            target: (B, #anchors, #classes) float tensor.\n                One-hot encoded classification targets\n\n        Returns:\n            loss: (B, #anchors, #classes) float tensor.\n                Sigmoid cross entropy loss without reduction\n        \"\"\"\n        loss = torch.clamp(input, min=0) - input * target + \\\n               torch.log1p(torch.exp(-torch.abs(input)))\n        return loss\n\n    def forward(self, input: torch.Tensor, target: torch.Tensor, weights: torch.Tensor):\n        \"\"\"\n        Args:\n            input: (B, #anchors, #classes) float tensor.\n                Predicted logits for each class\n            target: (B, #anchors, #classes) float tensor.\n                One-hot encoded classification targets\n            weights: (B, #anchors) float tensor.\n                Anchor-wise weights.\n\n        Returns:\n            weighted_loss: (B, #anchors, #classes) float tensor after weighting.\n        \"\"\"\n        pred_sigmoid = torch.sigmoid(input)\n        alpha_weight = target * self.alpha + (1 - target) * (1 - self.alpha)\n        pt = target * (1.0 - pred_sigmoid) + (1.0 - target) * pred_sigmoid\n        focal_weight = alpha_weight * torch.pow(pt, self.gamma)\n\n        bce_loss = self.sigmoid_cross_entropy_with_logits(input, target)\n\n        loss = focal_weight * bce_loss\n\n        if weights.shape.__len__() == 2 or \\\n                (weights.shape.__len__() == 1 and target.shape.__len__() == 2):\n            weights = weights.unsqueeze(-1)\n\n        assert weights.shape.__len__() == loss.shape.__len__()\n\n        return loss * weights\n\n\nclass WeightedSmoothL1Loss(nn.Module):\n    \"\"\"\n    Code-wise Weighted Smooth L1 Loss modified based on fvcore.nn.smooth_l1_loss\n    https://github.com/facebookresearch/fvcore/blob/master/fvcore/nn/smooth_l1_loss.py\n                  | 0.5 * x ** 2 / beta   if abs(x) < beta\n    smoothl1(x) = |\n                  | abs(x) - 0.5 * beta   otherwise,\n    where x = input - target.\n    \"\"\"\n    def __init__(self, beta: float = 1.0 / 9.0, code_weights: list = None):\n        \"\"\"\n        Args:\n            beta: Scalar float.\n                L1 to L2 change point.\n                For beta values < 1e-5, L1 loss is computed.\n            code_weights: (#codes) float list if not None.\n                Code-wise weights.\n        \"\"\"\n        super(WeightedSmoothL1Loss, self).__init__()\n        self.beta = beta\n        if code_weights is not None:\n            self.code_weights = np.array(code_weights, dtype=np.float32)\n            self.code_weights = torch.from_numpy(self.code_weights).cuda()\n\n    @staticmethod\n    def smooth_l1_loss(diff, beta):\n        if beta < 1e-5:\n            loss = torch.abs(diff)\n        else:\n            n = torch.abs(diff)\n            loss = torch.where(n < beta, 0.5 * n ** 2 / beta, n - 0.5 * beta)\n\n        return loss\n\n    def forward(self, input: torch.Tensor, target: torch.Tensor, weights: torch.Tensor = None):\n        \"\"\"\n        Args:\n            input: (B, #anchors, #codes) float tensor.\n                Ecoded predicted locations of objects.\n            target: (B, #anchors, #codes) float tensor.\n                Regression targets.\n            weights: (B, #anchors) float tensor if not None.\n\n        Returns:\n            loss: (B, #anchors) float tensor.\n                Weighted smooth l1 loss without reduction.\n        \"\"\"\n        target = torch.where(torch.isnan(target), input, target)  # ignore nan targets\n\n        diff = input - target\n        # code-wise weighting\n        if self.code_weights is not None:\n            diff = diff * self.code_weights.view(1, 1, -1)\n\n        loss = self.smooth_l1_loss(diff, self.beta)\n\n        # anchor-wise weighting\n        if weights is not None:\n            assert weights.shape[0] == loss.shape[0] and weights.shape[1] == loss.shape[1]\n            loss = loss * weights.unsqueeze(-1)\n\n        return loss\n\n\nclass WeightedL1Loss(nn.Module):\n    def __init__(self, code_weights: list = None):\n        \"\"\"\n        Args:\n            code_weights: (#codes) float list if not None.\n                Code-wise weights.\n        \"\"\"\n        super(WeightedL1Loss, self).__init__()\n        if code_weights is not None:\n            self.code_weights = np.array(code_weights, dtype=np.float32)\n            self.code_weights = torch.from_numpy(self.code_weights).cuda()\n\n    def forward(self, input: torch.Tensor, target: torch.Tensor, weights: torch.Tensor = None):\n        \"\"\"\n        Args:\n            input: (B, #anchors, #codes) float tensor.\n                Ecoded predicted locations of objects.\n            target: (B, #anchors, #codes) float tensor.\n                Regression targets.\n            weights: (B, #anchors) float tensor if not None.\n\n        Returns:\n            loss: (B, #anchors) float tensor.\n                Weighted smooth l1 loss without reduction.\n        \"\"\"\n        target = torch.where(torch.isnan(target), input, target)  # ignore nan targets\n\n        diff = input - target\n        # code-wise weighting\n        if self.code_weights is not None:\n            diff = diff * self.code_weights.view(1, 1, -1)\n\n        loss = torch.abs(diff)\n\n        # anchor-wise weighting\n        if weights is not None:\n            assert weights.shape[0] == loss.shape[0] and weights.shape[1] == loss.shape[1]\n            loss = loss * weights.unsqueeze(-1)\n\n        return loss\n\n\nclass WeightedCrossEntropyLoss(nn.Module):\n    \"\"\"\n    Transform input to fit the fomation of PyTorch offical cross entropy loss\n    with anchor-wise weighting.\n    \"\"\"\n    def __init__(self):\n        super(WeightedCrossEntropyLoss, self).__init__()\n\n    def forward(self, input: torch.Tensor, target: torch.Tensor, weights: torch.Tensor):\n        \"\"\"\n        Args:\n            input: (B, #anchors, #classes) float tensor.\n                Predited logits for each class.\n            target: (B, #anchors, #classes) float tensor.\n                One-hot classification targets.\n            weights: (B, #anchors) float tensor.\n                Anchor-wise weights.\n\n        Returns:\n            loss: (B, #anchors) float tensor.\n                Weighted cross entropy loss without reduction\n        \"\"\"\n        input = input.permute(0, 2, 1)\n        target = target.argmax(dim=-1)\n        loss = F.cross_entropy(input, target, reduction='none') * weights\n        return loss\n\n\ndef get_corner_loss_lidar(pred_bbox3d: torch.Tensor, gt_bbox3d: torch.Tensor):\n    \"\"\"\n    Args:\n        pred_bbox3d: (N, 7) float Tensor.\n        gt_bbox3d: (N, 7) float Tensor.\n\n    Returns:\n        corner_loss: (N) float Tensor.\n    \"\"\"\n    assert pred_bbox3d.shape[0] == gt_bbox3d.shape[0]\n\n    pred_box_corners = box_utils.boxes_to_corners_3d(pred_bbox3d)\n    gt_box_corners = box_utils.boxes_to_corners_3d(gt_bbox3d)\n\n    gt_bbox3d_flip = gt_bbox3d.clone()\n    gt_bbox3d_flip[:, 6] += np.pi\n    gt_box_corners_flip = box_utils.boxes_to_corners_3d(gt_bbox3d_flip)\n    # (N, 8)\n    corner_dist = torch.min(torch.norm(pred_box_corners - gt_box_corners, dim=2),\n                            torch.norm(pred_box_corners - gt_box_corners_flip, dim=2))\n    # (N, 8)\n    corner_loss = WeightedSmoothL1Loss.smooth_l1_loss(corner_dist, beta=1.0)\n\n    return corner_loss.mean(dim=1)\n\n\ndef compute_fg_mask(gt_boxes2d, shape, downsample_factor=1, device=torch.device(\"cpu\")):\n    \"\"\"\n    Compute foreground mask for images\n    Args:\n        gt_boxes2d: (B, N, 4), 2D box labels\n        shape: torch.Size or tuple, Foreground mask desired shape\n        downsample_factor: int, Downsample factor for image\n        device: torch.device, Foreground mask desired device\n    Returns:\n        fg_mask (shape), Foreground mask\n    \"\"\"\n    fg_mask = torch.zeros(shape, dtype=torch.bool, device=device)\n\n    # Set box corners\n    gt_boxes2d /= downsample_factor\n    gt_boxes2d[:, :, :2] = torch.floor(gt_boxes2d[:, :, :2])\n    gt_boxes2d[:, :, 2:] = torch.ceil(gt_boxes2d[:, :, 2:])\n    gt_boxes2d = gt_boxes2d.long()\n\n    # Set all values within each box to True\n    B, N = gt_boxes2d.shape[:2]\n    for b in range(B):\n        for n in range(N):\n            u1, v1, u2, v2 = gt_boxes2d[b, n]\n            fg_mask[b, v1:v2, u1:u2] = True\n\n    return fg_mask\n\n\ndef neg_loss_cornernet(pred, gt, mask=None):\n    \"\"\"\n    Refer to https://github.com/tianweiy/CenterPoint.\n    Modified focal loss. Exactly the same as CornerNet. Runs faster and costs a little bit more memory\n    Args:\n        pred: (batch x c x h x w)\n        gt: (batch x c x h x w)\n        mask: (batch x h x w)\n    Returns:\n    \"\"\"\n    pos_inds = gt.eq(1).float()\n    neg_inds = gt.lt(1).float()\n\n    neg_weights = torch.pow(1 - gt, 4)\n\n    loss = 0\n\n    pos_loss = torch.log(pred) * torch.pow(1 - pred, 2) * pos_inds\n    neg_loss = torch.log(1 - pred) * torch.pow(pred, 2) * neg_weights * neg_inds\n\n    if mask is not None:\n        mask = mask[:, None, :, :].float()\n        pos_loss = pos_loss * mask\n        neg_loss = neg_loss * mask\n        num_pos = (pos_inds.float() * mask).sum()\n    else:\n        num_pos = pos_inds.float().sum()\n\n    pos_loss = pos_loss.sum()\n    neg_loss = neg_loss.sum()\n\n    if num_pos == 0:\n        loss = loss - neg_loss\n    else:\n        loss = loss - (pos_loss + neg_loss) / num_pos\n    return loss\n\n\nclass FocalLossCenterNet(nn.Module):\n    \"\"\"\n    Refer to https://github.com/tianweiy/CenterPoint\n    \"\"\"\n    def __init__(self):\n        super(FocalLossCenterNet, self).__init__()\n        self.neg_loss = neg_loss_cornernet\n\n    def forward(self, out, target, mask=None):\n        return self.neg_loss(out, target, mask=mask)\n\n\ndef _reg_loss(regr, gt_regr, mask):\n    \"\"\"\n    Refer to https://github.com/tianweiy/CenterPoint\n    L1 regression loss\n    Args:\n        regr (batch x max_objects x dim)\n        gt_regr (batch x max_objects x dim)\n        mask (batch x max_objects)\n    Returns:\n    \"\"\"\n    num = mask.float().sum()\n    mask = mask.unsqueeze(2).expand_as(gt_regr).float()\n    isnotnan = (~ torch.isnan(gt_regr)).float()\n    mask *= isnotnan\n    regr = regr * mask\n    gt_regr = gt_regr * mask\n\n    loss = torch.abs(regr - gt_regr)\n    loss = loss.transpose(2, 0)\n\n    loss = torch.sum(loss, dim=2)\n    loss = torch.sum(loss, dim=1)\n    # else:\n    #  # D x M x B\n    #  loss = loss.reshape(loss.shape[0], -1)\n\n    # loss = loss / (num + 1e-4)\n    loss = loss / torch.clamp_min(num, min=1.0)\n    # import pdb; pdb.set_trace()\n    return loss\n\n\ndef _gather_feat(feat, ind, mask=None):\n    dim  = feat.size(2)\n    ind  = ind.unsqueeze(2).expand(ind.size(0), ind.size(1), dim)\n    feat = feat.gather(1, ind)\n    if mask is not None:\n        mask = mask.unsqueeze(2).expand_as(feat)\n        feat = feat[mask]\n        feat = feat.view(-1, dim)\n    return feat\n\n\ndef _transpose_and_gather_feat(feat, ind):\n    feat = feat.permute(0, 2, 3, 1).contiguous()\n    feat = feat.view(feat.size(0), -1, feat.size(3))\n    feat = _gather_feat(feat, ind)\n    return feat\n\n\nclass RegLossCenterNet(nn.Module):\n    \"\"\"\n    Refer to https://github.com/tianweiy/CenterPoint\n    \"\"\"\n\n    def __init__(self):\n        super(RegLossCenterNet, self).__init__()\n\n    def forward(self, output, mask, ind=None, target=None):\n        \"\"\"\n        Args:\n            output: (batch x dim x h x w) or (batch x max_objects)\n            mask: (batch x max_objects)\n            ind: (batch x max_objects)\n            target: (batch x max_objects x dim)\n        Returns:\n        \"\"\"\n        if ind is None:\n            pred = output\n        else:\n            pred = _transpose_and_gather_feat(output, ind)\n        loss = _reg_loss(pred, target, mask)\n        return loss"
  },
  {
    "path": "pcdet/utils/object3d_kitti.py",
    "content": "import numpy as np\n\n\ndef get_objects_from_label(label_file):\n    with open(label_file, 'r') as f:\n        lines = f.readlines()\n    objects = [Object3d(line) for line in lines]\n    if len(objects) == 0:\n        return [Object3d('DontCare -1 -1 -4.0061 0.0000 198.4733 416.3764 373.0000 1.5332 1.6821 4.2322 -2.7611 1.6843 4.1515 -4.5719')]\n    return objects\ndef get_objects_from_tracking_label(label_file):\n    objects = [Object3d(line) for line in label_file]\n    return objects\n\n\ndef cls_type_to_id(cls_type):\n    type_to_id = {'Car': 1, 'Pedestrian': 2, 'Cyclist': 3, 'Van': 4}\n    if cls_type not in type_to_id.keys():\n        return -1\n    return type_to_id[cls_type]\n\n\nclass Object3d(object):\n    def __init__(self, line):\n        label = line.strip().split(' ')\n        self.src = line\n        self.cls_type = label[0]\n        self.cls_id = cls_type_to_id(self.cls_type)\n        self.truncation = float(label[1])\n        self.occlusion = float(label[2])  # 0:fully visible 1:partly occluded 2:largely occluded 3:unknown\n        self.alpha = float(label[3])\n        self.box2d = np.array((float(label[4]), float(label[5]), float(label[6]), float(label[7])), dtype=np.float32)\n        self.h = float(label[8])\n        self.w = float(label[9])\n        self.l = float(label[10])\n        self.loc = np.array((float(label[11]), float(label[12]), float(label[13])), dtype=np.float32)\n        self.dis_to_cam = np.linalg.norm(self.loc)\n        self.ry = float(label[14])\n        self.score = float(label[15]) if label.__len__() == 16 else -1.0\n        self.level_str = None\n        self.ob_id = -1\n        if len(label)>15:\n            self.ob_id=label[-1]\n            self.level = self.get_kitti_tracking_obj_level()\n        else:\n            self.level = self.get_kitti_obj_level()\n\n    def get_kitti_obj_level(self):\n        height = float(self.box2d[3]) - float(self.box2d[1]) + 1\n\n        if height >= 40 and self.truncation <= 0.15 and self.occlusion <= 0:\n            self.level_str = 'Easy'\n            return 0  # Easy\n        elif height >= 25 and self.truncation <= 0.3 and self.occlusion <= 1:\n            self.level_str = 'Moderate'\n            return 1  # Moderate\n        elif height >= 25 and self.truncation <= 0.5 and self.occlusion <= 2:\n            self.level_str = 'Hard'\n            return 2  # Hard\n        else:\n            self.level_str = 'UnKnown'\n            return -1\n    def get_kitti_tracking_obj_level(self):\n        height = float(self.box2d[3]) - float(self.box2d[1]) + 1\n\n        if height >= 40 and self.truncation <= 0 and self.occlusion <= 0:\n            self.level_str = 'Easy'\n            return 0  # Easy\n        elif height >= 25 and self.truncation <= 1 and self.occlusion <= 1:\n            self.level_str = 'Moderate'\n            return 1  # Moderate\n        elif height >= 25 and self.truncation <= 2 and self.occlusion <= 2:\n            self.level_str = 'Hard'\n            return 2  # Hard\n        else:\n            self.level_str = 'UnKnown'\n            return -1\n\n    def generate_corners3d(self):\n        \"\"\"\n        generate corners3d representation for this object\n        :return corners_3d: (8, 3) corners of box3d in camera coord\n        \"\"\"\n        l, h, w = self.l, self.h, self.w\n        x_corners = [l / 2, l / 2, -l / 2, -l / 2, l / 2, l / 2, -l / 2, -l / 2]\n        y_corners = [0, 0, 0, 0, -h, -h, -h, -h]\n        z_corners = [w / 2, -w / 2, -w / 2, w / 2, w / 2, -w / 2, -w / 2, w / 2]\n\n        R = np.array([[np.cos(self.ry), 0, np.sin(self.ry)],\n                      [0, 1, 0],\n                      [-np.sin(self.ry), 0, np.cos(self.ry)]])\n        corners3d = np.vstack([x_corners, y_corners, z_corners])  # (3, 8)\n        corners3d = np.dot(R, corners3d).T\n        corners3d = corners3d + self.loc\n        return corners3d\n\n    def to_str(self):\n        print_str = '%s %.3f %.3f %.3f box2d: %s hwl: [%.3f %.3f %.3f] pos: %s ry: %.3f' \\\n                     % (self.cls_type, self.truncation, self.occlusion, self.alpha, self.box2d, self.h, self.w, self.l,\n                        self.loc, self.ry)\n        return print_str\n\n    def to_kitti_format(self):\n        kitti_str = '%s %.2f %d %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f' \\\n                    % (self.cls_type, self.truncation, int(self.occlusion), self.alpha, self.box2d[0], self.box2d[1],\n                       self.box2d[2], self.box2d[3], self.h, self.w, self.l, self.loc[0], self.loc[1], self.loc[2],\n                       self.ry)\n        return kitti_str\n"
  },
  {
    "path": "pcdet/utils/odiou_loss.py",
    "content": "### Compute the IOU of two rotated 2D rectangle\n\nimport math\nimport numpy as np\nimport sys\nimport random\nimport torch\nfrom torch.autograd import Function\nimport torch.nn as nn\n# from compute_ious import compute_ious_whih_shapely\nfrom scipy.spatial import ConvexHull\n\n\n## This function is used to determine whether a point is inside a rectangle or not\nclass compute_vertex(Function):\n    '''\n    Compute the corners which are inside the rectangles\n    '''\n\n    @staticmethod\n    def forward(ctx, corners_gboxes, corners_qboxes):\n\n        np_corners_gboxes = corners_gboxes.cpu().numpy()\n        np_corners_qboxes = corners_qboxes.cpu().detach().numpy()\n        N = corners_gboxes.shape[0]\n        num_of_intersections = np.zeros((N,), dtype=np.int32)\n        intersections = np.zeros((N, 16), dtype=np.float32)\n        flags_qboxes = np.zeros((N, 4), dtype=np.float32)\n        flags_gboxes = np.zeros((N, 4), dtype=np.float32)\n        flags_inters = np.zeros((N, 4, 4), dtype=np.float32)\n\n        for iter in range(N):\n            # step 1: determine how many corners from corners_gboxes inside the np_qboxes\n            ab0 = np_corners_qboxes[iter, 2] - np_corners_qboxes[iter, 0]\n            ab1 = np_corners_qboxes[iter, 3] - np_corners_qboxes[iter, 1]\n            ad0 = np_corners_qboxes[iter, 6] - np_corners_qboxes[iter, 0]\n            ad1 = np_corners_qboxes[iter, 7] - np_corners_qboxes[iter, 1]\n            for i in range(4):\n                ap0 = np_corners_gboxes[iter, i * 2] - np_corners_qboxes[iter, 0]\n                ap1 = np_corners_gboxes[iter, i * 2 + 1] - np_corners_qboxes[iter, 1]\n                abab = ab0 * ab0 + ab1 * ab1\n                abap = ab0 * ap0 + ab1 * ap1\n                adad = ad0 * ad0 + ad1 * ad1\n                adap = ad0 * ap0 + ad1 * ap1\n                if (abab >= abap and abap >= 0 and adad >= adap and adap >= 0):\n                    intersections[iter, num_of_intersections[iter] * 2] = np_corners_gboxes[iter, i * 2]\n                    intersections[iter, num_of_intersections[iter] * 2 + 1] = np_corners_gboxes[iter, i * 2 + 1]\n                    num_of_intersections[iter] += 1\n                    flags_gboxes[iter, i] = 1.0\n\n            # step 2: determine how many corners from np_qboxes inside corners_gboxes\n            ab0 = np_corners_gboxes[iter, 2] - np_corners_gboxes[iter, 0]\n            ab1 = np_corners_gboxes[iter, 3] - np_corners_gboxes[iter, 1]\n            ad0 = np_corners_gboxes[iter, 6] - np_corners_gboxes[iter, 0]\n            ad1 = np_corners_gboxes[iter, 7] - np_corners_gboxes[iter, 1]\n            for i in range(4):\n                ap0 = np_corners_qboxes[iter, i * 2] - np_corners_gboxes[iter, 0]\n                ap1 = np_corners_qboxes[iter, i * 2 + 1] - np_corners_gboxes[iter, 1]\n                abab = ab0 * ab0 + ab1 * ab1\n                abap = ab0 * ap0 + ab1 * ap1\n                adad = ad0 * ad0 + ad1 * ad1\n                adap = ad0 * ap0 + ad1 * ap1\n                if (abab >= abap and abap >= 0 and adad >= adap and adap >= 0):\n                    intersections[iter, num_of_intersections[iter] * 2] = np_corners_qboxes[iter, i * 2]\n                    intersections[iter, num_of_intersections[iter] * 2 + 1] = np_corners_qboxes[iter, i * 2 + 1]\n                    num_of_intersections[iter] += 1\n                    flags_qboxes[iter, i] = 1.0\n\n            # step 3: find the intersection of all the edges\n            for i in range(4):\n                for j in range(4):\n                    A = np.zeros((2,), dtype=np.float32)\n                    B = np.zeros((2,), dtype=np.float32)\n                    C = np.zeros((2,), dtype=np.float32)\n                    D = np.zeros((2,), dtype=np.float32)\n\n                    A[0] = np_corners_gboxes[iter, 2 * i]\n                    A[1] = np_corners_gboxes[iter, 2 * i + 1]\n                    B[0] = np_corners_gboxes[iter, 2 * ((i + 1) % 4)]\n                    B[1] = np_corners_gboxes[iter, 2 * ((i + 1) % 4) + 1]\n\n                    C[0] = np_corners_qboxes[iter, 2 * j]\n                    C[1] = np_corners_qboxes[iter, 2 * j + 1]\n                    D[0] = np_corners_qboxes[iter, 2 * ((j + 1) % 4)]\n                    D[1] = np_corners_qboxes[iter, 2 * ((j + 1) % 4) + 1]\n\n                    BA0 = B[0] - A[0]\n                    BA1 = B[1] - A[1]\n                    CA0 = C[0] - A[0]\n                    CA1 = C[1] - A[1]\n                    DA0 = D[0] - A[0]\n                    DA1 = D[1] - A[1]\n\n                    acd = DA1 * CA0 > CA1 * DA0\n                    bcd = (D[1] - B[1]) * (C[0] - B[0]) > (C[1] - B[1]) * (D[0] - B[0])\n                    if acd != bcd:\n                        abc = CA1 * BA0 > BA1 * CA0\n                        abd = DA1 * BA0 > BA1 * DA0\n                        if abc != abd:\n                            DC0 = D[0] - C[0]\n                            DC1 = D[1] - C[1]\n                            ABBA = A[0] * B[1] - B[0] * A[1]\n                            CDDC = C[0] * D[1] - D[0] * C[1]\n                            DH = BA1 * DC0 - BA0 * DC1\n                            Dx = ABBA * DC0 - BA0 * CDDC\n                            Dy = ABBA * DC1 - BA1 * CDDC\n                            # DH = (B[1] - A[1]) * (D[0] - C[0]) - (B[0] - A[0]) * (D[1] - C[1])\n                            # Dx = (A[0] * B[1] - B[0] * A[1]) * (D[0] - C[0]) - (B[0] - A[0]) * (C[0] * D[1] - D[0] * C[1])\n                            # Dy = (A[0] * B[1] - B[0] * A[1]) * (D[1] - C[1]) - (B[1] - A[1]) * (C[0] * D[1] - D[0] * C[1])\n                            if (num_of_intersections[iter] > 7):\n                                print(\"iter = \", iter)\n                                print(\"(%.4f %.4f) (%.4f %.4f) (%.4f %.4f) (%.4f %.4f)\" % (\n                                    np_corners_gboxes[iter, 0], np_corners_gboxes[iter, 1],\n                                    np_corners_gboxes[iter, 2], np_corners_gboxes[iter, 3],\n                                    np_corners_gboxes[iter, 4], np_corners_gboxes[iter, 5],\n                                    np_corners_gboxes[iter, 6], np_corners_gboxes[iter, 7]))\n                                print(\"(%.4f %.4f) (%.4f %.4f) (%.4f %.4f) (%.4f %.4f)\" % (\n                                    np_corners_qboxes[iter, 0], np_corners_qboxes[iter, 1],\n                                    np_corners_qboxes[iter, 2], np_corners_qboxes[iter, 3],\n                                    np_corners_qboxes[iter, 4], np_corners_qboxes[iter, 5],\n                                    np_corners_qboxes[iter, 6], np_corners_qboxes[iter, 7]))\n                                continue\n                            intersections[iter, num_of_intersections[iter] * 2] = Dx / DH\n                            intersections[iter, num_of_intersections[iter] * 2 + 1] = Dy / DH\n                            num_of_intersections[iter] += 1\n                            flags_inters[iter, i, j] = 1.0\n\n        ctx.save_for_backward(corners_qboxes)\n        ctx.corners_gboxes = corners_gboxes\n        ctx.flags_qboxes = flags_qboxes\n        ctx.flags_gboxes = flags_gboxes\n        ctx.flags_inters = flags_inters\n        # conver numpy to tensor\n        tensor_intersections = torch.from_numpy(intersections)\n        tensor_num_of_intersections = torch.from_numpy(num_of_intersections)\n        return tensor_intersections, tensor_num_of_intersections.detach()\n\n    @staticmethod\n    def backward(ctx, *grad_outputs):\n        _variables = ctx.saved_tensors\n        corners_qboxes = _variables[0]\n        corners_gboxes = ctx.corners_gboxes\n        flags_qboxes = ctx.flags_qboxes\n        flags_gboxes = ctx.flags_gboxes\n        flags_inters = ctx.flags_inters\n        grad_output = grad_outputs[0]\n\n        np_corners_gboxes = corners_gboxes.cpu().numpy()\n        np_corners_qboxes = corners_qboxes.cpu().detach().numpy()\n\n        N = flags_qboxes.shape[0]\n        n_of_inter = np.zeros((N,), dtype=np.int32)\n\n        ### Check whether here is correct or not\n        Jacbian_qboxes = np.zeros((N, 8, 16), dtype=np.float32)\n        Jacbian_gboxes = np.zeros((N, 8, 16), dtype=np.float32)\n\n        for iter in range(N):\n\n            for i in range(4):\n                if (flags_gboxes[iter, i] > 0):\n                    Jacbian_gboxes[iter, i * 2, n_of_inter[iter] * 2] += 1.0\n                    Jacbian_gboxes[iter, i * 2 + 1, n_of_inter[iter] * 2 + 1] += 1.0\n                    n_of_inter[iter] += 1\n\n            for i in range(4):\n                if (flags_qboxes[iter, i] > 0):\n                    Jacbian_qboxes[iter, i * 2, n_of_inter[iter] * 2] += 1.0\n                    Jacbian_qboxes[iter, i * 2 + 1, n_of_inter[iter] * 2 + 1] += 1.0\n                    n_of_inter[iter] += 1\n\n            for i in range(4):\n                for j in range(4):\n                    if (flags_inters[iter, i, j] > 0):\n                        ###\n                        A = np.zeros((2,), dtype=np.float32)\n                        B = np.zeros((2,), dtype=np.float32)\n                        C = np.zeros((2,), dtype=np.float32)\n                        D = np.zeros((2,), dtype=np.float32)\n                        A[0] = np_corners_gboxes[iter, 2 * i]\n                        A[1] = np_corners_gboxes[iter, 2 * i + 1]\n\n                        B[0] = np_corners_gboxes[iter, 2 * ((i + 1) % 4)]\n                        B[1] = np_corners_gboxes[iter, 2 * ((i + 1) % 4) + 1]\n\n                        C[0] = np_corners_qboxes[iter, 2 * j]\n                        C[1] = np_corners_qboxes[iter, 2 * j + 1]\n\n                        D[0] = np_corners_qboxes[iter, 2 * ((j + 1) % 4)]\n                        D[1] = np_corners_qboxes[iter, 2 * ((j + 1) % 4) + 1]\n                        BA0 = B[0] - A[0]\n                        BA1 = B[1] - A[1]\n                        CA0 = C[0] - A[0]\n                        CA1 = C[1] - A[1]\n                        DA0 = D[0] - A[0]\n                        DA1 = D[1] - A[1]\n                        acd = DA1 * CA0 > CA1 * DA0\n                        bcd = (D[1] - B[1]) * (C[0] - B[0]) > (C[1] - B[1]) * (D[0] - B[0])\n\n                        if acd != bcd:\n                            abc = CA1 * BA0 > BA1 * CA0\n                            abd = DA1 * BA0 > BA1 * DA0\n                            if abc != abd:\n                                DC0 = D[0] - C[0]\n                                DC1 = D[1] - C[1]\n                                ABBA = A[0] * B[1] - B[0] * A[1]\n                                CDDC = C[0] * D[1] - D[0] * C[1]\n                                DH = BA1 * DC0 - BA0 * DC1\n                                Dx = ABBA * DC0 - BA0 * CDDC\n                                Dy = ABBA * DC1 - BA1 * CDDC\n\n                                # DH = (B[1] - A[1]) * (D[0] - C[0]) - (B[0] - A[0]) * (D[1] - C[1])\n                                # Dx = (A[0] * B[1] - B[0] * A[1]) * (D[0] - C[0]) - (B[0] - A[0]) * (C[0] * D[1] - D[0] * C[1])\n                                det_DxA0 = B[1] * (D[0] - C[0]) + (C[0] * D[1] - D[0] * C[1])\n                                det_DxA1 = - B[0] * (D[0] - C[0])\n                                det_DxB0 = - A[1] * (D[0] - C[0]) - (C[0] * D[1] - D[0] * C[1])\n                                det_DxB1 = A[0] * (D[0] - C[0])\n                                det_DxC0 = - (A[0] * B[1] - B[0] * A[1]) - (B[0] - A[0]) * D[1]\n                                det_DxC1 = (B[0] - A[0]) * D[0]\n                                det_DxD0 = (A[0] * B[1] - B[0] * A[1]) + (B[0] - A[0]) * C[1]\n                                det_DxD1 = -(B[0] - A[0]) * C[0]\n                                # Dy = (A[0] * B[1] - B[0] * A[1]) * (D[1] - C[1]) - (B[1] - A[1]) * (C[0] * D[1] - D[0] * C[1])\n                                det_DyA0 = B[1] * (D[1] - C[1])\n                                det_DyA1 = - B[0] * (D[1] - C[1]) + (C[0] * D[1] - D[0] * C[1])\n                                det_DyB0 = -  A[1] * (D[1] - C[1])\n                                det_DyB1 = A[0] * (D[1] - C[1]) - (C[0] * D[1] - D[0] * C[1])\n\n                                det_DyC0 = - (B[1] - A[1]) * D[1]\n                                det_DyC1 = - (A[0] * B[1] - B[0] * A[1]) + (B[1] - A[1]) * D[0]\n                                det_DyD0 = (B[1] - A[1]) * C[1]\n                                det_DyD1 = (A[0] * B[1] - B[0] * A[1]) - (B[1] - A[1]) * C[0]\n                                # DH = (B[1] - A[1]) * (D[0] - C[0]) - (B[0] - A[0]) * (D[1] - C[1])\n                                det_DHA0 = (D[1] - C[1])\n                                det_DHA1 = - (D[0] - C[0])\n                                det_DHB0 = - (D[1] - C[1])\n                                det_DHB1 = (D[0] - C[0])\n                                det_DHC0 = - (B[1] - A[1])\n                                det_DHC1 = (B[0] - A[0])\n                                det_DHD0 = (B[1] - A[1])\n                                det_DHD1 = - (B[0] - A[0])\n\n                                DHDH = DH * DH\n                                Jacbian_gboxes[iter, i * 2, n_of_inter[iter] * 2] += (det_DxA0 * DH - Dx * det_DHA0) / DHDH\n                                Jacbian_gboxes[iter, i * 2, n_of_inter[iter] * 2 + 1] += (det_DyA0 * DH - Dy * det_DHA0) / DHDH\n\n                                Jacbian_gboxes[iter, i * 2 + 1, n_of_inter[iter] * 2] += (det_DxA1 * DH - Dx * det_DHA1) / DHDH\n                                Jacbian_gboxes[iter, i * 2 + 1, n_of_inter[iter] * 2 + 1] += (det_DyA1 * DH - Dy * det_DHA1) / DHDH\n\n                                Jacbian_gboxes[iter, 2 * ((i + 1) % 4), n_of_inter[iter] * 2] += (det_DxB0 * DH - Dx * det_DHB0) / DHDH\n                                Jacbian_gboxes[iter, 2 * ((i + 1) % 4), n_of_inter[iter] * 2 + 1] += (det_DyB0 * DH - Dy * det_DHB0) / DHDH\n\n                                Jacbian_gboxes[iter, 2 * ((i + 1) % 4) + 1, n_of_inter[iter] * 2] += (det_DxB1 * DH - Dx * det_DHB1) / DHDH\n                                Jacbian_gboxes[iter, 2 * ((i + 1) % 4) + 1, n_of_inter[iter] * 2 + 1] += (det_DyB1 * DH - Dy * det_DHB1) / DHDH\n\n                                Jacbian_qboxes[iter, j * 2, n_of_inter[iter] * 2] += (det_DxC0 * DH - Dx * det_DHC0) / DHDH\n                                Jacbian_qboxes[iter, j * 2, n_of_inter[iter] * 2 + 1] += (det_DyC0 * DH - Dy * det_DHC0) / DHDH\n\n                                Jacbian_qboxes[iter, j * 2 + 1, n_of_inter[iter] * 2] += (det_DxC1 * DH - Dx * det_DHC1) / DHDH\n                                Jacbian_qboxes[iter, j * 2 + 1, n_of_inter[iter] * 2 + 1] += (det_DyC1 * DH - Dy * det_DHC1) / DHDH\n\n                                Jacbian_qboxes[iter, 2 * ((j + 1) % 4), n_of_inter[iter] * 2] += (det_DxD0 * DH - Dx * det_DHD0) / DHDH\n                                Jacbian_qboxes[iter, 2 * ((j + 1) % 4), n_of_inter[iter] * 2 + 1] += (det_DyD0 * DH - Dy * det_DHD0) / DHDH\n\n                                Jacbian_qboxes[iter, 2 * ((j + 1) % 4) + 1, n_of_inter[iter] * 2] += (det_DxD1 * DH - Dx * det_DHD1) / DHDH\n                                Jacbian_qboxes[iter, 2 * ((j + 1) % 4) + 1, n_of_inter[iter] * 2 + 1] += (det_DyD1 * DH - Dy * det_DHD1) / DHDH\n\n                                n_of_inter[iter] += 1\n\n        tensor_Jacbian_gboxes = torch.from_numpy(Jacbian_gboxes).to(torch.device(corners_qboxes.device))\n        tensor_Jacbian_qboxes = torch.from_numpy(Jacbian_qboxes).to(torch.device(corners_qboxes.device))\n        grad_output_cuda = grad_output.to(torch.device(corners_qboxes.device))\n        # print(\"grad_output_cuda =\", grad_output_cuda.shape)\n        tensor_grad_corners_gboxes = tensor_Jacbian_gboxes.matmul(grad_output_cuda.unsqueeze(2)).squeeze(2)\n        tensor_grad_corners_qboxes = tensor_Jacbian_qboxes.matmul(grad_output_cuda.unsqueeze(2)).squeeze(2)\n        return tensor_grad_corners_gboxes, tensor_grad_corners_qboxes\n\n\nclass sort_vertex(Function):\n    @staticmethod\n    def forward(ctx, int_pts, num_of_inter):\n        np_int_pts = int_pts.detach().numpy()\n        #np_num_of_inter = num_of_inter.detach().numpy()\n        np_num_of_inter = num_of_inter\n        N = int_pts.shape[0]\n        np_sorted_indexs = np.zeros((N, 8), dtype=np.int32)\n        sorted_int_pts = np.zeros((N, 16), dtype=np.float32)\n        for iter in range(N):\n            if np_num_of_inter[iter] > 0:\n                center = np.zeros((2,), dtype=np.float32)\n                for i in range(np_num_of_inter[iter]):\n                    center[0] += np_int_pts[iter, 2 * i]\n                    center[1] += np_int_pts[iter, 2 * i + 1]\n                center[0] /= np_num_of_inter[iter].float()\n                center[1] /= np_num_of_inter[iter].float()\n\n                angle = np.zeros((8,), dtype=np.float32)\n                v = np.zeros((2,), dtype=np.float32)\n\n                for i in range(np_num_of_inter[iter]):\n                    v[0] = np_int_pts[iter, 2 * i] - center[0]\n                    v[1] = np_int_pts[iter, 2 * i + 1] - center[1]\n                    d = math.sqrt(v[0] * v[0] + v[1] * v[1])\n                    v[0] = v[0] / d\n                    v[1] = v[1] / d\n                    anglei = math.atan2(v[1], v[0])\n                    if anglei < 0:\n                        angle[i] = anglei + 2 * 3.1415926\n                    else:\n                        angle[i] = anglei\n                # sort angles with descending\n                np_sorted_indexs[iter, :] = np.argsort(-angle)\n                for i in range(np_num_of_inter[iter]):\n                    sorted_int_pts[iter, 2 * i] = np_int_pts[iter, 2 * np_sorted_indexs[iter, i]]\n                    sorted_int_pts[iter, 2 * i + 1] = np_int_pts[iter, 2 * np_sorted_indexs[iter, i] + 1]\n\n        # conver numpy to tensor\n        ctx.save_for_backward(int_pts, num_of_inter)\n        ctx.np_sorted_indexs = np_sorted_indexs\n        tensor_sorted_int_pts = torch.from_numpy(sorted_int_pts)\n        return tensor_sorted_int_pts\n\n    @staticmethod\n    def backward(ctx, grad_output):\n        int_pts, num_of_inter = ctx.saved_tensors\n        np_sorted_indexs = ctx.np_sorted_indexs\n\n        N = int_pts.shape[0]\n        Jacbian_int_pts = np.zeros((N, 16, 16), dtype=np.float32)\n        for iter in range(N):\n            for i in range(num_of_inter[iter]):\n                Jacbian_int_pts[iter, 2 * np_sorted_indexs[iter, i], 2 * i] = 1\n                Jacbian_int_pts[iter, 2 * np_sorted_indexs[iter, i] + 1, 2 * i + 1] = 1\n\n        tensor_Jacbian_int_pts = torch.from_numpy(Jacbian_int_pts).to(torch.device(int_pts.device))\n        grad_output_cuda = grad_output.to(torch.device(int_pts.device))\n        tensor_grad_int_pts = tensor_Jacbian_int_pts.matmul(grad_output_cuda.unsqueeze(2)).squeeze(2)\n        # todo: my second addtion\n        # my_add_1 = torch.zeros(tensor_grad_int_pts.shape[0], dtype=torch.float32)\n        return tensor_grad_int_pts, None\n\n\nclass area_polygon(Function):\n\n    @staticmethod\n    def forward(ctx, int_pts, num_of_inter):\n        ctx.save_for_backward(int_pts, num_of_inter)\n        np_int_pts = int_pts.detach().numpy()\n        #np_num_of_inter = num_of_inter.detach().numpy()\n        np_num_of_inter = num_of_inter\n        N = int_pts.shape[0]\n        areas = np.zeros((N,), dtype=np.float32)\n\n        for iter in range(N):\n            for i in range(np_num_of_inter[iter] - 2):\n                p1 = np_int_pts[iter, 0:2]\n                p2 = np_int_pts[iter, 2 * i + 2:2 * i + 4]\n                p3 = np_int_pts[iter, 2 * i + 4:2 * i + 6]\n                areas[iter] += abs(((p1[0] - p3[0]) * (p2[1] - p3[1]) - (p1[1] - p3[1]) * (p2[0] - p3[0])) / 2.0)\n\n        tensor_areas = torch.from_numpy(areas)\n\n        return tensor_areas\n\n    @staticmethod\n    def backward(ctx, *grad_outputs):\n\n        int_pts, num_of_inter = ctx.saved_tensors\n        np_int_pts = int_pts.detach().numpy()\n        np_num_of_inter = num_of_inter.detach().numpy()\n        grad_output0 = grad_outputs[0]\n        N = int_pts.shape[0]\n        grad_int_pts = np.zeros((N, 16), dtype=np.float32)\n\n        for iter in range(N):\n            if (np_num_of_inter[iter] > 2):\n                for i in range(np_num_of_inter[iter]):\n                    if i == 0:\n                        for j in range(np_num_of_inter[iter] - 2):\n                            p1 = np_int_pts[iter, 0:2]\n                            p2 = np_int_pts[iter, 2 * j + 2:2 * j + 4]\n                            p3 = np_int_pts[iter, 2 * j + 4:2 * j + 6]\n\n                            if ((p1[0] - p3[0]) * (p2[1] - p3[1]) - (p1[1] - p3[1]) * (p2[0] - p3[0])) > 0:\n                                grad_int_pts[iter, 0] += (p2[1] - p3[1]) * grad_output0[iter] * 0.5\n                                grad_int_pts[iter, 1] += -(p2[0] - p3[0]) * grad_output0[iter] * 0.5\n                            else:\n                                grad_int_pts[iter, 0] += -(p2[1] - p3[1]) * grad_output0[iter] * 0.5\n                                grad_int_pts[iter, 1] += (p2[0] - p3[0]) * grad_output0[iter] * 0.5\n\n                    elif i == 1:\n                        p1 = np_int_pts[iter, 0:2]\n                        p2 = np_int_pts[iter, 2:4]\n                        p3 = np_int_pts[iter, 4:6]\n                        if ((p1[0] - p3[0]) * (p2[1] - p3[1]) - (p1[1] - p3[1]) * (p2[0] - p3[0])) > 0:\n                            grad_int_pts[iter, 2] = -(p1[1] - p3[1]) * grad_output0[iter] * 0.5\n                            grad_int_pts[iter, 3] = (p1[0] - p3[0]) * grad_output0[iter] * 0.5\n                        else:\n                            grad_int_pts[iter, 2] = (p1[1] - p3[1]) * grad_output0[iter] * 0.5\n                            grad_int_pts[iter, 3] = -(p1[0] - p3[0]) * grad_output0[iter] * 0.5\n\n                    elif i == np_num_of_inter[iter] - 1:\n\n                        p1 = np_int_pts[iter, 2 * (np_num_of_inter[iter] - 2):2 * (np_num_of_inter[iter] - 1)]\n                        p2 = np_int_pts[iter, 2 * (np_num_of_inter[iter] - 1):2 * (np_num_of_inter[iter])]\n                        p3 = np_int_pts[iter, 0:2]\n                        if ((p1[0] - p3[0]) * (p2[1] - p3[1]) - (p1[1] - p3[1]) * (p2[0] - p3[0])) > 0:\n                            grad_int_pts[iter, 2 * (np_num_of_inter[iter] - 1)] = - (p1[1] - p3[1]) * grad_output0[\n                                iter] * 0.5\n                            grad_int_pts[iter, 2 * np_num_of_inter[iter] - 1] = (p1[0] - p3[0]) * grad_output0[\n                                iter] * 0.5\n                        else:\n                            grad_int_pts[iter, 2 * (np_num_of_inter[iter] - 1)] = (p1[1] - p3[1]) * grad_output0[\n                                iter] * 0.5\n                            grad_int_pts[iter, 2 * np_num_of_inter[iter] - 1] = - (p1[0] - p3[0]) * grad_output0[\n                                iter] * 0.5\n                    else:\n                        p1 = np_int_pts[iter, 0:2]\n                        p2 = np_int_pts[iter, 2 * i - 2: 2 * i]\n                        p3 = np_int_pts[iter, 2 * i: 2 * i + 2]\n                        if ((p1[0] - p3[0]) * (p2[1] - p3[1]) - (p1[1] - p3[1]) * (p2[0] - p3[0])) > 0:\n                            grad_int_pts[iter, i * 2] += (- (p2[1] - p3[1]) + (p1[1] - p3[1])) * grad_output0[\n                                iter] * 0.5\n                            grad_int_pts[iter, i * 2 + 1] += (- (p1[0] - p3[0]) + (p2[0] - p3[0])) * grad_output0[\n                                iter] * 0.5\n                        else:\n                            grad_int_pts[iter, i * 2] += ((p2[1] - p3[1]) - (p1[1] - p3[1])) * grad_output0[iter] * 0.5\n                            grad_int_pts[iter, i * 2 + 1] += ((p1[0] - p3[0]) - (p2[0] - p3[0])) * grad_output0[\n                                iter] * 0.5\n\n                        p1 = np_int_pts[iter, 0:2]\n                        p2 = np_int_pts[iter, 2 * i: 2 * i + 2]\n                        p3 = np_int_pts[iter, 2 * i + 2: 2 * i + 4]\n                        if ((p1[0] - p3[0]) * (p2[1] - p3[1]) - (p1[1] - p3[1]) * (p2[0] - p3[0])) > 0:\n                            grad_int_pts[iter, i * 2] += - (p1[1] - p3[1]) * grad_output0[iter] * 0.5\n                            grad_int_pts[iter, i * 2 + 1] += (p1[0] - p3[0]) * grad_output0[iter] * 0.5\n                        else:\n                            grad_int_pts[iter, i * 2] += (p1[1] - p3[1]) * grad_output0[iter] * 0.5\n                            grad_int_pts[iter, i * 2 + 1] += -(p1[0] - p3[0]) * grad_output0[iter] * 0.5\n\n        tensor_grad_int_pts = torch.from_numpy(grad_int_pts)\n        # todo: my first addition.\n        # my_add_0 = torch.zeros(tensor_grad_int_pts.shape[0], dtype=torch.float32)\n        #print(\"area_polygon backward\")\n        return tensor_grad_int_pts, None\n\n\n## Transform the (cx, cy, w, l, theta) representation to 4 corners representation\nclass rbbox_to_corners(nn.Module):\n\n    def _init_(self, rbbox):\n        super(rbbox_to_corners, self)._init_()\n        self.rbbox = rbbox\n        return\n\n    def forward(ctx, rbbox):\n        '''\n                    There is no rotation performed here. As axis are aligned.\n                                          ^ [y]\n                                     1 --------- 2\n                                     /          /    --->\n                                    0 -------- 3     [x]\n                    Each node has the coordinate of [x, y]. Corresponding the order of input.\n                    Output: [N, 8]\n                            [x_0, y_0, x_1, y_1, x_2, y_2, x_3, y_3],\n                            if ry > 0, then rotate clockwisely.\n                '''\n\n        assert rbbox.shape[1] == 5\n        device = rbbox.device\n        corners = torch.zeros((rbbox.shape[0], 8), dtype=torch.float32, device=device)\n        dxcos = rbbox[:, 2].mul(torch.cos(rbbox[:, 4])) / 2.0\n        dxsin = rbbox[:, 2].mul(torch.sin(rbbox[:, 4])) / 2.0\n        dycos = rbbox[:, 3].mul(torch.cos(rbbox[:, 4])) / 2.0\n        dysin = rbbox[:, 3].mul(torch.sin(rbbox[:, 4])) / 2.0\n        corners[:, 0] = -dxcos - dysin + rbbox[:, 0]\n        corners[:, 1] = dxsin - dycos + rbbox[:, 1]\n        corners[:, 2] = -dxcos + dysin + rbbox[:, 0]\n        corners[:, 3] = dxsin + dycos + rbbox[:, 1]\n\n        corners[:, 4] = dxcos + dysin + rbbox[:, 0]\n        corners[:, 5] = -dxsin + dycos + rbbox[:, 1]\n        corners[:, 6] = dxcos - dysin + rbbox[:, 0]\n        corners[:, 7] = -dxsin - dycos + rbbox[:, 1]\n        return corners\n\nclass rinter_area_compute(nn.Module):\n\n    def _init_(self, corners_gboxes, corners_qboxes):\n        super(rinter_area_compute, self)._init_()\n        self.corners_gboxes = corners_gboxes\n        self.corners_qboxes = corners_qboxes\n        return\n\n    def forward(ctx, corners_gboxes, corners_qboxes):\n        intersections, num_of_intersections = compute_vertex(corners_gboxes, corners_qboxes)\n        num_of_intersections = num_of_intersections.detach()\n        sorted_int_pts = sort_vertex(intersections, num_of_intersections)\n        # x = sorted_int_pts.clone()\n        # x[0, 4:6] = sorted_int_pts[0, 6:8]\n        # x[0, 6:8] = sorted_int_pts[0, 4:6]\n        inter_area = area_polygon(sorted_int_pts, num_of_intersections)\n        return inter_area\n\n\nclass find_convex_hull(Function):\n    # get the minimum bounding box from a set of points, points are reordered with a anti-clockwise order.\n    # and those points inside the minimum bbox are removed.\n    @staticmethod\n    def forward(ctx, corners):\n        np_corners = corners.cpu().detach().numpy()\n        hull = ConvexHull(np_corners)\n        M = hull.nsimplex\n        index = hull.vertices\n        hull_points_2d = np.zeros((M, 2), np.float32)\n        for i in range(M):\n            hull_points_2d[i, 0] = np_corners[index[i], 0]\n            hull_points_2d[i, 1] = np_corners[index[i], 1]\n        tensor_hull_points_2d = torch.from_numpy(hull_points_2d).to(torch.device(corners.device))\n        ctx.index = index\n        return tensor_hull_points_2d\n\n    @staticmethod\n    def backward(ctx, *grad_outputs):\n        index = ctx.index\n        grad_output0 = grad_outputs[0]\n        device = grad_output0.device\n        grad_corners = torch.zeros((8, 2), dtype=torch.float32, device=device)\n        for i in range(len(index)):\n            grad_corners[index[i], 0] = grad_output0[i, 0]\n            grad_corners[index[i], 1] = grad_output0[i, 1]\n        return grad_corners\n\n\n## nn Module\nclass mbr_convex_hull(nn.Module):\n    '''\n        Miminum Bounding Rectangle (MBR)\n        Algorithm core: The orientation of the MBR is the same as the one of one of the edges of the point cloud convex hull, which means\n        the result rectangle must overlap with at least one of the edges.\n    '''\n\n    def _init_(self, hull_points_2d):\n        super(mbr_convex_hull, self)._init_()\n        self.hull_points_2d = hull_points_2d\n        return\n\n    def forward(ctx, hull_points_2d):\n        device = hull_points_2d.device\n        N = hull_points_2d.shape[0]\n        edges = hull_points_2d[1:N, :].add(- hull_points_2d[0:N - 1, :])\n        edge_angles = torch.atan2(edges[:, 1], edges[:, 0])\n        edge_angles = torch.fmod(edge_angles, 3.1415926 / 2.0)\n        edge_angles = torch.abs(edge_angles)\n        # edge_angles = torch.unique(edge_angles)\n        # print(\"edge_angles =\", edge_angles)\n        a = torch.stack((torch.cos(edge_angles), torch.cos(edge_angles - 3.1415926 / 2.0)), 1)\n        a = torch.unsqueeze(a, 1)\n        b = torch.stack((torch.cos(edge_angles + 3.1415926 / 2.0), torch.cos(edge_angles)), 1)\n        b = torch.unsqueeze(b, 1)\n        R_tensor = torch.cat((a, b), 1)\n        hull_points_2d_ = torch.unsqueeze(torch.transpose(hull_points_2d, 0, 1), 0)\n        rot_points = R_tensor.matmul(hull_points_2d_)\n        min_x = torch.min(rot_points, 2)[0]\n        max_x = torch.max(rot_points, 2)[0]\n        areas = (max_x[:, 0] - min_x[:, 0]).mul(max_x[:, 1] - min_x[:, 1])\n        return torch.min(areas)\n\n\nclass mbr_area_compute(nn.Module):\n    # get the minimum bounding box from a set of points\n\n    def _init_(self, corners):\n        super(mbr_area_compute, self)._init_()\n        self.corners = corners\n        return\n\n    def forward(ctx, corners):\n        # np_corners = corners.numpy()\n        N = corners.shape[0]\n        # mbr_rect_areas   = torch.zeros((N,), dtype=torch.float32)\n        mbr_rect_area = []\n        for i in range(N):\n            mbr_rect_area.append(torch.zeros((1,), dtype=torch.float32, device=corners.device))\n        # mbr_rect_areas   = torch.zeros((N,), dtype=torch.float32, device = corners_gboxes.device)\n        for iter in range(N):\n            convex_hull_pts = find_convex_hull(corners[iter, :, :].squeeze())\n            mbr_convex_hull_object = mbr_convex_hull()\n            mbr_rect_area[iter] = mbr_convex_hull_object(convex_hull_pts)\n        mbr_rect_areas = torch.stack(mbr_rect_area)  # torch.cat(mbr_rect_area)\n        # ctx.save_for_backward(corners)\n        return mbr_rect_areas\n\n\n\nclass mbr_diag_convex_hull(nn.Module):\n    '''\n        # added by zhengwu\n    '''\n\n    def _init_(self, hull_points_2d):\n        super(mbr_diag_convex_hull, self)._init_()\n        self.hull_points_2d = hull_points_2d\n        return\n\n    def forward(ctx, hull_points_2d):\n        device = hull_points_2d.device\n        N = hull_points_2d.shape[0]\n        edges = hull_points_2d[1:N, :].add(- hull_points_2d[0:N - 1, :])\n        edge_angles = torch.atan2(edges[:, 1], edges[:, 0])\n        edge_angles = torch.fmod(edge_angles, 3.1415926 / 2.0)\n        edge_angles = torch.abs(edge_angles)\n        # edge_angles = torch.unique(edge_angles)\n        # print(\"edge_angles =\", edge_angles)\n        a = torch.stack((torch.cos(edge_angles), torch.cos(edge_angles - 3.1415926 / 2.0)), 1)\n        a = torch.unsqueeze(a, 1)\n        b = torch.stack((torch.cos(edge_angles + 3.1415926 / 2.0), torch.cos(edge_angles)), 1)\n        b = torch.unsqueeze(b, 1)\n        R_tensor = torch.cat((a, b), 1)\n        hull_points_2d_ = torch.unsqueeze(torch.transpose(hull_points_2d, 0, 1), 0)\n        rot_points = R_tensor.matmul(hull_points_2d_)\n        min_x = torch.min(rot_points, 2)[0]\n        max_x = torch.max(rot_points, 2)[0]\n        areas = (max_x[:, 0] - min_x[:, 0]).mul(max_x[:, 1] - min_x[:, 1])\n        # modified here\n        min_index = torch.argmin(areas)\n        corner_max, corner_min = max_x[min_index], min_x[min_index]\n        diag = torch.sqrt((corner_max[0] - corner_min[0]) ** 2 + (corner_max[1] - corner_min[1]) ** 2)\n        return diag\n\n\nclass mbr_diag_compute(nn.Module):\n    # added by zhengwu\n    def _init_(self, corners):\n        super(mbr_diag_compute, self)._init_()\n        self.corners = corners\n        return\n\n    def forward(ctx, corners):\n        N = corners.shape[0]\n        mbr_rect_diag = []\n        for iter in range(N):\n            convex_hull_pts = find_convex_hull(corners[iter, :, :].squeeze())\n            mbr_diag_convex_hull_object = mbr_diag_convex_hull()\n            mbr_rect_diag.append(mbr_diag_convex_hull_object(convex_hull_pts))\n        mbr_rect_diags = torch.stack(mbr_rect_diag)\n        return mbr_rect_diags\n\n\nclass _second_box_decode_operation(nn.Module):\n    \"\"\"box decode for VoxelNet in lidar\n    Args:\n        boxes ([N, 7] Tensor): normal boxes: x, y, z, w, l, h, r\n        anchors ([N, 7] Tensor): anchors\n    \"\"\"\n\n    # need to convert box_encodings to z-bottom format\n\n    def _init_(self, box_encodings, anchors, encode_angle_to_vector, smooth_dim):\n        super(_second_box_decode_operation, self)._init_()\n        self.box_encodings = box_encodings\n        self.anchors = anchors\n        self.encode_angle_to_vector = False\n        self.smooth_dim = False\n        return\n\n    def forward(ctx, box_encodings, anchors, encode_angle_to_vector, smooth_dim):\n\n        \"\"\"box decode for VoxelNet in lidar\n        Args:\n            boxes ([N, 7] Tensor): normal boxes: x, y, z, w, l, h, r\n            anchors ([N, 7] Tensor): anchors\n        \"\"\"\n        xa, ya, za, wa, la, ha, ra = torch.split(anchors, 1, dim=-1)\n        if encode_angle_to_vector:\n            xt, yt, zt, wt, lt, ht, rtx, rty = torch.split(box_encodings, 1, dim=-1)\n        else:\n            xt, yt, zt, wt, lt, ht, rt = torch.split(box_encodings, 1, dim=-1)\n        # xt, yt, zt, wt, lt, ht, rt = torch.split(box_encodings, 1, dim=-1)\n        za = za + ha / 2.\n        diagonal = torch.sqrt(la ** 2 + wa ** 2)\n        xg = xt * diagonal + xa\n        yg = yt * diagonal + ya\n        zg = zt * ha + za\n        if smooth_dim:\n            lg = (lt + 1) * la\n            wg = (wt + 1) * wa\n            hg = (ht + 1) * ha\n        else:\n\n            lg = torch.exp(lt) * la\n            wg = torch.exp(wt) * wa\n            hg = torch.exp(ht) * ha\n        if encode_angle_to_vector:\n            rax = torch.cos(ra)\n            ray = torch.sin(ra)\n            rgx = rtx + rax\n            rgy = rty + ray\n            rg = torch.atan2(rgy, rgx)\n        else:\n            rg = rt + ra\n        zg = zg - hg / 2.\n        return torch.cat([xg, yg, zg, wg, lg, hg, rg], dim=-1)\n\n\n###################################\n# simplified version\n###################################\n\nclass rbbox_corners_aligned(nn.Module):\n\n    def _init_(self, gboxes):\n        super(rbbox_corners_aligned, self)._init_()\n        self.corners_gboxes = gboxes\n        return\n\n    def forward(ctx, gboxes):\n        '''\n            There is no rotation performed here. As axis are aligned.\n                                  ^ [y]\n                             1 --------- 2\n                             /          /    --->\n                            0 -------- 3     [x]\n            Each node has the coordinate of [x, y]. Corresponding the order of input.\n            Output: [N, 2, 4]\n                    [[x_0, x_1, x_2, x_3],\n                     [y_0, y_1, y_2, y_3]].\n        '''\n        N = gboxes.shape[0]\n\n        center_x = gboxes[:, 0]\n        center_y = gboxes[:, 1]\n\n        x_d = gboxes[:, 2]\n        y_d = gboxes[:, 3]\n\n        corners = torch.zeros([N, 2, 4], device=gboxes.device, dtype=torch.float32)\n\n        corners[:, 0, 0] = x_d.mul(-0.5)\n        corners[:, 1, 0] = y_d.mul(-0.5)\n\n        corners[:, 0, 1] = x_d.mul(-0.5)\n        corners[:, 1, 1] = y_d.mul(0.5)\n\n        corners[:, 0, 2] = x_d.mul(0.5)\n        corners[:, 1, 2] = y_d.mul(0.5)\n\n        corners[:, 0, 3] = x_d.mul(0.5)\n        corners[:, 1, 3] = y_d.mul(-0.5)\n\n        b = center_x.unsqueeze(1).repeat(1, 4).unsqueeze(1)\n        c = center_y.unsqueeze(1).repeat(1, 4).unsqueeze(1)\n\n        return (corners + torch.cat((b, c), 1))\n\n\nclass align_inter_aligned(nn.Module):\n\n    def _init_(self, gboxes, qboxes):\n        super(align_inter_aligned, self)._init_()\n        self.gboxes = gboxes\n        self.qboxes = qboxes\n        return\n\n    def forward(ctx, gboxes, qboxes):\n        N = gboxes.shape[0]\n        M = qboxes.shape[0]\n        eps = 1e-5\n        assert N == M\n\n        ## we can project the 3D bounding boxes into 3 different plane\n        ## Notice: ry is not used here.\n\n        ## view1 xoz plane\n        inter_area_xoz = torch.zeros((N,), device=gboxes.device, dtype=torch.float32)\n        mbr_area_xoz = torch.zeros((N,), device=gboxes.device, dtype=torch.float32)\n        rbbox_corners_aligned_object = rbbox_corners_aligned()\n        rotated_corners1 = rbbox_corners_aligned_object(gboxes[:, [0, 2, 3, 5, 6]])\n        rotated_corners2 = rbbox_corners_aligned_object(qboxes[:, [0, 2, 3, 5, 6]])\n\n        for i in range(N):\n            iw = (min(rotated_corners1[i, 0, 2], rotated_corners2[i, 0, 2]) -\n                  max(rotated_corners1[i, 0, 1], rotated_corners2[i, 0, 1]) + eps)\n            if (iw > 0):\n                ih = ((min(rotated_corners1[i, 1, 1], rotated_corners2[i, 1, 1]) -\n                       max(rotated_corners1[i, 1, 0], rotated_corners2[i, 1, 0]) + eps))\n                if (ih > 0):\n                    inter_area_xoz[i] = iw * ih\n            iwmbr = (max(rotated_corners1[i, 0, 3], rotated_corners2[i, 0, 3]) -\n                     min(rotated_corners1[i, 0, 0], rotated_corners2[i, 0, 0]) + eps)\n            ihmbr = ((max(rotated_corners1[i, 1, 1], rotated_corners2[i, 1, 1]) -\n                      min(rotated_corners1[i, 1, 0], rotated_corners2[i, 1, 0]) + eps))\n            mbr_area_xoz[i] = iwmbr * ihmbr\n\n        ### view2 xoy plane\n        inter_area_xoy = torch.zeros((N,), device=gboxes.device, dtype=torch.float32)\n        mbr_area_xoy = torch.zeros((N,), device=gboxes.device, dtype=torch.float32)\n        rotated_corners1 = rbbox_corners_aligned_object(gboxes[:, [0, 1, 3, 4, 6]])\n        rotated_corners2 = rbbox_corners_aligned_object(qboxes[:, [0, 1, 3, 4, 6]])\n        for i in range(N):\n            iw = (min(rotated_corners1[i, 0, 2], rotated_corners2[i, 0, 2]) -\n                  max(rotated_corners1[i, 0, 1], rotated_corners2[i, 0, 1]) + eps)\n            if (iw > 0):\n                ih = ((min(rotated_corners1[i, 1, 1], rotated_corners2[i, 1, 1]) -\n                       max(rotated_corners1[i, 1, 0], rotated_corners2[i, 1, 0]) + eps))\n                if (ih > 0):\n                    inter_area_xoy[i] = iw * ih\n            iwmbr = (max(rotated_corners1[i, 0, 3], rotated_corners2[i, 0, 3]) -\n                     min(rotated_corners1[i, 0, 0], rotated_corners2[i, 0, 0]) + eps)\n            ihmbr = ((max(rotated_corners1[i, 1, 1], rotated_corners2[i, 1, 1]) -\n                      min(rotated_corners1[i, 1, 0], rotated_corners2[i, 1, 0]) + eps))\n            mbr_area_xoy[i] = iwmbr * ihmbr\n\n        ### view3 yoz plane\n        inter_area_yoz = torch.zeros((N,), device=gboxes.device, dtype=torch.float32)\n        mbr_area_yoz = torch.zeros((N,), device=gboxes.device, dtype=torch.float32)\n        rotated_corners1 = rbbox_corners_aligned_object(gboxes[:, [1, 2, 4, 5, 6]])\n        rotated_corners2 = rbbox_corners_aligned_object(qboxes[:, [1, 2, 4, 5, 6]])\n        for i in range(N):\n            iw = (min(rotated_corners1[i, 0, 2], rotated_corners2[i, 0, 2]) -\n                  max(rotated_corners1[i, 0, 1], rotated_corners2[i, 0, 1]) + eps)\n            if (iw > 0):\n                ih = ((min(rotated_corners1[i, 1, 1], rotated_corners2[i, 1, 1]) -\n                       max(rotated_corners1[i, 1, 0], rotated_corners2[i, 1, 0]) + eps))\n                if (ih > 0):\n                    inter_area_yoz[i] = iw * ih\n            iwmbr = (max(rotated_corners1[i, 0, 3], rotated_corners2[i, 0, 3]) -\n                     min(rotated_corners1[i, 0, 0], rotated_corners2[i, 0, 0]) + eps)\n            ihmbr = ((max(rotated_corners1[i, 1, 1], rotated_corners2[i, 1, 1]) -\n                      min(rotated_corners1[i, 1, 0], rotated_corners2[i, 1, 0]) + eps))\n            mbr_area_yoz[i] = iwmbr * ihmbr\n\n        return inter_area_xoz, mbr_area_xoz, inter_area_xoy, mbr_area_xoy, inter_area_yoz, mbr_area_yoz\n\n\nclass odiou_3D(nn.Module):\n    def _init_(self, gboxes=None, qboxes=None, aligned=False):\n        super(odiou_3D, self)._init_()\n        self.gboxes = gboxes\n        self.qboxes = qboxes\n        self.aligned = aligned\n        return\n\n    def forward(ctx, gboxes, qboxes, weights, batch_size):\n        '''\n            gboxes / qboxes: [N, 7], [x, y, z, w, l, h, ry] in velo coord.\n            Notice: (x, y, z) is the real center of bbox.\n        '''\n\n        xa, ya, za, dxa, dya, dza, ra = torch.split(gboxes, 1, dim=-1)\n        gboxes = torch.cat([xa, ya, za, dya, dxa, dza, ra], dim=-1)\n\n        xa1, ya1, za1, dxa1, dya1, dza1, ra1 = torch.split(qboxes, 1, dim=-1)\n        qboxes = torch.cat([xa1, ya1, za1, dya1, dxa1, dza1, ra1], dim=-1)\n\n        assert gboxes.shape[0] == qboxes.shape[0]\n        indicator = torch.gt(gboxes[:, 3], 0) & torch.gt(gboxes[:, 4], 0) & torch.gt(gboxes[:, 5], 0) \\\n                    & torch.gt(qboxes[:, 3], 0) & torch.gt(qboxes[:, 4], 0) & torch.gt(qboxes[:, 5], 0)\n        index_loc = torch.nonzero(indicator)\n        # todo: my addtion to avoid too large number after model initialization.\n        gboxes = torch.clamp(gboxes, -200.0, 200.0)\n        qboxes = torch.clamp(qboxes, -200.0, 200.0)\n        odious = torch.zeros([gboxes.shape[0], ], device=gboxes.device, dtype=torch.float32)\n        if gboxes.shape[0] == 0 or qboxes.shape[0] == 0:\n            return torch.unsqueeze(odious, 1)\n\n        diff_angle = qboxes[:, -1] - gboxes[:, -1]\n        angle_factor = 1.25 * (1.0 - torch.abs(torch.cos(diff_angle)))\n        rbbox_to_corners_object = rbbox_to_corners()\n        corners_gboxes = rbbox_to_corners_object(gboxes[:, [0, 1, 3, 4, 6]])\n        corners_qboxes = rbbox_to_corners_object(qboxes[:, [0, 1, 3, 4, 6]])\n        corners_gboxes_1 = torch.stack((corners_gboxes[:, [0, 2, 4, 6]], corners_gboxes[:, [1, 3, 5, 7]]), 2)\n        corners_qboxes_1 = torch.stack((corners_qboxes[:, [0, 2, 4, 6]], corners_qboxes[:, [1, 3, 5, 7]]), 2)\n        corners_pts = torch.cat((corners_gboxes_1, corners_qboxes_1), 1)\n\n        # compute the inter area\n        rinter_area_compute_object = rinter_area_compute()\n        inter_area = rinter_area_compute_object(corners_gboxes, corners_qboxes)\n\n        # compute center distance\n        center_dist_square = torch.pow(gboxes[:, 0:3] - qboxes[:, 0:3], 2).sum(-1)\n\n        # compute the mbr bev diag\n        mbr_diag_compute_object = mbr_diag_compute()\n        mbr_diag_bev = mbr_diag_compute_object(corners_pts)\n        inter_h = (torch.min(gboxes[:, 2] + 0.5 * gboxes[:, 5], qboxes[:, 2] + 0.5 * qboxes[:, 5]) -\n                   torch.max(gboxes[:, 2] - 0.5 * gboxes[:, 5], qboxes[:, 2] - 0.5 * qboxes[:, 5]))\n        oniou_h = (torch.max(gboxes[:, 2] + 0.5 * gboxes[:, 5], qboxes[:, 2] + 0.5 * qboxes[:, 5]) -\n                   torch.min(gboxes[:, 2] - 0.5 * gboxes[:, 5], qboxes[:, 2] - 0.5 * qboxes[:, 5]))\n        inter_h[inter_h < 0] = 0\n        mbr_diag_3d_square = mbr_diag_bev**2 + inter_h ** 2 + 1e-7\n\n        volume_gboxes = gboxes[:, 3].mul(gboxes[:, 4]).mul(gboxes[:, 5])\n        volume_qboxes = qboxes[:, 3].mul(qboxes[:, 4]).mul(qboxes[:, 5])\n        inter_area_cuda = inter_area.to(torch.device(gboxes.device))\n        volume_inc = inter_h.mul(inter_area_cuda)\n        volume_union = (volume_gboxes + volume_qboxes - volume_inc)\n        center_dist_square_cuda = center_dist_square.to(torch.device(gboxes.device))\n        mbr_diag_3d_square_cuda = mbr_diag_3d_square.to(torch.device(gboxes.device))\n\n        ious = torch.div(volume_inc, volume_union)\n        dp = torch.div(center_dist_square_cuda[index_loc[:, 0]], mbr_diag_3d_square_cuda[index_loc[:, 0]])\n        odious[index_loc[:, 0]] = 1 - ious[index_loc[:, 0]] + dp + angle_factor\n        batch_ious = odious * weights\n        ious_loss = 2.0 * batch_ious.sum() / batch_size\n        return ious_loss\n\n\ncompute_vertex = compute_vertex.apply\nsort_vertex = sort_vertex.apply\narea_polygon = area_polygon.apply\nfind_convex_hull = find_convex_hull.apply"
  },
  {
    "path": "pcdet/utils/spconv_utils.py",
    "content": "import torch\n\n\ndef scatter_point_inds(indices, point_inds, shape):\n    ret = -1 * torch.ones(*shape, dtype=point_inds.dtype, device=point_inds.device)\n    ndim = indices.shape[-1]\n    flattened_indices = indices.view(-1, ndim)\n    slices = [flattened_indices[:, i] for i in range(ndim)]\n    ret[slices] = point_inds\n    return ret\n\n\ndef generate_voxel2pinds(sparse_tensor):\n    device = sparse_tensor.indices.device\n    batch_size = sparse_tensor.batch_size\n    spatial_shape = sparse_tensor.spatial_shape\n    indices = sparse_tensor.indices.long()\n    point_indices = torch.arange(indices.shape[0], device=device, dtype=torch.int32)\n    output_shape = [batch_size] + list(spatial_shape)\n    v2pinds_tensor = scatter_point_inds(indices, point_indices, output_shape)\n    return v2pinds_tensor\n\ndef generate_voxel2pinds2(batch_size,spatial_shape,indices):\n    indices = indices.long()\n    device = indices.device\n    point_indices = torch.arange(indices.shape[0], device=device, dtype=torch.int32)\n    output_shape = [batch_size] + list(spatial_shape)\n    v2pinds_tensor = scatter_point_inds(indices, point_indices, output_shape)\n    return v2pinds_tensor\n\nfrom typing import Set\n\ntry:\n    import spconv.pytorch as spconv\nexcept:\n    import spconv as spconv\n\nimport torch.nn as nn\n\n\ndef find_all_spconv_keys(model: nn.Module, prefix=\"\") -> Set[str]:\n    \"\"\"\n    Finds all spconv keys that need to have weight's transposed\n    \"\"\"\n    found_keys: Set[str] = set()\n    for name, child in model.named_children():\n        new_prefix = f\"{prefix}.{name}\" if prefix != \"\" else name\n\n        if isinstance(child, spconv.conv.SparseConvolution):\n            new_prefix = f\"{new_prefix}.weight\"\n            found_keys.add(new_prefix)\n\n        found_keys.update(find_all_spconv_keys(child, prefix=new_prefix))\n\n    return found_keys\n\n\ndef replace_feature(out, new_features):\n    if \"replace_feature\" in out.__dir__():\n        # spconv 2.x behaviour\n        return out.replace_feature(new_features)\n    else:\n        out.features = new_features\n        return out\n"
  },
  {
    "path": "pcdet/utils/transform_utils.py",
    "content": "import math\nimport torch\n\ntry:\n    from kornia.geometry.conversions import (\n        convert_points_to_homogeneous,\n        convert_points_from_homogeneous,\n    )\nexcept:\n    pass \n    # print('Warning: kornia is not installed. This package is only required by CaDDN')\n\n\ndef project_to_image(project, points):\n    \"\"\"\n    Project points to image\n    Args:\n        project [torch.tensor(..., 3, 4)]: Projection matrix\n        points [torch.Tensor(..., 3)]: 3D points\n    Returns:\n        points_img [torch.Tensor(..., 2)]: Points in image\n        points_depth [torch.Tensor(...)]: Depth of each point\n    \"\"\"\n    # Reshape tensors to expected shape\n    points = convert_points_to_homogeneous(points)\n    points = points.unsqueeze(dim=-1)\n    project = project.unsqueeze(dim=1)\n\n    # Transform points to image and get depths\n    points_t = project @ points\n    points_t = points_t.squeeze(dim=-1)\n    points_img = convert_points_from_homogeneous(points_t)\n    points_depth = points_t[..., -1] - project[..., 2, 3]\n\n    return points_img, points_depth\n\n\ndef normalize_coords(coords, shape):\n    \"\"\"\n    Normalize coordinates of a grid between [-1, 1]\n    Args:\n        coords: (..., 3), Coordinates in grid\n        shape: (3), Grid shape\n    Returns:\n        norm_coords: (.., 3), Normalized coordinates in grid\n    \"\"\"\n    min_n = -1\n    max_n = 1\n    shape = torch.flip(shape, dims=[0])  # Reverse ordering of shape\n\n    # Subtract 1 since pixel indexing from [0, shape - 1]\n    norm_coords = coords / (shape - 1) * (max_n - min_n) + min_n\n    return norm_coords\n\n\ndef bin_depths(depth_map, mode, depth_min, depth_max, num_bins, target=False):\n    \"\"\"\n    Converts depth map into bin indices\n    Args:\n        depth_map: (H, W), Depth Map\n        mode: string, Discretiziation mode (See https://arxiv.org/pdf/2005.13423.pdf for more details)\n            UD: Uniform discretiziation\n            LID: Linear increasing discretiziation\n            SID: Spacing increasing discretiziation\n        depth_min: float, Minimum depth value\n        depth_max: float, Maximum depth value\n        num_bins: int, Number of depth bins\n        target: bool, Whether the depth bins indices will be used for a target tensor in loss comparison\n    Returns:\n        indices: (H, W), Depth bin indices\n    \"\"\"\n    if mode == \"UD\":\n        bin_size = (depth_max - depth_min) / num_bins\n        indices = ((depth_map - depth_min) / bin_size)\n    elif mode == \"LID\":\n        bin_size = 2 * (depth_max - depth_min) / (num_bins * (1 + num_bins))\n        indices = -0.5 + 0.5 * torch.sqrt(1 + 8 * (depth_map - depth_min) / bin_size)\n    elif mode == \"SID\":\n        indices = num_bins * (torch.log(1 + depth_map) - math.log(1 + depth_min)) / \\\n            (math.log(1 + depth_max) - math.log(1 + depth_min))\n    else:\n        raise NotImplementedError\n\n    if target:\n        # Remove indicies outside of bounds\n        mask = (indices < 0) | (indices > num_bins) | (~torch.isfinite(indices))\n        indices[mask] = num_bins\n\n        # Convert to integer\n        indices = indices.type(torch.int64)\n    return indices\n"
  },
  {
    "path": "pcdet/version.py",
    "content": "__version__ = \"0.3.0+0000000\"\n"
  },
  {
    "path": "requirements.txt",
    "content": "numpy\ntorch>=1.1\nnumba\ntensorboardX\neasydict\npyyaml\nscikit-image\ntqdm\n"
  },
  {
    "path": "setup.py",
    "content": "import os\nimport subprocess\n\nfrom setuptools import find_packages, setup\nfrom torch.utils.cpp_extension import BuildExtension, CUDAExtension\n\n\ndef get_git_commit_number():\n    if not os.path.exists('.git'):\n        return '0000000'\n\n    cmd_out = subprocess.run(['git', 'rev-parse', 'HEAD'], stdout=subprocess.PIPE)\n    git_commit_number = cmd_out.stdout.decode('utf-8')[:7]\n    return git_commit_number\n\n\ndef make_cuda_ext(name, module, sources):\n    cuda_ext = CUDAExtension(\n        name='%s.%s' % (module, name),\n        sources=[os.path.join(*module.split('.'), src) for src in sources]\n    )\n    return cuda_ext\n\n\ndef write_version_to_file(version, target_file):\n    with open(target_file, 'w') as f:\n        print('__version__ = \"%s\"' % version, file=f)\n\n\nif __name__ == '__main__':\n    version = '0.3.0+%s' % get_git_commit_number()\n    write_version_to_file(version, 'pcdet/version.py')\n\n    setup(\n        name='pcdet',\n        version=version,\n        description='OpenPCDet is a general codebase for 3D object detection from point cloud',\n        install_requires=[\n            'numpy',\n            'torch>=1.1',\n            'spconv',\n            'numba',\n            'tensorboardX',\n            'easydict',\n            'pyyaml'\n        ],\n        author='Shaoshuai Shi',\n        author_email='shaoshuaics@gmail.com',\n        license='Apache License 2.0',\n        packages=find_packages(exclude=['tools', 'data', 'output']),\n        cmdclass={'build_ext': BuildExtension},\n        ext_modules=[\n            make_cuda_ext(\n                name='votr_ops_cuda',\n                module='pcdet.ops.votr_ops',\n                sources=[\n                    'src/votr_api.cpp',\n                    'src/build_mapping.cpp',\n                    'src/build_mapping_gpu.cu',\n                    'src/build_attention_indices.cpp',\n                    'src/build_attention_indices_gpu.cu',\n                    'src/group_features.cpp',\n                    'src/group_features_gpu.cu',\n                ],\n            ),\n            make_cuda_ext(\n                name='iou3d_nms_cuda',\n                module='pcdet.ops.iou3d_nms',\n                sources=[\n                    'src/iou3d_cpu.cpp',\n                    'src/iou3d_nms_api.cpp',\n                    'src/iou3d_nms.cpp',\n                    'src/iou3d_nms_kernel.cu',\n                ]\n            ),\n            make_cuda_ext(\n                name='roiaware_pool3d_cuda',\n                module='pcdet.ops.roiaware_pool3d',\n                sources=[\n                    'src/roiaware_pool3d.cpp',\n                    'src/roiaware_pool3d_kernel.cu',\n                ]\n            ),\n            make_cuda_ext(\n                name='roipoint_pool3d_cuda',\n                module='pcdet.ops.roipoint_pool3d',\n                sources=[\n                    'src/roipoint_pool3d.cpp',\n                    'src/roipoint_pool3d_kernel.cu',\n                ]\n            ),\n            make_cuda_ext(\n                name='pointnet2_stack_cuda',\n                module='pcdet.ops.pointnet2.pointnet2_stack',\n                sources=[\n                    'src/pointnet2_api.cpp',\n                    'src/ball_query.cpp',\n                    'src/ball_query_gpu.cu',\n                    'src/group_points.cpp',\n                    'src/group_points_gpu.cu',\n                    'src/sampling.cpp',\n                    'src/sampling_gpu.cu', \n                    'src/interpolate.cpp', \n                    'src/interpolate_gpu.cu',\n                    'src/voxel_query.cpp', \n                    'src/voxel_query_gpu.cu',\n                    'src/ball_query_deform.cpp',\n                    'src/ball_query_deform_gpu.cu',\n                    'src/vector_pool.cpp',\n                    'src/vector_pool_gpu.cu'\n                ],\n            ),\n            make_cuda_ext(\n                name='pointnet2_batch_cuda',\n                module='pcdet.ops.pointnet2.pointnet2_batch',\n                sources=[\n                    'src/pointnet2_api.cpp',\n                    'src/ball_query.cpp',\n                    'src/ball_query_gpu.cu',\n                    'src/group_points.cpp',\n                    'src/group_points_gpu.cu',\n                    'src/interpolate.cpp',\n                    'src/interpolate_gpu.cu',\n                    'src/sampling.cpp',\n                    'src/sampling_gpu.cu',\n\n                ],\n            ),\n        ],\n    )\n"
  },
  {
    "path": "tools/PENet/CoordConv.py",
    "content": "from __future__ import print_function\n\nimport numpy as np\n\nclass AddCoordsNp():\n\t\"\"\"Add coords to a tensor\"\"\"\n\tdef __init__(self, x_dim=64, y_dim=64, with_r=False):\n\t\tself.x_dim = x_dim\n\t\tself.y_dim = y_dim\n\t\tself.with_r = with_r\n\n\tdef call(self):\n\t\t\"\"\"\n\t\tinput_tensor: (batch, x_dim, y_dim, c)\n\t\t\"\"\"\n\t\t#batch_size_tensor = np.shape(input_tensor)[0]\n\n\t\txx_ones = np.ones([self.x_dim], dtype=np.int32)\n\t\txx_ones = np.expand_dims(xx_ones, 1)\n\n\t\t#print(xx_ones.shape)\n\n\t\txx_range = np.expand_dims(np.arange(self.y_dim), 0)\n\t\t#xx_range = np.expand_dims(xx_range, 1)\n\n\t\t#print(xx_range.shape)\n\n\t\txx_channel = np.matmul(xx_ones, xx_range)\n\t\txx_channel = np.expand_dims(xx_channel, -1)\n\n\t\tyy_ones = np.ones([self.y_dim], dtype=np.int32)\n\t\tyy_ones = np.expand_dims(yy_ones, 0)\n\n\t\t#print(yy_ones.shape)\n\n\t\tyy_range = np.expand_dims(np.arange(self.x_dim), 1)\n\t\t#yy_range = np.expand_dims(yy_range, -1)\n\n\t\t#print(yy_range.shape)\n\n\t\tyy_channel = np.matmul(yy_range, yy_ones)\n\t\tyy_channel = np.expand_dims(yy_channel, -1)\n\n\t\txx_channel = xx_channel.astype('float32') / (self.y_dim - 1)\n\t\tyy_channel = yy_channel.astype('float32') / (self.x_dim - 1)\n\n\t\txx_channel = xx_channel*2 - 1\n\t\tyy_channel = yy_channel*2 - 1\n\t\n\n\t\t#xx_channel = xx_channel.repeat(batch_size_tensor, axis=0)\n\t\t#yy_channel = yy_channel.repeat(batch_size_tensor, axis=0)\n\n\t\tret = np.concatenate([xx_channel, yy_channel], axis=-1)\n\n\t\tif self.with_r:\n\t\t\trr = np.sqrt( np.square(xx_channel-0.5) + np.square(yy_channel-0.5))\n\t\t\tret = np.concatenate([ret, rr], axis=-1)\n\n\t\treturn ret\n"
  },
  {
    "path": "tools/PENet/LICENSE",
    "content": "MIT License\n\nCopyright (c) 2018 Fangchang Ma\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "tools/PENet/basic.py",
    "content": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nimport math\n\ngks = 5\npad = [i for i in range(gks*gks)]\nshift = torch.zeros(gks*gks, 4)\nfor i in range(gks):\n    for j in range(gks):\n        top = i\n        bottom = gks-1-i\n        left = j\n        right = gks-1-j\n        pad[i*gks + j] = torch.nn.ZeroPad2d((left, right, top, bottom))\n        #shift[i*gks + j, :] = torch.tensor([left, right, top, bottom])\nmid_pad = torch.nn.ZeroPad2d(((gks-1)/2, (gks-1)/2, (gks-1)/2, (gks-1)/2))\nzero_pad = pad[0]\n\ngks2 = 3     #guide kernel size\npad2 = [i for i in range(gks2*gks2)]\nshift = torch.zeros(gks2*gks2, 4)\nfor i in range(gks2):\n    for j in range(gks2):\n        top = i\n        bottom = gks2-1-i\n        left = j\n        right = gks2-1-j\n        pad2[i*gks2 + j] = torch.nn.ZeroPad2d((left, right, top, bottom))\n\ngks3 = 7     #guide kernel size\npad3 = [i for i in range(gks3*gks3)]\nshift = torch.zeros(gks3*gks3, 4)\nfor i in range(gks3):\n    for j in range(gks3):\n        top = i\n        bottom = gks3-1-i\n        left = j\n        right = gks3-1-j\n        pad3[i*gks3 + j] = torch.nn.ZeroPad2d((left, right, top, bottom))\n\ndef weights_init(m):\n    # Initialize filters with Gaussian random weights\n    if isinstance(m, nn.Conv2d):\n        n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels\n        m.weight.data.normal_(0, math.sqrt(2. / n))\n        if m.bias is not None:\n            m.bias.data.zero_()\n    elif isinstance(m, nn.ConvTranspose2d):\n        n = m.kernel_size[0] * m.kernel_size[1] * m.in_channels\n        m.weight.data.normal_(0, math.sqrt(2. / n))\n        if m.bias is not None:\n            m.bias.data.zero_()\n    elif isinstance(m, nn.BatchNorm2d):\n        m.weight.data.fill_(1)\n        m.bias.data.zero_()\n\ndef convbnrelu(in_channels, out_channels, kernel_size=3,stride=1, padding=1):\n    return nn.Sequential(\n\t\tnn.Conv2d(in_channels, out_channels, kernel_size=kernel_size, stride=stride, padding=padding, bias=False),\n\t\tnn.BatchNorm2d(out_channels),\n\t\tnn.ReLU(inplace=True)\n\t)\n\ndef deconvbnrelu(in_channels, out_channels, kernel_size=5, stride=2, padding=2, output_padding=1):\n    return nn.Sequential(\n\t\tnn.ConvTranspose2d(in_channels, out_channels, kernel_size=kernel_size, stride=stride, padding=padding, output_padding=output_padding, bias=False),\n\t\tnn.BatchNorm2d(out_channels),\n\t\tnn.ReLU(inplace=True)\n\t)\n\ndef convbn(in_channels, out_channels, kernel_size=3,stride=1, padding=1):\n    return nn.Sequential(\n\t\tnn.Conv2d(in_channels, out_channels, kernel_size=kernel_size, stride=stride, padding=padding, bias=False),\n\t\tnn.BatchNorm2d(out_channels)\n\t)\n\ndef deconvbn(in_channels, out_channels, kernel_size=4, stride=2, padding=1, output_padding=0):\n    return nn.Sequential(\n\t\tnn.ConvTranspose2d(in_channels, out_channels, kernel_size=kernel_size, stride=stride, padding=padding, output_padding=output_padding, bias=False),\n\t\tnn.BatchNorm2d(out_channels)\n\t)\n\nclass BasicBlock(nn.Module):\n    expansion = 1\n    __constants__ = ['downsample']\n\n    def __init__(self, inplanes, planes, stride=1, downsample=None, groups=1,\n                 base_width=64, dilation=1, norm_layer=None):\n        super(BasicBlock, self).__init__()\n        if norm_layer is None:\n            norm_layer = nn.BatchNorm2d\n            #norm_layer = encoding.nn.BatchNorm2d\n        if groups != 1 or base_width != 64:\n            raise ValueError('BasicBlock only supports groups=1 and base_width=64')\n        if dilation > 1:\n            raise NotImplementedError(\"Dilation > 1 not supported in BasicBlock\")\n        # Both self.conv1 and self.downsample layers downsample the input when stride != 1\n        self.conv1 = conv3x3(inplanes, planes, stride)\n        self.bn1 = norm_layer(planes)\n        self.relu = nn.ReLU(inplace=True)\n        self.conv2 = conv3x3(planes, planes)\n        self.bn2 = norm_layer(planes)\n        if stride != 1 or inplanes != planes:\n            downsample = nn.Sequential(\n                conv1x1(inplanes, planes, stride),\n                norm_layer(planes),\n            )\n        self.downsample = downsample\n        self.stride = stride\n\n    def forward(self, x):\n        identity = x\n\n        out = self.conv1(x)\n        out = self.bn1(out)\n        out = self.relu(out)\n\n        out = self.conv2(out)\n        out = self.bn2(out)\n\n        if self.downsample is not None:\n            identity = self.downsample(x)\n\n        out += identity\n        out = self.relu(out)\n\n        return out\n\ndef conv3x3(in_planes, out_planes, stride=1, groups=1, dilation=1, bias=False, padding=1):\n    \"\"\"3x3 convolution with padding\"\"\"\n    if padding >= 1:\n        padding = dilation\n    return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,\n                     padding=padding, groups=groups, bias=bias, dilation=dilation)\n\ndef conv1x1(in_planes, out_planes, stride=1, groups=1, bias=False):\n    \"\"\"1x1 convolution\"\"\"\n    return nn.Conv2d(in_planes, out_planes, kernel_size=1, stride=stride, groups=groups, bias=bias)\n\nclass SparseDownSampleClose(nn.Module):\n    def __init__(self, stride):\n        super(SparseDownSampleClose, self).__init__()\n        self.pooling = nn.MaxPool2d(stride, stride)\n        self.large_number = 600\n    def forward(self, d, mask):\n        encode_d = - (1-mask)*self.large_number - d\n\n        d = - self.pooling(encode_d)\n        mask_result = self.pooling(mask)\n        d_result = d - (1-mask_result)*self.large_number\n\n        return d_result, mask_result\n\nclass CSPNGenerate(nn.Module):\n    def __init__(self, in_channels, kernel_size):\n        super(CSPNGenerate, self).__init__()\n        self.kernel_size = kernel_size\n        self.generate = convbn(in_channels, self.kernel_size * self.kernel_size - 1, kernel_size=3, stride=1, padding=1)\n\n    def forward(self, feature):\n\n        guide = self.generate(feature)\n\n        #normalization\n        guide_sum = torch.sum(guide.abs(), dim=1).unsqueeze(1)\n        guide = torch.div(guide, guide_sum)\n        guide_mid = (1 - torch.sum(guide, dim=1)).unsqueeze(1)\n\n        #padding\n        weight_pad = [i for i in range(self.kernel_size * self.kernel_size)]\n        for t in range(self.kernel_size*self.kernel_size):\n            zero_pad = 0\n            if(self.kernel_size==3):\n                zero_pad = pad2[t]\n            elif(self.kernel_size==5):\n                zero_pad = pad[t]\n            elif(self.kernel_size==7):\n                zero_pad = pad3[t]\n            if(t < int((self.kernel_size*self.kernel_size-1)/2)):\n                weight_pad[t] = zero_pad(guide[:, t:t+1, :, :])\n            elif(t > int((self.kernel_size*self.kernel_size-1)/2)):\n                weight_pad[t] = zero_pad(guide[:, t-1:t, :, :])\n            else:\n                weight_pad[t] = zero_pad(guide_mid)\n\n        guide_weight = torch.cat([weight_pad[t] for t in range(self.kernel_size*self.kernel_size)], dim=1)\n        return guide_weight\n\nclass CSPN(nn.Module):\n  def __init__(self, kernel_size):\n      super(CSPN, self).__init__()\n      self.kernel_size = kernel_size\n\n  def forward(self, guide_weight, hn, h0):\n\n        #CSPN\n        half = int(0.5 * (self.kernel_size * self.kernel_size - 1))\n        result_pad = [i for i in range(self.kernel_size * self.kernel_size)]\n        for t in range(self.kernel_size*self.kernel_size):\n            zero_pad = 0\n            if(self.kernel_size==3):\n                zero_pad = pad2[t]\n            elif(self.kernel_size==5):\n                zero_pad = pad[t]\n            elif(self.kernel_size==7):\n                zero_pad = pad3[t]\n            if(t == half):\n                result_pad[t] = zero_pad(h0)\n            else:\n                result_pad[t] = zero_pad(hn)\n        guide_result = torch.cat([result_pad[t] for t in range(self.kernel_size*self.kernel_size)], dim=1)\n        #guide_result = torch.cat([result0_pad, result1_pad, result2_pad, result3_pad,result4_pad, result5_pad, result6_pad, result7_pad, result8_pad], 1)\n\n        guide_result = torch.sum((guide_weight.mul(guide_result)), dim=1)\n        guide_result = guide_result[:, int((self.kernel_size-1)/2):-int((self.kernel_size-1)/2), int((self.kernel_size-1)/2):-int((self.kernel_size-1)/2)]\n\n        return guide_result.unsqueeze(dim=1)\n\nclass CSPNGenerateAccelerate(nn.Module):\n    def __init__(self, in_channels, kernel_size):\n        super(CSPNGenerateAccelerate, self).__init__()\n        self.kernel_size = kernel_size\n        self.generate = convbn(in_channels, self.kernel_size * self.kernel_size - 1, kernel_size=3, stride=1, padding=1)\n\n    def forward(self, feature):\n\n        guide = self.generate(feature)\n\n        #normalization in standard CSPN\n        #'''\n        guide_sum = torch.sum(guide.abs(), dim=1).unsqueeze(1)\n        guide = torch.div(guide, guide_sum)\n        guide_mid = (1 - torch.sum(guide, dim=1)).unsqueeze(1)\n        #'''\n        #weight_pad = [i for i in range(self.kernel_size * self.kernel_size)]\n\n        half1, half2 = torch.chunk(guide, 2, dim=1)\n        output =  torch.cat((half1, guide_mid, half2), dim=1)\n        return output\n\ndef kernel_trans(kernel, weight):\n    kernel_size = int(math.sqrt(kernel.size()[1]))\n    kernel = F.conv2d(kernel, weight, stride=1, padding=int((kernel_size-1)/2))\n    return kernel\n\nclass CSPNAccelerate(nn.Module):\n    def __init__(self, kernel_size, dilation=1, padding=1, stride=1):\n        super(CSPNAccelerate, self).__init__()\n        self.kernel_size = kernel_size\n        self.dilation = dilation\n        self.padding = padding\n        self.stride = stride\n\n    def forward(self, kernel, input, input0): #with standard CSPN, an addition input0 port is added\n        bs = input.size()[0]\n        h, w = input.size()[2], input.size()[3]\n        input_im2col = F.unfold(input, self.kernel_size, self.dilation, self.padding, self.stride)\n        kernel = kernel.reshape(bs, self.kernel_size * self.kernel_size, h * w)\n\n        # standard CSPN\n        input0 = input0.view(bs, 1, h * w)\n        mid_index = int((self.kernel_size*self.kernel_size-1)/2)\n        input_im2col[:, mid_index:mid_index+1, :] = input0\n\n        #print(input_im2col.size(), kernel.size())\n        output = torch.einsum('ijk,ijk->ik', (input_im2col, kernel))\n        return output.view(bs, 1, h, w)\n\nclass GeometryFeature(nn.Module):\n    def __init__(self):\n        super(GeometryFeature, self).__init__()\n\n    def forward(self, z, vnorm, unorm, h, w, ch, cw, fh, fw):\n        x = z*(0.5*h*(vnorm+1)-ch)/fh\n        y = z*(0.5*w*(unorm+1)-cw)/fw\n        return torch.cat((x, y, z),1)\n\nclass BasicBlockGeo(nn.Module):\n    expansion = 1\n    __constants__ = ['downsample']\n\n    def __init__(self, inplanes, planes, stride=1, downsample=None, groups=1,\n                 base_width=64, dilation=1, norm_layer=None, geoplanes=3):\n        super(BasicBlockGeo, self).__init__()\n\n        if norm_layer is None:\n            norm_layer = nn.BatchNorm2d\n            #norm_layer = encoding.nn.BatchNorm2d\n        if groups != 1 or base_width != 64:\n            raise ValueError('BasicBlock only supports groups=1 and base_width=64')\n        if dilation > 1:\n            raise NotImplementedError(\"Dilation > 1 not supported in BasicBlock\")\n        # Both self.conv1 and self.downsample layers downsample the input when stride != 1\n        self.conv1 = conv3x3(inplanes + geoplanes, planes, stride)\n        self.bn1 = norm_layer(planes)\n        self.relu = nn.ReLU(inplace=True)\n        self.conv2 = conv3x3(planes+geoplanes, planes)\n        self.bn2 = norm_layer(planes)\n        if stride != 1 or inplanes != planes:\n            downsample = nn.Sequential(\n                conv1x1(inplanes+geoplanes, planes, stride),\n                norm_layer(planes),\n            )\n        self.downsample = downsample\n        self.stride = stride\n\n    def forward(self, x, g1=None, g2=None):\n        identity = x\n        if g1 is not None:\n            x = torch.cat((x, g1), 1)\n        out = self.conv1(x)\n        out = self.bn1(out)\n        out = self.relu(out)\n\n        if g2 is not None:\n            out = torch.cat((g2,out), 1)\n        out = self.conv2(out)\n        out = self.bn2(out)\n\n        if self.downsample is not None:\n            identity = self.downsample(x)\n\n        out += identity\n        out = self.relu(out)\n\n        return out\n\n\n"
  },
  {
    "path": "tools/PENet/criteria.py",
    "content": "import torch\nimport torch.nn as nn\n\nloss_names = ['l1', 'l2']\n\nclass MaskedMSELoss(nn.Module):\n    def __init__(self):\n        super(MaskedMSELoss, self).__init__()\n\n    def forward(self, pred, target):\n        assert pred.dim() == target.dim(), \"inconsistent dimensions\"\n        valid_mask = (target > 0).detach()\n        diff = target - pred\n        diff = diff[valid_mask]\n        self.loss = (diff**2).mean()\n        return self.loss\n\n\nclass MaskedL1Loss(nn.Module):\n    def __init__(self):\n        super(MaskedL1Loss, self).__init__()\n\n    def forward(self, pred, target, weight=None):\n        assert pred.dim() == target.dim(), \"inconsistent dimensions\"\n        valid_mask = (target > 0).detach()\n        diff = target - pred\n        diff = diff[valid_mask]\n        self.loss = diff.abs().mean()\n        return self.loss\n"
  },
  {
    "path": "tools/PENet/dataloaders/calib_cam_to_cam.txt",
    "content": "calib_time: 09-Jan-2012 13:57:47\ncorner_dist: 9.950000e-02\nS_00: 1.392000e+03 5.120000e+02\nK_00: 9.842439e+02 0.000000e+00 6.900000e+02 0.000000e+00 9.808141e+02 2.331966e+02 0.000000e+00 0.000000e+00 1.000000e+00\nD_00: -3.728755e-01 2.037299e-01 2.219027e-03 1.383707e-03 -7.233722e-02\nR_00: 1.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00\nT_00: 2.573699e-16 -1.059758e-16 1.614870e-16\nS_rect_00: 1.242000e+03 3.750000e+02\nR_rect_00: 9.999239e-01 9.837760e-03 -7.445048e-03 -9.869795e-03 9.999421e-01 -4.278459e-03 7.402527e-03 4.351614e-03 9.999631e-01\nP_rect_00: 7.215377e+02 0.000000e+00 6.095593e+02 0.000000e+00 0.000000e+00 7.215377e+02 1.728540e+02 0.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00 0.000000e+00\nS_01: 1.392000e+03 5.120000e+02\nK_01: 9.895267e+02 0.000000e+00 7.020000e+02 0.000000e+00 9.878386e+02 2.455590e+02 0.000000e+00 0.000000e+00 1.000000e+00\nD_01: -3.644661e-01 1.790019e-01 1.148107e-03 -6.298563e-04 -5.314062e-02\nR_01: 9.993513e-01 1.860866e-02 -3.083487e-02 -1.887662e-02 9.997863e-01 -8.421873e-03 3.067156e-02 8.998467e-03 9.994890e-01\nT_01: -5.370000e-01 4.822061e-03 -1.252488e-02\nS_rect_01: 1.242000e+03 3.750000e+02\nR_rect_01: 9.996878e-01 -8.976826e-03 2.331651e-02 8.876121e-03 9.999508e-01 4.418952e-03 -2.335503e-02 -4.210612e-03 9.997184e-01\nP_rect_01: 7.215377e+02 0.000000e+00 6.095593e+02 -3.875744e+02 0.000000e+00 7.215377e+02 1.728540e+02 0.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00 0.000000e+00\nS_02: 1.392000e+03 5.120000e+02\nK_02: 9.597910e+02 0.000000e+00 6.960217e+02 0.000000e+00 9.569251e+02 2.241806e+02 0.000000e+00 0.000000e+00 1.000000e+00\nD_02: -3.691481e-01 1.968681e-01 1.353473e-03 5.677587e-04 -6.770705e-02\nR_02: 9.999758e-01 -5.267463e-03 -4.552439e-03 5.251945e-03 9.999804e-01 -3.413835e-03 4.570332e-03 3.389843e-03 9.999838e-01\nT_02: 5.956621e-02 2.900141e-04 2.577209e-03\nS_rect_02: 1.242000e+03 3.750000e+02\nR_rect_02: 9.998817e-01 1.511453e-02 -2.841595e-03 -1.511724e-02 9.998853e-01 -9.338510e-04 2.827154e-03 9.766976e-04 9.999955e-01\nP_rect_02: 7.215377e+02 0.000000e+00 6.095593e+02 4.485728e+01 0.000000e+00 7.215377e+02 1.728540e+02 2.163791e-01 0.000000e+00 0.000000e+00 1.000000e+00 2.745884e-03\nS_03: 1.392000e+03 5.120000e+02\nK_03: 9.037596e+02 0.000000e+00 6.957519e+02 0.000000e+00 9.019653e+02 2.242509e+02 0.000000e+00 0.000000e+00 1.000000e+00\nD_03: -3.639558e-01 1.788651e-01 6.029694e-04 -3.922424e-04 -5.382460e-02\nR_03: 9.995599e-01 1.699522e-02 -2.431313e-02 -1.704422e-02 9.998531e-01 -1.809756e-03 2.427880e-02 2.223358e-03 9.997028e-01\nT_03: -4.731050e-01 5.551470e-03 -5.250882e-03\nS_rect_03: 1.242000e+03 3.750000e+02\nR_rect_03: 9.998321e-01 -7.193136e-03 1.685599e-02 7.232804e-03 9.999712e-01 -2.293585e-03 -1.683901e-02 2.415116e-03 9.998553e-01\nP_rect_03: 7.215377e+02 0.000000e+00 6.095593e+02 -3.395242e+02 0.000000e+00 7.215377e+02 1.728540e+02 2.199936e+00 0.000000e+00 0.000000e+00 1.000000e+00 2.729905e-03\n"
  },
  {
    "path": "tools/PENet/dataloaders/calibration_kitti.py",
    "content": "import numpy as np\nimport re\n'''\ndef get_calib_from_file(calib_file):\n    with open(calib_file) as f:\n        lines = f.readlines()\n\n    obj = lines[2].strip().split(' ')[1:]\n    P2 = np.array(obj, dtype=np.float32)\n    obj = lines[3].strip().split(' ')[1:]\n    P3 = np.array(obj, dtype=np.float32)\n    obj = lines[4].strip().split(' ')[1:]\n    R0 = np.array(obj, dtype=np.float32)\n    obj = lines[5].strip().split(' ')[1:]\n    Tr_velo_to_cam = np.array(obj, dtype=np.float32)\n\n    return {'P2': P2.reshape(3, 4),\n            'P3': P3.reshape(3, 4),\n            'R0': R0.reshape(3, 3),\n            'Tr_velo2cam': Tr_velo_to_cam.reshape(3, 4)}\n'''\n\ndef get_calib_from_file(filepath):\n    ''' Read in a calibration file and parse into a dictionary.\n    Ref: https://github.com/utiasSTARS/pykitti/blob/master/pykitti/utils.py\n    '''\n\n    data2 = {}\n    R0 = np.array([[ 0.99992624,  0.00965411, -0.0072371 ],\n                                          [-0.00968531,  0.99994343, -0.00433077],\n                                          [ 0.00719491,  0.00440054,  0.99996366]])\n    with open(filepath) as f:\n        for line in f.readlines():\n            if line[:2] == \"P2\":\n                P2 = re.split(\" \", line.strip())\n                P2 = np.array(P2[-12:], np.float32)\n\n            if line[:2] == \"P3\":\n                P3 = re.split(\" \", line.strip())\n                P3 = np.array(P3[-12:], np.float32)\n\n            if line[:14] == \"Tr_velo_to_cam\" or line[:11] == \"Tr_velo_cam\":\n                vtc_mat = re.split(\" \", line.strip())\n                vtc_mat = np.array(vtc_mat[-12:], np.float32)\n\n            if line[:7] == \"R0_rect\" or line[:6] == \"R_rect\":\n                R0 = re.split(\" \", line.strip())\n                R0 = np.array(R0[-9:], np.float32)\n\n    data2[\"P2\"]=P2.reshape(3, 4)\n    data2[\"P3\"]=P3.reshape(3, 4)\n    data2[\"Tr_velo2cam\"]=vtc_mat.reshape(3, 4)\n    data2[\"R0\"]=R0.reshape(3, 3)\n\n    return data2\n\n\n\nclass Calibration(object):\n    def __init__(self, calib_file):\n        if not isinstance(calib_file, dict):\n            calib = get_calib_from_file(calib_file)\n        else:\n            calib = calib_file\n\n        self.P2 = calib['P2']  # 3 x 4\n        self.R0 = calib['R0']  # 3 x 3\n        self.V2C = calib['Tr_velo2cam']  # 3 x 4\n\n        # Camera intrinsics and extrinsics\n        self.cu = self.P2[0, 2]\n        self.cv = self.P2[1, 2]\n        self.fu = self.P2[0, 0]\n        self.fv = self.P2[1, 1]\n        self.tx = self.P2[0, 3] / (-self.fu)\n        self.ty = self.P2[1, 3] / (-self.fv)\n\n    def cart_to_hom(self, pts):\n        \"\"\"\n        :param pts: (N, 3 or 2)\n        :return pts_hom: (N, 4 or 3)\n        \"\"\"\n        pts_hom = np.hstack((pts, np.ones((pts.shape[0], 1), dtype=np.float32)))\n        return pts_hom\n\n    def rect_to_lidar(self, pts_rect):\n        \"\"\"\n        :param pts_lidar: (N, 3)\n        :return pts_rect: (N, 3)\n        \"\"\"\n        pts_rect_hom = self.cart_to_hom(pts_rect)  # (N, 4)\n        R0_ext = np.hstack((self.R0, np.zeros((3, 1), dtype=np.float32)))  # (3, 4)\n        R0_ext = np.vstack((R0_ext, np.zeros((1, 4), dtype=np.float32)))  # (4, 4)\n        R0_ext[3, 3] = 1\n        V2C_ext = np.vstack((self.V2C, np.zeros((1, 4), dtype=np.float32)))  # (4, 4)\n        V2C_ext[3, 3] = 1\n\n        pts_lidar = np.dot(pts_rect_hom, np.linalg.inv(np.dot(R0_ext, V2C_ext).T))\n        return pts_lidar[:, 0:3]\n\n    def lidar_to_rect(self, pts_lidar):\n        \"\"\"\n        :param pts_lidar: (N, 3)\n        :return pts_rect: (N, 3)\n        \"\"\"\n        pts_lidar_hom = self.cart_to_hom(pts_lidar)\n        pts_rect = np.dot(pts_lidar_hom, np.dot(self.V2C.T, self.R0.T))\n        # pts_rect = reduce(np.dot, (pts_lidar_hom, self.V2C.T, self.R0.T))\n        return pts_rect\n\n    def rect_to_img(self, pts_rect):\n        \"\"\"\n        :param pts_rect: (N, 3)\n        :return pts_img: (N, 2)\n        \"\"\"\n        pts_rect_hom = self.cart_to_hom(pts_rect)\n        pts_2d_hom = np.dot(pts_rect_hom, self.P2.T)\n        pts_img = (pts_2d_hom[:, 0:2].T / pts_rect_hom[:, 2]).T  # (N, 2)\n        pts_rect_depth = pts_2d_hom[:, 2] - self.P2.T[3, 2]  # depth in rect camera coord\n        return pts_img, pts_rect_depth\n\n    def lidar_to_img(self, pts_lidar):\n        \"\"\"\n        :param pts_lidar: (N, 3)\n        :return pts_img: (N, 2)\n        \"\"\"\n        pts_rect = self.lidar_to_rect(pts_lidar)\n        pts_img, pts_depth = self.rect_to_img(pts_rect)\n        return pts_img, pts_depth\n\n    def img_to_rect(self, u, v, depth_rect):\n        \"\"\"\n        :param u: (N)\n        :param v: (N)\n        :param depth_rect: (N)\n        :return:\n        \"\"\"\n        x = ((u - self.cu) * depth_rect) / self.fu + self.tx\n        y = ((v - self.cv) * depth_rect) / self.fv + self.ty\n        pts_rect = np.concatenate((x.reshape(-1, 1), y.reshape(-1, 1), depth_rect.reshape(-1, 1)), axis=1)\n        return pts_rect\n\n    def corners3d_to_img_boxes(self, corners3d):\n        \"\"\"\n        :param corners3d: (N, 8, 3) corners in rect coordinate\n        :return: boxes: (None, 4) [x1, y1, x2, y2] in rgb coordinate\n        :return: boxes_corner: (None, 8) [xi, yi] in rgb coordinate\n        \"\"\"\n        sample_num = corners3d.shape[0]\n        corners3d_hom = np.concatenate((corners3d, np.ones((sample_num, 8, 1))), axis=2)  # (N, 8, 4)\n\n        img_pts = np.matmul(corners3d_hom, self.P2.T)  # (N, 8, 3)\n\n        x, y = img_pts[:, :, 0] / img_pts[:, :, 2], img_pts[:, :, 1] / img_pts[:, :, 2]\n        x1, y1 = np.min(x, axis=1), np.min(y, axis=1)\n        x2, y2 = np.max(x, axis=1), np.max(y, axis=1)\n\n        boxes = np.concatenate((x1.reshape(-1, 1), y1.reshape(-1, 1), x2.reshape(-1, 1), y2.reshape(-1, 1)), axis=1)\n        boxes_corner = np.concatenate((x.reshape(-1, 8, 1), y.reshape(-1, 8, 1)), axis=2)\n\n        return boxes, boxes_corner\n"
  },
  {
    "path": "tools/PENet/dataloaders/kitti_loader.py",
    "content": "import os\nimport os.path\nimport glob\nimport fnmatch  # pattern matching\nimport numpy as np\nfrom numpy import linalg as LA\nfrom random import choice\nfrom PIL import Image\nimport torch\nimport torch.utils.data as data\nimport cv2\nfrom dataloaders import transforms\nimport CoordConv\nfrom dataloaders.my_loader import MyLoader\n\ninput_options = ['d', 'rgb', 'rgbd', 'g', 'gd']\noheight, owidth, cwidth = 256, 1216, 1216\ndef load_calib():\n    \"\"\"\n    Temporarily hardcoding the calibration matrix using calib file from 2011_09_26\n    \"\"\"\n    calib = open(\"dataloaders/calib_cam_to_cam.txt\", \"r\")\n    lines = calib.readlines()\n    P_rect_line = lines[25]\n\n    Proj_str = P_rect_line.split(\":\")[1].split(\" \")[1:]\n    Proj = np.reshape(np.array([float(p) for p in Proj_str]),\n                      (3, 4)).astype(np.float32)\n    K = Proj[:3, :3]  # camera matrix\n\n    # note: we will take the center crop of the images during augmentation\n    # that changes the optical centers, but not focal lengths\n    # K[0, 2] = K[0, 2] - 13  # from width = 1242 to 1216, with a 13-pixel cut on both sides\n    # K[1, 2] = K[1, 2] - 11.5  # from width = 375 to 352, with a 11.5-pixel cut on both sides\n    K[0, 2] = K[0, 2] - 13;\n    K[1, 2] = K[1, 2] - 11.5;\n    return K\n\n\ndef get_paths_and_transform(split, args):\n    assert (args.use_d or args.use_rgb\n            or args.use_g), 'no proper input selected'\n\n    if split == \"train\":\n        transform = train_transform\n        # transform = val_transform\n        '''\n        glob_d = os.path.join(\n            args.data_folder,\n            'data_depth_velodyne/train/*_sync/proj_depth/velodyne_raw/image_0[2,3]/*.png'\n        )\n        glob_gt = os.path.join(\n            args.data_folder,\n            'data_depth_annotated/train/*_sync/proj_depth/groundtruth/image_0[2,3]/*.png'\n        )\n\n        def get_rgb_paths(p):\n            ps = p.split('/')\n            date_liststr = []\n            date_liststr.append(ps[-5][:10])\n            # pnew = '/'.join([args.data_folder] + ['data_rgb'] + ps[-6:-4] +\n            #                ps[-2:-1] + ['data'] + ps[-1:])\n            pnew = '/'.join(date_liststr + ps[-5:-4] + ps[-2:-1] + ['data'] + ps[-1:])\n            pnew = os.path.join(args.data_folder_rgb, pnew)\n            return pnew\n        '''\n    elif split == \"val\":\n        if args.val == \"full\":\n            transform = val_transform\n            '''\n            glob_d = os.path.join(\n                args.data_folder,\n                'data_depth_velodyne/val/*_sync/proj_depth/velodyne_raw/image_0[2,3]/*.png'\n            )\n            glob_gt = os.path.join(\n                args.data_folder,\n                'data_depth_annotated/val/*_sync/proj_depth/groundtruth/image_0[2,3]/*.png'\n            )\n\n            def get_rgb_paths(p):\n                ps = p.split('/')\n                date_liststr = []\n                date_liststr.append(ps[-5][:10])\n                # pnew = '/'.join(ps[:-7] +\n                #   ['data_rgb']+ps[-6:-4]+ps[-2:-1]+['data']+ps[-1:])\n                pnew = '/'.join(date_liststr + ps[-5:-4] + ps[-2:-1] + ['data'] + ps[-1:])\n                pnew = os.path.join(args.data_folder_rgb, pnew)\n                return pnew\n            '''\n        elif args.val == \"select\":\n            # transform = no_transform\n            transform = val_transform\n            '''\n            glob_d = os.path.join(\n                args.data_folder,\n                \"data_depth_selection/val_selection_cropped/velodyne_raw/*.png\")\n            glob_gt = os.path.join(\n                args.data_folder,\n                \"data_depth_selection/val_selection_cropped/groundtruth_depth/*.png\"\n            )\n\n            def get_rgb_paths(p):\n                return p.replace(\"groundtruth_depth\", \"image\")\n\n            '''\n    elif split == \"test_completion\":\n        transform = no_transform\n        '''\n        glob_d = os.path.join(\n            args.data_folder,\n            \"data_depth_selection/test_depth_completion_anonymous/velodyne_raw/*.png\"\n        )\n        glob_gt = None  # \"test_depth_completion_anonymous/\"\n        glob_rgb = os.path.join(\n            args.data_folder,\n            \"data_depth_selection/test_depth_completion_anonymous/image/*.png\")\n        '''\n    elif split == \"test_prediction\":\n        transform = no_transform\n        '''\n        glob_d = None\n        glob_gt = None  # \"test_depth_completion_anonymous/\"\n        glob_rgb = os.path.join(\n            args.data_folder,\n            \"data_depth_selection/test_depth_prediction_anonymous/image/*.png\")\n        '''\n    else:\n        raise ValueError(\"Unrecognized split \" + str(split))\n    '''\n    if glob_gt is not None:\n        # train or val-full or val-select\n        paths_d = sorted(glob.glob(glob_d))\n        paths_gt = sorted(glob.glob(glob_gt))\n        paths_rgb = [get_rgb_paths(p) for p in paths_gt]\n    else:\n        # test only has d or rgb\n        paths_rgb = sorted(glob.glob(glob_rgb))\n        paths_gt = [None] * len(paths_rgb)\n        if split == \"test_prediction\":\n            paths_d = [None] * len(\n                paths_rgb)  # test_prediction has no sparse depth\n        else:\n            paths_d = sorted(glob.glob(glob_d))\n\n    if len(paths_d) == 0 and len(paths_rgb) == 0 and len(paths_gt) == 0:\n        raise (RuntimeError(\"Found 0 images under {}\".format(glob_gt)))\n    if len(paths_d) == 0 and args.use_d:\n        raise (RuntimeError(\"Requested sparse depth but none was found\"))\n    if len(paths_rgb) == 0 and args.use_rgb:\n        raise (RuntimeError(\"Requested rgb images but none was found\"))\n    if len(paths_rgb) == 0 and args.use_g:\n        raise (RuntimeError(\"Requested gray images but no rgb was found\"))\n    if len(paths_rgb) != len(paths_d) or len(paths_rgb) != len(paths_gt):\n        print(len(paths_rgb), len(paths_d), len(paths_gt))\n        # for i in range(999):\n        #    print(\"#####\")\n        #    print(paths_rgb[i])\n        #    print(paths_d[i])\n        #    print(paths_gt[i])\n        # raise (RuntimeError(\"Produced different sizes for datasets\"))\n    #paths = {\"rgb\": paths_rgb, \"d\": paths_d, \"gt\": paths_gt}\n    '''\n    paths = None\n    return paths, transform\n\n\ndef rgb_read(filename):\n    assert os.path.exists(filename), \"file not found: {}\".format(filename)\n    img_file = Image.open(filename)\n    # rgb_png = np.array(img_file, dtype=float) / 255.0 # scale pixels to the range [0,1]\n    rgb_png = np.array(img_file, dtype='uint8')  # in the range [0,255]\n    img_file.close()\n    return rgb_png\n\n\ndef depth_read(filename):\n    # loads depth map D from png file\n    # and returns it as a numpy array,\n    # for details see readme.txt\n    assert os.path.exists(filename), \"file not found: {}\".format(filename)\n    img_file = Image.open(filename)\n    depth_png = np.array(img_file, dtype=int)\n    img_file.close()\n    # make sure we have a proper 16bit depth map here.. not 8bit!\n    assert np.max(depth_png) > 255, \\\n        \"np.max(depth_png)={}, path={}\".format(np.max(depth_png), filename)\n\n    depth = depth_png.astype(np.float) / 256.\n    # depth[depth_png == 0] = -1.\n    depth = np.expand_dims(depth, -1)\n    return depth\n\ndef drop_depth_measurements(depth, prob_keep):\n    mask = np.random.binomial(1, prob_keep, depth.shape)\n    depth *= mask\n    return depth\n\ndef train_transform(rgb, sparse, target, position, args):\n    # s = np.random.uniform(1.0, 1.5) # random scaling\n    # angle = np.random.uniform(-5.0, 5.0) # random rotation degrees\n    oheight = args.val_h\n    owidth = args.val_w\n\n    do_flip = np.random.uniform(0.0, 1.0) < 0.5  # random horizontal flip\n\n    transforms_list = [\n        # transforms.Rotate(angle),\n        # transforms.Resize(s),\n        transforms.BottomCrop((oheight, owidth)),\n        transforms.HorizontalFlip(do_flip)\n    ]\n\n    # if small_training == True:\n    # transforms_list.append(transforms.RandomCrop((rheight, rwidth)))\n\n    transform_geometric = transforms.Compose(transforms_list)\n\n    if sparse is not None:\n        sparse = transform_geometric(sparse)\n    target = transform_geometric(target)\n    if rgb is not None:\n        brightness = np.random.uniform(max(0, 1 - args.jitter),\n                                       1 + args.jitter)\n        contrast = np.random.uniform(max(0, 1 - args.jitter), 1 + args.jitter)\n        saturation = np.random.uniform(max(0, 1 - args.jitter),\n                                       1 + args.jitter)\n        transform_rgb = transforms.Compose([\n            transforms.ColorJitter(brightness, contrast, saturation, 0),\n            transform_geometric\n        ])\n        rgb = transform_rgb(rgb)\n    # sparse = drop_depth_measurements(sparse, 0.9)\n\n    if position is not None:\n        bottom_crop_only = transforms.Compose([transforms.BottomCrop((oheight, owidth))])\n        position = bottom_crop_only(position)\n\n    # random crop\n    #if small_training == True:\n    if args.not_random_crop == False:\n        h = oheight\n        w = owidth\n        rheight = args.random_crop_height\n        rwidth = args.random_crop_width\n        # randomlize\n        i = np.random.randint(0, h - rheight + 1)\n        j = np.random.randint(0, w - rwidth + 1)\n\n        if rgb is not None:\n            if rgb.ndim == 3:\n                rgb = rgb[i:i + rheight, j:j + rwidth, :]\n            elif rgb.ndim == 2:\n                rgb = rgb[i:i + rheight, j:j + rwidth]\n\n        if sparse is not None:\n            if sparse.ndim == 3:\n                sparse = sparse[i:i + rheight, j:j + rwidth, :]\n            elif sparse.ndim == 2:\n                sparse = sparse[i:i + rheight, j:j + rwidth]\n\n        if target is not None:\n            if target.ndim == 3:\n                target = target[i:i + rheight, j:j + rwidth, :]\n            elif target.ndim == 2:\n                target = target[i:i + rheight, j:j + rwidth]\n\n        if position is not None:\n            if position.ndim == 3:\n                position = position[i:i + rheight, j:j + rwidth, :]\n            elif position.ndim == 2:\n                position = position[i:i + rheight, j:j + rwidth]\n\n    return rgb, sparse, target, position\n\ndef val_transform(rgb, sparse, target, position, args):\n    oheight = args.val_h\n    owidth = args.val_w\n\n    transform = transforms.Compose([\n        transforms.BottomCrop((oheight, owidth)),\n    ])\n    if rgb is not None:\n        rgb = transform(rgb)\n    if sparse is not None:\n        sparse = transform(sparse)\n    if target is not None:\n        target = transform(target)\n    if position is not None:\n        position = transform(position)\n\n    return rgb, sparse, target, position\n\n\ndef no_transform(rgb, sparse, target, position, args):\n    return rgb, sparse, target, position\n\n\nto_tensor = transforms.ToTensor()\nto_float_tensor = lambda x: to_tensor(x).float()\n\n\ndef handle_gray(rgb, args):\n    if rgb is None:\n        return None, None\n    if not args.use_g:\n        return rgb, None\n    else:\n        img = np.array(Image.fromarray(rgb).convert('L'))\n        img = np.expand_dims(img, -1)\n        if not args.use_rgb:\n            rgb_ret = None\n        else:\n            rgb_ret = rgb\n        return rgb_ret, img\n\n\ndef get_rgb_near(path, args):\n    assert path is not None, \"path is None\"\n\n    def extract_frame_id(filename):\n        head, tail = os.path.split(filename)\n        number_string = tail[0:tail.find('.')]\n        number = int(number_string)\n        return head, number\n\n    def get_nearby_filename(filename, new_id):\n        head, _ = os.path.split(filename)\n        new_filename = os.path.join(head, '%010d.png' % new_id)\n        return new_filename\n\n    head, number = extract_frame_id(path)\n    count = 0\n    max_frame_diff = 3\n    candidates = [\n        i - max_frame_diff for i in range(max_frame_diff * 2 + 1)\n        if i - max_frame_diff != 0\n    ]\n    while True:\n        random_offset = choice(candidates)\n        path_near = get_nearby_filename(path, number + random_offset)\n        if os.path.exists(path_near):\n            break\n        assert count < 20, \"cannot find a nearby frame in 20 trials for {}\".format(path_near)\n\n    return rgb_read(path_near)\n\n\nclass KittiDepth(data.Dataset):\n    \"\"\"A data loader for the Kitti dataset\n    \"\"\"\n\n    def __init__(self, split, args):\n        self.args = args\n        self.split = split\n        paths, transform = get_paths_and_transform(split, args)\n        self.paths = paths\n        self.transform = transform\n        self.K = load_calib()\n        self.threshold_translation = 0.1\n        self.my_loader = MyLoader(args.detpath)\n\n    def __getraw__(self, index):\n        rgb = rgb_read(self.paths['rgb'][index]) if \\\n            (self.paths['rgb'][index] is not None and (self.args.use_rgb or self.args.use_g)) else None\n        sparse = depth_read(self.paths['d'][index]) if \\\n            (self.paths['d'][index] is not None and self.args.use_d) else None\n        target = depth_read(self.paths['gt'][index]) if \\\n            self.paths['gt'][index] is not None else None\n        return rgb, sparse, target\n\n    def __getitem__(self, index):\n        rgb, sparse = self.my_loader[index]\n\n        target = None\n        position = CoordConv.AddCoordsNp(self.args.val_h, self.args.val_w)\n        position = position.call()\n        rgb, sparse, target, position = self.transform(rgb, sparse, target, position, self.args)\n\n        rgb, gray = handle_gray(rgb, self.args)\n        # candidates = {\"rgb\": rgb, \"d\": sparse, \"gt\": target, \\\n        #              \"g\": gray, \"r_mat\": r_mat, \"t_vec\": t_vec, \"rgb_near\": rgb_near}\n        candidates = {\"rgb\": rgb, \"d\": sparse, \"gt\": target, \\\n                      \"g\": gray, 'position': position, 'K': self.K}\n\n        items = {\n            key: to_float_tensor(val)\n            for key, val in candidates.items() if val is not None\n        }\n\n        return items\n\n    def __len__(self):\n        return len(self.my_loader)"
  },
  {
    "path": "tools/PENet/dataloaders/my_loader.py",
    "content": "from dataloaders import calibration_kitti\nimport numpy as np\nfrom skimage import io\nimport cv2\nfrom PIL import Image\nimport os\nimport copy\nimport torch\nfrom dataloaders.spconv_utils import replace_feature, spconv\nfrom torch import nn\nimport torch.nn.functional as F\n\nimport torch\nimport numpy as np\ntv = None\ntry:\n    import cumm.tensorview as tv\nexcept:\n    pass\nclass VoxelGeneratorWrapper():\n    def __init__(self, vsize_xyz, coors_range_xyz, num_point_features, max_num_points_per_voxel, max_num_voxels):\n        try:\n            from spconv.utils import VoxelGeneratorV2 as VoxelGenerator\n            self.spconv_ver = 1\n        except:\n            try:\n                from spconv.utils import VoxelGenerator\n                self.spconv_ver = 1\n            except:\n                from spconv.utils import Point2VoxelCPU3d as VoxelGenerator\n                self.spconv_ver = 2\n\n        if self.spconv_ver == 1:\n            self._voxel_generator = VoxelGenerator(\n                voxel_size=vsize_xyz,\n                point_cloud_range=coors_range_xyz,\n                max_num_points=max_num_points_per_voxel,\n                max_voxels=max_num_voxels\n            )\n        else:\n            self._voxel_generator = VoxelGenerator(\n                vsize_xyz=vsize_xyz,\n                coors_range_xyz=coors_range_xyz,\n                num_point_features=num_point_features,\n                max_num_points_per_voxel=max_num_points_per_voxel,\n                max_num_voxels=max_num_voxels\n            )\n\n    def generate(self, points):\n        if self.spconv_ver == 1:\n            voxel_output = self._voxel_generator.generate(points)\n            if isinstance(voxel_output, dict):\n                voxels, coordinates, num_points = \\\n                    voxel_output['voxels'], voxel_output['coordinates'], voxel_output['num_points_per_voxel']\n            else:\n                voxels, coordinates, num_points = voxel_output\n        else:\n            assert tv is not None, f\"Unexpected error, library: 'cumm' wasn't imported properly.\"\n            voxel_output = self._voxel_generator.point_to_voxel(tv.from_numpy(points))\n            tv_voxels, tv_coordinates, tv_num_points = voxel_output\n            # make copy with numpy(), since numpy_view() will disappear as soon as the generator is deleted\n            voxels = tv_voxels.numpy()\n            coordinates = tv_coordinates.numpy()\n            num_points = tv_num_points.numpy()\n        return voxels, coordinates, num_points\n\nvoxel_generator = VoxelGeneratorWrapper(\n        vsize_xyz=[200, 0.002, 0.002],\n        coors_range_xyz=[-100,-5,-5,100,5,5],\n        num_point_features=11,\n        max_num_points_per_voxel=100,\n        max_num_voxels=1000000,\n    )\n\n\ndef get_fov_flag(pts_rect, img_shape, calib):\n    \"\"\"\n    Args:\n        pts_rect:\n        img_shape:\n        calib:\n\n    Returns:\n\n    \"\"\"\n    pts_img, pts_rect_depth = calib.rect_to_img(pts_rect)\n    val_flag_1 = np.logical_and(pts_img[:, 0] >= 0, pts_img[:, 0] < img_shape[1])\n    val_flag_2 = np.logical_and(pts_img[:, 1] >= 0, pts_img[:, 1] < img_shape[0])\n    val_flag_merge = np.logical_and(val_flag_1, val_flag_2)\n    pts_valid_flag = np.logical_and(val_flag_merge, pts_rect_depth >= 0)\n    return pts_valid_flag\n\ndef load_depth_input(calib, image, points):\n    image = copy.deepcopy(image)\n    pts_rect = calib.lidar_to_rect(points[:, 0:3])\n    fov_flag = get_fov_flag(pts_rect, image.shape, calib)\n    points = points[fov_flag]\n\n    pts_rect = calib.lidar_to_rect(points[:, 0:3])\n    pts_img, pts_rect_depth = calib.rect_to_img(pts_rect)\n\n    val_inds = (pts_img[:, 0] >= 0) & (pts_img[:, 1] >= 0)\n    val_inds = val_inds & (pts_img[:, 0] < image.shape[1]) & (pts_img[:, 1] < image.shape[0])\n\n    pts_img = pts_img[val_inds].astype(np.int32)\n    depth = pts_rect_depth[val_inds]\n\n    new_im = np.zeros(shape=image.shape[0:2])\n    new_im[pts_img[:, 1], pts_img[:, 0]] = depth\n    depth = np.expand_dims(new_im, -1)\n    rgb_png = np.array(image, dtype='uint8')\n\n    return rgb_png, depth\n\ndef depth_read(filename):\n    # loads depth map D from png file\n    # and returns it as a numpy array,\n    # for details see readme.txt\n    assert os.path.exists(filename), \"file not found: {}\".format(filename)\n    img_file = Image.open(filename)\n    depth_png = np.array(img_file, dtype=int)\n    img_file.close()\n    # make sure we have a proper 16bit depth map here.. not 8bit!\n    assert np.max(depth_png) > 255, \\\n        \"np.max(depth_png)={}, path={}\".format(np.max(depth_png), filename)\n\n    depth = depth_png.astype(np.float32) / 256.\n    # depth[depth_png == 0] = -1.\n    depth = np.expand_dims(depth, -1)\n\n    return depth\n\ndef depth2points(depth, calib):\n    depth[depth<0.1] = 0\n    uv = depth.nonzero()\n    depth_val = depth[depth>0]\n\n    p_rect = calib.img_to_rect(uv[1], uv[0], depth_val)\n    p_lidar = calib.rect_to_lidar(p_rect)\n\n    return p_lidar\n\ndef depth2pointsrgb(depth, image, calib):\n    depth[depth<0.1] = 0\n    uv = depth.nonzero()\n    depth_val = depth[depth>0]\n\n    new_p = np.zeros(shape=(uv[0].shape[0], 6))\n\n    p_rect = calib.img_to_rect(uv[1], uv[0], depth_val)\n    p_lidar = calib.rect_to_lidar(p_rect)\n    new_p[:, 0:3] = p_lidar\n    new_p[:, 3:] = image[uv[0], uv[1]]\n\n    return new_p\n\ndef to_sphere_coords(points):\n    r = np.linalg.norm(points[:, 0:3], ord=2, axis=-1)\n    theta = np.arccos(points[:, 2]/r)\n    fan = np.arctan(points[:, 1]/points[:, 0])\n\n    new_points = copy.deepcopy(points)\n    new_points[:, 0] = r\n    new_points[:, 1] = theta\n    new_points[:, 2] = fan\n    mask1 = new_points[:, 1]>1.5\n\n    new_points=new_points[mask1]\n    points = points[mask1]\n\n    return new_points, points\n\ndef de_noise(points, vert_res = 0.05, hor_res = 0.05):\n    new_points = copy.deepcopy(points)\n\n    sp_coords, new_points = to_sphere_coords(new_points)\n\n    voxel_dict = {}\n\n    for i, point in enumerate(sp_coords):\n\n        vert_coord = point[1]//vert_res\n        hor_coord = point[2]//hor_res\n\n        voxel_key = str(vert_coord)+'_'+str(hor_coord)\n\n        if voxel_key in voxel_dict:\n\n            voxel_dict[voxel_key]['sp'].append(point)\n            voxel_dict[voxel_key]['pts'].append(new_points[i])\n        else:\n            voxel_dict[voxel_key] = {'sp': [point], 'pts': [new_points[i]]}\n\n    sampled_list = []\n\n    for voxel_key in voxel_dict:\n\n        sp = voxel_dict[voxel_key]['pts']\n        if len(sp)<=20:\n            continue\n\n        sampled_list+=sp\n\n    return np.array(sampled_list)\n\ndef la_sampling(points, vert_res = 0.002, hor_res = 0.002):\n    new_points = copy.deepcopy(points)\n\n    sp_coords, new_points = to_sphere_coords(new_points)\n    voxel_dict = {}\n\n    for i, point in enumerate(sp_coords):\n\n        vert_coord = point[1]//vert_res\n        hor_coord = point[2]//hor_res\n\n        voxel_key = str(vert_coord)+'_'+str(hor_coord)\n\n        if voxel_key in voxel_dict:\n\n            voxel_dict[voxel_key]['sp'].append(point)\n            voxel_dict[voxel_key]['pts'].append(new_points[i])\n        else:\n            voxel_dict[voxel_key] = {'sp': [point], 'pts': [new_points[i]]}\n\n    sampled_list = []\n\n    for voxel_key in voxel_dict:\n\n        sp = voxel_dict[voxel_key]['pts'] #N,10\n\n        arg_min = np.argmin(np.array(sp)[:, 0])\n        min_point = voxel_dict[voxel_key]['pts'][arg_min]\n        sampled_list.append(min_point)\n\n    return np.array(sampled_list)\n\ndef la_sampling2(points, vert_res=0.002, hor_res=0.002):\n    new_points = copy.deepcopy(points)\n\n    sp_coords, new_points = to_sphere_coords(new_points)\n\n    cat_points = np.concatenate([sp_coords,new_points[:,0:3]],-1)\n    voxels, coordinates, num_points = voxel_generator.generate(cat_points)\n    finals = []\n    for i,voxel in enumerate(voxels):\n        pt_n = num_points[i]\n        arg_min = np.argmin(np.array(voxel[:pt_n, 10]))\n        finals.append(voxel[arg_min])\n    finals = np.array(finals)\n    return np.concatenate([finals[:, 8:11], finals[:, 3:8]],-1)\n\n\ndef voxel_sampling(point2, res_x=0.05, res_y=0.05, res_z = 0.05):\n\n    min_x = -100\n    min_y = -100\n    min_z = -10\n\n    voxels = {}\n\n    for point in point2:\n        x = point[0]\n        y = point[1]\n        z = point[2]\n\n        x_coord = (x-min_x)//res_x\n        y_coord = (y-min_y)//res_y\n        z_coord = (z-min_z)//res_z\n\n        key = str(x_coord)+'_'+str(y_coord)+'_'+str(z_coord)\n\n        voxels[key] = point\n\n    return np.array(list(voxels.values()))\n\ndef lidar_guied_voxel_sampling(point2, ref_points, res_x=0.2, res_y=0.2, res_z = 0.2):\n\n    min_x = -100\n    min_y = -100\n    min_z = -10\n\n    voxels = {}\n\n    for point in ref_points:\n        x = point[0]\n        y = point[1]\n        z = point[2]\n\n        x_coord = (x-min_x)//res_x\n        y_coord = (y-min_y)//res_y\n        z_coord = (z-min_z)//res_z\n\n        key = str(x_coord)+'_'+str(y_coord)+'_'+str(z_coord)\n\n        voxels[key] = 1\n\n    new_points = []\n    for point in point2:\n        x = point[0]\n        y = point[1]\n        z = point[2]\n\n        x_coord = (x - min_x) // res_x\n        y_coord = (y - min_y) // res_y\n        z_coord = (z - min_z) // res_z\n\n        key = str(x_coord) + '_' + str(y_coord) + '_' + str(z_coord)\n\n        if key in voxels:\n            new_points.append(point)\n\n    return np.array(new_points)\n\ndef lidar_guied_dis_sampling(point2, ref_points, dis = 0.3, res_z = 0.3):\n    point2[np.abs(point2[:, 0] > 100)] = 100\n    point2[np.abs(point2[:, 1] > 100)] = 100\n    new_points=[]\n    for i, point in enumerate(ref_points):\n        if i%1000==0:\n            print(i)\n        x = point[0]\n        y = point[1]\n        z = point[2]\n        mask_x = np.abs(point2[:, 0] - x) < dis\n        mask_y = np.abs(point2[:, 1] - y) < dis\n        mask_z = np.abs(point2[:, 2] - z) < res_z\n\n        mask = mask_x*mask_z*mask_y\n\n        new_points.append(point2[mask])\n\n        point2[mask]=10000\n\n    return np.concatenate(new_points)\n\ndef range_sampling(points2, ref_points, calib, pix_dis_x = 1, pix_dis_y = 7, depth_dis = 0.3):\n    pts_img2, pts_depth2 = calib.lidar_to_img(points2[:, 0:3])\n    ref_img, ref_depth = calib.lidar_to_img(ref_points[:, 0:3])\n\n    pts = np.concatenate([pts_img2, pts_depth2.reshape(pts_img2.shape[0], 1)], -1)\n    ref = np.concatenate([ref_img, ref_depth.reshape(ref_img.shape[0], 1)], -1)\n\n    new_points=[]\n\n    for i, point in enumerate(ref):\n        if i%1000==0:\n            print(i)\n        x = point[0]\n        y = point[1]\n        dis = point[2]\n        mask_x = np.abs(pts[:, 0] - x) < pix_dis_x\n        mask_y = np.abs(pts[:, 1] - y) < pix_dis_y\n        mask_z = np.abs(pts[:, 2] - dis) < depth_dis\n\n        mask = mask_x*mask_z*mask_y\n\n        new_points.append(points2[mask])\n\n        pts[mask]=100000\n\n    return np.concatenate(new_points)\ndef range_sampling_torch(points2, ref_points, calib, pix_dis_x = 4, pix_dis_y = 7, depth_dis = 0.5):\n    pts_img2, pts_depth2 = calib.lidar_to_img(points2[:, 0:3])\n    ref_img, ref_depth = calib.lidar_to_img(ref_points[:, 0:3])\n\n    pts = np.concatenate([pts_img2, pts_depth2.reshape(pts_img2.shape[0], 1)], -1)\n    ref = np.concatenate([ref_img, ref_depth.reshape(ref_img.shape[0], 1)], -1)\n\n    pts_t = torch.from_numpy(pts).cuda()\n\n    mask_all = torch.zeros((points2.shape[0],)).bool().cuda()\n\n    for i, point in enumerate(ref):\n\n        x = point[0]\n        y = point[1]\n        dis = point[2]\n        mask_x = torch.abs(pts_t[:, 0] - x) < pix_dis_x\n        mask_y = torch.abs(pts_t[:, 1] - y) < pix_dis_y\n        mask_z1 = (pts_t[:, 2] - dis) < depth_dis\n        mask_z2 = (pts_t[:, 2] - dis) > 0\n        mask_z = mask_z1*mask_z2\n\n        mask = mask_x*mask_z*mask_y\n        pts_t[mask] = 100000\n        mask_all+=mask\n\n    return points2[mask_all.cpu().numpy()]\n\ndef depth2pointsrgbp(depth, image, calib, lidar):\n    depth[depth<0.01] = 0\n    uv = depth.nonzero()\n    depth_val = depth[depth>0]\n\n    new_p = np.zeros(shape=(uv[0].shape[0], 8))\n\n    p_rect = calib.img_to_rect(uv[1], uv[0], depth_val)\n    p_lidar = calib.rect_to_lidar(p_rect)\n    new_p[:, 0:3] = p_lidar\n    new_p[:, 4:7] = image[uv[0], uv[1]]/3\n    new_p = new_p[new_p[:, 2] < 1.]\n    new_p = la_sampling2(new_p)\n    new_p[:, -1] = 1\n\n    new_lidar = np.zeros(shape=(lidar.shape[0], 8))\n    new_lidar[:, 0:4] = lidar[:, 0:4]\n    new_lidar[:, 3] *= 10\n    new_lidar[:, -1] = 2\n\n    #new_p = new_p[new_p[:, 2]<1.]\n    #_, new_p = to_sphere_coords(new_p)\n    #new_p = voxel_sampling(new_p)\n    #new_p = range_sampling_torch(new_p, new_lidar, calib)\n\n    all_points = np.concatenate([new_lidar, new_p], 0)\n\n    return all_points\n\nclass MyLoader():\n    def __init__(self, root_path=''):\n        self.root_path = root_path\n        self.file_list = self.include_all_files()\n\n    def include_all_files(self):\n        velo_path = os.path.join(self.root_path, 'velodyne')\n        all_files = os.listdir(velo_path)\n        all_files.sort()\n\n        all_files = [x[0:6] for x in all_files]\n\n        return all_files\n\n    def __len__(self):\n        return len(self.file_list)\n\n    def __getitem__(self, item):\n        file_idx = self.file_list[item]\n        file_image_path = os.path.join(self.root_path, 'image_2', file_idx+'.png')\n        file_velo_path = os.path.join(self.root_path, 'velodyne', file_idx+'.bin')\n        file_calib = os.path.join(self.root_path, 'calib', file_idx+'.txt')\n\n        calib = calibration_kitti.Calibration(file_calib)\n        points = np.fromfile(str(file_velo_path), dtype=np.float32).reshape(-1, 4)\n        image = np.array(io.imread(file_image_path), dtype=np.int32)\n        image = image[:352, :1216]\n\n        rgb, depth = load_depth_input(calib, image, points)\n\n        return rgb, depth\n"
  },
  {
    "path": "tools/PENet/dataloaders/spconv_utils.py",
    "content": "import torch\n\n\ndef scatter_point_inds(indices, point_inds, shape):\n    ret = -1 * torch.ones(*shape, dtype=point_inds.dtype, device=point_inds.device)\n    ndim = indices.shape[-1]\n    flattened_indices = indices.view(-1, ndim)\n    slices = [flattened_indices[:, i] for i in range(ndim)]\n    ret[slices] = point_inds\n    return ret\n\n\ndef generate_voxel2pinds(sparse_tensor):\n    device = sparse_tensor.indices.device\n    batch_size = sparse_tensor.batch_size\n    spatial_shape = sparse_tensor.spatial_shape\n    indices = sparse_tensor.indices.long()\n    point_indices = torch.arange(indices.shape[0], device=device, dtype=torch.int32)\n    output_shape = [batch_size] + list(spatial_shape)\n    v2pinds_tensor = scatter_point_inds(indices, point_indices, output_shape)\n    return v2pinds_tensor\n\ndef generate_voxel2pinds2(batch_size,spatial_shape,indices):\n    indices = indices.long()\n    device = indices.device\n    point_indices = torch.arange(indices.shape[0], device=device, dtype=torch.int32)\n    output_shape = [batch_size] + list(spatial_shape)\n    v2pinds_tensor = scatter_point_inds(indices, point_indices, output_shape)\n    return v2pinds_tensor\n\nfrom typing import Set\n\ntry:\n    import spconv.pytorch as spconv\nexcept:\n    import spconv as spconv\n\nimport torch.nn as nn\n\n\ndef find_all_spconv_keys(model: nn.Module, prefix=\"\") -> Set[str]:\n    \"\"\"\n    Finds all spconv keys that need to have weight's transposed\n    \"\"\"\n    found_keys: Set[str] = set()\n    for name, child in model.named_children():\n        new_prefix = f\"{prefix}.{name}\" if prefix != \"\" else name\n\n        if isinstance(child, spconv.conv.SparseConvolution):\n            new_prefix = f\"{new_prefix}.weight\"\n            found_keys.add(new_prefix)\n\n        found_keys.update(find_all_spconv_keys(child, prefix=new_prefix))\n\n    return found_keys\n\n\ndef replace_feature(out, new_features):\n    if \"replace_feature\" in out.__dir__():\n        # spconv 2.x behaviour\n        return out.replace_feature(new_features)\n    else:\n        out.features = new_features\n        return out\n"
  },
  {
    "path": "tools/PENet/dataloaders/transforms.py",
    "content": "from __future__ import division\nimport torch\nimport math\nimport random\n\nfrom PIL import Image, ImageOps, ImageEnhance\ntry:\n    import accimage\nexcept ImportError:\n    accimage = None\n\nimport numpy as np\nimport numbers\nimport types\nimport collections\nimport warnings\n\nimport scipy.ndimage.interpolation as itpl\nimport skimage.transform\n\n\ndef _is_numpy_image(img):\n    return isinstance(img, np.ndarray) and (img.ndim in {2, 3})\n\n\ndef _is_pil_image(img):\n    if accimage is not None:\n        return isinstance(img, (Image.Image, accimage.Image))\n    else:\n        return isinstance(img, Image.Image)\n\n\ndef _is_tensor_image(img):\n    return torch.is_tensor(img) and img.ndimension() == 3\n\n\ndef adjust_brightness(img, brightness_factor):\n    \"\"\"Adjust brightness of an Image.\n\n    Args:\n        img (PIL Image): PIL Image to be adjusted.\n        brightness_factor (float):  How much to adjust the brightness. Can be\n            any non negative number. 0 gives a black image, 1 gives the\n            original image while 2 increases the brightness by a factor of 2.\n\n    Returns:\n        PIL Image: Brightness adjusted image.\n    \"\"\"\n    if not _is_pil_image(img):\n        raise TypeError('img should be PIL Image. Got {}'.format(type(img)))\n\n    enhancer = ImageEnhance.Brightness(img)\n    img = enhancer.enhance(brightness_factor)\n    return img\n\n\ndef adjust_contrast(img, contrast_factor):\n    \"\"\"Adjust contrast of an Image.\n\n    Args:\n        img (PIL Image): PIL Image to be adjusted.\n        contrast_factor (float): How much to adjust the contrast. Can be any\n            non negative number. 0 gives a solid gray image, 1 gives the\n            original image while 2 increases the contrast by a factor of 2.\n\n    Returns:\n        PIL Image: Contrast adjusted image.\n    \"\"\"\n    if not _is_pil_image(img):\n        raise TypeError('img should be PIL Image. Got {}'.format(type(img)))\n\n    enhancer = ImageEnhance.Contrast(img)\n    img = enhancer.enhance(contrast_factor)\n    return img\n\n\ndef adjust_saturation(img, saturation_factor):\n    \"\"\"Adjust color saturation of an image.\n\n    Args:\n        img (PIL Image): PIL Image to be adjusted.\n        saturation_factor (float):  How much to adjust the saturation. 0 will\n            give a black and white image, 1 will give the original image while\n            2 will enhance the saturation by a factor of 2.\n\n    Returns:\n        PIL Image: Saturation adjusted image.\n    \"\"\"\n    if not _is_pil_image(img):\n        raise TypeError('img should be PIL Image. Got {}'.format(type(img)))\n\n    enhancer = ImageEnhance.Color(img)\n    img = enhancer.enhance(saturation_factor)\n    return img\n\n\ndef adjust_hue(img, hue_factor):\n    \"\"\"Adjust hue of an image.\n\n    The image hue is adjusted by converting the image to HSV and\n    cyclically shifting the intensities in the hue channel (H).\n    The image is then converted back to original image mode.\n\n    `hue_factor` is the amount of shift in H channel and must be in the\n    interval `[-0.5, 0.5]`.\n\n    See https://en.wikipedia.org/wiki/Hue for more details on Hue.\n\n    Args:\n        img (PIL Image): PIL Image to be adjusted.\n        hue_factor (float):  How much to shift the hue channel. Should be in\n            [-0.5, 0.5]. 0.5 and -0.5 give complete reversal of hue channel in\n            HSV space in positive and negative direction respectively.\n            0 means no shift. Therefore, both -0.5 and 0.5 will give an image\n            with complementary colors while 0 gives the original image.\n\n    Returns:\n        PIL Image: Hue adjusted image.\n    \"\"\"\n    if not (-0.5 <= hue_factor <= 0.5):\n        raise ValueError(\n            'hue_factor is not in [-0.5, 0.5].'.format(hue_factor))\n\n    if not _is_pil_image(img):\n        raise TypeError('img should be PIL Image. Got {}'.format(type(img)))\n\n    input_mode = img.mode\n    if input_mode in {'L', '1', 'I', 'F'}:\n        return img\n\n    h, s, v = img.convert('HSV').split()\n\n    np_h = np.array(h, dtype=np.uint8)\n    # uint8 addition take cares of rotation across boundaries\n    with np.errstate(over='ignore'):\n        np_h += np.uint8(hue_factor * 255)\n    h = Image.fromarray(np_h, 'L')\n\n    img = Image.merge('HSV', (h, s, v)).convert(input_mode)\n    return img\n\n\ndef adjust_gamma(img, gamma, gain=1):\n    \"\"\"Perform gamma correction on an image.\n\n    Also known as Power Law Transform. Intensities in RGB mode are adjusted\n    based on the following equation:\n\n        I_out = 255 * gain * ((I_in / 255) ** gamma)\n\n    See https://en.wikipedia.org/wiki/Gamma_correction for more details.\n\n    Args:\n        img (PIL Image): PIL Image to be adjusted.\n        gamma (float): Non negative real number. gamma larger than 1 make the\n            shadows darker, while gamma smaller than 1 make dark regions\n            lighter.\n        gain (float): The constant multiplier.\n    \"\"\"\n    if not _is_pil_image(img):\n        raise TypeError('img should be PIL Image. Got {}'.format(type(img)))\n\n    if gamma < 0:\n        raise ValueError('Gamma should be a non-negative real number')\n\n    input_mode = img.mode\n    img = img.convert('RGB')\n\n    np_img = np.array(img, dtype=np.float32)\n    np_img = 255 * gain * ((np_img / 255)**gamma)\n    np_img = np.uint8(np.clip(np_img, 0, 255))\n\n    img = Image.fromarray(np_img, 'RGB').convert(input_mode)\n    return img\n\n\nclass Compose(object):\n    \"\"\"Composes several transforms together.\n\n    Args:\n        transforms (list of ``Transform`` objects): list of transforms to compose.\n\n    Example:\n        >>> transforms.Compose([\n        >>>     transforms.CenterCrop(10),\n        >>>     transforms.ToTensor(),\n        >>> ])\n    \"\"\"\n    def __init__(self, transforms):\n        self.transforms = transforms\n\n    def __call__(self, img):\n        for t in self.transforms:\n            img = t(img)\n        return img\n\n\nclass ToTensor(object):\n    \"\"\"Convert a ``numpy.ndarray`` to tensor.\n\n    Converts a numpy.ndarray (H x W x C) to a torch.FloatTensor of shape (C x H x W).\n    \"\"\"\n    def __call__(self, img):\n        \"\"\"Convert a ``numpy.ndarray`` to tensor.\n\n        Args:\n            img (numpy.ndarray): Image to be converted to tensor.\n\n        Returns:\n            Tensor: Converted image.\n        \"\"\"\n        if not (_is_numpy_image(img)):\n            raise TypeError('img should be ndarray. Got {}'.format(type(img)))\n\n        if isinstance(img, np.ndarray):\n            # handle numpy array\n            if img.ndim == 3:\n                img = torch.from_numpy(img.transpose((2, 0, 1)).copy())\n            elif img.ndim == 2:\n                img = torch.from_numpy(img.copy())\n            else:\n                raise RuntimeError(\n                    'img should be ndarray with 2 or 3 dimensions. Got {}'.\n                    format(img.ndim))\n\n            return img\n\n\nclass NormalizeNumpyArray(object):\n    \"\"\"Normalize a ``numpy.ndarray`` with mean and standard deviation.\n    Given mean: ``(M1,...,Mn)`` and std: ``(M1,..,Mn)`` for ``n`` channels, this transform\n    will normalize each channel of the input ``numpy.ndarray`` i.e.\n    ``input[channel] = (input[channel] - mean[channel]) / std[channel]``\n\n    Args:\n        mean (sequence): Sequence of means for each channel.\n        std (sequence): Sequence of standard deviations for each channel.\n    \"\"\"\n    def __init__(self, mean, std):\n        self.mean = mean\n        self.std = std\n\n    def __call__(self, img):\n        \"\"\"\n        Args:\n            img (numpy.ndarray): Image of size (H, W, C) to be normalized.\n\n        Returns:\n            Tensor: Normalized image.\n        \"\"\"\n        if not (_is_numpy_image(img)):\n            raise TypeError('img should be ndarray. Got {}'.format(type(img)))\n        # TODO: make efficient\n        print(img.shape)\n        for i in range(3):\n            img[:, :, i] = (img[:, :, i] - self.mean[i]) / self.std[i]\n        return img\n\n\nclass NormalizeTensor(object):\n    \"\"\"Normalize an tensor image with mean and standard deviation.\n    Given mean: ``(M1,...,Mn)`` and std: ``(M1,..,Mn)`` for ``n`` channels, this transform\n    will normalize each channel of the input ``torch.*Tensor`` i.e.\n    ``input[channel] = (input[channel] - mean[channel]) / std[channel]``\n\n    Args:\n        mean (sequence): Sequence of means for each channel.\n        std (sequence): Sequence of standard deviations for each channel.\n    \"\"\"\n    def __init__(self, mean, std):\n        self.mean = mean\n        self.std = std\n\n    def __call__(self, tensor):\n        \"\"\"\n        Args:\n            tensor (Tensor): Tensor image of size (C, H, W) to be normalized.\n\n        Returns:\n            Tensor: Normalized Tensor image.\n        \"\"\"\n        if not _is_tensor_image(tensor):\n            raise TypeError('tensor is not a torch image.')\n        # TODO: make efficient\n        for t, m, s in zip(tensor, self.mean, self.std):\n            t.sub_(m).div_(s)\n        return tensor\n\n\nclass Rotate(object):\n    \"\"\"Rotates the given ``numpy.ndarray``.\n\n    Args:\n        angle (float): The rotation angle in degrees.\n    \"\"\"\n    def __init__(self, angle):\n        self.angle = angle\n\n    def __call__(self, img):\n        \"\"\"\n        Args:\n            img (numpy.ndarray (C x H x W)): Image to be rotated.\n\n        Returns:\n            img (numpy.ndarray (C x H x W)): Rotated image.\n        \"\"\"\n\n        # order=0 means nearest-neighbor type interpolation\n        return skimage.transform.rotate(img, self.angle, resize=False, order=0)\n\n\nclass Resize(object):\n    \"\"\"Resize the the given ``numpy.ndarray`` to the given size.\n    Args:\n        size (sequence or int): Desired output size. If size is a sequence like\n            (h, w), output size will be matched to this. If size is an int,\n            smaller edge of the image will be matched to this number.\n            i.e, if height > width, then image will be rescaled to\n            (size * height / width, size)\n        interpolation (int, optional): Desired interpolation. Default is\n            ``PIL.Image.BILINEAR``\n    \"\"\"\n    def __init__(self, size, interpolation='nearest'):\n        assert isinstance(size, float)\n        self.size = size\n        self.interpolation = interpolation\n\n    def __call__(self, img):\n        \"\"\"\n        Args:\n            img (numpy.ndarray (C x H x W)): Image to be scaled.\n        Returns:\n            img (numpy.ndarray (C x H x W)): Rescaled image.\n        \"\"\"\n        if img.ndim == 3:\n            return skimage.transform.rescale(img, self.size, order=0)\n        elif img.ndim == 2:\n            return skimage.transform.rescale(img, self.size, order=0)\n        else:\n            RuntimeError(\n                'img should be ndarray with 2 or 3 dimensions. Got {}'.format(\n                    img.ndim))\n\n\nclass CenterCrop(object):\n    \"\"\"Crops the given ``numpy.ndarray`` at the center.\n\n    Args:\n        size (sequence or int): Desired output size of the crop. If size is an\n            int instead of sequence like (h, w), a square crop (size, size) is\n            made.\n    \"\"\"\n    def __init__(self, size):\n        if isinstance(size, numbers.Number):\n            self.size = (int(size), int(size))\n        else:\n            self.size = size\n\n    @staticmethod\n    def get_params(img, output_size):\n        \"\"\"Get parameters for ``crop`` for center crop.\n\n        Args:\n            img (numpy.ndarray (C x H x W)): Image to be cropped.\n            output_size (tuple): Expected output size of the crop.\n\n        Returns:\n            tuple: params (i, j, h, w) to be passed to ``crop`` for center crop.\n        \"\"\"\n        h = img.shape[0]\n        w = img.shape[1]\n        th, tw = output_size\n        i = int(round((h - th) / 2.))\n        j = int(round((w - tw) / 2.))\n\n        # # randomized cropping\n        # i = np.random.randint(i-3, i+4)\n        # j = np.random.randint(j-3, j+4)\n\n        return i, j, th, tw\n\n    def __call__(self, img):\n        \"\"\"\n        Args:\n            img (numpy.ndarray (C x H x W)): Image to be cropped.\n\n        Returns:\n            img (numpy.ndarray (C x H x W)): Cropped image.\n        \"\"\"\n        i, j, h, w = self.get_params(img, self.size)\n        \"\"\"\n        i: Upper pixel coordinate.\n        j: Left pixel coordinate.\n        h: Height of the cropped image.\n        w: Width of the cropped image.\n        \"\"\"\n        if not (_is_numpy_image(img)):\n            raise TypeError('img should be ndarray. Got {}'.format(type(img)))\n        if img.ndim == 3:\n            return img[i:i + h, j:j + w, :]\n        elif img.ndim == 2:\n            return img[i:i + h, j:j + w]\n        else:\n            raise RuntimeError(\n                'img should be ndarray with 2 or 3 dimensions. Got {}'.format(\n                    img.ndim))\n\n\nclass BottomCrop(object):\n    \"\"\"Crops the given ``numpy.ndarray`` at the bottom.\n\n    Args:\n        size (sequence or int): Desired output size of the crop. If size is an\n            int instead of sequence like (h, w), a square crop (size, size) is\n            made.\n    \"\"\"\n    def __init__(self, size):\n        if isinstance(size, numbers.Number):\n            self.size = (int(size), int(size))\n        else:\n            self.size = size\n\n    @staticmethod\n    def get_params(img, output_size):\n        \"\"\"Get parameters for ``crop`` for bottom crop.\n\n        Args:\n            img (numpy.ndarray (C x H x W)): Image to be cropped.\n            output_size (tuple): Expected output size of the crop.\n\n        Returns:\n            tuple: params (i, j, h, w) to be passed to ``crop`` for bottom crop.\n        \"\"\"\n        h = img.shape[0]\n        w = img.shape[1]\n        th, tw = output_size\n        i = h - th\n        j = int(round((w - tw) / 2.))\n\n        # randomized left and right cropping\n        # i = np.random.randint(i-3, i+4)\n        # j = np.random.randint(j-1, j+1)\n\n        return i, j, th, tw\n\n    def __call__(self, img):\n        \"\"\"\n        Args:\n            img (numpy.ndarray (C x H x W)): Image to be cropped.\n\n        Returns:\n            img (numpy.ndarray (C x H x W)): Cropped image.\n        \"\"\"\n        i, j, h, w = self.get_params(img, self.size)\n        \"\"\"\n        i: Upper pixel coordinate.\n        j: Left pixel coordinate.\n        h: Height of the cropped image.\n        w: Width of the cropped image.\n        \"\"\"\n        if not (_is_numpy_image(img)):\n            raise TypeError('img should be ndarray. Got {}'.format(type(img)))\n        if img.ndim == 3:\n            return img[i:i + h, j:j + w, :]\n        elif img.ndim == 2:\n            return img[i:i + h, j:j + w]\n        else:\n            raise RuntimeError(\n                'img should be ndarray with 2 or 3 dimensions. Got {}'.format(\n                    img.ndim))\n\n\nclass RandomCrop(object):\n    \"\"\"Crops the given ``numpy.ndarray`` at the bottom.\n\n    Args:\n        size (sequence or int): Desired output size of the crop. If size is an\n            int instead of sequence like (h, w), a square crop (size, size) is\n            made.\n    \"\"\"\n    def __init__(self, size):\n        if isinstance(size, numbers.Number):\n            self.size = (int(size), int(size))\n        else:\n            self.size = size\n\n    @staticmethod\n    def get_params(img, output_size):\n        \"\"\"Get parameters for ``crop`` for bottom crop.\n\n        Args:\n            img (numpy.ndarray (C x H x W)): Image to be cropped.\n            output_size (tuple): Expected output size of the crop.\n\n        Returns:\n            tuple: params (i, j, h, w) to be passed to ``crop`` for bottom crop.\n        \"\"\"\n        h = img.shape[0]\n        w = img.shape[1]\n        th, tw = output_size\n\n        # randomized left and right cropping\n        i = np.random.randint(0, h-th+1)\n        j = np.random.randint(0, w-tw+1)\n\n        return i, j, th, tw\n\n    def __call__(self, img):\n        \"\"\"\n        Args:\n            img (numpy.ndarray (C x H x W)): Image to be cropped.\n\n        Returns:\n            img (numpy.ndarray (C x H x W)): Cropped image.\n        \"\"\"\n        i, j, h, w = self.get_params(img, self.size)\n        \"\"\"\n        i: Upper pixel coordinate.\n        j: Left pixel coordinate.\n        h: Height of the cropped image.\n        w: Width of the cropped image.\n        \"\"\"\n        if not (_is_numpy_image(img)):\n            raise TypeError('img should be ndarray. Got {}'.format(type(img)))\n        if img.ndim == 3:\n            return img[i:i + h, j:j + w, :]\n        elif img.ndim == 2:\n            return img[i:i + h, j:j + w]\n        else:\n            raise RuntimeError(\n                'img should be ndarray with 2 or 3 dimensions. Got {}'.format(\n                    img.ndim))\n\n\nclass Crop(object):\n    \"\"\"Crops the given ``numpy.ndarray`` at the center.\n\n    Args:\n        size (sequence or int): Desired output size of the crop. If size is an\n            int instead of sequence like (h, w), a square crop (size, size) is\n            made.\n    \"\"\"\n    def __init__(self, crop):\n        self.crop = crop\n\n    @staticmethod\n    def get_params(img, crop):\n        \"\"\"Get parameters for ``crop`` for center crop.\n\n        Args:\n            img (numpy.ndarray (C x H x W)): Image to be cropped.\n            output_size (tuple): Expected output size of the crop.\n\n        Returns:\n            tuple: params (i, j, h, w) to be passed to ``crop`` for center crop.\n        \"\"\"\n        x_l, x_r, y_b, y_t = crop\n        h = img.shape[0]\n        w = img.shape[1]\n        assert x_l >= 0 and x_l < w\n        assert x_r >= 0 and x_r < w\n        assert y_b >= 0 and y_b < h\n        assert y_t >= 0 and y_t < h\n        assert x_l < x_r and y_b < y_t\n\n        return x_l, x_r, y_b, y_t\n\n    def __call__(self, img):\n        \"\"\"\n        Args:\n            img (numpy.ndarray (C x H x W)): Image to be cropped.\n\n        Returns:\n            img (numpy.ndarray (C x H x W)): Cropped image.\n        \"\"\"\n        x_l, x_r, y_b, y_t = self.get_params(img, self.crop)\n        \"\"\"\n        i: Upper pixel coordinate.\n        j: Left pixel coordinate.\n        h: Height of the cropped image.\n        w: Width of the cropped image.\n        \"\"\"\n        if not (_is_numpy_image(img)):\n            raise TypeError('img should be ndarray. Got {}'.format(type(img)))\n        if img.ndim == 3:\n            return img[y_b:y_t, x_l:x_r, :]\n        elif img.ndim == 2:\n            return img[y_b:y_t, x_l:x_r]\n        else:\n            raise RuntimeError(\n                'img should be ndarray with 2 or 3 dimensions. Got {}'.format(\n                    img.ndim))\n\n\nclass Lambda(object):\n    \"\"\"Apply a user-defined lambda as a transform.\n\n    Args:\n        lambd (function): Lambda/function to be used for transform.\n    \"\"\"\n    def __init__(self, lambd):\n        assert isinstance(lambd, types.LambdaType)\n        self.lambd = lambd\n\n    def __call__(self, img):\n        return self.lambd(img)\n\n\nclass HorizontalFlip(object):\n    \"\"\"Horizontally flip the given ``numpy.ndarray``.\n\n    Args:\n        do_flip (boolean): whether or not do horizontal flip.\n\n    \"\"\"\n    def __init__(self, do_flip):\n        self.do_flip = do_flip\n\n    def __call__(self, img):\n        \"\"\"\n        Args:\n            img (numpy.ndarray (C x H x W)): Image to be flipped.\n\n        Returns:\n            img (numpy.ndarray (C x H x W)): flipped image.\n        \"\"\"\n        if not (_is_numpy_image(img)):\n            raise TypeError('img should be ndarray. Got {}'.format(type(img)))\n\n        if self.do_flip:\n            return np.fliplr(img)\n        else:\n            return img\n\n\nclass ColorJitter(object):\n    \"\"\"Randomly change the brightness, contrast and saturation of an image.\n\n    Args:\n        brightness (float): How much to jitter brightness. brightness_factor\n            is chosen uniformly from [max(0, 1 - brightness), 1 + brightness].\n        contrast (float): How much to jitter contrast. contrast_factor\n            is chosen uniformly from [max(0, 1 - contrast), 1 + contrast].\n        saturation (float): How much to jitter saturation. saturation_factor\n            is chosen uniformly from [max(0, 1 - saturation), 1 + saturation].\n        hue(float): How much to jitter hue. hue_factor is chosen uniformly from\n            [-hue, hue]. Should be >=0 and <= 0.5.\n    \"\"\"\n    def __init__(self, brightness=0, contrast=0, saturation=0, hue=0):\n        transforms = []\n        transforms.append(\n            Lambda(lambda img: adjust_brightness(img, brightness)))\n        transforms.append(Lambda(lambda img: adjust_contrast(img, contrast)))\n        transforms.append(\n            Lambda(lambda img: adjust_saturation(img, saturation)))\n        transforms.append(Lambda(lambda img: adjust_hue(img, hue)))\n        np.random.shuffle(transforms)\n        self.transform = Compose(transforms)\n\n    def __call__(self, img):\n        \"\"\"\n        Args:\n            img (numpy.ndarray (C x H x W)): Input image.\n\n        Returns:\n            img (numpy.ndarray (C x H x W)): Color jittered image.\n        \"\"\"\n        if not (_is_numpy_image(img)):\n            raise TypeError('img should be ndarray. Got {}'.format(type(img)))\n\n        pil = Image.fromarray(img)\n        return np.array(self.transform(pil))\n"
  },
  {
    "path": "tools/PENet/helper.py",
    "content": "import math\nimport os, time\nimport shutil\nimport torch\nimport csv\nimport vis_utils\nfrom metrics import Result\n\nfieldnames = [\n    'epoch', 'rmse', 'photo', 'mae', 'irmse', 'imae', 'mse', 'absrel', 'lg10',\n    'silog', 'squared_rel', 'delta1', 'delta2', 'delta3', 'data_time',\n    'gpu_time'\n]\n\n\nclass logger:\n    def __init__(self, args, prepare=True):\n        self.args = args\n        output_directory = get_folder_name(args)\n        self.output_directory = output_directory\n        self.best_result = Result()\n        self.best_result.set_to_worst()\n\n        if not prepare:\n            return\n        if not os.path.exists(output_directory):\n            os.makedirs(output_directory)\n        self.train_csv = os.path.join(output_directory, 'train.csv')\n        self.val_csv = os.path.join(output_directory, 'val.csv')\n        self.best_txt = os.path.join(output_directory, 'best.txt')\n\n        # backup the source code\n        if args.resume == '':\n            print(\"=> creating source code backup ...\")\n            backup_directory = os.path.join(output_directory, \"code_backup\")\n            self.backup_directory = backup_directory\n            backup_source_code(backup_directory)\n            # create new csv files with only header\n            with open(self.train_csv, 'w') as csvfile:\n                writer = csv.DictWriter(csvfile, fieldnames=fieldnames)\n                writer.writeheader()\n            with open(self.val_csv, 'w') as csvfile:\n                writer = csv.DictWriter(csvfile, fieldnames=fieldnames)\n                writer.writeheader()\n            print(\"=> finished creating source code backup.\")\n\n    def conditional_print(self, split, i, epoch, lr, n_set, blk_avg_meter,\n                          avg_meter):\n        if (i + 1) % self.args.print_freq == 0:\n            avg = avg_meter.average()\n            blk_avg = blk_avg_meter.average()\n            print('=> output: {}'.format(self.output_directory))\n            print(\n                '{split} Epoch: {0} [{1}/{2}]\\tlr={lr} '\n                't_Data={blk_avg.data_time:.3f}({average.data_time:.3f}) '\n                't_GPU={blk_avg.gpu_time:.3f}({average.gpu_time:.3f})\\n\\t'\n                'RMSE={blk_avg.rmse:.2f}({average.rmse:.2f}) '\n                'MAE={blk_avg.mae:.2f}({average.mae:.2f}) '\n                'iRMSE={blk_avg.irmse:.2f}({average.irmse:.2f}) '\n                'iMAE={blk_avg.imae:.2f}({average.imae:.2f})\\n\\t'\n                'silog={blk_avg.silog:.2f}({average.silog:.2f}) '\n                'squared_rel={blk_avg.squared_rel:.2f}({average.squared_rel:.2f}) '\n                'Delta1={blk_avg.delta1:.3f}({average.delta1:.3f}) '\n                'REL={blk_avg.absrel:.3f}({average.absrel:.3f})\\n\\t'\n                'Lg10={blk_avg.lg10:.3f}({average.lg10:.3f}) '\n                'Photometric={blk_avg.photometric:.3f}({average.photometric:.3f}) '\n                .format(epoch,\n                        i + 1,\n                        n_set,\n                        lr=lr,\n                        blk_avg=blk_avg,\n                        average=avg,\n                        split=split.capitalize()))\n            blk_avg_meter.reset(False)\n\n    def conditional_save_info(self, split, average_meter, epoch):\n        avg = average_meter.average()\n        if split == \"train\":\n            csvfile_name = self.train_csv\n        elif split == \"val\":\n            csvfile_name = self.val_csv\n        elif split == \"eval\":\n            eval_filename = os.path.join(self.output_directory, 'eval.txt')\n            self.save_single_txt(eval_filename, avg, epoch)\n            return avg\n        elif \"test\" in split:\n            return avg\n        else:\n            raise ValueError(\"wrong split provided to logger\")\n        with open(csvfile_name, 'a') as csvfile:\n            writer = csv.DictWriter(csvfile, fieldnames=fieldnames)\n            writer.writerow({\n                'epoch': epoch,\n                'rmse': avg.rmse,\n                'photo': avg.photometric,\n                'mae': avg.mae,\n                'irmse': avg.irmse,\n                'imae': avg.imae,\n                'mse': avg.mse,\n                'silog': avg.silog,\n                'squared_rel': avg.squared_rel,\n                'absrel': avg.absrel,\n                'lg10': avg.lg10,\n                'delta1': avg.delta1,\n                'delta2': avg.delta2,\n                'delta3': avg.delta3,\n                'gpu_time': avg.gpu_time,\n                'data_time': avg.data_time\n            })\n        return avg\n\n    def save_single_txt(self, filename, result, epoch):\n        with open(filename, 'w') as txtfile:\n            txtfile.write(\n                (\"rank_metric={}\\n\" + \"epoch={}\\n\" + \"rmse={:.3f}\\n\" +\n                 \"mae={:.3f}\\n\" + \"silog={:.3f}\\n\" + \"squared_rel={:.3f}\\n\" +\n                 \"irmse={:.3f}\\n\" + \"imae={:.3f}\\n\" + \"mse={:.3f}\\n\" +\n                 \"absrel={:.3f}\\n\" + \"lg10={:.3f}\\n\" + \"delta1={:.3f}\\n\" +\n                 \"t_gpu={:.4f}\").format(self.args.rank_metric, epoch,\n                                        result.rmse, result.mae, result.silog,\n                                        result.squared_rel, result.irmse,\n                                        result.imae, result.mse, result.absrel,\n                                        result.lg10, result.delta1,\n                                        result.gpu_time))\n\n    def save_best_txt(self, result, epoch):\n        self.save_single_txt(self.best_txt, result, epoch)\n\n    def _get_img_comparison_name(self, mode, epoch, is_best=False):\n        if mode == 'eval':\n            return self.output_directory + '/comparison_eval.png'\n        if mode == 'val':\n            if is_best:\n                return self.output_directory + '/comparison_best.png'\n            else:\n                return self.output_directory + '/comparison_' + str(epoch) + '.png'\n\n    def conditional_save_img_comparison(self, mode, i, ele, pred, epoch, predrgb=None, predg=None, extra=None, extra2=None, extrargb=None):\n        # save 8 images for visualization\n        if mode == 'val' or mode == 'eval':\n            skip = 100\n            if i == 0:\n                self.img_merge = vis_utils.merge_into_row(ele, pred, predrgb, predg, extra, extra2, extrargb)\n            elif i % skip == 0 and i < 8 * skip:\n                row = vis_utils.merge_into_row(ele, pred, predrgb, predg, extra, extra2, extrargb)\n                self.img_merge = vis_utils.add_row(self.img_merge, row)\n            elif i == 8 * skip:\n                filename = self._get_img_comparison_name(mode, epoch)\n                vis_utils.save_image(self.img_merge, filename)\n\n    def save_img_comparison_as_best(self, mode, epoch):\n        if mode == 'val':\n            filename = self._get_img_comparison_name(mode, epoch, is_best=True)\n            vis_utils.save_image(self.img_merge, filename)\n\n    def get_ranking_error(self, result):\n        return getattr(result, self.args.rank_metric)\n\n    def rank_conditional_save_best(self, mode, result, epoch):\n        error = self.get_ranking_error(result)\n        best_error = self.get_ranking_error(self.best_result)\n        is_best = error < best_error\n        if is_best and mode == \"val\":\n            self.old_best_result = self.best_result\n            self.best_result = result\n            self.save_best_txt(result, epoch)\n        return is_best\n\n    def conditional_save_pred(self, mode, i, pred, epoch):\n        if (\"test\" in mode or mode == \"eval\") and self.args.save_pred:\n\n            # save images for visualization/ testing\n            image_folder = os.path.join(self.output_directory,\n                                        mode + \"_output\")\n            if not os.path.exists(image_folder):\n                os.makedirs(image_folder)\n            img = torch.squeeze(pred.data.cpu()).numpy()\n            filename = os.path.join(image_folder, '{0:010d}.png'.format(i))\n            vis_utils.save_depth_as_uint16png(img, filename)\n\n    def conditional_summarize(self, mode, avg, is_best):\n        print(\"\\n*\\nSummary of \", mode, \"round\")\n        print(''\n              'RMSE={average.rmse:.3f}\\n'\n              'MAE={average.mae:.3f}\\n'\n              'Photo={average.photometric:.3f}\\n'\n              'iRMSE={average.irmse:.3f}\\n'\n              'iMAE={average.imae:.3f}\\n'\n              'squared_rel={average.squared_rel}\\n'\n              'silog={average.silog}\\n'\n              'Delta1={average.delta1:.3f}\\n'\n              'REL={average.absrel:.3f}\\n'\n              'Lg10={average.lg10:.3f}\\n'\n              't_GPU={time:.3f}'.format(average=avg, time=avg.gpu_time))\n        if is_best and mode == \"val\":\n            print(\"New best model by %s (was %.3f)\" %\n                  (self.args.rank_metric,\n                   self.get_ranking_error(self.old_best_result)))\n        elif mode == \"val\":\n            print(\"(best %s is %.3f)\" %\n                  (self.args.rank_metric,\n                   self.get_ranking_error(self.best_result)))\n        print(\"*\\n\")\n\n\nignore_hidden = shutil.ignore_patterns(\".\", \"..\", \".git*\", \"*pycache*\",\n                                       \"*build\", \"*.fuse*\", \"*_drive_*\")\n\n\ndef backup_source_code(backup_directory):\n    if os.path.exists(backup_directory):\n        shutil.rmtree(backup_directory)\n    shutil.copytree('.', backup_directory, ignore=ignore_hidden)\n\n\ndef adjust_learning_rate(lr_init, optimizer, epoch, args):\n    \"\"\"Sets the learning rate to the initial LR decayed by 10 every 5 epochs\"\"\"\n    #lr = lr_init * (0.5**(epoch // 5))\n    #'''\n    lr = lr_init\n    if (args.network_model == 'pe' and args.freeze_backbone == False):\n        if (epoch >= 10):\n            lr = lr_init * 0.5\n        if (epoch >= 20):\n            lr = lr_init * 0.1\n        if (epoch >= 30):\n            lr = lr_init * 0.01\n        if (epoch >= 40):\n            lr = lr_init * 0.0005\n        if (epoch >= 50):\n            lr = lr_init * 0.00001\n    else:\n        if (epoch >= 10):\n            lr = lr_init * 0.5\n        if (epoch >= 15):\n            lr = lr_init * 0.1\n        if (epoch >= 25):\n            lr = lr_init * 0.01\n    #'''\n\n    for param_group in optimizer.param_groups:\n        param_group['lr'] = lr\n    return lr\n\ndef save_checkpoint(state, is_best, epoch, output_directory):\n    checkpoint_filename = os.path.join(output_directory,\n                                       'checkpoint-' + str(epoch) + '.pth.tar')\n    torch.save(state, checkpoint_filename)\n    if is_best:\n        best_filename = os.path.join(output_directory, 'model_best.pth.tar')\n        shutil.copyfile(checkpoint_filename, best_filename)\n    if epoch > 0:\n        prev_checkpoint_filename = os.path.join(\n            output_directory, 'checkpoint-' + str(epoch - 1) + '.pth.tar')\n        if os.path.exists(prev_checkpoint_filename):\n            os.remove(prev_checkpoint_filename)\n\n\ndef get_folder_name(args):\n    current_time = time.strftime('%Y-%m-%d@%H-%M')\n    return os.path.join(args.result,\n        'input={}.criterion={}.lr={}.bs={}.wd={}.jitter={}.time={}'.\n        format(args.input, args.criterion, \\\n            args.lr, args.batch_size, args.weight_decay, \\\n            args.jitter, current_time\n            ))\n\n\navgpool = torch.nn.AvgPool2d(kernel_size=2, stride=2).cuda()\n\n\ndef multiscale(img):\n    img1 = avgpool(img)\n    img2 = avgpool(img1)\n    img3 = avgpool(img2)\n    img4 = avgpool(img3)\n    img5 = avgpool(img4)\n    return img5, img4, img3, img2, img1\n"
  },
  {
    "path": "tools/PENet/main.py",
    "content": "import argparse\nimport os\n#os.environ[\"CUDA_VISIBLE_DEVICES\"] = '1'\nimport torch\nimport torch.nn.parallel\nimport torch.optim\nimport torch.utils.data\nimport time\n\nfrom dataloaders.kitti_loader import load_calib, input_options, KittiDepth\nfrom metrics import AverageMeter, Result\nimport criteria\nimport helper\nimport vis_utils\n\nfrom model import ENet\nfrom model import PENet_C1_train\nfrom model import PENet_C2_train\n#from model import PENet_C4_train (Not Implemented)\nfrom model import PENet_C1\nfrom model import PENet_C2\nfrom model import PENet_C4\nimport time\n\nparser = argparse.ArgumentParser(description='Sparse-to-Dense')\nparser.add_argument('-n',\n                    '--network-model',\n                    type=str,\n                    default=\"pe\",\n                    choices=[\"e\", \"pe\"],\n                    help='choose a model: enet or penet'\n                    )\nparser.add_argument('--workers',\n                    default=4,\n                    type=int,\n                    metavar='N',\n                    help='number of data loading workers (default: 4)')\nparser.add_argument('--epochs',\n                    default=100,\n                    type=int,\n                    metavar='N',\n                    help='number of total epochs to run (default: 100)')\nparser.add_argument('--start-epoch',\n                    default=0,\n                    type=int,\n                    metavar='N',\n                    help='manual epoch number (useful on restarts)')\nparser.add_argument('--start-epoch-bias',\n                    default=0,\n                    type=int,\n                    metavar='N',\n                    help='manual epoch number bias(useful on restarts)')\nparser.add_argument('-c',\n                    '--criterion',\n                    metavar='LOSS',\n                    default='l2',\n                    choices=criteria.loss_names,\n                    help='loss function: | '.join(criteria.loss_names) +\n                    ' (default: l2)')\nparser.add_argument('-b',\n                    '--batch-size',\n                    default=1,\n                    type=int,\n                    help='mini-batch size (default: 1)')\nparser.add_argument('--lr',\n                    '--learning-rate',\n                    default=1e-3,\n                    type=float,\n                    metavar='LR',\n                    help='initial learning rate (default 1e-5)')\nparser.add_argument('--weight-decay',\n                    '--wd',\n                    default=1e-6,\n                    type=float,\n                    metavar='W',\n                    help='weight decay (default: 0)')\nparser.add_argument('--print-freq',\n                    '-p',\n                    default=10,\n                    type=int,\n                    metavar='N',\n                    help='print frequency (default: 10)')\nparser.add_argument('--resume',\n                    default='',\n                    type=str,\n                    metavar='PATH',\n                    help='path to latest checkpoint (default: none)')\nparser.add_argument('--data-folder',\n                    default='/data/dataset/kitti_depth/depth',\n                    type=str,\n                    metavar='PATH',\n                    help='data folder (default: none)')\nparser.add_argument('--data-folder-rgb',\n                    default='/data/dataset/kitti_raw',\n                    type=str,\n                    metavar='PATH',\n                    help='data folder rgb (default: none)')\nparser.add_argument('--data-folder-save',\n                    default='',\n                    type=str,\n                    metavar='PATH',\n                    help='data folder test results(default: none)')\nparser.add_argument('--detpath',\n                    default='../../data/kitti/training',\n                    type=str,\n                    metavar='PATH',\n                    help='data folder of 3D object detection')\nparser.add_argument('-i',\n                    '--input',\n                    type=str,\n                    default='rgbd',\n                    choices=input_options,\n                    help='input: | '.join(input_options))\nparser.add_argument('--val',\n                    type=str,\n                    default=\"select\",\n                    choices=[\"select\", \"full\"],\n                    help='full or select validation set')\nparser.add_argument('--jitter',\n                    type=float,\n                    default=0.1,\n                    help='color jitter for images')\nparser.add_argument('--rank-metric',\n                    type=str,\n                    default='rmse',\n                    choices=[m for m in dir(Result()) if not m.startswith('_')],\n                    help='metrics for which best result is saved')\n\nparser.add_argument('-e', '--evaluate', default='pe.pth.tar', type=str, metavar='PATH')\nparser.add_argument('-f', '--freeze-backbone', action=\"store_true\", default=False,\n                    help='freeze parameters in backbone')\nparser.add_argument('--test', action=\"store_true\", default=True,\n                    help='save result kitti test dataset for submission')\nparser.add_argument('--cpu', action=\"store_true\", default=False, help='run on cpu')\n\n#random cropping\nparser.add_argument('--not-random-crop', action=\"store_true\", default=False,\n                    help='prohibit random cropping')\nparser.add_argument('-he', '--random-crop-height', default=320, type=int, metavar='N',\n                    help='random crop height')\nparser.add_argument('-w', '--random-crop-width', default=1216, type=int, metavar='N',\n                    help='random crop height')\n\n#geometric encoding\nparser.add_argument('-co', '--convolutional-layer-encoding', default=\"xyz\", type=str,\n                    choices=[\"std\", \"z\", \"uv\", \"xyz\"],\n                    help='information concatenated in encoder convolutional layers')\n\n#dilated rate of DA-CSPN++\nparser.add_argument('-d', '--dilation-rate', default=\"2\", type=int,\n                    choices=[1, 2, 4],\n                    help='CSPN++ dilation rate')\n\nargs = parser.parse_args()\nargs.result = os.path.join('..', 'results')\nargs.use_rgb = ('rgb' in args.input)\nargs.use_d = 'd' in args.input\nargs.use_g = 'g' in args.input\nargs.val_h = 352\nargs.val_w = 1216\nprint(args)\n\ncuda = torch.cuda.is_available() and not args.cpu\nif cuda:\n    import torch.backends.cudnn as cudnn\n    cudnn.benchmark = True\n    device = torch.device(\"cuda\")\nelse:\n    device = torch.device(\"cpu\")\nprint(\"=> using '{}' for computation.\".format(device))\n\n# define loss functions\ndepth_criterion = criteria.MaskedMSELoss() if (\n    args.criterion == 'l2') else criteria.MaskedL1Loss()\n\n#multi batch\nmulti_batch_size = 1\ndef iterate(mode, args, loader, model, optimizer, logger, epoch):\n    actual_epoch = epoch - args.start_epoch + args.start_epoch_bias\n\n    block_average_meter = AverageMeter()\n    block_average_meter.reset(False)\n    average_meter = AverageMeter()\n    meters = [block_average_meter, average_meter]\n\n    # switch to appropriate mode\n    assert mode in [\"train\", \"val\", \"eval\", \"test_prediction\", \"test_completion\"], \\\n        \"unsupported mode: {}\".format(mode)\n    if mode == 'train':\n        model.train()\n        lr = helper.adjust_learning_rate(args.lr, optimizer, actual_epoch, args)\n    else:\n        model.eval()\n        lr = 0\n\n    torch.cuda.empty_cache()\n\n    for i, batch_data in enumerate(loader):\n\n        dstart = time.time()\n        batch_data = {\n            key: val.to(device)\n            for key, val in batch_data.items() if val is not None\n        }\n\n        gt = batch_data[\n            'gt'] if mode != 'test_prediction' and mode != 'test_completion' else None\n        data_time = time.time() - dstart\n\n        pred = None\n        start = None\n        gpu_time = 0\n\n        #start = time.time()\n        #pred = model(batch_data)\n        #gpu_time = time.time() - start\n\n        #'''\n        if(args.network_model == 'e'):\n            start = time.time()\n            st1_pred, st2_pred, pred = model(batch_data)\n        else:\n            start = time.time()\n            pred = model(batch_data)\n\n        if(args.evaluate):\n            gpu_time = time.time() - start\n        #'''\n\n        depth_loss, photometric_loss, smooth_loss, mask = 0, 0, 0, None\n\n        # inter loss_param\n        st1_loss, st2_loss, loss = 0, 0, 0\n        w_st1, w_st2 = 0, 0\n        round1, round2, round3 = 1, 3, None\n        if(actual_epoch <= round1):\n            w_st1, w_st2 = 0.2, 0.2\n        elif(actual_epoch <= round2):\n            w_st1, w_st2 = 0.05, 0.05\n        else:\n            w_st1, w_st2 = 0, 0\n\n        if mode == 'train':\n            # Loss 1: the direct depth supervision from ground truth label\n            # mask=1 indicates that a pixel does not ground truth labels\n            depth_loss = depth_criterion(pred, gt)\n\n            if args.network_model == 'e':\n                st1_loss = depth_criterion(st1_pred, gt)\n                st2_loss = depth_criterion(st2_pred, gt)\n                loss = (1 - w_st1 - w_st2) * depth_loss + w_st1 * st1_loss + w_st2 * st2_loss\n            else:\n                loss = depth_loss\n\n            if i % multi_batch_size == 0:\n                optimizer.zero_grad()\n            loss.backward()\n\n            if i % multi_batch_size == (multi_batch_size-1) or i==(len(loader)-1):\n                optimizer.step()\n            print(\"loss:\", loss, \" epoch:\", epoch, \" \", i, \"/\", len(loader))\n\n        if mode == \"test_completion\":\n\n            vis_utils.save_depth_as_points(pred, i, args.detpath)\n\n        if(not args.evaluate):\n            gpu_time = time.time() - start\n        # measure accuracy and record loss\n        with torch.no_grad():\n            mini_batch_size = next(iter(batch_data.values())).size(0)\n            result = Result()\n            if mode != 'test_prediction' and mode != 'test_completion':\n                result.evaluate(pred.data, gt.data, photometric_loss)\n                [\n                    m.update(result, gpu_time, data_time, mini_batch_size)\n                    for m in meters\n                ]\n\n                if mode != 'train':\n                    logger.conditional_print(mode, i, epoch, lr, len(loader),\n                                     block_average_meter, average_meter)\n                logger.conditional_save_img_comparison(mode, i, batch_data, pred,\n                                                   epoch)\n                logger.conditional_save_pred(mode, i, pred, epoch)\n        end_time = time.time()-dstart\n        print('iter: ', i,'  ',  'remain time:', (len(loader)-i)*end_time//60, 'min')\n    avg = logger.conditional_save_info(mode, average_meter, epoch)\n    is_best = logger.rank_conditional_save_best(mode, avg, epoch)\n    if is_best and not (mode == \"train\"):\n        logger.save_img_comparison_as_best(mode, epoch)\n    logger.conditional_summarize(mode, avg, is_best)\n\n    return avg, is_best\n\ndef main():\n    global args\n    checkpoint = None\n    is_eval = False\n    if args.evaluate:\n        args_new = args\n        if os.path.isfile(args.evaluate):\n            print(\"=> loading checkpoint '{}' ... \".format(args.evaluate),\n                  end='')\n            checkpoint = torch.load(args.evaluate, map_location=device)\n            #args = checkpoint['args']\n            args.start_epoch = checkpoint['epoch'] + 1\n            args.data_folder = args_new.data_folder\n            args.val = args_new.val\n            is_eval = True\n\n            print(\"Completed.\")\n        else:\n            is_eval = True\n            print(\"No model found at '{}'\".format(args.evaluate))\n            #return\n\n    elif args.resume:  # optionally resume from a checkpoint\n        args_new = args\n        if os.path.isfile(args.resume):\n            print(\"=> loading checkpoint '{}' ... \".format(args.resume),\n                  end='')\n            checkpoint = torch.load(args.resume, map_location=device)\n\n            args.start_epoch = checkpoint['epoch'] + 1\n            args.data_folder = args_new.data_folder\n            args.val = args_new.val\n            print(\"Completed. Resuming from epoch {}.\".format(\n                checkpoint['epoch']))\n        else:\n            print(\"No checkpoint found at '{}'\".format(args.resume))\n            return\n\n    print(\"=> creating model and optimizer ... \", end='')\n    model = None\n    penet_accelerated = False\n    if (args.network_model == 'e'):\n        model = ENet(args).to(device)\n    elif (is_eval == False):\n        if (args.dilation_rate == 1):\n            model = PENet_C1_train(args).to(device)\n        elif (args.dilation_rate == 2):\n            model = PENet_C2_train(args).to(device)\n        elif (args.dilation_rate == 4):\n            model = PENet_C4(args).to(device)\n            penet_accelerated = True\n    else:\n        if (args.dilation_rate == 1):\n            model = PENet_C1(args).to(device)\n            penet_accelerated = True\n        elif (args.dilation_rate == 2):\n            model = PENet_C2(args).to(device)\n            penet_accelerated = True\n        elif (args.dilation_rate == 4):\n            model = PENet_C4(args).to(device)\n            penet_accelerated = True\n\n    if (penet_accelerated == True):\n        model.encoder3.requires_grad = False\n        model.encoder5.requires_grad = False\n        model.encoder7.requires_grad = False\n\n    model_named_params = None\n    model_bone_params = None\n    model_new_params = None\n    optimizer = None\n\n    if checkpoint is not None:\n        #print(checkpoint.keys())\n        if (args.freeze_backbone == True):\n            model.backbone.load_state_dict(checkpoint['model'])\n        else:\n            model.load_state_dict(checkpoint['model'], strict=False)\n        #optimizer.load_state_dict(checkpoint['optimizer'])\n        print(\"=> checkpoint state loaded.\")\n\n    logger = helper.logger(args)\n    if checkpoint is not None:\n        logger.best_result = checkpoint['best_result']\n        del checkpoint\n    print(\"=> logger created.\")\n\n    test_dataset = None\n    test_loader = None\n    if (args.test):\n        test_dataset = KittiDepth('test_completion', args)\n        test_loader = torch.utils.data.DataLoader(\n            test_dataset,\n            batch_size=1,\n            shuffle=False,\n            num_workers=1,\n            pin_memory=True)\n        iterate(\"test_completion\", args, test_loader, model, None, logger, 0)\n        return\n\n    val_dataset = KittiDepth('val', args)\n    val_loader = torch.utils.data.DataLoader(\n        val_dataset,\n        batch_size=1,\n        shuffle=False,\n        num_workers=2,\n        pin_memory=True)  # set batch size to be 1 for validation\n    print(\"\\t==> val_loader size:{}\".format(len(val_loader)))\n\n    if is_eval == True:\n        for p in model.parameters():\n            p.requires_grad = False\n\n        result, is_best = iterate(\"val\", args, val_loader, model, None, logger,\n                              args.start_epoch - 1)\n        return\n\n    if (args.freeze_backbone == True):\n        for p in model.backbone.parameters():\n            p.requires_grad = False\n        model_named_params = [\n            p for _, p in model.named_parameters() if p.requires_grad\n        ]\n        optimizer = torch.optim.Adam(model_named_params, lr=args.lr, weight_decay=args.weight_decay, betas=(0.9, 0.99))\n    elif (args.network_model == 'pe'):\n        model_bone_params = [\n            p for _, p in model.backbone.named_parameters() if p.requires_grad\n        ]\n        model_new_params = [\n            p for _, p in model.named_parameters() if p.requires_grad\n        ]\n        model_new_params = list(set(model_new_params) - set(model_bone_params))\n        optimizer = torch.optim.Adam([{'params': model_bone_params, 'lr': args.lr / 10}, {'params': model_new_params}],\n                                     lr=args.lr, weight_decay=args.weight_decay, betas=(0.9, 0.99))\n    else:\n        model_named_params = [\n            p for _, p in model.named_parameters() if p.requires_grad\n        ]\n        optimizer = torch.optim.Adam(model_named_params, lr=args.lr, weight_decay=args.weight_decay, betas=(0.9, 0.99))\n    print(\"completed.\")\n\n    model = torch.nn.DataParallel(model)\n\n    # Data loading code\n    print(\"=> creating data loaders ... \")\n    if not is_eval:\n        train_dataset = KittiDepth('train', args)\n        train_loader = torch.utils.data.DataLoader(train_dataset,\n                                                   batch_size=args.batch_size,\n                                                   shuffle=True,\n                                                   num_workers=args.workers,\n                                                   pin_memory=True,\n                                                   sampler=None)\n        print(\"\\t==> train_loader size:{}\".format(len(train_loader)))\n\n    print(\"=> starting main loop ...\")\n    for epoch in range(args.start_epoch, args.epochs):\n        print(\"=> starting training epoch {} ..\".format(epoch))\n        iterate(\"train\", args, train_loader, model, optimizer, logger, epoch)  # train for one epoch\n\n        # validation memory reset\n        for p in model.parameters():\n            p.requires_grad = False\n        result, is_best = iterate(\"val\", args, val_loader, model, None, logger, epoch)  # evaluate on validation set\n\n        for p in model.parameters():\n            p.requires_grad = True\n        if (args.freeze_backbone == True):\n            for p in model.module.backbone.parameters():\n                p.requires_grad = False\n        if (penet_accelerated == True):\n            model.module.encoder3.requires_grad = False\n            model.module.encoder5.requires_grad = False\n            model.module.encoder7.requires_grad = False\n\n        helper.save_checkpoint({ # save checkpoint\n            'epoch': epoch,\n            'model': model.module.state_dict(),\n            'best_result': logger.best_result,\n            'optimizer' : optimizer.state_dict(),\n            'args' : args,\n        }, is_best, epoch, logger.output_directory)\n\n\nif __name__ == '__main__':\n    main()"
  },
  {
    "path": "tools/PENet/metrics.py",
    "content": "import torch\nimport math\nimport numpy as np\n\nlg_e_10 = math.log(10)\n\n\ndef log10(x):\n    \"\"\"Convert a new tensor with the base-10 logarithm of the elements of x. \"\"\"\n    return torch.log(x) / lg_e_10\n\n\nclass Result(object):\n    def __init__(self):\n        self.irmse = 0\n        self.imae = 0\n        self.mse = 0\n        self.rmse = 0\n        self.mae = 0\n        self.absrel = 0\n        self.squared_rel = 0\n        self.lg10 = 0\n        self.delta1 = 0\n        self.delta2 = 0\n        self.delta3 = 0\n        self.data_time = 0\n        self.gpu_time = 0\n        self.silog = 0  # Scale invariant logarithmic error [log(m)*100]\n        self.photometric = 0\n\n    def set_to_worst(self):\n        self.irmse = np.inf\n        self.imae = np.inf\n        self.mse = np.inf\n        self.rmse = np.inf\n        self.mae = np.inf\n        self.absrel = np.inf\n        self.squared_rel = np.inf\n        self.lg10 = np.inf\n        self.silog = np.inf\n        self.delta1 = 0\n        self.delta2 = 0\n        self.delta3 = 0\n        self.data_time = 0\n        self.gpu_time = 0\n\n    def update(self, irmse, imae, mse, rmse, mae, absrel, squared_rel, lg10, \\\n            delta1, delta2, delta3, gpu_time, data_time, silog, photometric=0):\n        self.irmse = irmse\n        self.imae = imae\n        self.mse = mse\n        self.rmse = rmse\n        self.mae = mae\n        self.absrel = absrel\n        self.squared_rel = squared_rel\n        self.lg10 = lg10\n        self.delta1 = delta1\n        self.delta2 = delta2\n        self.delta3 = delta3\n        self.data_time = data_time\n        self.gpu_time = gpu_time\n        self.silog = silog\n        self.photometric = photometric\n\n    def evaluate(self, output, target, photometric=0):\n        valid_mask = target > 0.1\n\n        # convert from meters to mm\n        output_mm = 1e3 * output[valid_mask]\n        target_mm = 1e3 * target[valid_mask]\n\n        abs_diff = (output_mm - target_mm).abs()\n\n        self.mse = float((torch.pow(abs_diff, 2)).mean())\n        self.rmse = math.sqrt(self.mse)\n        self.mae = float(abs_diff.mean())\n        self.lg10 = float((log10(output_mm) - log10(target_mm)).abs().mean())\n        self.absrel = float((abs_diff / target_mm).mean())\n        self.squared_rel = float(((abs_diff / target_mm)**2).mean())\n\n        maxRatio = torch.max(output_mm / target_mm, target_mm / output_mm)\n        self.delta1 = float((maxRatio < 1.25).float().mean())\n        self.delta2 = float((maxRatio < 1.25**2).float().mean())\n        self.delta3 = float((maxRatio < 1.25**3).float().mean())\n        self.data_time = 0\n        self.gpu_time = 0\n\n        # silog uses meters\n        err_log = torch.log(target[valid_mask]) - torch.log(output[valid_mask])\n        normalized_squared_log = (err_log**2).mean()\n        log_mean = err_log.mean()\n        self.silog = math.sqrt(normalized_squared_log -\n                               log_mean * log_mean) * 100\n\n        # convert from meters to km\n        inv_output_km = (1e-3 * output[valid_mask])**(-1)\n        inv_target_km = (1e-3 * target[valid_mask])**(-1)\n        abs_inv_diff = (inv_output_km - inv_target_km).abs()\n        self.irmse = math.sqrt((torch.pow(abs_inv_diff, 2)).mean())\n        self.imae = float(abs_inv_diff.mean())\n\n        self.photometric = float(photometric)\n\n\nclass AverageMeter(object):\n    def __init__(self):\n        self.reset(time_stable=True)\n\n    def reset(self, time_stable):\n        self.count = 0.0\n        self.sum_irmse = 0\n        self.sum_imae = 0\n        self.sum_mse = 0\n        self.sum_rmse = 0\n        self.sum_mae = 0\n        self.sum_absrel = 0\n        self.sum_squared_rel = 0\n        self.sum_lg10 = 0\n        self.sum_delta1 = 0\n        self.sum_delta2 = 0\n        self.sum_delta3 = 0\n        self.sum_data_time = 0\n        self.sum_gpu_time = 0\n        self.sum_photometric = 0\n        self.sum_silog = 0\n        self.time_stable = time_stable\n        self.time_stable_counter_init = 10\n        self.time_stable_counter = self.time_stable_counter_init\n\n    def update(self, result, gpu_time, data_time, n=1):\n        self.count += n\n        self.sum_irmse += n * result.irmse\n        self.sum_imae += n * result.imae\n        self.sum_mse += n * result.mse\n        self.sum_rmse += n * result.rmse\n        self.sum_mae += n * result.mae\n        self.sum_absrel += n * result.absrel\n        self.sum_squared_rel += n * result.squared_rel\n        self.sum_lg10 += n * result.lg10\n        self.sum_delta1 += n * result.delta1\n        self.sum_delta2 += n * result.delta2\n        self.sum_delta3 += n * result.delta3\n        self.sum_data_time += n * data_time\n        if self.time_stable == True and self.time_stable_counter > 0:\n            self.time_stable_counter = self.time_stable_counter - 1\n        else:\n            self.sum_gpu_time += n * gpu_time\n        self.sum_silog += n * result.silog\n        self.sum_photometric += n * result.photometric\n\n    def average(self):\n        avg = Result()\n        if self.time_stable == True:\n            if self.count > 0 and self.count - self.time_stable_counter_init > 0:\n                avg.update(\n                    self.sum_irmse / self.count, self.sum_imae / self.count,\n                    self.sum_mse / self.count, self.sum_rmse / self.count,\n                    self.sum_mae / self.count, self.sum_absrel / self.count,\n                    self.sum_squared_rel / self.count, self.sum_lg10 / self.count,\n                    self.sum_delta1 / self.count, self.sum_delta2 / self.count,\n                    self.sum_delta3 / self.count, self.sum_gpu_time / (self.count - self.time_stable_counter_init),\n                    self.sum_data_time / self.count, self.sum_silog / self.count,\n                    self.sum_photometric / self.count)\n            elif self.count > 0:\n                avg.update(\n                    self.sum_irmse / self.count, self.sum_imae / self.count,\n                    self.sum_mse / self.count, self.sum_rmse / self.count,\n                    self.sum_mae / self.count, self.sum_absrel / self.count,\n                    self.sum_squared_rel / self.count, self.sum_lg10 / self.count,\n                    self.sum_delta1 / self.count, self.sum_delta2 / self.count,\n                    self.sum_delta3 / self.count, 0,\n                    self.sum_data_time / self.count, self.sum_silog / self.count,\n                    self.sum_photometric / self.count)\n        elif self.count > 0:\n            avg.update(\n                self.sum_irmse / self.count, self.sum_imae / self.count,\n                self.sum_mse / self.count, self.sum_rmse / self.count,\n                self.sum_mae / self.count, self.sum_absrel / self.count,\n                self.sum_squared_rel / self.count, self.sum_lg10 / self.count,\n                self.sum_delta1 / self.count, self.sum_delta2 / self.count,\n                self.sum_delta3 / self.count, self.sum_gpu_time / self.count,\n                self.sum_data_time / self.count, self.sum_silog / self.count,\n                self.sum_photometric / self.count)\n        return avg\n"
  },
  {
    "path": "tools/PENet/model.py",
    "content": "from basic import *\n\nclass ENet(nn.Module):\n    def __init__(self, args):\n        super(ENet, self).__init__()\n        self.args = args\n        self.geofeature = None\n        self.geoplanes = 3\n        if self.args.convolutional_layer_encoding == \"xyz\":\n            self.geofeature = GeometryFeature()\n        elif self.args.convolutional_layer_encoding == \"std\":\n            self.geoplanes = 0\n        elif self.args.convolutional_layer_encoding == \"uv\":\n            self.geoplanes = 2\n        elif self.args.convolutional_layer_encoding == \"z\":\n            self.geoplanes = 1\n\n        # rgb encoder\n        self.rgb_conv_init = convbnrelu(in_channels=4, out_channels=32, kernel_size=5, stride=1, padding=2)\n\n        self.rgb_encoder_layer1 = BasicBlockGeo(inplanes=32, planes=64, stride=2, geoplanes=self.geoplanes)\n        self.rgb_encoder_layer2 = BasicBlockGeo(inplanes=64, planes=64, stride=1, geoplanes=self.geoplanes)\n        self.rgb_encoder_layer3 = BasicBlockGeo(inplanes=64, planes=128, stride=2, geoplanes=self.geoplanes)\n        self.rgb_encoder_layer4 = BasicBlockGeo(inplanes=128, planes=128, stride=1, geoplanes=self.geoplanes)\n        self.rgb_encoder_layer5 = BasicBlockGeo(inplanes=128, planes=256, stride=2, geoplanes=self.geoplanes)\n        self.rgb_encoder_layer6 = BasicBlockGeo(inplanes=256, planes=256, stride=1, geoplanes=self.geoplanes)\n        self.rgb_encoder_layer7 = BasicBlockGeo(inplanes=256, planes=512, stride=2, geoplanes=self.geoplanes)\n        self.rgb_encoder_layer8 = BasicBlockGeo(inplanes=512, planes=512, stride=1, geoplanes=self.geoplanes)\n        self.rgb_encoder_layer9 = BasicBlockGeo(inplanes=512, planes=1024, stride=2, geoplanes=self.geoplanes)\n        self.rgb_encoder_layer10 = BasicBlockGeo(inplanes=1024, planes=1024, stride=1, geoplanes=self.geoplanes)\n\n        self.rgb_decoder_layer8 = deconvbnrelu(in_channels=1024, out_channels=512, kernel_size=5, stride=2, padding=2, output_padding=1)\n        self.rgb_decoder_layer6 = deconvbnrelu(in_channels=512, out_channels=256, kernel_size=5, stride=2, padding=2, output_padding=1)\n        self.rgb_decoder_layer4 = deconvbnrelu(in_channels=256, out_channels=128, kernel_size=5, stride=2, padding=2, output_padding=1)\n        self.rgb_decoder_layer2 = deconvbnrelu(in_channels=128, out_channels=64, kernel_size=5, stride=2, padding=2, output_padding=1)\n        self.rgb_decoder_layer0 = deconvbnrelu(in_channels=64, out_channels=32, kernel_size=5, stride=2, padding=2, output_padding=1)\n        self.rgb_decoder_output = deconvbnrelu(in_channels=32, out_channels=2, kernel_size=3, stride=1, padding=1, output_padding=0)\n\n\n        # depth encoder\n        self.depth_conv_init = convbnrelu(in_channels=2, out_channels=32, kernel_size=5, stride=1, padding=2)\n\n        self.depth_layer1 = BasicBlockGeo(inplanes=32, planes=64, stride=2, geoplanes=self.geoplanes)\n        self.depth_layer2 = BasicBlockGeo(inplanes=64, planes=64, stride=1, geoplanes=self.geoplanes)\n        self.depth_layer3 = BasicBlockGeo(inplanes=128, planes=128, stride=2, geoplanes=self.geoplanes)\n        self.depth_layer4 = BasicBlockGeo(inplanes=128, planes=128, stride=1, geoplanes=self.geoplanes)\n        self.depth_layer5 = BasicBlockGeo(inplanes=256, planes=256, stride=2, geoplanes=self.geoplanes)\n        self.depth_layer6 = BasicBlockGeo(inplanes=256, planes=256, stride=1, geoplanes=self.geoplanes)\n        self.depth_layer7 = BasicBlockGeo(inplanes=512, planes=512, stride=2, geoplanes=self.geoplanes)\n        self.depth_layer8 = BasicBlockGeo(inplanes=512, planes=512, stride=1, geoplanes=self.geoplanes)\n        self.depth_layer9 = BasicBlockGeo(inplanes=1024, planes=1024, stride=2, geoplanes=self.geoplanes)\n        self.depth_layer10 = BasicBlockGeo(inplanes=1024, planes=1024, stride=1, geoplanes=self.geoplanes)\n\n        # decoder\n        self.decoder_layer1 = deconvbnrelu(in_channels=1024, out_channels=512, kernel_size=5, stride=2, padding=2, output_padding=1)\n        self.decoder_layer2 = deconvbnrelu(in_channels=512, out_channels=256, kernel_size=5, stride=2, padding=2, output_padding=1)\n        self.decoder_layer3 = deconvbnrelu(in_channels=256, out_channels=128, kernel_size=5, stride=2, padding=2, output_padding=1)\n        self.decoder_layer4 = deconvbnrelu(in_channels=128, out_channels=64, kernel_size=5, stride=2, padding=2, output_padding=1)\n        self.decoder_layer5 = deconvbnrelu(in_channels=64, out_channels=32, kernel_size=5, stride=2, padding=2, output_padding=1)\n\n        self.decoder_layer6 = convbnrelu(in_channels=32, out_channels=2, kernel_size=3, stride=1, padding=1)\n        self.softmax = nn.Softmax(dim=1)\n        self.pooling = nn.AvgPool2d(kernel_size=2)\n        self.sparsepooling = SparseDownSampleClose(stride=2)\n\n        weights_init(self)\n\n    def forward(self, input):\n        #independent input\n        rgb = input['rgb']\n        d = input['d']\n\n        position = input['position']\n        K = input['K']\n        unorm = position[:, 0:1, :, :]\n        vnorm = position[:, 1:2, :, :]\n\n        f352 = K[:, 1, 1]\n        f352 = f352.unsqueeze(1)\n        f352 = f352.unsqueeze(2)\n        f352 = f352.unsqueeze(3)\n        c352 = K[:, 1, 2]\n        c352 = c352.unsqueeze(1)\n        c352 = c352.unsqueeze(2)\n        c352 = c352.unsqueeze(3)\n        f1216 = K[:, 0, 0]\n        f1216 = f1216.unsqueeze(1)\n        f1216 = f1216.unsqueeze(2)\n        f1216 = f1216.unsqueeze(3)\n        c1216 = K[:, 0, 2]\n        c1216 = c1216.unsqueeze(1)\n        c1216 = c1216.unsqueeze(2)\n        c1216 = c1216.unsqueeze(3)\n\n        vnorm_s2 = self.pooling(vnorm)\n        vnorm_s3 = self.pooling(vnorm_s2)\n        vnorm_s4 = self.pooling(vnorm_s3)\n        vnorm_s5 = self.pooling(vnorm_s4)\n        vnorm_s6 = self.pooling(vnorm_s5)\n\n        unorm_s2 = self.pooling(unorm)\n        unorm_s3 = self.pooling(unorm_s2)\n        unorm_s4 = self.pooling(unorm_s3)\n        unorm_s5 = self.pooling(unorm_s4)\n        unorm_s6 = self.pooling(unorm_s5)\n\n        valid_mask = torch.where(d>0, torch.full_like(d, 1.0), torch.full_like(d, 0.0))\n        d_s2, vm_s2 = self.sparsepooling(d, valid_mask)\n        d_s3, vm_s3 = self.sparsepooling(d_s2, vm_s2)\n        d_s4, vm_s4 = self.sparsepooling(d_s3, vm_s3)\n        d_s5, vm_s5 = self.sparsepooling(d_s4, vm_s4)\n        d_s6, vm_s6 = self.sparsepooling(d_s5, vm_s5)\n\n        geo_s1 = None\n        geo_s2 = None\n        geo_s3 = None\n        geo_s4 = None\n        geo_s5 = None\n        geo_s6 = None\n\n        if self.args.convolutional_layer_encoding == \"xyz\":\n            geo_s1 = self.geofeature(d, vnorm, unorm, 352, 1216, c352, c1216, f352, f1216)\n            geo_s2 = self.geofeature(d_s2, vnorm_s2, unorm_s2, 352 / 2, 1216 / 2, c352, c1216, f352, f1216)\n            geo_s3 = self.geofeature(d_s3, vnorm_s3, unorm_s3, 352 / 4, 1216 / 4, c352, c1216, f352, f1216)\n            geo_s4 = self.geofeature(d_s4, vnorm_s4, unorm_s4, 352 / 8, 1216 / 8, c352, c1216, f352, f1216)\n            geo_s5 = self.geofeature(d_s5, vnorm_s5, unorm_s5, 352 / 16, 1216 / 16, c352, c1216, f352, f1216)\n            geo_s6 = self.geofeature(d_s6, vnorm_s6, unorm_s6, 352 / 32, 1216 / 32, c352, c1216, f352, f1216)\n        elif self.args.convolutional_layer_encoding == \"uv\":\n            geo_s1 = torch.cat((vnorm, unorm), dim=1)\n            geo_s2 = torch.cat((vnorm_s2, unorm_s2), dim=1)\n            geo_s3 = torch.cat((vnorm_s3, unorm_s3), dim=1)\n            geo_s4 = torch.cat((vnorm_s4, unorm_s4), dim=1)\n            geo_s5 = torch.cat((vnorm_s5, unorm_s5), dim=1)\n            geo_s6 = torch.cat((vnorm_s6, unorm_s6), dim=1)\n        elif self.args.convolutional_layer_encoding == \"z\":\n            geo_s1 = d\n            geo_s2 = d_s2\n            geo_s3 = d_s3\n            geo_s4 = d_s4\n            geo_s5 = d_s5\n            geo_s6 = d_s6\n\n        #embeded input\n        #rgb = input[:, 0:3, :, :]\n        #d = input[:, 3:4, :, :]\n\n        # b 1 352 1216\n        rgb_feature = self.rgb_conv_init(torch.cat((rgb, d), dim=1))\n        rgb_feature1 = self.rgb_encoder_layer1(rgb_feature, geo_s1, geo_s2) # b 32 176 608\n        rgb_feature2 = self.rgb_encoder_layer2(rgb_feature1, geo_s2, geo_s2) # b 32 176 608\n        rgb_feature3 = self.rgb_encoder_layer3(rgb_feature2, geo_s2, geo_s3) # b 64 88 304\n        rgb_feature4 = self.rgb_encoder_layer4(rgb_feature3, geo_s3, geo_s3) # b 64 88 304\n        rgb_feature5 = self.rgb_encoder_layer5(rgb_feature4, geo_s3, geo_s4) # b 128 44 152\n        rgb_feature6 = self.rgb_encoder_layer6(rgb_feature5, geo_s4, geo_s4) # b 128 44 152\n        rgb_feature7 = self.rgb_encoder_layer7(rgb_feature6, geo_s4, geo_s5) # b 256 22 76\n        rgb_feature8 = self.rgb_encoder_layer8(rgb_feature7, geo_s5, geo_s5) # b 256 22 76\n        rgb_feature9 = self.rgb_encoder_layer9(rgb_feature8, geo_s5, geo_s6) # b 512 11 38\n        rgb_feature10 = self.rgb_encoder_layer10(rgb_feature9, geo_s6, geo_s6) # b 512 11 38\n\n        rgb_feature_decoder8 = self.rgb_decoder_layer8(rgb_feature10)\n        rgb_feature8_plus = rgb_feature_decoder8 + rgb_feature8\n\n        rgb_feature_decoder6 = self.rgb_decoder_layer6(rgb_feature8_plus)\n        rgb_feature6_plus = rgb_feature_decoder6 + rgb_feature6\n\n        rgb_feature_decoder4 = self.rgb_decoder_layer4(rgb_feature6_plus)\n        rgb_feature4_plus = rgb_feature_decoder4 + rgb_feature4\n\n        rgb_feature_decoder2 = self.rgb_decoder_layer2(rgb_feature4_plus)\n        rgb_feature2_plus = rgb_feature_decoder2 + rgb_feature2   # b 32 176 608\n\n        rgb_feature_decoder0 = self.rgb_decoder_layer0(rgb_feature2_plus)\n        rgb_feature0_plus = rgb_feature_decoder0 + rgb_feature\n\n        rgb_output = self.rgb_decoder_output(rgb_feature0_plus)\n        rgb_depth = rgb_output[:, 0:1, :, :]\n        rgb_conf = rgb_output[:, 1:2, :, :]\n\n        # -----------------------------------------------------------------------\n        # mask = torch.where(d>0, torch.full_like(d, 1.0), torch.full_like(d, 0.0))\n        # input = torch.cat([d, mask], 1)\n\n        sparsed_feature = self.depth_conv_init(torch.cat((d, rgb_depth), dim=1))\n        sparsed_feature1 = self.depth_layer1(sparsed_feature, geo_s1, geo_s2)# b 32 176 608\n        sparsed_feature2 = self.depth_layer2(sparsed_feature1, geo_s2, geo_s2) # b 32 176 608\n\n        sparsed_feature2_plus = torch.cat([rgb_feature2_plus, sparsed_feature2], 1)\n        sparsed_feature3 = self.depth_layer3(sparsed_feature2_plus, geo_s2, geo_s3) # b 64 88 304\n        sparsed_feature4 = self.depth_layer4(sparsed_feature3, geo_s3, geo_s3) # b 64 88 304\n\n        sparsed_feature4_plus = torch.cat([rgb_feature4_plus, sparsed_feature4], 1)\n        sparsed_feature5 = self.depth_layer5(sparsed_feature4_plus, geo_s3, geo_s4) # b 128 44 152\n        sparsed_feature6 = self.depth_layer6(sparsed_feature5, geo_s4, geo_s4) # b 128 44 152\n\n        sparsed_feature6_plus = torch.cat([rgb_feature6_plus, sparsed_feature6], 1)\n        sparsed_feature7 = self.depth_layer7(sparsed_feature6_plus, geo_s4, geo_s5) # b 256 22 76\n        sparsed_feature8 = self.depth_layer8(sparsed_feature7, geo_s5, geo_s5) # b 256 22 76\n\n        sparsed_feature8_plus = torch.cat([rgb_feature8_plus, sparsed_feature8], 1)\n        sparsed_feature9 = self.depth_layer9(sparsed_feature8_plus, geo_s5, geo_s6) # b 512 11 38\n        sparsed_feature10 = self.depth_layer10(sparsed_feature9, geo_s6, geo_s6) # b 512 11 38\n\n        # -----------------------------------------------------------------------------------------\n\n        fusion1 = rgb_feature10 + sparsed_feature10\n        decoder_feature1 = self.decoder_layer1(fusion1)\n\n        fusion2 = sparsed_feature8 + decoder_feature1\n        decoder_feature2 = self.decoder_layer2(fusion2)\n\n        fusion3 = sparsed_feature6 + decoder_feature2\n        decoder_feature3 = self.decoder_layer3(fusion3)\n\n        fusion4 = sparsed_feature4 + decoder_feature3\n        decoder_feature4 = self.decoder_layer4(fusion4)\n\n        fusion5 = sparsed_feature2 + decoder_feature4\n        decoder_feature5 = self.decoder_layer5(fusion5)\n\n        depth_output = self.decoder_layer6(decoder_feature5)\n        d_depth, d_conf = torch.chunk(depth_output, 2, dim=1)\n\n        rgb_conf, d_conf = torch.chunk(self.softmax(torch.cat((rgb_conf, d_conf), dim=1)), 2, dim=1)\n        output = rgb_conf*rgb_depth + d_conf*d_depth\n\n        if(self.args.network_model == 'e'):\n            return rgb_depth, d_depth, output\n        elif(self.args.dilation_rate == 1):\n            return torch.cat((rgb_feature0_plus, decoder_feature5),1), output\n        elif (self.args.dilation_rate == 2):\n            return torch.cat((rgb_feature0_plus, decoder_feature5), 1), torch.cat((rgb_feature2_plus, decoder_feature4),1), output\n        elif (self.args.dilation_rate == 4):\n            return torch.cat((rgb_feature0_plus, decoder_feature5), 1), torch.cat((rgb_feature2_plus, decoder_feature4),1),\\\n                   torch.cat((rgb_feature4_plus, decoder_feature3), 1), output\n\nclass PENet_C1(nn.Module):\n    def __init__(self, args):\n        super(PENet_C1, self).__init__()\n\n        self.backbone = ENet(args)\n        #self.backbone = Bone()\n        self.mask_layer = convbn(64, 3)\n\n        self.kernel_conf_layer = convbn(64, 3)\n        self.iter_conf_layer = convbn(64, 12)\n        self.iter_guide_layer3 = CSPNGenerateAccelerate(64, 3)\n        self.iter_guide_layer5 = CSPNGenerateAccelerate(64, 5)\n        self.iter_guide_layer7 = CSPNGenerateAccelerate(64, 7)\n        self.softmax = nn.Softmax(dim=1)\n        self.CSPN3 = CSPNAccelerate(3)\n        self.CSPN5 = CSPNAccelerate(5, padding=2)\n        self.CSPN7 = CSPNAccelerate(7, padding=3)\n\n        # CSPN new\n        ks = 3\n        encoder3 = torch.zeros(ks * ks, ks * ks, ks, ks).cuda()\n        kernel_range_list = [i for i in range(ks - 1, -1, -1)]\n        ls = []\n        for i in range(ks):\n            ls.extend(kernel_range_list)\n        index = [[j for j in range(ks * ks - 1, -1, -1)], [j for j in range(ks * ks)], \\\n                 [val for val in kernel_range_list for j in range(ks)], ls]\n        encoder3[index] = 1\n        self.encoder3 = nn.Parameter(encoder3, requires_grad=False)\n\n        ks = 5\n        encoder5 = torch.zeros(ks * ks, ks * ks, ks, ks).cuda()\n        kernel_range_list = [i for i in range(ks - 1, -1, -1)]\n        ls = []\n        for i in range(ks):\n            ls.extend(kernel_range_list)\n        index = [[j for j in range(ks * ks - 1, -1, -1)], [j for j in range(ks * ks)], \\\n                 [val for val in kernel_range_list for j in range(ks)], ls]\n        encoder5[index] = 1\n        self.encoder5 = nn.Parameter(encoder5, requires_grad=False)\n\n        ks = 7\n        encoder7 = torch.zeros(ks * ks, ks * ks, ks, ks).cuda()\n        kernel_range_list = [i for i in range(ks - 1, -1, -1)]\n        ls = []\n        for i in range(ks):\n            ls.extend(kernel_range_list)\n        index = [[j for j in range(ks * ks - 1, -1, -1)], [j for j in range(ks * ks)], \\\n                 [val for val in kernel_range_list for j in range(ks)], ls]\n        encoder7[index] = 1\n        self.encoder7 = nn.Parameter(encoder7, requires_grad=False)\n\n        weights_init(self)\n\n    def forward(self, input):\n        #rgb = input['rgb']\n        d = input['d']\n        valid_mask = torch.where(d>0, torch.full_like(d, 1.0), torch.full_like(d, 0.0))\n\n        feature, coarse_depth= self.backbone(input)\n\n        mask = self.mask_layer(feature)\n        mask = torch.sigmoid(mask)\n\n        mask = mask*valid_mask\n        mask3 = mask[:, 0:1, :, :]\n        mask5 = mask[:, 1:2, :, :]\n        mask7 = mask[:, 2:3, :, :]\n\n        kernel_conf = self.kernel_conf_layer(feature)\n        kernel_conf = self.softmax(kernel_conf)\n        kernel_conf3 = kernel_conf[:, 0:1, :, :]\n        kernel_conf5 = kernel_conf[:, 1:2, :, :]\n        kernel_conf7 = kernel_conf[:, 2:3, :, :]\n\n        conf = self.iter_conf_layer(feature)\n        conf3 = conf[:, 0:4, :, :]\n        conf5 = conf[:, 4:8, :, :]\n        conf7 = conf[:, 8:12, :, :]\n        conf3 = self.softmax(conf3)\n        conf5 = self.softmax(conf5)\n        conf7 = self.softmax(conf7)\n\n        guide3 = self.iter_guide_layer3(feature)\n        guide5 = self.iter_guide_layer5(feature)\n        guide7 = self.iter_guide_layer7(feature)\n\n        #init\n        depth = coarse_depth\n        depth3 = depth\n        depth5 = depth\n        depth7 = depth\n\n        d3_list = [i for i in range(4)]\n        d5_list = [i for i in range(4)]\n        d7_list = [i for i in range(4)]\n\n        #prop\n        guide3 = kernel_trans(guide3, self.encoder3)\n        guide5 = kernel_trans(guide5, self.encoder5)\n        guide7 = kernel_trans(guide7, self.encoder7)\n\n        for i in range(12):\n            depth3 = self.CSPN3(guide3, depth3, depth)\n            depth3 = mask3*d + (1-mask3)*depth3\n            depth5 = self.CSPN5(guide5, depth5, depth)\n            depth5 = mask5*d + (1-mask5)*depth5\n            depth7 = self.CSPN7(guide7, depth7, depth)\n            depth7 = mask7*d + (1-mask7)*depth7\n\n            if(i==2):\n                d3_list[0] = depth3\n                d5_list[0] = depth5\n                d7_list[0] = depth7\n\n            if(i==5):\n                d3_list[1] = depth3\n                d5_list[1] = depth5\n                d7_list[1] = depth7\n\n            if(i==8):\n                d3_list[2] = depth3\n                d5_list[2] = depth5\n                d7_list[2] = depth7\n\n            if(i==11):\n                d3_list[3] = depth3\n                d5_list[3] = depth5\n                d7_list[3] = depth7\n\n        refined_depth = \\\n        d3_list[0] * (kernel_conf3 * conf3[:, 0:1, :, :]) + \\\n        d3_list[1] * (kernel_conf3 * conf3[:, 1:2, :, :]) + \\\n        d3_list[2] * (kernel_conf3 * conf3[:, 2:3, :, :]) + \\\n        d3_list[3] * (kernel_conf3 * conf3[:, 3:4, :, :]) + \\\n        d5_list[0] * (kernel_conf5 * conf5[:, 0:1, :, :]) + \\\n        d5_list[1] * (kernel_conf5 * conf5[:, 1:2, :, :]) + \\\n        d5_list[2] * (kernel_conf5 * conf5[:, 2:3, :, :]) + \\\n        d5_list[3] * (kernel_conf5 * conf5[:, 3:4, :, :]) + \\\n        d7_list[0] * (kernel_conf7 * conf7[:, 0:1, :, :]) + \\\n        d7_list[1] * (kernel_conf7 * conf7[:, 1:2, :, :]) + \\\n        d7_list[2] * (kernel_conf7 * conf7[:, 2:3, :, :]) + \\\n        d7_list[3] * (kernel_conf7 * conf7[:, 3:4, :, :])\n\n        return refined_depth\n\nclass PENet_C2(nn.Module):\n    def __init__(self, args):\n        super(PENet_C2, self).__init__()\n\n        self.backbone = ENet(args)\n\n        self.kernel_conf_layer = convbn(64, 3)\n        self.mask_layer = convbn(64, 1)\n        self.iter_guide_layer3 = CSPNGenerateAccelerate(64, 3)\n        self.iter_guide_layer5 = CSPNGenerateAccelerate(64, 5)\n        self.iter_guide_layer7 = CSPNGenerateAccelerate(64, 7)\n\n        self.kernel_conf_layer_s2 = convbn(128, 3)\n        self.mask_layer_s2 = convbn(128, 1)\n        self.iter_guide_layer3_s2 = CSPNGenerateAccelerate(128, 3)\n        self.iter_guide_layer5_s2 = CSPNGenerateAccelerate(128, 5)\n        self.iter_guide_layer7_s2 = CSPNGenerateAccelerate(128, 7)\n\n        self.upsample = nn.UpsamplingBilinear2d(scale_factor=2)\n        self.nnupsample = nn.UpsamplingNearest2d(scale_factor=2)\n        self.downsample = SparseDownSampleClose(stride=2)\n        self.softmax = nn.Softmax(dim=1)\n        self.CSPN3 = CSPNAccelerate(kernel_size=3, dilation=1, padding=1, stride=1)\n        self.CSPN5 = CSPNAccelerate(kernel_size=5, dilation=1, padding=2, stride=1)\n        self.CSPN7 = CSPNAccelerate(kernel_size=7, dilation=1, padding=3, stride=1)\n        self.CSPN3_s2 = CSPNAccelerate(kernel_size=3, dilation=2, padding=2, stride=1)\n        self.CSPN5_s2 = CSPNAccelerate(kernel_size=5, dilation=2, padding=4, stride=1)\n        self.CSPN7_s2 = CSPNAccelerate(kernel_size=7, dilation=2, padding=6, stride=1)\n\n        # CSPN\n        ks = 3\n        encoder3 = torch.zeros(ks * ks, ks * ks, ks, ks).cuda()\n        kernel_range_list = [i for i in range(ks - 1, -1, -1)]\n        ls = []\n        for i in range(ks):\n            ls.extend(kernel_range_list)\n        index = [[j for j in range(ks * ks - 1, -1, -1)], [j for j in range(ks * ks)], \\\n                 [val for val in kernel_range_list for j in range(ks)], ls]\n        encoder3[index] = 1\n        self.encoder3 = nn.Parameter(encoder3, requires_grad=False)\n\n        ks = 5\n        encoder5 = torch.zeros(ks * ks, ks * ks, ks, ks).cuda()\n        kernel_range_list = [i for i in range(ks - 1, -1, -1)]\n        ls = []\n        for i in range(ks):\n            ls.extend(kernel_range_list)\n        index = [[j for j in range(ks * ks - 1, -1, -1)], [j for j in range(ks * ks)], \\\n                 [val for val in kernel_range_list for j in range(ks)], ls]\n        encoder5[index] = 1\n        self.encoder5 = nn.Parameter(encoder5, requires_grad=False)\n\n        ks = 7\n        encoder7 = torch.zeros(ks * ks, ks * ks, ks, ks).cuda()\n        kernel_range_list = [i for i in range(ks - 1, -1, -1)]\n        ls = []\n        for i in range(ks):\n            ls.extend(kernel_range_list)\n        index = [[j for j in range(ks * ks - 1, -1, -1)], [j for j in range(ks * ks)], \\\n                 [val for val in kernel_range_list for j in range(ks)], ls]\n        encoder7[index] = 1\n        self.encoder7 = nn.Parameter(encoder7, requires_grad=False)\n\n        weights_init(self)\n\n    def forward(self, input):\n\n        d = input['d']\n        valid_mask = torch.where(d>0, torch.full_like(d, 1.0), torch.full_like(d, 0.0))\n\n        feature_s1, feature_s2, coarse_depth = self.backbone(input)\n        depth = coarse_depth\n\n        d_s2, valid_mask_s2 = self.downsample(d, valid_mask)\n        mask_s2 = self.mask_layer_s2(feature_s2)\n        mask_s2 = torch.sigmoid(mask_s2)\n        mask_s2 = mask_s2*valid_mask_s2\n\n        kernel_conf_s2 = self.kernel_conf_layer_s2(feature_s2)\n        kernel_conf_s2 = self.softmax(kernel_conf_s2)\n        kernel_conf3_s2 = self.nnupsample(kernel_conf_s2[:, 0:1, :, :])\n        kernel_conf5_s2 = self.nnupsample(kernel_conf_s2[:, 1:2, :, :])\n        kernel_conf7_s2 = self.nnupsample(kernel_conf_s2[:, 2:3, :, :])\n\n        guide3_s2 = self.iter_guide_layer3_s2(feature_s2)\n        guide5_s2 = self.iter_guide_layer5_s2(feature_s2)\n        guide7_s2 = self.iter_guide_layer7_s2(feature_s2)\n\n        depth_s2 = self.nnupsample(d_s2)\n        mask_s2 = self.nnupsample(mask_s2)\n        depth3 = depth5 = depth7 = depth\n\n        mask = self.mask_layer(feature_s1)\n        mask = torch.sigmoid(mask)\n        mask = mask * valid_mask\n\n        kernel_conf = self.kernel_conf_layer(feature_s1)\n        kernel_conf = self.softmax(kernel_conf)\n        kernel_conf3 = kernel_conf[:, 0:1, :, :]\n        kernel_conf5 = kernel_conf[:, 1:2, :, :]\n        kernel_conf7 = kernel_conf[:, 2:3, :, :]\n\n        guide3 = self.iter_guide_layer3(feature_s1)\n        guide5 = self.iter_guide_layer5(feature_s1)\n        guide7 = self.iter_guide_layer7(feature_s1)\n\n        guide3 = kernel_trans(guide3, self.encoder3)\n        guide5 = kernel_trans(guide5, self.encoder5)\n        guide7 = kernel_trans(guide7, self.encoder7)\n\n        guide3_s2 = kernel_trans(guide3_s2, self.encoder3)\n        guide5_s2 = kernel_trans(guide5_s2, self.encoder5)\n        guide7_s2 = kernel_trans(guide7_s2, self.encoder7)\n\n        guide3_s2 = self.nnupsample(guide3_s2)\n        guide5_s2 = self.nnupsample(guide5_s2)\n        guide7_s2 = self.nnupsample(guide7_s2)\n\n        for i in range(6):\n            depth3 = self.CSPN3_s2(guide3_s2, depth3, coarse_depth)\n            depth3 = mask_s2*depth_s2 + (1-mask_s2)*depth3\n            depth5 = self.CSPN5_s2(guide5_s2, depth5, coarse_depth)\n            depth5 = mask_s2*depth_s2 + (1-mask_s2)*depth5\n            depth7 = self.CSPN7_s2(guide7_s2, depth7, coarse_depth)\n            depth7 = mask_s2*depth_s2 + (1-mask_s2)*depth7\n\n        depth_s2 = kernel_conf3_s2*depth3 + kernel_conf5_s2*depth5 + kernel_conf7_s2*depth7\n        refined_depth_s2 = depth_s2\n\n        depth3 = depth5 = depth7 = refined_depth_s2\n\n        #prop\n        for i in range(6):\n            depth3 = self.CSPN3(guide3, depth3, depth_s2)\n            depth3 = mask*d + (1-mask)*depth3\n            depth5 = self.CSPN5(guide5, depth5, depth_s2)\n            depth5 = mask*d + (1-mask)*depth5\n            depth7 = self.CSPN7(guide7, depth7, depth_s2)\n            depth7 = mask*d + (1-mask)*depth7\n\n        refined_depth = kernel_conf3*depth3 + kernel_conf5*depth5 + kernel_conf7*depth7\n\n        return refined_depth\n\nclass PENet_C4(nn.Module):\n    def __init__(self, args):\n        super(PENet_C4, self).__init__()\n\n        self.backbone = ENet(args)\n\n        self.kernel_conf_layer = convbn(64, 3)\n        self.mask_layer = convbn(64, 1)\n        self.prop_mask_layer = convbn(64, 1)\n        self.iter_guide_layer3 = CSPNGenerateAccelerate(64, 3)\n        self.iter_guide_layer5 = CSPNGenerateAccelerate(64, 5)\n        self.iter_guide_layer7 = CSPNGenerateAccelerate(64, 7)\n\n        self.kernel_conf_layer_s2 = convbn(128, 3)\n        self.mask_layer_s2 = convbn(128, 1)\n        self.prop_mask_layer_s2 = convbn(128, 1)\n        self.iter_guide_layer3_s2 = CSPNGenerateAccelerate(128, 3)\n        self.iter_guide_layer5_s2 = CSPNGenerateAccelerate(128, 5)\n        self.iter_guide_layer7_s2 = CSPNGenerateAccelerate(128, 7)\n\n        self.kernel_conf_layer_s3 = convbn(256, 3)\n        self.mask_layer_s3 = convbn(256, 1)\n        self.prop_mask_layer_s3 = convbn(256, 1)\n        self.iter_guide_layer3_s3 = CSPNGenerateAccelerate(256, 3)\n        self.iter_guide_layer5_s3 = CSPNGenerateAccelerate(256, 5)\n        self.iter_guide_layer7_s3 = CSPNGenerateAccelerate(256, 7)\n\n        self.upsample = nn.UpsamplingBilinear2d(scale_factor=2)\n        self.upsample4 = nn.UpsamplingBilinear2d(scale_factor=4)\n        self.nnupsample = nn.UpsamplingNearest2d(scale_factor=2)\n        self.nnupsample4 = nn.UpsamplingNearest2d(scale_factor=4)\n        self.downsample = SparseDownSampleClose(stride=2)\n        self.softmax = nn.Softmax(dim=1)\n        self.CSPN3 = CSPNAccelerate(kernel_size=3, dilation=1, padding=1, stride=1)\n        self.CSPN5 = CSPNAccelerate(kernel_size=5, dilation=1, padding=2, stride=1)\n        self.CSPN7 = CSPNAccelerate(kernel_size=7, dilation=1, padding=3, stride=1)\n        self.CSPN3_s2 = CSPNAccelerate(kernel_size=3, dilation=2, padding=2, stride=1)\n        self.CSPN5_s2 = CSPNAccelerate(kernel_size=5, dilation=2, padding=4, stride=1)\n        self.CSPN7_s2 = CSPNAccelerate(kernel_size=7, dilation=2, padding=6, stride=1)\n        self.CSPN3_s3 = CSPNAccelerate(kernel_size=3, dilation=4, padding=4, stride=1)\n        self.CSPN5_s3 = CSPNAccelerate(kernel_size=5, dilation=4, padding=8, stride=1)\n        self.CSPN7_s3 = CSPNAccelerate(kernel_size=7, dilation=4, padding=12, stride=1)\n\n        # CSPN\n        ks = 3\n        encoder3 = torch.zeros(ks * ks, ks * ks, ks, ks).cuda()\n        kernel_range_list = [i for i in range(ks - 1, -1, -1)]\n        ls = []\n        for i in range(ks):\n            ls.extend(kernel_range_list)\n        index = [[j for j in range(ks * ks - 1, -1, -1)], [j for j in range(ks * ks)], \\\n                 [val for val in kernel_range_list for j in range(ks)], ls]\n        encoder3[index] = 1\n        self.encoder3 = nn.Parameter(encoder3, requires_grad=False)\n\n        ks = 5\n        encoder5 = torch.zeros(ks * ks, ks * ks, ks, ks).cuda()\n        kernel_range_list = [i for i in range(ks - 1, -1, -1)]\n        ls = []\n        for i in range(ks):\n            ls.extend(kernel_range_list)\n        index = [[j for j in range(ks * ks - 1, -1, -1)], [j for j in range(ks * ks)], \\\n                 [val for val in kernel_range_list for j in range(ks)], ls]\n        encoder5[index] = 1\n        self.encoder5 = nn.Parameter(encoder5, requires_grad=False)\n\n        ks = 7\n        encoder7 = torch.zeros(ks * ks, ks * ks, ks, ks).cuda()\n        kernel_range_list = [i for i in range(ks - 1, -1, -1)]\n        ls = []\n        for i in range(ks):\n            ls.extend(kernel_range_list)\n        index = [[j for j in range(ks * ks - 1, -1, -1)], [j for j in range(ks * ks)], \\\n                 [val for val in kernel_range_list for j in range(ks)], ls]\n        encoder7[index] = 1\n        self.encoder7 = nn.Parameter(encoder7, requires_grad=False)\n\n        weights_init(self)\n\n    def forward(self, input):\n        #rgb = input['rgb']\n        d = input['d']\n        valid_mask = torch.where(d>0, torch.full_like(d, 1.0), torch.full_like(d, 0.0))\n\n        feature_s1, feature_s2, feature_s3, coarse_depth = self.backbone(input)\n        depth = coarse_depth\n\n        d_s2, valid_mask_s2 = self.downsample(d, valid_mask)\n        d_s3, valid_mask_s3 = self.downsample(d_s2, valid_mask_s2)\n\n        #s3\n        mask_s3 = self.mask_layer_s3(feature_s3)\n        mask_s3 = torch.sigmoid(mask_s3)\n        mask_s3 = mask_s3 * valid_mask_s3\n        prop_mask_s3 = self.prop_mask_layer_s3(feature_s3)\n        prop_mask_s3 = torch.sigmoid(prop_mask_s3)\n\n        kernel_conf_s3 = self.kernel_conf_layer_s3(feature_s3)\n        kernel_conf_s3 = self.softmax(kernel_conf_s3)\n        kernel_conf3_s3 = self.nnupsample4(kernel_conf_s3[:, 0:1, :, :])\n        kernel_conf5_s3 = self.nnupsample4(kernel_conf_s3[:, 1:2, :, :])\n        kernel_conf7_s3 = self.nnupsample4(kernel_conf_s3[:, 2:3, :, :])\n\n        guide3_s3 = self.iter_guide_layer3_s3(feature_s3)\n        guide5_s3 = self.iter_guide_layer5_s3(feature_s3)\n        guide7_s3 = self.iter_guide_layer7_s3(feature_s3)\n\n        guide3_s3 = kernel_trans(guide3_s3, self.encoder3)\n        guide5_s3 = kernel_trans(guide5_s3, self.encoder5)\n        guide7_s3 = kernel_trans(guide7_s3, self.encoder7)\n\n        guide3_s3 = prop_mask_s3*guide3_s3\n        guide5_s3 = prop_mask_s3*guide5_s3\n        guide7_s3 = prop_mask_s3*guide7_s3\n\n        guide3_s3 = self.nnupsample4(guide3_s3)\n        guide5_s3 = self.nnupsample4(guide5_s3)\n        guide7_s3 = self.nnupsample4(guide7_s3)\n\n        depth_s3 = self.nnupsample4(d_s3)\n        mask_s3 = self.nnupsample4(mask_s3)\n        depth3 = depth5 = depth7 = depth\n\n        for i in range(4):\n            depth3 = self.CSPN3_s3(guide3_s3, depth3, coarse_depth)\n            depth3 = mask_s3 * depth_s3 + (1 - mask_s3) * depth3\n            depth5 = self.CSPN5_s3(guide5_s3, depth5, coarse_depth)\n            depth5 = mask_s3 * depth_s3 + (1 - mask_s3) * depth5\n            depth7 = self.CSPN7_s3(guide7_s3, depth7, coarse_depth)\n            depth7 = mask_s3 * depth_s3 + (1 - mask_s3) * depth7\n\n        depth_s3 = kernel_conf3_s3 * depth3 + kernel_conf5_s3 * depth5 + kernel_conf7_s3 * depth7\n        refined_depth_s3 = depth_s3\n\n        #s2\n        mask_s2 = self.mask_layer_s2(feature_s2)\n        mask_s2 = torch.sigmoid(mask_s2)\n        mask_s2 = mask_s2*valid_mask_s2\n        prop_mask_s2 = self.prop_mask_layer_s2(feature_s2)\n        prop_mask_s2 = torch.sigmoid(prop_mask_s2)\n\n        kernel_conf_s2 = self.kernel_conf_layer_s2(feature_s2)\n        kernel_conf_s2 = self.softmax(kernel_conf_s2)\n        kernel_conf3_s2 = self.nnupsample(kernel_conf_s2[:, 0:1, :, :])\n        kernel_conf5_s2 = self.nnupsample(kernel_conf_s2[:, 1:2, :, :])\n        kernel_conf7_s2 = self.nnupsample(kernel_conf_s2[:, 2:3, :, :])\n\n        guide3_s2 = self.iter_guide_layer3_s2(feature_s2)\n        guide5_s2 = self.iter_guide_layer5_s2(feature_s2)\n        guide7_s2 = self.iter_guide_layer7_s2(feature_s2)\n\n        guide3_s2 = kernel_trans(guide3_s2, self.encoder3)\n        guide5_s2 = kernel_trans(guide5_s2, self.encoder5)\n        guide7_s2 = kernel_trans(guide7_s2, self.encoder7)\n\n        guide3_s2 = prop_mask_s2*guide3_s2\n        guide5_s2 = prop_mask_s2*guide5_s2\n        guide7_s2 = prop_mask_s2*guide7_s2\n\n        guide3_s2 = self.nnupsample(guide3_s2)\n        guide5_s2 = self.nnupsample(guide5_s2)\n        guide7_s2 = self.nnupsample(guide7_s2)\n\n        depth_s2 = self.nnupsample(d_s2)\n        mask_s2 = self.nnupsample(mask_s2)\n        depth3 = depth5 = depth7 = refined_depth_s3\n\n        for i in range(4):\n            depth3 = self.CSPN3_s2(guide3_s2, depth3, depth_s3)\n            depth3 = mask_s2*depth_s2 + (1-mask_s2)*depth3\n            depth5 = self.CSPN5_s2(guide5_s2, depth5, depth_s3)\n            depth5 = mask_s2*depth_s2 + (1-mask_s2)*depth5\n            depth7 = self.CSPN7_s2(guide7_s2, depth7, depth_s3)\n            depth7 = mask_s2*depth_s2 + (1-mask_s2)*depth7\n\n        depth_s2 = kernel_conf3_s2*depth3 + kernel_conf5_s2*depth5 + kernel_conf7_s2*depth7\n        refined_depth_s2 = depth_s2\n\n        #s1\n        mask = self.mask_layer(feature_s1)\n        mask = torch.sigmoid(mask)\n        mask = mask*valid_mask\n        prop_mask = self.prop_mask_layer(feature_s1)\n        prop_mask = torch.sigmoid(prop_mask)\n\n        kernel_conf = self.kernel_conf_layer(feature_s1)\n        kernel_conf = self.softmax(kernel_conf)\n        kernel_conf3 = kernel_conf[:, 0:1, :, :]\n        kernel_conf5 = kernel_conf[:, 1:2, :, :]\n        kernel_conf7 = kernel_conf[:, 2:3, :, :]\n\n        guide3 = self.iter_guide_layer3(feature_s1)\n        guide5 = self.iter_guide_layer5(feature_s1)\n        guide7 = self.iter_guide_layer7(feature_s1)\n\n        guide3 = kernel_trans(guide3, self.encoder3)\n        guide5 = kernel_trans(guide5, self.encoder5)\n        guide7 = kernel_trans(guide7, self.encoder7)\n\n        guide3 = prop_mask*guide3\n        guide5 = prop_mask*guide5\n        guide7 = prop_mask*guide7\n\n        depth3 = depth5 = depth7 = refined_depth_s2\n\n        for i in range(4):\n            depth3 = self.CSPN3(guide3, depth3, depth_s2)\n            depth3 = mask*d + (1-mask)*depth3\n            depth5 = self.CSPN5(guide5, depth5, depth_s2)\n            depth5 = mask*d + (1-mask)*depth5\n            depth7 = self.CSPN7(guide7, depth7, depth_s2)\n            depth7 = mask*d + (1-mask)*depth7\n\n        refined_depth = kernel_conf3*depth3 + kernel_conf5*depth5 + kernel_conf7*depth7\n        return refined_depth\n\nclass PENet_C1_train(nn.Module):\n    def __init__(self, args):\n        super(PENet_C1_train, self).__init__()\n\n        self.backbone = ENet(args)\n        self.mask_layer = convbn(64, 3)\n\n        self.kernel_conf_layer = convbn(64, 3)\n        self.iter_conf_layer = convbn(64, 12)\n        self.iter_guide_layer3 = CSPNGenerate(64, 3)\n        self.iter_guide_layer5 = CSPNGenerate(64, 5)\n        self.iter_guide_layer7 = CSPNGenerate(64, 7)\n        self.softmax = nn.Softmax(dim=1)\n        self.CSPN3 = CSPN(3)\n        self.CSPN5 = CSPN(5)\n        self.CSPN7 = CSPN(7)\n\n        weights_init(self)\n\n    def forward(self, input):\n        #rgb = input['rgb']\n        d = input['d']\n        valid_mask = torch.where(d>0, torch.full_like(d, 1.0), torch.full_like(d, 0.0))\n\n        feature, coarse_depth = self.backbone(input)\n\n        mask = self.mask_layer(feature)\n        mask = torch.sigmoid(mask)\n        mask = mask*valid_mask\n        mask3 = mask[:, 0:1, :, :]\n        mask5 = mask[:, 1:2, :, :]\n        mask7 = mask[:, 2:3, :, :]\n\n        kernel_conf = self.kernel_conf_layer(feature)\n        kernel_conf = self.softmax(kernel_conf)\n        kernel_conf3 = kernel_conf[:, 0:1, :, :]\n        kernel_conf5 = kernel_conf[:, 1:2, :, :]\n        kernel_conf7 = kernel_conf[:, 2:3, :, :]\n\n        conf = self.iter_conf_layer(feature)\n        conf3 = conf[:, 0:4, :, :]\n        conf5 = conf[:, 4:8, :, :]\n        conf7 = conf[:, 8:12, :, :]\n        conf3 = self.softmax(conf3)\n        conf5 = self.softmax(conf5)\n        conf7 = self.softmax(conf7)\n\n        #guide3 = self.iter_guide_layer3(feature)\n        #guide5 = self.iter_guide_layer5(feature)\n        #guide7 = self.iter_guide_layer7(feature)\n\n        #init\n        depth = coarse_depth\n        depth3 = depth\n        depth5 = depth\n        depth7 = depth\n\n        d3_list = [i for i in range(4)]\n        d5_list = [i for i in range(4)]\n        d7_list = [i for i in range(4)]\n\n        #prop\n        guide3 = self.iter_guide_layer3(feature)\n        guide5 = self.iter_guide_layer5(feature)\n        guide7 = self.iter_guide_layer7(feature)\n\n        for i in range(12):\n            depth3 = self.CSPN3(guide3, depth3, depth)\n            depth3 = mask3*d + (1-mask3)*depth3\n            depth5 = self.CSPN5(guide5, depth5, depth)\n            depth5 = mask5*d + (1-mask5)*depth5\n            depth7 = self.CSPN7(guide7, depth7, depth)\n            depth7 = mask7*d + (1-mask7)*depth7\n\n            if(i==2):\n                d3_list[0] = depth3\n                d5_list[0] = depth5\n                d7_list[0] = depth7\n\n            if(i==5):\n                d3_list[1] = depth3\n                d5_list[1] = depth5\n                d7_list[1] = depth7\n\n            if(i==8):\n                d3_list[2] = depth3\n                d5_list[2] = depth5\n                d7_list[2] = depth7\n\n            if(i==11):\n                d3_list[3] = depth3\n                d5_list[3] = depth5\n                d7_list[3] = depth7\n\n        refined_depth = \\\n        d3_list[0] * (kernel_conf3 * conf3[:, 0:1, :, :]) + \\\n        d3_list[1] * (kernel_conf3 * conf3[:, 1:2, :, :]) + \\\n        d3_list[2] * (kernel_conf3 * conf3[:, 2:3, :, :]) + \\\n        d3_list[3] * (kernel_conf3 * conf3[:, 3:4, :, :]) + \\\n        d5_list[0] * (kernel_conf5 * conf5[:, 0:1, :, :]) + \\\n        d5_list[1] * (kernel_conf5 * conf5[:, 1:2, :, :]) + \\\n        d5_list[2] * (kernel_conf5 * conf5[:, 2:3, :, :]) + \\\n        d5_list[3] * (kernel_conf5 * conf5[:, 3:4, :, :]) + \\\n        d7_list[0] * (kernel_conf7 * conf7[:, 0:1, :, :]) + \\\n        d7_list[1] * (kernel_conf7 * conf7[:, 1:2, :, :]) + \\\n        d7_list[2] * (kernel_conf7 * conf7[:, 2:3, :, :]) + \\\n        d7_list[3] * (kernel_conf7 * conf7[:, 3:4, :, :])\n\n        return refined_depth\n\nclass PENet_C2_train(nn.Module):\n    def __init__(self, args):\n        super(PENet_C2_train, self).__init__()\n\n        self.backbone = ENet(args)\n\n        self.kernel_conf_layer = convbn(64, 3)\n        self.mask_layer = convbn(64, 1)\n        self.iter_guide_layer3 = CSPNGenerate(64, 3)\n        self.iter_guide_layer5 = CSPNGenerate(64, 5)\n        self.iter_guide_layer7 = CSPNGenerate(64, 7)\n\n        self.kernel_conf_layer_s2 = convbn(128, 3)\n        self.mask_layer_s2 = convbn(128, 1)\n        self.iter_guide_layer3_s2 = CSPNGenerate(128, 3)\n        self.iter_guide_layer5_s2 = CSPNGenerate(128, 5)\n        self.iter_guide_layer7_s2 = CSPNGenerate(128, 7)\n\n        self.dimhalf_s2 = convbnrelu(128, 64, 1, 1, 0)\n        self.att_12 = convbnrelu(128, 2)\n\n        self.upsample = nn.UpsamplingBilinear2d(scale_factor=2)\n        self.downsample = SparseDownSampleClose(stride=2)\n        self.softmax = nn.Softmax(dim=1)\n        self.CSPN3 = CSPN(3)\n        self.CSPN5 = CSPN(5)\n        self.CSPN7 = CSPN(7)\n\n        weights_init(self)\n\n    def forward(self, input):\n        d = input['d']\n        valid_mask = torch.where(d>0, torch.full_like(d, 1.0), torch.full_like(d, 0.0))\n\n        feature_s1, feature_s2, coarse_depth = self.backbone(input)\n        depth = coarse_depth\n\n        d_s2, valid_mask_s2 = self.downsample(d, valid_mask)\n        mask_s2 = self.mask_layer_s2(feature_s2)\n        mask_s2 = torch.sigmoid(mask_s2)\n        mask_s2 = mask_s2*valid_mask_s2\n\n        kernel_conf_s2 = self.kernel_conf_layer_s2(feature_s2)\n        kernel_conf_s2 = self.softmax(kernel_conf_s2)\n        kernel_conf3_s2 = kernel_conf_s2[:, 0:1, :, :]\n        kernel_conf5_s2 = kernel_conf_s2[:, 1:2, :, :]\n        kernel_conf7_s2 = kernel_conf_s2[:, 2:3, :, :]\n\n        mask = self.mask_layer(feature_s1)\n        mask = torch.sigmoid(mask)\n        mask = mask*valid_mask\n\n        kernel_conf = self.kernel_conf_layer(feature_s1)\n        kernel_conf = self.softmax(kernel_conf)\n        kernel_conf3 = kernel_conf[:, 0:1, :, :]\n        kernel_conf5 = kernel_conf[:, 1:2, :, :]\n        kernel_conf7 = kernel_conf[:, 2:3, :, :]\n\n        feature_12 = torch.cat((feature_s1, self.upsample(self.dimhalf_s2(feature_s2))), 1)\n        att_map_12 = self.softmax(self.att_12(feature_12))\n\n        guide3_s2 = self.iter_guide_layer3_s2(feature_s2)\n        guide5_s2 = self.iter_guide_layer5_s2(feature_s2)\n        guide7_s2 = self.iter_guide_layer7_s2(feature_s2)\n        guide3 = self.iter_guide_layer3(feature_s1)\n        guide5 = self.iter_guide_layer5(feature_s1)\n        guide7 = self.iter_guide_layer7(feature_s1)\n\n        depth_s2 = depth\n        depth_s2_00 = depth_s2[:, :, 0::2, 0::2]\n        depth_s2_01 = depth_s2[:, :, 0::2, 1::2]\n        depth_s2_10 = depth_s2[:, :, 1::2, 0::2]\n        depth_s2_11 = depth_s2[:, :, 1::2, 1::2]\n\n        depth_s2_00_h0 = depth3_s2_00 = depth5_s2_00 = depth7_s2_00 = depth_s2_00\n        depth_s2_01_h0 = depth3_s2_01 = depth5_s2_01 = depth7_s2_01 = depth_s2_01\n        depth_s2_10_h0 = depth3_s2_10 = depth5_s2_10 = depth7_s2_10 = depth_s2_10\n        depth_s2_11_h0 = depth3_s2_11 = depth5_s2_11 = depth7_s2_11 = depth_s2_11\n\n        for i in range(6):\n            depth3_s2_00 = self.CSPN3(guide3_s2, depth3_s2_00, depth_s2_00_h0)\n            depth3_s2_00 = mask_s2*d_s2 + (1-mask_s2)*depth3_s2_00\n            depth5_s2_00 = self.CSPN5(guide5_s2, depth5_s2_00, depth_s2_00_h0)\n            depth5_s2_00 = mask_s2*d_s2 + (1-mask_s2)*depth5_s2_00\n            depth7_s2_00 = self.CSPN7(guide7_s2, depth7_s2_00, depth_s2_00_h0)\n            depth7_s2_00 = mask_s2*d_s2 + (1-mask_s2)*depth7_s2_00\n\n            depth3_s2_01 = self.CSPN3(guide3_s2, depth3_s2_01, depth_s2_01_h0)\n            depth3_s2_01 = mask_s2*d_s2 + (1-mask_s2)*depth3_s2_01\n            depth5_s2_01 = self.CSPN5(guide5_s2, depth5_s2_01, depth_s2_01_h0)\n            depth5_s2_01 = mask_s2*d_s2 + (1-mask_s2)*depth5_s2_01\n            depth7_s2_01 = self.CSPN7(guide7_s2, depth7_s2_01, depth_s2_01_h0)\n            depth7_s2_01 = mask_s2*d_s2 + (1-mask_s2)*depth7_s2_01\n\n            depth3_s2_10 = self.CSPN3(guide3_s2, depth3_s2_10, depth_s2_10_h0)\n            depth3_s2_10 = mask_s2*d_s2 + (1-mask_s2)*depth3_s2_10\n            depth5_s2_10 = self.CSPN5(guide5_s2, depth5_s2_10, depth_s2_10_h0)\n            depth5_s2_10 = mask_s2*d_s2 + (1-mask_s2)*depth5_s2_10\n            depth7_s2_10 = self.CSPN7(guide7_s2, depth7_s2_10, depth_s2_10_h0)\n            depth7_s2_10 = mask_s2*d_s2 + (1-mask_s2)*depth7_s2_10\n\n            depth3_s2_11 = self.CSPN3(guide3_s2, depth3_s2_11, depth_s2_11_h0)\n            depth3_s2_11 = mask_s2*d_s2 + (1-mask_s2)*depth3_s2_11\n            depth5_s2_11 = self.CSPN5(guide5_s2, depth5_s2_11, depth_s2_11_h0)\n            depth5_s2_11 = mask_s2*d_s2 + (1-mask_s2)*depth5_s2_11\n            depth7_s2_11 = self.CSPN7(guide7_s2, depth7_s2_11, depth_s2_11_h0)\n            depth7_s2_11 = mask_s2*d_s2 + (1-mask_s2)*depth7_s2_11\n\n        depth_s2_00 = kernel_conf3_s2*depth3_s2_00 + kernel_conf5_s2*depth5_s2_00 + kernel_conf7_s2*depth7_s2_00\n        depth_s2_01 = kernel_conf3_s2*depth3_s2_01 + kernel_conf5_s2*depth5_s2_01 + kernel_conf7_s2*depth7_s2_01\n        depth_s2_10 = kernel_conf3_s2*depth3_s2_10 + kernel_conf5_s2*depth5_s2_10 + kernel_conf7_s2*depth7_s2_10\n        depth_s2_11 = kernel_conf3_s2*depth3_s2_11 + kernel_conf5_s2*depth5_s2_11 + kernel_conf7_s2*depth7_s2_11\n\n        depth_s2[:, :, 0::2, 0::2] = depth_s2_00\n        depth_s2[:, :, 0::2, 1::2] = depth_s2_01\n        depth_s2[:, :, 1::2, 0::2] = depth_s2_10\n        depth_s2[:, :, 1::2, 1::2] = depth_s2_11\n\n        #feature_12 = torch.cat((feature_s1, self.upsample(self.dimhalf_s2(feature_s2))), 1)\n        #att_map_12 = self.softmax(self.att_12(feature_12))\n        refined_depth_s2 = depth*att_map_12[:, 0:1, :, :] + depth_s2*att_map_12[:, 1:2, :, :]\n        #refined_depth_s2 = depth\n\n        depth3 = depth5 = depth7 = refined_depth_s2\n\n        #prop\n        for i in range(6):\n            depth3 = self.CSPN3(guide3, depth3, depth)\n            depth3 = mask*d + (1-mask)*depth3\n            depth5 = self.CSPN5(guide5, depth5, depth)\n            depth5 = mask*d + (1-mask)*depth5\n            depth7 = self.CSPN7(guide7, depth7, depth)\n            depth7 = mask*d + (1-mask)*depth7\n\n        refined_depth = kernel_conf3*depth3 + kernel_conf5*depth5 + kernel_conf7*depth7\n        return refined_depth\n"
  },
  {
    "path": "tools/PENet/vis_utils.py",
    "content": "import os\n\nimport matplotlib.pyplot as plt\nfrom PIL import Image\nimport numpy as np\nimport cv2\nfrom dataloaders import calibration_kitti\nfrom skimage import io\nimport cv2\n\ncmap = plt.cm.jet\ncmap2 = plt.cm.nipy_spectral\n\nfrom dataloaders.my_loader import depth2pointsrgb, depth2pointsrgbp\n\ndef validcrop(img):\n    ratio = 256/1216\n    h = img.size()[2]\n    w = img.size()[3]\n    return img[:, :, h-int(ratio*w):, :]\n\ndef depth_colorize(depth):\n    depth = (depth - np.min(depth)) / (np.max(depth) - np.min(depth))\n    depth = 255 * cmap(depth)[:, :, :3]  # H, W, C\n    return depth.astype('uint8')\n\ndef feature_colorize(feature):\n    feature = (feature - np.min(feature)) / ((np.max(feature) - np.min(feature)))\n    feature = 255 * cmap2(feature)[:, :, :3]\n    return feature.astype('uint8')\n\ndef mask_vis(mask):\n    mask = (mask - np.min(mask)) / (np.max(mask) - np.min(mask))\n    mask = 255 * mask\n    return mask.astype('uint8')\n\ndef merge_into_row(ele, pred, predrgb=None, predg=None, extra=None, extra2=None, extrargb=None):\n    def preprocess_depth(x):\n        y = np.squeeze(x.data.cpu().numpy())\n        return depth_colorize(y)\n\n    # if is gray, transforms to rgb\n    img_list = []\n    if 'rgb' in ele:\n        rgb = np.squeeze(ele['rgb'][0, ...].data.cpu().numpy())\n        rgb = np.transpose(rgb, (1, 2, 0))\n        img_list.append(rgb)\n    elif 'g' in ele:\n        g = np.squeeze(ele['g'][0, ...].data.cpu().numpy())\n        g = np.array(Image.fromarray(g).convert('RGB'))\n        img_list.append(g)\n    if 'd' in ele:\n        img_list.append(preprocess_depth(ele['d'][0, ...]))\n        img_list.append(preprocess_depth(pred[0, ...]))\n    if extrargb is not None:\n        img_list.append(preprocess_depth(extrargb[0, ...]))\n    if predrgb is not None:\n        predrgb = np.squeeze(ele['rgb'][0, ...].data.cpu().numpy())\n        predrgb = np.transpose(predrgb, (1, 2, 0))\n        #predrgb = predrgb.astype('uint8')\n        img_list.append(predrgb)\n    if predg is not None:\n        predg = np.squeeze(predg[0, ...].data.cpu().numpy())\n        predg = mask_vis(predg)\n        predg = np.array(Image.fromarray(predg).convert('RGB'))\n        #predg = predg.astype('uint8')\n        img_list.append(predg)\n    if extra is not None:\n        extra = np.squeeze(extra[0, ...].data.cpu().numpy())\n        extra = mask_vis(extra)\n        extra = np.array(Image.fromarray(extra).convert('RGB'))\n        img_list.append(extra)\n    if extra2 is not None:\n        extra2 = np.squeeze(extra2[0, ...].data.cpu().numpy())\n        extra2 = mask_vis(extra2)\n        extra2 = np.array(Image.fromarray(extra2).convert('RGB'))\n        img_list.append(extra2)\n    if 'gt' in ele:\n        img_list.append(preprocess_depth(ele['gt'][0, ...]))\n\n    img_merge = np.hstack(img_list)\n    return img_merge.astype('uint8')\n\n\ndef add_row(img_merge, row):\n    return np.vstack([img_merge, row])\n\n\ndef save_image(img_merge, filename):\n    image_to_write = cv2.cvtColor(img_merge, cv2.COLOR_RGB2BGR)\n    cv2.imwrite(filename, image_to_write)\n\ndef save_image_torch(rgb, filename):\n    #torch2numpy\n    rgb = validcrop(rgb)\n    rgb = np.squeeze(rgb[0, ...].data.cpu().numpy())\n    #print(rgb.size())\n    rgb = np.transpose(rgb, (1, 2, 0))\n    rgb = rgb.astype('uint8')\n    image_to_write = cv2.cvtColor(rgb, cv2.COLOR_RGB2BGR)\n    cv2.imwrite(filename, image_to_write)\n\ndef save_depth_as_uint16png(img, filename):\n    #from tensor\n    img = np.squeeze(img.data.cpu().numpy())\n    img = (img * 256).astype('uint16')\n    cv2.imwrite(filename, img)\n\ndef get_fov_flag(pts_rect, img_shape, calib):\n    \"\"\"\n    Args:\n        pts_rect:\n        img_shape:\n        calib:\n\n    Returns:\n\n    \"\"\"\n    pts_img, pts_rect_depth = calib.rect_to_img(pts_rect)\n    val_flag_1 = np.logical_and(pts_img[:, 0] >= 0, pts_img[:, 0] < img_shape[1])\n    val_flag_2 = np.logical_and(pts_img[:, 1] >= 0, pts_img[:, 1] < img_shape[0])\n    val_flag_merge = np.logical_and(val_flag_1, val_flag_2)\n    pts_valid_flag = np.logical_and(val_flag_merge, pts_rect_depth >= 0)\n    return pts_valid_flag\n\ndef save_depth_as_points(depth, idx, root_path):\n\n    file_idx = str(idx).zfill(6)\n    file_image_path = os.path.join(root_path, 'image_2', file_idx + '.png')\n    file_velo_path = os.path.join(root_path, 'velodyne', file_idx + '.bin')\n    file_calib = os.path.join(root_path, 'calib', file_idx + '.txt')\n\n    calib = calibration_kitti.Calibration(file_calib)\n\n    lidar = np.fromfile(str(file_velo_path), dtype=np.float32).reshape(-1, 4)\n    image = np.array(io.imread(file_image_path), dtype=np.int32)\n    image = image[:352, :1216]\n\n    pts_rect = calib.lidar_to_rect(lidar[:, 0:3])\n    fov_flag = get_fov_flag(pts_rect, image.shape, calib)\n    lidar = lidar[fov_flag]\n\n\n    paths = os.path.join(root_path, 'velodyne_depth')\n    if not os.path.exists(paths):\n        os.makedirs(paths)\n\n    out_path = os.path.join(paths, file_idx + '.npy')\n    depth = depth.cpu().detach().numpy().reshape(352, 1216,1)\n    final_points = depth2pointsrgbp(depth, image, calib, lidar)\n    final_points = final_points.astype(np.float16)\n    np.save(out_path, final_points)\n\n\ndef save_depth_as_uint16png_upload(img, filename):\n    #from tensor\n    img = np.squeeze(img.data.cpu().numpy())\n    img = (img * 256.0).astype('uint16')\n    img_buffer = img.tobytes()\n    imgsave = Image.new(\"I\", img.T.shape)\n    imgsave.frombytes(img_buffer, 'raw', \"I;16\")\n    imgsave.save(filename)\n\ndef save_depth_as_uint8colored(img, filename):\n    #from tensor\n    img = validcrop(img)\n    img = np.squeeze(img.data.cpu().numpy())\n    img = depth_colorize(img)\n    img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)\n    cv2.imwrite(filename, img)\n\ndef save_mask_as_uint8colored(img, filename, colored=True, normalized=True):\n    img = validcrop(img)\n    img = np.squeeze(img.data.cpu().numpy())\n    if(normalized==False):\n        img = (img - np.min(img)) / (np.max(img) - np.min(img))\n    if(colored==True):\n        img = 255 * cmap(img)[:, :, :3]\n    else:\n        img = 255 * img\n    img = img.astype('uint8')\n    img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)\n    cv2.imwrite(filename, img)\n\ndef save_feature_as_uint8colored(img, filename):\n    img = validcrop(img)\n    img = np.squeeze(img.data.cpu().numpy())\n    img = feature_colorize(img)\n    img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)\n    cv2.imwrite(filename, img)\n"
  },
  {
    "path": "tools/cfgs/dataset_configs/kitti_dataset.yaml",
    "content": "DATA_PATH: '../data/kitti'\n\nDATASET: 'KittiDataset'\n\nMM_PATH: 'velodyne_depth'\n\nPOINT_CLOUD_RANGE: [0, -40, -3, 70.4, 40, 1]\n\nDATA_SPLIT: {\n    'train': train,\n    'test': val\n}\n\nINFO_PATH: {\n    'train': [kitti_infos_train.pkl],\n    'test': [kitti_infos_val.pkl],\n}\n\nFOV_POINTS_ONLY: True\n\nX_TRANS:\n    AUG_CONFIG_LIST:\n        - NAME: world_rotation\n          WORLD_ROT_ANGLE: [0.39269908,0 , 0.39269908, -0.39269908, -0.39269908, 0]\n        - NAME: world_flip\n          ALONG_AXIS_LIST: [0, 1, 1, 0, 1, 0]\n        - NAME: world_scaling\n          WORLD_SCALE_RANGE: [ 0.98, 1.02, 1., 0.98, 1.02, 1.]\n\nDATA_AUGMENTOR:\n    DISABLE_AUG_LIST: ['placeholder']\n    AUG_CONFIG_LIST:\n        - NAME: gt_sampling\n          USE_ROAD_PLANE: True\n          DB_INFO_PATH:\n              - kitti_dbinfos_train.pkl\n          PREPARE: {\n             filter_by_min_points: ['Car:5', 'Pedestrian:5', 'Cyclist:5'],\n             filter_by_difficulty: [-1],\n          }\n\n          SAMPLE_GROUPS: ['Car:15','Pedestrian:10', 'Cyclist:10']\n          NUM_POINT_FEATURES: 4\n          DATABASE_WITH_FAKELIDAR: False\n          REMOVE_EXTRA_WIDTH: [0.0, 0.0, 0.0]\n          LIMIT_WHOLE_SCENE: False\n\n        - NAME: random_world_flip\n          ALONG_AXIS_LIST: ['x']\n\n        - NAME: random_world_rotation\n          WORLD_ROT_ANGLE: [-0.78539816, 0.78539816]\n\n        - NAME: random_world_scaling\n          WORLD_SCALE_RANGE: [0.95, 1.05]\n\n\n\nPOINT_FEATURE_ENCODING: {\n    encoding_type: absolute_coordinates_encoding,\n    used_feature_list: ['x', 'y', 'z', 'intensity'],\n    src_feature_list: ['x', 'y', 'z', 'intensity'],\n}\n\n\nDATA_PROCESSOR:\n    - NAME: mask_points_and_boxes_outside_range\n      REMOVE_OUTSIDE_BOXES: True\n\n    - NAME: shuffle_points\n      SHUFFLE_ENABLED: {\n        'train': True,\n        'test': False\n      }\n\n    - NAME: transform_points_to_voxels\n      VOXEL_SIZE: [0.05, 0.05, 0.1]\n      MAX_POINTS_PER_VOXEL: 5\n      MAX_NUMBER_OF_VOXELS: {\n        'train': 1600000,\n        'test': 4000000\n      }\n"
  },
  {
    "path": "tools/cfgs/models/kitti/TED-M.yaml",
    "content": "CLASS_NAMES: ['Car']\n\nDATA_CONFIG:\n    _BASE_CONFIG_: cfgs/dataset_configs/kitti_dataset.yaml\n    DATASET: 'KittiDatasetMM'\n    MM_PATH: 'velodyne_depth'\n    ROT_NUM: 3\n    USE_VAN: True\n\n    DATA_SPLIT: {\n        'train': train,\n        'test': val\n    }\n\n    INFO_PATH: {\n        'train': [kitti_infos_train.pkl],\n        'test': [kitti_infos_val.pkl],\n    }\n\n    DATA_AUGMENTOR:\n        DISABLE_AUG_LIST: ['placeholder']\n        AUG_CONFIG_LIST:\n            - NAME: gt_sampling\n              USE_ROAD_PLANE: True\n              DB_INFO_PATH:\n                  - kitti_dbinfos_train_mm.pkl\n              PREPARE: {\n                  filter_by_min_points: ['Car:5', 'Pedestrian:5', 'Cyclist:5'],\n                  filter_by_difficulty: [-1],\n              }\n\n              SAMPLE_GROUPS: ['Car:10', 'Pedestrian:10', 'Cyclist:10']\n              NUM_POINT_FEATURES: 8\n              DATABASE_WITH_FAKELIDAR: False\n              REMOVE_EXTRA_WIDTH: [0.0, 0.0, -0.2]\n              LIMIT_WHOLE_SCENE: False\n\n            - NAME: da_sampling\n              USE_ROAD_PLANE: True\n              DB_INFO_PATH:\n                - kitti_dbinfos_train_mm.pkl\n              PREPARE: {\n                filter_by_min_points: ['Car:5'],\n                filter_by_difficulty: [-1],\n              }\n\n              SAMPLE_GROUPS: ['Car:10']\n\n              MIN_SAMPLING_DIS: 0\n              MAX_SAMPLING_DIS: 20\n              OCCLUSION_NOISE: 0.2\n              OCCLUSION_OFFSET: 2.\n              SAMPLING_METHOD: 'LiDAR-aware'\n              VERT_RES: 0.006\n              HOR_RES: 0.003\n\n              NUM_POINT_FEATURES: 8\n              DATABASE_WITH_FAKELIDAR: False\n              REMOVE_EXTRA_WIDTH: [0.0, 0.0, -0.2]\n              LIMIT_WHOLE_SCENE: False\n\n            - NAME: random_local_noise\n              LOCAL_ROT_RANGE: [-0.78539816, 0.78539816]\n              TRANSLATION_STD: [1.0, 1.0, 0.5]\n              GLOBAL_ROT_RANGE: [0.0, 0.0]\n              EXTRA_WIDTH: [0.2, 0.2, 0.]\n\n            - NAME: random_world_rotation\n              WORLD_ROT_ANGLE: [-0.39269908, 0.39269908]\n\n            - NAME: random_world_scaling\n              WORLD_SCALE_RANGE: [0.95, 1.05]\n\n            - NAME: random_local_pyramid_aug\n              DROP_PROB: 0.25\n              SPARSIFY_PROB: 0.05\n              SPARSIFY_MAX_NUM: 50\n              SWAP_PROB: 0.1\n              SWAP_MAX_NUM: 50\n\n    X_TRANS:\n      AUG_CONFIG_LIST:\n        - NAME: world_rotation\n          WORLD_ROT_ANGLE: [0.39269908, 0, 0.39269908, -0.39269908, -0.39269908, 0]\n        - NAME: world_flip\n          ALONG_AXIS_LIST: [0, 1, 1, 0, 1, 0]\n        - NAME: world_scaling\n          WORLD_SCALE_RANGE: [ 0.98, 1.02, 1., 0.98, 1.02, 1.]\n\n\n    POINT_FEATURE_ENCODING: {\n        encoding_type: absolute_coordinates_encoding_mm,\n        used_feature_list: ['x', 'y', 'z', 'intensity'],\n        src_feature_list: ['x', 'y', 'z', 'intensity'],\n        num_features: 8\n    }\n\n    DATA_PROCESSOR:\n        - NAME: mask_points_and_boxes_outside_range\n          REMOVE_OUTSIDE_BOXES: True\n\n        - NAME: shuffle_points\n          SHUFFLE_ENABLED: {\n            'train': True,\n            'test': True\n          }\n\n        - NAME: transform_points_to_voxels\n          VOXEL_SIZE: [0.05, 0.05, 0.05]\n          MAX_POINTS_PER_VOXEL: 5\n          MAX_NUMBER_OF_VOXELS: {\n            'train': 16000,\n            'test': 40000\n          }\n\nMODEL:\n    NAME: VoxelRCNN\n\n    VFE:\n        NAME: MeanVFE\n        MODEL: 'max'\n\n    BACKBONE_3D:\n        NAME: TeMMVoxelBackBone8x\n        NUM_FILTERS: [16, 32, 64, 64]\n        RETURN_NUM_FEATURES_AS_DICT: True\n        OUT_FEATURES: 64\n        MM: True\n\n    MAP_TO_BEV:\n        NAME: BEVPool\n        NUM_BEV_FEATURES: 256\n        ALIGN_METHOD: 'max'\n\n\n    BACKBONE_2D:\n        NAME: BaseBEVBackbone\n\n        LAYER_NUMS: [4, 4]\n        LAYER_STRIDES: [1, 2]\n        NUM_FILTERS: [64, 128]\n        UPSAMPLE_STRIDES: [1, 2]\n        NUM_UPSAMPLE_FILTERS: [128, 128]\n\n    DENSE_HEAD:\n        NAME: AnchorHeadSingle\n        CLASS_AGNOSTIC: False\n\n        USE_DIRECTION_CLASSIFIER: True\n        DIR_OFFSET: 0.78539\n        DIR_LIMIT_OFFSET: 0.0\n        NUM_DIR_BINS: 2\n\n        ANCHOR_GENERATOR_CONFIG: [\n            {\n                'class_name': 'Car',\n                'anchor_sizes': [[3.9, 1.6, 1.56]],\n                'anchor_rotations': [0, 1.57],\n                'anchor_bottom_heights': [-1.78],\n                'align_center': False,\n                'feature_map_stride': 8,\n                'matched_threshold': 0.6,\n                'unmatched_threshold': 0.45\n            }\n        ]\n        TARGET_ASSIGNER_CONFIG:\n            NAME: AxisAlignedTargetAssigner\n            POS_FRACTION: -1.0\n            SAMPLE_SIZE: 512\n            NORM_BY_NUM_EXAMPLES: False\n            MATCH_HEIGHT: False\n            BOX_CODER: ResidualCoder\n\n        LOSS_CONFIG:\n            LOSS_WEIGHTS: {\n                'cls_weight': 1.0,\n                'loc_weight': 2.0,\n                'dir_weight': 0.2,\n                'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]\n            }\n\n\n    ROI_HEAD:\n        NAME: TEDMHead\n        CLASS_AGNOSTIC: True\n\n        SHARED_FC: [256, 256]\n        CLS_FC: [256, 256]\n        REG_FC: [256, 256]\n        DP_RATIO: 0.01\n\n        PART:\n          IN_CHANNEL: 256\n          SIZE: 7\n          GRID_OFFSETS: [0., 40.]\n          FEATMAP_STRIDE: 0.4\n\n        NMS_CONFIG:\n            TRAIN:\n                NMS_TYPE: nms_gpu\n                MULTI_CLASSES_NMS: False\n                NMS_PRE_MAXSIZE: 4000\n                NMS_POST_MAXSIZE: 512\n                NMS_THRESH: 0.8\n            TEST:\n                NMS_TYPE: nms_gpu\n                MULTI_CLASSES_NMS: False\n                USE_FAST_NMS: True\n                SCORE_THRESH: 0.0\n                NMS_PRE_MAXSIZE: 4000\n                NMS_POST_MAXSIZE: 50\n                NMS_THRESH: 0.75\n\n        ROI_GRID_POOL:\n            FEATURES_SOURCE: ['x_conv3','x_conv4']\n            PRE_MLP: True\n            GRID_SIZE: 6\n            POOL_LAYERS:\n                x_conv3:\n                    MLPS: [[32, 32], [32, 32]]\n                    QUERY_RANGES: [[2, 2, 2], [4, 4, 4]]\n                    POOL_RADIUS: [0.4, 0.8]\n                    NSAMPLE: [16, 16]\n                    POOL_METHOD: max_pool\n                x_conv4:\n                    MLPS: [[32, 32], [32, 32]]\n                    QUERY_RANGES: [[2, 2, 2], [4, 4, 4]]\n                    POOL_RADIUS: [0.8, 1.6]\n                    NSAMPLE: [16, 16]\n                    POOL_METHOD: max_pool\n\n        ROI_GRID_POOL_MM:\n            FEATURES_SOURCE: ['x_conv3','x_conv4']\n            PRE_MLP: True\n            GRID_SIZE: 4\n            POOL_LAYERS:\n                x_conv3:\n                    MLPS: [[32, 32], [32, 32]]\n                    QUERY_RANGES: [[2, 2, 2], [4, 4, 4]]\n                    POOL_RADIUS: [0.4, 0.8]\n                    NSAMPLE: [16, 16]\n                    POOL_METHOD: max_pool\n                x_conv4:\n                    MLPS: [[32, 32], [32, 32]]\n                    QUERY_RANGES: [[2, 2, 2], [4, 4, 4]]\n                    POOL_RADIUS: [0.8, 1.6]\n                    NSAMPLE: [16, 16]\n                    POOL_METHOD: max_pool\n\n\n        TARGET_CONFIG:\n            BOX_CODER: ResidualCoder\n            ROI_PER_IMAGE: 160\n            FG_RATIO: 0.5\n            SAMPLE_ROI_BY_EACH_CLASS: True\n            CLS_SCORE_TYPE: roi_iou_x\n            CLS_FG_THRESH: [0.75]\n            CLS_BG_THRESH: [0.25]\n            CLS_BG_THRESH_LO: 0.1\n            HARD_BG_RATIO: 0.8\n            REG_FG_THRESH: [0.55]\n            ENABLE_HARD_SAMPLING: True\n            HARD_SAMPLING_THRESH: [0.5]\n            HARD_SAMPLING_RATIO: [0.5]\n\n\n        LOSS_CONFIG:\n            CLS_LOSS: BinaryCrossEntropy\n            REG_LOSS: smooth-l1\n            CORNER_LOSS_REGULARIZATION: True\n            GRID_3D_IOU_LOSS: False\n            LOSS_WEIGHTS: {\n                'rcnn_cls_weight': 1.0,\n                'rcnn_reg_weight': 1.0,\n                'rcnn_corner_weight': 1.0,\n                'rcnn_iou3d_weight': 1.0,\n                'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]\n            }\n\n    POST_PROCESSING:\n        RECALL_THRESH_LIST: [0.3, 0.5, 0.7]\n        SCORE_THRESH: 0.7\n        OUTPUT_RAW_SCORE: False\n        EVAL_METRIC: kitti\n\n        NMS_CONFIG:\n            MULTI_CLASSES_NMS: False\n            NMS_TYPE: nms_gpu\n            NMS_THRESH: 0.1\n            NMS_PRE_MAXSIZE: 4096\n            NMS_POST_MAXSIZE: 500\n\n\nOPTIMIZATION:\n    BATCH_SIZE_PER_GPU: 2\n    NUM_EPOCHS: 30\n\n    OPTIMIZER: adam_onecycle\n    LR: 0.01\n    WEIGHT_DECAY: 0.01\n    MOMENTUM: 0.9\n\n    MOMS: [0.95, 0.85]\n    PCT_START: 0.4\n    DIV_FACTOR: 10\n    DECAY_STEP_LIST: [35, 45]\n    LR_DECAY: 0.1\n    LR_CLIP: 0.0000001\n\n    LR_WARMUP: False\n    WARMUP_EPOCH: 1\n\n    GRAD_NORM_CLIP: 10"
  },
  {
    "path": "tools/cfgs/models/kitti/TED-S.yaml",
    "content": "CLASS_NAMES: ['Car']\n\nDATA_CONFIG:\n    _BASE_CONFIG_: cfgs/dataset_configs/kitti_dataset.yaml\n    DATASET: 'KittiDataset'\n    ROT_NUM: 3\n    USE_VAN: True\n\n    DATA_SPLIT: {\n        'train': train,\n        'test': val\n    }\n\n    INFO_PATH: {\n        'train': [kitti_infos_train.pkl],\n        'test': [kitti_infos_val.pkl],\n    }\n\n    DATA_AUGMENTOR:\n        DISABLE_AUG_LIST: ['placeholder']\n        AUG_CONFIG_LIST:\n            - NAME: gt_sampling\n              USE_ROAD_PLANE: True\n              DB_INFO_PATH:\n                  - kitti_dbinfos_train.pkl\n              PREPARE: {\n                  filter_by_min_points: ['Car:5', 'Pedestrian:5', 'Cyclist:5'],\n                  filter_by_difficulty: [-1],\n              }\n\n              SAMPLE_GROUPS: ['Car:10', 'Pedestrian:10', 'Cyclist:10']\n              NUM_POINT_FEATURES: 4\n              DATABASE_WITH_FAKELIDAR: False\n              REMOVE_EXTRA_WIDTH: [0.0, 0.0, -0.2]\n              LIMIT_WHOLE_SCENE: False\n\n            - NAME: da_sampling\n              USE_ROAD_PLANE: True\n              DB_INFO_PATH:\n                - kitti_dbinfos_train.pkl\n              PREPARE: {\n                filter_by_min_points: ['Car:5'],\n                filter_by_difficulty: [-1],\n              }\n\n              SAMPLE_GROUPS: ['Car:10']\n\n              MIN_SAMPLING_DIS: 0\n              MAX_SAMPLING_DIS: 20\n              OCCLUSION_NOISE: 0.2\n              OCCLUSION_OFFSET: 2.\n              SAMPLING_METHOD: 'LiDAR-aware'\n              VERT_RES: 0.006\n              HOR_RES: 0.003\n\n              NUM_POINT_FEATURES: 4\n              DATABASE_WITH_FAKELIDAR: False\n              REMOVE_EXTRA_WIDTH: [0.0, 0.0, -0.2]\n              LIMIT_WHOLE_SCENE: False\n\n            - NAME: random_local_noise\n              LOCAL_ROT_RANGE: [-0.78539816, 0.78539816]\n              TRANSLATION_STD: [1.0, 1.0, 0.5]\n              GLOBAL_ROT_RANGE: [0.0, 0.0]\n              EXTRA_WIDTH: [0.2, 0.2, 0.]\n\n            - NAME: random_world_rotation\n              WORLD_ROT_ANGLE: [-0.39269908, 0.39269908]\n\n            - NAME: random_world_scaling\n              WORLD_SCALE_RANGE: [0.95, 1.05]\n\n            - NAME: random_local_pyramid_aug\n              DROP_PROB: 0.25\n              SPARSIFY_PROB: 0.05\n              SPARSIFY_MAX_NUM: 50\n              SWAP_PROB: 0.1\n              SWAP_MAX_NUM: 50\n\n    X_TRANS:\n      AUG_CONFIG_LIST:\n        - NAME: world_rotation\n          WORLD_ROT_ANGLE: [0.39269908, 0, 0.39269908, -0.39269908, -0.39269908, 0]\n        - NAME: world_flip\n          ALONG_AXIS_LIST: [0, 1, 1, 0, 1, 0]\n        - NAME: world_scaling\n          WORLD_SCALE_RANGE: [ 0.98, 1.02, 1., 0.98, 1.02, 1.]\n\n\n\n    POINT_FEATURE_ENCODING: {\n        encoding_type: absolute_coordinates_encoding_mm,\n        used_feature_list: ['x', 'y', 'z', 'intensity'],\n        src_feature_list: ['x', 'y', 'z', 'intensity'],\n        num_features: 4\n    }\n\n    DATA_PROCESSOR:\n        - NAME: mask_points_and_boxes_outside_range\n          REMOVE_OUTSIDE_BOXES: True\n\n        - NAME: shuffle_points\n          SHUFFLE_ENABLED: {\n            'train': True,\n            'test': True\n          }\n\n        - NAME: transform_points_to_voxels\n          VOXEL_SIZE: [0.05, 0.05, 0.05]  \n          MAX_POINTS_PER_VOXEL: 5\n          MAX_NUMBER_OF_VOXELS: {\n            'train': 16000,\n            'test': 40000\n          }\n\nMODEL:\n    NAME: VoxelRCNN\n\n    VFE:\n        NAME: MeanVFE\n        MODEL: 'max'\n\n    BACKBONE_3D:\n        NAME: TeVoxelBackBone8x\n        NUM_FILTERS: [16, 32, 64, 64]\n        RETURN_NUM_FEATURES_AS_DICT: True\n        OUT_FEATURES: 64\n\n    MAP_TO_BEV:\n        NAME: BEVPool\n        NUM_BEV_FEATURES: 256\n        ALIGN_METHOD: 'max'\n\n    BACKBONE_2D:\n        NAME: BaseBEVBackbone\n\n        LAYER_NUMS: [4, 4]\n        LAYER_STRIDES: [1, 2]\n        NUM_FILTERS: [64, 128]\n        UPSAMPLE_STRIDES: [1, 2]\n        NUM_UPSAMPLE_FILTERS: [128, 128]\n\n    DENSE_HEAD:\n        NAME: AnchorHeadSingle\n        CLASS_AGNOSTIC: False\n\n        USE_DIRECTION_CLASSIFIER: True\n        DIR_OFFSET: 0.78539\n        DIR_LIMIT_OFFSET: 0.0\n        NUM_DIR_BINS: 2\n\n        ANCHOR_GENERATOR_CONFIG: [\n            {\n                'class_name': 'Car',\n                'anchor_sizes': [[3.9, 1.6, 1.56]],\n                'anchor_rotations': [0, 1.57],\n                'anchor_bottom_heights': [-1.78],\n                'align_center': False,\n                'feature_map_stride': 8,\n                'matched_threshold': 0.6,\n                'unmatched_threshold': 0.45\n            }\n        ]\n        TARGET_ASSIGNER_CONFIG:\n            NAME: AxisAlignedTargetAssigner\n            POS_FRACTION: -1.0\n            SAMPLE_SIZE: 512\n            NORM_BY_NUM_EXAMPLES: False\n            MATCH_HEIGHT: False\n            BOX_CODER: ResidualCoder\n\n        LOSS_CONFIG:\n            LOSS_WEIGHTS: {\n                'cls_weight': 1.0,\n                'loc_weight': 2.0,\n                'dir_weight': 0.2,\n                'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]\n            }\n\n\n    ROI_HEAD:\n        NAME: TEDSHead\n        CLASS_AGNOSTIC: True\n\n        SHARED_FC: [256, 256]\n        CLS_FC: [256, 256]\n        REG_FC: [256, 256]\n        DP_RATIO: 0.01\n\n        NMS_CONFIG:\n            TRAIN:\n                NMS_TYPE: nms_gpu\n                MULTI_CLASSES_NMS: False\n                NMS_PRE_MAXSIZE: 4000\n                NMS_POST_MAXSIZE: 512\n                NMS_THRESH: 0.8\n            TEST:\n                NMS_TYPE: nms_gpu\n                MULTI_CLASSES_NMS: False\n                USE_FAST_NMS: True\n                SCORE_THRESH: 0.0\n                NMS_PRE_MAXSIZE: 4000\n                NMS_POST_MAXSIZE: 50\n                NMS_THRESH: 0.75\n\n        ROI_GRID_POOL:\n            FEATURES_SOURCE: ['x_conv3','x_conv4']\n            PRE_MLP: True\n            GRID_SIZE: 6\n            POOL_LAYERS:\n                x_conv3:\n                    MLPS: [[32, 32], [32, 32]]\n                    QUERY_RANGES: [[2, 2, 2], [4, 4, 4]]\n                    POOL_RADIUS: [0.4, 0.8]\n                    NSAMPLE: [16, 16]\n                    POOL_METHOD: max_pool\n                x_conv4:\n                    MLPS: [[32, 32], [32, 32]]\n                    QUERY_RANGES: [[2, 2, 2], [4, 4, 4]]\n                    POOL_RADIUS: [0.8, 1.6]\n                    NSAMPLE: [16, 16]\n                    POOL_METHOD: max_pool\n\n\n        TARGET_CONFIG:\n            BOX_CODER: ResidualCoder\n            ROI_PER_IMAGE: 160\n            FG_RATIO: 0.5\n            SAMPLE_ROI_BY_EACH_CLASS: True\n            CLS_SCORE_TYPE: roi_iou_x\n            CLS_FG_THRESH: [0.75]\n            CLS_BG_THRESH: [0.25]\n            CLS_BG_THRESH_LO: 0.1\n            HARD_BG_RATIO: 0.8\n            REG_FG_THRESH: [0.55]\n            ENABLE_HARD_SAMPLING: True\n            HARD_SAMPLING_THRESH: [0.5]\n            HARD_SAMPLING_RATIO: [0.5]\n\n\n        LOSS_CONFIG:\n            CLS_LOSS: BinaryCrossEntropy\n            REG_LOSS: smooth-l1\n            CORNER_LOSS_REGULARIZATION: True\n            GRID_3D_IOU_LOSS: False\n            LOSS_WEIGHTS: {\n                'rcnn_cls_weight': 1.0,\n                'rcnn_reg_weight': 1.0,\n                'rcnn_corner_weight': 1.0,\n                'rcnn_iou3d_weight': 1.0,\n                'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]\n            }\n\n    POST_PROCESSING:\n        RECALL_THRESH_LIST: [0.3, 0.5, 0.7]\n        SCORE_THRESH: 0.25\n        OUTPUT_RAW_SCORE: False\n        EVAL_METRIC: kitti\n\n        NMS_CONFIG:\n            MULTI_CLASSES_NMS: False\n            NMS_TYPE: nms_gpu\n            NMS_THRESH: 0.1\n            NMS_PRE_MAXSIZE: 4096\n            NMS_POST_MAXSIZE: 500\n\n\nOPTIMIZATION:\n    BATCH_SIZE_PER_GPU: 2\n    NUM_EPOCHS: 40\n\n    OPTIMIZER: adam_onecycle\n    LR: 0.01\n    WEIGHT_DECAY: 0.01\n    MOMENTUM: 0.9\n\n    MOMS: [0.95, 0.85]\n    PCT_START: 0.4\n    DIV_FACTOR: 10\n    DECAY_STEP_LIST: [35, 45]\n    LR_DECAY: 0.1\n    LR_CLIP: 0.0000001\n\n    LR_WARMUP: False\n    WARMUP_EPOCH: 1\n\n    GRAD_NORM_CLIP: 10"
  },
  {
    "path": "tools/dist_test.sh",
    "content": "#!/usr/bin/env bash\n\nCUDA_VISIBLE_DEVICES=1,2,3,4 nohup python3 -m torch.distributed.launch --nproc_per_node=4 test.py --launcher pytorch > log-test.txt &\n\n"
  },
  {
    "path": "tools/dist_train.sh",
    "content": "#!/usr/bin/env bash\n\nCUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 nohup python3 -m torch.distributed.launch --nproc_per_node=8 train.py --launcher pytorch > log.txt&\n"
  },
  {
    "path": "tools/eval_utils/eval_utils.py",
    "content": "import pickle\nimport time\n\nimport numpy as np\nimport torch\nimport tqdm\nimport os\n\nfrom pcdet.models import load_data_to_gpu\nfrom pcdet.utils import common_utils\nimport time\n\n\ndef statistics_info(cfg, ret_dict, metric, disp_dict):\n    for cur_thresh in cfg.MODEL.POST_PROCESSING.RECALL_THRESH_LIST:\n        metric['recall_roi_%s' % str(cur_thresh)] += ret_dict.get('roi_%s' % str(cur_thresh), 0)\n        metric['recall_rcnn_%s' % str(cur_thresh)] += ret_dict.get('rcnn_%s' % str(cur_thresh), 0)\n    metric['gt_num'] += ret_dict.get('gt', 0)\n    min_thresh = cfg.MODEL.POST_PROCESSING.RECALL_THRESH_LIST[0]\n    disp_dict['recall_%s' % str(min_thresh)] = \\\n        '(%d, %d) / %d' % (metric['recall_roi_%s' % str(min_thresh)], metric['recall_rcnn_%s' % str(min_thresh)], metric['gt_num'])\n\n\ndef eval_one_epoch(cfg, model, dataloader, epoch_id, logger, dist_test=False, save_to_file=True, result_dir=None):\n    result_dir.mkdir(parents=True, exist_ok=True)\n\n    final_output_dir = result_dir / 'final_result' / 'data'\n    if save_to_file:\n        final_output_dir.mkdir(parents=True, exist_ok=True)\n\n    metric = {\n        'gt_num': 0,\n    }\n    for cur_thresh in cfg.MODEL.POST_PROCESSING.RECALL_THRESH_LIST:\n        metric['recall_roi_%s' % str(cur_thresh)] = 0\n        metric['recall_rcnn_%s' % str(cur_thresh)] = 0\n\n    dataset = dataloader.dataset\n    class_names = dataset.class_names\n    det_annos = []\n\n    logger.info('*************** EPOCH %s EVALUATION *****************' % epoch_id)\n    if dist_test:\n        num_gpus = torch.cuda.device_count()\n        local_rank = cfg.LOCAL_RANK % num_gpus\n        model = torch.nn.parallel.DistributedDataParallel(\n                model,\n                device_ids=[local_rank],\n                broadcast_buffers=False\n        )\n    model.eval()\n\n    if cfg.LOCAL_RANK == 0:\n        progress_bar = tqdm.tqdm(total=len(dataloader), leave=True, desc='eval', dynamic_ncols=True)\n    start_time = time.time()\n    for i, batch_dict in enumerate(dataloader):\n        load_data_to_gpu(batch_dict)\n        #begin = time.time()\n\n        with torch.no_grad():\n            pred_dicts, ret_dict, batch_dict = model(batch_dict)\n        disp_dict = {}\n        #end = time.time()\n        #print(end-begin)\n\n        statistics_info(cfg, ret_dict, metric, disp_dict)\n        annos = dataset.generate_prediction_dicts(\n            batch_dict, pred_dicts, class_names,\n            output_path=final_output_dir if save_to_file else None\n        )\n        det_annos += annos\n        if cfg.LOCAL_RANK == 0:\n            progress_bar.set_postfix(disp_dict)\n            progress_bar.update()\n\n    if cfg.LOCAL_RANK == 0:\n        progress_bar.close()\n\n    if dist_test:\n        rank, world_size = common_utils.get_dist_info()\n        det_annos = common_utils.merge_results_dist(det_annos, len(dataset), tmpdir=result_dir / 'tmpdir')\n        metric = common_utils.merge_results_dist([metric], world_size, tmpdir=result_dir / 'tmpdir')\n\n    logger.info('*************** Performance of EPOCH %s *****************' % epoch_id)\n    sec_per_example = (time.time() - start_time) / len(dataloader.dataset)\n    logger.info('Generate label finished(sec_per_example: %.4f second).' % sec_per_example)\n\n    if cfg.LOCAL_RANK != 0:\n        return {}\n\n    ret_dict = {}\n    if dist_test:\n        for key, val in metric[0].items():\n            for k in range(1, world_size):\n                metric[0][key] += metric[k][key]\n        metric = metric[0]\n\n    gt_num_cnt = metric['gt_num']\n    for cur_thresh in cfg.MODEL.POST_PROCESSING.RECALL_THRESH_LIST:\n        cur_roi_recall = metric['recall_roi_%s' % str(cur_thresh)] / max(gt_num_cnt, 1)\n        cur_rcnn_recall = metric['recall_rcnn_%s' % str(cur_thresh)] / max(gt_num_cnt, 1)\n        logger.info('recall_roi_%s: %f' % (cur_thresh, cur_roi_recall))\n        logger.info('recall_rcnn_%s: %f' % (cur_thresh, cur_rcnn_recall))\n        ret_dict['recall/roi_%s' % str(cur_thresh)] = cur_roi_recall\n        ret_dict['recall/rcnn_%s' % str(cur_thresh)] = cur_rcnn_recall\n\n    total_pred_objects = 0\n    for anno in det_annos:\n        total_pred_objects += anno['name'].__len__()\n    logger.info('Average predicted number of objects(%d samples): %.3f'\n                % (len(det_annos), total_pred_objects / max(1, len(det_annos))))\n\n    path = result_dir / 'result.pkl'\n    if os.path.exists(path):\n        path = result_dir / ('result_'+str(time.time())[:10]+'.pkl')\n\n    with open(path, 'wb') as f:\n        pickle.dump(det_annos, f)\n    \n    result_str, result_dict = dataset.evaluation(\n        det_annos, class_names,\n        eval_metric=cfg.MODEL.POST_PROCESSING.EVAL_METRIC,\n        output_path=final_output_dir\n    )\n\n    logger.info(result_str)\n    ret_dict.update(result_dict)\n\n    logger.info('Result is save to %s' % result_dir)\n    logger.info('****************Evaluation done.*****************')\n    \n    return ret_dict\n\n\nif __name__ == '__main__':\n    pass\n"
  },
  {
    "path": "tools/test.py",
    "content": "import os\nimport argparse\nimport datetime\nimport glob\nimport re\nimport time\nfrom pathlib import Path\n\nimport numpy as np\nimport torch\nfrom tensorboardX import SummaryWriter\n\nfrom eval_utils import eval_utils\nfrom pcdet.config import cfg, cfg_from_list, cfg_from_yaml_file, log_config_to_file\nfrom pcdet.datasets import build_dataloader\nfrom pcdet.models import build_network\nfrom pcdet.utils import common_utils\nimport warnings\nwarnings.filterwarnings(\"ignore\")\n\ndef parse_config():\n    parser = argparse.ArgumentParser(description='arg parser')\n    parser.add_argument('--cfg_file', type=str, default=\"cfgs/models/kitti/TED-S.yaml\", help='specify the config for training')\n\n    parser.add_argument('--batch_size', type=int, default=None, required=False, help='batch size for training')\n    parser.add_argument('--workers', type=int, default=0, help='number of workers for dataloader')\n    parser.add_argument('--extra_tag', type=str, default='default', help='extra tag for this experiment')\n    parser.add_argument('--ckpt', type=str, default=\"TED-S.pth\", help='checkpoint to start from')\n    parser.add_argument('--launcher', choices=['none', 'pytorch', 'slurm'], default='none')\n    parser.add_argument('--tcp_port', type=int, default=18888, help='tcp port for distrbuted training')\n    parser.add_argument('--local_rank', type=int, default=0, help='local rank for distributed training')\n    parser.add_argument('--set', dest='set_cfgs', default=None, nargs=argparse.REMAINDER,\n                        help='set extra config keys if needed')\n\n    parser.add_argument('--max_waiting_mins', type=int, default=30, help='max waiting minutes')\n    parser.add_argument('--start_epoch', type=int, default=0, help='')\n    parser.add_argument('--eval_tag', type=str, default='default', help='eval tag for this experiment')\n    parser.add_argument('--eval_all', action='store_true', default=False, help='whether to evaluate all checkpoints')\n    parser.add_argument('--ckpt_dir', type=str, default=None, help='specify a ckpt directory to be evaluated if needed')\n    parser.add_argument('--save_to_file', action='store_true', default=False, help='')\n\n    args = parser.parse_args()\n\n    cfg_from_yaml_file(args.cfg_file, cfg)\n    cfg.TAG = Path(args.cfg_file).stem\n    cfg.EXP_GROUP_PATH = '/'.join(args.cfg_file.split('/')[1:-1])  # remove 'cfgs' and 'xxxx.yaml'\n\n    np.random.seed(1024)\n\n\n    if args.set_cfgs is not None:\n        cfg_from_list(args.set_cfgs, cfg)\n\n    return args, cfg\n\n\ndef eval_single_ckpt(model, test_loader, args, eval_output_dir, logger, epoch_id, dist_test=False):\n    # load checkpoint\n    model.load_params_from_file(filename=args.ckpt, logger=logger, to_cpu=dist_test)\n    model.cuda()\n\n    # start evaluation\n    eval_utils.eval_one_epoch(\n        cfg, model, test_loader, epoch_id, logger, dist_test=dist_test,\n        result_dir=eval_output_dir, save_to_file=args.save_to_file\n    )\n\n\ndef get_no_evaluated_ckpt(ckpt_dir, ckpt_record_file, args):\n    ckpt_list = glob.glob(os.path.join(ckpt_dir, '*checkpoint_epoch_*.pth'))\n    ckpt_list.sort(key=os.path.getmtime)\n    evaluated_ckpt_list = [float(x.strip()) for x in open(ckpt_record_file, 'r').readlines()]\n\n    for cur_ckpt in ckpt_list:\n        num_list = re.findall('checkpoint_epoch_(.*).pth', cur_ckpt)\n        if num_list.__len__() == 0:\n            continue\n\n        epoch_id = num_list[-1]\n        if 'optim' in epoch_id:\n            continue\n        if float(epoch_id) not in evaluated_ckpt_list and int(float(epoch_id)) >= args.start_epoch:\n            return epoch_id, cur_ckpt\n    return -1, None\n\n\ndef repeat_eval_ckpt(model, test_loader, args, eval_output_dir, logger, ckpt_dir, dist_test=False):\n    # evaluated ckpt record\n    ckpt_record_file = eval_output_dir / ('eval_list_%s.txt' % cfg.DATA_CONFIG.DATA_SPLIT['test'])\n    with open(ckpt_record_file, 'a'):\n        pass\n\n    # tensorboard log\n    if cfg.LOCAL_RANK == 0:\n        tb_log = SummaryWriter(log_dir=str(eval_output_dir / ('tensorboard_%s' % cfg.DATA_CONFIG.DATA_SPLIT['test'])))\n    total_time = 0\n    first_eval = True\n    while True:\n        # check whether there is checkpoint which is not evaluated\n        cur_epoch_id, cur_ckpt = get_no_evaluated_ckpt(ckpt_dir, ckpt_record_file, args)\n        if cur_epoch_id == -1 or int(float(cur_epoch_id)) < args.start_epoch:\n            break\n        total_time = 0\n        first_eval = False\n\n        model.load_params_from_file(filename=cur_ckpt, logger=logger, to_cpu=dist_test)\n        model.cuda()\n\n        # start evaluation\n        cur_result_dir = eval_output_dir / ('epoch_%s' % cur_epoch_id) / cfg.DATA_CONFIG.DATA_SPLIT['test']\n        tb_dict = eval_utils.eval_one_epoch(\n            cfg, model, test_loader, cur_epoch_id, logger, dist_test=dist_test,\n            result_dir=cur_result_dir, save_to_file=args.save_to_file\n        )\n\n        if cfg.LOCAL_RANK == 0:\n            for key, val in tb_dict.items():\n                tb_log.add_scalar(key, val, cur_epoch_id)\n\n        # record this epoch which has been evaluated\n        with open(ckpt_record_file, 'a') as f:\n            print('%s' % cur_epoch_id, file=f)\n        logger.info('Epoch %s has been evaluated' % cur_epoch_id)\n\n\ndef main():\n    args, cfg = parse_config()\n    if args.launcher == 'none':\n        dist_test = False\n        total_gpus = 1\n    else:\n        total_gpus, cfg.LOCAL_RANK = getattr(common_utils, 'init_dist_%s' % args.launcher)(\n            args.tcp_port, args.local_rank, backend='nccl'\n        )\n        dist_test = True\n\n    if args.batch_size is None:\n        args.batch_size = cfg.OPTIMIZATION.BATCH_SIZE_PER_GPU\n    else:\n        assert args.batch_size % total_gpus == 0, 'Batch size should match the number of gpus'\n        args.batch_size = args.batch_size // total_gpus\n\n    output_dir = cfg.ROOT_DIR / 'output' / cfg.EXP_GROUP_PATH / cfg.TAG / args.extra_tag\n    output_dir.mkdir(parents=True, exist_ok=True)\n\n    eval_output_dir = output_dir / 'eval'\n\n    if not args.eval_all:\n        num_list = re.findall(r'\\d+', args.ckpt) if args.ckpt is not None else []\n        epoch_id = num_list[-1] if num_list.__len__() > 0 else 'no_number'\n        eval_output_dir = eval_output_dir / ('epoch_%s' % epoch_id) / cfg.DATA_CONFIG.DATA_SPLIT['test']\n    else:\n        eval_output_dir = eval_output_dir / 'eval_all_default'\n\n    if args.eval_tag is not None:\n        eval_output_dir = eval_output_dir / args.eval_tag\n\n    eval_output_dir.mkdir(parents=True, exist_ok=True)\n    log_file = eval_output_dir / ('log_eval_%s.txt' % datetime.datetime.now().strftime('%Y%m%d-%H%M%S'))\n    logger = common_utils.create_logger(log_file, rank=cfg.LOCAL_RANK)\n\n    # log to file\n    logger.info('**********************Start logging**********************')\n    gpu_list = os.environ['CUDA_VISIBLE_DEVICES'] if 'CUDA_VISIBLE_DEVICES' in os.environ.keys() else 'ALL'\n    logger.info('CUDA_VISIBLE_DEVICES=%s' % gpu_list)\n\n    if dist_test:\n        logger.info('total_batch_size: %d' % (total_gpus * args.batch_size))\n    for key, val in vars(args).items():\n        logger.info('{:16} {}'.format(key, val))\n    log_config_to_file(cfg, logger=logger)\n\n    ckpt_dir = args.ckpt_dir if args.ckpt_dir is not None else output_dir / 'ckpt'\n\n    test_set, test_loader, sampler = build_dataloader(\n        dataset_cfg=cfg.DATA_CONFIG,\n        class_names=cfg.CLASS_NAMES,\n        batch_size=args.batch_size,\n        dist=dist_test, workers=args.workers, logger=logger, training=False\n    )\n\n    model = build_network(model_cfg=cfg.MODEL, num_class=len(cfg.CLASS_NAMES), dataset=test_set)\n    with torch.no_grad():\n        if args.eval_all:\n            repeat_eval_ckpt(model, test_loader, args, eval_output_dir, logger, ckpt_dir, dist_test=dist_test)\n        else:\n            eval_single_ckpt(model, test_loader, args, eval_output_dir, logger, epoch_id, dist_test=dist_test)\n\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "tools/train.py",
    "content": "import os\nimport argparse\nimport datetime\nimport glob\nfrom pathlib import Path\nfrom test import repeat_eval_ckpt\nimport torch\nimport torch.distributed as dist\nimport torch.nn as nn\nfrom tensorboardX import SummaryWriter\n\nfrom pcdet.config import cfg, cfg_from_list, cfg_from_yaml_file, log_config_to_file\nfrom pcdet.datasets import build_dataloader\nfrom pcdet.models import build_network, model_fn_decorator\nfrom pcdet.utils import common_utils\nfrom train_utils.optimization import build_optimizer, build_scheduler\nfrom train_utils.train_utils import train_model\nimport warnings\nwarnings.filterwarnings(\"ignore\")\n\ndef parse_config():\n    parser = argparse.ArgumentParser(description='arg parser')\n    parser.add_argument('--cfg_file', type=str, default=\"cfgs/models/kitti/TED-M.yaml\", help='specify the config for training')\n    parser.add_argument('--batch_size', type=int, default=None, required=False, help='batch size for training')\n    parser.add_argument('--epochs', type=int, default=None, required=False, help='number of epochs to train for')\n    parser.add_argument('--workers', type=int, default=0, help='number of workers for dataloader')\n    parser.add_argument('--extra_tag', type=str, default='default', help='extra tag for this experiment')\n    parser.add_argument('--ckpt', type=str, default=None, help='checkpoint to start from')\n    parser.add_argument('--pretrained_model', type=str, default=None, help='pretrained_model')\n    parser.add_argument('--launcher', choices=['none', 'pytorch', 'slurm'], default='none')\n    parser.add_argument('--tcp_port', type=int, default=23271, help='tcp port for distrbuted training')\n    parser.add_argument('--sync_bn', action='store_true', default=False, help='whether to use sync bn')\n    parser.add_argument('--fix_random_seed', action='store_true', default=True, help='')\n    parser.add_argument('--ckpt_save_interval', type=int, default=1, help='number of training epochs')\n    parser.add_argument('--local_rank', type=int, default=0, help='local rank for distributed training')\n    parser.add_argument('--max_ckpt_save_num', type=int, default=10, help='max number of saved checkpoint')\n    parser.add_argument('--merge_all_iters_to_one_epoch', action='store_true', default=False, help='')\n    parser.add_argument('--set', dest='set_cfgs', default=None, nargs=argparse.REMAINDER,\n                        help='set extra config keys if needed')\n\n    parser.add_argument('--max_waiting_mins', type=int, default=0, help='max waiting minutes')\n    parser.add_argument('--start_epoch', type=int, default=0, help='')\n    parser.add_argument('--save_to_file', action='store_true', default=False, help='')\n\n    args = parser.parse_args()\n\n    cfg_from_yaml_file(args.cfg_file, cfg)\n    cfg.TAG = Path(args.cfg_file).stem\n    cfg.EXP_GROUP_PATH = '/'.join(args.cfg_file.split('/')[1:-1])  # remove 'cfgs' and 'xxxx.yaml'\n\n    if args.set_cfgs is not None:\n        cfg_from_list(args.set_cfgs, cfg)\n\n    return args, cfg\n\ndef main():\n    args, cfg = parse_config()\n\n    if args.launcher == 'none':\n        dist_train = False\n        total_gpus = 1\n    else:\n        total_gpus, cfg.LOCAL_RANK = getattr(common_utils, 'init_dist_%s' % args.launcher)(\n            args.tcp_port, args.local_rank, backend='nccl'\n        )\n        dist_train = True\n\n    if args.batch_size is None:\n        args.batch_size = cfg.OPTIMIZATION.BATCH_SIZE_PER_GPU\n    else:\n        assert args.batch_size % total_gpus == 0, 'Batch size should match the number of gpus'\n        args.batch_size = args.batch_size // total_gpus\n\n    args.epochs = cfg.OPTIMIZATION.NUM_EPOCHS if args.epochs is None else args.epochs\n\n    if args.fix_random_seed:\n        common_utils.set_random_seed(666)\n\n    output_dir = cfg.ROOT_DIR / 'output' / cfg.EXP_GROUP_PATH / cfg.TAG / args.extra_tag\n    ckpt_dir = output_dir / 'ckpt'\n    output_dir.mkdir(parents=True, exist_ok=True)\n    ckpt_dir.mkdir(parents=True, exist_ok=True)\n\n    log_file = output_dir / ('log_train_%s.txt' % datetime.datetime.now().strftime('%Y%m%d-%H%M%S'))\n    logger = common_utils.create_logger(log_file, rank=cfg.LOCAL_RANK)\n\n    # log to file\n    logger.info('**********************Start logging**********************')\n    gpu_list = os.environ['CUDA_VISIBLE_DEVICES'] if 'CUDA_VISIBLE_DEVICES' in os.environ.keys() else 'ALL'\n    logger.info('CUDA_VISIBLE_DEVICES=%s' % gpu_list)\n\n    if dist_train:\n        logger.info('total_batch_size: %d' % (total_gpus * args.batch_size))\n    for key, val in vars(args).items():\n        logger.info('{:16} {}'.format(key, val))\n    log_config_to_file(cfg, logger=logger)\n    if cfg.LOCAL_RANK == 0:\n        os.system('cp %s %s' % (args.cfg_file, output_dir))\n\n    tb_log = SummaryWriter(log_dir=str(output_dir / 'tensorboard')) if cfg.LOCAL_RANK == 0 else None\n\n    # -----------------------create dataloader & network & optimizer---------------------------\n    train_set, train_loader, train_sampler = build_dataloader(\n        dataset_cfg=cfg.DATA_CONFIG,\n        class_names=cfg.CLASS_NAMES,\n        batch_size=args.batch_size,\n        dist=dist_train, workers=args.workers,\n        logger=logger,\n        training=True,\n        merge_all_iters_to_one_epoch=args.merge_all_iters_to_one_epoch,\n        total_epochs=args.epochs,\n    )\n\n    model = build_network(model_cfg=cfg.MODEL, num_class=len(cfg.CLASS_NAMES), dataset=train_set)\n    if args.sync_bn:\n        model = torch.nn.SyncBatchNorm.convert_sync_batchnorm(model)\n    model.cuda()\n\n    optimizer = build_optimizer(model, cfg.OPTIMIZATION)\n\n    # load checkpoint if it is possible\n    start_epoch = it = 0\n    last_epoch = -1\n    if args.pretrained_model is not None:\n        model.load_params_from_file(filename=args.pretrained_model, to_cpu=dist, logger=logger)\n\n    if args.ckpt is not None:\n        it, start_epoch = model.load_params_with_optimizer(args.ckpt, to_cpu=dist, optimizer=optimizer, logger=logger)\n        last_epoch = start_epoch + 1\n    else:\n        ckpt_list = glob.glob(str(ckpt_dir / '*checkpoint_epoch_*.pth'))\n        if len(ckpt_list) > 0:\n            ckpt_list.sort(key=os.path.getmtime)\n            it, start_epoch = model.load_params_with_optimizer(\n                ckpt_list[-1], to_cpu=dist, optimizer=optimizer, logger=logger\n            )\n            last_epoch = start_epoch + 1\n\n    model.train()  # before wrap to DistributedDataParallel to support fixed some parameters\n    if dist_train:\n        model = nn.parallel.DistributedDataParallel(model, device_ids=[cfg.LOCAL_RANK % torch.cuda.device_count()])#,find_unused_parameters=True\n    logger.info(model)\n\n    lr_scheduler, lr_warmup_scheduler = build_scheduler(\n        optimizer, total_iters_each_epoch=len(train_loader), total_epochs=args.epochs,\n        last_epoch=last_epoch, optim_cfg=cfg.OPTIMIZATION\n    )\n\n    # -----------------------start training---------------------------\n    logger.info('**********************Start training %s/%s(%s)**********************'\n                % (cfg.EXP_GROUP_PATH, cfg.TAG, args.extra_tag))\n    train_model(\n        model,\n        optimizer,\n        train_loader,\n        model_func=model_fn_decorator(),\n        lr_scheduler=lr_scheduler,\n        optim_cfg=cfg.OPTIMIZATION,\n        start_epoch=start_epoch,\n        total_epochs=args.epochs,\n        start_iter=it,\n        rank=cfg.LOCAL_RANK,\n        tb_log=tb_log,\n        ckpt_save_dir=ckpt_dir,\n        train_sampler=train_sampler,\n        lr_warmup_scheduler=lr_warmup_scheduler,\n        ckpt_save_interval=args.ckpt_save_interval,\n        max_ckpt_save_num=args.max_ckpt_save_num,\n        merge_all_iters_to_one_epoch=args.merge_all_iters_to_one_epoch\n    )\n\n    logger.info('**********************End training %s/%s(%s)**********************\\n\\n\\n'\n                % (cfg.EXP_GROUP_PATH, cfg.TAG, args.extra_tag))\n\n    logger.info('**********************Start evaluation %s/%s(%s)**********************' %\n                (cfg.EXP_GROUP_PATH, cfg.TAG, args.extra_tag))\n\n    test_set, test_loader, sampler = build_dataloader(\n        dataset_cfg=cfg.DATA_CONFIG,\n        class_names=cfg.CLASS_NAMES,\n        batch_size=args.batch_size,\n        dist=dist_train, workers=args.workers, logger=logger, training=False\n    )\n    eval_output_dir = output_dir / 'eval' / ('eval_with_train')\n    eval_output_dir.mkdir(parents=True, exist_ok=True)\n    args.start_epoch = max(args.epochs - 10, 0)  # Only evaluate the last 10 epochs\n\n    repeat_eval_ckpt(\n        model.module if dist_train else model,\n        test_loader, args, eval_output_dir, logger, ckpt_dir,\n        dist_test=dist_train\n    )\n\n    logger.info('**********************End evaluation %s/%s(%s)**********************' %\n                (cfg.EXP_GROUP_PATH, cfg.TAG, args.extra_tag))\n\n\n\n\nif __name__ == '__main__':\n    main()\n\n"
  },
  {
    "path": "tools/train_utils/optimization/__init__.py",
    "content": "from functools import partial\n\nimport torch.nn as nn\nimport torch.optim as optim\nimport torch.optim.lr_scheduler as lr_sched\n\nfrom .fastai_optim import OptimWrapper\nfrom .learning_schedules_fastai import CosineWarmupLR, OneCycle,CosineWarmup\n\n\ndef build_optimizer(model, optim_cfg):\n    if optim_cfg.OPTIMIZER == 'adam':\n        optimizer = optim.Adam(model.parameters(), lr=optim_cfg.LR, weight_decay=optim_cfg.WEIGHT_DECAY)\n    elif optim_cfg.OPTIMIZER == 'sgd':\n        optimizer = optim.SGD(\n            model.parameters(), lr=optim_cfg.LR, weight_decay=optim_cfg.WEIGHT_DECAY,\n            momentum=optim_cfg.MOMENTUM\n        )\n    elif optim_cfg.OPTIMIZER == 'adam_onecycle' or optim_cfg.OPTIMIZER == 'adam_cosin':\n        def children(m: nn.Module):\n            return list(m.children())\n\n        def num_children(m: nn.Module) -> int:\n            return len(children(m))\n\n        flatten_model = lambda m: sum(map(flatten_model, m.children()), []) if num_children(m) else [m]\n        get_layer_groups = lambda m: [nn.Sequential(*flatten_model(m))]\n\n        optimizer_func = partial(optim.Adam, betas=(0.9, 0.99))\n        optimizer = OptimWrapper.create(\n            optimizer_func, 3e-3, get_layer_groups(model), wd=optim_cfg.WEIGHT_DECAY, true_wd=True, bn_wd=True\n        )\n    else:\n        raise NotImplementedError\n\n    return optimizer\n\n\ndef build_scheduler(optimizer, total_iters_each_epoch, total_epochs, last_epoch, optim_cfg):\n    decay_steps = [x * total_iters_each_epoch for x in optim_cfg.DECAY_STEP_LIST]\n    def lr_lbmd(cur_epoch):\n        cur_decay = 1\n        for decay_step in decay_steps:\n            if cur_epoch >= decay_step:\n                cur_decay = cur_decay * optim_cfg.LR_DECAY\n        return max(cur_decay, optim_cfg.LR_CLIP / optim_cfg.LR)\n\n    lr_warmup_scheduler = None\n    total_steps = total_iters_each_epoch * total_epochs\n    if optim_cfg.OPTIMIZER == 'adam_onecycle':\n        lr_scheduler = OneCycle(\n            optimizer, total_steps, optim_cfg.LR, list(optim_cfg.MOMS), optim_cfg.DIV_FACTOR, optim_cfg.PCT_START\n        )\n    elif optim_cfg.OPTIMIZER == 'adam_cosin':\n        lr_scheduler = CosineWarmup(\n            optimizer, total_steps, optim_cfg.WARMUP_EPOCH * total_iters_each_epoch, optim_cfg.LR, list(optim_cfg.MOMS),\n            optim_cfg.DIV_FACTOR, optim_cfg.PCT_START\n        )\n    else:\n        lr_scheduler = lr_sched.LambdaLR(optimizer, lr_lbmd, last_epoch=last_epoch)\n\n        if optim_cfg.LR_WARMUP:\n            lr_warmup_scheduler = CosineWarmupLR(\n                optimizer, T_max=optim_cfg.WARMUP_EPOCH * total_iters_each_epoch,\n                eta_min=optim_cfg.LR / optim_cfg.DIV_FACTOR\n            )\n\n    return lr_scheduler, lr_warmup_scheduler\n"
  },
  {
    "path": "tools/train_utils/optimization/fastai_optim.py",
    "content": "# This file is modified from https://github.com/traveller59/second.pytorch\n\nfrom collections import Iterable\n\nimport torch\nfrom torch import nn\nfrom torch._utils import _unflatten_dense_tensors\nfrom torch.nn.utils import parameters_to_vector\n\nbn_types = (nn.BatchNorm1d, nn.BatchNorm2d, nn.BatchNorm3d, nn.SyncBatchNorm)\n\n\ndef split_bn_bias(layer_groups):\n    \"Split the layers in `layer_groups` into batchnorm (`bn_types`) and non-batchnorm groups.\"\n    split_groups = []\n    for l in layer_groups:\n        l1, l2 = [], []\n        for c in l.children():\n            if isinstance(c, bn_types):\n                l2.append(c)\n            else:\n                l1.append(c)\n        split_groups += [nn.Sequential(*l1), nn.Sequential(*l2)]\n    return split_groups\n\n\ndef get_master(layer_groups, flat_master: bool = False):\n    \"Return two lists, one for the model parameters in FP16 and one for the master parameters in FP32.\"\n    split_groups = split_bn_bias(layer_groups)\n    model_params = [[param for param in lg.parameters() if param.requires_grad] for lg in split_groups]\n    if flat_master:\n        master_params = []\n        for lg in model_params:\n            if len(lg) != 0:\n                mp = parameters_to_vector([param.data.float() for param in lg])\n                mp = torch.nn.Parameter(mp, requires_grad=True)\n                if mp.grad is None: mp.grad = mp.new(*mp.size())\n                master_params.append([mp])\n            else:\n                master_params.append([])\n        return model_params, master_params\n    else:\n        master_params = [[param.clone().float().detach() for param in lg] for lg in model_params]\n        for mp in master_params:\n            for param in mp: param.requires_grad = True\n        return model_params, master_params\n\n\ndef model_g2master_g(model_params, master_params, flat_master: bool = False) -> None:\n    \"Copy the `model_params` gradients to `master_params` for the optimizer step.\"\n    if flat_master:\n        for model_group, master_group in zip(model_params, master_params):\n            if len(master_group) != 0:\n                master_group[0].grad.data.copy_(parameters_to_vector([p.grad.data.float() for p in model_group]))\n    else:\n        for model_group, master_group in zip(model_params, master_params):\n            for model, master in zip(model_group, master_group):\n                if model.grad is not None:\n                    if master.grad is None: master.grad = master.data.new(*master.data.size())\n                    master.grad.data.copy_(model.grad.data)\n                else:\n                    master.grad = None\n\n\ndef master2model(model_params, master_params, flat_master: bool = False) -> None:\n    \"Copy `master_params` to `model_params`.\"\n    if flat_master:\n        for model_group, master_group in zip(model_params, master_params):\n            if len(model_group) != 0:\n                for model, master in zip(model_group, _unflatten_dense_tensors(master_group[0].data, model_group)):\n                    model.data.copy_(master)\n    else:\n        for model_group, master_group in zip(model_params, master_params):\n            for model, master in zip(model_group, master_group): model.data.copy_(master.data)\n\n\ndef listify(p=None, q=None):\n    \"Make `p` listy and the same length as `q`.\"\n    if p is None:\n        p = []\n    elif isinstance(p, str):\n        p = [p]\n    elif not isinstance(p, Iterable):\n        p = [p]\n    n = q if type(q) == int else len(p) if q is None else len(q)\n    if len(p) == 1: p = p * n\n    assert len(p) == n, f'List len mismatch ({len(p)} vs {n})'\n    return list(p)\n\n\ndef trainable_params(m: nn.Module):\n    \"Return list of trainable params in `m`.\"\n    res = filter(lambda p: p.requires_grad, m.parameters())\n    return res\n\n\ndef is_tuple(x) -> bool: return isinstance(x, tuple)\n\n\n# copy from fastai.\nclass OptimWrapper():\n    \"Basic wrapper around `opt` to simplify hyper-parameters changes.\"\n\n    def __init__(self, opt, wd, true_wd: bool = False, bn_wd: bool = True):\n        self.opt, self.true_wd, self.bn_wd = opt, true_wd, bn_wd\n        self.opt_keys = list(self.opt.param_groups[0].keys())\n        self.opt_keys.remove('params')\n        self.read_defaults()\n        self.wd = wd\n\n    @classmethod\n    def create(cls, opt_func, lr,\n               layer_groups, **kwargs):\n        \"Create an `optim.Optimizer` from `opt_func` with `lr`. Set lr on `layer_groups`.\"\n        split_groups = split_bn_bias(layer_groups)\n        opt = opt_func([{'params': trainable_params(l), 'lr': 0} for l in split_groups])\n        opt = cls(opt, **kwargs)\n        opt.lr, opt.opt_func = listify(lr, layer_groups), opt_func\n        return opt\n\n    def new(self, layer_groups):\n        \"Create a new `OptimWrapper` from `self` with another `layer_groups` but the same hyper-parameters.\"\n        opt_func = getattr(self, 'opt_func', self.opt.__class__)\n        split_groups = split_bn_bias(layer_groups)\n        opt = opt_func([{'params': trainable_params(l), 'lr': 0} for l in split_groups])\n        return self.create(opt_func, self.lr, layer_groups, wd=self.wd, true_wd=self.true_wd, bn_wd=self.bn_wd)\n\n    def __repr__(self) -> str:\n        return f'OptimWrapper over {repr(self.opt)}.\\nTrue weight decay: {self.true_wd}'\n\n    # Pytorch optimizer methods\n    def step(self) -> None:\n        \"Set weight decay and step optimizer.\"\n        # weight decay outside of optimizer step (AdamW)\n        if self.true_wd:\n            for lr, wd, pg1, pg2 in zip(self._lr, self._wd, self.opt.param_groups[::2], self.opt.param_groups[1::2]):\n                for p in pg1['params']:\n                    # When some parameters are fixed:  Shaoshuai Shi\n                    if p.requires_grad is False:\n                        continue\n                    p.data.mul_(1 - wd * lr)\n                if self.bn_wd:\n                    for p in pg2['params']:\n                        # When some parameters are fixed:  Shaoshuai Shi\n                        if p.requires_grad is False:\n                            continue\n                        p.data.mul_(1 - wd * lr)\n            self.set_val('weight_decay', listify(0, self._wd))\n        self.opt.step()\n\n    def zero_grad(self) -> None:\n        \"Clear optimizer gradients.\"\n        self.opt.zero_grad()\n\n    # Passthrough to the inner opt.\n    def __getattr__(self, k: str):\n        return getattr(self.opt, k, None)\n\n    def clear(self):\n        \"Reset the state of the inner optimizer.\"\n        sd = self.state_dict()\n        sd['state'] = {}\n        self.load_state_dict(sd)\n\n    # Hyperparameters as properties\n    @property\n    def lr(self) -> float:\n        return self._lr[-1]\n\n    @lr.setter\n    def lr(self, val: float) -> None:\n        self._lr = self.set_val('lr', listify(val, self._lr))\n\n    @property\n    def mom(self) -> float:\n        return self._mom[-1]\n\n    @mom.setter\n    def mom(self, val: float) -> None:\n        if 'momentum' in self.opt_keys:\n            self.set_val('momentum', listify(val, self._mom))\n        elif 'betas' in self.opt_keys:\n            self.set_val('betas', (listify(val, self._mom), self._beta))\n        self._mom = listify(val, self._mom)\n\n    @property\n    def beta(self) -> float:\n        return None if self._beta is None else self._beta[-1]\n\n    @beta.setter\n    def beta(self, val: float) -> None:\n        \"Set beta (or alpha as makes sense for given optimizer).\"\n        if val is None: return\n        if 'betas' in self.opt_keys:\n            self.set_val('betas', (self._mom, listify(val, self._beta)))\n        elif 'alpha' in self.opt_keys:\n            self.set_val('alpha', listify(val, self._beta))\n        self._beta = listify(val, self._beta)\n\n    @property\n    def wd(self) -> float:\n        return self._wd[-1]\n\n    @wd.setter\n    def wd(self, val: float) -> None:\n        \"Set weight decay.\"\n        if not self.true_wd: self.set_val('weight_decay', listify(val, self._wd), bn_groups=self.bn_wd)\n        self._wd = listify(val, self._wd)\n\n    # Helper functions\n    def read_defaults(self) -> None:\n        \"Read the values inside the optimizer for the hyper-parameters.\"\n        self._beta = None\n        if 'lr' in self.opt_keys: self._lr = self.read_val('lr')\n        if 'momentum' in self.opt_keys: self._mom = self.read_val('momentum')\n        if 'alpha' in self.opt_keys: self._beta = self.read_val('alpha')\n        if 'betas' in self.opt_keys: self._mom, self._beta = self.read_val('betas')\n        if 'weight_decay' in self.opt_keys: self._wd = self.read_val('weight_decay')\n\n    def set_val(self, key: str, val, bn_groups: bool = True):\n        \"Set `val` inside the optimizer dictionary at `key`.\"\n        if is_tuple(val): val = [(v1, v2) for v1, v2 in zip(*val)]\n        for v, pg1, pg2 in zip(val, self.opt.param_groups[::2], self.opt.param_groups[1::2]):\n            pg1[key] = v\n            if bn_groups: pg2[key] = v\n        return val\n\n    def read_val(self, key: str):\n        \"Read a hyperparameter `key` in the optimizer dictionary.\"\n        val = [pg[key] for pg in self.opt.param_groups[::2]]\n        if is_tuple(val[0]): val = [o[0] for o in val], [o[1] for o in val]\n        return val\n\n\nclass FastAIMixedOptim(OptimWrapper):\n    @classmethod\n    def create(cls, opt_func, lr,\n               layer_groups, model, flat_master=False, loss_scale=512.0, **kwargs):\n        \"Create an `optim.Optimizer` from `opt_func` with `lr`. Set lr on `layer_groups`.\"\n        opt = OptimWrapper.create(opt_func, lr, layer_groups, **kwargs)\n        opt.model_params, opt.master_params = get_master(layer_groups, flat_master)\n        opt.flat_master = flat_master\n        opt.loss_scale = loss_scale\n        opt.model = model\n        # Changes the optimizer so that the optimization step is done in FP32.\n        # opt = self.learn.opt\n        mom, wd, beta = opt.mom, opt.wd, opt.beta\n        lrs = [lr for lr in opt._lr for _ in range(2)]\n        opt_params = [{'params': mp, 'lr': lr} for mp, lr in zip(opt.master_params, lrs)]\n        opt.opt = opt_func(opt_params)\n        opt.mom, opt.wd, opt.beta = mom, wd, beta\n        return opt\n\n    def step(self):\n        model_g2master_g(self.model_params, self.master_params, self.flat_master)\n        for group in self.master_params:\n            for param in group: param.grad.div_(self.loss_scale)\n        super(FastAIMixedOptim, self).step()\n        self.model.zero_grad()\n        # Update the params from master to model.\n        master2model(self.model_params, self.master_params, self.flat_master)\n"
  },
  {
    "path": "tools/train_utils/optimization/learning_schedules_fastai.py",
    "content": "# This file is modified from https://github.com/traveller59/second.pytorch\n\nimport math\nfrom functools import partial\n\nimport numpy as np\nimport torch.optim.lr_scheduler as lr_sched\n\nfrom .fastai_optim import OptimWrapper\n\n\nclass LRSchedulerStep(object):\n    def __init__(self, fai_optimizer: OptimWrapper, total_step, lr_phases,\n                 mom_phases):\n        # if not isinstance(fai_optimizer, OptimWrapper):\n        #     raise TypeError('{} is not a fastai OptimWrapper'.format(\n        #         type(fai_optimizer).__name__))\n        self.optimizer = fai_optimizer\n        self.total_step = total_step\n        self.lr_phases = []\n\n        for i, (start, lambda_func) in enumerate(lr_phases):\n            if len(self.lr_phases) != 0:\n                assert self.lr_phases[-1][0] < start\n            if isinstance(lambda_func, str):\n                lambda_func = eval(lambda_func)\n            if i < len(lr_phases) - 1:\n                self.lr_phases.append((int(start * total_step), int(lr_phases[i + 1][0] * total_step), lambda_func))\n            else:\n                self.lr_phases.append((int(start * total_step), total_step, lambda_func))\n        assert self.lr_phases[0][0] == 0\n        self.mom_phases = []\n        for i, (start, lambda_func) in enumerate(mom_phases):\n            if len(self.mom_phases) != 0:\n                assert self.mom_phases[-1][0] < start\n            if isinstance(lambda_func, str):\n                lambda_func = eval(lambda_func)\n            if i < len(mom_phases) - 1:\n                self.mom_phases.append((int(start * total_step), int(mom_phases[i + 1][0] * total_step), lambda_func))\n            else:\n                self.mom_phases.append((int(start * total_step), total_step, lambda_func))\n        assert self.mom_phases[0][0] == 0\n\n    def step(self, step):\n        for start, end, func in self.lr_phases:\n            if step >= start:\n                self.optimizer.lr = func((step - start) / (end - start))\n        for start, end, func in self.mom_phases:\n            if step >= start:\n                self.optimizer.mom = func((step - start) / (end - start))\n\n\ndef annealing_cos(start, end, pct):\n    # print(pct, start, end)\n    \"Cosine anneal from `start` to `end` as pct goes from 0.0 to 1.0.\"\n    cos_out = np.cos(np.pi * pct) + 1\n    return end + (start - end) / 2 * cos_out\n\n\nclass OneCycle(LRSchedulerStep):\n    def __init__(self, fai_optimizer, total_step, lr_max, moms, div_factor,\n                 pct_start):\n        self.lr_max = lr_max\n        self.moms = moms\n        self.div_factor = div_factor\n        self.pct_start = pct_start\n        a1 = int(total_step * self.pct_start)\n        a2 = total_step - a1\n        low_lr = self.lr_max / self.div_factor\n        lr_phases = ((0, partial(annealing_cos, low_lr, self.lr_max)),\n                     (self.pct_start,\n                      partial(annealing_cos, self.lr_max, low_lr / 1e4)))\n        mom_phases = ((0, partial(annealing_cos, *self.moms)),\n                      (self.pct_start, partial(annealing_cos,\n                                               *self.moms[::-1])))\n        fai_optimizer.lr, fai_optimizer.mom = low_lr, self.moms[0]\n        super().__init__(fai_optimizer, total_step, lr_phases, mom_phases)\n\nclass CosineWarmup():\n    def __init__(self, optimizer, total_step, up_steps, lr_max, moms, div_factor,\n                 pct_start):\n        self.scheme = OneCycle(optimizer, up_steps, lr_max, moms, div_factor,\n                 pct_start)\n        self.total_step = total_step\n        self.up_steps = up_steps\n\n    def step(self, step):\n\n        this_step = step%self.up_steps\n        self.scheme.step(this_step)\n\nclass CosineWarmupLR(lr_sched._LRScheduler):\n    def __init__(self, optimizer, T_max, eta_min=0, last_epoch=-1):\n        self.T_max = T_max\n        self.eta_min = eta_min\n        super(CosineWarmupLR, self).__init__(optimizer, last_epoch)\n\n    def get_lr(self):\n        return [self.eta_min + (base_lr - self.eta_min) *\n                (1 - math.cos(math.pi * self.last_epoch / self.T_max)) / 2\n                for base_lr in self.base_lrs]\n\n\nclass FakeOptim:\n    def __init__(self):\n        self.lr = 0\n        self.mom = 0\n\n\nif __name__ == \"__main__\":\n    import matplotlib.pyplot as plt\n\n    opt = FakeOptim()  # 3e-3, wd=0.4, div_factor=10\n    schd = CosineWarmup(opt, 1000,100, 3e-3, (0.95, 0.85), 10.0, 0.1)\n\n    lrs = []\n    moms = []\n    for i in range(1000):\n        schd.step(i)\n        lrs.append(opt.lr)\n        moms.append(opt.mom)\n    plt.plot(lrs)\n    # plt.plot(moms)\n    plt.show()\n    plt.plot(moms)\n    plt.show()\n"
  },
  {
    "path": "tools/train_utils/train_utils.py",
    "content": "import glob\nimport os\n\nimport torch\nimport tqdm\nfrom torch.nn.utils import clip_grad_norm_\nimport numpy as np\nfrom pcdet.models import load_data_to_gpu\nimport copy\nimport pcdet.datasets.augmentor.augmentor_utils as uti\n\n\ndef train_one_epoch(model, optimizer, train_loader, model_func, lr_scheduler, accumulated_iter, optim_cfg,\n                    rank, tbar, total_it_each_epoch, dataloader_iter, tb_log=None, leave_pbar=False):\n    if total_it_each_epoch == len(train_loader):\n        dataloader_iter = iter(train_loader)\n\n    if rank == 0:\n        pbar = tqdm.tqdm(total=total_it_each_epoch, leave=leave_pbar, desc='train', dynamic_ncols=True)\n\n    accus = 1\n\n    for cur_it in range(total_it_each_epoch):\n        try:\n            batch = next(dataloader_iter)\n        except StopIteration:\n            dataloader_iter = iter(train_loader)\n            batch = next(dataloader_iter)\n            print('new iters')\n\n        lr_scheduler.step(accumulated_iter)\n\n        try:\n            cur_lr = float(optimizer.lr)\n        except:\n            cur_lr = optimizer.param_groups[0]['lr']\n\n        if tb_log is not None:\n            tb_log.add_scalar('meta_data/learning_rate', cur_lr, accumulated_iter)\n\n\n        model.train()\n\n        loss, tb_dict, disp_dict = model_func(model, batch)\n        loss = loss/accus\n        \n        loss.backward()\n\n        if ((cur_it + 1) % accus) == 0:\n            clip_grad_norm_(model.parameters(), optim_cfg.GRAD_NORM_CLIP)\n            optimizer.step()\n            optimizer.zero_grad()\n        \n        accumulated_iter += 1\n        disp_dict.update({'loss': loss.item()*accus, 'lr': cur_lr})\n\n        # log to console and tensorboard\n        if rank == 0:\n            pbar.update()\n            pbar.set_postfix(dict(total_it=accumulated_iter))\n            tbar.set_postfix(disp_dict)\n            tbar.refresh()\n\n            if tb_log is not None:\n                tb_log.add_scalar('train/loss', loss, accumulated_iter)\n                tb_log.add_scalar('meta_data/learning_rate', cur_lr, accumulated_iter)\n                for key, val in tb_dict.items():\n                    tb_log.add_scalar('train/' + key, val, accumulated_iter)\n    if rank == 0:\n        pbar.close()\n    return accumulated_iter\n\ndef train_model(model, optimizer, train_loader, model_func, lr_scheduler, optim_cfg,\n                start_epoch, total_epochs, start_iter, rank, tb_log, ckpt_save_dir, train_sampler=None,\n                lr_warmup_scheduler=None, ckpt_save_interval=1, max_ckpt_save_num=50,\n                merge_all_iters_to_one_epoch=False):\n    accumulated_iter = start_iter\n    with tqdm.trange(start_epoch, total_epochs, desc='epochs', dynamic_ncols=True, leave=(rank == 0)) as tbar:\n        total_it_each_epoch = len(train_loader)\n        if merge_all_iters_to_one_epoch:\n            assert hasattr(train_loader.dataset, 'merge_all_iters_to_one_epoch')\n            train_loader.dataset.merge_all_iters_to_one_epoch(merge=True, epochs=total_epochs)\n            total_it_each_epoch = len(train_loader) // max(total_epochs, 1)\n\n        dataloader_iter = iter(train_loader)\n        for cur_epoch in tbar:\n            if train_sampler is not None:\n                train_sampler.set_epoch(cur_epoch)\n\n            # train one epoch\n            if lr_warmup_scheduler is not None:\n                cur_scheduler = lr_warmup_scheduler\n            else:\n                cur_scheduler = lr_scheduler\n            accumulated_iter = train_one_epoch(\n                model, optimizer, train_loader, model_func,\n                lr_scheduler=cur_scheduler,\n                accumulated_iter=accumulated_iter, optim_cfg=optim_cfg,\n                rank=rank, tbar=tbar, tb_log=tb_log,\n                leave_pbar=(cur_epoch + 1 == total_epochs),\n                total_it_each_epoch=total_it_each_epoch,\n                dataloader_iter=dataloader_iter\n            )\n\n            # save trained model\n            trained_epoch = cur_epoch + 1\n            if trained_epoch % ckpt_save_interval == 0 and rank == 0:\n\n                ckpt_list = glob.glob(str(ckpt_save_dir / 'checkpoint_epoch_*.pth'))\n                ckpt_list.sort(key=os.path.getmtime)\n\n                if ckpt_list.__len__() >= max_ckpt_save_num:\n                    for cur_file_idx in range(0, len(ckpt_list) - max_ckpt_save_num + 1):\n                        os.remove(ckpt_list[cur_file_idx])\n\n                ckpt_name = ckpt_save_dir / ('checkpoint_epoch_%d' % trained_epoch)\n                save_checkpoint(\n                    checkpoint_state(model, optimizer, trained_epoch, accumulated_iter), filename=ckpt_name,\n                )\n\ndef model_state_to_cpu(model_state):\n    model_state_cpu = type(model_state)()  # ordered dict\n    for key, val in model_state.items():\n        model_state_cpu[key] = val.cpu()\n    return model_state_cpu\n\n\ndef checkpoint_state(model=None, optimizer=None, epoch=None, it=None):\n    optim_state = optimizer.state_dict() if optimizer is not None else None\n    if model is not None:\n        if isinstance(model, torch.nn.parallel.DistributedDataParallel):\n            model_state = model_state_to_cpu(model.module.state_dict())\n        else:\n            model_state = model.state_dict()\n    else:\n        model_state = None\n\n    try:\n        import pcdet\n        version = 'pcdet+' + pcdet.__version__\n    except:\n        version = 'none'\n\n    return {'epoch': epoch, 'it': it, 'model_state': model_state, 'optimizer_state': optim_state, 'version': version}\n\n\ndef save_checkpoint(state, filename='checkpoint'):\n    if False and 'optimizer_state' in state:\n        optimizer_state = state['optimizer_state']\n        state.pop('optimizer_state', None)\n        optimizer_filename = '{}_optim.pth'.format(filename)\n        torch.save({'optimizer_state': optimizer_state}, optimizer_filename,_use_new_zipfile_serialization=False)\n\n    filename = '{}.pth'.format(filename)\n    torch.save(state, filename,_use_new_zipfile_serialization=False)\n"
  },
  {
    "path": "tools/visual_utils/visualize_utils.py",
    "content": "import mayavi.mlab as mlab\nimport numpy as np\nimport torch\n\nbox_colormap = [\n    [1, 1, 1],\n    [0, 1, 0],\n    [0, 1, 1],\n    [1, 1, 0],\n]\n\n\ndef check_numpy_to_torch(x):\n    if isinstance(x, np.ndarray):\n        return torch.from_numpy(x).float(), True\n    return x, False\n\n\ndef rotate_points_along_z(points, angle):\n    \"\"\"\n    Args:\n        points: (B, N, 3 + C)\n        angle: (B), angle along z-axis, angle increases x ==> y\n    Returns:\n\n    \"\"\"\n    points, is_numpy = check_numpy_to_torch(points)\n    angle, _ = check_numpy_to_torch(angle)\n\n    cosa = torch.cos(angle)\n    sina = torch.sin(angle)\n    zeros = angle.new_zeros(points.shape[0])\n    ones = angle.new_ones(points.shape[0])\n    rot_matrix = torch.stack((\n        cosa,  sina, zeros,\n        -sina, cosa, zeros,\n        zeros, zeros, ones\n    ), dim=1).view(-1, 3, 3).float()\n    points_rot = torch.matmul(points[:, :, 0:3], rot_matrix)\n    points_rot = torch.cat((points_rot, points[:, :, 3:]), dim=-1)\n    return points_rot.numpy() if is_numpy else points_rot\n\n\ndef boxes_to_corners_3d(boxes3d):\n    \"\"\"\n        7 -------- 4\n       /|         /|\n      6 -------- 5 .\n      | |        | |\n      . 3 -------- 0\n      |/         |/\n      2 -------- 1\n    Args:\n        boxes3d:  (N, 7) [x, y, z, dx, dy, dz, heading], (x, y, z) is the box center\n\n    Returns:\n    \"\"\"\n    boxes3d, is_numpy = check_numpy_to_torch(boxes3d)\n\n    template = boxes3d.new_tensor((\n        [1, 1, -1], [1, -1, -1], [-1, -1, -1], [-1, 1, -1],\n        [1, 1, 1], [1, -1, 1], [-1, -1, 1], [-1, 1, 1],\n    )) / 2\n\n    corners3d = boxes3d[:, None, 3:6].repeat(1, 8, 1) * template[None, :, :]\n    corners3d = rotate_points_along_z(corners3d.view(-1, 8, 3), boxes3d[:, 6]).view(-1, 8, 3)\n    corners3d += boxes3d[:, None, 0:3]\n\n    return corners3d.numpy() if is_numpy else corners3d\n\n\ndef visualize_pts(pts, fig=None, bgcolor=(0, 0, 0), fgcolor=(1.0, 1.0, 1.0),\n                  show_intensity=False, size=(600, 600), draw_origin=True):\n    if not isinstance(pts, np.ndarray):\n        pts = pts.cpu().numpy()\n    if fig is None:\n        fig = mlab.figure(figure=None, bgcolor=bgcolor, fgcolor=fgcolor, engine=None, size=size)\n\n    if show_intensity:\n        G = mlab.points3d(pts[:, 0], pts[:, 1], pts[:, 2], pts[:, 3], mode='point',\n                          colormap='gnuplot', scale_factor=1, figure=fig)\n    else:\n        G = mlab.points3d(pts[:, 0], pts[:, 1], pts[:, 2], mode='point',\n                          colormap='gnuplot', scale_factor=1, figure=fig)\n    if draw_origin:\n        mlab.points3d(0, 0, 0, color=(1, 1, 1), mode='cube', scale_factor=0.2)\n        mlab.plot3d([0, 3], [0, 0], [0, 0], color=(0, 0, 1), tube_radius=0.1)\n        mlab.plot3d([0, 0], [0, 3], [0, 0], color=(0, 1, 0), tube_radius=0.1)\n        mlab.plot3d([0, 0], [0, 0], [0, 3], color=(1, 0, 0), tube_radius=0.1)\n\n    return fig\n\n\ndef draw_sphere_pts(pts, color=(0, 1, 0), fig=None, bgcolor=(0, 0, 0), scale_factor=0.2):\n    if not isinstance(pts, np.ndarray):\n        pts = pts.cpu().numpy()\n\n    if fig is None:\n        fig = mlab.figure(figure=None, bgcolor=bgcolor, fgcolor=None, engine=None, size=(600, 600))\n\n    if isinstance(color, np.ndarray) and color.shape[0] == 1:\n        color = color[0]\n        color = (color[0] / 255.0, color[1] / 255.0, color[2] / 255.0)\n\n    if isinstance(color, np.ndarray):\n        pts_color = np.zeros((pts.__len__(), 4), dtype=np.uint8)\n        pts_color[:, 0:3] = color\n        pts_color[:, 3] = 255\n        G = mlab.points3d(pts[:, 0], pts[:, 1], pts[:, 2], np.arange(0, pts_color.__len__()), mode='sphere',\n                          scale_factor=scale_factor, figure=fig)\n        G.glyph.color_mode = 'color_by_scalar'\n        G.glyph.scale_mode = 'scale_by_vector'\n        G.module_manager.scalar_lut_manager.lut.table = pts_color\n    else:\n        mlab.points3d(pts[:, 0], pts[:, 1], pts[:, 2], mode='sphere', color=color,\n                      colormap='gnuplot', scale_factor=scale_factor, figure=fig)\n\n    mlab.points3d(0, 0, 0, color=(1, 1, 1), mode='cube', scale_factor=0.2)\n    mlab.plot3d([0, 3], [0, 0], [0, 0], color=(0, 0, 1), line_width=3, tube_radius=None, figure=fig)\n    mlab.plot3d([0, 0], [0, 3], [0, 0], color=(0, 1, 0), line_width=3, tube_radius=None, figure=fig)\n    mlab.plot3d([0, 0], [0, 0], [0, 3], color=(1, 0, 0), line_width=3, tube_radius=None, figure=fig)\n\n    return fig\n\n\ndef draw_grid(x1, y1, x2, y2, fig, tube_radius=None, color=(0.5, 0.5, 0.5)):\n    mlab.plot3d([x1, x1], [y1, y2], [0, 0], color=color, tube_radius=tube_radius, line_width=1, figure=fig)\n    mlab.plot3d([x2, x2], [y1, y2], [0, 0], color=color, tube_radius=tube_radius, line_width=1, figure=fig)\n    mlab.plot3d([x1, x2], [y1, y1], [0, 0], color=color, tube_radius=tube_radius, line_width=1, figure=fig)\n    mlab.plot3d([x1, x2], [y2, y2], [0, 0], color=color, tube_radius=tube_radius, line_width=1, figure=fig)\n    return fig\n\n\ndef draw_multi_grid_range(fig, grid_size=20, bv_range=(-60, -60, 60, 60)):\n    for x in range(bv_range[0], bv_range[2], grid_size):\n        for y in range(bv_range[1], bv_range[3], grid_size):\n            fig = draw_grid(x, y, x + grid_size, y + grid_size, fig)\n\n    return fig\n\n\ndef draw_scenes(points, gt_boxes=None, ref_boxes=None, ref_scores=None, ref_labels=None):\n    if not isinstance(points, np.ndarray):\n        points = points.cpu().numpy()\n    if ref_boxes is not None and not isinstance(ref_boxes, np.ndarray):\n        ref_boxes = ref_boxes.cpu().numpy()\n    if gt_boxes is not None and not isinstance(gt_boxes, np.ndarray):\n        gt_boxes = gt_boxes.cpu().numpy()\n    if ref_scores is not None and not isinstance(ref_scores, np.ndarray):\n        ref_scores = ref_scores.cpu().numpy()\n    if ref_labels is not None and not isinstance(ref_labels, np.ndarray):\n        ref_labels = ref_labels.cpu().numpy()\n\n    fig = visualize_pts(points)\n    fig = draw_multi_grid_range(fig, bv_range=(0, -40, 80, 40))\n    if gt_boxes is not None:\n        corners3d = boxes_to_corners_3d(gt_boxes)\n        fig = draw_corners3d(corners3d, fig=fig, color=(0, 0, 1), max_num=100)\n\n    if ref_boxes is not None and len(ref_boxes) > 0:\n        ref_corners3d = boxes_to_corners_3d(ref_boxes)\n        if ref_labels is None:\n            fig = draw_corners3d(ref_corners3d, fig=fig, color=(0, 1, 0), cls=ref_scores, max_num=100)\n        else:\n            for k in range(ref_labels.min(), ref_labels.max() + 1):\n                cur_color = tuple(box_colormap[k % len(box_colormap)])\n                mask = (ref_labels == k)\n                fig = draw_corners3d(ref_corners3d[mask], fig=fig, color=cur_color, cls=ref_scores[mask], max_num=100)\n    mlab.view(azimuth=-179, elevation=54.0, distance=104.0, roll=90.0)\n    return fig\n\n\ndef draw_corners3d(corners3d, fig, color=(1, 1, 1), line_width=2, cls=None, tag='', max_num=500, tube_radius=None):\n    \"\"\"\n    :param corners3d: (N, 8, 3)\n    :param fig:\n    :param color:\n    :param line_width:\n    :param cls:\n    :param tag:\n    :param max_num:\n    :return:\n    \"\"\"\n    import mayavi.mlab as mlab\n    num = min(max_num, len(corners3d))\n    for n in range(num):\n        b = corners3d[n]  # (8, 3)\n\n        if cls is not None:\n            if isinstance(cls, np.ndarray):\n                mlab.text3d(b[6, 0], b[6, 1], b[6, 2], '%.2f' % cls[n], scale=(0.3, 0.3, 0.3), color=color, figure=fig)\n            else:\n                mlab.text3d(b[6, 0], b[6, 1], b[6, 2], '%s' % cls[n], scale=(0.3, 0.3, 0.3), color=color, figure=fig)\n\n        for k in range(0, 4):\n            i, j = k, (k + 1) % 4\n            mlab.plot3d([b[i, 0], b[j, 0]], [b[i, 1], b[j, 1]], [b[i, 2], b[j, 2]], color=color, tube_radius=tube_radius,\n                        line_width=line_width, figure=fig)\n\n            i, j = k + 4, (k + 1) % 4 + 4\n            mlab.plot3d([b[i, 0], b[j, 0]], [b[i, 1], b[j, 1]], [b[i, 2], b[j, 2]], color=color, tube_radius=tube_radius,\n                        line_width=line_width, figure=fig)\n\n            i, j = k, k + 4\n            mlab.plot3d([b[i, 0], b[j, 0]], [b[i, 1], b[j, 1]], [b[i, 2], b[j, 2]], color=color, tube_radius=tube_radius,\n                        line_width=line_width, figure=fig)\n\n        i, j = 0, 5\n        mlab.plot3d([b[i, 0], b[j, 0]], [b[i, 1], b[j, 1]], [b[i, 2], b[j, 2]], color=color, tube_radius=tube_radius,\n                    line_width=line_width, figure=fig)\n        i, j = 1, 4\n        mlab.plot3d([b[i, 0], b[j, 0]], [b[i, 1], b[j, 1]], [b[i, 2], b[j, 2]], color=color, tube_radius=tube_radius,\n                    line_width=line_width, figure=fig)\n\n    return fig\n"
  }
]